Language selection

Search

Patent 2399776 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2399776
(54) English Title: NOVEL NUCLEIC ACIDS AND POLYPEPTIDES
(54) French Title: ACIDES NUCLEIQUES ET POLYPEPTIDES
Status: Dead
Bibliographic Data
(51) International Patent Classification (IPC):
  • C12N 15/12 (2006.01)
  • A61K 31/7088 (2006.01)
  • A61K 38/16 (2006.01)
  • C07K 14/47 (2006.01)
  • C07K 14/52 (2006.01)
  • C07K 16/18 (2006.01)
  • C12N 9/16 (2006.01)
  • C12N 9/64 (2006.01)
  • C12N 15/19 (2006.01)
  • C12N 15/66 (2006.01)
  • C12Q 1/68 (2006.01)
  • G01N 33/543 (2006.01)
  • G01N 33/566 (2006.01)
  • A61K 38/00 (2006.01)
(72) Inventors :
  • TANG, Y. TOM (United States of America)
  • LIU, CHENGHUA (United States of America)
  • DRMANAC, RADOJE T. (United States of America)
  • ASUNDI, VINOD (United States of America)
  • ZHOU, PING (United States of America)
  • XU, CHONGJUN (United States of America)
  • CAO, YICHENG (United States of America)
  • MA, YUNQUING (United States of America)
  • ZHAO, QING A. (United States of America)
  • WANG, DUNRUI (United States of America)
  • WANG, JIAN-RUI (United States of America)
  • ZHANG, JIE (United States of America)
  • REN, FEIYAN (United States of America)
  • CHEN, RUI-HONG (United States of America)
  • WANG, ZHIWEI (United States of America)
  • XUE, AIDONG J. (United States of America)
  • YANG, YONGHONG (United States of America)
  • WEHRMAN, TOM (United States of America)
  • GOODRICH, RYLE (United States of America)
(73) Owners :
  • NUVELO, INC. (United States of America)
(71) Applicants :
  • HYSEQ, INC. (United States of America)
(74) Agent: SMART & BIGGAR
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date: 2001-02-05
(87) Open to Public Inspection: 2001-08-09
Examination requested: 2005-09-16
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/US2001/004098
(87) International Publication Number: WO2001/057190
(85) National Entry: 2002-08-02

(30) Application Priority Data:
Application No. Country/Territory Date
09/496,914 United States of America 2000-02-03
09/560,875 United States of America 2000-04-27
09/598,075 United States of America 2000-06-20
09/620,325 United States of America 2000-07-19
09/654,936 United States of America 2000-09-01
09/663,561 United States of America 2000-09-15
09/693,325 United States of America 2000-10-20
09/728,422 United States of America 2000-11-30

Abstracts

English Abstract




The present invention provides novel nucleic acids, novel polypeptide
sequences encoded by these nucleic acids and uses thereof.


French Abstract

L'invention concerne des acides nucléiques, des séquences polypeptidiques codées par ces acides nucléiques et leurs utilisations correspondantes.

Claims

Note: Claims are shown in the official language in which they were submitted.



WHAT IS CLAIMED IS:

1. An isolated polynucleotide comprising a nucleotide sequence selected from
the group
consisting of SEQ ID N0:1-984, 1969-2952, 3937-3942 or 3949-3954, a full
length protein
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, a
mature protein
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, an
active domain
coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, and
complementary
sequences thereof.

2. An isolated polynucleotide encoding a polypeptide with biological activity,
wherein said
polynucleotide hybridizes to the polynucleotide of claim 1 under stringent
hybridization
conditions.

3. An isolated polynucleotide encoding a polypeptide with biological activity,
wherein said
polynucleotide has greater than about 90% sequence identity with the
polynucleotide of claim 1.

4. The polynucleotide of claim 1 wherein said polynucleotide is DNA.

5. An isolated polynucleotide of claim 1 wherein said polynucleotide comprises
the
complementary sequences.

6. A vector comprising the polynucleotide of claim 1.

7. An expression vector comprising the polynucleotide of claim 1.

8. A host cell genetically engineered to comprise the polynucleotide of claim
1.

9. A host cell genetically engineered to comprise the polynucleotide of claim
1 operatively
associated with a regulatory sequence that modulates expression of the
polynucleotide in the host
cell.

10. An isolated polypeptide, wherein the polypeptide is selected from the
group consisting of:

(a) a polypeptide encoded by any one of the polynucleotides of claim 1; and

(b) a polypeptide encoded by a polynucleotide hybridizing under stringent
conditions
with any one of SEQ ID NO: 1-984,1969-2952, 3937-3942 or 3949-3954.


480


11. A composition comprising the polypeptide of claim 10 and a carrier.

12. An antibody directed against the polypeptide of claim 10.

13. A method for detecting the polynucleotide of claim 1 in a sample,
comprising:

a) contacting the sample with a compound that binds to and forms a complex
with the polynucleotide of claim 1 for a period sufficient to form the
complex; and

b) detecting the complex, so that if a complex is detected, the polynucleotide
of claim 1 is detected.

14. A method for detecting the polynucleotide of claim 1 in a sample,
comprising:

a) contacting the sample under stringent hybridization conditions with
nucleic acid primers that anneal to the polynucleotide of claim 1 under such
conditions;

b) amplifying a product comprising at least a portion of the polynucleotide of
claim 1; and

c) detecting said product and thereby the polynucleotide of claim 1 in the
sample.

15. The method of claim 14, wherein the polynucleotide is an RNA molecule and
the method
further comprises reverse transcribing an annealed RNA molecule into a cDNA
polynucleotide.

16. A method for detecting the polypeptide of claim 10 in a sample,
comprising:

a) contacting the sample with a compound that binds to and forms a complex
with the polypeptide under conditions and for a period sufficient to form the
complex; and

b) detecting formation of the complex, so that if a complex formation is
detected, the polypeptide of claim 10 is detected.

17. A method for identifying a compound that binds to the polypeptide of claim
10,
comprising:

a) contacting the compound with the polypeptide of claim 10 under
conditions sufficient to form a polypeptide/compound complex; and

b) detecting the complex, so that if the polypeptide/compound complex is
detected, a compound that binds to the polypeptide of claim 10 is identified.


481



18. A method for identifying a compound that binds to the polypeptide of claim
10,
comprising:

a) contacting the compound with the polypeptide of claim 10, in a cell, under
conditions sufficient to form a polypeptide/compound complex, wherein the
complex drives
expression of a reporter gene sequence in the cell; and

b) detecting the complex by detecting reporter gene sequence expression, so
that if the polypeptide/compound complex is detected, a compound that binds to
the polypeptide
of claim 10 is identified.

19. A method of producing the polypeptide of claim 10, comprising,

a) culturing a host cell comprising a polynucleotide sequence selected from
the group consisting of SEQ ID NO: 1-984; 1969-2952, 3937-3942 or 3949-3954, a
mature
protein coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954,
an active
domain coding portion of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954,
complementary sequences thereof and a polynucleotide sequence hybridizing
under stringent
conditions to SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, under
conditions
sufficient to express the polypeptide in said cell; and

b) isolating the polypeptide from the cell culture or cells of step (a).

20. An isolated polypeptide comprising an amino acid sequence selected from
the group
consisting of any one of the polypeptides SEQ ID NO: 985-1968, 2953-3936, 3943-
3948 or
3955-3960, the mature protein portion thereof, or the active domain thereof.

21. The polypeptide of claim 20 wherein the polypeptide is provided on a
polypeptide array.

22. A collection of polynucleotides, wherein the collection comprising the
sequence
information of at Least one of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-
3954.

23. The collection of claim 22, wherein the collection is provided on a
nucleic acid array.

24. The collection of claim 23, wherein the array detects full-matches to any
one of the
polynucleotides in the collection.

25. The collection of claim 23, Wherein the array detects mismatches to any
one of the
polynucleotides in the collection.



482


26. The collection of claim 22, wherein the collection is provided in a
computer-readable
format.

27. A method of treatment comprising administering to a mammalian subject in
need thereof
a therapeutic amount of a composition comprising a polypeptide of claim 10 or
20 and a
pharmaceutically acceptable carrier.

28. A method of treatment comprising administering to a mammalian subject in
need thereof
a therapeutic amount of a composition comprising an antibody that specifically
binds to a
polypeptide of claim 10 or 20 and a pharmaceutically acceptable carrier.



483

Description

Note: Descriptions are shown in the official language in which they were submitted.





DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.
CECI EST LE TOME 1 DE 3
~~ TTENANT LES PAGES 1 A 208
NOTE : Pour les tomes additionels, veuillez contacter 1e Bureau canadien des
brevets
JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME
THIS IS VOLUME 1 OF 3
CONTAINING PAGES 1 TO 208
NOTE: For additional volumes, please contact the Canadian Patent Office
NOM DU FICHIER / FILE NAME
NOTE POUR LE TOME / VOLUME NOTE:


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
NOVEL NUCLEIC ACIDS AND POLVPEPTIDES
1. TECHNICAL FIELD
The present invention provides novel polynucleotides and proteins encoded by
such
S polynucleotides, along with uses for these polynucleotides and proteins, for
example in
therapeutic, diagnostic and research methods.
2. BACKGROUND
Teclimology aimed at the discovery of protein factors (including e.g.,
cytokines, such as
lymphokines, interferons, CSFs, chemokines, and interleukins) has matured
rapidly over the past
decade. The now routine hybridization cloning and expression cloning
techniques clone novel
polynucleotides "directly" in the sense that they rely on information directly
related to the
discovered protean (i.e., partial DNAlamino acid sequence of the protein in
the case of
hybridization cloning; activity of the protein in the case of expression
cloning). More recent
"indirect" cloning techniques such as signal sequence cloning, which isolates
DNA sequences
based on the presence of a now well-recognized secretory leader sequence
motif, as well as
various PCR-based or low stringency hybridization-based cloning techniques,
have advanced the
state of the art by making available large numbers of DNA/amino acid sequences
for proteins
that are known to have biological activity, for example, by virtue of their
secreted nature in the
case of leader sequence cloning, by virtue of their cell or tissue source in
the case of PCR-based
techniques, or by virtue of structural similarity to other genes of known
biological activity.
Identified polynucleotide and polypeptide sequences have numerous applications
in, for
example, diagnostics, forensics, gene mapping; identification of mutations
responsible for
genetic disorders or other traits, to assess biodiversity, and to produce many
other types of data
and products dependent on DNA and amino acid sequences.
3. SUMMARY OF THE INVENTION
The compositions of the present invention include novel isolated polypeptides,
novel
isolated polynucleotides encoding such polypeptides, including recombinant DNA
molecules,
cloned genes or degenerate variants thereof, especially naturally occurring
variants such as allelic
variants, antisense polynucleotide molecules, and antibodies that specifically
recognize one or more
epitopes present on such polypeptides, as well as hybridomas producing such
antibodies.
The compositions of the present invention additionally include vectors,
including expression
vectors, containing the polynucleotides of the invention, cells genetically
engineeredto contain such
3 5 polynucleotides and cells genetically engineered to express such
polynucleotides.
1.


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
The present invention relates to a collection or library of at least one novel
nucleic acid
sequence assembled from expressed sequence tags (ESTs) isolated mainly by
sequencing by
hybridization (SBH), and in some cases, sequences obtained from one or more
public databases.
The invention relates also to the proteins encoded by such polynucleotides,
along with therapeutic,
diagnostic and research utilities for these polynucleotides and proteins.
These nucleic acid
sequences are designated as SEQ ID NO: 1-984,1969-2952, 3937-3942 or 3949-
3954. The
polypeptides sequences are designated SEQ ID NO: 985-1968, 2953-3936, 3943-
3948 or 3955-
3960. The nucleic acids and polypeptides are provided in the Sequence Listing.
In the.nucleic acids
provided in the Sequence Listing, A is adenosine; C is cytosine; G is guanine;
T is thymine; and N
is any of the four bases. In the amino acids provided in the Sequence Listing,
~ corresponds to the
stop codon.
The nucleic acid sequences of the present invention also include, nucleic acid
sequences that
hybridize to the complement of SEQ ID NO: 1-984,1969-2952, 3937-3942 or 3949-
3954 under
stringent hybridization conditions; nucleic acid sequences which are allelic
variants or species
homologues of any of the nucleic acid sequences recited above, or nucleic acid
sequences that
encode a peptide comprising a specific domain or truncation of the peptides
encoded by SEQ ID
NO: 1-984,1969-2952, 3937-3942 or 3949-3954. A polynucleotide comprising a
nucleotide
sequence having at least 90°~° identity to an identifying
sequence of SEQ ID NO: 1-984,1969-2952,
3937-3942 or 3949-3954 or a degenerate variant or fragment thereof. The
identifying sequence can
be 100 base pairs in length.
The nucleic acid sequences of the present invention also include the sequence
information
from the nucleic acid sequences of SEQ ID NO:l-984,1969-2952, 3937-3942 or
3949-3954. The
sequence informationcan be a segment of any one of SEQ ID N0:1-984,1969-2952,
3937-3942 or
3949-3954 that uniquely identifies or represents the sequence infomnation of
SEQ ID N0:1-984,
1969-2952, 3937-3942 or 3949-3954.
A collection as used in this application can be a collection of only one
polynucleotide. The
collection of sequence information or identifying information of each sequence
can be provided on
a nucleic acid array. In one embodiment, segments of sequence information is
provided on a
nucleic acid array to detect the polynucleotide that contains the segment. The
array can be designed
3 0 to detect full-match or mismatch to the polynucleotide that contains the
segment. The collection
can also be provided in a computer-readableformat.
This invention also includes the reverse or direct complement of any of the
nucleic acid
sequences recited above; cloning or expression vectors containing the nucleic
acid sequences; and
host cells or organisms transformed with these expression vectors. Nucleic
acid sequences (or their
3 5 reverse or direct complements) according to the invention have numerous
applications in a variety


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
of techniques known to those skilled in the art of molecular biology, such as
use as hybridization
probes, use as primers for PCR, use in an array, use in computer-
readablemedia, use in sequencing
full-length genes, use for chromosome and gene mapping, use in the recombinant
production of
protein, and use in the generation of anti-sense DNA or RNA, their chemical
analogs and the like.
In a preferred embodiment, the nucleic acid sequences of SEQ ID NO: l -
984,1969-2952,
3937-3942 or 3949-3954 or novel segments or parts of the nucleic acids of the
invention are used as
primers in expression assays that are well known in the art. In a par
ticularly preferred embodiment,
the nucleic acid sequences of SEQ ID NO:l-984,1969-2952, 3937-3942 or 3949-
3954 or novel
segments or parts of the nucleic acids provided herein are used in diagnostics
for identifying
expressed genes or, as well known in the art and exemplified by Vollrath et
al., Science 258:52-59
(1992), as expressed sequence tags for physical mapping of the human genome.
The isolated polynucleotides of the invention include, but are not limited to,
a
polynucleotide comprising any one of the nucleotide sequences set forth in SEQ
ID NO:1-984,
1969-2952, 3937-3942 or 3949-3954 ; a polynucleotide comprising any of the
full length protein
coding sequences of SEQ ID NO: l -984,1969-2952, 3937-3942 or 3949-3954; and a
polynucleotide
comprising any of the nucleotide sequences of the mature protein coding
sequences of SEQ ID
NO: l-984,1969-2952, 3937-3942 or 3949-3954. The polynueleotides of the
present invention also
include, but are not limited to, a polynucleotide that hybridizes under
stringent hybridization
conditions to (a) the complement of any one of the nucleotide sequences set
forth in SEQ ID NO:1-
984,1969-2952, 3937-3942 or 3949-3954; (b) a nucleotide sequence encoding any
one of the
amino acid sequences set forth in the Sequence Listing; (c) a polynucleotide
which is an allelic
variant of any polynucleotides recited above; (d) a polynucleotide which
encodes a species homolog
(e.g. orthologs) of any of the proteins recited above; or (e) a polynucleotide
that encodes a
polypeptide comprising a specific domain or truncation of any of the
polypeptides comprising an
amino acid sequence set forth in the Sequence Listing.
The isolated polypeptides of the invention include, but are not limited to, a
polypeptide
comprising any of the amino acid sequences set forth in SEQ ID NO: 985-1968,
2953-3936, 3943-
3948 or 3955-3960; or the corresponding full length or mature protein.
Polypeptides of the
invention also include polypeptides with biological activity that are encoded
by (a) any of the
3Q polynucleotideshaving a nucleotide sequence set forth in SEQ ID NO:l-984,
1969-2952, 3937-
3942 or 3949-3954; or (b) polynucleotides that hybridize to the complement of
the polynucleotides
of (a) under stringent hybridization conditions. Biologically or
immunologically active variants of
any of the polypeptide sequences in the Sequence Listing, and "substantial
equivalents" thereof
(e.g., with at least about 65°!°, 70%, 75%, 80%, 85%, 90%, 95%,
98% or 99% amino acid sequence
identity) that preferably retain biological activity are also contemplated.
The polypeptides of the


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
invention may be wholly or partially chemically synthesized but are preferably
produced by
recombinant means using the genetically engineered cells (e.g. host cells) of
the invention.
The invention also provides compositions comprising a polypeptide of the
invention.
Polypeptide compositions of the invention may further comprise an acceptable
carrier, such as a
hydrophilic, e.g., pharmaceutically acceptable, carrier.
The invention also provides host cells transformed or transfected with a
polynucleotide of
the invention.
The invention also relates to methods for producing a polypeptide of the
invention
comprising growing a culture of the host cells of the invention in a suitable
culture medium
under conditions permitting expression of the desired polypeptide, and
purifying the polypeptide
from the culture or from the host cells. Preferred embodiments include those
in which the
protein produced by such process is a mature form of the protein.
Polynucleotides according to the invention have numerous applications in a
variety of
techniques known to those skilled in the art of molecular biology. These
techniques include use
as hybridization probes, use as oligomers, or primers, for PCR, use far
chramosorne and gene
mapping, use in the recombinant production of protein, and use in generation
of anti-sense DNA
ox RNA, their chemical analogs and the like. For example, when the expression
of an mRNA is
largely restricted to a particular cell or tissue type, polynucleotides of the
invention can be used
as hybridization probes to detect the presence of the particular cell or
tissue mRNA in a sample
using, e.g., in situ hybridization.
In other exemplary embodiments, the polynucleotides are used in diagnostics as
expressed sequence tags for identifying expressed genes or, as well known in
the art and
eXemplified by Vollrath et al., Science 258:52-59 (1992), as expressed
sequence tags for physical
mapping of the human genome.
The polypeptides according to the invention can be used in a variety of
conventional
procedures and methods that axe currently applied to other proteins. For
example, a polypeptide
of the invention can be used to generate an antibody that specifically binds
the polypeptide. Such
antibodies, particularly monoclonal antibodies, are useful for detecting or
quantitating the
polypeptide in tissue. The polypeptides of the invention can also be used as
molecular weight
markers, and as a food supplement.
Methods are also provided for preventing, treating, or ameliorating a medical
condition
which comprises the step of administering to a mammalian subject a
therapeutically effective
amount of a composition comprising a polypeptide of the present invention and
a
pharmaceutically acceptable carrier.
4


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
In particular, the polypeptides and polynucleotides of the invention can be
utilized, for
example, in methods for the prevention andlor treatment of disorders involving
aberrant protein
expression or biological activity.
The present invention further relates to methods for detecting the presence of
the
polynucleotides or polypeptides of the invention in a sample. Such methods
can, fax example, be
utilized as part of prognostic and diagnostic evaluation of disorders as
recited herein and for the
identification of subjects exhibiting a predisposition to such conditions. The
invention provides
a method fox detecting the polynucleotides of the invention in a sample,
comprising contacting
the sample with a compound that binds to and forms a complex with the
polynucleotide of
interest for a period sufficient to form the complex and under conditions
sufficient to form a
complex and detecting the complex such that if a complex is detected, the
polynucleotide of
interest is detected. The invention also provides a method for detecting the
polypeptides of the
invention in a sample coxiprising contacting the sample with a compound that
binds to and forms
a complex with the polypeptide under conditions and for a period sufficient to
foam the complex
and detecting the formation of the complex such that if a complex is formed,
the polypeptide is
detected.
The invention also provides kits comprising polynucleotide probes and/or
monoclonal
antibodies, and optionally quantitative standards, for carrying out methods of
the invention.
Furthermore, the invention provides methods for evaluating the efficacy of
drugs, and
monitoring the progress of patients, involved in clinical trials for the
treatment of disorders as
recited above.
The invention also provides methods for the identification of compounds that
modulate
(i.e., increase or decrease) the expression or activity of the polynucleotides
andlor polypeptides
of the invention. Such methods can be utilized, for example, for the
identification of compounds
that can ameliorate symptoms of disorders as recited herein. Such methods can
include, but are
not limited to, assays for identifying compounds and other substances that
interact with (e.g.,
bind to) the polypeptides of the invention. The invention provides a method
for identifying a
compound that binds to the polypeptides of the invention comprising contacting
the compound
with a polypeptide of the invention in a cell for a time sufficient to form a
polypeptide/compoun.d
complex, wherein the complex drives expression of a reporter gene sequence in
the cell; and
detecting the complex by detecting the reporter gene sequence expression such
that if expression
of the reporter gene is detected the compound the binds to a polypeptide of
the invention is
identified.
The methods of the invention also provides methods for treatment which involve
the
administration of the polynucleotides or polypeptides of the invention to
individuals exhibiting
5


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
symptoms or tendencies. In addition, the invention encompasses methods for
treating diseases or
disorders as recited herein comprising administering compounds and other
substances that
modulate the overall activity of the target gene products. Compounds and other
substances can
effect such modulation either on the level of target gene/protein expression
or target protein
activity.
The polypeptides of the present invention and the polynucleotides encoding
them are also
useful for the same functions known to one of skill in the art as the
polypeptides and
polynucleotides to which they have homology (set forth in Tables 2 and 9); for
which they have
a signature region (as set forth in Tables 3 and 10); or fox which they have
homology to a gene
family (as set forth in Tables 4 and 11). If no homology is set forth for a
sequence, then the
polypeptides and polynucleotides of the present invention are useful for a
variety of applications,
as described herein, including use in arrays for detection.
4. DETAILED DESCRIPTION OF THE INVENTION
4.1 DEFINITIONS
It must be noted that as used herein and in the appended claims, the singular
forms "a",
"an" and "the" include plural references unless the context clearly dictates
otherwise.
The term "active" refers to those forms of the polypeptide which retain the
biologic
and/or immunologic activities of any naturally occurring polypeptide.
According to the
invention, the terms "biologically active" or "biological activity" refer to a
protein or peptide
having structural, regulatory or biochemical functions of a naturally
occurring molecule.
Likewise "immunologically active" or "immunological activity" refers to the
capability of the
natural, recombinant or synthetic polypeptide to induce a specific immune
response in
appropriate animals or cells and to bind with specific antibodies.
The term "activated cells" as used in this application are those cells which
are engaged in
extracellular or intracellular membrane trafficking, including the export of
secretory or
enzymatic molecules as part of a normal or disease process.
The terms "complementary" or "complementarity" refer to the natural binding of
polynucleotides by base pairing. For example, the sequence 5'-AGT-3' binds to
the
complementary sequence 3'-TCA-5'. Complementarity between two single-stranded
molecules
may be "partial" such that only some of the nucleic acids bind or it may be
"complete" such that
total complementarity exists between the single stranded molecules. The degree
of
complementarity between the nucleic acid strands has significant effects on
the e~ciency and
strength of the hybridization between the nucleic acid strands.


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
The term "embryonic stem cells (ES)" refers to a cell that can give rise to
many
differentiated cell types in an embryo or an adult, including the germ cells.
The term "germ line
stem cells (GSCs)" refers to stem cells derived from primordial stem cells
that provide a steady
and continuous source of germ cells for the production of gametes. The term
"primordial germ
cells (PGCs)" refers to a small population of cells set aside from other cell
lineages particularly
from the yolk sac, mesenteries, or gonadal ridges during embryogenesis that
have the potential to
differentiate into germ cells and other cells. PGCs are the source from which
GSCs and ES cells
are derived The PGCs, the GSCs and the ES cells are capable of self renewal.
Thus these cells
not only populate the germ line and give rise to a plurality
ofterminallydifferentiated cells that
comprise the adult specialized organs, but are able to regenerate themselves.
The term "expression modulating fragment," EMF, means a series of nucleotides
which
modulates the expression of an operably linked ORF or another EMF.
As used herein, a sequence is said to "modulate the expression of an operably
linked
sequence" when the expression of the sequence is altered by the presence of
the EMF. EMFs
1 S include, but are not limited to, promoters, and promoter modulating
sequences (inducible
elements). One class of EMFs are nucleic acid fragments which induce the
expression of an
operably linked ORF in response to a specific regulatory factor or
physiological event.
The terms "nucleotide sequence" or "nucleic acid" or "polynucleotide" or
"oligonculeotide" are used interchangeably and refer to a heteropolymer of
nucleotides or the
sequence of these nucleotides. These phrases also refer to DNA or RNA of
genomic or synthetic
origin which may be single-stranded or double-stranded and may represent the
sense or the
antisense strand, to peptide nucleic acid (PNA) or to any DNA-like or RNA-like
material. In the
sequences herein A is adenine, C is cytosine, T is thymine, G is guanine and N
is A, C, G or T
(U). It is contemplated that where the polynucleotide is RNA, the T (thymine)
in the sequences
provided herein is substituted with U (uracil). Generally, nucleic acid
segments provided by this
invention may be assembled from fragments of the genome and short
oligonucleotide linkers, or
from a series of oligonucleotides, or from individual nucleotides, to provide
a synthetic nucleic
acid which is capable of being expressed in a recombinant transcriptional unit
comprising
regulatory elements derived from a microbial or viral operon, or a eukaryotic
gene.
The terms "oligonucleotide fragment" or a "polynucleotide fragment",
"portion," or
"segment" or "probe" or "primer" are used interchangeably and refer to a
sequence of nucleotide
residues which are at least about 5 nucleotides, more preferably at least
about 7 nucleotides,
more preferably at least about 9 nucleotides, more preferably at least about
11 nucleotides and
most preferably at least about 17 nucleotides. The fragment is preferably less
than about 500
nucleotides, preferably less than about 200 nucleotides, more preferably less
than about 100


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
nucleotides, more preferably less than about 50 nucleotides and most
preferably less than 30
nucleotides. Preferably the probe is from about 6 nucleotides to about 200
nucleotides,
preferably from about 15 to about 50 nucleotides, more preferably from about
17 to 30
nucleotides and most preferably from about 20 to 25 nucleotides. Preferably
the fragments can
be used in polymerase chain reaction (PCR), various hybridization procedures
or microarray
procedures to identify or amplify identical or related parts of mRNA or DNA
molecules. A
fragment or segment may uniquely identify each polynucleotide sequence of the
present
invention. Preferably the fragment comprises a sequence substantially similar
to any one of SEQ
ID NOs:l-20.
Probes may, for example, be used to determine whether specific mRNA molecules
are
present in a cell or tissue or to isolate similar nucleic acid sequences from
chromosomal DNA as
described by Walsh et al. (Walsh, P.S. et al., 1992, PCR Methods Appl 1:241-
250). They may
be labeled by nick translation, Klenow fill-in reaction, PCR, or other methods
well known in the
art. Probes of the present invention, their preparation and/or labeling are
elaborated in
Sambrook, J. et al., 1989, Molecular Cloning: A Laboratory Manual, Cold Spring
Harbor
Laboratory, NY; or Ausubel, F.M. et al., 1989, Current Protocols in Molecular
Biology, 3ohn
Wiley & Sons, New York NY, both of which are incorporated herein by reference
in their
entirety.
The nucleic acid sequences of the present invention also include the sequence
information from the nucleic acid sequences of SEQ ID NO: 1-984, 1969-2952,
3937-3942 or
3949-3954. The sequence information can be a segment of any one of SEQ ID NO:l-
1-984,
1969-2952, 3937-3942 or 3949-3954 that uniquely identifies or represents the
sequence
information of that sequence of SEQ ID NO:1-984, 1969-2952, 3937-3942 or 3949-
3954. One
such segment can be a twenty-mer nucleic acid sequence because the probability
that a twenty-
mer is fully matched in the human genome is 1 in 300. In the human genome,
there are three
billion base pairs in one set of chromosomes. Because 4~° possible
twenty-mers exist, there are
300 times more twenty-mers than there are base pairs in a set of human
chromosomes. Using the
same analysis, the probability for a seventeen-mer to be fully matched in the
human genome is
approximately 1 in 5. When these segments are used in arrays for expression
studies, fifteen-
mer segments can be used. The probability that the fifteen-mer is fully
matched in the expressed
sequences is also approximately one in five because expressed sequences
comprise less than
approximately 5% of the entire genome sequence.
Similarly, when using sequence information for detecting a single mismatch, a
segment can
be a twenty-five mer. The probability that the twenty-five mer would appear in
a human genome
with a single mismatch is calculated by multiplying the probability for a full
match (1425) times the
s


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
increased probability for mismatch at each nucleotide position (3 x 25). The
probability that an
eighteen mer with a single mismatch can be detected in an array for expression
studies is
approximately one in five. The probability that a twenty-mer with a single
mismatch can be
detected in a human genome is approximately one in five.
The term "open reading frame," ORF, means a series of nucleotide triplets
coding for
amino acids without any termination codons and is a sequence translatable into
protein.
The terms "operably linked" or "operably associated" refer to functionally
related nucleic
acid sequences. For example, a promoter is operably associated or operably
linked with a coding
sequence if the promoter controls the transcription of the coding sequence.
While operably
linked nucleic acid sequences can be contiguous and in the same reading frame,
certain genetic
elements e.g. repressor genes are not contiguously linked to the coding
sequence but still control
transcription/translation of the coding sequence.
The term "pluripotent" refers to the capability of a cell to differentiate
into a number of
differentiated cell types that are present in an adult organism. A pluripotent
cell is restricted in its
differentiation capability in comparison to a totipotent cell.
The terms "polypeptide" or "peptide" or "amino acid sequence" refer to an
oligopeptide,
peptide, polypeptide or protein sequence or fragment thereof and to naturally
occurring or
synthetic molecules. A polypeptide "fragment," "portion," or "segment" is a
stretch of amino
acid residues of at least about 5 amino acids, preferably at least about 7
amino acids, more
preferably at least about 9 amino acids and most preferably at least about 17
or more amino
acids. The peptide preferably is not greater than about 500 amino acids, more
preferably less
than 200 amino acids more preferably less than 150 amino acids and most
preferably less than
100 amino acids. Preferably the peptide is from about 5 to about 200 amino
acids. To be active,
any polypeptide must have sufficient length to display biological and/or
immunological activity.
The term "naturally occurring polypeptide" refers to polypeptides produced by
cells that
have not been genetically engineered and specifically contemplates various
polypeptides arising
from post-translational modifications of the polypeptide including, but not
limited to, acetylation,
carboxylation, glycosylation, phosphorylation, lipidation and acylation.
The term "translated protein coding portion" means a sequence which encodes
for the full
length protein which may include any leader sequence or any processing
sequence.
The term "mature protein coding sequence" means a sequence which encodes a
peptide
or protein without a signal or leader sequence. The "mature protein portion"
means that portion
of the protein which does not include a signal or leader sequence. The peptide
may have been
produced by processing in the cell which removes any leader/signal sequence.
The mature
protein portion may or may not include the initial methionine residue. The
methionine residue
9


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
may be removed from the protein during processing in the cell. The peptide may
be produced
synthetically or the pxotein may have been produced using a polynucleotide
only encoding for
the mature protein coding sequence.
The term "derivative" refers to polypeptides chemically modified by such
techniques as
S ubiquitination, labeling (e.g., with radionuclides or various enzymes),
covalent polymer
attachment such as pegylation (derivatization with polyethylene glycol) and
insertion or
substitution by chemical synthesis of amino acids such as ornithine, which do
not normally occur
in human proteins.
The term "variant"(or "analog"} refers to any polypeptide differing from
naturally
occurring polypeptides by amino acid insertions, deletions, and substitutions,
created using, a g.,
recombinant DNA techniques. Guidance in determining which amino acid residues
may be
replaced, added or deleted without abolishing activities of interest, may be
found by comparing
the sequence of the particular polypeptide with that of homologous peptides
and minimizing the
number of amino acid sequence changes made in regions of high homology
(conserved regions)
or by replacing amino acids with consensus sequence.
Alternatively, recombinant variants encoding these same or similar
polypeptides may be
synthesized or selected by making use of the "redundancy" in the genetic code.
Various codon
substitutions, such as the silent changes which produce various restriction
sites, may be
introduced to optimize cloning into a plasmid or viral vector or expression in
a particular
prokaryotic or eukaryotic system. Mutations in the poiynueleotide sequence may
be reflected in
the polypeptide or domains of other peptides added to the polypeptide to
modify the properties of
any part of the polypeptide, to change characteristics such as ligand-binding
off nines, interchain
affinities, or degradation/turnover rate.
Preferably, amino acid "substitutions" are the result of replacing one amino
acid with
another amino acid having similax structural and/or chemical properties, i.e.,
conservative amino
acid replacements. "Conservative" amino acid substitutions may be made on the
basis of
similarity in polarity, charge, solubility, hydrophobicity, hydrophilicity,
and/or the amphipathic
nature of the residues involved. For example, nonpolar (hydrophobic} amino
acids include
alanine, leucine, isoleucine, valine, proline, phenylalanine, tryptophan, and
methionine; polar
neutral amino acids include glycine, serine, threonine, cysteine, tyrosine,
asparagine, and
glutamine; positively charged (basic) amino acids include arginine, lysine,
and histidine; and
negatively charged (acidic) amino acids include aspartic acid and glutamic
acid. "Insertions" or
"deletions" are preferably in the range of about I to 20 amino acids, more
preferably 1 to 10
amino acids. The variation allowed may be experimentally determined by
systematically making


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
insertions, deletions, or substitutions of amino acids in a polypeptide
molecule using
recombinant DNA techniques and assaying the resulting recombinant variants for
activity.
Alternatively, where alteration of function is desired, insertions, deletions
or
non-conservative alterations can be engineered to produce altered
polypeptides. Such alterations
can, for example, alter one or more of the biological functions or biochemical
characteristics of
the polypeptides of the invention. For example, such alterations may change
polypeptide
characteristics such as ligand-binding affinities, interchain affinities, or
degradation/turnover
rate. Further, such alterations can be selected so as to generate polypeptides
that are better suited
for expression, scale up and the like in the host cells chosen for expression.
For example,
cysteine residues can be deleted or substituted with another amino acid
residue in order to
eliminate disulfide bridges.
The terms "purified" or "substantially purified" as used herein denotes that
the indicated
nucleic acid or polypeptide is present in the substantial absence of other
biological
macromolecules, e.g., polynucleotides, proteins, and the like. In one
embodiment, the
polynucleotide or polypeptide is purified such that it constitutes at Ieast
95% by weight, more
preferably at least 99% by weight, of the indicated biological macromolecules
present (but water,
buffers, and other small molecules, especially molecules having a molecular
weight of less than
1000 daltons, can be present).
The term "isolated" as used herein refers to a nucleic acid or polypeptide
separated from
at least one other component (e.g., nucleic acid or polypeptide) present with
the nucleic acid or
polypeptide in its natural source. In one embodiment, the nucleic acid or
polypeptide is found in
the presence of (if anything) only a solvent, buffer, ion, or other component
normally present in a
solution of the same. The terms "isolated" and "purified" do not encompass
nucleic acids or
polypeptides present in their natural source.
The term "recombinant," when used herein to refer to a polypeptide or protein,
means
that a polypeptide or protein is derived from recombinant (e.g., microbial,
insect, or mammalian)
expression systems. "Microbial" refers to recombinant polypeptides or proteins
made in
bacterial or fungal (e.g., yeast) expression systems. As a product,
"recombinant microbial"
defines a polypeptide or protein essentially free of native endogenous
substances and
unaccompanied by associated native glycosylation. Polypeptides or proteins
expressed in most
bacterial cultures, e.g., E. coli, will be free of glycosylation
modifications; polypeptides or
proteins expressed in yeast will have a glycosylation pattern in general
different from those
expressed in mammalian cells.
The term "recombinant expression vehicle or vector" refers to a plasmid or
phage or virus
or vector, for expressing a polypeptide from a DNA (RNA) sequence. An
expression vehicle can
11


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
comprise a transcriptional unit comprising an assembly of (1) a genetic
element or elements
having a regulatory role in gene expression, for example, promoters or
enhancers, (2) a structural
or coding sequence which is transcribed into mRNA and translated into protein,
and (3)
appropriate transcription initiation and termination sequences. Structural
units intended for use
in yeast or eukaxyotic expression systems preferably include a leader sequence
enabling
extracellular secretion of translated protein by a host cell. Alternatively,
where recombinant
protein is expressed without a leader or transport sequence, it may include an
amino terminal
methionine residue. This residue may or may not be subsequently cleaved from
the expressed
recombinant protein to provide a final product.
The term "recombinant expression system" means host cells which\have stably
integrated
a recombinant transcriptional unit into chromosomal DNA or carry the
recombinant
transcriptional unit extrachromosomally. Recombinant expression systems as
defined herein will
express heterologous polypeptides or proteins upon induction of the regulatory
elements linked
to the DNA segment or synthetic gene to be expressed. This term also means
host cells which
have stably integrated a recombinant genetic element or elements having a
regulatory role in
gene expression, for example, promoters or enhancers. Recombinant expression
systems as
defined herein will express polypeptides or proteins endogenous to the cell
upon induction of the
regulatory elements linked to the endogenous DNA segment or gene to be
expressed. The cells
can be prokaryotic or eukaryotic.
The term "secreted" includes a protein that is transported across or through a
membrane,
including transport as a result of signal sequences in its amino acid sequence
when it is expressed
in a suitable host cell. "Secreted" proteins include without limitation
proteins secreted wholly
(e.g., soluble proteins) or partially (e.g., receptors) from the cell in which
they are expressed.
"Secreted" proteins also include without limitation proteins that are
transported across the
membrane of the endoplasmic reticulum. "Secreted" proteins axe also intended
to include
proteins containing non-typical signal sequences (e.g. Interleukin-1 Beta, see
Krasney, P.A. and
Young, P.R. (1992) Cytokine 4(2):134 -143) and factors released from damaged
cells (e.g.
Interleukin-1 Receptor Antagonist, see Arend, W.P. et. al. (1998) Annu. Rev.
Immunol.
16:27-55)
Where desired, an expression vector may be designed to contain a "signal or
leader
sequence" which will direct the polypeptide through the membrane of a cell.
Such a sequence
may be naturally present on the polypeptides of the present invention or
provided from
heterologous protein sources by recombinant DNA techniques.
The term "stringent" is used to refer to conditions that are commonly
understood in the
art as stringent. Stringent conditions can include highly stringent conditions
(i.e., hybridization
12


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
to filter-bound DNA in 0.5 M NaHP04, 7% sodium dodecyl sulfate (SDS), 1 mM
EDTA at
65°C, and washing in O.1X SSC/0.1% SDS at 68°C), and moderately
stringent conditions (i.e.,
washing in 0.2X SSC/0.1% SDS at 42°C). Other exemplary hybridization
conditions are
described herein in the examples.
In instances of hybridization of deoxyoligonucleotides, additional exemplary
stringent
hybridization conditions include washing in 6X SSC10.05% sodium pyrophosphate
at 37°C (for
14-base oligonucleotides), 48°C (for 17-base oligos), 55°C (for
20-base oligonucleotides), and
60°C (for 23-base oligonucleotides).
As used herein, "substantially equivalent" can refer both to nucleotide and
amino acid
sequences, for example a mutant sequence, that varies from a reference
sequence by one or more
substitutions, deletions, or additions, the net effect of which does not
result in an adverse
functional dissimilarity between the reference and subject sequences.
Typically, such a
substantially equivalent sequence varies from one of those listed herein by no
more than about
35% (i.e., the number of individual residue substitutions, additions, and/or
deletions in a
substantially equivalent sequence, as compared to the corresponding reference
sequence, divided
by the total number of residues in the substantially equivalent sequence is
about 0.35 or less).
Such a sequence is said to have 65% sequence identity to the listed sequence.
In one
embodiment, a substantially equivalent, e.g., mutant, sequence of the
invention varies from a
listed sequence by no more than 30% (70% sequence identity); in a variation of
this embodiment,
by no more than 25% (75% sequence identity); and in a further variation of
this embodiment, by
no more than 20% (80% sequence identity) and in a further variation of this
embodiment, by no
more than 10% (90% sequence identity) and in a further variation of this
embodiment, by no
more that 5% (95% sequence identity). Substantially equivalent, e.g., mutant,
amino acid
sequences according to the invention preferably have at least 80% sequence
identity with a listed
amino acid sequence, more preferably at least 85% sequence identity, more
preferably at least
90% sequence identity, more preferably at least 95% sequence identity, more
preferably at least
98% sequence identity and most preferably at~ least 98% idenity. Substantially
equivalent
nucleotide sequences of the invention can have lower percent sequence
identities, taking into
account, for example, the redundancy or degeneracy of the genetic code.
Preferably, nucleotide
sequence has at least about 65% identity, more preferably at least about 75%
identity, more
preferably at least about 80% identity, more preferably at least about 85%
identity, more
preferably at least about 90% identity, and most preferably at least about 95%
identity, more
preferably at least 98% and most preferably at least about 99% identity. For
the purposes of the
present invention, sequences having substantially equivalent biological
activity and substantially
equivalent expression characteristics are considered substantially equivalent.
For the purposes of
13


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
determining equivalence, truncation of the mature sequence (e.g., via a
mutation which creates a
spurious stop codon) should be disregarded. Sequence identity may be
determined, e.g., using
the Jotun Hein method (Hero, J. (1990) Methods Enzymol. 183:626-645). Identity
between
sequences can also be determined by other methods known in the art, e.g. by
varying
hybridization conditions.
The term "totipotent" refers to the capability of a cell to differentiate into
all of the cell
types of an adult organism.
The term "transformation" means introducing DNA into a suitable host cell so
that the
DNA is replicable, either as an extrachromosomal element, or by chromosomal
integration. The
term "transfection" refers to the taking up of an expression vector by a
suitable host cell, whether
or not any coding sequences are in fact expressed. The term "infection" refers
to the introduction
of nucleic acids into a suitable host cell by use of a virus or viral vector.
As used herein, an "uptake modulating fragment," UMF, means a series of
nucleotides
which mediate the uptake of a linked DNA fragment into a cell. UMFs can be
readily identified
using known UMFs as a target sequence or target motif with the computer-based
systems
described below. The presence and activity of a UMF can be confirmed by
attaching the
suspected UMF to a marker sequence. The resulting nucleic acid molecule is
then incubated
with an appropriate host under appropriate conditions and the uptake of the
marker sequence is
determined. As described above, a UMF will increase the frequency of uptake of
a linked
marker sequence.
Each of the above terms is meant to encompass all that is described for each,
unless the
context dictates otherwise.
4.2 NUCLEIC ACIDS OF THE INVENTION
Nucleotide sequences of the invention are set forth in the Sequence Listing.
The isolated polynucleotides of the invention include a polynucleotide
comprising the
nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954; a
polynucleotide encoding any one of the peptide sequences of SEQ ID NO: 985-
1968, 2953-3936,
3943-3948 or 3955-3960; and a polynucleotide comprising the nucleotide
sequence encoding the
mature protein coding sequence of the polypeptides of any one of SEQ ID NO:
985-1968, 2953-
3936, 3943-3948 or 3955-3960. The polynucleotides of the present invention
also include, but
are not limited to, a polynucleotide that hybridizes under stringent
conditions to (a) the
complement of any of the nucleotides sequences of SEQ ID NO: 1-984, 1969-2952,
3937-3942
or 3949-3954; (b) nucleotide sequences encoding any one of the amino acid
sequences set forth
in the Sequence Listing as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or 3955-
3960; (c) a
14


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
polynucleotide which is an allelic variant of any polynucleotide recited
above; (d) a
polynucleotide which encodes a species homolog of any of the proteins recited
above; or (e) a
polynucleotide that encodes a polypeptide comprising a specific domain or
truncation of the
polypeptides of SEQ ID N0:985-1968, 2953-3936, 3943-3948 or 3955-3960. Domains
of
interest may depend on the nature of the encoded polypeptide; e.g., domains in
receptor-like
polypeptides include ligand-binding, extracellular, transmembrane, or
cytoplasmic domains, or
combinations thereof; domains in immunoglobulin-like proteins include the
variable
immunoglobulin-like domains; domains in enzyme-like polypeptides include
catalytic and
substrate binding domains; and domains in ligand polypeptides include receptor-
binding
domains.
The polynucleotides of the invention include naturally occurring or wholly or
partially
synthetic DNA, e.g., cDNA and genomic DNA, and RNA, e.g., mRNA. The
polynucleotides
may include all of the coding region of the cDNA or may represent a portion of
the coding
region of the cDNA.
The present invention also provides genes corresponding to the cDNA sequences
disclosed
herein. The corresponding genes can be isolated in accordance with known
methods using the
sequence information disclosed herein. Such methods include the preparation of
probes or primers
from the disclosed sequence information for identification andlor
amplification of genes in
appropriate genomic libraries or other sources of genomic materials. Further
5' and 3' sequence can
be obtained using methods known in the art. For example, full length cDNA or
genomic DNA that
corresponds to any of the polynucleotides of SEQ ID NO: 1-984, 1969-2952, 3937-
3942 or 3949-
3954 can be obtained by screening appropriate cDNA or genomic DNA libraries
under suitable
hybridization conditions using any of the polynucleotides of SEQ ID NO: 1-984,
1969-2952, 3937-
3942 or 3949-3954 or a portion thereof as a probe. Alternatively, the
polynucleotides of SEQ ID
NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 may be used as the basis for
suitable primers)
that allow identification and/or amplification of genes in appropriate genomic
DNA or cDNA
libraries.
The nucleic acid sequences of the invention can be assembled from ESTs and
sequences
(including cDNA and genomic sequences) obtained from one or more public
databases, such as
dbEST, gbpri, and UniGene. The EST sequences can provide identifying sequence
information,
representative fragment or segment information, or novel segment information
for the full-length
gene.
The polynucleotides of the invention also provide polynucleotides including
nucleotide
sequences that are substantially equivalent to the polynucleotides recited
above. Polynucleotides
according to the invention can have, e.g., at least about 65%, at least about
70%, at least about


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
75%, at least about 80%, 81%, 82%, 83%, 84%, more typically at least about
85%, 86%, 87%,
88%, 89%, and more typically at least about 90%, 91%, 92%, 93%, 94%, and even
more
typically at least about 95%, 96%, 97%, 98%, 99%, sequence identity to a
polynucleotide recited
above.
Included within the scope of the nucleic acid sequences of the invention are
nucleic acid
sequence fragments that hybridize under stringent conditions to any of the
nucleotide sequences
of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, or complements
thereof, which
fragment is greater than about 5 nucleotides, preferably 7 nucleotides, more
preferably greater
than 9 nucleotides and most preferably greater than 17 nucleotides. Fragments
of, e.g. 15, 17, or
20 nucleotides or more that are selective for (i.e. specifically hybridize to
any one of the
polynucleotides of the invention) are contemplated. Probes capable of
specifically hybridizing to
a polynucleotide can differentiate polynucleotide sequences of the invention
from other
polynucleotide sequences in the same family of genes or can differentiate
human genes from
genes of other species, and are preferably based on unique nucleotide
sequences.
The sequences falling within the scope of the present invention are not
limited to these
specific sequences, but also include allelic and species variations thereof.
Allelic and species
variations can be routiilely determined by comparing the sequence provided SEQ
ID NO: 1-984,
1969-2952, 3937-3942 or 3949-3954, a representativefragmentthereof, or a
nucleotide sequence at
least 90% identical, preferably 95% identical, to SEQ ID NO: 1-984,1969-2952,
3937-3942 or
3949-3954 with a sequence from another isolate of the same species.
Furthermore, to accommodate
codon variability, the invention includes nucleic acid molecules coding for
the same amino acid
sequences as do the specific ORFs disclosed herein. In other words, in the
coding region of an
ORF, substitution of one codon for another codon that encodes the same amino
acid is expressly
contemplated.
The nearest neighbor or homology result for the nucleic acids of the present
invention,
including SEQ ID NO: 1-984,1969-2952, 3937-3942 or 3949-3954, can be obtained
by searching a
database using an algorithm or a program. Preferably, a BLAST which stands for
Basic Local
Alignment Search Tool is used to search for local sequence alignments
(Altshul, S.F. J Mol. Evol.
36 290-300 (1993) and Altschul S.F. et al. J. Mol. Biol. 21:403-410 (1990)).
Alternatively a
FASTA version 3 search against Genpept, using Fastxy algorithm.
Species homologs (or orthologs) of the disclosed polynucleotides and proteins
are also
provided by the present invention. Species homologs may be isolated and
identified by making
suitable probes or primers from the sequences provided herein and screening a
suitable nucleic
acid source from the desired species.
16


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
The invention also encompasses allelic variants of the disclosed
polynucleotides or
proteins; that is, naturally-occurring alternative forms of the isolated
polynucleotide which also
encode proteins which are identical, homologous or related to that encoded by
the
polynucleotides.
The nucleic acid sequences of the invention are further directed to sequences
which
encode variants of the described nucleic acids. These amino acid sequence
variants may be
prepared by methods known in the art by introducing appropriate nucleotide
changes into a
native or variant polynucleotide. There are two variables in the construction
of amino acid
sequence variants: the location of the mutation and the nature of the
mutation. Nucleic acids
encoding the amino acid sequence variants are preferably constructed by
mutating the
polynucleotide to encode an amino acid sequence that does not occur in nature.
These nucleic
acid alterations can be made at sites that differ in the nucleic acids from
different species
(variable positions) or in highly conserved regions (constant regions). Sites
at such locations
will typically be modified in series, e.g., by substituting first with
conservative choices (e.g.,
hydrophobic amino acid to a different hydrophobic amino acid) and then with
more distant
choices (e.g., hydrophobic amino acid to a charged amino acid), and then
deletions or insertions
may be made at the target site. Amino acid sequence deletions generally range
from about 1 to
30 residues, preferably about 1 to 10 residues, and are typically contiguous.
Amino acid
insertions include amino- and/or carboxyl-terminal fusions ranging in length
from one to one
hundred or more residues, as well as intrasequence insertions of single or
multiple amino acid
residues. Intrasequence insertions may range generally from about 1 to 10
amino residues,
preferably from 1 to 5 residues. Examples of terminal insertions include the
heterologous signal
sequences necessary for secretion or for intracellular targeting in different
host cells and
sequences such as FLAG or poly-histidine sequences useful for purifying the
expressed protein.
In a preferred method, polynucleotides encoding the novel amino acid sequences
are
changed via site-directed mutagenesis. This method uses oligonucleotide
sequences to alter a
polynucleotide to encode the desired amino acid variant, as well as sufficient
adjacent
nucleotides on both sides of the changed amino acid to form a stable duplex on
either side of the
site of being changed. In general, the techniques of site-directed mutagenesis
are well known to
those of skill in the art and this technique is exemplified by publications
such as, Edelman et al.,
DNA 2:183 (1983). A versatile and efficient method for producing site-specific
changes in a
polynucleotide sequence was published by Zoller and Smith, Nucleic Acids Res.
10:6487-6500
(1982). PCR may also be used to create amino acid sequence variants of the
novel nucleic acids.
When small amounts of template DNA are used as starting material, primers)
that differs
slightly in sequence from the corresponding region in the template DNA can
generate the desired
17


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
amino acid variant. PCR amplification results in a population of product DNA
fragments that
differ from the polynucleotide template encoding the polypeptide at the
position specified by the
primer. The product DNA fragments replace the corresponding region in the
plasmid and this
gives a polynucleotide encoding the desired amino acid variant.
A further technique for generating amino acid variants is the cassette
mutagenesis
technique described in Wells et al., Gehe 34:315 (1985); and other mutagenesis
techniques well
known in the art, such as, for example, the techniques in Sambrook et al.,
supra, and Cu~~ent
Protocols ivy Molecular Biology, Ausubel et al. Due to the inherent degeneracy
of the genetic
code, other DNA sequences which encode substantially the same or a
functionally equivalent
amino acid sequence may be used in the practice of the invention for the
cloning and expression
of these novel nucleic acids. Such DNA sequences include those which are
capable of
hybridizing to the appropriate novel nucleic acid sequence under stringent
conditions.
Polynucleotides encoding preferred polypeptide truncations of the invention
can be used
to generate polynucleotides encoding chimeric or fusion proteins comprising
one or more
domains of the invention and heterologous protein sequences.
The polynucleotides of the invention additionally include the complement of
any of the
polynucleotides recited above. The polynucleotide can be DNA (genomic, cDNA,
amplified, or
synthetic) or RNA. Methods and algorithms for obtaining such polynucleotides
are well known
to those of skill in the art and can include, for example, methods for
determining hybridization
conditions that can routinely isolate polynucleotides of the desired sequence
identities.
In accordance with the invention, polynucleotide sequences comprising the
mature
protein coding sequences corresponding to any one of SEQ ID NO: 1-984, 1969-
2952, 3937-
3942 or 3949-3954, or functional equivalents thereof, may be used to generate
recombinant
DNA molecules that direct the expression of that nucleic acid, or a functional
equivalent thereof,
in appropriate host cells. Also included axe the cDNA inserts of any of the
clones identified
herein.
A polynucleotide according to the invention can be joined to any of a variety
of other
nucleotide sequences by well-established recombinant DNA techniques (see
Sambrook J et al.
(1989) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory,
NY). Useful
nucleotide sequences for joining to polynucleotides include an assortment of
vectors, e.g.,
plasmids, cosmids, lambda phage derivatives, phagemids, and the like, that are
well known in the
art. Accordingly, the invention also provides a vector including a
polynucleotide of the
invention and a host cell containing the polynucleotide. In general, the
vector contains an origin
of replication functional in at least one organism, convenient restriction
endonuclease sites, and a
selectable marker for the host cell. Vectors according to the invention
include expression
18


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
vectors, replication vectors, probe generation vectors, and sequencing
vectors. A host cell
according to the invention can be a prokaryotic or eukaryotic cell and can be
a unicellular
organism or park of a multicellular organism.
The present invention further provides recombinant constructs comprising a
nucleic acid
having any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952, 3937-
3942 or 3949-
3954or a fragment thereof or any other polynucleotides of the invention. In
one embodiment, the
recombinant constructs of the present invention comprise a vector, such as a
plasmid or viral
vector, into which a nucleic acid having any of the nucleotide sequences of
SEQ ID NO: 1-984,
1969-2952, 3937-3942 or 3949-3954 or a fragment thereof is inserted, in a
forward or reverse
orientation. In the case of a vector comprising one of the ORFs of the present
invention, the
vector may further comprise regulatory sequences, including for example, a
promoter, operably
linked to the ORF. Large numbers of suitable vectors and promoters are known
to those of skill
in the art and are commercially available for generating the recombinant
constructs of the present
invention. The following vectors are provided by way of example. Bacterial:
pBs, phagescript,
PsiX174, pBluescript SK, pBs KS, pNHBa, pNHl6a, pNHl8a, pNH46a (Stratagene);
pTrc99A,
pKK223-3, pKK233-3, pDR540, pRITS (Pharmacia). Eukaryotic: pWLneo, pSV2cat,
pOG44,
PXTI, pSG (Stratagene) pSVK3, pBPV, pMSG, pSVL (Pharmacia).
The isolated polynucleotide of the invention may be operably linked to an
expression
control sequence such as the pMT2 or pED expression vectors disclosed in
Kaufman et al.,
Nucleic Acids Res. I9, 4485-4490 (1991), in order to produce the protein
recombinantly. Many
suitable expression control sequences are known in the art. General methods of
expressing
recombinant proteins are also known and are exemplified in R. Kaufinan,
Methods in
Enzymology 185, 537-566 (1990). As defined herein "operably linked" means that
the isolated
polynucleotide of the invention and an expression control sequence are
situated within a vector
or cell in such a way that the protein is expressed by a host cell which has
been transformed
(transfected) with the ligated polynucleotide/expression control sequence.
Promoter regions can be selected from any desired gene using CAT
(chloramphenicol
transferase) vectors or other vectors with selectable markers. Two appropriate
vectors are
pKK232-8 and pCM7. Particular named bacterial promoters include lacI, lacZ,
T3, T7, gpt,
lambda PR, and trc. Eukaryotic promoters include CMV immediate early, HSV
thymidine
kinase, early and late SV40, LTRs from retrovirus, and mouse metallothionein-
I. Selection of
the appropriate vector and promoter is well within the level of ordinary skill
in the art.
Generally, recombinant expression vectors will include origins of replication
and selectable
markers permitting transfornlation of the host cell, e.g., the ampicillin
resistance gene of E. coli
and S. ce~evisiae TRPl gene, and a promoter derived from a highly-expressed
gene to direct
19


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
transcription of a downstream structural sequence. Such promoters can be
derived from operons
encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK), a-factor,
acid
phosphatase, or heat shock proteins, among others. The heterologous structural
sequence is
assembled in appropriate phase with translation initiation and termination
sequences, and
preferably, a leader sequence capable of directing secretion of translated
protein into the
periplasmic space or extracellular medium. Optionally, the heterologous
sequence can encode a
fusion protein including an amino terminal identification peptide imparting
desired
characteristics, e.g., stabilization or simplified purification of expressed
recombinant product.
Useful expression vectors for bacterial use are constructed by inserting a
structural DNA
sequence encoding a desired protein together with suitable translation
initiation and termination
signals in operable reading phase with a functional promoter. The vector will
comprise one or
more phenotypic selectable markers and an origin of replication to ensure
maintenance of the
vector and to, if desirable, provide amplification within the host. Suitable
prokaryotic hosts for
transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and
various species
within the genera Pseudomonas, St~eptomyces, and Staphylococcus, although
others may also be
employed as a matter of choice.
As a representative but non-limiting example, useful expression vectors for
bacterial use
can comprise a selectable marker and bacterial origin of replication derived
from commercially
available plasmids comprising genetic elements of the well known cloning
vector pBR322
(ATCC 37017). Such commercial vectors include, for example, pKK223-3
(Pharmacia Fine
Chemicals, Uppsala, Sweden) and GEM 1 (Promega Biotech, Madison, WI, USA).
These
pBR322 "backbone" sections are combined with an appropriate promoter and the
structural
sequence to be expressed. Following transformation of a suitable host strain
and growth of the
host strain to an appropriate cell density, the selected promoter is induced
or derepressed by
appropriate means (e.g., temperature shift or chemical induction) and cells
are cultured for an
additional period. Cells are typically harvested by centrifugation, disrupted
by physical or
chemical means, and the resulting crude extract retained for further
purification.
Polynucleotides of the invention can also be used to induce immune responses.
For
example, as described in Fan et al., Nat. Biotech. 17:870-872 (1999),
incorporated herein by
reference, nucleic acid sequences encoding a polypeptide may be used to
generate antibodies
against the encoded polypeptide following topical administration of naked
plasmid DNA or
following injection, and preferably intramuscular injection of the DNA. The
nucleic acid
sequences are preferably inserted in a recombinant expression vector and may
be in the form of
naked DNA.
20


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
4.3 ANTISENSE
Another aspect of the invention pertains to isolated antisense nucleic acid
molecules that
are hybridizable to or complementary to the nucleic acid molecule comprising
the nucleotide
sequence of SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954, or fragments,
analogs or
derivatives thereof. An "antisense" nucleic acid comprises a nucleotide
sequence that is
complementary to a "sense" nucleic acid encoding a protein, e.g.,
complementary to the coding
strand of a double-stranded cDNA molecule or complementary to an mRNA
sequence. In
specific aspects, antisense nucleic acid molecules are provided that comprise
a sequence
complementary to at least about 10, 25, 50, 100, 250 or 500 nucleotides or an
entire coding
strand, or to only a portion thereof. Nucleic acid molecules encoding
fragments, homologs,
derivatives and analogs of a protein of any of SEQ ID NO: 985-1968, 2953-3936,
3943-3948 or
3955-3960 or antisense nucleic acids complementary to a nucleic acid sequence
of SEQ ID NO:
1-984, 1969-2952, 3937-3942 or 3949-3954 are additionally provided.
In one embodiment, an antisense nucleic acid molecule is antisense to a
"coding region"
of the coding strand of a nucleotide sequence of the invention. The term
"coding region" refers
to the region of the nucleotide sequence comprising codons which are
translated into amino acid
residues. In another embodiment, the antisense nucleic acid molecule is
antisense to a
"noncoding region" of the coding strand of a nucleotide sequence of the
invention. The term
"noncoding region" refers to 5' and 3' sequences which flank the coding region
that are not
translated into amino acids (i.e., also referred to as 5' and 3' untranslated
regions).
Given the coding strand sequences encoding a nucleic acid disclosed herein
(e.g., SEQ ID
NO: 1-984, 1969=2952, 3937-3942 or 3949-3954), antisense nucleic acids of the
invention can be
designed according to the rules of Watson and Criclc or Hoogsteen base
pairing. The antisense
nucleic acid molecule can be complementary to the entire coding region of a
mRNA, but more
preferably is an oligonucleotide that is antisense to only a portion of the
coding or noncoding
region of a mRNA. For example, the antisense oligonucleotide can be
complementary to the
region surrounding the translation start site of a mRNA. An antisense
oligonucleotide can be, for
example, about 5, 10, 15, 20, 25, 30, 35, 40, 45 or 50 nucleotides in length.
An antisense nucleic
acid of the invention can be constructed using chemical synthesis or enzymatic
ligation reactions
using procedures known in the art. For example, an antisense nucleic acid
(e.g., an antisense
oligonucleotide) can be chemically synthesized using naturally occurring
nucleotides or
variously modified nucleotides designed to increase the biological stability
of the molecules or to
increase the physical stability of the duplex formed between the antisense and
sense nucleic
acids, e.g., phosphorothioate derivatives and acridine substituted nucleotides
can be used.
21


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
Examples of modified nucleotides that can be used to generate the antisense
nucleic acid
include: 5-fluorouracil, 5-bromouracil, 5-chlorouracil, 5-iodouracil,
hypoxanthine, xanthine,
4-acetylcytosine, 5-(carboxyhydroxylmethyl) uracil, 5-carboxymethylaminomethyl-

2-thiouridine, 5-carboxymethylaminomethyluracil, dihydrouracil, beta-D-
galactosylqueosine,
inosine, N6-isopentenyladenine, 1-methylguanine, 1-methylinosine, 2,2-
dimethylguanine,
2-methyladenine, 2-methylguanine, 3-methylcytosine, 5-methylcytosine, N6-
adenine,
7-methylguanine, 5-methylaminomethyluracil, 5-methoxyaminomethyl-2-thiouracil,
beta-D-mannosylqueosine, 5'-methoxycarboxymethyluracil, 5-methoxyuracil,
2-methylthio-N6-isopentenyladenine, uracil-5-oxyacetic acid (v), wybutoxosine,
pseudouracil,
queosine, 2-thiocytosine, 5-methyl-2-thiouracil, 2-thiouracil, 4-thiouracil, 5-
methyluracil,
uracil-5-oxyacetic acid methylester, uracil-5-oxyacetic acid (v), 5-methyl-2-
thiouracil,
3-(3-amino-3-N-2-carboxypropyl) uracil, (acp3)w, and 2,6-diaminopurine.
Alternatively, the
antisense nucleic acid can be produced biologically using an expression vector
into which a
nucleic acid has been subcloned in an antisense orientation (i.e., RNA
transcribed from the
inserted nucleic acid will be of an antisense orientation to a target nucleic
acid of interest,
described further in the following subsection).
The antisense nucleic acid molecules of the invention are typically
administered to a
subject or generated in situ such that they hybridize with or bind to cellular
mRNA and/or
genomic DNA encoding a protein according to the invention to thereby inhibit
expression of the
protein, e.g., by inhibiting transcription and/or translation. The
hybridization can be by
conventional nucleotide complementarity to form a stable duplex, or, for
example, in the case of
an antisense nucleic acid molecule that binds to DNA duplexes, through
specific interactions in
the major groove of the double helix. An example of a route of administration
of antisense
nucleic acid molecules of the invention includes direct injection at a tissue
site. Alternatively,
antisense nucleic acid molecules can be modified to target selected cells and
then administered
systemically. For example, for systemic administration, antisense molecules
can be modified
such that they specifically bind to receptors or antigens expressed on a
selected cell surface, e.g.,
by linking the antisense nucleic acid molecules to peptides or antibodies that
bind to cell surface
receptors or antigens. The antisense nucleic acid molecules can also be
delivered to cells using
the vectors described herein. To achieve sufficient intracellular
concentrations of antisense
molecules, vector constructs in which the antisense nucleic acid molecule is
placed under the
control of a strong pol II or pol III promoter are preferred.
In yet another embodiment, the antisense nucleic acid molecule of the
invention is an
a-anomeric nucleic acid molecule. An a-anomeric nucleic acid molecule forms
specific
double-stranded hybrids with complementary RNA in which, contrary to the usual
[3-units, the
22


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
strands run parallel to each other (Gaultier et al. (I987) Nucleic Acids Res
15: 6625-6641). The
antisense nucleic acid molecule can also comprise a 2'-o-methylribonucleotide
(moue et al.
(1987) Nucleic Acids Res 15: 6131-6148) or a chimeric RNA -DNA analogue (moue
et al. (1987)
FEBSLett 215: 327-330).
4.4 RIBOZYMES AND PNA MOIETIES
Tn still another embodiment, an antisense nucleic acid of the invention is a
ribozyme.
Ribozymes are catalytic RNA molecules with ribonuclease activity that are
capable of cleaving a
-single-stranded nucleic acid, such as a mRNA, to which they have a
complementary region.
Thus, ribozymes (e.g., hammerhead ribozymes (described in Haselhoff and
Gerlach (1988)
Nature 334:585-591 )) can be used to catalytically cleave a mRNA transcripts
to thereby inhibit
translation of a mRNA. A ribozyme having specificity for a nucleic acid of the
invention can be
designed based upon the nucleotide sequence of a DNA disclosed herein (i.e.,
SEQ ID NO: 1-
984, 1969-2952, 3937-3942 or 3949-3954). For example, a derivative of a
Tetrahymena L-19
IVS RNA can be constructed in which the nucleotide sequence of the active site
is
complementary to the nucleotide sequence to be cleaved in a SECX-encoding
mRNA. See, e.g.,
Cech et al. U.5. Pat. No. 4,987,071; and Cech et al. U.5. Pat. No. 5,116,742.
Alternatively,
SECX mRNA can be used to select a catalytic RNA having a specific ribonuclease
activity from
a pool of RNA molecules. See, e.g., Bartel et al., (1993) Science 261:1411-
1418.
Alternatively, gene expression can be inhibited by targeting nucleotide
sequences
complementary to the regulatory xegion (e.g., promoter and/or enhancers) to
form triple helical
structures that prevent transcription of the gene in taxget cells. See
generally, Helene. (1991)
Anticancer Drug Des. 6: 569-84; Helene. et al. (1992) Ar~~. N. Y. Acad Sci.
660:27-36; and
Maher (1992) Bioassays 14: 807-15.
In various embodiments, the nucleic acids of the invention can be modified at
the base
moiety, sugar moiety or phosphate backbone to improve, e.g., the stability,
hybridization, or
solubility of the molecule. For example, the deoxyribose phosphate backbone of
the nucleic
acids can be modified to generate peptide nucleic acids (see Hyrup et al.
(1996) Bioorg Med
Chem 4: 5-23). As used herein, the terms "peptide nucleic acids" or "PNAs"
refer to nucleic acid
mimics, e.g., DNA mimics, in which the deoxyribose phosphate backbone is
replaced by a
pseudopeptide backbone and only the four natural nucleobases are retained. The
neutral
backbone of PNAs has been shown to allow for specific hybridization to DNA and
RNA under
conditions of low ionic strength. The synthesis of PhlA oligomers can be
performed using
standard solid phase peptide synthesis protocols as described in Hyrup et al.
(1996) above;
Perry-O'Keefe et al. (1996) PNAS 93: 14670-675.
23


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
PNAs of the invention can be used in therapeutic and diagnostic applications.
For
example, PNAs can be used as antisense or antigene agents for sequence-
specific modulation of
gene expression by, e.g., inducing transcription or translation arrest or
inhibiting replication.
PNAs of the invention can also be used, e.g., in the analysis of single base
pair mutations in a
gene by, e.g., PNA directed PCR clamping; as artificial restriction enzymes
when used in
combination with other enzymes, e.g., S1 nucleases (Hyrup B. (1996) above); or
as probes or
primers for DNA sequence and hybridization (Hyrup et al. (1996), above; Perry-
O'Keefe (1996),
above).
In another embodiment, PNAs of the invention can be modified, e.g., to enhance
their
stability or cellular uptake, by attaching lipophilic or other helper groups
to PNA, by the
formation of PNA-DNA chimeras, or by the use of liposomes or other techniques
of drug
delivery known in the art. For example, PNA-DNA chimeras can be generated that
may
combine the advantageous properties of PNA and DNA. Such chimeras allow DNA
recognition
enzymes, e.g., RNase H and DNA polymerases, to interact with the DNA portion
while the PNA
portion would provide high binding affinity and specificity. PNA-DNA chimeras
can be linked
using linkers of appropriate lengths selected in terms of base stacking,
number of bonds between
the nucleobases, and orientation (Hyrup (1996) above). The synthesis of PNA-
DNA chimeras
can be performed as described in Hyrup (1996) above and Finn et al. (1996)
Nucl Acids Res 24:
3357-63. For example, a DNA chain can be synthesized on a solid support using
standard
phosphoramidite coupling chemistry, and modified nucleoside analogs, e.g.,
5'-(4-methoxytrityl)amino-5'-deoxy-thymidine phosphoramidite, can be used
between the PNA
and the 5' end of DNA (Mag et al. (1989) Nucl Acid Res 17: 5973-88). PNA
monomers are then
coupled in a stepwise manner to produce a chimeric molecule with a 5' PNA
segment and a 3'
DNA segment (Finn et al. (1996) above). Alternatively, chimeric molecules can
be synthesized
with a 5' DNA segment and a 3' PNA segment. See, Petersen et al. (1975) Bioorg
Med Chern
Lett 5: 1119-11124.
In other embodiments, the oligonucleotide may include other appended groups
such as
peptides (e.g., for targeting host cell receptors ire vivo), or agents
facilitating transport across the
cell membrane (see, e.g., Letsinger et al., 1989, Pr~oc. Natl. Acad. Sci.
U.SA. 86:6553-6556;
Lemaitre et al., 1987, Proc. Natl. Acad. Sci. 84:648-652; PCT Publication No.
W088/09810) or
the blood-brain barrier (see, e.g., PCT Publication No. W089/10134). In
addition,
oligonucleotides can be modified with hybridization triggered cleavage agents
(See, e.g., Krol et
al., 1988, BioTechniques 6:958-976) or intercalating agents. (See, e.g., Zon,
1988, Phar~r~c. Res.
5: 539-549). To this end, the oligonucleotide may be conjugated to another
molecule, e.g., a
24


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
peptide, a hybridization triggered cross-linking agent, a transport agent, a
hybridization-triggered
cleavage agent, etc.
4.5 HOSTS
The present invention further provides host cells genetically engineered to
contain the
polynucleotides of the invention. For example, such host cells may contain
nucleic acids of the
invention introduced into the host cell using known transformation,
transfection or infection
methods. The present invention still further provides host cells genetically
engineered to express
the polynucleotides of the invention, wherein such polynucleotides are in
operative association
with a regulatory sequence heterologous to the host cell which drives
expression of the
polynucleotides in the cell.
Knowledge of nucleic acid sequences allows for modification of cells to
permit, or
increase, expression of endogenous polypeptide. Cells can be modified (e.g.,
by homologous
recombination) to provide increased polypeptide expression by replacing, in
whole or in part, the
naturally occurring promoter with all or part of a heterologous promoter so
that the cells express
the polypeptide at higher levels. The heterologous promoter is inserted in
such a manner that it
is operatively linked to the encoding sequences. See, for example, PCT
International Publication
No. WO94l12650, PCT International Publication No. W092/20808, and PCT
International
Publication No. W091/09955. It is also contemplated that, in addition to
heterologous promoter
DNA, amplifiable marker DNA (e.g., ado, dhfr, and the multifunctional CAD gene
which
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and
dihydroorotase) andlor
intron DNA may be inserted along with the heterologous promoter DNA. If linked
to the coding
sequence, amplification of the marker DNA by standard selection methods
results in co-
amplification of the desired protein coding sequences in the cells.
The host cell can be a higher eukaryotic host cell, such as a mammalian cell,
a lower
eukaryotic host cell, such as a yeast cell, or the host cell can be a
prokaryotic cell, such as a
bacterial cell. Introduction of the recombinant construct into the host cell
can be effected by
calcium phosphate transfection, DEAF, dextran mediated transfection, or
electroporation (Davis,
L. et al., Basic Methods in Molecular Biology (1986)). The host cells
containing one of the
polynucleotides of the invention, can be used in conventional manners to
produce the gene
product encoded by the isolated fragment (in the case of an ORF) or can be
used to produce a
heterologous protein under the control of the EMF.
.Any hostlvector system can be used to express one or more of the ORFs of the
present
invention. These include, but are not limited to, eukaryotic hosts such as
HeLa cells, Cv-1 cell,
COS cells, 293 cells, and Sf9 cells, as well as prokaryotic host such as E.
coli and B. subtilis.


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
The most preferred cells are those which do not normally express the
particular polypeptide or
protein or which expresses the polypeptide or protein at low natural level.
Mature proteins can
be expressed in mammalian cells, yeast, bacteria, or other cells under the
control of appropriate
promoters. Cell-free translation systems can also be employed to produce such
proteins using
RNAs derived from the DNA constructs of the present invention. Appropriate
cloning and
expression vectors for use with prokaryotic and eukaryotic hosts are described
by Sambrook, et
al., in Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring
Harbor, New
York (1989), the disclosure of which is hereby incorporated by reference.
Various mammalian cell culture systems can also be employed to express
recombinant
protein. Examples of mammalian expression systems include the COS-7 lines of
monkey kidney
fibroblasts, described by Gluzman, Cell 23:175 (1981). Other cell lines
capable of expressing a
compatible vector are, for example, the C127, monkey COS cells, Chinese
Hamster Ovary
(CHO) cells, human kidney 293 cells, human epidermal A431 cells, human Co1o205
cells, 3T3
cells, CV-1 cells, other transformed primate cell lines, normal diploid cells,
cell strains derived
from in vitro culture of primary tissue, primary explants, HeLa cells, mouse L
cells, BHI~,
HL-60, U937, HaI~ or Jurkat cells. Mammalian expression vectors will comprise
an origin of
replication, a suitable promoter and also any necessary ribosome binding
sites, polyadenylation
site, splice donor and acceptor sites, transcriptional termination sequences,
and 5' flanking
nontranscribed sequences. DNA sequences derived from the SV40 viral genome,
for example,
SV40 origin, early promoter, enhancer, splice, and polyadenylation sites may
be used to provide
the required nontranscribed genetic elements. Recombinant polypeptides and
proteins produced
in bacterial culture are usually isolated by initial extraction from cell
pellets, followed by one or
more salting-out, aqueous ion exchange or size exclusion chromatography steps.
Protein
refolding steps can be used, as necessary, in completing configuration of the
mature protein.
Finally, high performance liquid chromatography (HPLC) can be employed for
final purification
steps. Microbial cells employed in expression of proteins can be disrupted by
any convenient
method, including freeze-thaw cycling, sonication, mechanical disruption, or
use of cell lysing
agents.
Alternatively, it may be possible to produce the protein in lower eukaryotes
such as yeast
or insects or in prokaryotes such as bacteria. Potentially suitable yeast
strains include
Saccharomyces cey~evisiae, Schizosaccharomyces ponabe, Kluyveromyces strains,
Candida, or
any yeast strain capable of expressing heterologous proteins. Potentially
suitable bacterial
strains include Escherichia coli, Bacillus subtilis, Salmonella typhimu~ium,
or any bacterial
strain capable of expressing heterologous proteins. If the protein is made in
yeast or bacteria, it
may be necessary to modify the protein produced therein, for example by
phosphorylation or
26


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
glycosylation of the appropriate sites, in order to obtain the functional
protein. Such covalent
attachments may be accomplished using known chemical or enzymatic methods.
In another embodiment of the present invention, cells and tissues may be
engineered to
express an endogenous gene comprising the polynucleotides of the invention
under the control of
inducible regulatory elements, in which case the regulatory sequences of the
endogenous gene
may be replaced by homologous recombination. As described herein, gene
targeting can be used
to replace a gene's existing regulatory region with a regulatory sequence
isolated from a different
gene or a novel regulatory sequence synthesized by genetic engineering
methods. Such
regulatory sequences may be comprised of promoters, enhancers, scaffold-
attachment regions,
negative regulatory elements, transcriptional initiation sites, regulatory
protein binding sites or
combinations of said sequences. Alternatively, sequences which affect the
structure or stability
of the RNA or protein produced may be replaced, removed, added, or otherwise
modified by
targeting. These sequence include polyadenylation signals, mRNA stability
elements, splice
sites, leader sequences for enhancing or modifying transport or secretion
properties of the
protein, or other sequences which alter or improve the function or stability
of protein or RNA
molecules.
The targeting event may be a simple insertion of the regulatory sequence,
placing the
gene under the control of the new regulatory sequence, e.g., inserting a new
promoter or
enhancer or both upstream of a gene. Alternatively, the targeting event may be
a simple deletion
of a regulatory element, such as the deletion of a tissue-specific negative
regulatory element.
Alternatively, the targeting event may replace an existing element; for
example, a tissue-specific
enhancer can be replaced by an enhancer that has broader or different cell-
type specificity than
the naturally occurring elements. ~ Here, the naturally occurring sequences
are deleted and new
sequences are added. In all cases, the identification of the targeting event
may be facilitated by
the use of one or more selectable marker genes that are contiguous with the
targeting DNA,
allowing for the selection of cells in which the exogenous DNA has integrated
into the host cell
genome. The identification of the targeting event may also be facilitated by
the use of one or
more marker genes exhibiting the property of negative selection, such that the
negatively
selectable marker is linked to the exogenous DNA, but configured such that the
negatively
selectable marker flanks the targeting sequence, and such that a correct
homologous
recombination event with sequences in the host cell genome does not result in
the stable
integration of the negatively selectable marker. Markers useful for this
purpose include the
Herpes Simplex Virus thymidine kinase (TK) gene or the bacterial xanthine-
guanine
phosphoribosyl-transferase (gpt) gene.
27


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
The gene targeting or gene activation techniques which can be used in
accordance with
this aspect of the invention are more particularly described in U.S. Patent
No. 5,272,071 to
Chappel; U.S. Patent No. 5,578,461 to Sherwin et al.; International
Application No.
PCT/US92/09627 (W093/09222) by Selden et al.; and International Application
No.
PCT/IJS90/06436 (W091/06667) by Skoultchi et al., each of which is
incorporated by reference
herein in its entirety.
4.6 POLYPEPTIDES OF THE INVENTION
The isolated polypeptides of the invention include, but are not limited to, a
polypeptide
comprising: the amino acid sequences set forth as any one of SEQ ID NO: 985-
1968, 2953-3936,
3943-3948 or 3955-3960 or an amino acid sequence encoded by any one of the
nucleotide
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954 or the
corresponding full
length or mature protein. Polypeptides of the invention also include
polypeptides preferably with
biological or immunological activity that are encoded by: (a) a polynucleotide
having any one of
the nucleotide sequences set forth in SEQ ID NO: 1-984, 1969-2952, 3937-3942
or 3949-3954 or
(b) polynucleotides encoding any one of the amino acid sequences set forth as
SEQ ID NO: 985-
1968, 2953-3936, 3943-3948 or 3955-3960 or (c) polynucleotides that hybridize
to the
complement of the polynucleotides of either (a) ox (b) under stringent
hybridization conditions.
The invention also provides biologically active or immunologically active
variants of any of the
amino acid sequences set forth as SEQ ID NO: 985-1968, 2953-3936, 3943-3948 or
3955-3960
or the corresponding full length or mature protein; and "substantial
equivalents" thereof (e.g., at
least about 65%, at least about 70%, at least about 75%, at least about 80%,
81%, 82%, 83%,
84%, more typically at least about 85%, 86%, 87%, 88%, 89%, and more typically
at least about
90%, 91%, 92%, 93%, 94%, and even more typically at least about 95%, 96%, 97%,
98%, 99%,
sequence identity that retain biological activity. Polypeptides encoded by
allelic variants may
have a similar, increased, or decreased activity compared to polypeptides
comprising SEQ ID
NO: 98S-1968, 2953-3936, 3943-3948 or 3955-3960.
Fragments of the proteins of the present invention which are capable of
exhibiting
biological activity are also encompassed by the present invention. Fragments
of the protein may
be in linear form or they may be cyclized using known methods, for example, as
described in H.
U. Saragovi, et al., Bio/Technology 10, 773-778 (1992) and in R. S. McDowell,
et al., 3. Amer.
Chem. Soc. 114, 9245-9253 (1992), both of which are incorporated herein by
reference. Such
fragments may be fused to carrier molecules such as immunoglobulins for many
purposes,
including increasing the valency of protein binding sites.
28


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
The present invention also provides both full-length and mature forms (for
example,
without a signal sequence or precursor sequence) of the disclosed proteins.
The protein coding
sequence is identified in the sequence listing by translation of the disclosed
nucleotide
sequences. The mature form of such protein may be obtained by expression of a
full-length
polynucleotide in a suitable mammalian cell or other host cell. The sequence
of the mature form
of the protein is also determinable from the amino acid sequence of the full-
length form. Where
proteins of the present invention are membrane bound, soluble forms of the
proteins are also
provided. In such forms, part or all of the regions causing the proteins to be
membrane bound
are deleted so that the proteins are fully secreted from the cell in which
they are expressed.
Protein compositions of the present invention may further comprise an
acceptable carrier,
such as a hydrophilic, e.g., pharmaceutically acceptable, carrier.
The present invention fiuther provides isolated polypeptides encoded by the
nucleic acid
fragments of the present invention or by degenerate variants of the nucleic
acid fragments of the
present invention. By "degenerate variant" is intended nucleotide fragments
which differ from a
nucleic acid fragment of the present invention (e.g., an ORF) by nucleotide
sequence but, due to
the degeneracy of the genetic code, encode an identical polypeptide sequence.
Preferred nucleic
acid fragments of the present invention are the ORFs that encode proteins.
A variety of methodologies known in the art can be utilized to obtain any one
of the
isolated polypeptides or proteins of the present invention. At the simplest
level, the amino acid
sequence can be synthesized using commercially available peptide synthesizers.
The
synthetically-constructed protein sequences, by virtue of sharing primary,
secondary or tertiary
structural and/or conformational characteristics with proteins may possess
biological properties
in common therewith, including protein activity. This technique is
particularly useful in
producing small peptides and fragments of larger polypeptides. Fragments are
useful, for
example, in generating antibodies against the native polypeptide. Thus, they
may be employed
as biologically active or immunological substitutes for natural, purif ed
proteins in screening of
therapeutic compounds and in immunological processes for the development of
antibodies.
The polypeptides and proteins of the present invention can alternatively be
purified from
cells which have been altered to express the desired polypeptide or protein.
As used herein, a
cell is said to be altered to express a desired polypeptide or protein when
the cell, through genetic
manipulation, is made to produce a polypeptide or protein which it normally
does not produce or
which the cell normally produces at a lower level. One skilled in the art can
readily adapt
procedures for introducing and expressing either recombinant or synthetic
sequences into
eukaryotic or prokaryotic cells in order to generate a cell which produces one
of the polypeptides
or proteins of the present invention.
29


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
The invention also relates to methods for producing a polypeptide comprising
growing a
culture of host cells of the invention in a suitable culture medium, and
purifying the protein from
the cells or the culture in which the cells are grown. For example, the
methods of the invention
include a process for producing a polypeptide in which a host cell containing
a suitable
expression vector that includes a polynucleotide of the invention is cultured
under conditions that
allow expression of the encoded polypeptide. The polypeptide can be recovered
from the
culture, conveniently from the culture medium, or from a lysate prepared from
the host cells and
further purified. Preferred embodiments include those in which the protein
produced by such
process is a full length or mature form of the protein.
In an alternative method, the polypeptide or protein is purified from
bacterial cells which
naturally produce the polypeptide or protein. One skilled in the art can
xeadily follow known
methods for isolating polypeptides and proteins in order to obtain one of the
isolated
polypeptides or proteins of the present invention. These include, but are not
limited to,
immunochromatography, HPLC, size-exclusion chromatography, ion-exchange
chromatography,
and immuno-affinity chromatography. See, e.g., Scopes, Protein Purification:
Principles and
Practice, Springer-Verlag (1994); Sambrook, et aL, in Molecular Cloning: A Lal
oratory
Manual; Ausubel et al., Current Protocols in Molecular Biology. Polypeptide
fragments that
retain biological/immunological activity include fragments comprising greater
than about 100
amino acids, or greater than about 200 amino acids, and fragments that encode
specific protein
domains.
The purified polypeptides can be used in in vitro binding assays which are
well known in
the art to identify molecules which bind to the polypeptides. These molecules
include but are not
limited to, for e.g., small molecules, molecules from combinatorial libraries,
antibodies or other
proteins. The molecules identified in the binding assay are then tested for
antagonist or agonist
activity in i~ vivo tissue culture or animal models that are well known in the
art. In brief, the
molecules are titrated into a plurality of cell cultures or animals and then
tested for either
cell/animal death or prolonged survival of the animal/cells.
In addition, the peptides of the invention or molecules capable of binding to
the peptides
may be complexed with toxins, e.g., ricin or cholera, or with other compounds
that are toxic to
cells. The toxin-binding molecule complex is then targeted to a tumor or other
cell by the
specificity of the binding molecule for SEQ ID NO: 985-1968, 2953-3936, 3943-
3948 or 3955-
3960.
The protein of the invention may also be expressed as a product of transgenic
animals,
e.g., as a component of the milk of transgenic cows, goats, pigs, or sheep
which are characterized
by somatic or germ cells containing a nucleotide sequence encoding the
protein.


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
The proteins provided herein also include proteins characterized by amino acid
sequences
similar to those of purified proteins but into which modification are
naturally provided or
deliberately engineered. For example, modifications, in the peptide or DNA
sequence, can be
made by those skilled in the art using known techniques, Modifications of
interest in the protein
sequences may include the alteration, substitution, replacement, insertion or
deletion of a
selected amino acid residue in the coding sequence. For example, one or more
of the cysteine
residues may be deleted or replaced with another amino acid to alter the
conformation of the
molecule. Techniques for such alteration, substitution, replacement, insertion
or deletion axe
well known to those skilled in the art (see, e.g., U.S. Pat. No. 4,518,584).
Preferably, such
alteration, substitution, replacement, insertion or deletion retains the
desired activity of the
protein. Regions of the protein that are important for the protein function
can be determined by
various methods known in the art including the alanine-scanning method which
involved
systematic substitution of single or strings of amino acids with alanine,
followed by testing the
resulting alanine-containing variant for biological activity. This type of
analysis determines the
1 S importance of the substituted amino acids) in biological activity. Regions
of the protein that are
important for protein function may be determined by the eMATRIX program.
Other fragments and derivatives of the sequences of proteins which would be
expected to
retain protein activity in whole or in part and are useful for screening or
other immunological
methodologies may also be easily made by those skilled in the art given the
disclosures herein.
Such modifications are encompassed by the present invention.
The protein may also be produced by operably linking the isolated
polynucleotide of the
invention to suitable control sequences in one or more insect expression
vectors, and employing
an insect expression system. Materials and methods for baculovixus/insect cell
expression
systems are commercially available in kit form from, e.g., Invitrogen, San
Diego, Calif., U.S.A.
(the MaxBatTM kit), and such methods are well known in the art, as described
in Summers and
Smith, Texas Agricultural Experiment Station Bulletin No. 1555 (1987),
incorporated herein by
reference. As used herein, an insect cell capable of expressing a
polynucleotide of the present
invention is "transformed."
The protein of the invention may be prepared by culturing transformed host
cells under
culture conditions suitable to express the recombinant protein. The resulting
expressed protein
may then be purified from such culture (i. e., from culture medium or cell
extracts) using known
purification processes, such as gel filtration and ion exchange
chromatography. The purification
of the protein may also include an affinity column containing agents which
will bind to the
protein; one or more column steps over such affinity resins as concanavalin A-
agarose,
heparin-toyopearlTM or Cibacrom blue 3GA SepharoseTM; one or more steps
involving
31


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
hydrophobic interaction chromatography using such resins as phenyl ether,
butyl ether, or propyl
ether; or immunoaffinity chromatography.
Alternatively, the protein of the invention may also be expressed in a form
which will
facilitate purification. For example, it may be expressed as a fusion protein,
such as those of
maltose binding protein (MBP), glutathione-S-transferase (GST) or thioredoxin
(TRX), or as a
His tag. Kits for expression and purification of such fusion proteins axe
commercially available
from New England BioLab (Beverly, Mass.), Pharmacia (Piscataway, N.J.) and
Invitrogen,
respectively. The protein can also be tagged with an epitope and subsequently
purified by using
a specific antibody directed to such epitope. One such epitope ("FLAG~") is
commercially
available from Kodak (New Haven, Conn.).
Finally, one or more reverse-phase high performance liquid chromatography (RP-
HPLC)
steps employing hydrophobic RP-HPLC media, e.g., silica gel having pendant
methyl or other
aliphatic groups, can be employed to further purify the protein. Some or all
of the foregoing
purification steps, in various combinations, can also be employed to provide a
substantially
homogeneous isolated recombinant protein. The protein thus purified is
substantially free of
other mammalian proteins and is defined in accordance with the present
invention as an "isolated
protein."
The polypeptides of the invention include analogs (variants). This embraces
fragments,
as well as peptides in which one or more amino acids has been deleted,
inserted, or substituted.
Also, analogs of the polypeptides of the invention embrace fusions of the
polypeptides or
modifications of the polypeptides of the invention, wherein the polypeptide or
analog is fused to
another moiety or moieties, e.g., targeting moiety or another therapeutic
agent. Such analogs
may exhibit improved properties such as activity and/or stability. Examples of
moieties which
may be fused to the polypeptide or an analog include, for example, targeting
moieties which
provide for the delivery of polypeptide to pancreatic cells, e.g., antibodies
to pancreatic cells,
antibodies to immune cells such as T-cells, monocytes, dendritic cells,
granulocytes, etc., as well
as receptor and ligands expressed on pancreatic or immune cells. Other
moieties which may be
fused to the polypeptide include therapeutic agents which are used for
treatment, for example,
immunosuppressive drugs such as cyclosporin, SK506, azathioprine, CD3
antibodies and
steroids. Also, polypeptides may be fused to immune modulators, and other
cytokines such as
alpha or beta interferon.
4.6.1 DETERMINING POLYPEPTIDE AND POLYNUCLEOTIDE IDENTITY
AND SIMILARITY
32


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
Preferred identity and/or similarity are designed to give the largest match
between the
sequences tested. Methods to determine identity and similarity are codified in
computer
programs including, but are not limited to, the GCG program package, including
GAP
(Devereux, J., et al., Nucleic Acids Research 12(1):387 (1984); Genetics
Computer Group,
University of Wisconsin, Madison, WI), BLASTP, BLASTN, BLASTX, FASTA
(Altschul, S.F.
et al., J. Molec. Biol. 215:403-410 (1990), PSI-BLAST (Altschul S.F. et al.,
Nucleic Acids Res.
vol. 25, pp. 3389-3402, herein incorporated by reference), eMatrix software
(Wu et al., J. Comp.
Biol., Vol. 6, pp. 219-235 (1999), herein incorporated by reference), eMotif
software (Nevill-
Manning et al, ISMB-97, Vol. 4, pp. 202-209, herein incorporated by
reference), pFam software
(Sonnhammer et al., Nucleic Acids Res., Vol. 26(1), pp. 320-322 (1998), herein
incorporated by
reference) and the I~yte-Doolittle hydrophobocity prediction algorithm (J. Mol
Biol, 157, pp.
105-31 (1982), incorporated herein by reference). The BLAST programs are
publicly available
from the National Center for Biotechnology Information (NCBI) and other
sources (BLAST
Manual, Altschul, S., et al. NCB NLM NIH Bethesda, MD 20894; Altschul, S., et
al., J. Mol.
Biol. 215:403-410 (1990).
4.7 CHIMERIC AND FUSION PROTEINS
The invention also provides chimeric or fusion proteins. As used herein, a
"chimeric
protein" or "fusion protein" comprises a polypeptide of the invention
operatively linked to
another polypeptide. Within a fusion protein the polypeptide according to the
invention can
correspond to all or a portion of a protein according to the invention. In one
embodiment, a
fusion protein comprises at least one biologically active portion of a protein
according to the
invention. In another embodiment, a fusion protein comprises at least two
biologically active
portions of a protein according to the invention. Within the fusion protein,
the term "operatively
linked" is intended to indicate that the polypeptide according to the
invention and the other
polypeptide axe fused in-frame to each other. The polypeptide can be fused to
the N-terminus or
C-terminus.
For example, in one embodiment a fusion protein comprises a polypeptide
according to
the invention operably linked to the extracellular domain of a second protein.
In another embodiment, the fusion protein is a GST-fusion protein in which the
polypeptide
sequences of the invention are fused to the C-terminus of the GST (i.e.,
glutathione
S-transferase) sequences.
In another embodiment, the fusion protein is an immunoglobulin fusion protein
in which
the polypeptide sequences according to the invention comprise one or more
domains fused to
sequences derived from a member of the immunoglobulin protein family. The
immunoglobulin
fusion proteins of the invention can be incorporated into pharmaceutical
compositions and
33


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
administered to a subject to inhibit an interaction between a ligand and a
protein of the invention
on the surface of a cell, to thereby suppress signal transduction in vivo. The
immmoglobulin
fusion proteins can be used to affect the bioavailability of a cognate ligand.
Inhibition of the
ligand/protein interaction may be useful therapeutically for both the
treatment of proliferative
and differentiative disorders, e,g., cancer as well as modulating (e.g.,
promoting or inhibiting)
cell survival. Moreover, the immunoglobulin fusion proteins of the invention
can be used as
immunogens to produce antibodies in a subject, to purify ligands, and in
screening assays to
identify molecules that inhibit the interaction of a polypeptide of the
invention with a ligand.
A chimeric or fusion protein of the invention can be produced by standard
recombinant
DNA techniques. For example, DNA fragments coding for the different
polypeptide sequences
are ligated together in-frame in accordance with conventional techniques,
e.g., by employing
blunt-ended or stagger-ended termini for ligation, restriction enzyme
digestion to provide for
appropriate termini, filling-in of cohesive ends as appropriate, alkaline
phosphatase treatment to
avoid undesirable joining, and enzymatic ligation. In another embodiment, the
fusion gene can
be synthesized by conventional techniques 'including automated DNA
synthesizers.
Alternatively, PCR amplification of gene fragments can be carried out using
anchor primers that
give rise to complementary overhangs between two consecutive gene fragments
that can
subsequently be annealed and reamplif ed to generate a chimeric gene sequence
(see, for
example, Ausubel et al. (eds.) CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John
Wiley ~
Sons, 1992). Moreover, many expression vectors are commercially available that
already encode
a fusion moiety (e.g., a GST polypeptide). A nucleic acid encoding a
polypeptide of the
invention can be cloned into such an expression vector such that the fusion
moiety is linked
in-frame to the protein of the invention.
4.8 GENE THERAPY
Mutations in the polynucleotides of the invention gene may result in loss of
normal
function of the encoded protein. The invention thus provides gene therapy to
restore normal
activity of the polypeptides of the invention; or to treat disease states
involving polypeptides of
the invention. Delivery of a functional gene encoding polypeptides of the
invention to
appropriate cells is effected ex vivo, in situ, or i~c vivo by use of vectors,
and more particularly
viral vectors (e.g., adenovirus, adeno-associated virus, or a retrovirus), or
ex vivo by use of
physical DNA transfer methods (e.g., liposomes or chemical treatments). See,
for example,
Anderson, Nature, supplement to vol. 392, no. 6679, pp.25-20 (1998). For
additional reviews of
gene therapy technology see Friedmann, Science, 244: 1275-1281 (1989); Verma,
Scientific
American: 68-84 (1990); and Miller, Nature, 357: 455-460 (1992). Introduction
of any one of
34


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
the nucleotides of the present invention or a gene encoding the polypeptides
of the present
invention can also be accomplished with extrachromosomal substrates (transient
expression) or
artificial chromosomes (stable expression). Cells may also be cultured ex vivo
in the presence of
proteins of the present invention in order to proliferate or to produce a
desired effect on or
S activity in such cells. Treated cells can then be introduced i~ vivo for
therapeutic purposes.
Alternatively, it is contemplated that in other human disease states,
preventing the expression of
or inhibiting the activity of polypeptides of the invention will be useful in
treating the disease
states. It is contemplated that antisense therapy or gene therapy could be
applied to negatively
regulate the expression of polypeptides of the invention.
Other methods inhibiting expression of a protein include the introduction of
antisense
molecules to the nucleic acids of the present invention, their complements, or
their translated RNA
sequences, by methods known in the art. Further, the polypeptides of the
present invention can be
inhibited by using targeted deletion methods, or the insertion of a negative
regulatory element such
as a silencer, which is tissue specific.
1 S The present invention still further provides cells genetically engineered
in vivo to express the
polynucleotides of the invention, wherein such polynucleotides are in
operative association with a
regulatory sequence heterologous to the host cell which drives expression of
the polynucleotides in
the cell. These methods can be used to increase or decrease the expression of
the polynucleotides of
the present invention.
Knowledge of DNA sequences provided by the invention allows for modification
of cells to
permit, increase, or decrease, expression of endogenous polypeptide. Cells can
be modified (e.g., by
homologous recombination) to provide increased polypeptide expression by
replacing, in whole or
in part, the naturally occurring promoter with all or part of a heterologous
promoter so that the cells
express the protein at higher levels. The heterologous promoter is inserted in
such a manner that it is
2S operatively linked to the desired protein encoding sequences. See, for
example, PCT International
PublicationNo. WO 94/12650, PCT InternationalPublicationNo. WO 92/20808, and
PCT
InternationalPublicationNo. WO 91/099SS. It is also contemplatedthat, in
addition to heterologous
promoter DNA, amplifiable marker DNA (e.g., ada, dhfr, and the multifunctional
GAD gene which
encodes carbamyl phosphate synthase, aspartate transcarbamylase, and
dihydroorotase) and/or
intron DNA may be inserted along with the heterologous promoter DNA. If linked
to the desired
protein coding sequence, amplification of the marker DNA by standard selection
methods results in
co-amplification of the desired protein coding sequences in the cells.
In another embodiment of the present invention, cells and tissues may be
engineered to
express an endogenous gene comprising the polynucleotides of the invention
under the control of
3 S inducible regulatory elements, in which case the regulatory sequences of
the endogenous gene may


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
be replaced by homologous recombination. As described herein, gene targeting
can be used to
replace a gene's existing regulatory region with a regulatory sequence
isolated from a different gene
or a novel regulatory sequence synthesized by genetic engineering methods.
Such regulatory
sequences may be comprised of promoters, enhancers, scaffold-
attachmentregions, negative
regulatory elements, transcriptional initiation sites, regulatory protein
binding sites or combinations
of said sequences. Alternatively, sequences which affect the structure or
stability of the RNA or
protein produced may be replaced, removed, added, or otherwise modified by
targeting. These
sequences include polyadenylation signals, mRNA stability elements, splice
sites, leader sequences
for enhancing or modifying transport or secretion properties of the protein,
or other sequences
which alter or improve the function or stability of protein or RNA molecules.
The targeting event may be a simple insertion of the regulatory sequence,
placing the gene
under the control of the new regulatory sequence, e.g., inserting a new
promoter or enhancer or both
upstream of a gene. Alternatively, the targeting event may be a simple
deletion of a regulatory
element, such as the deletion of a tissue-specific negative regulatory
element. Alternatively, the
targeting event may replace an existing element; for example, a tissue-
specific enhancer can be
replaced by an enhancer that has broader or different cell-type specificity
than the naturally
occurring elements. Here, the naturally occurring sequences are deleted and
new sequences are
added. In all cases, the identification of the targeting event may be
facilitated by the use of one or
more selectable marker genes that are contiguous with the targeting DNA,
allowing for the selection
of cells in which the exogenous DNA has integrated into the cell genome. The
identification of the
targeting event may also be facilitated by the use of one or more marker genes
exhibiting the
property of negative selection, such that the negatively selectable maxker is
linked to the exogenous
DNA, but configured such that the negatively selectable maxker flanks the
targeting sequence, and
such that a correct homologous recombination event with sequences in the host
cell genome does
not result in the stable integration of the negatively selectable marker.
Markers useful for this
purpose include the Herpes Simplex Virus thymidine kinase (TK) gene or the
bacterial
xanthine-guanine phosphoribosyl-transferase (gpt) gene.
The gene targeting or gene activation techniques which can be used in
accordance with this
aspect of the invention are more particularly described in U.S. Patent No.
5,272,071 to Chappel;
U.S. Patent No. 5,578,461 to Sherwinet al.; InternationalApplicationNo.
PCT/US92/09627
(W093/09222) by Selden et al.; and International ApplicationNo.
PCT/LTS90/06436
(W091 /06667) by Skoultchi et al., each of which is incorporated by reference
herein in its entirety.
4.9 TR.ANSGENIC ANIMALS
36


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
In preferred methods to determine biological functions of the polypeptides of
the
invention in vivo, one or more genes provided by the invention are either over
expressed or
inactivated in the germ line of animals using homologous recombination
[Capecchi, Science
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the
regulatory
control of exogenous or endogenous promoter elements, are known as transgenic
animals.
Animals in which an endogenous gene has been inactivated by homologous
recombination are
referred to as "knockout" animals. Knockout animals, preferably non-human
mammals, can be
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by
reference. Transgenic
animals are useful to determine the roles polypeptides of the invention play
in biological
processes, and preferably in disease states. Transgenic animals are useful as
model systems to
identify compounds that modulate lipid metabolism. Transgenic animals,
preferably non-human
mammals, are produced using methods as described in U.S. Patent No 5,489,743
and PCT
Publication No. W094128122, incorporated herein by reference.
Transgenic animals can be prepared wherein all or part of a promoter of the
polynucleotides of the invention is either activated or inactivated to alter
the level of expression
of the polypeptides of the invention. Inactivation can be carried out using
homologous
recombination methods described above. Activation can be achieved by
supplementing or even
replacing the homologous promoter to provide for increased protein expression.
The homologous
promoter can be supplemented by insertion of one or more heterologous enhancer
elements
known to confer promoter activation in a particular tissue.
The polynucleotides of the present invention also make possible the
development,
through, e.g., homologous recombination or knock out strategies, of animals
that fail to express
polypeptides of the invention or that express a variant polypeptide. Such
animals are useful as
models for studying the in vivo activities of polypeptide as well as for
studying modulators of the
polypeptides of the invention.
in preferred methods to determine biological functions of the polypeptides of
the
invention in vivo, one or more genes provided by the invention are either over
expressed or
inactivated in the germ line of animals using homologous recombination
[Capecchi, Science
244:1288-1292 (1989)]. Animals in which the gene is over expressed, under the
regulatory
control of exogenous or endogenous promoter elements, are known as transgenic
animals.
Animals in which an endogenous gene has been inactivated by homologous
recombination are
referred to as "knockout" animals. Knockout animals, preferably non-human
mammals, can be
prepared as described in U.S. Patent No. 5,557,032, incorporated herein by
.reference. Transgenic
animals are useful to determine the roles polypeptides of the invention play
in biological
processes, and preferably in disease states. Transgenic animals are useful as
model systems to
37


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
identify compounds that modulate lipid metabolism. Transgenic animals,
preferably non-human
mammals, are produced using methods as described in U.S. Patent No 5,489,743
and PCT
Publication No. W094/28122, incorporated herein by reference.
Transgenic animals can be prepared wherein all or part of the polynucleotides
of the
invention promoter is either activated or inactivated to alter the level of
expression of the
polypeptides of the invention. Inactivation can be carried out using
homologous recombination
methods described above. Activation can be achieved by supplementing or even
replacing the
homologous promoter to provide for increased protein expression. The
homologous promoter
can be supplemented by insertion of one or more heterologous enhancer elements
known to
confer promoter activation in a particular tissue.
4.10 USES AND BIOLOGICAL ACTIVITY
The polynucleotides and proteins of the present invention are expected to
exhibit one or
more of the uses or biological activities (including those associated with
assays cited herein)
identified herein. Uses or activities described for proteins of the present
invention may be
provided by administration or use of such proteins or of polynucleotides
encoding such proteins
(such as, for example, in gene therapies or vectors suitable for introduction
of DNA). The
mechanism underlying the particular condition or pathology will dictate
whether the
polypeptides of the invention, the polynucleotides of the invention or
modulators (activators or
inhibitors) thereof would be beneficial to the subject in need of treatment.
Thus, "therapeutic
compositions of the invention" include compositions comprising isolated
polynucleotides
(including recombinant DNA molecules, cloned genes and degenerate variants
thereof) or
polypeptides of the invention (including full length protein, mature protein
and truncations or
domains thereof), or compounds and other substances that modulate the overall
activity of the
target gene products, either at the level of target gene/protein expression or
target protein
activity. Such modulators include polypeptides, analogs, (variants), including
fragments and
fusion proteins, antibodies and other binding proteins; chemical compounds
that directly or
indirectly activate or inhibit the polypeptides of the invention (identified,
e.g., via drug screening
assays as described herein); antisense polynucleotides and polyrrucleotides
suitable for triple
helix formation; and in particular antibodies or other binding partners that
specifically recognize
one or more epitopes of the polypeptides of the invention.
The polypeptides of the present invention may likewise be involved in cellular
activation
or in one of the other physiological pathways described herein.
4.10.1 RESEARCH USES AND UTILITIES
38


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
The polynucleotides provided by the present invention can be used by the
research
community for various purposes. The polynucleotides can be used to express
recombinant
protein for analysis, characterization or therapeutic use; as markers for
tissues in which the
corresponding protein is preferentially expressed (either constitutively or at
a particular stage of
tissue differentiation or development or in disease states); as molecular
weight markers on gels;
as chromosome markers or tags (when labeled) to identify chromosomes or to map
related gene
positions; to compare with endogenous DNA sequences in patients to identify
potential genetic
disorders; as probes to hybridize and thus discover novel, related DNA
sequences; as a source of
information to derive PCR primers for genetic fingerprinting; as a probe to
"subtract-out" known
sequences in the process of discovering other novel polynucleotides; for
selecting and making
oligomers for attachment to a "gene chip" or other support, including for
examination of
expression patterns; to raise anti-protein antibodies using DNA immunization
techniques; and as
an antigen to raise anti-DNA antibodies or elicit another immune response.
Where the
polynucleotide encodes a protein which binds or potentially binds to another
protein (such as, for
example, in a receptor-ligand interaction), the polynucleotide can also be
used in interaction trap
assays (such as, for example, that described in Gyuris et al., Cell 75:791-803
(1993)) to identify
polynucleotides encoding the other protein with which binding occurs or to
identify inhibitors of
the binding interaction.
The polypeptides provided by the present invention can similarly be used in
assays to
determine biological activity, including in a panel of multiple proteins for
high-throughput
screening; to raise antibodies or to elicit another irmnune response; as a
reagent (including the
labeled reagent) in assays designed to quantitatively determine levels of the
protein (or its
receptor) in biological fluids; as markers for tissues in which the
corresponding polypeptide is
preferentially expressed (either constitutively or at a particular stage of
tissue differentiation or
development or in a disease state); and, of course, to isolate correlative
receptors or ligands.
Proteins involved in these binding interactions can also be used to screen for
peptide or small
molecule inhibitors or agonists of the binding interaction.
Any or all of these research utilities are capable of being developed into
reagent grade or
kit format for commercialization as research products.
Methods for performing the uses listed above are well known to those skilled
in the art.
References disclosing such methods include without limitation "Molecular
Cloning: A
Laboratory Manual", 2d ed., Cold Spring Harbor Laboratory Press, Sambrook, J.,
E. F. Fritsch
and T. Maniatis eds., 1989, and "Methods in Enzymology: Guide to Molecular
Cloning
Techniques", Academic Press, Berger, S. L. and A. R. Kimmel eds., 1987.
39


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
4.10.2 NUTRITIONAL USES
Polynucleotides and polypeptides of the present invention can also be used as
nutritional
sources or supplements. Such uses include without limitation use as a protein
or amino acid
supplement, use as a carbon source, use as a nitrogen source and use as a
source of carbohydrate. In
such cases the polypeptide or polynucleotide of the invention can be added to
the feed of a
particular organism or can be administered as a separate solid or liquid
preparation, such as in the
form of powder, pills, solutions, suspensions or capsules. In the case of
microorganisms, the
polypeptide or polynucleotide of the invention can be added to the medium in
or on which the
microorganism is cultured.
4.10.3 CYTOKINE AND CELL PROLIFERATIONIDIFFERENTIATION
ACTIVITY
A polypeptide of the present invention may exhibit activity relating to
cytokine, cell
proliferation (either inducing or inhibiting) or cell differentiation (either
inducing or inhibiting)
activity or may induce production of other cytokines in certain cell
populations. A
polynucleotide of the invention can encode a polypeptide exhibiting such
attributes. Many
protein factors discovered to date, including all known cytokines, have
exhibited activity in one
or more factor-dependent cell proliferation assays, and hence the assays serve
as a convenient
confirmation of cytokine activity. The activity of therapeutic compositions of
the present
invention is evidenced by any one of a number of routine factor dependent cell
proliferation
assays for cell lines including, without limitation, 32D, DA2, DAlG, T10, B9,
B9/11, BaF3,
MC9/G, M+(preB M+), 2E8, RBS, DAl, 123, T1165, HT2, CTLL2, TF-1, Mo7e, CMK,
HUVEC, and Caco. Therapeutic compositions of the invention can be used in the
following:
Assays for T-cell or thymocyte proliferation include without limitation those
described
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D.
H. Margulies, E.
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-
Interscience (Chapter 3,
In Yit~o assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, Immunologic
studies in
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Bertagnolli et al., J.
Immunol.
145:1706-1712, 1990; Bertagnolli et al., Cellular Immunology 133:327-341,
1991; Bertagnolli,
et al., I. Immunol. 149:3778-3783, 1992; Bowman et al., I. Immunol. 152:1756-
1761, 1994.
Assays for cytokine production and/or proliferation of spleen cells, lymph
node cells or
thymocytes include, without limitation, those described in: Polyclonal T cell
stimulation,
Kruisbeek, A. M. and Shevach, E. M. In Current Protocols in Immunology. J. E.
e.a. Coligan
eds. Vol 1 pp. 3.12.1-3.12.14, John Wiley and Sons, Toronto. 1994; and
Measurement of mouse


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
and human interleukin-y, Schreiber, R. D. In Current Protocols in Immunology.
J. E. e.a. Coligan
eds. Vol 1 pp. 6.8.1-6.8.8, John Wiley and Sons, Toronto. 1994.
Assays for proliferation and differentiation of hematopoietic and
lymphopoietic cells
include, without limitation, those described in: Measurement of Human and
Murine Interleukin 2
and Interleukin 4, Bottomly, K., Davis, L. S. and Lipsky, P. E. In Current
Protocols in
Immunology. J. E. e.a. Coligan eds. Vol 1 pp. 6.3.1-6.3.12, John Wiley and
Sons, Toronto. 1991;
deVries et al., J. Exp. Med. 173:1205-1211, 1991; Moreau et al., Nature
336:690-692, 1988;
Greenberger et al., Proc. Natl. Acad. Sci. U.S.A. 80:2931-2938, 1983;
Measurement of mouse
and human interleukin 6--Nordan, R. In Current Protocols in Immunology. J. E.
Coligan eds. Vol
1 pp. 6.6.1-6.6.5, John Wiley and Sons, Toronto. 1991; Smith et al., Proc.
Natl. Aced. Sci.
U.S.A. 83:1857-1861, 1986; Measurement of human Interleukin 11--Bennett, F.,
Giannotti, J.,
Claxk, S. C. and Turner, K. J. In Current Protocols in Immunology. J. E.
Coligan eds. Vol 1 pp.
6.15.1 John Wiley and Sons, Toronto. 1991; Measurement of mouse and human
Interleukin
9--Ciaxletta, A., Giannotti, J., Clark, S. C. and Turner, K. J. In Current
Protocols in Immunology.
J. E. Coligan eds. Vol 1 pp. 6.13.1, John Wiley and Sons, Toronto. 1991.
Assays for T-cell clone responses to antigens (which will identify, among
others, proteins
that affect APC-T cell interactions as well as direct T-cell effects by
measuring proliferation and
cytokine production) include, without limitation, those described in: Current
Protocols in
Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D. H. Margulies, E. M.
Shevach, W Strober,
Pub. Crreene Publishing Associates and Wiley-Interscience (Chapter 3, I~
T~'itro assays for Mouse
Lymphocyte Function; Chapter 6, Cytokines and their cellular receptors;
Chapter 7,
Immunologic studies in Humans); Weinberger et al.; Proc. Natl. Acad. Sci. USA
77:6091-6095,
1980; Weinberger et al., Eur. J. Immun. 11:405-411, 1981; Takai et al., J.
Immunol.
137:3494-3500, 1986; Takai et al., J. Immunol. 140:508-512, 1988.
4.10.4 STEM CELL GROWTH FACTOR ACTIVITY
A polypeptide of the present invention may exhibit stem cell growth factor
activity and
be involved in the proliferation, differentiation and survival of pluripotent
and totipotent stem
cells including primordial germ cells, embryonic stem cells, hematopoietic
stem cells and/or
germ line stem cells. Administration of the polypeptide of the invention to
stem cells in vivo or
ex vivo is expected to maintain and expand cell populations in a totipotential
or pluripotential
state which would be useful for re-engineering damaged or diseased tissues,
transplantation,
manufacture of bio-pharmaceuticals and the development of bio-sensors. The
ability to produce
laxge quantities of human cells has important working applications for the
production of human
proteins which currently must be obtained from non-human sources or donors,
implantation of
41


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
cells to treat diseases such as Parkinson's, Alzheimer's and other
neurodegenexative diseases;
tissues for grafting such as bone marrow, skin, cartilage, tendons, bone,
muscle (including
cardiac muscle), blood vessels, cornea, neural cells, gastrointestinal cells
and others; and organs
for transplantation such as kidney, liver, pancreas (including islet cells),
heart and lung.
It is contemplated that multiple different exogenous growth factors andlor
cytokines may
be administered in combination with the polypeptide of the invention to
achieve the desired
effect, including any of the growth factors listed herein, other stem cell
maintenance factors, and
specifically including stem cell factor (SCF), leukemia inhibitory factor
(LIF), Flt-3 ligand (Flt-
3L), any of the interieukins, recombinant soluble IL-6 xeceptor fused to IL-6,
macrophage
I O inflaanmatory protein I-alpha (MIP-I-alpha), G-CSF, GM-CSF, thrombopoietin
(TPO), platelet
factor 4 (PF-4), platelet-derived growth factor (PDGF), neural growth factors
and basic fibroblast
growth factor (bFGF).
Since totipotent stem cells can give rise to virtually any mature cell type,
expansion of
these cells in culture will facilitate the production of large quantities of
mature cells. Techniques
I 5 for culturing stem cells axe known in the art and administration of
polypeptides of the invention,
optionally with other growth factors and/or cytokines, is expected to enhance
the survival and
proliferation of the stem cell populations. This can be accomplished by direct
administration of
the polypeptide of the invention to the culture medium. Alternatively, strama
cells transfected
with a polynucleotide that encodes for the polypeptide of the invention can be
used as a feeder
20 layer for the stem cell populations in culture or in vivo. Stromal support
cells for feeder layers
rnay include embryonic bone marrow fibroblasts, bone marrow stromal cells,
fetal liver cells, or
cultured embryonic fibroblasts (see U.S. Patent No. 5,690,926).
Stem cells themselves can be transfected with a polynucleotide of the
invention to induce
autocrine expression of the polypeptide of the invention. This will allow for
generation of
25 undifferentiated totipotential/pluripotential stem cell lines that are
useful as is or that can then be
differentiated into the desired mature cell types. These stable cell lines can
also serve as a source
of undifferentiated totipotential/pluripotential mRNA to create cDNA libraries
and templates for
polymerase chain reaction experiments. These studies would allow for the
isolation and
identification of differentially expressed genes in stem cell populations that
regulate stem cell
30 proliferation andlor maintenance.
Expansion and maintenance of totipotent stem cell populations will be useful
in the
treatment of many pathological conditions. For example, polypeptides of the
present invention
rnay be used to manipulate stem cells in culture to give rise to
neuroepithelial cells that can be
used to augment or replace cells damaged by illness, autoimmune disease,
accidental damage or
35 genetic disorders. The polypeptide of the invention may be useful fox
inducing the proliferation
42


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
of neural cells and for-the regeneration of nerve and brain tissue, i.e. for
the treatment of central
and peripheral nervous system diseases and neuropathies, as well as mechanical
and traumatic
disorders which involve degeneration, death or trauma to neural cells or nerve
tissue. In addition,
the expanded stem cell populations can also be genetically altered for gene
therapy purposes and
to decrease host rejection of replacement tissues after grafting or
implantation.
Expression of the polypeptide of the invention and its effect on stem cells
can also be
manipulated to achieve controlled differentiation of the stem cells into more
differentiated cell
types. A broadly applicable method of obtaining pure populations of a specific
differentiated
cell type from undifferentiated stem cell populations involves the use of a
cell-type specific
promoter driving a selectable marker. The selectable marker allows only cells
of the desired type
to survive. For example, stem cells can be induced to differentiate into
cardiomyocytes (Wobus
et al., Differentiation, 48: 173-182, (1991); Klug et al., J. Clin. Invest.,
98(1): 216-224, (1998))
or skeletal muscle cells (Browder, L. W. In: Principles of Tissue Ehgineerihg
eds. Lama et al.,
Academic Press (1997)). Alternatively, directed differentiation of stem cells
can be
accomplished by culturing the stem cells in the presence of a differentiation
factor such as
retinoic acid and an antagonist of the polypeptide of the invention which
would inhibit the
effects of endogenous stem cell factor activity and allow differentiation to
proceed.
I~ vitro cultures of stem cells can be used to determine if the polypeptide of
the invention
exhibits stem cell growth factor activity. Stem cells are isolated from any
one of various cell
sources (including hematopoietic stem cells and embryonic stem cells) and
cultured on a feeder
layer, as described by Thompson et al. Proc. Natl. Acad. Sci, U.S.A., 92: 7844-
7848 (1995), in
the presence of the polypeptide of the invention alone or in combination with
other growth
factors or cytokines. The ability of the polypeptide of the invention to
induce stem cells
proliferation is determined by colony formation on semi-solid support e.g. as
described by
Bernstein et al., Blood, 77: 2316-2321 (1991).
4.10.5 HEMATOPOIESIS REGULATING ACTIVITY
A polypeptide of the present invention may be involved in regulation of
hematopoiesis
and, consequently, in the treatment of myeloid or lymphoid cell disorders.
Even marginal
biological activity in support of colony forming cells or of factor-dependent
cell lines indicates
involvement in regulating hematopoiesis, e.g. in supporting the growth and
proliferation of
erythroid progenitor cells alone or in combination with other cytokines,
thereby indicating utility,
for example, in treating various anemias or for use in conjunction with
irradiation/chemotherapy
to stimulate the production of erythroid precursors and/or erythroid cells; in
supporting the
growth and proliferation of myeloid cells such as granulocytes and
monocytes/macrophages (i.e.,
43


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
traditional CSF activity) useful, for example, in conjunction with
chemotherapy to prevent or
treat consequent myelo-suppression; in supporting the growth and proliferation
of
megakaryocytes and consequently of platelets thereby allowing prevention or
treatment of
various platelet disorders such as thrombocytopenia, and generally for use in
place of or
complimentary to platelet transfusions; andlor in supporting the growth and
proliferation of
hematopoietic stem cells which are capable of maturing to any and all of the
above-mentioned
hematopoietic cells and therefore find therapeutic utility in various stem
cell disorders (such as
those usually treated with transplantation, including, without limitation,
aplastic anemia and
paroxysmal nocturnal hemoglobinuria), as well as in repopulating the stem cell
compartment
post irradiation/chemotherapy, either ih-vivo or ex-vivo (i.e., in conjunction
with bone marrow
transplantation or with peripheral progenitor cell transplantation (homologous
or heterologous))
as normal cells or genetically manipulated for gene therapy.
Therapeutic compositions of the invention can be used in the following:
Suitable assays for proliferation and differentiation of various hematopoietic
lines are
I5 cited above.
Assays for embryonic stem cell differentiation (which will identify, among
others,
proteins that influence embryonic differentiation hematopoiesis) include,
without limitation,
those described in: Johansson et al. Cellular Biology 15:141-151, 1995; Keller
et al., Molecular
and Cellular Biology 13:473-486, 1993; McClanahan et al., Blood 81:2903-2915,
1993.
Assays for stem cell survival and differentiation (which will identify, among
others,
proteins that regulate lympho-hematopoiesis) include, without limitation,
those described in:
Methylcellulose colony forming assays, Freshney, M. G. In Culture of
Hematopoietic Cells. R. I.
Freshney, et al. eds. Vol pp. 265-268, Wiley-Liss, Tnc., New York, N.Y. 1994;
Hirayama et al.,
Proc. Natl. Acad. Sci. USA 89:5907-5911, 1992; Primitive hematopoietic colony
forming cells
with high proliferative potential, McNiece, I. K. and Briddell, R. A. In
Culture of Hematopoietic
Cells. R. I. Freshney, et al. eds. VoI pp. 23-39, Wiley-Liss, Inc., New York,
N.Y. 1994; Neben et
al., Experimental Hematology 22:353-359, 1994; Cobblestone area forming cell
assay,
Ploemacher, R. E. In Culture of Hematopoietic Cells. R. I. Fxeshney, et al.
eds. Vol pp. 1-21,
Wiley-Liss, Tnc., New York, N.Y. 1994; Long term bone marrow cultures in the
presence of
stromal cells, Spooncer, E., Dexter, M, and Allen, T. In Culture of
Hematopoietic Cells. R. I.
Freshney, et al. eds. Vol pp. 163-i79, Wiley-Liss, Tnc., New York, N.Y. 1994;
Long term culture
initiating cell assay, Sutherland, H. J. In Culture of Hematopoietic Cells. R.
I. Freshney, et al.
eds. Vol pp. 139-162, Wiley-Liss, Inc., New York, N.Y. 1994.
4.10.6 TISSUE GROWTH ACTIVITY
44


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
A polypeptide of the present invention also may be involved in bone,
cartilage, tendon,
ligament and/or nerve tissue growth or regeneration, as well as in wound
healing and tissue
repair and replacement, and in healing of burns, incisions and ulcers.
A polypeptide of the present invention which induces cartilage and/or bone
growth in
circumstances where bone is not normally formed, has application in the
healing of bone
fractures and cartilage damage or defects in humans and other animals.
Compositions of a
polypeptide, antibody, binding partner, or other modulator of the invention
may have
prophylactic use in closed as well as open fracture reduction and also in the
improved fixation of
artificial joints. De novo bone formation induced by an osteogenic agent
contributes to the repair
of congenital, trauma induced, or oncologic resection induced craniofacial
defects, and also is
useful in cosmetic plastic surgery.
A polypeptide of this invention may also be involved in attracting bone-
forming cells,
stimulating growth of bone-forming cells, or inducing differentiation of
progenitors of
bone-forming cells. Treatment of osteoporosis, osteoarthritis, bone
degenerative disorders, or
periodontal disease, such as through stimulation of bone and/or cartilage
repair or by blocking
inflammation or processes of tissue destruction (collagenase activity,
osteoclast activity, etc.)
mediated by inflammatory processes may also be possible using the composition
of the
invention.
Another category of tissue regeneration activity that may involve the
polypeptide of the
present invention is tendon/ligament formation. Induction of tendon/ligament-
like tissue or
other tissue formation in circumstances where such tissue is not normally
formed, has application
in the healing of tendon or ligament tears, deformities and other tendon or
ligament defects in
humans and other animals. Such a preparation employing a tendon/ligament-like
tissue inducing
protein may have prophylactic use in preventing damage to tendon or ligament
tissue, as well as
use in the improved fixation of tendon or ligament to bone or other tissues,
and in repairing
defects to tendon or ligament tissue. De novo tendon/ligament-like tissue
formation induced by
a composition of the present invention contributes to the repair of congeutal,
trauma induced, or
other tendon or ligament defects of other origin, and is also useful in
cosmetic plastic surgery for
attachment or repair of tendons or ligaments. The compositions of the present
invention may
provide environment to attract tendon- or ligament-forming cells, stimulate
growth of tendon- or
ligament-forming cells, induce differentiation of progenitors of tendon- or
ligament-forming
cells, or induce growth of tendon/ligament cells or progenitors ex vivo for
return ih vivo to effect
tissue repair. The compositions of the invention may also be useful in the
treatment of tendinitis,
carpal tunnel syndrome and other tendon or ligament defects. The compositions
may also include
an appropriate matrix and/or sequestering agent as a carrier as is well known
in the art.


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
The compositions of the present invention may also be useful for proliferation
of neural
cells and for regeneration of nerve and brain tissue, i.e. for the treatment
of central and peripheral
nervous system diseases and neuropathies, as well as mechanical and traumatic
disorders, which
involve degeneration, death or trauma to neural cells or nerve tissue, More
specifically, a
composition may be used in the treatment of diseases of the peripheral nervous
system, such as
peripheral nerve injuries, peripheral neuropathy and localized neuropathies,
and central nervous
system diseases, such as Alzheimer's, Parkinson's disease, Huntington's
disease, amyotrophic
lateral sclerosis, and Shy-Drager syndrome. Further conditions which may be
treated in
accordance with the present invention include mechanical and traumatic
disorders, such as spinal
cord disorders, head trauma and cerebrovascular diseases such as stroke.
Peripheral neuropathies
resulting from chemotherapy or other medical therapies may also be treatable
using a
composition of the invention.
Compositions of the invention may also be useful to promote better or faster
closure of
non-healing wounds, including without limitation pressure ulcers, ulcers
associated with vascular
1 S insuff ciency, surgical and traumatic wounds, and the like.
Compositions of the present invention may also be involved in the generation
or
regeneration of other tissues, such as organs (including, for example,
pancreas, liver, intestine,
kidney, skin, endothelium), muscle (smooth, skeletal or cardiac) and vascular
(including vascular
endothelium) tissue, or for promoting the growth of cells comprising such
tissues. Part of the
desired effects may be by inhibition or modulation of fibrotic scarring may
allow normal tissue
to regenerate. A polypeptide of the present invention may also exhibit
angiogenic activity.
A composition of the present invention may also be useful for gut protection
or
regeneration and treatment of lung or liver fibrosis, reperfusion injury in
various tissues, and
conditions resulting from systemic cytokine damage.
A composition of the present invention may also be useful for promoting or
inhibiting
differentiation of tissues described above from precursor tissues or cells; or
for inhibiting the
growth of tissues described above.
Therapeutic compositions of the invention can be used in the following:
Assays for tissue generation activity include, without limitation, those
described in:
International Patent Publication No. W095lI6035 (bone, cartilage, tendon);
International Patent
Publication No. W095/05846 (nerve, neuronal); International Patent Publication
No.
W091/07491 (skin, endothelium).
Assays for wound healing activity include, without limitation, those described
in: Winter,
Epidermal Wound Healing, pps. 71-112 (Maibach, H. I. and Rovee, D. T., eds.),
Year Book
46


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
Medical Publishers, Inc., Chicago, as modified by Eaglstein and Mertz, J.
Invest. Dermatol
71:382-84 (1978).
4.10.7 IMMUNE STIMULATING OR SUPPRESSING ACTIVITY
A polypeptide of the present invention may also exhibit immune stimulating or
immune
suppressing activity, including without limitation the activities for which
assays are described
herein. A polynucleotide of the invention can encode a polypeptide exhibiting
such activities. A
protein may be useful in the treatment of various immune deficiencies and
disorders (including
severe combined immunodeficiency (SCID)), e.g., in regulating (up or down)
growth and
proliferation of T and/or B lymphocytes, as well as effecting the cytolytic
activity of NK cells
and other cell populations. These immune deficiencies may be genetic or be
caused by viral (e.g.,
HIV) as well as bacterial or fungal infections, or may result from autoimmune
disorders. More
specifically, infectious diseases causes by viral, bacterial, fungal or other
infection may be
treatable using a protein of the present invention, including infections by
HIV, hepatitis viruses,
herpes viruses, mycobacteria, Leishmania spp., malaria spp. and various fungal
infections such
as candidiasis. Of course, in this regard, proteins of the present invention
may also be useful
where a boost to the immune system generally may be desirable, i.e., in the
treatment of cancer.
Autoimmune disorders which may be treated using a protein of the present
invention
include, for example, connective tissue disease, multiple sclerosis, systemic
lupus erythematosus,
rheumatoid arthritis, autoimmune pulmonary inflammation, Guillain-Barre
syndrome,
autoimmune thyroiditis, insulin dependent diabetes mellitis, myasthenia
gravis, graft-versus-host
disease and autoimmune inflammatory eye disease. Such a protein (or
antagonists thereof,
including antibodies) of the present invention may also to be useful in the
treatment of allergic
reactions and conditions (e.g., anaphylaxis, serum sickness, drug reactions,
food allergies, insect
venom allergies, mastocytosis, allergic rhinitis, hypersensitivity
pneumonitis, urticaria,
angioedema, eczema, atopic dermatitis, allergic contact dermatitis, erythema
multiforme,
Stevens-Johnson syndrome, allergic conjunctivitis, atopic
keratoconjunctivitis, venereal
keratoconjunctivitis, giant papillary conjunctivitis and contact allergies),
such as asthma
(particularly allergic asthma) or other respiratory problems. Other
conditions, in which immune
suppression is desired (including, for example, organ transplantation), may
also be treatable
using a protein (or antagonists thereof) of the present invention. The
therapeutic effects of the
polypeptides or antagonists thereof on allergic reactions can be evaluated by
in vivo animals
models such as the cumulative contact enhancement test (Lastbom et al.,
Toxicology 125: 59-66,
1998), skin prick test (Hoffinann et al., Allergy 54: 446-54, 1999), guinea
pig skin sensitization
47


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
test (Voter et al., Arch. Toxocol. 73: 501-9), and marine local lymph node
assay (Kimber et al.,
J. Toxicol. Environ. Health 53: 563-79).
Using the proteins of the invention it may also be possible to modulate immune
responses, in a number of ways. Down regulation may be in the form of
inhibiting or blocking an
immune response already in progress or may involve preventing the induction of
an immune
response. The functions of activated T cells may be inhibited by suppressing T
cell responses or
by inducing specific tolerance in T cells, or both. Immunosuppression of T
cell responses is
generally an active, non-antigen-specific, process which requires continuous
exposure of the T
cells to the suppressive agent. Tolerance, which involves inducing non-
responsiveness or anergy
in T cells, is distinguishable from immunosuppression in that it is generally
antigen-specific and
persists after exposure to the tolerizing agent has ceased. Operationally,
tolerance can be
demonstrated by the lack of a T cell response upon reexposure to specific
antigen in the absence
of the tolerizing agent.
Dov~m regulating or preventing one or more antigen functions (including
without
limitation B lymphocyte antigen functions (such as, for example, B7)), e.g.,
preventing high
level lymphokine synthesis by activated T cells, will be useful in situations
of tissue, skin and
organ transplantation and in graft-versus-host disease (GVHD). For example,
blockage of T cell
function should result in reduced tissue destruction in tissue
transplantation. Typically, in tissue
transplants, rejection of the transplant is initiated through its recognition
as foreign by T cells,
followed by an immune reaction that destroys the transplant. The
administration of a therapeutic
composition of the invention may prevent cytokine synthesis by immune cells,
such as T cells,
and thus acts as an immunosuppressant. Moreover, a lack of costimulation may
also be sufficient
to anergize the T cells, thereby inducing tolerance in a subject. Induction of
long-term tolerance
by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated
administration
of these blocking reagents. To achieve sufficient immunosuppression or
tolerance in a subject, it
may also be necessary to block the function of a combination of B lymphocyte
antigens.
The efficacy of particular therapeutic compositions in preventing organ
transplant
rejection or GVHD can be assessed using animal models that are predictive of
efficacy in
humans. Examples of appropriate systems which can be used include allogeneic
cardiac grafts in
rats and xenogeneic pancreatic islet cell grafts in mice, both of which have
been used to examine
the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described
in Lenschow et
al., Science 257:789-792 (1992) and Turka et al., Proc. Natl. Acad. Sci USA,
89:11102-11105
(1992). In addition, marine models of GVHD (see Paul ed., Fundamental
Immunology, Raven
Press, New York, 1989, pp. 846-847) can be used to determine the effect of
therapeutic
compositions of the invention on the development of that disease.
48


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
Blocking antigen function may also be therapeutically useful for treating
autoimmune
diseases. Many autoimmune disorders are the result of inappropriate activation
of T cells that are
reactive against self tissue and which promote the production of cytokines and
autoantibodies
involved in the pathology of the diseases. Preventing the activation of
autoreactive T cells may
reduce or eliminate disease symptoms. Administration of reagents which block
stimulation of T
cells can be used to inhibit T cell activation and prevent production of
autoantibodies or T
cell-derived cytokines which may be involved in the disease process.
Additionally, blocking
reagents may induce antigen-specific tolerance of autoreactive T cells which
could lead to
long-term relief from the disease. The efficacy of blocking reagents in
preventing or alleviating
autoimmune disorders can be determined using a number of well-characterized
animal models of
human autoimmune diseases. Examples include marine experimental autoimmune
encephalitis,
systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid mice, marine
autoimmune
collagen arthritis, diabetes mellitus in NOD mice and BB rats, and marine
experimental
myasthenia gravis (see Paul ed., Fundamental Immunology, Raven Press, New
York, 1989, pp.
840-856).
Upregulation of an antigen function (e.g., a B lymphocyte antigen function),
as a means
of up regulating immune responses, may also be useful in therapy. Upregulation
of immune
responses may be in the form of enhancing an existing immune response or
eliciting an initial
immune response. For example, enhancing an immune response may be useful in
cases of viral
infection, including systemic viral diseases such as influenza, the common
cold, and encephalitis.
Alternatively, anti-viral immune responses may be enhanced in an infected
patient by
removing T cells from the patient, costimulating the T cells in vitro with
viral antigen-pulsed
APCs either expressing a peptide of the present invention or together with a
stimulatory form of
a soluble peptide of the present invention and reintroducing the in vitro
activated T cells into the
patient. Another method of enhancing anti-viral immune responses would be to
isolate infected
cells from a patient, transfect them with a nucleic acid encoding a protein of
the present
invention as described herein such that the cells express all or a portion of
the protein on their
surface, and reintroduce the transfected cells into the patient. The infected
cells would now be
capable of delivering a costimulatory signal to, and thereby activate, T cells
in vivo.
A polypeptide of the present invention may provide the necessary stimulation
signal to T
cells to induce a T cell mediated immune response against the transfected
tumor cells. In
addition, tumor cells which lack MHC class I or MHC class II molecules, or
which fail to
reexpress sufficient mounts of MHC class I or MHC class II molecules, can be
transfected with
nucleic acid encoding all or a portion of (e.g., a cytoplasmic-domain
truncated portion) of an
MHC class I alpha chain protein and (32 microglobulin protein or an MHC class
II alpha chain
49


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
pxotein and an MHC class II beta chain pxotein to thereby express MHC class I
or MHC class II
pxoteins on the cell surface. Expression of the appropriate class I or class
II MHC in conjunction
with a peptide having the activity of a B lymphocyte antigen (e.g., B7-1, B7-
2, B7-3) induces a T
cell me'tliated immune response against the transfected tumor cell.
Optionally, a gene encoding
an antisense construct which blocks expression of an MHC class II associated
protein, such as
the invariant chain, can also be cotransfected with a DNA encoding a peptide
having the activity
of a B lymphocyte antigen to promote presentation of tumor associated antigens
and induce
tumor specific immunity. Thus, the induction of a T cell mediated immune
response in a human
subject may be sufficient to overcome tumor-specific tolerance in the subject.
The activity of a protein of the invention may, among other means, be measured
by the
following methods:
Suitable assays for thymocyte or splenocyte cytotoxicity include, without
limitation,
those described in: Current Protocols in Immunology, Ed by J. E. Coligan, A.
M. Kruisbeek, D.
H. Margulies, E. M. Shevach, W. Strober, Pub. Greene Publishing Associates and
Wiley-Interscience (Chapter 3, In Vitro assays fox Mouse Lymphocyte Function
3.1-3.19;
Chapter 7, Immunologic studies in Humans); Herrmann et al., Proc. Natl. Acad.
Sci. USA
78:2488-2492, 1981; Herrmann et al., J. Immunol. 128:1968-1974, 1982; Handa et
al., J.
Immunol. 135:1564-1572, 1985; Takai et al., I. Immunol. 137:3494-3500, 1986;
Takai et al., J.
hmnunol. 140:508-512, 1988; Bowman et al., J. Virology 61:1992-1998;
Bertagnolli et al.,
Cellulax Immunology 133:327-341, 1991; Brown et al., J. Immunol. 153:3079-
3092, 1994.
Assays for T-cell-dependent immunoglobulin responses and isotype switching
(which
will identify, among others, proteins that modulate T-cell dependent antibody
responses and that
affect Thl/Th2 profiles) include, without limitation, those described in:
Maliszewski, J.
Immunol. 144:3028-3033, 1990; and Assays for B cell function: In vitro
antibody production,
Mond, J. J. and Brunswick, M. In Current Protocols in Immunology. J. E. e.a.
Coligan eds. Vol 1
pp. 3.8.1-3.8.16, John Wiley and Sons, Toronto. 1994.
Mixed lymphocyte reaction (MLR) assays (which will identify, among others,
proteins
that generate predominantly Th1 and CTL responses) include, without
limitation, those described
in: Current Protocols in Immunology, Ed by J. E. Coligan, A. M. Kruisbeek, D.
H. Margulies, E.
M. Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley-
Interscience (Chapter 3,
In Vitro assays for Mouse Lymphocyte Function 3.1-3.19; Chapter 7, ImmunoIogic
studies in
Humans); Takai et al., J. Immunol. 137:3494-3500, 1986; Takai et al., J.
Immunol. 140:508-512,
1988; Bertagnolli et al., J. Imrnunol. 149:3778-3783, 1992.
Dendritic cell-dependent assays (which will identify, among others, proteins
expressed by
dendritic cells that activate naive T-cells) include, without limitation,
those described in: Guery
so


CA 02399776 2002-08-02
WO 01/57190 ~ PCT/USO1/04098
et al., J. Immunol. 134:536-544, 1995; Inaba et al., Journal of Experimental
Medicine
173:549-559, 1991; Macatonia et al., Journal of Immunology 154a071-s079, 1995;
Porgador et
al., Journal of Experimental Medicine 182:255-260, 199s; Nair et al., Journal
of Virology
67:4062-4069, 1993; Huang et al., Science 264:961-96s, 1994; Macatonia et al.,
Journal of
Experimental Medicine 169:1255-1264, 1989; Bhardwaj et al., Journal of
Clinical Investigation
94:797-807, 1994; and Inaba et al., Journal of Experimental Medicine 172:631-
640, 1990.
Assays for lymphocyte survival/apoptosis (which will identify, among others,
proteins
that prevent apoptosis after superantigen induction and proteins that regulate
lymphocyte
homeostasis) include, without limitation, those described in: Darzynkiewicz et
al., Cytometry
13:795-808, 1992; Gorczyca et al., Leukemia 7:659-670, 1993; Gorczyca et al.,
Cancer Research
53:1945-19s1, 1993; Itoh et al., Cell 66:233-243, 1991; Zacharchuk, Journal of
Immunology
145:4037-4045, 1990; Zamai et al., Cytometry 14:891-897, 1993; Gorczyca et
al., International
Journal of Oncology 1:639-648, 1992.
Assays for proteins that influence early steps of T-cell commitment and
development
include, without limitation, those described in: Antica et al., Blood 84:111-
117, 1994; Fine et al.,
Cellular Immunology 155:111-122, 1994; Galy et al., Blood 85:2770-2778, 1995;
Toki et al.,
Proc. Nat. Acad Sci. USA 88:7548-7551, 1991.
4.10.8 ACTIVIN/INHIBIN ACTIVITY
A polypeptide of the present invention may also exhibit activin- or inhibin-
related
activities. A polynucleotide of the invention may encode a polypeptide
exhibiting such
characteristics. Inhibins are characterized by their ability to inhibit the
release of follicle
stimulating hormone (FSH), while activins and are characterized by their
ability to stimulate the
release of follicle stimulating hormone (FSH). Thus, a polypeptide of the
present invention,
alone or in heterodimers with a member of the inhibin family, may be useful as
a contraceptive
based on the ability of inhibins to decrease fertility in female mammals and
decrease
spermatogenesis in male mammals. Administration of sufficient amounts of other
inhibins can
induce infertility in these mammals. Alternatively, the polypeptide of the
invention, as a
homodimer or as a heterodimer with other protein subunits of the inhibin
group, may be useful as
a fertility inducing therapeutic, based upon the ability of activin molecules
in stimulating FSH
release from cells of the anterior pituitary. See, for example, U.S. Pat. No.
4,798,885. A
polypeptide of the invention may also be useful for advancement of the onset
of fertility in
sexually immature mammals, so as to increase the lifetime reproductive
performance of domestic
animals such as, but not limited to, cows, sheep and pigs.
s1


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
The activity of a polypeptide of the invention may, among other means, be
measured by
the following methods.
Assays for activin/inhibin activity include, without limitation, those
described in: Vale et
al., Endocrinology 91:562-572, 1972; Ling et al., Nature 321:779-782, 1986;
Vale et al., Nature
321:776-779, 1986; Mason et al., Nature 318:659-663, 1985; Forage et al.,
Proc. Natl. Acad. Sci.
USA 83:3091-3095, 1986.
4.10.9 CHEMOTACTIC/CHEMOKINETIC ACTIVITY
A polypeptide of the present invention may be involved in chemotactic or
chemokinetic
activity for mammalian cells, including, for example, monocytes, fibroblasts,
neutrophils,
T-cells, mast cells, eosinophils, epithelial and/or endothelial cells. A
polynucleotide of the
invention can encode a polypeptide exhibiting such attributes. Chemotactic and
chemokinetic
receptor activation can be used to mobilize or attract a desired cell
population to a desired site of
action. Chemotactic or chemokinetic compositions (e.g. proteins, antibodies,
binding partners, or
modulators of the invention) provide particular advantages in treatment of
wounds and other
trauma to tissues, as well as in treatment of localized infections. For
example, attraction of
lymphocytes, monocytes or neutrophils to tumors or sites of infection may
result in improved
immune responses against the tumor or infecting agent.
A protein or peptide has chemotactic activity for a particular cell population
if it can
stimulate, directly or indirectly, the directed orientation or movement of
such cell population.
Preferably, the protein or peptide has the ability to directly stimulate
directed movement of cells.
Whether a particular protein has chemotactic activity for a population of
cells can be readily
determined by employing such protein or peptide in any known assay for cell
chemotaxis.
Therapeutic compositions of the invention can be used in the following:
Assays for chemotactic activity (which will identify proteins that induce or
prevent
chemotaxis) consist of assays that measure the ability of a protein to induce
the migration of cells
across a membrane as well as the ability of a protein to induce the adhesion
of one cell
population to another cell population. Suitable assays for movement and
adhesion include,
without limitation, those described in: Current Protocols in Immunology, Ed by
J. E. Coligan, A.
M. Kruisbeek, D. H. Marguiles, E. M. Shevach, W. Strober, Pub. Greene
Publishing Associates
and Wiley-Interscience (Chapter 6.12, Measurement of alpha and beta Chemokines
6.12.1-6.12.28; Taub et al. J. Clin. Invest. 95:1370-1376, 1995; Lind et al.
APMIS 103:140-146,
1995; Muller et al Eur. J. Immunol. 25:1744-1748; Gruber et al. J. of Immunol.
152:5860-5867,
1994; Johnston et al. J. of Immunol. 153:1762-1768, 1994.
52


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
4.10.10 HEMOSTATIC AND THROMBOLYTIC ACTIVITY
A polypeptide of the invention may also be involved in hemostatis or
thrombolysis or
thrombosis. A polynucleotide of the invention can encode a polypeptide
exhibiting such
attributes. Compositions may be useful in treatment of various coagulation
disorders (including
hereditary disorders, such as hemophiliac) or to enhance coagulation and other
hemostatic events
in treating wounds resulting from trauma, surgery or other causes. A
composition of the
invention may also be useful for dissolving or inhibiting formation of
thromboses and for
treatment and prevention of conditions resulting therefrom (such as, for
example, infarction of
cardiac and central nervous system vessels (e.g., stroke).
Therapeutic compositions of the invention can be used in the following:
Assay for hemostatic and thrombolytic activity include, without limitation,
those
described in: Linet et al., J. Clin. Pharmacol. 26:131-140, 1986; Burdick et
al., Thrombosis Res.
45:413-419, 1987; Humphrey et al., Fibrinolysis 5:71-79 (1991); Schaub,
Prostaglandins
35:467-474, 1988.
4.10.11 CANCER DIAGNOSIS AND THERAPY
Polypeptides of the invention may be involved in cancer cell generation,
proliferation or
metastasis. Detection of the presence or amount of polynucleotides or
polypeptides of the
invention may be useful for the diagnosis and/or prognosis of one or more
types of cancer. For
example, the presence or increased expression of a polynucleotide/polypeptide
of the invention
may indicate a hereditary risk of cancer, a precancerous condition, or an
ongoing malignancy.
Conversely, a defect in the gene or absence of the polypeptide may be
associated with a cancer
condition. Identification of single nucleotide polymorphisms associated with
cancer or a
predisposition to cancer may also be useful for diagnosis or prognosis.
Cancer treatments promote tumor regression by inhibiting tumor cell
proliferation,
inhibiting angiogenesis (growth of new blood vessels that is necessary to
support tumor growth)
and/or prohibiting metastasis by reducing tumor cell motility or invasiveness.
Therapeutic
compositions of the invention may be effective in adult and pediatric oncology
including in solid
phase tumors/malignancies, locally advanced tumors, human soft tissue
sarcomas, metastatic
cancer, including lymphatic metastases, blood cell malignancies including
multiple myeloma,
acute and chronic leukemias, and lymphomas, head and neck cancers including
mouth cancer,
larynx cancer and thyroid cancer, lung cancers including small cell carcinoma
and non-small cell
cancers, breast cancers including small cell carcinoma and ductal carcinoma,
gastrointestinal
cancers including esophageal cancer, stomach cancer, colon cancer, colorectal
cancer and polyps
associated with colorectal neoplasia, pancreatic cancers, liver cancer,
urologic cancers including
53


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
bladder cancer and prostate cancer, malignancies of the female gendtal tract
including ovarian
carcinoma, uterine (including endometrial) cancers, and solid tumor in the
ovarian follicle,
kidney cancers including renal cell carcinoma, brain cancers including
intrinsic brain tumors,
neurobiastoma, astrocytic brain tumors, gliomas, metastatic tumor cell
invasion in the central
nervous system, bone cancers including osteomas, skin cancers including
malignant melanoma,
tumor progression of human skin keratinocytes, squamous cell carcinoma, basal
cell carcinoma,
hemangiopericytoma and Karposi's sarcoma.
Polypeptides, polynucleotides, or modulators of polypeptides of the invention
(including
inhibitors and stimulators of the biological activity of the polypeptide of
the invention) may be
administered to treat cancer. Therapeutic compositions can be administered in
therapeutically
effective dosages alone or in combination with adjuvant cancer therapy such as
surgery,
chemotherapy, radiotherapy, thermotherapy, and laser therapy, and may provide
a beneficial
effect, e.g. reducing tumor size, slowing rate of tumor growth, inhibiting
metastasis, or otherwise
improving overall clinical condition, without necessarily eradicating the
cancer.
The composition can also be administered in therapeutically effective amounts
as a
portion of an anti-cancer cocktail. An anti-cancer cocktail is a mixture of
the polypeptide or
modulator of the invention with one or more anti-cancer drugs in addition to a
pharmaceutically
acceptable carrier for delivery. The use of anti-cancer cocktails as a cancer
treatment is routine.
Anti-cancer drugs that are well known in the art and can be used as a
treatment in combination
with the polypeptide or modulator of the invention include: Actinomycin D,
Aminoglutethimide,
Asparaginase, Bleomycin, Busulfan, Carboplatin, Carmustine, Chlorambucil,
Cisplatin (cis-
DDP), Cyclophosphamide, Cytarabine HCl (Cytosine arabinoside), Dacarbazine,
Dactinomycin,
Daunorubicin HCl, Doxorubicin HCI, Estxamustine phosphate sodium, Etoposide (V
16-213),
Floxuridine, 5-Fluorouracil (5-Fu), Flutamide, Hydxoxyurea (hydroxycaxbamide),
Ifosfamide,
Interferon Alpha-2a, Interferon Alpha-2b, Leuprolide acetate (LHRH-releasing
factor analog),
Lomustine, Mechlorethamine HCl (nitrogen mustard), Melphalan, Mercaptopurine,
Mesna,
Methotrexate (MTX), Mitomycin, Mitoxantxone HCI, Octreotide, Plicamycin,
Procarbazine HCI,
Streptozocin, Tamoxifen citrate, Thioguanine, Thiotepa, Vinblastine sulfate,
Vincristine sulfate,
Amsacrine, Azacitidine, Hexamethylrnelamine, Interleukin-2, Mitoguazone,
Pentostatin,
Semustine, Teniposide, and Vindesine sulfate.
In addition, therapeutic compositions of the invention may be used for
prophylactic
treatment of cancer. There are hereditary conditions andlor environmental
situations (e.g.
exposure to carcinogens) known in the art that predispose an individual to
developing cancers.
Under these circumstances, it may be beneficial to treat these individuals
with therapeutically
effective doses of the polypeptide of the invention to reduce the risk of
developing cancers.
54


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
Ih vitro models can be used to determine the effective doses of the
polypeptide of the
invention as a potential cancer treatment. These ih viWo models include
proliferation assays of
cultured tumor cells, growth of cultured tumor cells in soft agar (see
Fxeshney, (1987) Culture of
Animal Cells: A Manual of Basic Technique, Wily-Liss, New York, NY Ch 18 and
Ch 21),
tumor systems in nude mice as described in Giovanella et al., J. Natl. Can.
Inst., 52: 921-30
(1974), mobility and invasive potential of tumor cells in Boyden Chamber
assays as described in
Pilkington et al., Anticancer Res., 17: 4107-9 (1997), and angiogenesis assays
such as induction
of vascularization of the chick chorioallantoic membrane or induction of
vascular endothelial
cell migration as described in Ribatta et al., Intl. J. Dev. BioL, 40: 1189-97
(1999) and Li et aL,
Clin. Exp. Metastasis, 1.7:423-9 (1999), respectively. Suitable tumox cells
lines are available,
e.g. from American Type Tissue Culture Collection catalogs.
4.10.12 RECEPTOR/LIGAND ACTIVITY
A polypeptide of the present invention may also demonstrate activity as
receptor,
receptor ligand or inhibitor or agonist of receptox/ligand interactions. A
polynucleotide of the
invention can encode a polypeptide exhibiting such characteristics. Examples
of such receptors
and ligands include, without limitation, cytokine receptors and their ligands,
receptor kinases and
their ligands, receptor phosphatases and their ligands, receptors involved in
cell-cell interactions
and their ligands (including without limitation, cellular adhesion molecules
(such as selectins,
integrins and their ligands) and receptor/ligand pairs involved in antigen
presentation, antigen
recognition and development of cellular and humoral immune responses.
Receptors and ligands
are also useful for screening of potential peptide ox small molecule
inhibitors of the relevant
receptor/ligand interaction. A protein of the present invention (including,
without limitation,
fragments of receptors and ligands) may themselves be useful as inhibitors of
receptorlligand
interactions,
The activity of a polypeptide of the invention may, among other means, be
measured by
the following methods:
Suitable assays for receptor-ligand activity include without limitation those
described in:
Current Protocols in Immunology, Ed by J. E. Coligan, A. M. I~ruisbeek, D. H.
Margulies, E. M.
Shevach, W. Strober, Pub. Greene Publishing Associates and Wiley- Interscience
(Chapter 7.28,
Measurement of Cellular Adhesion under static conditions 7.28.1- 7.28.22),
Takai et al., Proc.
Natl. Acad. Sci. USA 84:6864-6868, 1987; Bierer et al., J. Exp. Med. 168:1145-
1156, 1988;
Rosenstein et al., J. Exp. Med. 169:149-160 1989; Stoltenborg et aL, 3.
Immunol. Methods
175:59-68, 1994; Stitt et al., Cell 80:661-670, 1995.


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
By way of example, the polypeptides of the invention may be used as a receptor
for a
ligand(s) thereby transmitting the biological activity of that ligand(s).
Ligands may be identified
through binding assays; affinity chromatography, dihybrid screening assays,
BIAcore assays, gel
overlay assays, or other methods known in the art.
Studies characterizing drugs or proteins as agonist or antagonist or partial
agonists or a
partial antagonist require the use of other proteins as competing ligands. The
polypeptides of the
present invention or ligand(s) thereof may be labeled by being coupled to
radioisotopes,
colorimetric molecules or a toxin molecules by conventional methods. ("Guide
to Protein
Purification" Murray P. Deutscher (ed) Methods in Enzymology Vol. 182 (1990)
Academic
Press, Inc. San Diego). Examples of radioisotopes include, but are not limited
to, tritium and
carbon-14 . Examples of colorimetric molecules include, but are not limited
to, fluorescent
molecules such as fluorescamine, or rhodamine or other colorimetric molecules.
Examples of
toxins include, but are not limited, to ricin.
4.10.13 DRUG SCREENING
This invention is particularly useful for screening chemical compounds by
using the
novel polypeptides or binding fragments thereof in any of a variety of drug
screening techniques.
The polypeptides or fragments employed in such a test may either be free in
solution, affixed to a
solid support, borne on a cell surface or located intracellularly. One method
of drug screening
utilizes eukaryotic or prokaryotic host cells which are stably transformed
with recombinant
nucleic acids expressing the polypeptide or a fragment thereof. Drugs are
screened against such
transformed cells in competitive binding assays. Such cells, either in viable
or fixed form, can
be used for standard binding assays. One may measure, for example, the
formation of complexes
between polypeptides of the invention or fragments and the agent being tested
or examine the
diminution in complex formation between the novel polypeptides and an
appropriate cell line,
which are well known in the art.
Sources for test compounds that may be screened for ability to bind to or
modulate (i.e.,
increase or decrease) the activity of polypeptides of the invention include
(l~ inorganic and
organic chemical libraries, (2) natural product libraries, and (3)
combinatorial libraries
comprised of either random or mimetic peptides, oligonucleotides or organic
molecules.
Chemical libraries may be readily synthesized or purchased from a number of
commercial sources, and may include structural analogs of known compounds or
compounds
that are identified as "hits" or "leads" via natural product screening.
The sources of natural product libraries are microorganisms (including
bacteria and
fungi), animals, plants or other vegetation, or marine organisms, and
libraries of mixtures for
56


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
screening may be created by: (1) fermentation and extraction of broths from
soil, plant or marine
microorganisms or (2) extraction of the organisms themselves. Natural product
libraries include
polyketides, non-ribosomal peptides, and (non-naturally occurring) variants
thereof. For a
review, see Science 282:63-68 (1998).
Combinatorial libraries are composed of large numbers of peptides,
oligonucleotides or
organic compounds and can be readily prepared by traditional automated
synthesis methods,
PCR, cloning or proprietary synthetic methods. Of particular interest are
peptide and
oligonucleotide combinatorial libraries. Still other libraries of interest
include peptide, protein,
peptidomimetic, multiparallel synthetic collection, recombinatorial, and
polypeptide libraries.
For a review of combinatorial chemistry and libraries created therefrom, see
Myers, Curs. Opih.
Biotechnol. 8:701-707 (1997). For reviews and examples of peptidomimetic
libraries, see
Al-Obeidi et al., Mol. Biotechnol, 9(3):205-23 (1998); Hruby et al., Curr Opih
Chem Biol,
1(1):114-19 (1997); Dorner et al., BioorgMed Chem, 4(5):709-15 (1996)
(alkylated dipeptides).
Identification of modulators through use of the various libraries described
herein permits
modification of the candidate "hit" (or "lead") to optimize the capacity of
the "hit" to bind a
polypeptide of the invention. The molecules identified in the binding assay
are then tested for
antagonist or agonist activity in in vivo tissue culture or animal models that
are well known in the
art. In brief, the molecules are titrated into a plurality of cell cultures or
animals and then tested
for either cell/animal death or prolonged survival of the animal/cells.
The binding molecules thus identified may be complexed with toxins, e.g.,
ricin or
cholera, or with other compounds that are toxic to cells such as
radioisotopes. The toxin-binding
molecule complex is then targeted to a tumor or other cell by the specificity
of the binding
molecule for a polypeptide of the invention. Alternatively, the binding
molecules may be
complexed with imaging agents for targeting and imaging purposes.
4.10.14 ASSAY FOR RECEPTOR ACTIVITY
The invention also provides methods to detect specif c binding of a
polypeptide e.g. a
ligand or a receptor. The art provides numerous assays particularly useful for
identifying
previously unknown binding partners for receptor polypeptides of the
invention. For example,
expression cloning using mammalian or bacterial cells, or dihybrid screening
assays can be used
to identify polynucleotides encoding binding partners. As another example,
affinity
chromatography with the appropriate immobilized polypeptide of the invention
can be used to
isolate polypeptides that recognize and bind polypeptides of the invention.
There are a number
of different libraries used for the identification of compounds, and in
particular small molecules,
that modulate (i. e., increase or decrease) biological activity of a
polypeptide of the invention.
57


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
Ligands for receptor polypeptides of the invention can also be identified by
adding exogenous
ligands, or cocktails ofligands to two cells populations that are genetically
identical except for
the expression of the receptor of the invention: one cell population expresses
the receptor of the
invention whereas the other does not. The response of the two cell populations
to the addition of
ligands(s) are then compared. Alternatively, an expression library can be co-
expressed with the
polypeptide of the invention in cells and assayed for an autocrine response to
identify potential
ligand(s). As still another example, BIAcore assays, gel overlay assays, or
other methods known
in the art can be used to identify binding partner polypeptides, including,
(1) organic and
inorganic chemical libraries, (2) natural product libraries, and (3)
combinatorial libraries
comprised of random peptides, oligonucleotides or organic molecules.
The role of downstream intracellular signaling molecules in the signaling
cascade of the
polypeptide of the invention can be determined. Fox example, a chimeric
protein in which the
cytoplasmic domain of the polypeptide of the invention is fused to the
extracellular portion of a
protein, whose Iigand has been identified, is produced in a host cell. The
cell is then, incubated
with the ligand specific for the extracellular portion of the chimeric
protein, thereby activating
the chimeric receptor. Known downstream proteins involved in intracellular
signaling can then
be assayed for expected modifications i.e. phosphorylation. Other methods
known to those in the
art can also be used to identify signaling molecules involved in receptor
activity.
4.10.15 ANTT-INFLAMMATORY ACTIVITY
Compositions of the present invention may also exhibit anti-inflammatory
activity. The
anti-inflammatory activity may be achieved by providing a stimulus to cells
involved in the
inflammatory response, by inhibiting or promoting cell-cell interactions (such
as, for example,
cell adhesion), by inhibiting or promoting chemotaxis of cells involved in the
inflammatory
process, inhibiting or promoting cell extravasation, or by stimulating or
suppressing production
of other factors which more directly inhibit or promote an inflammatory
response. Compositions
with such activities can be used to treat inflammatory conditions including
chronic or acute
conditions), including without limitation intimation associated with infection
(such as septic
shock, sepsis or systemic inflammatory response syndrome (SIRS)), ischemia-
repexfusion injury,
endotoxin lethality, arthritis, complement-mediated hyperacute rejection,
nephritis, cytokine or
chemokine-induced lung injury, inflammatory bowel disease, Crohn's disease or
resulting from
over production of cytokines such as TNF or IL-1. Compositions of the
invention may also be
useful to treat anaphylaxis and hypersensitivity to an antigenic substance or
material.
Compositions of this invention may be utilized to prevent or treat conditions
such as, but not
limited to, sepsis, acute pancreatitis, endotoxin shock, cytokine induced
shock, rheumatoid
58


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
arthritis, chronic inflammatory arthritis, pancreatic cell damage from
diabetes mellitus type 1,
graft versus host disease, inflammatory bowel disease, inflamation associated
with pulmonary
disease, other autoimmune disease or inflammatory disease, an
antiproliferative agent such as for
acute or chronic mylegenous leukemia or in the prevention of premature labor
secondary to
intrauterine infections.
4.10.16 LEUKEMIAS
Leukemias and related disorders may be treated or prevented by administration
of a
therapeutic that promotes or inhibits function of the polynucleotides and/or
polypeptides of the
invention. Such leukemias and related disorders include but are not limited to
acute leukemia,
acute lymphocytic leukemia, acute myelocytic leukemia, myeloblastic,
promyelocytic,
myelomonocytic, monocytic, erythroleukemia, chronic leukemia, chronic
myelocytic
(granulocytic) leukemia and chronic lymphocytic leukemia (for a review of such
disorders, see
Fishman et al., 1985, Medicine, 2d Ed., J.B. Lippincott Co., Philadelphia).
4.10.17 NERVOUS SYSTEM DISORDERS
Nervous system disorders, involving cell types which can be tested for
efficacy of
intervention with compounds that modulate the activity of the polynucleotides
and/or
polypeptides of the invention, and which can be treated upon thus observing an
indication of
therapeutic utility, include but are not limited to nervous system injuries,
and diseases or
disorders which result in either a disconnection of axons, a diminution or
degeneration of
neurons, or demyelination. Nervous system lesions which may be treated in a
patient (including
human and non-human mammalian patients) according to the invention include but
are not
limited to the following lesions of either the central (including spinal cord,
brain) or peripheral
nervous systems:
(i) traumatic lesions, including lesions caused by physical injury or
associated with
surgery, for example, lesions which sever a portion of the nervous system, or
compression
injuries;
(ii) ischemic lesions, in which a lack of oxygen in a portion of the nervous
system
results in neuronal injury or death, including cerebral infarction or
ischemia, or spinal cord
infarction or ischemia;
(iii) infectious lesions, in which a portion of the nervous system is
destroyed or injured
as a result of infection, for example, by an abscess or associated with
infection by human
immunodeficiency virus, herpes zoster, or herpes simplex virus or with Lyme
disease,
tuberculosis, syphilis;
59


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
(iv) degenerative lesions, in which a portion of the nervous system is
destroyed or
injured as a result of a degenerative process including but not limited to
degeneration associated
with Parkinson's disease, Alzheimer's disease, Huntington's chorea, or
amyotrophic lateral
sclerosis;
S (v) lesions associated with nutritional diseases or disorders, in which a
portion of the
nervous system is destroyed or injured by a nutritional disorder or disorder
of metabolism
including but not limited to, vitamin B 12 def ciency, folic acid deficiency,
Wernicke disease,
tobacco-alcohol amblyopia, Marchiafava-Bignami disease (primary degeneration
of the corpus
callosum), and alcoholic cerebellar degeneration;
(vi) neurological lesions associated with systemic diseases including but not
limited to
diabetes (diabetic neuropathy, Bell's palsy), systemic lupus erythematosus,
carcinoma, or
sarcoidosis;
(vii) lesions caused by toxic substances including alcohol, lead, or
particular
neurotoxins; and
(viii) demyelinated lesions in which a portion of the nervous system is
destroyed or
injured by a demyelinating disease including but not limited to multiple
sclerosis, human
immunodeficiency virus-associated myelopathy, transverse myelopathy or various
etiologies,
progressive multifocal leukoencephalopathy, and central pontine myelinolysis.
Therapeutics which are useful according to the invention for treatment of a
nervous
system disorder may be selected by testing for biological activity in
promoting the survival or
differentiation of neurons. For example, and not by way of limitation,
therapeutics which elicit
any of the following effects may be useful according to the invention:
(i) increased survival time of neurons in culture;
(ii} increased sprouting of neurons in culture or in vivo;
(iii) increased production of a neuron-associated molecule in culture or in
vivo, e.g.,
choline acetyltransferase or acetylcholinesterase with respect to motor
neurons; or
(iv) decreased symptoms of neuron dysfunction in vivo.
Such effects may be measured by any method known in the art. In preferred,
non-limiting embodiments, increased survival of neurons may be measured by the
method set
forth in Arakawa et al. (1990, 3. Neurosci. 10:3507-3515); increased sprouting
of neurons may
be detected by methods set forth in Pestronk et al. (1980, Exp. Neurol. 70:65-
82) or Brown et al.
(1981, Ann. Rev. Neurosci. 4:17-42); increased production of neuron-associated
molecules may
be measured by bioassay, enzymatic assay, antibody binding, Northern blot
assay, etc.,
depending on the molecule to be measured; and motor neuron dysfunction may be
measured by


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
assessing the physical manifestation of motor neuron disorder, e.g., weakness,
motor neuron
conduction velocity, or functional disability.
In specific embodiments, motor neuron disorders that may be treated according
to the
invention include but are not Iimified to disorders such as infarction,
infection, exposure to toxin,
trauma, surgical damage, degenerative disease or malignancy that may affect
motor neurons as
well as other components of the nervous system, as well as disorders that
selectively affect
neurons such as amyotrophic lateral sclerosis, and including but not limited
to progressive spinal
muscular atrophy, progressive bulbar palsy, primary lateral sclerosis,
infantile and juvenile
muscular atrophy, progressive bulbar paralysis of childhood (Fazio-Londe
syndrome),
poliomyelitis and the post polio syndrome, and Hereditary Motorsensory
Neuropathy
(Charcot-Marie-Tooth Disease).
4.10.18 OTHER ACTIVITIES
A polypeptide of the invention may also exhibit one or more of the following
additional
activities or effects: inhibiting the growth, infection or function of, or
killing, infectious agents,
including, without limitation, bacteria, viruses, fungi and other parasites;
effecting (suppressing
or enhancing) bodily characteristics, including, without limitation, height,
weight, hair color, eye
color, skin, fat to lean ratio or other tissue pigmentation, or organ or body
part size or shape
(such as, far example, breast augmentation or diminution, change in bone form
or shape);
effecting biorhythms or circadian cycles or rhythms; effecting the fertility
of male or female
subjects; effecting the metabolism, catabolism, anabolism, processing,
utilization, storage or
elimination of dietary fat, lipid, protein, carbohydrate, vitamins, minerals,
co-factors or other
nutritional factors or component(s); effecting behavioral characteristics,
including, without
limitation, appetite, libido, stress, cognition (including cognitive
disorders), depression
(including depressive disorders) and violent behaviors; providing analgesic
effects or other pain
reducing effects; promoting differentiation and growth of embryonic stem cells
in lineages other
than hematopoietic Iineages; hormonal or endocrine activity; in the case of
enzymes, correcting
deficiencies of the enzyme and treating deficiency-related diseases; treatment
of
hyperproliferative disorders (such as, for example, psoriasis); immunoglobulin-
like activity (such
as, for example, the ability to bind antigens or complement); and the ability
to act as an antigen
in a vaccine composition to raise an immune response against such protein or
another material or
entity which is cross-reactive with such protein.
4.10.19 IDENTIFICATION OF POLYMORPHISMS
m


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
The demonstration of polymorphisms makes possible the identification of such
polymorphisms in human subjects and the pharmacogenetic use of this
information for diagnosis
and treatment. Such polymorphisms may be associated with, e.g., differential
predisposition or
susceptibility to various disease states (such as disorders involving
inflammation or immune
response) or a differential response to drug administration, and this genetic
information can be
used to tailor preventive or therapeutic treatment appropriately. For example,
the existence of a
polymorphism associated with a predisposition to inflammation or autoimmune
disease makes
possible the diagnosis of this condition in humans by identifying the presence
of the
polymorphism.
Polymorphisms can be identified in a variety of ways known in the art which
all
generally involve obtaining a sample from a patient, analyzing DNA from the
sample, optionally
involving isolation or amplification of the DNA, and identifying the presence
of the
polymorphism in the DNA. For example, PCR may be used to amplify an
appropriate fragment
of genomic DNA which may then be sequenced. Alternatively, the DNA may be
subjected to
allele-specific oligonucleotide hybridization (in which appropriate
oligonucleotides are
hybridized to the DNA under conditions permitting detection of a single base
mismatch) or to a
single nucleotide extension assay (in which an oligonucleotide that hybridizes
immediately
adjacent to the position of the polymorphism is extended with one or more
labeled nucleotides).
In addition, traditional restriction fragment length polymorphism analysis
(using restriction
enzymes that provide differential digestion of the genomic DNA depending on
the presence or
absence of the polymorphism) may be performed. Arrays with nucleotide
sequences of the
present invention can be used to detect polymorphisms. The array can comprise
modified
nucleotide sequences of the present invention in order to detect the
nucleotide sequences of the
present invention. In the alternative, any one of the nucleotide sequences of
the present
invention can be placed on the array to detect changes from those sequences.
Alternatively a polymorphism resulting in a change in the amino acid sequence
could
also be detected by detecting a corresponding change in amino acid sequence of
the protein, e.g.,
by an antibody specific to the variant sequence.
4.10.20 ARTHRITIS AND INFLAMMATION
The immunosuppressive effects of the compositions of the invention against
rheumatoid
arthritis is determined in an experimental animal model system. The
experimental model system
is adjuvant induced arthritis in rats, and the protocol is described by J.
Holoshitz, et at., 1983,
Science, 219:56, or by B. Waksman et al., 1963, Int. Arch. Allergy Appl.
Immunol., 23:129.
Induction of the disease can be caused by a single injection, generally
intradermally, of a
62


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
suspension of killed Mycobacterium tuberculosis in complete Freund's adjuvant
(CFA). The
route of injection can vary, but rats may be injected at the base ofthe tail
with an adjuvant
mixture. The polypeptide is administered in phosphate buffered solution (PBS)
at a dose of about
1-5 mg/kg. The control consists of administering PBS only. .
The procedure for testing the effects of the test compound would consist of
intradermally
injecting killed Mycobacterium tuberculosis in CFA followed by immediately
administering the
test compound and subsequent treatment every other day until day 24. At 14,
15, 18, 20, 22, and
24 days after injection of Mycobacterium CFA, an overall arthritis score may
be obtained as
described by J. Holoskitz above. An analysis of the data would reveal that the
test compound
would have a dramatic affect on the swelling of the joints as measured by a
decrease of the
arthritis score. .
4.11 THERAPEUTIC METHODS
The compositions (including polypeptide fragments, analogs, variants and
antibodies or
other binding partners or modulators including antisense polynucleotides) of
the invention have
numerous applications in a variety of therapeutic methods. Examples of
therapeutic applications
include, but are not limited to, those exemplified herein.
4.11.1 EXAMPLE
One embodiment of the invention is the administration of an effective amount
of the
polypeptides or other composition of the invention to individuals affected by
a disease or
disorder that can be modulated by regulating the peptides of the invention.
While the mode of
administration is not particularly important, parenteral administration is
preferred. An
exemplary mode of administration is to deliver an intravenous bolus. The
dosage of the
polypeptides or other composition of the invention will normally be determined
by the
prescribing physician. It is to be expected that the dosage will vary
according to the age, weight,
condition and response of the individual patient. Typically, the amount of
polypeptide
administered per dose will be in the range of about 0.01 ~,g/kg to 100 mg/kg
of Body weight, with
the preferred dose being about 0.1 ~g/kg to 10 mg/kg of patient body weight.
For parenteral
administration, polypeptides of the invention will be formulated in an
injectable form combined
with a pharmaceutically acceptable parenteral vehicle. Such vehicles are well
known in the art
and examples include water, saline, Ringer's solution, dextrose solution, and
solutions consisting
of small amounts of the human serum albumin. The vehicle may contain minor
amounts of
additives that maintain the isotonicity and stability of the polypeptide or
other active ingredient.
The preparation of such solutions is within the skill of the art.
63


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
4.12 PHARMACEUTICAL FORMULATIONS AND ROUTES OF
ADMINISTRATION
A protein or other composition of the present invention (from whatever source
derived,
including without limitation from recombinant and non-recombinant sources and
including
antibodies and other binding partners of the polypeptides of the invention)
may be administered
to a patient in need, by itself, or in pharmaceutical compositions where it is
mixed with suitable
carriers or excipient(s) at doses to treat or ameliorate a variety of
disorders. Such a composition
may optionally contain (in addition to protein or other active ingredient and
a carrier) diluents,
fillers, salts, buffers, stabilizers, solubilizers, and other materials well
known in the art. The term
"pharmaceutically acceptable" means a non-toxic material that does not
interfere with the
effectiveness of the biological activity of the active ingredient(s). The
characteristics of the
carrier will depend on the route of administration. The pharmaceutical
composition of the
invention may also contain cytokines, lymphokines, or other hematopoietic
factors such as
M-CSF, GM-CSF, TNF, IL-1, IL-2, IL-3, IL-4, IL-5, IL-6, IL-7, IL-8, IL-9, IL-
10, IL-11, IL-12,
IL-13, IL-14, IL-15, IFN, TNFO, TNF1, TNF2, G-CSF, Meg-CSF, thrombopoietin,
stem cell
factor, and erythropoietin. In further compositions, proteins of the invention
may be combined
with other agents beneficial to the treatment of the disease or disorder in
question. These agents
include various growth factors such as epidermal growth factor (EGF), platelet-
derived growth
factor (PDGF), transforming growth factors (TGF-a and TGF-(3), insulin-like
growth factor
(IGF), as well as cytokines described herein.
The pharmaceutical composition may further contain other agents which either
enhance
the activity of the protein or other active ingredient or complement its
activity or use in
treatment. Such additional factors and/or agents may be included in the
pharmaceutical
composition to produce a synergistic effect with protein or other active
ingredient of the
invention, or to minimize side effects. Conversely, protein or other active
ingredient of the
present invention may be included in formulations of the particular clotting
factor, cytokine,
lymphokine, other hematopoietic factor, thrombolytic or anti-thrombotic
factor, or anti-
inflammatory agent to minimize side effects of the clotting factor, cytolcine,
lymphokine, other
hematopoietic factor, thrombolytic or anti-thrombotic factor, or anti-
inflammatory agent (such as
IL-lRa, IL-1 Hyl, IL-1 Hy2, anti-TNF, corticosteroids, immunosuppressive
agents). A protein
of the present invention may be active in multimers (e.g., heterodimers or
homodimers) or
complexes with itself or other proteins. As a result, pharmaceutical
compositions of the
invention may comprise a protein of the invention in such multimeric or
complexed form.
64


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
As an alternative to being included in a pharmaceutical composition of the
invention
including a first protein, a second protein or a therapeutic agent may be
concurrently
administered with the first protein (e.g., at the same time, or at differing
times provided that
therapeutic concentrations of the combination of agents is achieved at the
treatment site).
Techniques for formulation and administration of the compounds of the instant
application may
be found in "Remington's Pharmaceutical Sciences," Mack Publishing Co.,
Easton, PA, latest
edition. A therapeutically effective dose further refers to that amount of the
compound sufficient
to result in amelioration of symptoms, e.g., treatment, healing, prevention or
amelioration of the
relevant medical condition, or an increase in rate of treatment, healing,
prevention or
amelioration of such conditions. When applied to an individual active
ingredient, administered
alone, a therapeutically effective dose refers to that ingredient alone. When
applied to a
combination, a therapeutically effective dose refers to combined amounts of
the active
ingredients that result in the therapeutic effect, whether administered in
combination, serially or
simultaneously.
In practicing the method of treatment or use of the present invention, a
therapeutically
effective amount of protein or other active ingredient of the present
invention is administered to
a mammal having a condition to be treated. Protein or other active ingredient
of the present
invention may be administered in accordance with the method of the invention
either alone or in
combination with other therapies such as treatments employing cytokines,
lymphokines or other
hematopoietic factors. When co- administered with one or more cytokines,
lymphokines or other
herilatopoietic factors, protein or other active ingredient of the present
invention may be
administered either simultaneously with the cytokine(s), lymphokine(s), other
hematopoietic
factor(s), thrombolytic or anti-thrombotic factors, or sequentially. If
administered sequentially,
the attending physician will decide on the appropriate sequence of
administering protein or other
active ingredient of the present invention in combination with cytokine(s),
lymphokine(s), other
hematopoietic factor(s), thrombolytic or anti-thrombotic factors.
4.12.1 ROUTES OF ADMINISTRATION
Suitable routes of administration may, for example, include oral, rectal,
transmucosal, or
intestinal administration; parenteral delivery, including intramuscular,
subcutaneous,
intramedullary injections, as well as intrathecal, direct intraventricular,
intravenous,
intraperitoneal, intranasal, or intraocular injections. Administration of
protein or other active
ingredient of the present invention used in the pharmaceutical composition or
to practice the
method of the present invention can be carried out in a variety of
conventional ways, such as oral


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
ingestion, inhalation, topical application or cutaneous, subcutaneous,
intraperitoneal, parenteral
or intravenous injection. Intravenous administration to the patient is
preferred.
Alternately, one may administer the compound in a local rather than systemic
manner, for
example, via injection of the compound directly into a arthritic joints or in
fibrotic tissue, often in
a depot or sustained release formulation. In order to prevent the scarring
process frequently
occurring as complication of glaucoma surgery, the compounds may be
administered topically,
for example, as eye drops. Furthermore, one may administer the drug in a
targeted drug delivery
system, for example, in a liposome coated with a specific antibody, targeting,
for example,
arthritic or fibrotic tissue. The liposomes will, be targeted to and taken up
selectively by the
afflicted tissue.
The polypeptides of the invention are administered by any route that delivers
an effective
dosage to the desired site of action. The determination of a suitable route of
administration and
an effective dosage for a particular indication is within the level of skill
in the art. Preferably for
wound treatment, one administers the therapeutic compound directly to the
site. Suitable dosage
ranges for the polypeptides of the invention can be extrapolated from these
dosages or from
similar studies in appropriate animal models. Dosages can then be adjusted as
necessary by the
clinician to provide maximal therapeutic benefit.
4.12.2 COMPOSITIONS/FORMiTLATIONS
Pharmaceutical compositions for use in accordance with the present invention
thus may
be formulated in a conventional manner using one or more physiologically
acceptable carriers
comprising excipients and auxiliaries which facilitate processing of the
active compounds into
preparations which can be used pharmaceutically. These pharmaceutical
compositions may be
manufactured in a manner that is itself known, e.g., by means of conventional
mixing,
dissolving, granulating, dragee-making, levigating, emulsifying,
encapsulating, entrapping or
lyophilizing processes. Proper formulation is dependent upon the route of
administration chosen.
When a therapeutically effective amount of protein or other active ingredient
of the present
invention is administered orally, protein or other active ingredient of the
present invention will
be in the form of a tablet, capsule, powder, solution or elixir. When
administered in tablet form,
the pharmaceutical composition of the invention may additionally contain a
solid carrier such as
a gelatin or an adjuvant. The tablet, capsule, and powder contain from about 5
to 95% protein or
other active ingredient of the present invention, and preferably from about 25
to 90% protein or
other active ingredient of the present invention. When administered in liquid
form, a liquid
carrier such as water, petroleum, oils of animal or plant origin such as
peanut oil, mineral oil,
soybean oil, or sesame oil, or synthetic oils may be added. The liquid form of
the
66


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
pharmaceutical composition may further contain physiological saline solution,
dextrose or other
saccharide solution, or glycols such as ethylene glycol, propylene glycol or
polyethylene glycol.
When administered in liquid form, the pharmaceutical composition contains from
about 0.5 to
90% by weight of protein or other active ingredient of the present invention,
and preferably from
about 1 to 50% protein or other active ingredient of the present invention.
When a therapeutically effective amount of protein or other active ingredient
of the
present invention is administered by intravenous, cutaneous or subcutaneous
injection, protein or
other active ingredient of the present invention will be in the form of a
pyrogen-free, paxenterally
acceptable aqueous solution. The preparation of such parenterally acceptable
protein or other
active ingredient solutions, having due regard to pH, isotonicity, stability,
and the like, is within
the skill in the art. A preferred pharmaceutical composition for intravenous,
cutaneous, or
subcutaneous injection should contain, in addition to protein or other active
ingredient of the
present invention, an isotonic vehicle such as Sodium Chloride Injection,
Ringer's Injection,
Dextrose Injection, Dextrose and Sodium Chloride Injection, Lactated Ringer's
Injection, or
I S other vehicle as known in the art. The pharmaceutical composition of the
present invention may
also contain stabilizers, preservatives, buffers, antioxidants, or other
additives known to those of
skill in the art. For inj ection, the agents of the invention may be
formulated in aqueous solutions,
preferably in physiologically compatible buffers such as Hanks's solution,
Ringer's solution, or
physiological saline buffer. For transmucosal administration, penetrants
appropriate to the
barrier to be permeated are used in the formulation. Such penetrants are
generally known in the
art.
For oral administration, the compounds can be formulated readily by combining
the
active compounds with pharmaceutically acceptable carriers well known in the
art. Such carriers
enable the compounds of the invention to be formulated as tablets, pills,
dragees, capsules,
liquids, gels, syrups, slurries, suspensions and the like, for oral ingestion
by a patient to be
treated. Pharmaceutical preparations for oral use can be obtained from a solid
excipient,
optionally grinding a resulting mixture, and processing the mixture of
granules, after adding
suitable auxiliaries, if desired, to obtain tablets or dragee cores. Suitable
excipients are, in
particular, fillers such as sugars, including lactose, sucrose, mannitol, or
sorbitol; cellulose
preparations such as, for example, maize starch, wheat starch, rice starch,
potato starch, gelatin,
gum tragacanth, methyl cellulose, hydroxypropylmethyl-cellulose, sodium
carboxymethylcellulose, and/or polyvinylpyrrolidone (PVP). If desired,
disintegrating agents
may be added, such as the cross-linked polyvinyl pyrrolidone, agar, or alginic
acid or a salt
thereof such as sodium alginate. Dragee cores are provided with suitable
coatings. For this
purpose, concentrated sugar solutions may be used, which may optionally
contain gum arabic,
67


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
talc, polyvinyl pyrrolidone, carbopol gel, polyethylene glycol, and/or
titanium dioxide, lacquer
solutions, and suitable organic solvents or solvent mixtures. Dyestuffs or
pigments may be
added to the tablets or dragee coatings for identification or to characterize
different combinations
of active compound doses.
Pharmaceutical preparations which can be used orally include push-fit capsules
made of
gelatin, as well as soft, sealed capsules made of gelatin and a plasticizes,
such as glycerol or
sorbitol. The push-fit capsules can contain the active ingredients in
admixture with filler such as
lactose, binders such as starches, and/or lubricants such as talc or magnesium
stearate and,
optionally, stabilizers. In soft capsules, the active compounds may be
dissolved or suspended in
suitable liquids, such as fatty oils, liquid paraffin, or liquid polyethylene
glycols. In addition,
stabilizers may be added. All formulations for oral administration should be
in dosages suitable
for such administration. For buccal administration, the compositions may take
the form of
tablets or lozenges formulated in conventional manner.
For administration by inhalation, the compounds for use according to the
present
invention are conveniently delivered in the form of an aerosol spray
presentation from
pressurized packs or a nebuliser, with the use of a suitable propellant, e.g.,
dichlorodifluoromethane, trichlorofluoromethane, dichlorotetrafluoroethane,
carbon dioxide or
other suitable gas. In the case of a pressurized aerosol the dosage unit may
be determined by
providing a valve to deliver a metered amount. Capsules and cartridges of,
e.g., gelatin for use in
an inhaler or insufflator may be formulated containing a powder mix of the
compound and a
suitable powder base such as lactose or starch. The compounds may be
formulated for parenteral
administration by injection, e.g., by bolus injection or continuous infusion.
Formulations for
injection may be presented in unit dosage form, e.g., in ampules or in multi-
dose containers, with
an added preservative. The compositions may take such forms as suspensions,
solutions or
emulsions in oily or aqueous vehicles, and may contain formulatory agents such
as suspending,
stabilizing and/or dispersing agents.
Pharmaceutical formulations for parenteral administration include aqueous
solutions of
the active compounds in water-soluble form. Additionally, suspensions of the
active compounds
may be prepared as appropriate oily injection suspensions. Suitable lipophilic
solvents or
vehicles include fatty oils such as sesame oil, or synthetic fatty acid
esters, such as ethyl oleate or
triglycerides, or liposomes. Aqueous injection suspensions may contain
substances which
increase the viscosity of the suspension, such as sodium carboxymethyl
cellulose, sorbitol, or
dextran. Optionally, the suspension may also contain suitable stabilizers or
agents which
increase the solubility of the compounds to allow for the preparation of
highly concentrated
68


CA 02399776 2002-08-02
WO 01/57190 . PCT/USO1/04098
solutions. Alternatively, the active ingredient may be in powder form for
constitution with a
suitable vehicle, e.g., sterile pyrogen-free water, before use..
The compounds may also be formulated in rectal compositions such as
suppositories or
retention enemas, e.g., containing conventional suppository bases such as
cocoa butter or other
glycerides. In addition to the formulations described previously, the
compounds may also be
formulated as a depot preparation. Such long acting formulations may be
administered by
implantation (for example subcutaneously or intramuscularly) or by
intramuscular injection.
Thus, for example, the compounds may be formulated with suitable polymeric or
hydrophobic
materials (for example as an emulsion in an acceptable oil) or ion exchange
resins, or as
sparingly soluble derivatives, for example, as a sparingly soluble salt.
A pharmaceutical carrier for the hydrophobic compounds of the invention is a
co-solvent
system comprising benzyl alcohol, a nonpolar surfactant, a water-miscible
organic polymer, and
an aqueous phase. The co-solvent system may be the VPD co-solvent system. VPD
is a solution
of 3% w/v benzyl alcohol, 8% w/v of the nonpolax surfactant polysorbate 80,
and 65% w/v
polyethylene glycol 300, made up to volume in absolute ethanol. The VPD co-
solvent system
(VPD:SV~ consists of VPD diluted 1:1 with a 5% dextrose in water solution.
This co-solvent
system dissolves hydrophobic compounds well, and itself produces low toxicity
upon systemic
administration. Naturally, the proportions of a co-solvent system may be
varied considerably
without destroying its solubility and toxicity characteristics. Furthermore,
the identity of the
co-solvent components may be varied: for example, other low-toxicity nonpolar
surfactants may
be used instead of polysorbate 80; the fraction size of polyethylene glycol
may be varied; other
biocompatible polymers may replace polyethylene glycol, e.g. polyvinyl
pyrrolidone; and other
sugars or polysaccharides may substitute for dextrose. Alternatively, other
delivery systems for
hydrophobic pharmaceutical compounds may be employed. Liposomes and emulsions
are well
known examples of delivery vehicles or carriers for hydrophobic drugs. Certain
organic solvents
such as dimethylsulfbxide also may be employed, although usually at the cost
of greater toxicity.
Additionally, the compounds may be delivered using a sustained-release system,
such as
semipermeable matrices of solid hydrophobic polymers containing the
therapeutic agent.
Various types of sustained-release materials have been established and are
well known by those
skilled in the art. Sustained-release capsules may, depending on their
chemical nature, release the
compounds for a few weeks up to over 100 days. Depending on the chemical
nature and the
biological stability of the therapeutic reagent, additional strategies for
protein or other active
ingredient stabilization may be employed.
The pharmaceutical compositions also may comprise suitable solid or gel phase
carriers
or excipients. Examples of such carriers or excipients include but axe not
limited to calcium
69


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
carbonate, calcium phosphate, various sugars, starches, cellulose derivatives,
gelatin, and
polymers such as polyethylene glycols. Many of the active ingredients of the
invention may be
provided as salts with pharmaceutically compatible counter ions. Such
pharmaceutically
acceptable base addition salts axe those salts which retain the biological
effectiveness and
properties of the free acids and which are obtained by reaction with inorganic
or organic bases
such as sodium hydroxide, magnesium hydroxide, ammonia, trialkylamine,
dialkylarnine,
monoalkylamine, dibasic amino acids, sodium acetate, potassium benzoate,
triethanol amine and
the like.
The pharmaceutical composition of the invention may be in the form of a
complex of the
proteins) or other active ingxedient(s) of present invention along with
protein or peptide
antigens. The protein and/or peptide antigen will deliver a stimulatory signal
to both B and T
lymphocytes. B lymphocytes will respond to antigen through their surface
immunoglobulin
receptor. T lymphocytes will respond to antigen through the T cell receptor
(TCR) following
presentation of the antigen by MHC proteins. MHC and structurally related
proteins including
those encoded by class I and class II MHC genes on host cells will serve to
present the peptide
antigens) to T lymphocytes. The antigen components could also be supplied as
purified
MHC-peptide complexes alone or with co-stimulatory molecules that can directly
signal T cells.
Alternatively antibodies able to bind surface immunoglobulin and other
molecules on B cells as
well as antibodies able to bind the TCR and other molecules on T cells can be
combined with the
pharmaceutical composition of the invention.
The pharmaceutical composition of the invention may be in the form of a
liposome in
which protein of the present invention is combined, in addition to other
pharmaceutically
acceptable carriers, with amphipathic agents such as lipids which exist in
aggregated form as
micelles, insoluble monolayers, liquid crystals, or lamellar layers in aqueous
solution. Suitable
lipids for liposomal formulation include, without limitation, monoglycerides,
diglycerides,
sulfatides, lysolecithins, phospholipids, saponin, bile acids, and the like.
Preparation of such
liposomal formulations is within the level of skill in the art, as disclosed,
fox example, in U.S.
Patent Nos. 4,235,871; 4,501,728; 4,837,028; and 4,737,323, all of which are
incorporated
herein by reference.
The amount of protein or other active ingredient of the present invention in
the
pharmaceutical composition of the present invention will depend upon the
nature and severity of
the condition being treated, and on the nature of prior treatments which the
patient has
undergone. Ultimately, the attending physician will decide the amount of
protein or other active
ingredient of the present invention with which to treat each individual
patient. Initially, the
attending physician will administer low doses of pxotein or other active
ingredient of the present


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
invention and observe the patient's response. Larger doses of protein or other
active ingredient
of the present invention may be administered until the optimal therapeutic
effect is obtained for
the patient, and at that point the dosage is not increased further. It is
contemplated that the
various pharmaceutical compositions used to practice the method of the present
invention should
contain about 0.01 ~,g to about 100 mg (preferably about 0.1 ~g to about 10
mg, more preferably
about 0.1 ~g to about 1 mg) of protein or other active ingredient of the
present invention per kg
body weight. For compositions of the present invention which are useful for
bone, cartilage,
tendon or ligament regeneration, the therapeutic method includes administering
the composition
topically, systematically, or locally as an implant or device. When
administered, the therapeutic
composition for use in this invention is, of course, in a pyrogen-free,
physiologically acceptable
form. Further, the composition may desirably be encapsulated or injected in a
viscous form for
delivery to the site of bone, cartilage or tissue damage. Topical
administration may be suitable
for wound healing and tissue repair. Therapeutically useful agents other than
a protein or other
active ingredient of the invention which may also optionally be included in
the composition as
described above, may alternatively or additionally, be administered
simultaneously or
sequentially with the composition in the methods of the invention. Preferably
for bone and/or
cartilage formation, the composition would include a matrix capable of
delivering the
protein-containing or other active ingredient-containing composition to the
site of bone and/or
cartilage damage, providing a structure for the developing bone and cartilage
and optimally
capable of being resorbed into the body. Such matrices may be formed of
materials presently in
use for other implanted medical applications.
The choice of matrix material is based on biocompatibility, biodegradability,
mechanical
properties, cosmetic appearance and interface properties. The particular
application of the
compositions will define the appropriate formulation. Potential matrices for
the compositions
may be biodegradable and chemically defined calcium sulfate, tricalcium
phosphate,
hydroxyapatite, polylactic acid, polyglycolic acid and polyanhydrides. Other
potential materials
are biodegradable and biologically well-defined, such as bone or dermal
collagen. Further
matrices axe comprised of pure proteins or extracellular matrix components.
Other potential
matrices are nonbiodegradable and chemically defined, such as sintered
hydroxyapatite, bioglass,
aluminates, or other ceramics. Matrices may be comprised of combinations of
any of the above
mentioned types of material, such as polylactic acid and hydroxyapatite or
collagen and
tricalcium phosphate. The bioceramics may be altered in composition, such as
in
calcium-aluminate-phosphate and processing to alter pore size, particle size,
particle shape, and
biodegradability. Presently preferred is a 50:50 (mole weight) copolymer of
lactic acid and
glycolic acid in the form of porous particles having diameters ranging from
150 to 800 microns.
71


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
In some applications, it will be useful to utilize a sequestering agent, such
as carboxymethyl
cellulose or autologous blood clot, to prevent the protein compositions from
disassociating from
the matrix.
A preferred family of sequestering agents is cellulosic materials such as
alkylcelluloses
(including hydroxyalkylcelluloses), including methylcellulose, ethylcellulose,
hydroxyethylcellulose, hydroxypropylcellulose, hydroxypropyl-methylcellulose,
and
carboxymethylcellulose, the most preferred being cationic salts of
carboxymethylcellulose
(CMC). Other preferred sequestering agents include hyaluronic acid, sodium
alginate,
polyethylene glycol), polyoxyethylene oxide, carboxyvinyl polymer and
polyvinyl alcohol).
The amount of sequestering agent useful herein is 0.5-20 wt %, preferably 1-10
wt % based on
total formulation weight, which represents the amount necessary to prevent
desorption of the
protein from the polymer matrix and to provide appropriate handling of the
composition, yet not
so much that the progenitor cells are prevented from infiltrating the matrix,
thereby providing the
protein the opportunity to assist the osteogenic activity of the progenitor
cells. In further
compositions, proteins or other active ingredients of the invention may be
combined with other
agents beneficial to the treatment of the bone and/or cartilage defect, wound,
or tissue in
question. These agents include various growth factors such as epidermal growth
factor (EGF),
platelet derived growth factor (PDGF), transforming growth factors (TGF-a, and
TGF-(3), and
insulin-like growth factor (IGF).
The therapeutic compositions are also presently valuable for veterinary
applications.
Particularly domestic animals and thoroughbred horses, in addition to humans,
are desired
patients for such treatment with proteins or other active ingredients of the
present invention. The
dosage regimen of a protein-containing pharmaceutical composition to be used
in tissue
regeneration will be determined by the attending physician considering various
factors which
modify the action of the proteins, e.g., amount of tissue weight desired to be
formed, the site of
damage, the condition of the damaged tissue, the size of a wound, type of
damaged tissue (e.g.,
bone), the patient's age, sex, and diet, the severity of any infection, time
of administration and
other clinical factors. The dosage may vary with the type of matrix used in
the reconstitution and
with inclusion of other proteins in the pharmaceutical composition. For
example, the addition of
other known growth factors, such as IGF I (insulin like growth factor I), to
the final composition,
may also effect the dosage. Progress can be monitored by periodic assessment
of tissue/bone
growth and/or repair, for example, X-rays, histomorphometric determinations
and tetracycline
labeling.
Polynucleotides of the present invention can also be used for gene therapy.
Such
polynucleotides can be introduced either in vivo or ex vivo into cells for
expression in a
72


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
mammalian subject. Polynucleotides of the invention may also be administered
by other known
methods for introduction of nucleic acid into a cell or organism (including,
without limitation, in
the form of viral vectors or naked DNA). Cells may also be cultured ex vivo in
the presence of
proteins of the present invention in order to proliferate or to produce a
desired effect on or
activity in such cells. Treated cells can then be introduced in viva fox
therapeutic purposes.
4.12.3 EFFECTIVE DOSAGE
Pharmaceutical compositions suitable for use in the present invention include
compositions wherein the active ingredients are contained in an effective
amount to achieve its
intended purpose. More specifically, a therapeutically effective amount means
an amount
effective. to prevent development of or to alleviate the existing symptoms of
the subject being
treated. Determination of the effective amount is well within the capability
of those skilled in
the art, especially in light of the detailed disclosure provided herein. For
any compound used in
the method of the invention, the therapeutically effective dose can be
estimated initially from
appropriate in vitro assays. For example, a dose can be formulated in animal
models to achieve a
circulating concentration range that can be used to more accurately determine
useful doses in
humans. For example, a dose can be formulated in animal models to achieve a
circulating
concentration range that includes the ICso as determined in cell culture
(i.e., the concentration of
the test compound which achieves a half maximal inhibition of the protein's
biological activity).
Such information can be used to more accurately determine useful doses in
humans.
A therapeutically effective dose refers to that amount of the compound that
results in
amelioration of symptoms ox a prolongation of survival in a patient. Toxicity
and therapeutic
efficacy of such compounds can be determined by standard pharmaceutical
procedures in cell
cultures or experimental animals, e.g., for determining the LDSO (the dose
lethal to SO% of the
population) and the EDso (the dose therapeutically effective in 50% of the
population). The dose
ratio between toxic and therapeutic effects is the therapeutic index and it
can be expressed as the
ratio between LDso and EDSO. Compounds which exhibit high therapeutic indices
are preferred.
The data obtained from these cell culture assays and animal studies can be
used in formulating a
range of dosage for use in human. The dosage of such compounds lies preferably
within a range
of circulating concentrations that include the EDso with little or no
toxicity. The dosage may
vary within this range depending upon the dosage form employed and the route
of administration
utilized. 'The exact formulation, route of administration and dosage can be
chosen by the
individual physician in view of the patient's condition. See, e.g., Fingl et
al., 1975, in "The
Pharmacological Basis of Therapeutics", Ch. 1 p.1. Dosage amount and interval
may be adjusted
individually to provide plasma levels of the active moiety which are
sufficient to maintain the
73


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
desired effects, or minimal effective concentration (MEC). The MEC will vary
for each
compound but can be estimated from ih vitro data. Dosages necessary to achieve
the MEC will
depend on individual characteristics and route of administration. However,
HPLC assays or
bioassays can be used to determine plasma concentrations.
Dosage intervals can also be determined using MEC value. Compounds should be
administered using a regimen which maintains plasma levels above the MEC for
10-90% of the
time, preferably between 30-90% and most preferably between 50-90%. In cases
of local
administration or selective uptake, the effective local concentration of the
drug may not be
related to plasma concentration.
An exemplary dosage regimen for polypeptides or other compositions of the
invention
will be in the range of about 0.01 ~,g/kg to 100 mg/kg of body weight daily,
with the preferred
dose being about 0.1 ~,g/kg to 25 mg/kg of patient body weight daily, varying
in adults and
children. Dosing may be once daily, or equivalent doses may be delivered at
longer or shorter
intervals.
The amount of composition administered will, of course, be dependent on the
subject
being treated, on the subject's age and weight, the severity of the
affliction, the manner of
administration and the judgment of the prescribing physician.
4.12.4 PACKAGING
The compositions may, if desired, be presented in a pack or dispenser device
which may
contain one or more unit dosage forms containing the active ingredient. The
pack may, for
example, comprise metal or plastic foil, such as a blister pack. The pack or
dispenser device may
be accompanied by instructions for administration. Compositions comprising a
compound of the
invention formulated in a compatible pharmaceutical carrier may also be
prepared, placed in an
appropriate container, and labeled for treatment of an indicated condition.
4.13 ANTIBODIES
Also included in the invention are antibodies to proteins, or fragments of
proteins of the
invention. The term "antibody" as used herein refers to immunoglobulin
molecules and
immunologically active portions of immunoglobulin (Ig) molecules, i.e.,
molecules that contain
an antigen binding site that specifically binds (immunoreacts with) an
antigen. Such antibodies
include, but are not limited to, polyclonal, monoclonal, chimeric, single
chain, Fab, Fab' and F(ab~)2
fragments, and an Fab expression library. In general, an antibody molecule
obtained from
humans relates to any of the classes IgG, IgM, IgA, IgE and IgD, which differ
from one another
by the nature of the heavy chain present in the molecule. Certain classes have
subclasses as well,
74


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
such as IgGI, IgG2, and others. Furthermore, in humans, the light chain may be
a kappa chain or
a lambda chain. Reference herein to antibodies includes a reference to all
such classes,
subclasses and types of human antibody species.
An isolated related protein of the invention may be intended to serve as an
antigen, or a
portion or fragment thereof, and additionally can be used as an immunogen to
generate
antibodies that immunospecifically bind the antigen, using standard techniques
for polyclonal
and monoclonal antibody preparation. The full-length protein can be used or,
alternatively, the
invention provides antigenic peptide fragments of the antigen fox use as
immunogens. An
antigenic peptide fragment comprises at least 6 amino acid residues of the
amino acid sequence
of the full length protein, such as an amino acid sequence shown in SEQ ID
N0:985, and
encompasses an epitope thereof such that an antibody raised against the
peptide forms a specific
immune complex with the full length protein or with any fragment that contains
the epitope.
Preferably, the antigenic peptide comprises at least I O amino acid residues,
or at least 15 amino
acid residues, or at least 20 amino acid residues, or at least 30 amino acid
residues. Preferred
epitopes encompassed by the antigenic peptide are regions of the protein that
are located on its
surface; commonly these are hydrophilic regions.
In certain embodiments of the invention, at least one epitope encompassed by
the
antigenic peptide is a region of -related protein that is located on the
surface of the protein, e.g., a
hydrophilic region. A hydrophobicity analysis of the human related protein
sequence will
indicate which regions of a related protein are particularly hydrophilic and,
therefore, are likely
to encode surface residues useful for targeting antibody production. As a
means for targeting
antibody production, hydropathy plots shoving regions of hydrophilicity and
hydrophobicity
may be generated by any method well known in the art, including, for example,
the Kyte
Doolittle or the Hopp Woods methods, either with or without Fourier
transformation. See, e.g.,
Hopp and Woods, 1981, P~oc. Nat. Acad. Sci. USA 78: 3824-3828; Kyte and
Doolittle 1982, J.
Mol. Biol. 157: 105-142, each of which is incorporated herein by reference in
its entirety.
Antibodies that are specific for one or more domains within an antigenic
protein, or derivatives,
fragments, analogs or homologs thereof, are also provided herein.
A protein of the invention, or a derivative, fragment, analog, homolog or
ortholog
thereof, may be utilized as an immunogen in the generation of antibodies that
immunospecifically bind these protein components.
Various procedures known within the art may be used for the production of
polyclonal or
monoclonal antibodies directed against a protein of the invention, or against
derivatives,
fragments, analogs homologs or orthologs thereof (see, for example,
Antibodies: A Laboratory


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
Manual, Harlow E, and Lane D, 1988, Cold Spring Harbor Laboratory Press, Cold
Spring
Haxbor, NY, incorporated herein by reference). Some of these antibodies are
discussed below.
5.13.1 Polyclonal Antibodies
For the production of polyclonal antibodies, various suitable host animals
(e.g., rabbit,
goat, mouse or other mammal) may be immunized by one or more injections with
the native
protein, a synthetic variant thereof, or a derivative of the foregoing. An
appropriate
immunogenic preparation can contain, for example, the naturally occurring
immunogenic
protein, a chemically synthesized polypeptide representing the immunogenic
protein, or a
recombinanlly expressed immunogenic protein. Furthermore, the protein may be
conjugated to
a second protein known to be immunogenic in the mammal being immunized.
Examples of such
immunogenic proteins include but are not limited to keyhole limpet hemocyanin,
serum albumin,
bovine thyroglobulin, and soybean trypsin inhibitor. The preparation can
further include an
adjuvant. Various adjuvants used to increase the immunological response
include, but axe not
limited to, Freund's (complete and incomplete), mineral gels (e.g., aluminum
hydroxide), surface
active substances (e.g., lysolecithin, platonic polyols, polyanions, peptides,
oil emulsions,
dinitrophenol, etc.), adjuvants usable in humans such as Bacille Calmette-
Guerin and
Corynebacterium parvum, or similar immunostimulatory agents. Additional
examples of
adjuvants which can be employed include MPL-TDM adjuvant (monophosphoryl Lipid
A,
synthetic trehalose dicorynomycolate).
The polyclonal antibody molecules directed against the immunogenic protein can
be
isolated from the mammal (e.g., from the blood) and further purified by well
known techniques,
such as affinity chromatography using protein A or protein G, which provide
primarily the IgG
fraction of immune serum. Subsequently, or alternatively, the specific antigen
which is the
target of the immunoglobulin sought, or an epitope thereof, may be immobilized
on a column to
purify the immune specific antibody by inununoaffinity chromatography.
Purification of
immunoglobulins is discussed, for example, by D. Wilkinson (The Scientist,
published by The
Scientist, Inc., Philadelphia PA, Vol. 14, No. 8 (April 17, 2000), pp. 25-28).
5.13.2 Monoclonal Antibodies
The term "monoclonal antibody" (MAb) or "monoclonal antibody composition", as
used
herein, refers to a population of antibody molecules that contain only one
molecular species of
antibody molecule consisting of a unique light chain gene product and a unique
heavy chain
gene product. In particular, the complementarily determining regions (CDRs) of
the monoclonal
antibody are identical in alI the molecules of the population. MAbs thus
contain an antigen
76


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
binding site capable of immunoreacting with a particular epitope of the
antigen characterized by
a unique binding affinity for it.
Monoclonal antibodies can be prepared using hybridoma methods, such as those
described by Kohler and Milstein, Nature, 256:495 (1975). In a hybridoma
method, a mouse,
hamster, or other appropriate host animal, is typically immunized with an
immunizing agent to
elicit lymphocytes that produce or are capable of producing antibodies that
will specifically bind
to the immunizing agent. Alternatively, the lymphocytes can be immunized in
vitro.
The immunizing agent will typically include the protein antigen, a fragment
thereof or a fusion
protein thereof. Generally, either peripheral blood lymphocytes are used if
cells of human origin
are desired, or spleen cells or lymph node cells 'are used if non-human
mammalian sources are
desired. The lymphocytes are then fused with an immortalized cell line using a
suitable fusing
agent, such as polyethylene glycol, to form a hybridoma cell (Goding,
Monoclonal Antibodies:
Principles and Practice, Academic Press, (1986) pp. 59-103). Immortalized cell
lines are usually
transformed mammalian cells, particularly myeloma cells of rodent, bovine and
human origin.
Usually, rat or mouse myeloma cell lines are employed. The hybridoma cells can
be cultured in
a suitable culture medium that preferably contains one or more substances that
inhibit the growth
or survival of the unfused, immortalized cells. For example, if the parental
cells lack the enzyme
hypoxanthine guanine phosphoribosyl transferase (HGPRT or HPRT), the culture
medium for
the hybridomas typically will include hypoxanthine, aminopterin, and thymidine
("HAT
medium"), which substances prevent the growth of HGPRT-deficient cells.
Preferred irmnortalized cell lines axe those that fuse efficiently, support
stable high level
expression of antibody by the selected antibody-producing cells, and are
sensitive to a medium
such as HAT medium. More preferred immortalized cell lines are marine myeloma
lines, which
can be obtained, for instance, from the Salk Institute Cell Distribution
Center, San Diego,
California and the American Type Culture Collection, Manassas, Virginia. Human
myeloma and
mouse-human heteromyeloma cell lines also have been described for the
production of human
monoclonal antibodies (Kozbor, J. Immunol., 133:3001 (1984); Brodeur et al.,
Monoclonal
Antibody Production Techniques and Applications, Maxcel Dekker, Inc., New
York, (1987) pp.
51-63).
The culture medium in which the hybridoma cells axe cultured can then be
assayed for
the presence of monoclonal antibodies directed against the antigen.
Preferably, the binding
specificity of monoclonal antibodies produced by the hybridoma cells is
determined by
immunoprecipitation or by an in vitro binding assay, such as radioimmunoassay
(RIA) or
enzyme-linked immunoabsorbent assay (ELISA). Such techniques and assays are
known in the
axt. The binding affinity of the monoclonal antibody can, for example, be
determined by the
77


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
Scatchard analysis of Munson and Pollard, Anal. Biochem., 107:220 (1980).
Preferably,
antibodies having a high degree of specificity and a high binding affinity for
the target antigen
are isolated.
After the desired hybridoma cells are identified, the clones can be subcloned
by limiting
dilution procedures and grown by standard methods. Suitable culture media for
this purpose
include, for example, Dulbecco's Modified Eagle's Medium and RPMI-1640 medium.
Alternatively, the hybridoma cells can be grown in vivo as ascites in a
mammal.
The monoclonal antibodies secreted by the subclones can be isolated or
purified from the culture
medium or ascites fluid by conventional immunoglobulin purification procedures
such as, for
example, protein A-Sephaxose, hydroxylapatite chromatography, gel
electrophoresis, dialysis, or
affinity chromatography.
The monoclonal antibodies can also be made by recombinant DNA methods, such as
those described in U.S. Patent No. 4,816,567. DNA encoding the monoclonal
antibodies of the
invention can be readily isolated and sequenced using conventional procedures
(e.g., by using
oligonucleotide probes that are capable of binding specifically to genes
encoding the heavy and
light chains of marine antibodies). The hybridoma cells of the invention serve
as a preferred
source of such DNA. Once isolated, the DNA can be placed into expression
vectors, which are
then transfected into host cells such as simian COS cells, Chinese hamster
ovary (CHO) cells, or
myeloma cells that do not otherwise produce immunoglobulin protein, to obtain
the synthesis of
monoclonal antibodies in the recombinant host cells. The DNA also can be
modified, for
example, by substituting the coding sequence for human heavy and light chain
constant domains
in place of the homologous marine sequences (U.S. Patent No. 4,816,567;
Morrison, Nature 368,
812-13 (1994)) or by covalently joining to the immunoglobulin coding sequence
all or part of the
coding sequence for a non-immunoglobulin polypeptide. Such a non-
immunoglobulin
polypeptide can be substituted for the constant domains of an antibody of the
invention, or can
be substituted for the variable domains of one antigen-combining site of an
antibody of the
invention to create a chimeric bivalent antibody.
5.13.2 Humanized Antibodies
The antibodies directed against the protein antigens of the invention can
further comprise
humanized antibodies or human antibodies. These antibodies are suitable for
administration to
humans without engendering an immune response by the human against the
administered
immunoglobulin. Humanized forms of antibodies are chimeric immunoglobulins,
immunoglobulin chains or fragments thereof (such as Fv, Fab, Fab', F(ab')2 or
other antigen-
binding subsequences of antibodies) that are principally comprised of the
sequence of a human
78


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
immunoglobulin, and contain minimal sequence derived from a non-human
immunoglobulin.
Humanization can be performed following the method of Winter and co-workers
(Jones et al.,
Nature, 321:522-525 (1986); Riechmann et al., Nature, 332:323-327 (1988);
Verhoeyen et al.,
Science, 239:1534-1536 (1988)), by substituting rodent CDRs or CDR sequences
for the
corresponding sequences of a human antibody. (See also U.S. Patent No.
5,225,539.) In some
instances, Fv framework residues of the human immunoglobulin are replaced by
corresponding
non-human residues. Humanized antibodies can also comprise residues which are
found neither
in the recipient antibody nor in the imported CDR or framework sequences. In
general, the
humanized antibody will comprise substantially all of at least one, and
typically two, variable
domains, in which all or substantially all of the CDR regions correspond to
those of a non-human
immunoglobulin and all or substantially all of the framework regions are those
of a human
immunoglobulin consensus sequence. The humanized antibody optimally also will
comprise at
least a portion of an immunoglobulin constant region (Fc), typically that of a
human
immunoglobulin (Jones et al., 1986; Riechmann et al., 1988; and Presta, Curr.
On. Struct. Biol.,
2:593-596 (1992)).
5.13.3 Human Antibodies
Fully human antibodies relate to antibody molecules in which essentially the
entire
sequences of both the light chain and the heavy chain, including the CDRs,
arise from human
genes. Such antibodies are termed "human antibodies", or "fully human
antibodies" herein.
Human monoclonal antibodies can be prepared by the trioma technique; the human
B-cell
hybridoma technique (see I~ozbor, et al., 1983 Immunol Today 4: 72) and the
EBV hybridoma
technique to produce human monoclonal antibodies (see Cole, et al., 1985 In:
MONOCLONAL
ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc., pp. 77-96). Human
monoclonal
antibodies may be utilized in the practice of the present invention and may be
produced by using
human hybridomas (see Cote, et al., 1983. Proc Natl Acad Sci USA 80: 2026-
2030) or by
transforming human B-cells with Epstein Barr Virus in vitro (see Cole, et al.,
1985 In:
MONOCLONAL ANTIBODIES AND CANCER THERAPY, Alan R. Liss, Inc., pp. 77-96).
In addition, human antibodies can also be produced using additional
techniques,
including phage display libraries (Hoogenboom and Winter, J. Mol. Biol.,
227:381 (1991);
Marks et al., J. Mol. Biol., 222:581 (1991)). Similarly, human antibodies can
be made by
introducing human immunoglobulin loci into transgenic animals, e.g., mice in
which the
endogenous immunoglobulin genes have been partially or completely inactivated.
Upon
challenge, human antibody production is observed, which closely resembles that
seen in humans
in all respects, including gene rearrangement, assembly, and antibody
repertoire. This approach
79


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
is described, for example, in U.S. Patent Nos. 5,545,807; 5,545,806;
5,569,825; 5,625,126;
5,633,425; 5,661,016, and in Marks et al. (Bio/Technolo~y 10, 779-783 (1992));
Lonberg et al.
(Nature 368 856-859 (1994)); Morrison ( Nature 368, 812-13 (1994)); Fishwild
et al,( Nature
Biotechnology 14, 845-51 (1996)); Neuberger (Nature Biotechnolo~y 14, 826
(1996)); and
Lonberg and Huszar (Intern. Rev. Irnmunol. 13 65-93 (1995)).
Human antibodies may additionally be produced using transgenic nonhuman
animals
which are modified so as to produce fully human antibodies rather than the
animal's endogenous
antibodies in response to challenge by an antigen. (See PCT publication
W094102602). The
endogenous genes encoding the heavy and light immunoglobulin chains in the
nonhuman host
have been incapacitated, and active loci encoding hmnan heavy and light chain
immunoglobulins
are inserted into the host's genome. The human genes are incorporated, for
example, using yeast
artificial chromosomes containing the requisite human DNA segments. An animal
which
provides all the desired modifications is then obtained as progeny by
crossbreeding intermediate
transgenic animals containing fewer than the full complement of the
modifications. The
preferred embodiment of such a nonhuman animal is a mouse, and is termed the
XenomouseTM
as disclosed in PCT publications WO 96/33735 and WO 96/34096. This animal
produces B cells
which secrete fully human immunoglobulins. The antibodies can be obtained
directly from the
animal after immunization with an immunogen of interest, as, for example, a
preparation of a
polyclonal antibody, or alternatively from immortalized B cells derived from
the animal, such as
hybridomas producing monoclonal antibodies. Additionally, the genes encoding
the
immunoglobulins with human variable regions can be recovered and expressed to
obtain the
antibodies directly, or can be further modified to obtain analogs of
antibodies such as, for
example, single chain Fv molecules.
An example of a method of producing a nonhuman host, exemplified as a mouse,
lacking
expression of an endogenous immunoglobulin heavy chain is disclosed in U.S.
Patent No.
5,939,598. It can be obtained by a method including deleting the J segment
genes from at least
one endogenous heavy chain locus in an embryonic stem cell to prevent
rearrangement of the
locus and to prevent formation of a transcript of a rearranged immunoglobulin
heavy chain locus,
the deletion being effected by a targeting vector containing a gene encoding a
selectable marker;
and producing from the embryonic stem cell a transgenic mouse whose somatic
and germ cells
contain the gene encoding the selectable marker.
A method for producing an antibody of interest, such as a human antibody, is
disclosed in
U.S. Patent No. 5,916,771. It includes introducing an expression vector that
contains a
nucleotide sequence encoding a heavy chain into one mammalian host cell in
culture, introducing
an expression vector containing a nucleotide sequence encoding a light chain
into another
so


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
mammalian host cell, and fusing the two cells to form a hybrid cell. The
hybrid cell expresses an
antibody containing the heavy chain and the light chain.
In a further improvement on this procedure, a method for identifying a
clinically relevant
epitope on an immunogen, and a correlative method for selecting an antibody
that binds
immunospecifically to the relevant epitope with high affinity, are disclosed
in PCT publication
WO 99/53049.
5.13.4 Fab Fragments and Single Chain Antibodies
According to the invention, techniques can be adapted for the production of
single-chain
antibodies specific to an antigenic protein of the invention (see e.g., U.S.
Patent No. 4,946,778).
In addition, methods can be adapted for the construction of Fab expression
libraries (see e.g.,
Huse, et al., 1989 Science 246: 1275-1281) to allow rapid and effective
identification of
monoclonal Fab fragments with the desired specificity for a protein or
derivatives, fragments,
analogs or homologs thereof. Antibody fragments that contain the idiotypes to
a protein antigen
may be produced by techniques known in the art including, but not limited to:
(i) an F~ab~~2
fragment produced by pepsin digestion of an antibody molecule; (ii) an Fab
fragment generated
by reducing the disulfide bridges of an F(ab')2 fragment; (iii) an Fab
fragment generated by the
treatment of the antibody molecule with papain and a reducing agent and (iv)
F~ fragments.
5.13.5 Bispecific Antibodies
Bispecific antibodies.are monoclonal, preferably human or humanized,
antibodies that
have binding specificities for at least two different antigens. In the present
case, one of the
binding specificities is for an antigenic protein of the invention. The second
binding target is any
other antigen, and advantageously is a cell-surface protein or receptor or
receptor subunit.
Methods for making bispecific antibodies are known in the art. Traditionally,
the
recombinant production of bispecific antibodies is based on the co-expression
of two
immunoglobulin heavy-chain/light-chain pairs, where the two heavy chains have
different
specificities (Milstein and Cuello, Nature, 305:537-539 (1983)). Because of
the random
assortment of immunoglobulin heavy and light chains, these hybridomas
(quadromas) produce a
potential mixture of ten different antibody molecules, of which only one has
the correct
bispecific structure. The purification of the correct molecule is usually
accomplished by affinity
chromatography steps. Similar procedures are disclosed in WO 93/08829,
published 13 May
1993, and in Traunecker et al., 1991 EMBO J., 10:3655-3659.
Antibody variable domains with the desired binding specificities (antibody-
antigen
combining sites) can be fused to immunoglobulin constant domain sequences. The
fusion
81


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
preferably is with an immunoglobulin heavy-chain constant domain, comprising
at least part of
the hinge, CH2, and CH3 regions. It is preferred to have the first heavy-chain
constant region
(CHl) containing the site necessary for light-chain binding present in at
least one of the fusions.
DNAs encoding the immunoglobulin heavy-chain fusions and, if desired, the
immunoglobulin
light chain, are inserted into separate expression vectors, and are co-
transfected into a suitable
host organism. For further details of generating bispecific antibodies see,
for example, Suresh et
al., Methods in Enzymolo~y, 121:210 (1986).
According to another approach described in WO 96/27011, the interface between
a pair
of antibody molecules can be engineered to maximize the percentage of
heterodimers which are
recovered from recombinant cell culture. The preferred interface comprises at
least a part of the
CH3 region of an antibody constant domain. In this method, one or more small
amino acid side
chains from the interface of the first antibody molecule are replaced with
larger side chains (e.g.
tyrosine or tryptophan). Compensatory "cavities" of identical or similar size
to the large side
chains) are created on the interface of the second antibody molecule by
replacing large amino
acid side chains with smaller ones (e.g. alanine or threonine). This provides
a mechanism for
increasing the yield of the heterodimer over other unwanted end-products such
as homodimers.
Bispecific antibodies can be prepared as full length antibodies or antibody
fragments (e.g.
F(ab')2 bispecific antibodies). Techniques for generating bispecific
antibodies from antibody
fragments have been described in the literature. For example, bispecific
antibodies can be
prepared using chemical linkage. Brennan et al., Science 229:81 (1985)
describe a procedure
wherein intact antibodies axe proteolytically cleaved to generate F(ab')2
fragments. These
fragments are reduced in the presence of the dithiol complexing agent sodium
arsenite to
stabilize vicinal dithiols and prevent intermolecular disulfide formation. The
Fab' fragments
generated are then converted to thionitrobenzoate (TNB) derivatives. One of
the Fab'-TNB
derivatives is then reconverted to the Fab'-thiol by reduction with
mercaptoethylamine and is
mixed with an equimolar amount of the other Fab'-TNB derivative to form the
bispecific
antibody. The bispecific antibodies produced can be used as agents for the
selective
immobilization of enzymes.
Additionally, Fab' fragments can be directly recovered from E. coli and
chemically
coupled to form bispecific antibodies. Shalaby et al., J. Exp Med. 175:217-225
(1992) describe
the production of a fully humanized bispecific antibody F(ab')2 molecule. Each
Fab' fragment
was separately secreted from E. coli and subjected to directed chemical
coupling in vitro to form
the bispecific antibody. The bispecific antibody thus formed was able to bind
to cells
overexpressing the ErbB2 receptor and normal human T cells, as well as trigger
the lytic activity
of human cytotoxic lymphocytes against human breast tumor targets.
82


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
Various techniques for making and isolating bispecific antibody fragments
directly from
recombinant cell culture have also been described. For example, bispecific
antibodies have been
produced using leucine zippers. Kostelny et al., J. Immunol. 148(5):1547-1553
(1992). The
leucine zipper peptides from the Fos and Jun proteins were linked to the Fab'
portions of two
different antibodies by gene fusion. The antibody homodimers were reduced at
the hinge region
to form monomers and then re-oxidized to form the antibody heterodimers. This
method can
also be utilized for the production of antibody homodimers. The "diabody"
technology
described by Hollinger et al., Proc. Natl. Acad. Sci. USA 90:6444-6448 (1993)
has provided an
alternative mechanism for making bispecific antibody fragments. The fragments
comprise a
heavy-chain variable domain (VH) connected to a light-chain variable domain
(VL) by a linker
which is too short to allow pairing between the two domains on the same chain.
Accordingly,
the VH arid VL domains of one fragment are forced to pair with the
complementary VL and VH
domains of another fragment, thereby forming two antigen-binding sites.
Another strategy for
making bispecific antibody fragments by the use of single-chain Fv (sFv)
dimers has also been
reported. See, Gruber et al., J. Immunol. 152:5368 (1994).
Antibodies with more than two valencies are contemplated. For example,
trispecific
. antibodies can be prepared. Tutt et al., J. Immunol. 147:60 ( 1991 ).
Exemplary bispecific antibodies can bind to two different epitopes, at least
one of which
originates in the protein antigen of the invention. Alternatively, an anti-
antigenic arm of an
immunoglobulin molecule can be combined with an arm which binds to a
triggering molecule on
a leukocyte such as a T-cell receptor molecule (e.g. CD2, CD3, CD28, or B7),
or Fc receptors for
IgG (FcyR), such as FcyRI (CD64), FcyRII (CD32) and FcyRIII (CD 16) so as to
focus cellular
defense mechanisms to the cell expressing the particular antigen. Bispecific
antibodies can also
be used to direct cytotoxic agents to cells which express a particular
antigen. These antibodies
possess an antigen-binding arm and an arm which binds a cytotoxic agent or a
radionuclide
chelator, such as EOTUBE, DPTA, DOTA, or TETA. Another bispecific antibody of
interest
binds the protein antigen described herein and further binds tissue factor
(TF).
5.13.6 Heteroconjugate Antibodies
Heteroconjugate antibodies are also within the scope of the present invention.
Heteroconjugate antibodies are composed of two covalently joined antibodies.
Such antibodies
have, for example, been proposed to target immune system cells to unwanted
cells (U.S. Patent
No. 4,676,980), and for treatment of HIV infection (WO 91/00360; WO 92/200373;
EP 03089).
It is contemplated that the antibodies can be prepared in vitro using known
methods in synthetic
protein chemistry, including those involving crosslinking agents. For example,
immunotoxins
83


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
can be constructed using a disulfide exchange reaction or by forming a
thioether bond.
Examples of suitable reagents for this purpose include iminothiolate and
methyl-4-
mercaptobutyrimidate and those disclosed, for example, in U.S. Patent No.
4,676,980.
5.13.7 Effector Function Engineering
It can be desirable to modify the antibody of the invention with respect to
effector function, so as
to enhance, e.g., the effectiveness of the antibody in treating cancer. For
example, cysteine
residues) can be introduced into the Fc region, thereby allowing interchain
disulfide bond
formation in this region. The homodimeric antibody thus generated can have
improved
internalization capability and/or increased complement-mediated cell killing
and antibody-
dependent cellulax cytotoxicity (ADCC). See Caron et al., J. Exp Med., 176:
1191-1195 (1992)
and Shopes, J. Immunol., 148: 2918-2922 (1992). Homodimeric antibodies with
enhanced anti-
tumor activity can also be prepared using heterobifunctional cross-linkers as
described in Wolff
et al. Cancer Research, 53: 2560-2565 (1993). Alternatively, an antibody can
be engineered that
has dual Fc regions and can thereby have enhanced complement lysis and ADCC
capabilities.
See Stevenson et al., Anti-Cancer Drug Design, 3: 219-230 (1989).
5.13.8 Immunoconjugates ,
The invention also pertains to immunoconjugates comprising an antibody
conjugated to a
cytotoxic agent such as a chemotherapeutic agent, toxin (e.g., an
enzymatically active toxin of
bacterial, fungal, plant, or animal origin, or fragments thereof), or a
radioactive isotope (i.e., a
radioconj ugate).
Chemotherapeutic agents useful in the generation of such immunoconjugates have
been
described above. Enzymatically active toxins and fragments thereof that can be
used include
diphtheria A chain, nonbinding active fragments of diphtheria toxin, exotoxin
A chain (from
Pseudomonas aeruginosa), ricin A chain, abrin A chain, modeccin A chain, alpha-
sarcin,
Aleurites fordii proteins, dianthin proteins, Phytolaca americana proteins
(PAPI, PAPII, and
PAP-S), momordica charantia inhibitor, curcin, croon, sapaonaria officinalis
inhibitor, gelonin,
mitogellin, restrictocin, phenomycin, enomycin, and the tricothecenes. A
variety of
radionuclides are available for the production of radioconjugated antibodies.
Examples include
aiaBi~ i3ih isiln~ 9oY~ ~d is6Re.
Conjugates of the antibody and cytotoxic agent axe made using a variety of
bifunctional
protein-coupling agents such as N-succinimidyl-3-(2-pyridyldithiol) propionate
(SPDP),
iminothiolane (IT), bifunctional derivatives of imidoesters (such as dimethyl
adipimidate HCL),
active esters (such as disuccinimidyl suberate), aldehydes (such as
glutareldehyde), bis-azido
84


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
compounds (such as bis (p-azidobenzoyl) hexanediamine), bis-diazonium
derivatives (such as
bis-(p-diazoniumbenzoyl)-ethylenediamine), diisocyanates (such as tolyene 2,6-
diisocyanate),
and bis-active fluorine compounds (such as 1,5-difluoro-2,4-dinitrobenzene).
For example, a
ricin immunotoxin can be prepared as described in Vitetta et al., Science,
238: 1098 (1987).
Carbon-14-labeled 1-isothiocyanatobenzyl-3-methyldiethylene
triaminepentaacetic acid (MX-
DTPA) is an exemplary chelating agent for conjugation of radionucleotide to
the antibody. See
W094/ 11026.
In another embodiment, the antibody can be conjugated to a "receptor" (such
streptavidin) for utilization in tumor pretargeting wherein the antibody-
receptor conjugate is
administered to the patient, followed by removal of unbound conjugate from the
circulation
using a clearing agent and then administration of a "ligand" (e.g., avidin)
that is in turn
conjugated to a cytotoxic agent.
4.14 COMPUTER READABLE SEQUENCES
In one application of this embodiment, a nucleotide sequence of the present
invention can
be recorded on computer readable media. As used herein, "computer readable
media" refers to
any medium which can be read and accessed directly by a computer. Such media
include, but
are not limited to: magnetic storage media, such as floppy discs, hard disc
storage medium, and
magnetic tape; optical storage media such as CD-ROM; electrical storage media
such as RAM
and ROM; and hybrids of these categories such as magnetic/optical storage
media. A slcilled
artisan can readily appreciate how any of the presently known computer
readable mediums can
be used to create a manufacture comprising computer readable medium having
recorded thereon
a nucleotide sequence of the present invention. As used herein, "recorded"
refers to a process for
storing information on computer readable medium. A skilled artisan can readily
adopt any of the
presently known methods for recording information on computer readable medium
to generate
manufactures comprising the nucleotide sequence information of the present
invention.
A variety of data storage structures are available to a skilled artisan for
creating a
computer readable medium having recorded thereon a nucleotide sequence of the
present
invention. The choice of the data storage structure will generally be based on
the means chosen
to access the stored information. In addition, a variety of data processor
programs and formats
can be used to store the nucleotide sequence information of the present
invention on computer
readable medium. The sequence information can be represented in a word
processing text file,
formatted in commercially-available software such as WordPerfect and Microsoft
Word, or
represented in the form of an ASCII file, stored in a database application,
such as DB2, Sybase,
Oracle, or the like. A skilled artisan can readily adapt any number of data
processor structuring


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
formats (e.g. text file or database) in order to obtain computer readable
medium having recorded
thereon the nucleotide sequence information, of the present invention.
By providing any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952, 3937-
3942
or 3949-3954 or a representative fragment thereof; or a nucleotide sequence at
least 95%
identical to any of the nucleotide sequences of SEQ ID NO: 1-984, 1969-2952,
3937-3942 or
3949-3954 in computer readable form, a skilled artisan can routinely access
the sequence
information for a variety of purposes. Computer software is publicly available
which allows a
skilled artisan to access sequence information provided in a computer readable
medium. The
examples which follow demonstrate how software which implements the BLAST
(Altschul et
al., J. Mol. Biol. 215:403-410 (1990)) and BLAZE (Brutlag et al., Comp. Chem.
17:203-207
(1993)) search algorithms on a Sybase system is used to identify open reading
frames (ORFs)
within a nucleic acid sequence. Such ORFs may be protein encoding fragments
and may be
useful in producing commercially important proteins such as enzymes used in
fermentation
reactions and in the production of commercially useful metabolites.
As used herein, "a computer-based system" refers to the hardware means,
software
means, and data storage means used to analyze the nucleotide sequence
information of the
present invention. The minimum hardware means of the computer-based systems of
the present
invention comprises a central processing unit (CPU), input means, output
means, and data
storage means. A skilled artisan can readily appreciate that any one of the
currently available
computer-based systems are suitable for use in the present invention. As
stated above, the
computer-based systems of the present invention comprise a data storage means
having stored
therein a nucleotide sequence of the present invention and the necessary
hardware means and
software means for supporting and implementing a search means. As used herein,
"data storage
means" refers to memory which can store nucleotide sequence information of the
present
invention, or a memory access means which can access manufactures having
recorded thereon
the nucleotide sequence information of the present invention.
As used herein, "search means" refers to one or more programs which are
implemented
on the computer-based system to compare a target sequence or target structural
motif with the
sequence information stored within the data storage means. Seaxch means are
used to identify
fragments or regions of a known sequence which match a particular target
sequence or target
motif. A variety of known algorithms are disclosed publicly and a variety of
commercially
available software for conducting search means are and can be used in the
computer-based
systems of the present invention. Examples of such software includes, but is
not limited to,
Smith-Waterman, MacPattern (EMBL), BLAST'N and BLASTA (NPOLYPEPTIDEIA). A
skilled artisan can readily recognize that any one of the available algorithms
or implementing
86


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
software packages for conducting homology searches can be adapted for use in
the present
computer-based systems. As used herein, a "target sequence" can be any nucleic
acid or amino
acid sequence of six or more nucleotides or two or more amino acids. A skilled
artisan can
readily recognize that the longer a target sequence is, the less likely a
target sequence will be
present as a random occurrence in the database. The most preferred sequence
length of a target
sequence is from about 10 to 300 amino acids, more preferably from about 30 to
100 nucleotide
residues. However, it is well recognized that searches for commercially
important fragments,
such as sequence fragments involved in gene expression and protein processing,
may be of
shorter length.
As used herein, "a target structural motif," or "target motif," refers to any
rationally
selected sequence or combination of sequences in which the sequences) are
chosen based on a
three-dimensional configuration which is formed upon the folding of the target
motif. There are
a variety of target motifs known in the art. Protein target motifs include,
but are not limited to,
enzyme active sites and signal sequences. Nucleic acid target motifs include,
but are not limited
to, promoter sequences, hairpin structures and inducible expression elements
(protein binding
sequences).
4.15 TRIPLE HELIX FORMATION
In addition, the fragments of the present invention, as broadly described, can
be used to
control gene expression through triple helix formation or antisense DNA or
RNA, both of which
methods are based on the binding of a polynucleotide sequence to DNA or RNA.
Polynucleotides suitable for use in these methods are preferably 20 to 40
bases in length and are
designed to be complementary to a region of the gene involved in transcription
(triple helix - see
Lee et al., Nucl. Acids Res. 6:3073 (1979); Cooney et al., Science 15241:456
(1988); and Dervan
et al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Olmno, J.
Neurochem. 56:560
(1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC
Press, Boca
Raton, FL (1988)). Triple helix-formation optimally results in a shut-off of
RNA transcription
from DNA, while antisense RNA hybridization blocks translation of an mRNA
molecule into
polypeptide. Both techniques have been demonstrated to be effective in model
systems.
Information contained in the sequences of the present invention is necessary
for the design of an
antisense or triple helix oligonucleotide.
4.16 DIAGNOSTIC ASSAYS AND KITS
The present invention further provides methods to identify the presence or
expression of
one of the ORFs of the present invention, or homolog thereof, in a test
sample, using a nucleic
87


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
acid probe or antibodies of the present invention, optionally conjugated or
otherwise associated
with a suitable label.
In general, methods for detecting a polynucleotide of the invention can
comprise
contacting a sample with a compound that binds to and forms a complex with the
polynucleotide
for a period sufficient to form the complex, and detecting the complex, so
that if a complex is
detected, a polynucleotide of the invention is detected in the sample. Such
methods can also
comprise contacting a sample under stringent hybridization conditions with
nucleic acid primers
that anneal to a polynucleotide of the invention under such conditions, and
amplifying annealed
polynucleotides, so that if a polynucleotide is amplified, a polynucleotide of
the invention is
detected in the sample.
In general, methods for detecting a polypeptide of the invention can comprise
contacting
a sample with a compound that binds to and forms a complex with the
polypeptide for a period
sufficient to form the complex, and detecting the complex, so that if a
complex is detected, a
polypeptide of the invention is detected in the sample.
In detail, such methods comprise incubating a test sample with one or more of
the
antibodies or one or more of the nucleic acid probes of the present invention
and assaying for
binding of the nucleic acid probes or antibodies to components within the test
sample.
Conditions for incubating a nucleic acid probe or antibody with a test sample
vary.
Incubation conditions depend on the format employed in the assay, the
detection methods
employed, and the type and nature of the nucleic acid probe or antibody used
in the assay. One
skilled in the art will recognize that any one of the commonly available
hybridization,
amplification or immunological assay formats can readily be adapted to employ
the nucleic acid
probes or antibodies of the present invention. Examples of such assays can be
found in Chaxd,
T., An Introduction to Radioimmunoassay and Related Techniques, Elsevier
Science Publishers,
Amsterdam, The Netherlands (1986); Bullock, G.R. et al., Techniques in
Immunocytochemistry,
Academic Press, Orlando, FL Vol. 1 (1982), Vol. 2 (1983), Vol. 3 (1985);
Tijssen, P., Practice
and Theory of immunoassays: Laboratory Techniques in Biochemistry and
Molecular Biology,
Elsevier Science Publishers, Amsterdam, The Netherlands (1985). The test
samples of the
present invention include cells, protein or membrane extracts of cells, or
biological fluids such as
sputum, blood, serum, plasma, or urine. The test sample used in the above-
described method
will vary based on the assay format, nature of the detection method and the
tissues, cells or
extracts used as the sample to be assayed. Methods for preparing protein
extracts or membrane
extracts of cells are well known in the art and can be readily be adapted in
order to obtain a
sample which is compatible with the system utilized.
8s


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
In another embodiment of the present invention, kits are provided which
contain the
necessary reagents to carry out the assays of the present invention.
Specifically, the invention
provides a compartment kit to receive, in close confinement, one or more
containers which
comprises: (a) a first container comprising one of the probes or antibodies of
the present
invention; and (b) one or more other containers comprising one or more of the
following: wash
reagents, reagents capable of detecting presence of a bound probe or antibody.
In detail, a compartment kit includes any kit in which reagents are contained
in separate
containers. Such containers include small glass containers, plastic containers
or strips of plastic
or paper. Such containers allows one to efficiently transfer reagents from one
compartment to
another compartment such that the samples and reagents are not cross-
contaminated, and the
agents or solutions of each container can be added in a quantitative fashion
from one
compartment to another. Such containers will include a container which will
accept the test
sample, a container which contains the antibodies used in the assay,
containers which contain
wash reagents (such as phosphate buffered saline, Tris-buffers, etc.), and
containers which
contain the reagents used to detect the bound antibody or probe. Types of
detection reagents
include labeled nucleic acid probes, labeled secondary antibodies, or in the
alternative, if the
primary antibody is labeled, the enzymatic, or antibody binding reagents which
axe capable of
reacting with the labeled antibody. One skilled in the art will readily
recognize that the disclosed
probes and antibodies of the present invention can be readily incorporated
into one of the
established kit formats which are well known in the art.
4.17 MEDICAL IMAGING
The novel polypeptides and binding partners of the invention are useful in
medical
imaging of sites expressing the molecules of the invention (e.g., where the
polypeptide of the
invention is involved in the immune response, for imaging sites of
inflammation or infection).
See, e.g., I~unkel et al., U.S. Pat. NO. 5,413,778. Such methods involve
chemical attachment of
a labeling or imaging agent, administration of the labeled polypeptide to a
subject in a
pharmaceutically acceptable carrier, and imaging the labeled polypeptide ih
vivo at the target
site.
4.18 SCREENING ASSAYS
Using the isolated proteins and polynucleotides of the invention, the present
invention
further provides methods of obtaining and identifying agents which bind to a
polypeptide
encoded by an ORF corresponding to any of the nucleotide sequences set forth
in SEQ ID NO:
89


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
1-9~4, 1969-2952, 3937-3942 or 3949-3954, or bind to a specific domain of the
polypeptide
encoded by the nucleic acid. In detail, said method comprises the steps of:
(a) contacting an agent with an isolated protein encoded by an ORF of the
present
invention, or nucleic acid of the invention; and
(b) determining whether the agent binds to said protein or said nucleic acid.
In general, therefore, such methods for identifying compounds that bind to a
polynucleotide of the invention can comprise contacting a compound with a
polynucleotide of
the invention fox a time sufficient to form a polynucleotidelcompound complex,
and detecting
the complex, so that if a polynucleotide/compound complex is detected, a
compound that binds
to a polynucleotide of the invention is identified,
Likewise, in general, therefore, such methods for identifying compounds that
bind to a
polypeptide of the invention can comprise contacting a compound with a
polypeptide of the
invention for a time sufficient to form a polypeptide/compound complex, and
detecting the
complex, so that if a polypeptide/compound complex is detected, a compound
that binds to a
polynucleotide of the invention is identified.
Methods for identifying compounds that bind to a polypeptide of the invention
can also
comprise contacting a compound with a polypeptide of the invention in a cell
for a time
sufficient to form a polypeptide/compound complex, wherein the complex drives
expression of a
receptor gene sequence in the cell, and detecting the complex by detecting
reporter gene
sequence expression, so that if a polypeptide/compound complex is detected, a
compound that
binds a polypeptide of the invention is identified.
Compounds identified via such methods can include compounds which modulate the
activity of a polypeptide of the invention (that is, increase or decrease its
activity, relative to
activity observed in the absence of the compound). Alternatively, compounds
identified via such
methods can include compounds which modulate the expression of a
polynucleotide of the
invention (that is, increase or decrease expression relative to expression
levels observed in the
absence of the compound). Compounds, such as compounds identified via the
methods of the
invention, can be tested using standard assays well known to those of skill in
the art for their
ability to modulate activity/expression.
The agents screened in the above assay can be, but are not limited to,
peptides,
carbohydrates, vitamin derivatives, or other pharmaceutical agents. The agents
can be selected
and screened at random or rationally selected or designed using protein
modeling techniques.
For random screening, agents such as peptides, carbohydrates, pharmaceutical
agents and
the like are selected at random and are assayed for their ability to bind to
the protein encoded by
the ORF of the present invention. Alternatively, agents may be rationally
selected or designed.


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
As used herein, an agent is said to be "rationally selected ox designed" when
the agent is chosen
based on the configuration of the particular protein. For example, one skilled
in the art can
readily adapt currently available procedures to generate peptides,
pharmaceutical agents and the
Like, capable of binding to a specific peptide sequence, in order to generate
rationally designed
antipeptide peptides, for example see Hurby et al., Application of Synthetic
Peptides: Antisense
Peptides," Tn Synthetic Peptides, A User's Guide, VJ.H. Freeman, NY (1992),
pp. 289-307, and
Kaspczak et al., Biochemistry 28:9230-8 (I989), or pharmaceutical agents, or
the like.
Tn addition to the foregoing, one class of agents of the present invention, as
broadly
described, can be used to control gene expression through binding to one of
the ORFs or EMFs
of the present invention. As described above, such agents can be randomly
screened ox
rationally designed/selected. Targeting the ORF or EMF allows a skilled
artisan to design
sequence specific or element specific agents, modulating the expression of
either a single ORF or
multiple ORFs which rely on the same EMF for expression control. One class of
DNA binding
agents are agents which contain base residues which hybridize or form a triple
helix formation
by binding to DNA or RNA. Such agents can be based on the classic
phosphodiester,
ribonucleic acid backbone, or can be a variety of sullhydryl ox polymeric
derivatives which have
base attachment capacity,
Agents suitable for use in these methods preferably contain 20 to 40 bases and
are
designed to be complementary to a region of the gene involved in transcription
(triple helix - see
Lee et aL, Nucl. Acids Res. 6:3073 (1979); Coaney et al., Science 241:456
(1988); and Dervan et
al., Science 251:1360 (1991)) or to the mRNA itself (antisense - Okano, J.
Neurochem. 56:560
{1991); Oligodeoxynucleotides as Antisense Inhibitors of Gene Expression, CRC
Press, Boca
Baton, FL (1988)). Triple helix-formation optimally results in a shut-off of
RNA transcription
from DNA, while antisense RNA hybridization blocks translation of an mRNA
molecule into
polypeptide. Both techniques have been demonstrated to be effective in model
systems.
Tnformation contained in the sequences of the present invention is necessary
for the design of an
antisense or triple helix oligonucleotide and other DNA binding agents.
Agents which bind to a protein encoded by one of the ORFs of the present
invention can
be used as a diagnostic agent. Agents which bind to a protein encoded by one
of the ORFs of the
present invention can be formulated using known techniques to generate a
pharmaceutical
composition.
x.19 USE OF NUCLEIC ACIDS AS PROBES
Another aspect of the subject invention is to provide for polypeptide-specific
nucleic acid
hybridization probes capable of hybridizing with naturally occurring
nucleotide sequences. The
91


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
hybridization probes of the subject invention may be derived from any of the
nucleotide
sequences SEQ ID NO: 1-984, 1969-2952, 3937-3942 or 3949-3954. Because the
corresponding gene is only expressed in a limited number of tissues, a
hybridization probe
derived from of any of the nucleotide sequences SEQ ID NO: 1-984, 1969-2952,
3937-3942 or
3949-3954 can be used as an indicator of the presence of RNA of cell type of
such a tissue in a
sample.
Any suitable hybridization technique can be employed, such as, for example, in
situ
hybridization. PCR as described in US Patents Nos. 4,683,195 and 4,965,188
provides
additional uses for oligonucleotides based upon the nucleotide sequences. Such
probes used in
PCR may be of recombinant origin, may be chemically synthesized, or a mixture
of both. The
probe will comprise a discrete nucleotide sequence for the detection of
identical sequences or a
degenerate pool of possible sequences for identification of closely related
genomic sequences.
Other means for producing specific hybridization probes for nucleic acids
include the
cloning of nucleic acid sequences into vectors for the production of mRNA
probes. Such vectors
are known in the art and are commercially available and may be used to
synthesize RNA probes
in vitro by means of the addition of the appropriate RNA polymerase as T7 or
SP6 RNA
polymerase and the appropriate radioactively labeled nucleotides. The
nucleotide sequences may
be used to construct hybridization probes for mapping their respective genomic
sequences. The
nucleotide sequence provided herein may be mapped to a chromosome or specific
regions of a
chromosome using well known genetic and/or chromosomal mapping techniques.
These
techniques include in situ hybridization, linkage analysis against known
chromosomal markers,
hybridization screening with libraries or flow-sorted chromosomal preparations
specific to
known chromosomes, and the like. The technique of fluorescent in situ
hybridization of
chromosome spreads has been described, among other places, in Verma et al
(1988) Human
Chromosomes: A Manual of Basic Techniques, Pergamon Press, New York NY.
Fluorescent i~ situ hybridization of chromosomal preparations and other
physical
chromosome mapping techniques may be correlated with additional genetic map
data. Examples
of genetic map data can be found in the 1994 Genome Issue of Science (265:1981
f). Correlation
between the location of a nucleic acid on a physical chromosomal map and a
specific disease (or
predisposition to a specific disease) may help delimit the region of DNA
associated with that
genetic disease. The nucleotide sequences of the subj ect invention may be
used to detect
differences in gene sequences between normal, carrier or affected individuals.
92


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
4.20 PREPARATION OF SUPPORT BOUND OLIGONUCLEOTIDES
Oligonucleotides, i.e., small nucleic acid segments, may be readily prepared
by, for
example, directly synthesizing the oligonucleotide by chemical means, as is
commonly practiced
using an automated oligonucleotide synthesizer.
Support bound oligonucleotides may be prepared by any of the methods known to
those of
skill in the art using any suitable support such as glass, polystyrene or
Teflon. One strategy is to
precisely spot oligonucleotides synthesized by standard synthesizers.
Immobilization can be
achieved using passive adsorption (Inouye & Hondo, (1990) J. Clin. Microbiol.
28(6) 1469-72);
using UV light (Nagata et al., 1985; Dahlen et al., 1987; Morrissey & Collins,
(1989) Mol. Cell
Probes 3 (2) 189-207) or by covalent binding of base modified DNA (I~eller et
al., 1988; 1989); all
references being specifically incorporated herein.
Another strategy that may be employed is the use of the strong biotin-
streptavidin
interaction as a linker. For example, Broude et al. (1994) Proc. Natl. Acad.
Sci. USA 91 (8) 3072-6,
describe the use of biotinylated probes, although these are duplex probes,
that are immobilized on
streptavidin-coated magnetic beads. Streptavidin-coated beads may be purchased
from Dynal, Oslo.
Of course, this same linking chemistry is applicable to coating any surface
with streptavidin.
Biotinylated probes may be purchased from various sources, such as, e.g.,
Operon Technologies
(Alameda, CA).
Nunc Laboratories (Naperville, IL) is also selling suitable material that
could be used. Nunc
Laboratories have developed a method by which DNA can be covalently bound to
the microwell
surface termed Covalink NH. CovaLink NH is a polystyrene surface grafted with
secondary amino
groups (>NH) that serve as bridge-heads for further covalent coupling.
CovaLink Modules may be
purchased from Nunc Laboratories. DNA molecules may be bound to CovaLink
exclusively at the
5'-end by a phosphoramidate bond, allowing immobilization of more than 1 pmol
of DNA
(Rasmussen et al., (1991) Anal. Biochem. 198(1) 138-42). '
The use of CovaLink NH strips for covalent binding of DNA molecules at the 5'-
end has
been described (Rasmussen et al., (1991). In this technology, a
phosphoramidatebond is employed
(Chu et al., (1983) Nucleic Acids Res. 11(8) 6513-29). This is beneficial as
immobilizationusing
only a single covalent bond is preferred. The phosphoramidatebond joins the
DNA to the
CovaLink NH secondary amino groups that are positioned at the end of spacer
arms covalently
grafted onto the polystyrene surface through a 2 nm long spacer arm. To link
an oligonucleotide to
CovaLink NH via an phosphoramidate bond, the oligonucleotide terminus must
have a 5'-end
phosphate group. It is, perhaps, even possible for biotin to be covalently
bound to CovaLink and
then streptavidin used to bind the probes.
93


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
More specifically, the linkage method includes dissolving DNA in water (7.5
ng/ul) and
denaturing for 10 min. at 95°C and cooling on ice for 10 min. Ice-cold
0.1 M 1-methylimidazole,
pH 7.0 (1-MeIm7), is then added to a final concentration of 10 mM 1-MeIm7. A
ss DNA solution is
then dispensed into CovaLink NH strips (75 ul/well) standing on ice.
Carbodiimide 0.2 M 1-ethyl-3-(3-dimethylaminopropyl)-caxbodiimide (EDC),
dissolved in
mM 1-MeIm7, is made fresh and 25 u1 added per well. The strips are incubated
for 5 hours at
50°C. After incubation the strips are washed using, e.g., Nunc-Immuno
Wash; first the wells are
washed 3 times, then they are soaked with washing solution for 5 min., and
finally they are washed
3 times (where in the washing solution is 0.4 N NaOH, 0.25% SDS heated to
50°C).
10 It is contemplated that a further suitable method for use with the present
invention is that
described in PCT Patent Application WO 90/03382 (Southern & Maskos),
incorporatedherein by
reference. This method of preparing an oligonucleotide bound to a support
involves attaching a
nucleoside 3'-reagent through the phosphate group by a covalent phosphodiester
link to aliphatic
hydroxyl groups carried by the support. The oligonucleotide is then
synthesized on the supported
nucleoside and protecting groups removed from the synthetic oligonucleotide
chain under standard
conditions that do not cleave the oligonucleotide from the support. Suitable
reagents include
nucleoside phosphoramidite and nucleoside hydrogen phosphorate.
An on-chip strategy for the preparation of DNA probe for the preparation of
DNA probe
arrays may be employed. For example, addressable laser-
activatedphotodeprotectionmay be
employed in the chemical synthesis of oligonucleotides directly on a glass
surface, as described by
Fodor et al. (1991 ) Science 251 (4995) 767-73, incorporated herein by
reference. Probes may also
be immobilized on nylon supports as described by Van Ness et al. (1991 )
Nucleic Acids Res.
19(12) 3345-50; or linked to Teflon using the method of Duncan & Cavalier
(1988) Anal. Biochem.
169(1) 104-8; all references being specificallyincorporatedherein.
To link an oligonucleotideto a nylon support, as described by Van Ness et al.
(1991),
requires activation of the nylon surface via alkylation and selective
activation of the 5'-amine of
oligonucleotides with cyanuric chloride.
One particular way to prepare support bound oligonucleotides is to utilize the
light-generated synthesis describedby Pease et al., (1994) PNAS USA 91(11)
5022-6, incorporated
herein by reference). These authors used current photolithographictechniques
to generate arrays of
immobilized oligonucleotide probes (DNA chips). These methods, in which light
is used to direct
the synthesis of oligonucleotide probes in high-density, miniaturized arrays,
utilize photolabile
5'-protectedN acyl-deoxynucleosidephosphoramidites, surface linker chemistry
and versatile
combinatorial synthesis strategies. A matrix of 256 spatially defined
oligonucleotideprobes may be
3 5 generated in this manner.
94


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
4.21 PREPARATION OF NUCLEIC ACID FRAGMENTS
The nucleic acids may be obtained from any appropriate source, such as cDNAs,
genomic
DNA, chromosomal DNA, microdissected chromosome bands, cosmid or YAC inserts,
and RNA,
including mRNA without any amplification steps. For example, Sambrook et al.
(1989) describes
three protocols for the isolation of high molecular weight DNA from mammalian
cells (p.
9.14-9.23).
DNA fragments may be prepared as clones in M 13, plasmid or lambda vectors
and/or
prepared directly from genomic DNA or cDNA by PCR or other
amplificationmethods. Samples
rnay be prepared or dispensed in multiwell plates. About 100-1000 ng of DNA
samples may be
prepared in 2-500 ml of final volume.
The nucleic acids would then be fragmented by any of the methods known to
those of skill
in the art including, for example, using restriction enzymes as described at
9.24-9.28 of Sambrook et
al. (1989), shearing by ultrasound and NaOH treatment.
Low pressure shearing is also appropriate, as described by Schriefer et al.
(1990) Nucleic
Acids Res. 18(24) 7455-6, incorporated herein by reference). In this method,
DNA samples are
passed through a small French pressure cell at a variety of low to
intermediatepressures. A lever
device allows controlled application of low to intermediate pressures to the
cell. The results of these
studies indicate that low-pressure shearing is a useful alternative to sonic
and enzymatic DNA
fragmentation methods.
One particularly suitable way for fragmenting DNA is contemplated to be that
using the two
base recognition endonuclease, CviJI, described by Fitzgerald et al. (1992)
Nucleic Acids Res.
20(14) 3753-62. These authors described an approach for the rapid
fragmentation and fractionation
of DNA into particular sizes that they contemplated to be suitable for shotgun
cloning and
sequencing.
The restriction endonuclease CviJI normally cleaves the recognition sequence
PuGCPy
between the G and C to leave blunt ends. Atypical reaction conditions, which
alter the specificity of
this enzyme (CviJI* *), yield a quasi-random distribution of DNA fragments
form the small
molecule pUC 19 (2688 base pairs). Fitzgerald et al. ~ (1992) quantitatively
evaluated the
randomness of this fragmentation strategy, using a CviJI* * digest of pUC 19
that was size
fractionatedby a rapid gel filtrationmethod and directly ligated, without end
repair, to a lac Z minus
M 13 cloning vector. Sequence analysis of 76 clones showed that CviJI* *
restricts pyGCPy and
PuGCPu, in addition to PuGCPy sites, and that new sequence data is accumulated
at a rate
consistent with random fragmentation.
As reported in the literature, advantages of this approach compared to
sonication and
agarose gel fractionation include: smaller amounts of DNA are required (0.2-
0.5 ug instead of 2-5


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
ug); and fewer steps are involved (no preligation, end repair, chemical
extraction, or agarose gel
electrophoresis and elution are needed
Irrespective of the manner in which the nucleic acid fragments are obtained or
prepared, it is
important to denature the DNA to give single stranded pieces available for
hybridization. This is
achieved by incubating the DNA solution for 2-5 minutes at 80-90°C. The
solution is then cooled
quickly to 2°C to prevent renaturation of the DNA fragments before they
are contacted with the
chip. Phosphate groups must also be removed from genomic DNA by methods known
in the art.
4.22 PREPARATION OF DNA ARRAYS
Arrays may be prepared by spotting DNA samples on a support such as a nylon
membrane.
Spotting may be performed by using arrays of metal pins (the positions of
which correspond to an
array of wells in a microtiter plate) to repeated by transfer of about 20 n1
of a DNA solution to a
nylon membrane. By offset printing, a density of dots higher than the density
of the wells is
achieved. One to 25 dots may be accommodated in 1 mm2, depending on the type
of label used. By
avoiding spotting in some preselectednumber of rows and columns, separate
subsets (subarrays)
may be formed. Samples in one subarray may be the same genomic segment of DNA
(or the same
gene) from different individuals, or may be different, overlapped genomic
clones. Each of the
subarrays may represent replica spotting of the same samples. In one example,
a selected gene
segment may be amplified from 64 patients. For each patient, the amplified
gene segment may be in
one 96-well plate (all 96 wells containing the same sample). A plate for each
of the 64 patients is
prepared. By using a 96-pin device, all samples may be spotted on one 8 x 12
cm membrane.
Subarrays may contain 64 samples, one from each patient. Where the 96
subarrays are identical, the
dot span may be 1 mm2 and there may be a 1 mm space between subarrays.
Another approach is to use membranes or plates (available from NUNC,
Naperville, Illinois)
which may be partitioned by physical spacers e.g. a plastic grid molded over
the membrane, the grid
being similar to the sort of membrane applied to the bottom of multiwell
plates, or hydrophobic
strips. A fixed physical spacer is not preferred for imaging by exposure to
flat phosphor-storage
screens or x-ray films.
The present invention is illustrated in the following examples. Upon
consideration of the
present disclosure, one of skill in the art will appreciate that many other
embodiments and variations
may be made in the scope of the present invention. Accordingly, it is intended
that the broader
aspects of the present invention not be limited to the disclosure of the
following examples. The
present invention is not to be limited in scope by the exemplified embodiments
which are intended
as illustrations of single aspects of the invention, and compositions and
methods which are
functionally equivalent are within the scope of the invention. Indeed,
numerous modifications and
96


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
variations in the practice of the invention are expected to occur to those
skilled in the art upon
consideration of the present preferred embodiments. Consequently, the only
limitations which
should be placed upon the scope of the invention are those which appear in the
appended claims.
All references cited within the body of the instant specification are hereby
incorporated by
reference in their entirety.
5.0 EXAMPLES
5.1 EXAMPLE 1
Novel Nucleic Acid Seduences Obtained From Various Libraries
A plurality of novel nucleic acids were obtained from cDNA libraries prepared
from various
human tissues and in some cases isolated from a genomic library derived from
human chromosome
using standard PCR, SBH sequence signature analysis and Sanger sequencing
techniques. The
inserts of the library were amplified with PCR using primers specific for the
vector sequences which
flank the inserts. Clones from cDNA libraries were spotted on nylon membrane
filters and screened
with oligonucleotide probes (e.g., 7-mers) to obtain signature sequences. The
clones were clustered
into groups of similar or identical sequences. Representative clones were
selected for sequencing.
In some cases, the 5' sequence of the amplified inserts was then deduced using
a typical
Sanger sequencing protocol. PCR products were purified and subj ected to
fluorescent dye
terminator cycle sequencing. Single pass gel sequencing was done using a 377
Applied Biosystems
(ABI) sequencerto obtain the novel nucleic acid sequences. In some cases RACE
(Random
Amplification of cDNA Ends) was performed to further extend the sequence in
the 5' direction.
5,2 EXAMPLE 2
Assemblage of Novel Nucleic Acids
The contigs or nucleic acids of the present invention, designated as SEQ ID
NO: 1969-2951,
and 3949-3954 were assembled using an EST sequence as a seed. Then a recursive
algorithm was
used to extend the seed EST into an extended assemblage, by pulling additional
sequences from
different databases (i.e., Hyseq's database containing EST sequences, dbEST
version 114, gb pri
114, and UniGene version 1 O 1 ) that belong to this assemblage. The algorithm
terminated when
there was no additional sequences from the above databases that would extend
the assemblage.
Inclusion of component sequences into the assemblage was based on a BLASTN hit
to the
extending assemblage with BLAST score greater than 300 and percent identity
greater than 95%.
Tables 6 and 8 sets forth the novel predicted polypeptides (including
proteins) encoded by
the novel polynucleotides (SEQ ID N0:2953-3936, and 3949-3954) of the present
invention, and
their corresponding nucleotide locations to each of SEQ ID NO: 2953-3936 and
3955-3960. Tables
97


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
6 and 8 also indicates the method by which the polypeptide was predicted.
Method A refers to a
polypeptide obtained by using a software program called FASTY (available from
http://fasta.bioch.virginia,edu) which selects a polypeptide based on a
comparison of the translated
novel polynucleotide to known polynucleotides (W.R. Pearson, Methods in
Enzymology,183:63-98
( 1990), herein incorporated by reference). Method B refers to a polypeptide
obtained by using a
software program called GenScan for human/vertebrate sequences (available from
Stanford
University, Office of Technology Licensing) that predicts the polypeptide
based on a probabilistic
model of gene structure/compositionalproperties (C SBurge and S. Karlin, J.
Mol. Biol., 268:78-94
( I 997), incorporated herein by reference). Method C refers to a polypeptide
obtained by using a
Hyseq proprietary software program that translates the novel polynucleotide
and its complementary
strand into six possible amino acid sequences (forward and reverse frames) and
chooses the
polypeptide with the longest open reading frame.
5.3 EXAMI':L.:~~ 3
Novel Nucleic Acids
Using PHRAP (Univ. of Washington) or CAP4 (Paracel), full length gene cDNA
sequences
and their corresponding protein sequences were generated from the assemblage.
A,ny frame shifts
and incorrect stop codons were corrected by hand editing. During editing, the
sequence was
checked using FASTY and/or BLAST against Genebank. Other computer programs
which may
have been used in the editing process were phredPhrap and Consed (University
of Washington) and
ed-ready, ed-ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide
sequences are shown in the
Sequence Listing as SEQ ID NO:l-351. The amino acids are SEQ ID N0:985-1335.
Table 1 shows the various tissue sources of SEQ ID NO: 1-351.
The nearest neighbor results for SEQ ID NO: 1-351 were obtained by a BLASTP
version
2.0a1 19MP-WashU search against Genpept release 120 and Geneseq October 12,
2000 release
21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the
closest
homologue for SEQ ID NO: 1-351 from Genpept . The translated amino acid
sequences for
which the nucleic acid sequence encodes are shown in the Sequence Listing. The
homologs
with identifiable functions for SEQ ID NO: 1-351 are shown in Table 2 below.
Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al.,
J. Comp.
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the
sequences were
examined to determine whether they had identifiable signature regions. Table 3
shows the
signature region found in the indicated polypeptide sequences, the description
of the signature,
the eMatrix p-values) and the positions) of the signature within the
polypeptide sequence.
98


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol.
26(1)
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide
sequences were
examined for domains with homology to certain peptide domains. Table 4 shows
the name of
the domain found, the description, the p-value and the pFam score for the
identified domain
within the sequence.
The nucleotide sequence within the sequences that codes for signal peptide
sequences and
their cleavage sites can be determine from using Neural Network SignalP V 1.1
program (from
Center for Biological Sequence Analysis, The Technical Uuversity of Denmark).
The process for
identifying prokaryotic and eukaryotic signal peptides and their cleavage
sites are also disclosed by
Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von Heijne in the
publication "
Identification of prokaryotic and eukaryotic signal peptides and prediction of
their cleavage sites"
Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997), incorporated herein by
reference. A maximum
S score and a mean S score, as described in the Nielson et as reference, was
obtained for the
polypeptide sequences. Table 7 shows the position of the signal peptide in
each of the polypeptides
and the maximum score and mean score associated with that signal peptide.
5.4 EXAMPLE 4
Novel Nucleic Acids
Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA
sequence and its corresponding protein sequence were generated from the
assemblage. Any frame
shifts and incorrect stop codons were corrected by hand editing. During
editing, the sequence was
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 117, gb
pri 117,
UniGene version 117, Genpept release 117). Other computer programs which may
have been used
in the editing process were phredPhrap and Consed (University of Washington)
and ed-ready, ed-
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice
variants resulting from
these procedures are shown in the Sequence Listing as SEQ ID NOS: 352-766. The
corresponding
amino acids are SEQ ID NO: 1336-1750.
Table 1 shows the various tissue sources of SEQ ID NO: 352-766.
The nearest neighbor results for SEQ ID NO: 352-766 were obtained by a BLASTP
version 2.0a1 19MP-WashU search against Genpept release 120 and Geneseq
October 12, 2000
release 21 (Derwent), using BLAST algoritlun. The nearest neighbor result
showed the closest
homologue for SEQ ID NO: 352-766 from Genpept . The translated amino acid
sequences for
which the nucleic acid sequence encodes are shown in the Sequence Listing. The
homologs with
identifiable functions for SEQ ID NO: 352-766 are shown in Table 2 below.
99


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al.,
J. Comp.
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the
sequences were
examined to determine whether they had identifiable signature regions. Table 3
shows the
signature region found in the indicated polypeptide sequences, the description
of the signature,
the eMatrix p-values) and the positions) of the signature within the
polypeptide sequence.
Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol.
26(1)
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide
sequences were
examined for domains with homology to certain peptide domains. Table 4 shows
the name of
the domain found, the description, the p-value and the pFam score for the
identified domain
within the sequence.
The nucleotide sequence within the sequences that codes for signal peptide
sequences and
their cleavage sites can be determine from using Neural Network SignalP V 1.1
program (from
Center for Biological Sequence Analysis, The Technical University of Denmark).
The process
for identifying prokaryotic and eukaryotic signal peptides and their cleavage
sites are also
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnax von
Heijne in the
publication " Identification of prokaryotic and eukaryotic signal peptides and
prediction of their
cleavage sites" Protein Engineering, Vol. 10, no. l, pp. 1-6 (1997),
incorporated herein by
reference. A maximum S score and a mean S score, as described in the Nielson
et as reference,
was obtained for the polypeptide sequences. Table 7 shows the position of the
signal peptide in
each of the polypeptides and the maximum score and mean score associated with
that signal
peptide.
5.5 ~XAMPL:C 5
Novel Nucleic Acids
Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA
sequence and its corresponding protein sequence were generated from the
assemblage. Any frame
shifts and incorrect stop codons were corrected by hand editing. During
editing, the sequence was
checked using FASTY and/or BLAST against Genbank (i.e., dbEST version 118, gb
pri 118,
UniGene version 118, Genpept release 118). Other computer programs which may
have been used
in the editing process were phredPhrap and Consed (University of Washington)
and ed-ready, ed-
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice
variants resulting from
these procedures are shown in the Sequence Listing as SEQ ID NOS: 767-930. The
corresponding
amino acid sequences axe SEQ ID N0:1751-1914.
Table 1 shows the various tissue sources of SEQ ID NO: 767-930.
loo


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
The homology results for SEQ ID NO: 767-930 were obtained by a BLASTP version
2.0a1 19MP-WashU search against Genpept release 120 and Geneseq October 12,
2000 release
21 (Derwent), using BLAST algorithm. The nearest neighbor result showed the
homologs for
SEQ.ID NO: 767-930 from Genpept. The translated amino acid sequences for which
the nucleic
acid sequence encodes are shown in the Sequence Listing. The homologues with
identifiable
functions for SEQ ID NO: 767-930 are shown in Table 2 below.
Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al.,
J. Comp.
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the
sequences were
examined to determine whether they had identifiable signature regions. Table 3
shows the
signature region found in the indicated polypeptide sequences, the description
of the signature,
the eMatrix p-values) and the positions) of the signature within the
polypeptide sequence.
Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol.
26(1)
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide
sequences were
examined for domains with homology to certain peptide domains. Table 4 shows
the name of
the domain found, the description, the p-value and the pFam score for the
identified domain
within the sequence.
The nucleotide sequence within the sequences that codes for signal peptide
sequences and
their cleavage sites can be determine from using Neural Network SignalP V 1.1
program (from
Center for Biological Sequence Analysis, The Technical University of Denmark).
The process
for identifying prokaryotic and eukaryotic signal peptides and their cleavage
sites are also
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von
Heijne in the
publication " Identification of prokaryotic and eukaryotic signal peptides and
prediction of their
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997),
incorporated herein by
reference. A maximum S score and a mean S score, as described in the Nielson
et as reference,
was obtained for the polypeptide sequences. Table 7 shows the position of the
signal peptide in
each of the polypeptides and the maximum score and mean score associated with
that signal
peptide.
5.6 EXAMPLE 6
Novel Nucleic Acids
Using PHR.AP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA
sequence and its corresponding protein sequence were generated from the
assemblage. Any frame
shifts and incorrect stop codons were corrected by hand editing. During
editing, the sequence was
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 118, gb
pri 118,
UniGene version 118, Genpept release 118). Other computer programs which may
have been used
10i


CA 02399776 2002-08-02
WO 01/57190 . PCT/USO1/04098
in the editing process were phredPhrap and Consed (University of Washington)
and ed-ready, ed-
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice
variants resulting from
these procedures are shown in the Sequence Listing as SEQ ID NOS: 931-965. The
corresponding
amino acid sequences are shown in SEQ ID N0:1915-1949.
Table 1 shows the various tissue sources of SEQ ID NO: 931-965.
The nearest neighbor results for SEQ ID NO: 931-965 were obtained by a BLASTP
version 2.0a1 19MP-WashU search against Genpept release 120 and Geneseq
October 12, 2000
release (Derwent), using BLAST algorithm. The nearest neighbor result showed
the closest
homologue for SEQ ID NO: 931-965 from Genpept . The translated amino acid
sequences for
which the nucleic acid sequence encodes are shown in the Sequence Listing. The
homologs
with identifiable functions for SEQ ID NO: 931-965 are shown in Table 2 below.
Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al.,
J. Comp.
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the
sequences were
examined to determine whether they had identifiable signature regions. Table 3
shows the
signature region found in the indicated polypeptide sequences, the description
of the signature,
the eMatrix p-values) and the positions) of the signature within the
polypeptide sequence.
Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol.
26(1)
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide
sequences were
examined for domains with homology to certain peptide domains. Table 4 shows
the name of
the domain found, the description, the p-value and the pFam score for the
identified domain
within the sequence.
The nucleotide sequence within the sequences that codes for signal peptide
sequences and
their cleavage sites can be determine from using Neural Network SignalP V 1.1
program (from
Center for Biological Sequence Analysis, The Technical University of Denmark).
The process
for identifying prokaryotic and eukaryotic signal peptides and their cleavage
sites are also
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von
Heijne in the
publication " Identification of prokaryotic and eukaryotic signal peptides and
prediction of their
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997),
incorporated herein by
reference. A maximum S score and a mean S score, as described in the Nielson
et as reference,
was obtained for the polypeptide sequences. Table 7 shows the position of the
signal peptide in
each of the polypeptides and the maximum score and mean score associated with
that signal
peptide.
5.7 EXAMPLE 7
Novel Nucleic Acids
102


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length~gene cDNA
sequence and its corresponding protein sequence were generated from the
assemblage. Any frame
shifts and incorrect stop codons were corrected by hand editing. During
editing, the sequence was
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 119, gb
pri 119,
UniGene version 119, Genpept release 119). Other computer programs which may
have been used
in the editing process were phredPhrap and Consed (University of Washington)
and ed-ready, ed-
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice
variants resulting from
these procedures axe shown in the Sequence Listing as SEQ ID NOS:966-974. The
corresponding
amino acid sequences are SEQ ID N0:1950-1958.
Table 1 shows the various tissue sources of SEQ ID NO: 966-974.
The nearest neighbor results for SEQ ID NO: 966-974 were obtained by a BLASTP
version 2.0a1 19MP-WashU search against Genpept release 120 and Geneseq
October 12, 2000
release (Derwent), using BLAST algorithm. The nearest neighbor result showed
the closest
homologue for SEQ ID NO: 966-974 from Genpept . The translated amino acid
sequences for
which the nucleic acid sequence encodes are shown in the Sequence Listing. The
homologs
with identifiable functions for SEQ ID NO: 966-974 are shown in Table 2 below.
Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al.,
J. Comp.
Biol., Vol. 6 pp. 219-235 (1999) herein incorporated by reference), all the
sequences were
examined to determine whether they had identifiable signature regions. Table 3
shows the
signature region found in the indicated polypeptide sequences, the description
of the signature,
the eMatrix p-values) and the positions) of the signature within the
polypeptide sequence.
Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol.
26(1)
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide
sequences were
examined for domains with homology to certain peptide domains. Table 4 shows
the name of
the domain found, the description, the p-value and the pFam score for the
identified domain
within the sequence.
The nucleotide sequence within the sequences that codes for signal peptide
sequences and
their cleavage sites can be determine from using Neural Network SignalP V 1.1
program (from
Center for Biological Sequence Analysis, The Technical University of Denmark).
The process
for identifying prokaryotic and eukaryotic signal peptides and their cleavage
sites are also
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von
Heijne in the
publication " Identification of pro~Caryotic and eukaryotic signal peptides
and prediction of their
cleavage sites" Protein Engineering, Vol. 10, no. 1, pp. 1-6 (1997),
incorporated herein by
reference. A maximum S score and a mean S score, as described in the Nielson
et as reference,
was obtained for the polypeptide sequences. Table 7 shows the position of the
signal peptide in
103


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
each of the polypeptides and the maximum score and mean score associated with
that signal
peptide.
5,8 EXAMPLE 8
Novel Nucleic Acids
Using PHR.AP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA
sequence and its corresponding protein sequence were generated from the
assemblage, Any frame
shifts and incorrect stop codons were corrected by hand editing. During
editing, the sequence was
checked using FASTY and/ox BLAST against Genbank (i.e. dbEST version 120, gb
pri 120,
UniGene version 120, Genpept release 120). Other computer programs which may
have been used
in the editing process were phredPhrap and Consed (University of Washington)
and ed-ready, ed-
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice
variants resulting from
these procedures are shown in the Sequence Listing as SEQ ID NOS:975-984. The
corresponding
amino acid sequences are SEQ ID N0:1959-1968.
Table 1 shows the various tissue sources of SEQ ID NO: 975-984.
The nearest neighbor results for SEQ ID NO: 975-984 were obtained by a BLASTP
version 2.0a1 19MP-WashU search against Genpept release 120 and Geneseq
October 21, 2000
release (Derwent), using BLAST algorithm. The nearest neighbor result showed
the closest
homologue for SEQ ID NO: 975-984 from Genpept . The translated amino acid
sequences for
which the nucleic acid sequence encodes are shown in the Sequence Listing. The
homologs
with identifiable functions for SEQ ID NO: 975-984 are shown in Table 2 below.
Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al.,
J. Comp.
Biol., Vol. 6 pp. 219-235 (I999) herein incorporated by reference), all the
sequences were
examined to determine whether they had identifiable signature regions. Table 3
shows the
signature region found in the indicated polypeptide sequences, the description
of the signature,
the eMatrix p-values) and the positions) of the signature within the
polypeptide sequence.
Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Vol.
26(1)
pp. 320-322 (1998) herein incorporated by reference) all the polypeptide
sequences were
examined for domains with homology to certain peptide domains. Table 4 shows
the name of
the domain found, the description, the p-value and the pFam score for the
identified domain
within the sequence. .
The nucleotide sequence within the sequences that codes for signal peptide
sequences and
their cleavage sites can be determine from using Neural Network SignaIP V 1.1
program (from
Center for Biological Sequence Analysis, The Technical University of Denmark).
The process
for identifying prokaryotic and eukaryotic signal peptides and their cleavage
sites are also
104


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von
Heijne in the
publication " Identification of prokaryotic and eukaryotic signal peptides and
prediction of their
cleavage sites" Protein Engineering, Vol. 10, no. l, pp. 1-6 (1997),
incorporated herein by
reference. A maximum S score and a mean S score, as described in the Nielson
et as reference,
was obtained for the poiypeptide sequences. Table 7 shows the position of the
signal peptide in
each of the polypeptides and the maximum score and mean score associated with
that signal
peptide.
5.9 EXAMPLE 9
Novel Nucleic Acids
Using PHRAP (Univ. of Washington) or CAP4 (Paracel), a full length gene cDNA
sequence and its corresponding protein sequence were generated from the
assemblage. Any frame
shifts and incorrect stop codons were corrected by hand editing. During
editing, the sequence was
checked using FASTY and/or BLAST against Genbank (i.e. dbEST version 120, gb
pri I20,
UniGene version I20, Genpept release 120). Other computer programs which may
have been used
in the editing process were phredPhrap and Consed (University of Washington)
and ed-ready, ed-
ext and gc-zip-2 (Hyseq, Inc.). The full-length nucleotide, including splice
variants resulting from
these procedures are shown in the Sequence Listing as SEQ ID NOS:3937-3942.
The
correspondingpeptide sequence is SEQ ID NO: 3943-3948.
Table 1 shows the various tissue sources of SEQ ID NO: 3937-3942.
The nearest neighbor results for SEQ ID NO: 3937-3942 were obtained by a
BLASTP
version 2.OaI 19MP-WashU search against Genpept release 120 and Geneseq
October 12, 2000
release 21 (Derwent), using BLAST algorithm. The nearest neighbor result
showed the closest
homologue for SEQ ID NO: 3937-3942 from Genpept . The translated amino acid
sequences for
which the nucleic acid sequence encodes are shown in the Sequence Listing. The
homologs
with identifiable functions for SEQ ID NO: 3937-3942 are shown in Table 9
below.
Using eMatrix software package (Stanford University, Stanford, CA) (Wu et al.,
J. Comp.
Biol., Vol. 6 pp. 219-,235 (1999) herein incorporated by reference}, all the
sequences were
examined to determine whether they had identifiable signature regions. Table
10 shows the
signature region found in the indicated polypeptide sequences, the description
of the signature,
the eMatrix p-values) and the positions) of the signature within the
polypeptide sequence.
Using the pFam software program (Sonnhammer et al., Nucleic Acids Res., Val.
26(1)
pp. 320-322 (1998} herein incorporated by reference) all the polypeptide
sequences were
examined for domains with homology to certain peptide domains. Table 11 shows
the name of
1os


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
the domain found, the description, the p-value and the pFam score for the
identified domain
within the sequence.
The nucleotide sequence within the sequences that codes for signal peptide
sequences and
their cleavage sites can be determine from using Neural Network SignalP V 1.1
program (from
Center for Biological Sequence Analysis, The Technical University of Denmark).
The process
for identifying prokaryotic and eukaryotic signal peptides and their cleavage
sites are also
disclosed by Henrik Nielson, Jacob Engelbrecht, Soren Brunak, and Gunnar von
Heijne in the
publication " Identification of prokaryotic and eukaryotic signal peptides and
prediction of their
cleavage sites" Protein Engineering, Vol. 10, no. l, pp. 1-6 (1997),
incorporated herein by
reference. A maximum S score and a mean S score, as described in the Nielson
et as reference,
was obtained for the polypeptide sequences. Table 12 shows the position of the
signal peptide in
each of the polypeptides and the maximum score and mean score associated with
that signal
peptide.
Tables 5 and 13 are correlation tables of all of the sequences and the SEQ ID
NOS.
TABLE 1
Tissue OriginRNA Library SE(~ ID NOS:


Source Name


lung 3 11 25 49 65 75 114 141 156
160 172


190 198 209 217 224 229 234-235
267


269 274 277 282 284 303 308
312 320


334 336 352 372 396 398 412
414 437


453 464 470 481 492-494 508-509
532


539 581 584 617-619 621 628
633 643


688 691 745 752 761 768 794
822 837


848 876 887 953 967 973


adult brain GIBCO AB3001 1 3 12-13 16 22-24 28-29 41
48 58 65 78


82 89-90 94 97 103 112 114-115
117 120


122 130-131 168 181 184 186-187
189-


190 198 208 216 247 249 259
270 277


297 301 308 312 314 321 333
348 374


396 403 406 410 412 416-417
420 423


426-427 431 456 474 481 484-485
488


498 500 508-509 530 549 553
558 563-


564 583 596 602-603 608 612
621-622


624 643 650 674 699 711 736
738-739


753 770 779-780 785-786 802-803
816


822 839 842 848 859 861 871
893-894


897 900 903 925 954 958 967
969


adult brain GIBCO ABD003 3 19 21-25 28-29 31 33-34 37
39 41 46-48


53 58 63-64 66 72 78 80 99 103
109-110


112 114 118 120-124 126 132-133
135


106


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
139 143 146 148-149 159 163
168 174


176 179-180 184-185 188-190
202 208-


209 216-217 221 223 230 234-235
240


244 249 251 253 255 258-259
263 269-


270 277 282 285-286 290 294-295
297


301-302 304-305 307-308 311-312
314


320 329 333 335-336 342 344
346 349


354 358 365 370 373-374 377
380 382-


383 388 394-396 399 401-402
406 409-


410 413 416 420-421 425 428
430-431


436-437 442 456 462 464 466-467
474


484 486 495-496 500-501 506
508-509


519 530 537 542 549 561-562
564 572


574 577-578 580-583 586-587
589 592-


593 596-597 601 608 610 612-614
617-


624 630-632 635 637 650 658
663-664


668 676 679 681 689-690 693
699 724


726 732 736 742-743 747 767-770
780


784 789 793 799 802-805 813
817-818


822 824 829-831 837 839 845
848 856


859-860 864 871-872 875-876
881 887


896-897 901 903 907 910-911
925 930


933 943-944 947 952-953 958
962-963


965 967 972 977


adult brain Clontech ABR001 3 53 66 113 115 126 135 160
172 179 185


204 263 273 305 312 323 358
380 383


395-396 403 420 428-429 431
461 542


583 586 606-607 611 620 645-646
688


690 715 732 736 740 748 754
768 784-


786 790 796 800 878 897 906-907
947


977


adult brain Clontech ABR006 19 32 49 53 60 72 91 103 118
125 130-


131 134 184 224 275 338 350
354 361-


363 374 384 390 394 396 431-432
434-


435 445 468 549 621 732 734-736
745


760-761 764 768-769 775 787
806 811


818 887 903 906 918 930 942
947 957


973 977


adult brain Clontech ABR008 2-3 9-11 14 17 21 23-25 28-29
31-35 37


41-42 45 47-48 56-57 65-66 69-70
72 75


77-78 88 91-92 97-99 101 103
112-115


118-128 130-131 135 138-140
142 144-


146 148 152 156-157 159-160
163 168


172 174 176 178-180 182-190
194 196-


198 200-201 204 209-214 218
220-225


228-230 232-233 238-240 243-244
246


254-256 260-264 270 272-274
278-279


282-285 289-291 293-294 296-297
301


303-306 312-314 317 321-322
325-328


334 336 338 340-342 344 346
348 350-


352 354 356-358 363 366 369-374
376


379-381 383-386 388-394 398-399
402-


107


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
403 405 409-412 414 418-421
423-424


426-427 430 433-437 443 445-450
452


456-457 460 462 464 471 479
482-483


485 488 490-498 505 507 510
516 519-


522 524 527-532 535 538-539
542-545


548 551 553 555 561-562 566
569 571


574 580-583 588-589 593 597
601-608


611-612 614-615 617-618 621-622
624


630-635 642 644 646-648 650-652
655


657 659-661 664-665 668 672
674 689


693-699 701-702 708 711 715
717 724


728-730 732 734-735 738-740
745 747-


750 753-755 757 761 763-764
766-769


772-773 775 780-781 789-791
793-795


799-800 802-806 809 812 818-819
821-


822 826 829-830 832 834-835
841 843


845 856 858-859 861 864 866
870 872


876 880 883 885 887 893-898
902 906-


916 918 921 925-926 930-931
933 942-


943 946 948 950-951 953-954
958-960


962-965 967 969-970 972 977


adult brain Clontech ABRO11 57 196 270 304 344 436 834


adult brain BioChain ABR012 14 82 121-122 168 691


adult brain InvitrogenABR013 72 108 263 270 336 425 492-494
732 787


790 826 880


adult brain InvitrogenABR014 293 394 399 764 768-769 928
967


adult brain InvitrogenABRO15 738-739 764


adult brain InvitrogenABR016 320 374 396 399 405 684 742-743
767


931 947 967


adult brain InvitrogenABT004 21 33-34 37-38 47 52 57-58 69
72 91-93


109 119 122-124 126-127 135
142-143


158 167-168 185-188 194 200
212 232


242 246 255 258 270 277 279
293 301


312-313 319 322-323 331 341
346 348


371 374 388 391 394 399 401
409 411


429 436-437 456 462 477 488
496 498


510 512 515 539 542 545 549
559 563


573 579 587 589 601-605 612
620-621


624 640 643 647 681 715 723
728 732


735-736 740 745 748 753 766
785-786


792-793 797-801 812 822 829-831
853-


856 859 876-877 884 893-894
908-909


918 925 933 950 969 978


cultured StrategeneADP001 4 28-29 69 93 114 121 132-133
135 151-


preadipocytes 152159167172178181 184190194-
~


195 203-204 209 217 219 240
248 260-


262 267 273-274 277 282 297
301 304


312 314 326-327 361-362 371
374 388


394 401 403 405 411 420 437
453 466-


467 470 474 478 496 507-509
517 530


532-533 584 588 593 602-603
608 610


617-621 630-631 633 639 642-643
661


1os


CA 02399776 2002-08-02
WO 01/57190
PCT/USO1/04098
6 93 729 746 761 765 769 834 842
848


887 907 923 947-950 957 967
969


adrenal glandClontech ADR002 1 3 12-13 21 23-24 27-29 67
74 78 103-


I05 I08-109 113 115 118 120-121
128-


133 149156 160172177 182 214
217


223 232-233 247 254 269-270
273-274


277 283 285 288 298-299 308
317 319


328 338 340 342 361-362 364
372 376-


377 382 384 401-402 405-406
416 420


431 437 444 446 448 457 462
484 500


507 517 524 532-533 539 545
554 561-


562 564 588 597 602-603 606-607
635


642 646 649 658 664 674 693
703 730


740 745 752 759 765 767 775
779 799


809 8I7-818 839 845 856 859
863 887


890-891 896 948 953 958 961-963
973


GIBCO AHR001 1 3-4 8 10 14 20-2125 28-29
33-34 37-38


adult heart 41 48 54-57 65 69-72 75 78 80
82-83 97


99-100 108 112-115 117-121 123-124


128-133 141 144-146 149 152
159 162-


163 168 172 176 179 181 184
186-187


190-191 201 203 208-209 212
216-218


221 223 227 229 233 244 247
249 253-


255 258 263-264 267 269-270
274 278


280-282 285 289 291 295 297-299
301


303-304 308 313 317 321-322
326 328


334 344 348 352 358 361-363
370-371


380 382-383 388 394-396 398
401 403


405-406 410-416 423 425-427
430-431


436 452-453 464-465 470-474
481-484


487-488 490 492-494 496 499-500
505-


506 508-509 514 523 529-530
533 547-


548 553 558 563-565 577-578
586-588


590 593 597 601-603 606-608
610-613


617-619 621-622 626-628 637-638
642-


644 652 658 661 672 682-683
688 691


693 697 699 708 711 713 715
732 737


745 747-748 750-753 759 761
765 768-


770 775 790 802-803 814-815
818-819


830 837 839-840 842 845 848
859 861-


862 867 876-877 887 891-892
896 900-


901 903 905-906 908-909 919-920
922


925 928 936 939-940 946-947
950 953


959 967 970-971 973 977


adult kidney GIBCO AKD001 1.3 8 12-14 17 19-25 28-29 33-34
37-39


41 46-48 50 52 55-60 62 65-67
69 71-72


75 77-78 82 84 89-90 93 97 108-110114-


116 118-121 123-125 128 130-133
135


138 144 146 149 156 159-161
163-164


167-172 176 179 184 186-187
189-190


194 196 200-202 204 209 211-212
216-


217 219 221 223-224 229 232-235
244


109


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
247 250 253 255-256 258 263-264
268-


272 274 277-281 283 286 288-290
292


294-295 297 301 303-309 311-314
316


319-323 325 328-338 342 348-349
352


354-355 358 361-363 365 370-371
373


376-378 380 382-383 388 395-399
401-


403 405-406 409-413 416 418-420
425-


428 430-431 440 442 452-454
462 464-


465 470 472-474 477 479 481
483-485


487-489 492-495 498-500 504
506 510


517 522 525 529-530 532-533
539 542-


543 547 551-552 558 560-564
569-570


573-574 577-578 580-583 585-590
594-


596 601-608 610-613 617-621
624 626-


628 630-631 634-636 639 642-643
648


652 656 658 664-665 676-677
679 681


688-691 693 697 699 708 711
715 717


720-722 724 729-732 738-741
747-748


751-753 761 765 770-778 780
784 789


791 793 797 804 813 817 823-824
834


837 839 842-843 845 848 859
861-862


864 867 870 876-877 887 889
892-894


896-897 900-901 903 907 913-915
918


921 923 925 929-930 932 939
942 946-


947 949-950 953 958-959 961-963
967


969 972 977


adult kidneyInvitrogenAI~T002 1 3 1621 30 32 35 38-41 46-47
56 77 92


109 123-124 130-131 146 149
161 167-


168 172 176 190 209 212 234-235
258


279 292 301 303 308 314 333
355 363


372 380 383 396 399 402 418-419
426-


427 431 448 454 461 471-474
488-489


495 498 504 506 508-509 520-521
530


537 539-541 545 547 563 582-583
592


613 617-618 621 623-624 633
655 688


690 693 699 704 713 732 745
752-753


761 766-768 770 784 789 797
837 842


848-849 866-867 877 887 893-894
903


914-915 925 929-930 937 944-945
947-


949 955 961 967 984


adult lung GIBCO ALG001 1 3 14 18 28-29 38 54-56 59
92 110 114-


115 130-131 146 149 156 159
164 167


176 184 209 217 234-236 240
255-256


258 263-264 269 271 276 280-281
297


305 308 312 314 322 325 332
336 344


353 361-362 388 401 410 420-421
426-


427 431 465 469 474 484 498
500 506


508-509 517 530 532 573 592
596 613


619-620 623 626-628 638 658
679 681


684 689 717 731 741 771 791
799 817


834 845 861-862 864 875-876
901 921


925 928 932 940 947 949 959
962-963


mo


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
967


lymph node Clontech ALN001 3 10 110 146 160 168 196 209
221 269


278 301 336 348 394 405 411
420 422


459 464 474 485 503 506-507
532 563


582 619 623 630-631 642 669
684 697


713 715 727 747 767 769 789
825 839


842 849 887 896 913 921 925


young liver GIBCO ALV001 3 14 16 37-38 41 51 56 60 97
104-105


108 110 117 119 128 130-131
134 139


149 152 169-172 176 184 189-190
200


209 212 216 218 228 232 255
258 263


270-271 275 285-286 292 295
298-299


301 304 314 341 358 365 368
376 400


410-412 431 474 481-482 485
496 500


504-505 517 520-522 524 530
532-533


547 551 563 581 583 610-611
621 624


635 643 691 708 711 715 720
752 755


761 768 796-797 811 818 830
845-847


852 864-865 867-869 896 899
910-911


949 958 965 969 972-973


adult liver InvitrogenALV002 3 37 42 56 60 71 82 104-105
114-115


117-118 125 130-131 134-135
164 169-


172 176 179 200 203-204 212
217 223


226 232 237 244 263 274-275
292 301


310-312314317349354364368372


376 398-399 402 426-427 439
442 451


458 465 474 482 485 490 506
515 525


527 545 547 552 568 571 573-575
582


587 594-595 604-605 608 610
621 630-


631 634-635 637 657 664 690
693 699


723 726 745 751 763 767 784
793 811


822 845 848 852 856 861-862
864 892


899 908-909 925 950 958 967
983


adult liver Clontech ALV003 60 134 169-171 275


adult ovary InvitrogenAOV001 1 3 9-10 12-14 16 18 20 22-25
28-29 33-


35 37 39 41-42 46 48-50 55-57
59 63-67


69 71-72 75 77-80 82 88-89 92
101 103-


106 108-110 113 115 119-121
123-126


128-133 135 138 142-146 149
151-152


159-161 167-168 172 174 176-177
179


181 184-190 194 198 200 203
208-209


211-212 214 217 219 221 224
226 232-


235 240-242 246-247 249 251
254-255


258-259 264 269-271 274 276-277
279-


283 285 288 290 293-294 297
301-304


306-308 311 314 319-322 325-326
328-


329 331-332 335-338 341-342
344 348


354-358 361-363 365 368 370-372
374


376 379-380 382-383 388 394-396
398-


399 401-402 405-406 409-412
416 418-


421 423 425-433 438 442-443
449-452


454 462 464 466-467 469-471
474 479


m


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
482-484 488 490 492-496 498
500-504


506-509 511 515-518 520-524
529-530


532-533 537 539-542 545 551
555 558


560-565 569 571 573 577-578
581-583


585-590 592-593 596-597 600-605
608


610-611 613-614 617-628 633-637
639


642-643 646-648 650 652 654
656 658


664 668-670 672 674 679 681
684 688


691 693 697-699 701-702 713
717 721-


722 724 729-732 738-744 747-750
752-


753 755 759 761 765 767-774
779-780


783-784 789 793 795-797 801
813-818


823-824 828 830-832 834 837
839 841-


842 845 848-851 856 859 862
864 866-


867 870-871 874-878 881-883
887-889


891 893-894 896-897 901 903
906-911


913 919-922 925 928 930 936
939-940


943-944 946-947 949-950 952-953
955


957-958 962-963 965 967 969
971 973


977 981-982


adult placentaInvitrogenAPL001 41 56 67 253 301 304 334 380
383 451


474 479 500 577-578 643 648
729 767


856 859 866 873 962-963


placenta InvitrogenAPL002 3 21 31 38 63-64 78 135 143
168 186-187


212 232 244 263 280-281 334
336 344


348 371 374 394 399 461 490
582 588


602-607 610 620 699 745 769
793 817


822 859 897-898 923 928 931
943 949


969 973


adult spleenGIBCO ASP001 1 3 21-22 46 52 54-55 57-58
61-62 72 74


78 82 88 118 121 130-131 137
152 159


168 172 189 203 209 217 223
234-235


252 255 263 269 271 274 282
288 290


301 314 322 335 350 363 394
403 405-


406 410-412 415 431 459 464
472-474


482 488 500 506 510 514 517
532 537


542 561-563 589 593 602-603
610 613


619 621 636 642-643 655 658
662 674


676 679 681-682 684 689 691-692
697


699 715 720 723 729 747-748
769-770


782 793 818 830 834 845 856
859 862


877 887 893-894 896 903 906-907
914-


915 918 925 928 930 940 946
965 967


977 982


testis GIBCO ATS001 6 22 28-29 33-34 41 48 52 62
65 72 97


106 109 118 132-133 145-146
168 172


176 183 185 189-191 195 209
211-212


214 221 223 230 254-255 258
263 269


283 297 312 314 321 342 352
361-362


365 380 383 388 395 401 405-406
412


430-431 441 469-470 474 479
495-496


500 506 520-521 533 543 545
548 560


112


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
563 574 582 589-590 593 608
616-618


620 623-624 638 642-643 697
699 708


711 745 747-748 765 767-768
779 784


789 812-813 834 837 839 848
859 862


868-869 875-877 887 889 893-894
896


928 944 947 953-955 972 981


Genomic DNA Research BAC001 515


from BAC Genetics


63I18 (CITB BAC


Library)


Genomic DNA Research BAC002 640


from BAC Genetics


393I6 (CITB BAC


Library)


Genomic DNA Research BAC003 ~ 640


from BAC Genetics


393I6 (CITB BAC


Library)


adult bladderInvitrogenBLD001 50 55 66 71 111 143-144 148
160 201 209


223 255-256 280-281 286 305
315 319


340 394 431 442 488 497 505
518 552


588-589 621 636 664 676 715
738-739


769 790 824 837 845 877 887
936 940


948 962-963 967


bone marrow Clontech BMD001 3 10-13 16 18 20-21 25 28-29
31-34 41 45


48 52 54-55 57 59 61 65 67
72-73 75 78


80 82 84 99 103 108 110 114-115
118-


120 123-124 128 130-133 143-144
148


152 I 59-16I I 63 168 172 174
176 I 78


190 192 198 203 209 211 217-218
221


223-224 227 233-236 244 247
249 252


254 258 260-262 267 269 272
278 280-


281 284-285 288 290 294-297
301 304


308 314 317-318 320-321 325
328-330


333-335 349 351-354 358 363
365 367


377 382 388 394-397 400 405
408 410-
.


412 418-421
425-428 431 433 435 442


449-450 453 455 459 464 468-470
474


478-479 481 484 490 496 504
506 508-


509 511 519-521 530 532 539
553 558-


559 561-563 580 582 586 592
599 608


610 613-614 617-619 623 625-628
635


638 641-643 658 664 672 682
699 711


713 717 731 734 740 742-743
745 761


768-771 774 776-778 784 787
789 813


817-818 822 834 839-840 842
848 862


866 870 876 885-887 891 896-898
900


903 906 913 919 921-922 927-928
939


944 947 950 953 959 961-963
967-968


970 973 977


bone marrow Clontech BMD002 3 9-10 15-19 30 33-34 39 45
54 57 63-64


71 82 102 116 119 130-133 148
152 156


113


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
159-160 168 176 182 224 254-255
271-


272 282 285 290 297-299 301
305 323


333 340 344 351-355 358 361-362
364


367 370 372 387 394-395 399
403 405


409 411 449-450 459 461 468
474 488-


489 524 530 532 580-582 592
602-603


611 617-618 621-622 630-632
642 661


663 694 717 730 734 740 745
752 7S5


761 767 769-771 775-778 784
787 811


813 818 832 840 842 849 859
878 887


893-894 896-898 903 906 908-909
923


928 944 946-949 953 958-963
965 982


bone marrow Clontech BMD004 54


bone marrow Clontech BMD007 766 887 928


adult colon InvitrogenCLN001 22 37 67 97 117 121 148-149
168 172 190


200 204-205 232 244 263 268
292 301-


302 363 377 384 452 455 459
470 530


582 602-603 619 687 723 728
751 761


831 861 887 914-916 934 955
969 984


Mixture of Various CTL016 358 740 760
16


tissues - Vendors*


mRNAs*


Mixture of Various CTL021 468 527 928
16


tissues - Vendors*


mRNAs*


adult cervixBioChain CVX001 1 3 10 14 22 28-30 37 41 47-48
51-52 54-


57 71 82 89-90 92 106 108 110-111
117-


118 121 129-131 135 141 143-146
160-


161 164 168 172 177 189-190
193 195


200 204 209 211-212 217 226
229-230


232 234-235 240-242 246 254
260-263


268-270 274 277 282 285 292
295 297


305-308 314-316 319 328 343-344
348


354 358 363 368 380 382-384
389 394


396 399 401 405-407 410 416
418-421


428 430-431 437 442 453-454
459 464


469 471-473 476 480 484 492-495
500


504 506-509 516-517 526 530
532 545


550-551 563-565 569 577-578
585-586


590 608 611 613 619 621 623
628 630-


631 634-637 641 643 648 656-658
664-


665 674 679 682 689-690 693
700 703


708 713 721-722 724 728 732
742-743


747 750 752 755 757 761 763
767-769


* The 16 tissue-mRNAs and their vendor source, are as follows: 1) Normal adult
brain mRNA (Invitrogen), 2)
normal adult kidney mRNA (Invitrogen), 3) normal adult liver mRNA
(Invitrogen), 4) normal fetal brain mRNA
(Invitrogen), 5) normal fetal kidney mRNA (Invitrogen), 6) normal fetal liver
mRNA (Invitrogen), 7) normal fetal
skin mRNA (Invitrogen), 8) human adrenal gland mRNA (Clontech), 9) human bone
marrow mRNA (Clontech),
10) human leukemia lymphablastic mRNA (Clontech), 11) human thymus mRNA
(Clontech), 12) human lymph
node mRNA (Clontech), 13) human spinal cord mRNA (Clontech), 14) human thyroid
mRNA (Clontech), 15)
human esophagus mRNA (BioChain), 16) human conceptional umbilical cord mRNA
(BioChain).
114


CA 02399776 2002-08-02
WO 01/57190
PCT/USO1/04098
739-780 784 788 810-811 813-815 822
834 836-837 839 848 861 866-867 871
874 877 887 891-894 897-898 901 913
916 919 921-922 925 946-947 953 958-
959 967 969 973
diaphragm BioChain DIA002 3 39 184 203 431 563 ~54~5 yd ~
endothelial Stxategene EDT001 3 6 8-10 1419-24 28-29 33-34 37 39 41
cells 46 48 52 55-58 62-65 67 69 71-72 75 78
80 82-83 87 101-102 108-109 114-115
117 123-124 128 130-133 135 138 143
145-146 149 156 159-160 167-168 172
174 176-177 179 181 184-187 189-190
194-195 200 203 208-209 212 216-217
2I9 223-224 226-227 229 234-235 244
248-249 254-256 258 263-264 267 269
271274 276-282 285 290-291 294 297
301-304 308 311 313-314 316-317 320-
321 323 325-326 328-329 331-332 334-
337 339-341 344 348-349 352 354-355
358 361-363 365 367 371-372 375 379-
380 383 389 394-395 398-403 405-406
409-412 425-428 437 442-443 448 454
464 466-467 474 479 481 490 492-498
500 503 506-509 511 517 520-521 523-
524 530 532 537 540-542 558 561-563
565 569-570 573 581-583 586 588-589
596 602-608 610-611 613 617-622 625
628 630-631 633-637 642-643 646 648
650 652 659 661-662 682 688 690-693
696 698-699 708 712 715 717 720-722
724 727 729 740 745 748-750 752 761
765 767-770 772-773 779 784 789 792-
794 796 802-803 811 817-818 821 824
827-828 830 834-835 837 842 845 848
859 861-862 864 866-867 870 876 885
887 891 893-894 897-898 900 903 906-
907 913 916 921 925 939 947 950 953
955 957-958 962-963 967 973 978 984
Genomic Genomic EPM001 324 515 640


clones fromDNA from
the


short arm Genetic
of


chromosome Research
8


esophagus BioChain ES0002 97 103 128 371 474


fetal brainClontech FBR001 67 129 156 159 232 267 433 446
503 845


952


fetal brainClontech FBR004 28-29 185 213 277 350 384 432
485 501


549 651 747 754 761 780 787
848 870


887 906 958


fetal brain Clontech FBR006 10-11 14 21 30 32 47 49 56 65 69 72 77-
78 82 84 97 101 115 118 121 125 128
130-131 138 142 148 152 159-160 179
185 188 194 197 203 210 212 214 219
115


CA 02399776 2002-08-02
PCT/US01/04098


WO
01157190
.


2 22 227-229 243-246 249 252 256
264


270 273 282 285 290-291 293
301-303


30S-306 312 321-322 325 327
339-340


344 346 350 354-357 363 367-371
374


388 391 394-395 399 402 405-406
410


414 420 426427 436-437 442 444
454


456-457 464 462 464 470 480
48S 492-


494 507 510 516 524 528 530-532
539-


542 549 553-554 561-562 580-582
588-


589 602-608 611 615 617-619
621-622


624 632 636 641-642 646-647
651-653


661-662 666-669 672 677 691
715-716


730 735 740 752 754 761 767-770
772-


775 780-78I 799-801 808 818
822-823


835 843 845 856 859 864 867
876 880


885 887 890 893-894 896 913
918 926


942 946-947 951 957-959 962-963
970-


971


fetal FBRs03 130-131 312 517 637 691 738-739
brain
Clontech


fetal FBT002 3 22 28-31 47 57 63-64 72 75
brain 77-78 86
Invitrogen


94-95 97-98 126-127 135 140
143 156


159-160 167-168 177 185 190
196 201


203-204 214 217 230 254-255
258 267


273-274 277 279 282-283 292
30I-302


305 312 314 323 329 346 348
367 374


382 394 399 401 403 412 415
420 432


437 474 482 485 495 507 513
517 527


529-530 539-542 548 S52 579
S87-588


600 604-605 612 617-618 621-622
624


634 642-643 647-648 650 679
689 693


699 712 71 S 742-743 745 748-749
753


768-769 793 797 829-831 834
845 848


856 859 893-894 908-909 913
916 931


933 940 950 967 969


fetal FHR001 19 57 130-131 394 431 6_42 769
heart 844
Invitrogen


fetal FKD001 3 31 33-34 38 48 54 72160 208-209
kidney 211
Clontech


223 264 269 277 283 290 313
325 341


348 358 396 418-420 474 484
506 508-


509 517 520-521 532 547 553
558 567


569 587 596 608 610 613 619
622 626-


627 642 679 734 745 818 843
887 896


903 916 969 971


fetal kidneyClontech FKD002 19 474 726 903


fetal kidneyInvitrogenFKD007 3 118 186-187 230 244 271 432
887 969


fetal lung Clontech FLG001 69 132-133 156 168 208-209 217
267 269


274-275 286 354 394 396 406
462 483-


484 608 619 751 769 771 834
914-915


925


fetal lung InvitrogenFLG003 3 8 28-29 32 39 50 66 82 88
92 168 186-


187 200 204 212 226 229 246
274 309


327 332 368 374 382 394 398
426-427


431-432 442 485 536 555-557
587 604-


116


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
605 621 624 636 642-643 661
677-678


724 753 769 848 859 864 877-878
896


902 904 914-915 958


fetal lung Clontech FLG004 130-I31 394 664 769 942


fetal liver-Columbia FLS001 3 8-10 12-13 16-17 19-25 27-29
33-35 37-


spleen University 38 41 45-46 48 52 55-58 60-67
69 71-74


77-78 80 82 84 87-90 104-106
108-109


112-121 123-125 128-134 138
141 143-


146 149 151 156 159 163-164
167-172


174 176-179 181 184 186-188
190 194


200-201 203 208-209 211-212
216-217


219 224-227 229-230 232 234-235
237


241 243-244 246-248 254-255
258 260-


263 267 269-270 273-282 284-285
288-


290 292-295 297-299 301-306
308 311-


318 320-323 326 328 332 335
341-344


348 352 354-359 361-365 367-368
371-


374 376-380 382-383 388-389'394-396


398-399 401-411 413-414 416
418-421


425 428-430 432-433 437 439
442-444


449-450 452 456-457 461-470
472-474


478-479 481-482 484-485 487
490-494


497-499 504-507 S I 1 514-515
517-521


523-524 526 529 532 537 540-541
547


555 558-559 563 575 577-578
580-596


598-599 601-603 606-608 610-613
617-


624 626-628 630-631 634-636
639 642-


643 647-648 654-656 663-665
672 674-


675 679 681 684 686 688 691
693-699


711 713 715 717 719-726 729
732-733


738-740 745 748-749 751-753
757 759


761 767-770 776-778 780 784
787 792-


794 799 804 809 811 813 817-819
822-


825 830-831 834 837 840 842
845-848


852 856 859 861-862 865 867-869
871


874-878 887-888 891 893-894
896-900


903 905-911 913 916 918 923
928 930-


931 936 939 942 944 946-950
952 958-


959 961-963 965 967 969-970
972-973


976-977 981-983


fetal liver-Columbia FLS002 3 8-13 15-17 19-20 22 25 28-29
33-35 37


spleen University 41 45-46 52 54-56 60-61 63-64
66-70 73-


74 78 80 82 92 99 104-106 108-109
112


115-116 118 120-121 123-125
128 132-


135 139 141 143-144 146 149
152 156


159-161 167 169-172 174 176-177
179


181 185 188 190 194 196-197
200 204


212 214 216-218 223-224 226-230
232-


235 237 246-247 252 254-255
258-263


267 270-277 284-286 288 292
294-295


297-299 301 303-305 308 310
314 318


320 323 328 330-332 335-337
340 342-


117


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
344 352 354-355 358 361-365
367-368


371 373-374 376-377 382 388
394-396


398-399 401 405-406 409-411
413 418-


421 429 431 439-440 442-444
451-452


457 462-463 466-468 470 474
477-479


481 483-484 487-488 491 495
499 504


508-509 516 519-521 524 526-528
530


532 537 540-541 543 545-547
550-551


553 555 560 564 568 574-575
577-578


580-592 596-597 600 602-603
608 610-


611 613-614 617-618 621-622
628 630-


631 634 637 639 642 644 647
654 658-


659 665-667 669-675 679 681
684-685


688-690 693 695 697 708 711
713 715


717-719 723-727 729 731-734
738-739


741 745-746 749-750 753 759
761 766-


767 769-770 776-779 782 784
791-792


794 805 808 817-818 822 824-825
830


834 837 842 845-849 852 856
859 864-


865 867 874-878 888 891-892
896-900


903 905-906 908-909 913 916
918 921


923 925 932 936 939-940 942
944 946-


947 949-950 953 955-956 958-959
961-


963 965 968-970 973 977-978
981


fetal liver-Columbia FLS003 19 60 78 224 273 275 370 373-374
401


spleen University 602-603 639 643 730 732 738-739
748


752 770 782 928 930 947 949


fetal liver InvitrogenFLV001 37 55 60 69 72-73 97 104-105
108 113-


114 116-118 121 135 143 152
167-168


186-187 195 200-201 209 217
223 240


244 253 255 275 284 301 311
314 317


336 342 348-349 358 371 374
382 394


402 411-412 418-419 428 430
442 453


517 568-569 580 582 584 587
589 601-


603 606-608 617-618 624 634
639 642-


644 646 664-665 669 679 715
717 720


726 745 748 751 769-770 782
791 794


797 824 830-831 845-847 852
859 870


899 913-916 925 928 948 956
958 969


976 982


fetal liver Clontech FLV002 72 418-419 632


fetal liver Clontech FLV004 3 160 169-171 355 367 374 376
547 617-


618 621 646 717 741 771 836
878 976


fetal muscleInvitrogenFMS001 15 27 32 37 67 72 83 99 112
121 138 167


174 177 186-187 190 203-204
211 215


230 252 259 312 374 403 406
409 457


461 485 505 517 528 530 540-541
544


549 554 558 579-580 583 602-603
608


639 642-643 654 664 699 715
730 737


751 772-773 788 802-803 810
848 856


859 864 868-869 887 893-894
905-906


910-911 923 948 967


118


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
fetal muscleInvitrogenFMS002 15 99 130-131 223 361-362 431
474 505


581 639 643 666-667 784 790
808 810-


811 874 880 887 903 946 950
958 962-


963 973


fetal skin InvitrogenFSI~001 3 6 20-22 32-34 41-45 47 49-52
55 63-64


66 69 77 80 88 91 98 101 111-112
115


126 130-131 135 142 144 146
160 163


167 176 188-190 196 201 204
208 213


215 217-218 229 232 244 246
248 255


263 265-269 274 279-281 283
285 288


292 294 297 301 303 308 314
321 341-


342 344 348 354-355 358 361-362
366


369 371-372 374 381-382 384
386 394


401 403 405 413 415 428 431
437 440


460 466-467 472-473 477 481
483 495


499 504 517 522 532 536-537
539-541


545 556-558 569 574 576-578
580 584-


585 587-589 592-593 602-603
606-608


612 617-618 621 624 634 637
639 642-


643 647 664 673-674 676 680-681
689


699 705-707 709-715 724 728-730
738-


740 745 748 752 765 768-769
772-773


793 797 817 823 830 834 842
848 859


861 864 870 874 883 887-888
893-894


901 904 908-909 913-916 923
925 947


950 958 962-964 967 975


fetal skin InvitrogenFSK002 3 130-131 146 194 306 354 367
400 405


474 489 520-521 547 558 561-562
585


596 730 740 748 755 767 771
810 840


893-894 946 959


fetal spleenBioChain FSP001 276 563 842


umbilical BioChain FUC001 3 20 33-34 39 48 50 52 55-57
cord 65 67 69 72


77 79 82 92 109 112-113 121
132-133


138-143 156 167-168 172 174
179 184-


185 190 194-196 200 202-203
208-209


229-230 244 269-271 278 284-285
290


297-299 303 305 308 320 331-332
336


338 342-343 363 367 372 374
379-380


383-384 392-394 397 399 402
405-406


410 425-427 429-430 449-450
474 476


484 497 499 501 504-505 510
515 517


532-533 539 549 551 558 563
569 574


577-578 581 586-587 597 602-603
608


610 617-619 621 626-627 634-637
639


642-643 658 663-664 674 690-691
693-


694 699 713 715-717 720 724
726 729


738-739 746-747 749 759 761
765 768-


769 774-775 793 797 807 818
822 837


848-849 856 862 868-869 874
885 887


892-894 903 906-907 916-917
919-920


928 936 939 944 946-947 962-963
967


969


119


CA 02399776 2002-08-02
WO 01/57190
PCT/US01/04098
fetal brain GIBCO HFB001 3 9-10 12-14 16 21 25 28-30 32-34
37-39


41 47-48 52-53 56 65 67 69 71-72
75 80


84 92 97 103 106 110 114 117-119
123-


124 127 129 132-133 135 138
141-142


144-146 148-149 152 156 159-160
168


172 174 176 179 181 184-185
190 198


208-209 212 214 219 221 223-224
229-


230 233-236 240 244 247 251253-255


258-259 270 273 276-277 285
297 304-


305 308 312 314 322-323 325
328 332-


333 335-337 339-340 342-344
346 352


354 358 363 365 370-372 374
382 394-


396 398 401 403 405-406 409-412
414


416 425-427 431-432 437 442
445 453


456 462 466-467 469-470 472-474
479


483 488 490 492-497 S00-501
504 506-


510 520-521 S24 530 537 539
545 549


552 558 560-562 564 569 579
582-583


586-587 596 602-608 610-612
614 617-


624 626-628 630-631 633 635
638 641


643 647-648 656 658 661 676
679 688-


689 693 696-697 711-712 715
724 726


731 735 745 747-749 752 754
761 765


767-770 774 779-781 784-786
789 799-


800 802-803 813 818-819 823-824
831


834-835 837 839 845 848 859
864 866-


867 871 874-875 881 887 891
893-894


896-897 900 906-907 910-911
918 921-


922 925 927-928 934 943-944
946-947


950 953 962-963 965 969 972-973
977


macrophage InvitrogenHMP001 86 168 186-187 297 537 608 681
761 845


877


infant brain Columbia IB2002 2-3 9-10 12-14 16 21 25 27-30
32 37-38


University 46-47 49 55-56 58 65 69 71-72
78-79 82


84-86 91-92 98-99 106 109-l
I0 113-115


118 127-128 130-133 135 138
142 144


151 156 168 173-176 180-181
185-188


192 194 196-201 203 208 210-212
214


217-218 224 229-231 233 236
238 240-


241 244 246 251-256 259 263
270-271


2?7-279 284-285 287 293-294
296 301-


302 308 312-314 317 322-323
327 330


333 339 342 345-346 351 354
358 361-


362 365-366 368 370-371 373-374
382


388 394-396 402 405-406 411-412
415-


416 420 424-425 428 431 436-437
440-


441 444-445 453 456 460 465
474 479


482-483 488 495-496 498 501
S03-504


506-510 515-517 520-521 524-525
529


531-532 534-535 537 539-542
544-545


549 561-562 569 574 577-578
580-583


586-587 589 592 596 600-608
610 612-


120


CA 02399776 2002-08-02
O1/5719U
613 616-618 620 622 624 bzy-b;~z t~j~-
635 637 641 643-644 650-b51 653 661
663-664 676-677 689 693 695-698 708
711 720-722 724 730 732 735 740 74S-.
748 754 765-766 768-769 779-781 785-
786 789 791 796 798 800-803 807 81I-
813 818-819 822-824 830-83I 834-835
837 839 842-843 845 854 856 858 864
867-869 875-877 879 881 887 892-894
896 903 907-911 913 916 919-920 925
930-932 936 939 943 946-947 953 958
970-973 977-978 982 984
brain Columbia 1B2003 3 12-13 21 27-29 32 39 49 69 72 82 91
University 113 116 126 128 132-133 I42 144 1 S6
176-177 184-I85 188 194 208 212 223-
224 228 230 244 255 259 267 270 273
276 293-294 312 320 326-327 337 342
346 354-355 358 361-363 382 388 390
394 396 399 402 420 425 431 442 462
474 482 484 488 495-496 510 520-522
524 529 540-541 549 563 582 586 588-
589 596 600-603 606-607 612 617-618
620-621 632 647 650 679 720-722 724
735-736 746 751754 769 785-786 793
800 807 811-813 818-819 822 824 83I
834 838-840 843 856 864 892 896 907
919-920 925 930-931 936 947 950 957
973 982
nt brain Columbia IBM002 16 47 82 84 201263 302 376 394 4~,1 44C
University 488 537 592 606-607 635 740 769 887
892 906 921 926 971
nt brain Columbia IBS001 84 86 i 80 185 198 201 203 230 279 312
University 326 346 354 366 388 488 542 S81 588
620 647 664 732 740 785-786 801 807
822 827 9I0-911 925 931
g, fibroblast Strategene LFB001 3 11 25 49 65 7S 114 141 156 160 172
190 198 209 217 224 229 234-235 267
269 274 277 282 284 303 308 312 320
334 336 352 372 396 398 412 414 437
453 464 470 481 492-494 508-S09 532
539 581 584 617-619 621 628 633 643
688 691 745 752 761 768 ?94 822 837
848 876 887 953 967 973
~g tumor Invitrogen LGT002 1 3 9-10 12-13 20 31 38 41 46 48 51-52
56 58 63-64 72 74-75 78 82 88 101 106-
107 110 114-115 I 17-118 120-121 123-
124 128-133 135 I43-146149 151 156
159-161 163-164 167-168 172 176 178-
179 184-185 189-191 194-196 200 203
209 212 2i6-217 226 228-229 232 234-
236 241 246 248 256 258-259 263-264
269-27I 274 282-283 285-286 290 292
121


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
294 297 301 308-309 311 314
317 321


326 328-329 331 333-334 341
348 352


354-355 363 365 371 380 382-383
388


394-395 398-402 405-406 410-411
413


416 418-419 426-427 439 442
452-453


458-459 461-462 464-465 470-471
474


478 483-484 490 495-496 499
510 522


524 528 536-537 540-541 543
548 556-


558 560-565 571-573 580 582
587-588


592 597 602-605 608 610 612-613
617-


622 625-629 633-634 636 642-644
648


661 664 669 679 688-689 691
693 699-


700 708 717 723-724 730 733-734
738-


740 745 747 749 752-753 761
767-768


770 779 782 784-786 789 793-794
797


817-818 820 823-824 834 837
842 845


848 855 857 859 862 864 866
870 875-


877 887 892 896 900-901 907-909
914-


915 919-920 923-925 939 943
947 949


953 958 962-963 965 968 970
972-973


977


lymphocytes ATCC LPC001 3 9-11 32 47 50 56 71 75 88
97 99 102


121 125 128-129 135 13 8 141
149 163


167-168 212-213 217 233 255
290 294


301 305 311 314 342 372 377
388 398-


399 410 437 442 453 470 474
481 495


500 506 510 529 532 537 542
558 571


579 604-605 610 620 628 637
643 658


666-667 676 679 697 708 713
728 730


734 749 765 768 796 807 818
822 834


839 848 859 875 885 887 896
903 906


914-915 928 947 973 981-982


leukocyte GIBCO LUC001 1 3 9 11 18-19 21 23-25 27 31-34
39 41-


42 46-48 52 54-58 62-69 71-72
74-75 78-


80 82 89-90 93 99 110 115-121
123-124


128-133 135 138 141 143-146
149 152


156 159-161 163 167-168 176
179 181


186-187 189-190 194 198 200
203-204


209 211-212 218-219 226 232-236
240


244 247 251 253-255 258-259
263-264


269 271 274 278-279 282-283
285 288-


290 294-295 297 301-306 311
313-314


317 320-321 325 328 330-331
335 337


342 344 348 350-351 353-354
358-359


361-365 368 371-372 375 388-389
394-


395 397-401 403 405 407 409-412
421


425-427 432 437 442 448-450
452 457


460-461 468-471 474 476 479-482
484


492-494 496-498 500 506-S 10
516-517


520-521 524 529-530 532 537
540-544


551 553-554 558 560-565 569
577-578


580-583 586-587 589 592 596-597
602-


122


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
603 606-608 610-624 626-628
630-631


634-635 641-643 654 657-658
661 663-


665 669 672 677 679 684-689
691 696-


697 699 708 711 713 715 717
721-724


728 730 738-740 747-749 755
761 765


767-769 771 774-779 782 784
789 791-


792 794-795 797 807-808 811-815
817-


818 822 824 828 830 832 834
839-840


842 845 848 856 859 862 864
867 871


875-877 887 891 893-894 896-898
903


906-911 913-916 921 923 925
927-928


930 932 935-936 939 943-944
947 949-


950 953 958-959 961-963 965
967 972-


973 982


leukocyte Clontech LUC003 1 41 82 106 119 123-124 160
177 184 201


212 221 228 271 279 285 295
321 325


372 394 411-412 443 468-470
530 532


537 SS I 569 580-581 613 619
623 626-


627 642 655 697 761 767 769.
775 789


809 867 887 923 928 950


melanoma Clontech MEL004 3 25 55-56 67 71 78 109 121
129 146 167


from cell 172-173 176 200 209 212 258-259
line 263


ATCC #CRL 278 297 301 306 312 335 338
340 352


1424 361-362 367 388 395 402 410
418-419


429 437 454 464-465 481 496
500 503


507 524 532 539 560-562 581-582
587


589 599 612-613 617-621 623
643 657


663-664 672 715 724 748 752
761 767-


768 770 785-786 789 835 848
877 887


896 916 919-920 947 967 978-980


mammary InvitrogenMMG001 1 14 19 21 28-29 31-37 47 49-51
55 57


gland 63-67 69 71-72 75-78 92 108-109
111 116


121 123-124 126 128 130-133
135 143-


144 148-150 156 159 164 168
172 177-


179 184 186-187 190 194 200-204
209


212 217 226 230 232-236 241
244 246-


247 252 255 258-259 263 268
270 275


279-283 285 290 292-293 301
304-305


311 313-314 317 320 322-323
326-327


330 332 338 342-344 348-349
354 360


363 367 371 374 380 382-383
385 388


394-395 398 401-403 407 409
411-412


418-420 426-427 430 435 437
442 449-


453 459 461 465-468 470 474
477-478


480 483 485 488 498 500 503-504
507


515 519 522 524 529-532 538-541
544


547 555 560 563 565 569 573-574
579-


580 582 584 587-589 593 597
601-610


612-613 615-618 620-622 624
634 636-


637 639 642-644 646-647 650'657
663-


664 674 676 679 688-689 691
693 696


701-703 713 715 717 728 730
732 738-


123


CA 02399776 2002-08-02
WO 01/57190
PCT/USO1/04098
7 39 741-743 745 749 751 753 763
767


769 772-773 785-786 793 796-797
812


821-824 830-833 837 848 856 859
861


864 868-870 876-877 887 891 893-894


898 903-904 907-911 913-918 921
923


925-926 930-931 936 942 949-950
958


961 966-967 969 972-973


d neuron StrategeneNTD001 9 65 82 92 106 113 142 146 156
d 172 176
i


uce 191 208 221 258 277 328 333 346
n 361-


cells 362 371-372 375 388 410 414 418-419


440 471 484 495 516 524 529-530
592


610 628 642 650 745 748 752 761
793


818 848 851 897


retinaid acidStrategeneNTR001 19 87 184 305 385 440 474 626-627
643


induced neuron 748 799 834 977


cells
neuronal cellsStrategeneNTU001 19 33-34 42 70 82 87 109 115
126 146


172 185 188 194 212 255 269 274
283


312 317 329 340 361-362 367 379
394


399 401 410 420 426-427 474 479
507


530 579 582-583 610 617-618 636
643


658 732 740 765 769 784 791 793
799


802-803 818 842 851 864 897 907
932


pituitary Clontech PIT004 3 19 123-124 194 255 354 358
gland 373-374


377 426-427 462 492-494 635 785-786


793 893-894


placenta Clontech PLA003 138 176 574 896 972


Clontech PRT001 3 9 16 57 65 75 83 108 130-134
138 141


prostate 146 149-150 159 182 186-187 190
203


209 234-235 276 283 322 413 415
442


449-450 453 480 484 490 499-500
503


505-506 523 53.7 543 564 583
602-603


611 619 623 643 650 697 711 729
761


765 770 776-778 784 789 819 822
831


839 862 866 887 904 907 921 935
962-


963 967 973


t InvitrogenREC001 19 30 33-34 66 108-109 123-124
e 126 129-


c 131 143 149 151 156 164 190 201
um 240
r


247 250 263 268 274 279 287 295
298-


299 310 314 332 341 354 384 394
401


420 425 442 446 459 483 485 520-521


532 545 559 580-581 584 592 602-607


610 612 615 619 634 637 646 655
664


683-684 741 769 793 822 870 908-911


914-916 934 937-938 942 967 973
982


salivary glandClontech SAL001 16 68 74 84 121 123-124 156172
190 203


209 232 248 254 269 292 294 363
377


395 398 400 402 405-406 410 430
442


459 462 474 483 485 563-564 579
587-


588 599 602-603 643 658 699 728
730


737 741 748 794 822 867 876 897
903


981


124


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
salivary Clontech SALs03 217 254 270 388 610
gland


skin fibroblastATCC SFB001 517 949


skin fibroblastATCC SFB002 269 688


skin fibroblastATCC SFB003 3 203 897 907


small intestineClontech SIN001 3-4 47 57 68-69 92 99 125-126
130-131


135 149 151-152 156 159 185
204 241


246 291-292 318-319 338 343
348 363


373 375 382 388-389 392-394
397 400


437 466-467 471 484 500 517
520-521


525 547 560 580-581 588 599
602-603


612 624 643 711 731 733-734
757 761


769 774-775 794 824 864 904
906 910-


911 913 948 953 959 976 984


skeletal Clontech SI~MM001 15 75 135 146 172 190 218 267
muscle 282 308


410 426-427 474 505 588 620
623 658


692 713 737 779 790 862 874
878 887


952 962-963


skeletal Clontech SI~Ms04 215
muscle


spinal cord Clontech SPC001 14 20-21 25 28-29 31 39 46
48 59 78 83-


84 91-92 103 112-113 135 160
168 172


176 188 190 205 209 229 232
258 285


301 308 312-314 321 323 329
346 374


377 380 383 388 394 398 406
409-410


431 449-450 453 455 466-467
470-471


484-486 488 495 497 500 503
508-509


524 537 539.558 581 586 604-605
611


619 623 630-631 633 656 663
711 715


729 736 740-741 761 767 769
776-778


780 818 822 831 835-836 840
843 859


861 871 875 887-888 897 906-907
913


919-920 928 931 953 958


adult spleenClontech SPLc01 3 6 12-13 66 130-131 178 365
403 431


461 558 610 715 797 809 876
947 967


stomach Clontech STO001 35 114 130-131 144 155 176
189 206-207


249 260-262 336 382 398 425
431 453


461 483 496 500 527 530 580
642 657


663 669 748 765 768 802-803
839 891


942 981


thalamus Clontech THA002 30-32 48 66 109 127 130-131
135 142


145 156-158 168 172 174 185
199 224-


225 233 246 277 282 286 293
322 332


334 346 374 384 400 402 420
424 435-


437 446 466-467 485 503 506
527 542


549 572 612 615 622 624 633
643-644


658 676 736 790 794 824 831
835 896


907 950 969


thymus Clonetech THM001 10 16 20 28-29 32 37 41 52
57 66-67 74-


75 110 118 121 129-131 141
151 159-160


208 211 218 247 269 289 295
297 320


325 354 358 365 367 372 378
388-389


395 398 411-412 420 423 435
452 500


508-509 517 524 532 537 551
558 560


125


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
569 577-5'78 582 586 598 608
611 622


643 684 715 721-723 728 740
766 772-


773 795 834 837 849 864 885
900 921


946 948 958 962-963 965 972-973
982


thymus Clontech THMc02 1 3 9-11 16 21 27 32-34 38-39
51 55-57


66 72 74 77-78 80 82 89-90
101 112 115


118-119 121 123-124 126 138
144 152


159 168 174 176 178 186-188
197 200


208 212-214 217 225 233 243-244
246


254 256-262 279 282 285 288-289
296-


' 297 313-314 322 334 343 354-355
358-


359 363-364 367-368 372-373
382 387-


389 395 400 402 411 414 426-427
437


440 442 449-450 454 457 462
464 469


474 479 481 485 490-491 506
508-509


511 517 522 526 528 532 542
551 554


561-562 564 566-570 580-582
585 589


597 599-600 602-608 611 613-614
619-


621 625 628 630-631 644 646
655 669


672 677 X84 686-693 697 713
717 720


728 740 746 749 760-762 767
771 775


794 797 804 808 811 816 818-819
837


840 859 880 883 887-888 896-897
903


908-911 913 916 924 936 947-948
950


962-963 965 967 970


thyroid glandClontech THR001 3 8-9 14-15 19-22 28-29 39
41 55-56 66


69 71-72 78-79 97 104-105 109
113 115


119 121 123-124 130-133 135
138 143-


144 146 148 151-152 156 159-163
165


168 172 174 177 183-184 196
199-200


203 209 211 215-218 228-229
232-236


244 254-255 258 273 282 290
292 294


297 303-306 308 311 317-318
322-323


325-326 334-335 340 342 348
354358


373 377 381-382 387 394 398
401-402


405-406 409-412 416 422 425-427
429-


431 440 449-453 462 466-468
474 478-


479 481-484 490 492-496 500-501
505-


506 517-518 522-525 532 537
540-541


545 551 558 560 563-564 580
583 587-


589 593 597 599 606-607 610
617-621


625-628 633 635 641-643 658-659
664-


669 674 682 686 688-691 696
699 715


724 730 740 742-743 747 750
752 759


761 765-766 768-769 779 789
796 802-


803 813 818-819 822 831 837
843 845


848-849 862 864 868-869 871
874 876-


877 887 893-894 896-897 907-909
912


919-921 923 925 928 936 940-942
944


946-947 950 953 955 958-959
962-963


967 969 973 981


trachea Clontech TRC001 33-34 55-56 69 74 163 172 190
209 212


126


CA 02399776 2002-08-02
WO 01/57190 PCTlUSOI/04098
267 270 297 30S 314 3S2 413
426-427


466-467 S00 S02 S04 S80 S86
610 613


633 642 688 691 71 I 724 738-739
774


782 816 820 839 848 862 868-869
914-


91 S 928 968


uterus Clontech UTR001 4 9 18 37 63-64 74 108 114-11S
130-131


160 166 179 184 190 209 233
249 269


28S 301 314 327 337 348 384
394 399-


400 403 406 411' 42S 431 434
437 440


462 474 48S 490 S08-S09 S26
S32 S79


617-619 636 642-643 672 761
769 793


837 849 864 887 903 906 928
934 947


967


TABLE 2
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-


ID NUMBER WATERMAN IDENTITY


NO: SCORE


1 L06175 Homo Sapiensoccurs in MHC class 308 98
I region; ORF


2 Y70775 Homo sapiensFollistatin-related 3094 98
protein zfsta.


3 X15187 Homo Sapiensprecursor polypeptide 4112 100
(AA -21 to


782)


4 AF110640Homo Sapiensorphan seven-transmembrane344 100


receptor


603798 Homo sapiensHuman secreted protein,158 72
SEQ ID


NO: 7879.


6 W85607 Homo sapiensSecreted protein clone 1477 100
da228_6.


7 Y30162 Homo SapiensHuman dorsal root receptor884 88
4


hDRR4.


8 Y15227 Homo SapiensLeul 391 100


9 Y28817 Homo Sapienspt326 4 secreted protein.3338 100


X92I06 Homo sapiensbleomycin hydrolase 2445 100


I1 Y15228 Homo sapiensLeu2 445 100


12 TJ27838 Mus musculusglycosyl-phosphatidyl-inositol-432 34


anchored protein homolog


13 U27838 Mus musculusglycosyl-phosphatidyl-inositol-320 27


anchored protein homolog


14 'Y71062 Homo SapiensHuman membrane transport2323 99
protein,


MTRP-7.


U96781 Homo SapiensCa2+ ATPase of fast-twitch5145 100
skeletal


muscle sacroplasmic
reticulum, adult


isoform


16 M16653 Homo sapienspancreatic elastase 1435 99
IIB zymogen


17 Y13398 Homo SapiensAmino acid sequence 1749 99
ofprotein


PR0346.


18 Y02283 Homo SapiensSecreted protein clone 1399 99
br342 11


polypeptide sequence.


19 Y53030 Homo sapiensHuman secreted protein 1371 100
clone d24 1


protein sequence SEQ
ID N0:66.


AL031320Homo SapiensdJ20N2.S (novel protein2597 99
similar to


fucosidase, alpha-L-1,
tissue (EC


3.2.1.51, alpha-1-fucosidase


fucohydrolase))


21 BOI384 Homo SapiensNeuron-associated protein.1876 100


22 Y68778 Homo SapiensAmino acid sequence 2470 100
of a human


phosphorylation effector
PHSP-10.


127


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


23 Y55935 Homo SapiensHuman KHS2 protein. 4781 99


24 Y55935 Homo SapiensHuman KHS2 protein. 2807 100


25 AC024792Caenorhabditiscontains similarity 463 31
elegans to TR:095029


26 Y07972 787 Human secreted protein1540 100
fragment


27 X97630 Homo Sapiensserine/threonine protein3781 98
kinase


28 AF150755Mus musculusmicrotubule-actin crosslinking3514 68
factor


29 AF150755Mus musculusmicrotubule-actin crosslinking3725 70
factor


30 238011 Mus musculusDMR-N9 2988 86


31 AJ000522Homo Sapiensaxonemal dynein heavy 6058 99
chain


32 AF037256Mus musculusES2 protein 2260 91


33 562140 Homo SapiensTLS=nuclear RNA-binding2917 100
protein


34 S62140 Homo SapiensTLS=nuclear RNA-binding2890 98
protein


36 AB038237Homo SapiensG protein-coupled receptor1767 100
CSL2


37 D79994 Homo Sapienssimilar to ankyrin 6089 99
of Chromatium
vinosum.


38 X63380 Homo Sapiensserum response factor-related1966 99
protein


39 AL022072Schizosaccharlipoic acid synthetase1067 61
omyces pombe


40 J03930 Homo Sapiensalkaline phosphatase 2751 100


41 AF132968Homo SapiensCGI-34 protein 1088 98


42 AL117637Homo Sapienshypothetical protein 2208 100


43 AL021393Homo SapiensbK747E2.1 (novel protein)1526 100


44 X68011 Homo SapiensZNF81 1886 100


45 AC002464Homo Sapiensorganic cation transporter;2423 100
SO%
similarity to JC4884
(PID:g2143892)


46 W78245 Homo SapiensFragment of human secreted1949 100
protein
encoded by gene 19.


47 Y41765 Homo SapiensHuman PR01083 protein 3604 100
sequence.


48 AF097330Homo SapiensH1 chloride channel; 1305 99
p64H1; CLIC4


50 U09413 Homo Sapienszinc finger protein 1361 57
ZNF135


51 AF061812Homo Sapienskeratin 16 2374 100


52 W63681 Homo SapiensHuman secreted protein1326 99
1.


53 AB035303Homo Sapienscadherin-10 4094 100


54 A12022 synthetic MRP-8 485 100
construct


55 AL121897Homo SapiensbA392M18.3 (KIAA0180) 1867 100


56 Y73330 Homo SapiensHTRM clone 397663 protein818 96
sequence.


57 AF151018Homo SapiensHSPC184 955 100


58 AF125042Homo Sapiensbisphosphate 3'-nucleotidase' 1586 100


59 AF118670Homo Sapiensorphan G protein-coupled1971 100
receptor


60 X04494 Homo sapiensprecursor polypeptide 1903 100


61 AF208865Homo SapiensEDRF 528 100


62 D15057 Homo sapiensDAD-1 567 100


63 AF260665Homo sapienshistone acetyltransferase1510 100


64 AF260665Homo Sapienshistone acetyltransferase1429 96


65 AJ277145Homo Sapiensras-related small GTPase1073 100
RAB18


66 Y94950 Homo SapiensHuman secreted protein348 100
clone
dh1073_12 protein sequence
SEQ ID
N0:106.


67 Y82744 Homo SapiensDNA replication and 1028 100
repair
associated protein
(DRASP).


68 Y44486 Homo SapiensHuman GPRW receptor 1721 100
polypeptide.


69 AL031228Homo SapiensdJ1033B10.2 (WD40 protein3196 100
BING4 _
(similar to S. cerevisiae
YER082C,
M. sexta MNG 10 and
C. elegans
F28D1.1)


128


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


70 AJ276316Homo Sapienszinc finger protein 1751 52
304


71 Y 18314 Homo Sapiensparaplegin-like protein4146 99


72 AF157028Homo Sapiensprotein phosphatase 2017 100
methylesterase-1


74 Y71082 Homo SapiensHuman B-aggressive lymphoma1765 99
(BAL) protein.


75 AF225420Homo SapiensAD025 734 100


76 X95235 Homo Sapienstranscription factor 217 100
AP2


77 AF108420Takifugu 1-aminocyclopropane-carboxilate733 56
rubripes synthase


78 601349 Homo SapiensHuman secreted protein,650 99
SEQ ID
NO: 5430.


79 AL117635Homo Sapienshypothetical protein 922 99


81 285986 Homo SapiensdJ108K11.3 (similar 865 77
to yeast
suppressor protein SRP40)


82 AF183414Homo Sapienshemin-sensitive initiation3231 99
factor 2a
kinase


83 GO 1143 Homo SapiensHuman secreted protein,495 98
SEQ ID
NO: 5224.


84 U03985 Homo SapiensN-ethylinaleimide-sensitive3744 99
factor


85 Y17791 Homo SapiensVAX2 protein 1496 100


87 AF263538Homo Sapiensgrowth differentiation 1944 99
factor 3


88 Y19757 Homo SapiensSEQ ID NO 475 from W09922243.1361 100


89 AF161493Homo SapiensHSPC144 1185 100


90 AF161493Homo SapiensHSPC144 856 100


91 B25780 787 Human secreted protein 647 41
SEQ ID


92 U57344 Mus musculusMeis3 1007 89


93 AF172854Homo Sapienscardiotrophin-like cytokine1197 98
CLC


94 AL390114Leishmaniaextremely cysteine/valine223 29
major rich
protein


95 AB016886Arabidopsiscontains similarity 287 38
thaliana to adenylate
kinase~gene id:MCA23.18


96 AC005525Homo SapiensF22162_1 1855 96


97 B20997 Homo SapiensHuman nucleic acid-binding3836 99
protein,
NuABP-1.


98 AJ006692Homo Sapiensultra high sulfer keratin507 70


99 AF172264Homo SapiensTraf2 and NCK interacting6942 99
kinase,
splice variant 1


100 L11239 Homo Sapienshomeoboxprotein 717 100


101 AC004890Homo Sapienssimilar to zinc forger 2154 98
proteins;
similar to AAC01956
(PID:g2843171)


102 AC003682Homo SapiensR28830_2 1287 48


103 AF201839Rattus dynamin IIIbb isoform 4270 95
norvegicus


104 Y79510 Homo SapiensHuman carbohydrate-associated1394 100
protein CRBAP-6.


105 Y79510 Homo SapiensHuman carbohydrate-associated1209 90
protein CRBAP-6.


106 AL096748Homo Sapienshypothetical protein 1216 100


108 X97260 Homo SapiensMetallothionein 2 381 100


109 AL034422Homo SapiensdJ1141E15.2 (novel protein)433 100


110 AF191338Homo sapiensanaphase-promoting complex683 100
subunit
4


111 AL021712Arabidopsisputative protein 185 26
thaliana


112 AF250138Homo Sapienssmall stress protein-like1063 100
protein
HSP22


113 AL109976Homo sapiensdJ794I6.1.1 (novel protein)4176 99


114 Y36151 787 Human secreted protein 668 100


129


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


115 AF110399 Homo Sapienselongation factor Ts 1666 100


116 AF210317 Homo Sapiensfacilitative glucose 2052 99
transporter family
member GLUTS


117 Y73328 Homo sapiensHTRM clone 082843 protein931 100
sequence.


118 X0408S Homo Sapienscatalase 2846 100


119 AF147717 Homo Sapiensubiquitin C-terminal 1695 100
hydrolase
UCH37


120 X73882 Homo sapiensmicrotubule associated3801 99
protein


121 AC004882 Homo Sapienssimilar to CAA16821 3223 100
(PID:g32SS9S2)


122 M93311 Homo Sapiensmetallothionein-III 421 100


123 603827 Homo SapiensHuman secreted protein,SS7 94
SEQ ID
NO: 7908.


124 603827 Homo SapiensHuman secreted protein,222 S3
SEQ ID
NO: 7908.


12S AF232009 Homo Sapiensperoxisomal trans 2-enoyl1565 99
CoA
reductase


126 AB004906 Ipomoea transposase 146 20
purpurea


127 M6016S Homo Sapiensguanine nucleotide-binding1832 ~ 99
regulatory protein
2


128 Y10319 Homo Sapienscarnitine carrier 1592 100


129 U7S467 DrosophilaAtu 937 36 .
melanogaster


130 Z21S07 Homo Sapienshuman elongation factor-1-delta494 87


131 Z21S07 Homo sapienshuman elongation factor-1-delta938 100


132 YS8633 Homo SapiensProtein regulating 6745 100
gene expression
PRGE-26.


133 YS8633 Homo sapiensProtein regulating 4818 9S
gene expression
PRGE-26.


134 M13692 Homo Sapiensalpha-1 acid glycoprotein1064 99
precursor


13S U72970 Sus scrofacalcium/calinodulin-dependent2723 99
protein kinase II isoform
gamma-B


136 G032I3 Homo SapiensHuman secreted protein,4S0 100
SEQ ID
NO: 7294.


137 ACOOS Homo Sapienssmall inducible cytokine627 99
102 subfamily A
member 24


138 AF1SS648 Homo Sapiensputative zinc fmgerproteinS8SS 92


139 AF144638 Homo Sapienssphingosine-1-phosphate2977 100
lyase


140 AFIS2318 HomosapiensprotocadheringammaAl 4778 100


141 B08S 17 Homo SapiensAmino acid sequence 5841 100
of a beta-
tubulin antigen.


142 XS6667 Homo Sapienscalretinin 1410 99


143 X92763 Homo Sapienstafazzins 1605 100


x44 Y9S293 Homo SapiensHuman GEF containing 4092 99
NEK-like
kinase substrate sGNK.


14S AF226046 Homo SapiensGK003 1198 100


146 M22877 Homo Sapienscytochrome c SS4 98


147 AJ272212 Homo Sapiensprotein serine kinase 2196 100


148 AB026491 Homo sapiensPICKl 2114 98


149 AB018S80 Homo SapienshIuPGFS 1699 100


1S0 X91868 Homo sapienssixl 1509 100


151 AF266S05 Mus musculuspseudouridine synthase2135 84
3


1S2 U29170 DrosophilaANON-23D 883 43
melanogaster


1S3 G0407S Homo SapiensHuman secreted protein,S67 99
SEQ ID
NO: 8156.


1S4 AY009128 Homo SapiensISCU2 138 100


130


CA 02399776 2002-08-02
wo ons7igo rcTmsonoa.o9s
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NiJMBER WATERMAN IDENTITY
NO: SCORE


155 AF141315 Homo Sapiensalpha-I,4-N- 1842 100
acetylglucasaminyltransferase


156 AF110645 Homo sapienscandidate tumor suppressor1294 g9
p33
INGl homolog


15'7 AF159297 Zea ways extensin-like protein 238 25


158 ALI33325 _ dJ984P4.3 (Homeobox 1437 100
Homo Sapiensprotein
NI~.X2B)


159 AF073298 Homo Sapienssmall EDRK-rich factor294 100
2


160 AC004858 Homo SapiensUI small ribonucleoprotein4032 100
ISNRP
homolog; match to PID:g4050087


161 AB012109 Homa SapiensAPC10 990 I00


162 AL162751 Arabidopsisputative protein 194 32
thaliana


I63 AJ005698 Homo Sapienspoly(A)-specific ribonuclease3351 100


I64 AF117646 Homo Sapienslong CBL-3 protein 2547 99


165 AC004002 Homo Sapienssimilar to ciliary 5065 100
dynein beta heavy
chain; 78% Similarity
to P23098
(PID:g118965)


166 M10942 Homo sapienshuman metallothionein-Ie381 100


167 AF126484 Homo sapiensCARD4 4961 100


168 AF161518 Homo SapiensHSPC169 1604 I00


169 M64983 Homo Sapiensfibrinogen beta chain 2482 100


170 M64983 Homo Sapiensfibrinogen beta chain 2679 I00


I7I M58514 Galius fibrinogen beta chain 1059 78
gallus


172 AF078845 Homo Sapiens16.7Kd protein 786 100


173 AC004774 Homo SapiensDlx-6 923 I00


174 298974 Schizosaccharputative vacuolar protein185 31
omyces sorting-
pombe associated protein


175 X56203 PlasmodiumLiver stage antigen 283 23
falciparum


I76 W74726 Homo SapiensHuman secreted protein1879 z 00
fg949 3.


177 AJ222967 Homo Sapienscystinosin 1920 100


178 AC024796 Caenorhabditiscontains similarity 221 27
elegans to TR:076167


179 Y66632 Homo sapiensMembrane-bound protein1370 100
PRO276.


180 AF15I803 Homo SapiensCGI-45 protein 215 28


181 602694 Homo SapiensHuman secreted protein,283 i00
SEQ ID
NO: 6775.


182 Y17292 Homo SapiensHuman cell death preventing2676 100
kinase
{DPK-1) protein sequence.


183 AF234765 ____ serine-arginine-rich 148 27
Rattus splicing
norvegicusregulatory protein
SRRP86


184 AF151855 Homo SapiensCGI-97 protein 1214 96


185 AF289664 Mus musculusCYLN2 4673 g0


I86 AL022238 Homo SapiensdJI042K10.2 (supported4059 I00
by
GENSCAN, FGENES and
GENEbVISE)


187 AL022238 Homo SapiensdJ1042K10.2 (supported2332 x00
by
GENSCAN, FGENES and
GENE WISE)


188 X83543 Homo SapiensAPXL 8513 99


189 AF059569 Homo sapiensactin binding protein 3106 99
MAYVEN


190 M18135 Rattus smooth-muscle alphatropomyosin1306 95
norvegicus


191 _ Drosophilabrakeless-B I47 S2
AF242194 melanogaster


I92 D30689 Bacillus subunit of nitrite 113 29
subtilis reductase


' Y44984 Homo SapiensHuman epidermal protein-1.538 97
393 ~
~


i31 ,


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH- %
ID NUMBER WATERMAN IDENTITY
NO: SCORE


194 B25679 Homo sapiensHuman secreted protein760 100
sequence
encoded by gene 15
SEQ ID N0:68.


195 AB020315787 homologue of mouse 1466 100
dkk-1 gene:Acc


196 U35730 Mus musculusjerky 2021 75


197 AL136450Homo SapiensdJ510O21.1 (novel protein)632 - 100


198 X56203 Plasmodium liver stage antigen 512 24
falciparum


199 Y70775 Homo SapiensFollistatin-related 2027 63
protein zfsta.


200 X87237 Homo Sapiensa-glucosidase I 4447 99


201 AF101078CaenorhabditisCLU-1 1393 46
elegans


202 X04571 Homo Sapiensprecursor polypeptide 6611 100
(AA -22 to
1185)


203 X00474 Homo SapienspS2 precursor 466 100


204 AB029333HalocynthiaHrPET-1 974 54
roretzi


205 AF146019Homo Sapienshepatocellular carcinoma998 100
antigen
gene 520


206 AF071002Homo Sapiensminx-related peptide 632 100
1; MiRPl


207 AB038162Homo Sapienstrefoil factor 2 744 100


208 U30521 Homo SapiensP311 HUM 363 100


209 AB000911Sus scrofa ribosomal protein 782 100


210 AB021227Homo sapiensmembrane-type-5 matrix3545 100
metalloproteinase


211 AF180920Homo Sapienscycliri L ania-6a 2722 100


212 AF105365Homo SapiensK-Cl cotransporter 5624 100
KCC4


213 U29244 Caenorhabditissimilar to human (TRE)602 32
elegans transforming
protein (PIR:S22157)


214 AL033538Homo SapiensdJ477H23.1 (novel protein)3195 100


215 X52011 Homo Sapiensmuscle determination 1262 100
factor


216 AF083248Homo Sapiensribosomal protein L26 739 100
homolog


217 AF006751Homo SapiensES/130 4793 99


218 AB007859Homo SapiensKIAA0399 protein 3559 99


219 AK026291Homo Sapiensunnamed protein product826 100


221 Y84045 Homo SapiensSplice variant of cancer5851 97
associated
polypeptide CH1-9a11-2.


222 267996 Homo Sapienstenascin-R (restrictin)7186 100


223 AF134802Homo Sapienscofilin isoform 1 846 100


224 Y17711 Homo sapiensatopy related autoantigen1611 99
CALC


225 AF190051Gallus gallushepatocyte nuclear 443 81
factor la
dimerization cofactor
isoform


226 AK026256Homo sapiensunnamed protein product866 98


227 269368 Schizosaccharnuf2-like coiled-coil 230 25
omyces pombeprotein


228 AF275948Homo SapiensABCA1 11763 99


229 AF161384Homo SapiensHSPC266 2006 98


230 Y16270 Homo Sapiensparalemin 1951 100


231 AJ245599Homo Sapiensputative secreted ligand2379 99


232 W88499 Homo SapiensHuman stomach carcinoma1545 99
clone
HP10412-encoded protein.


233 AF096286Mus musculuspecanex 1 3623 93


234 V64619_cdHomo Sapiens30-NOV-1990 Human HE1 796 100
1 cDNA.


235 V64619_cdHomo Sapiens30-NOV-1990 Human HE1 470 98
1 cDNA.


236 AF227258Bos taurus RPGR-interacting protein-11262 38


237 AJ132445Homo Sapiensclaudin-14 1181 100


238 AL034562Homo SapiensdJ684024.2 (prodynorphin1330 100
(Beta-


132


CA 02399776 2002-08-02
WO 01/57190 PCT/US01/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMTTH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


Neoendorphin-Dynorphin
precursor,
Proenkephalin B precursor))


239 AF262027Homo SapienseIF-5A2 808 100


240 AL079344Arabidopsisputative protein 194 33
thaliana


241 AC002394Homo SapiensGene product with similarity1 S42 51
to
dynein beta subunit


242 AJ271361Takifugu FRANK2 protein 303 30
rubripes


243 AL02I9I8Homo Sapiensb34I8.I (Kruppel related1476 48
Zinc Finger
protein 184)


244 AF190167Homo Sapiensmembrane associated 1736 99
protein SLP-2


245 Y10601 Homo Sapiensankyrin-like protein 5877 100


246 AL121771Homo SapiensdJ548G19.1.1 (novel 3628 100
protein
(ortholog of mouse
zinc finger
protein ZFP64) (translation
of cDNA
NTZRP3001398 (Em:AK001596))
(isoform 1))


247 L25314 Drosophila~actin-related protein 984 47
melanogaster


248 X63745 Homo SapiensKDEL receptor 1095 100


249 AF112208Homo Sapiensl3kDa differentiation-associated816 100
protein


250 AP001707Homo Sapienshuman gene for claudin-8,1172 100
Accession
No. AJ2507I 1


251 AL136125Homo SapiensdJ304B14.1 (novel protein)778 100


252 AL031186Homo SapiensbK984G1.1 (supported 532 100
by FGENES)


253 Y17531 Homo SapiensHuman secreted protein639 100
clone BL205
14 protein.


254 AL049843Homo SapiensdJ392M17.3 (KIAA0349 6741 99
protein)


25S AJ242972Homo SapiensTOLLIP protein 1424 99


256 Y94873 Homo SapiensHuman protein clone 1876 100
HP02632.


257 AF279865Homo Sapienskinesin-like protein 2903 100
GAKIN


258 AL024498Homo SapiensdJ417M14.1 (novel protein)589 100


259 866278 Homo SapiensTherapeutic polypeptide830 100
from
glioblastoma cell line.


260 AF101784Homo Sapiensb-TRCP variant E3RS-IkappaB3226 99


261 AF101784Homo Sapiensb-TRCP variant E3RS-IkappaB2821 100


262 AF101784Homo sapiensb-TRCP variant E3RS-IkappaB3149 99


263 AF197060Homo sapienssrc homology 3 domain-containing2257 100
protein HIP-55


264 Y86262 Homo SapiensHuman secreted protein766 100
HAQAR23,
SEQ ID N0:177.


265 Y56966. Homo SapiensHuman SBPSAPL polypeptide.2779 100


266 Y56966 Homo sapiensHuman SBPSAPL polypeptide.1018 99


267 AJ300465Homo Sapiensputative white family 1557 95
ATP-binding
cassette transporter


268 AC004030Homo SapiensF21856_2 3579 99


269 X55954 Homo sapiensHL23 ribosomal protein714 100


270 AB033921Mus musculusNdrl related protein 1855 94
Ndr2


271 AF081886Homo sapiensERO1-like protein 1905 99


272 AF166492Homo sapienssmall GTPase RAB6B 1060 100


273 AL022238Homo SapiensdJ1042K10.4 (novel 2201 100
protein)


274 W88667 Homo SapiensSecreted protein encoded1530 99
by gene
134 clone HAIBP89.


275 X00129 Homo sapiensprecursor RBP 1044 97


276 247500 Homo sapiens11-MAY-1998 Human RHOH1161 100
cdl gene
sequence.


277 AB049188Equus caballusubiquitin C-terminal 1118 ~ 96
~ ~ f hydrolase ~


133


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRH'TION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


278 AF270647 Homo sapiensGTTl 1564 100


279 AF143956 Mus musculuscoronin-2 2414 94


280 885151 Homo SapiensEndothelial cell polypeptide.911 92


281 885151 Homo SapiensEndothelial cell polypeptide.1031 100


282 D83948 Rattus S1-1 protein 3975 90
norvegicus


283 Y14768 Homo SapiensI Kappa B-like protein2037 100


286 AL031316 Homo sapiensdJ28O10.3(HSD11B1 294 100
(hydroxysteroid (11-beta)
dehydrogenase 1)


. D64109 Homo sapienstob family 1773 99
287


288 AB026043 Homo SapiensMS4A7 1230 100


289 M61866 Homo SapiensKrueppel-related DNA-binding209 90
protein


290 AJ001810 Homo SapiensmRNA cleavage factor 1217 100
I 25 kDa
subunit


291 Y99454 Homo SapiensHuman PR01605 (UNQ786)694 100
amino
acid sequence SEQ ID
N0:395.


292 Y44824 Homo SapiensHuman molecule associated2370 100
with cell
proliferation, MACP-4.


293 AJ276101 Homo sapiensGPRCSB protein 2099 100


294 AF161406 Homo sapiensHSPC288 719 100


295 Y58628 Homo SapiensProtein regulating 1276 100
gene expression
PRGE-21.


296 U91561 Rattus pyridoxine 5'-phosphate1239 87
norvegicusoxidase


297 L02956 Xenopus ribonucleoprotein 1624 83
laevis


298 AF226730 Homo SapiensCytl9 1729 99


299 AF226730 Homo SapiensCytl9 906 98


300 Y54324 Homo SapiensAmino acid sequence 718 89
of a human
gastric cancer antigen
protein.


301 AF125533 Homo SapiensNADH-cytochrome b5 1606 100
reductase
isoform


302 Y32206 Homo SapiensHuman receptor molecule1676 98
(REC)
encoded by Incyte clone
2825826.


303 AF247565 Homo Sapienshepatocellular carcinoma525 100
associated
ring forger protein


304 AF208844 Homo SapiensBM-002 428 100


305 AC004983 Homo sapienssimilar to PID:g38779441988 100


306 AL132978 Arabidopsisputative protein 210 25
thaliana


307 Y10530 Homo Sapiensolfactory receptor 1645 100


308 AF180681 Homo Sapiensguanine nucleotide 3597 100
exchange factor


309 AF111856 Homo Sapienssodium dependent phosphate3591 99
transporter isoform
NaPi-3b


310 Y13583 Homo SapiensG-protein coupled receptor2171 100


311 273420 Homo SapienscE146D10.2 (mercaptopyruvate1598 100
.
sulfurtransferase (EC
2.8.1.2))


312 X79535 Homo Sapiensbeta tubulin 2348 100


313 AF070658 Homo SapiensHSPC002 861 100


314 AF078866 Homo SapiensSURF-4 1395 100


317 237986 Homo Sapiensphenylalkylamine binding1258 100
protein


320 AB047892 Macaca hypothetical protein 258 82
fascicularis


321 Y25755 Homo sapiensHuman secreted protein1440 100
encoded
from gene 45.


322 AB016531 Homo sapiensPEX16 1741 100


323 AL391141 Arabidopsisputative protein 274 49


134


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


thaliana


325 AF140501 Homo SapiensDNA polymerise iota 3691 99


326 X96698 Homo sapiensD1075-like 1450 96


327 AF152325 Homo Sapiensprotocadherin gamma 4769 100
AS


328 AF151803 Homo SapiensCGI-45 protein 1970 100


329 X74070 Homo Sapienstranscription factor 639 81
BTF3


330 AF171102 Homo Sapiensretinal degeneration 1302 95
B beta


331 W54040 Homo SapiensHuman interferon-inducible484 98
protein,
HIFI.


332 AF024617 Homo Sapienstranscription-associated691 100
zinc ribbon
protein


333 U19181 Rattus Rabin3 2129 90
norvegicus


334 603877 Homo sapiensHuman secreted protein,621 100
SEQ ID
NO: 7958.


335 AL008582 Homo SapiensbK223H9.2 (ortholog 626 100
of A. thaliana
F23F 1.8)


336 AF110774 Homo Sapiensadrenal gland protein 647 100
AD-001


337 AB011414 Homo SapiensKruppel-type zinc finger1674 58
protein


338 AF207600 Homo Sapiensethanolamine kinase 129 100


340 AC020579 Arabidopsisputative 3283 50
thaliana phosphoribosylformylglycinamidine
synthase; 25509-29950


341 Y28576 Homo sapiensSecreted peptide clone944 100
pe503 1.


342 U32274 SaccharomyceYdr386wp; CAI: 0.12 191 37
s cerevisiae


343 A01771 synthetic vascular anticoagulating1661 - 99
construct protein


344 AF220052 Homo sapiensuncharacterized hematopoietic1285 100
stem/progenitor cells
protein
MDS032


345 Y70400 Homo sapiensHuman cell-signalling 754 100
protein-2.


346 Y50926 Homo SapiensHuman fetal brain cDNA962 100
clone
vcl6_1 derived protein.


347 AF183428 Homo Sapiens28.4 kDa protein 1329 100


348 AC006069 Arabidopsisputative cleavage and 1383 55
thaliana polyadenylation specifity
factor


349 AL032631 CaenorhabditisY106G6H.8 194 39
elegans


350 U70669 Homo SapiensFas-ligand associated 167 23
factor 3


351 Y93468 Homo sapiensAmino acid sequence 1182 92
of a potassium
channel interactor
protein.


352 AF005856 Drosophilaanon2A5 111 45
yakuba


353 AJ271684 Homo Sapiensmyeloid DAP12-associating1013 100
lectin


354 AF099100 Homo SapiensWD-repeat protein 6 2882 99


355 U51730 Murine reverse transcriptase 316 42
leukemia
virus


356 D50617 SaccharomyceYFL042C 279 27
s cerevisiae


357 D50617 SaccharomyceYFL042C 279 27
s cerevisiae


358 AF161432 Homo SapiensHSPC314 1059 93


359 AB029488 Homo SapiensCl lorf2l 758 99


360 AJ251024 Homo Sapiensputative odorant binding1239 100
protein ag


361 U43281 SaccharomyceLpg22p 2074 74
s cerevisiae


362 U43281 SaccharomyceLpg22p 2153 74
s cerevisiae


135


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


363 AC007153 Arabidopsis100632 156 24
thaliana


364 AF197927 Homo SapiensAF.5q31 protein 3992 99


365 D28500 Homo Sapiensmitochondria) isoleucine4286 98
tRNA
synthetase


366 X97868 Homo Sapiensarylsulphatase 3141 98


367 AL162048 Homo Sapienshypothetical protein 1532 100


368 L36062 Mus musculussteroidogenic acute 189 25
regulatory
protein


369 AF113249 Homo Sapiensmultiple domain putative1022 59
nuclear
protein


370 M15888 Bos taurusendozepine-related 2425 84
protein precursor


371 X66363 Homo Sapiensserine/threonine protein2562 100
kinase


372 W74802 Homo SapiensHuman secreted protein1532 89
encoded by
gene 73 clone HSQEL25.


373 AF100772 Homo Sapienstenascin-M1 11535 99


374 . AF090934Homo SapiensPR00518 382 100


375 AB021643 Homo Sapiensgonadotropin inducible2761 99
transcription
repressor-3


376 AB049758 Homo SapiensMAWD binding protein 1331 100


377 AF070666 Homo SapiensKruppel-associated 466 97
box protein


378 559342 Mus Sp. nuclear pore complex 464 60
glycoprotein
p62


379 AF149205 Mus musculusSu(var)3-9 homolog 1690 88
Suv39h2


380 AF227906 Homo SapiensUDP-glucose:glycoprotein7851 99
glucosyltransferase
2 precursor


381 AF118566 Mus musculushematopoietic zinc 1769 92
finger protein


382 AK000619 Homo sapiensunnamed protein product810 100


383 AF227906 Homo SapiensUDP-glucose:glycoprotein7851 99
glucosyltransferase
2 precursor


384 AF117946 Homo SapiensLink guanine nucleotide2363 100
exchange
factor II


385 AF125390 DrosophilaL82G 139 41
melanogaster


386 Y94907 Homo SapiensHuman secreted protein1092 50
clone
ca106_19x protein sequence
SEQ ID
N0:20.


387 U18795 SaccharomyceYe1064cp 206 28
s cerevisiae


388 AF177388 Homo Sapienscancer-amplified transcriptional10748 99
coactivator ASC-2


389 AJ002744 Homo SapiensUDP-GaINAc:polypeptide3469 96
N-
acetylgalactosaminyltransferase
7


390 AF097366 Homo sapienscone sodium-calcium 3166 100
potassium
exchanger


391 AF217525 Homo SapiensDown syndrome cell 5337 60
adhesion
molecule


392 U81035 Rattus ankyrin binding cell 3967 91
norvegicusadhesion
molecule neurofascin


393 X65224 Gallus neurofascin 4097 78
gallus


394 X13916 Homo sapiensLDL-receptor related 4292 99
precursor (AA
-19 to 4525)


395 AF151083 Homo SapiensHSPC249 444 98


396 AB017026 Mus musculusoxysterol-binding protein2173 98


397 AL035587 Homo SapiensdJ475N16.4 (KIAA0240) 2393 100


398 W74813 Homo sapiensHuman secreted protein722 92
encoded by
gene 85 clone HSDFV29.


399 Y71110 Homo SapiensHuman Hydrolase protein-81637 99
(HYDRL-8).


136


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


400 AF0397I8Caenorhabditiscontains similarity 325 43
elegans to lupus LA
protein homologs


401 AE000877Methanothermconserved protein 231 36
obacter
thermoautotro
phicus


402 Y27795 Homo SapiensHuman secreted protein 1539 99
encoded by
gene No. 79.


403 250853 Homo SapiensCLPP 615 100


405 X03475 Rattus ribosomal protein L35a _ 99
norvegicus(aa 1-110) 576


406 AF144237Homo SapiensLOMP protein 252 44


407 U20239 Mus musculusfibrosin 288 76


409 AL033378Homo SapiensdJ323M4.1 (I~IAA0790 6026 99
protein)


4I0 X54326 Homo Sapiensglutaminyl-tRNA synthetase7577 99


411 X61585 Bos tauruspolynucleotide adenylyltransferase3715 97


412 AF217190Homo SapiensMLEL1 protein 5271 99


414 602815 Homo SapiensHuman secreted protein,_ 95
SEQ ID 314
NO: 6896.


41 AJ245922Homo Sapiensalpha-tubulin 8 2370 100
S


416 AF203032Homo Sapiensneurofilament protein 220 21


417 297653 Homo Sapiensc380A1.2.1 (novel protein1567 100
(isoform
1))


418 AJ404326Homo sapiensSR+89 1871 99


419 AJ404326Homo SapiensSR+89 ~ 902 64


420 AF134726Homo SapiensG9A ' 5334 99


421 L2812S Podospora beta transducin-like _ 39
anserina protein 288


422 W21733 Homo SapiensNIP-1 encoded by clone 110 72
59.


423 567970 Homo SapiensZNF75=KRAB zinc forger 951 76


424 L28035 Mus musculusprotein kinase C gamma 3768 98


426 Y73373 Homo SapiensHTRM clone 921803 protein555 56
sequence.


427 Y73373 Homo SapiensHTRM clone 921803 protein266 49
sequence.


428 X61118 Homo SapiensTTG-2a/RBTN-2a 876 100


429 296932 Homo sapiensnuclear autoantigen 496 83
fo 14 kDa


430 AJ277291Homo SapiensHELG protein 678 72


431 X82157 Homo Sapienshevin 3525 99


432 AC007192Homo SapiensP85B HUMAN; PTDINS-3- _ 99
KINASE P85-BETA 3825


433 AL021918Homo Sapiensb34I8.1 (Kruppel related1713 50
Zinc Finger
protein 184)


434 AF084464Rattus GTP-binding protein 141 29
norvegicusREM2


435 AL049795Homo SapiensdJ622L5.2 (novel protein)1756 98


436 M14513 Rattus (Na+ and K+) ATPase, 4269 99
norvegicusalpha(III)
catalytic subunit


437 U33460 Homo SapiensDNA-directed RNA polymerase8777 98
I,
largest subunit


438 D87076 Homo Sapienssimilar to human bromodomain3067 100
protein BR140(JC2069)


439 L43912 Macaca mannose-binding protein589 93
mulatta A


440 D31763 Homo Sapiensha0946 protein is Kruppel-related.927 49


441 U70976 Homo Sapiensarrestin 2068 99


442 B08069 Homo SapiensA human beta-alanine-pyruvate2343 99
aminotransferase (NAPA).


443 AF10066,2Caenorhabditiscontains similarity 166 24
to ubiquitin


137


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-


ID NUMBER WATERMAN IDENTITY


NO: SCORE


elegans carboxyl-terminal hydrolase
(Pfam:


UCH-l.hmm, score: 28.46)
(Pfam:


UCH-2.hmm, score: 47.53)


444 D78017 Rattus NFI-A1 2667 98


norvegicus


445 AL049569Homo sapiensdJ37C10.3 (novel ATPase)2418 100


448 AJ242540Volvox hydroxyproline-rich 165 34
carteri glycoprotein


f. nagariensisDZ-HRGP


449 AJ133352Homo SapiensZNF237 protein 2006 100


450 AJ133352Homo SapiensZNF237 protein 1025 96


451 AF170708Homo SapiensT-box protein TBX3 3700 99


452 AK002080Homo Sapiensunnamed protein product1546 99


453 L32977 Homo sapiensRieske Fe-S protein 1239 93


454 X51760 Homo sapienszinc forger protein 1533 57
(583 AA)


455 Y01141 Homo SapiensSecreted protein encoded1453 99
by gene 7


clone HTLFA90.


456 AB006631Homo SapiensThe human homolog of 6559 100
mouse Cux-2


457 AF067165Homo Sapienszinc forger protein 977 64
3


458 AF038169Homo Sapiensunknown 154 38


459 W75214 Homo SapiensHuman secreted protein 1180 95
encoded by


gene 19 clone HRSMC69.


460 U97002 Caenorhabditissimilar to acyl-CoA 583 37
dehydrogenases


elegans and epoxide hydrolases;~Pfam


domain PF00441 (Acyl-CoA_dh),


Score=57.4, E-value=1.7e-16,
N=2;


contains similarity
to Pfam domain


PF00702 (Hydrolase),
Score=57.4,


E-value=le-13, N=1


461 AK023 Homo Sapiensunnamed protein productI 041 99
I 14


462 M93134 Friend pol protein 289 44
murine


leukemia
virus


463 AF055473Homo SapiensGAGE-8 232 47


466 Y51415 Homo sapiensHuman wild type pKe83 2625 100
protein.


467 Y51417 787 Human pKe83 splice variant2433 100
protein


468 Y57936 Homo sapiensHuman transmembrane 1629 96
protein


HTMPN-60.


469 D38552 Homo SapiensThe ha1539 protein is 2995 100
related to


cyclophilin.


470 Y70013 Homo sapiensHuman Protease and associated3530 100


protein-7 (PPRG-7).


471 AJ224747Homo SapiensC-terminal variant of 7969 100
hINADL


including 2 amino acid
exchanges


and an insertion of
28 amino acids in


frame.


472 W99665 Homo SapiensHuman secreted protein 1546 100
clone


du157_12 protein.


473 W99665 Homo SapiensHuman secreted protein 998 98
clone


du157_12 protein.


474 X63526 Homo Sapienshomologue to elongation2273 99
factor 1-


gamma from A.salina


475 X15940 Homo Sapiensribosomal protein L31 644 100
(AA 1-125)


476 M60832 Homo Sapiensalpha-2 type VIII collagen3581 99


477 AF039697Homo Sapiensantigen NY-CO-31 1213 97


478 AF156929Sus scrofainflammatory response 1588 83
protein 6


479 AF264717Homo SapiensFYVE domain-containing 5610 99
dual


specificity protein
phosphatase


FYVE-DSP2


480 AF044578Homo sapiensputative DNA polymerase;2478 94
POL4P


481 X89750 Homo sapiensTGIF protein 1413 100


138


CA 02399776 2002-08-02
WO 01/57190 PCTJUS01J04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITFI- % y


ID NUMBER . WATERMAN IDENTITY


NO: SCORE


482 M93107 Homo Sapiens(R)-3-hydroxybutyrate 1663 96


dehydrogenase


483 U58334 Homo SapiensBbp/53BP2 1556 41


484 AF1S1538Homosapiensdeoxycytidyltransferase;Revlp4281 99


485 298884 Homo SapiensdJ467L1.1 (KIAA0833) 699 73


486 AJ243874Homo sapiensoligophrenin-4 3682 100


48? 211737 Homo sapiensflavin-containing monooxygenase2969 100
4


488 X56123 Mus musculustalin 4353 77


489 AJ278112Homo sapiensputative cell cycle 335 23
control protein


490 W74843 Homo SapiensHuman secreted protein 1013 98
encoded by


gene 115 clone HOVBA03.


491 Y41337 Homo SapiensHuman secreted protein 509 36
encoded by


gene 30 clone HRDDV47.


492 X90530 Homo Sapiensraga 1926 99


493 X90530 Homo Sapiensraga 1405 99


494 X90530 Horno Sapiensraga 1893 96


495 AL022394Homo sapiensdJ511B24.3 (KIAA0395 4990 99
(probable


homeobox protein))


496 Y11395 Homo Sapienslanthionine synthetase 2168 100
C-like protein


1


497 AJ010119Homo SapiensRibosomal protein kinase4001 100
B (RSK-B)


498 601563 Homo SapiensHuman secreted protein,330 100
SEQ ID


NO: 5644.


499 X54131 Homo sapiensprotein-tyrosine phasphatase10465 99


500 601082 Homo SapiensHuman secreted protein,549 100
SEQ ID


NO: 5163.


501 AC004142Homo sapienssimilar to murine leucine-rich3676 100
repeat


protein; possible role
in neural


development by protein-protein


interactions; 93% similarity
to


D49802 (PID:g1369906)


502 AL117544Homo Sapienshypothetical protein 1226 100


503 AF203032Homo Sapiensneurofilament protein 5115 99


504 AL034417Homo SapiensbK21SD11.2 (similar 2476 100
to rat gene 33)


505 X69090 Homo sapiens190kD protein 7546 99


506 U58755 Caenorhabditiscoded for by C. elegans782 55
cDNA


elegans yk34b1.5; coded for
by C. elegans


cDNA yk13h10.5; coded
for by C.


elegans cDNA yk46e8.5;
coded for


by C. elegans cDNA yk46d5.5;


coded for by C. elegans
cDNA


yk43c2.5; coded for
by C. elegans


cDNA yk46e8.3; coded
for by C.


elegans cDNA yk43c2.3;
coded fox


by C. elegans cDNA yk46d5.3;


coded for by C. elegans
cDNA


yk13f10,3; coded for
by C. elegans


cDNA yk34b1.3


507 AJ293309Homo SapiensNHP2 protein 801 100


508 U3904S Rattus cytoplasmic dynein intermediate3241 97


norvegicuschain 2B


509 AF063231Mus musculuscytoplasmic dynein intermediate3159 97


chain 2


510 AF202893Mus musculusKitZlb 4336 95


511 Y13115 Homo Sapiensserine/threonine protein5071 99
kinase


512 AB030207Homo SapiensG gamma subunit 364 100


513 AF039571Homo Sapiensperipheral benzodiazepine495 33
receptor


interacting protein;
PBR-IP/PRAXl


514 AB037883Homo SapiensGb3/CD77 synthase 1916 99


139


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


515 D90868 Escherichiasimilar to 1489 100
coli


516 X98834 Homo Sapienszinc forger protein 5290 100
Hsal2


517 AF055668Mus musculusapoptosis-linked gene 2904 78
4, deltaC form


518 AF019926Mus musculusprotein kinase 1694 90


519 M34513 Homo Sapiensomega protein 317 91


520 Y08612 Homo Sapiens88kDa nuclear pore 2313 99
complex protein


521 Y08612 Homo Sapiens88kDa nuclear pore 1561 99
complex protein


522 AL096766Homo SapiensdA59H18.1 (KIAA0767 2497 100
protein)


523 AF186249Homo sapienssix transmembrane epithelial1790 100
antigen
of prostate


524 AB029012Homo SapiensKIAA1089 protein 4933 100


525 AB026893Homo Sapiensvascular cadherin-2 5962 100


526 X74331 Homo SapiensDNA primase (p58 subunit)1720 100


528 AC007228Homo SapiensR31665_2 1488 47


529 X14830 Homo Sapiensacetylcholine receptor2639 100
beta-subunit
preprotein


530 U80446 Caenorhabditiscoded for by C. elegans420 39
elegans cDNA
yk172e6.3; coded for
by C. elegans
cDNA yk158f7.3; coded
for by C.
elegans cDNA yk158f7.5;
coded for
by C. elegans cDNA
yk172e6.5


531 576838 Mus Sp. Dbs 4821 88


532 282215 Homo SapiensdJ68O2.2 (myosin, heavy9828 100
polypeptide 9, non-muscle)


533 AF245505Homo Sapiensadlican 277 31


534 AF300612Homo SapiensN-acetylgalactosamine-4-O-993 59
sulfotransferase


535 AL121928Homo SapiensbA18I14.3 (pleckstrin 3333 99
and Sec7
domain protein)


536 AJ27I055Mus musculusIroquois homeobox protein1724 76
6


537 AF180473Homo SapiensNot2p 2267 100


538 AF071059Mus musculuszinc forger RNA binding1089 . 51
protein


539 AF023453Homo Sapiensactin-related protein 2219 100
3-beta


540 AC003030Homo SapiensR29828_1 1401 70


541 AC003030Homo SapiensR29828_1 2294 100


542 AL121889Homo SapiensdJ1076E17.1 (KIAA0823 2152 100
protein
(continues in AL023803))


543 AB006135Rattus db83 1238 98
norvegicus


544 602650 Homo SapiensHuman secreted protein,644 97
SEQ ID
NO: 6731.


545 Y07595 Homo Sapienstranscription factor 2373 - 100
TFIIH


546 AL133545Homo SapiensbA386N14.1 (novel protein964 99
similar
to a dual specificity
phosphatase)


547 X83618 Homo Sapienshydroxymethylglutaryl-CoA2647 100
synthase


548 AF134726Homo SapiensNG37 4359 99


549 AB035356Homo Sapiensneurexin I-alpha protein6948 99


551 AB037901Homo sapiensgene amplified in squamous5215 99
cell
carcinoma-1


552 AB043634Homo sapiensPAR-6A 885 100


553 AP000693Homo Sapienspartial CDS 4875 99


554 AF002223Homo Sapiensmyotubularin related 3490 100
1


555 AC004893Homo Sapienssimilar to NEDD-4 (KIA0093);1611 100
similar to P46934 (PID:g1171682)


556 AJ404468Homo Sapiensaxonemal dynein heavy 8328 100
chain


557 AJ404468Homo Sapiensaxonemal dynein heavy 11137 100
chain


140


CA 02399776 2002-08-02
WO 01!57190 PCT/US01/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


558 X65873 Homo Sapienskinesin heavy chain 4860 100


559 AJ277365Homo sapienspolyglutamine-containing592 36
protein


560 AF205600Homo sapienstransposase-like protein407 27


561 X71125 Homo Sapiensglutaminyl-peptide cyclotransferase1914 100


562 X71125 Homo Sapiensglutaminyl-peptide cyclotransferase1456 97


563 X54304 Homo sapiensmyosin regulatory light897 100
chain


564 AF250842Drosophilamultiple asters 130 23
melanogaster


565 Y58608 Homo SapiensProtein regulating gene1619 99
expression
PRGE-1.


566 AL121893Homo sapiensbA189K.21.5 (novel protein1012 100
similar
to retinoblastoma binding
protein
(RBBP9))


567 AL117352Homo sapiensdJ876B10.2 (novel protein3713 99
(ortholog
of xat EX084))


568 AF228603Homo Sapienspleckstrin 2 1841 100


569 AF239243Homo Sapienshistone deacetylase 3244 86
7


570 AF087695Mus musculusveii 3 989 100


571 AB046381Homo sapienstestis-abundant forger 1346 99
protein


572 AC005551Homo SapiensR26529_2, partial CDS 1020 100


573 Y90290 Homo sapiensHuman peptidase, HPEP-7274 52
protein
sequence.


574 W76734 Homo SapiensHuman mDia Rho targeting712 32
protein.


575 AL121935Homo SapiensbA5-17H2.3 (t-complex 853 78
10 (a marine
tcp.homolog))


576 Y86217 Homo SapiensHuman secreted protein 2123 99
HWHGU54,
SEQ ID N0:132.


577 AL121716Homo SapiensdJ202D23.2 (novel protein)6329 99


578 AL121716Homo SapiensdJ202D23.2 (novel protein)6329 99


579 X92715 Homo SapiensKRAB 1C2H2 zinc finger 3102 97
protein


580 X54637 Homo Sapiensprotein tyrosine kinase5564 98


581 X78817 Homo Sapiensp115 1148 44


582 AJ251245Rattus SECIS binding protein 3086 71
norvegicus2


583 AF113125Homo SapiensE-1 enzyme 581 100


584 M19529 Sus scrofafollistatin A 1906 98


585 AF169677Homo Sapiensleucine-rich repeat 3403 100
transmembrane
protein FLRT3


586 D87685 Homo Sapienssimilar to human transcription8083 99
factor
TFIIS (534159).


587 Y00876 Homo SapiensHuman LAPH-1 protein 2110 100
sequence.


588 Y99674 Homo sapiensHuman GTPase associated2111 99
protein-
25.


589 D86973 Homo sapienssimilar to Yeast translation12033 99
activator
GCN1 (P1:A48126)


590 AL034452Homo SapiensdJ682J1S.1 (novel Collagen1979 100
triple
helix repeat containing
protein)


591 YS7396 Horno SapiensHuman lysoenzyme LYC4 814 100
polypeptide.


592 AJ297743Mus musculustorsinB protein 1448 85


593 AF164796Homo SapiensNADH:ubiquinone oxidoreductase469 100
Q subunit homolog
MLR


594 Y41312 Homo sapiens_ 749 94
Human secreted protein
encoded by
gene 5 clone HLDRM43.


595 Y41312 Homo SapiensHuman secreted protein 824 100
encoded by
gene 5 clone HLDRM43.


596 Y77123 Homo SapiensHuman neurotransmission-associated2102 98
protein (NTAP) 998868.


597 AF215703DrosophilaKISMET-L long isoform 1880 65


141


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


melanogaster


598 AF070447Homo Sapiensbarrier-to-autointegration290 90
factor


599 X56203 Plasmodiumliver stage antigen 372 22
falciparum


600 X79828 Mus musculusNK10 202 53


601 AB004109Cricetulusphosphatidylserine synthase2262 92
griseus II


602 U94988 Mus musculusNulpl 2912 89


603 U94988 Mus musculusNulp 1 2800 86


604 AF006264Homo Sapiensrecombination and sister2850 100
chromatid
cohesion protein homolog


605 AF006264Homo Sapiensrecombination and sister2530 100
chromatid
cohesion protein homolog


606 X82260 Homo SapiensRanGAPl 2929 100


607 X82260 Homo sapiensRanGAPl 1843 97


608 AF160909DrosophilaBcDNA.LD03471 943 58
melanogaster


610 X74801 Homo sapiensgamma subunit of CCT 2745 99
chaperonin


6I1 AL03I427Homo SapiensdJ167A19.1 (novel protein)1608 100


612 Y71072 Homo SapiensHuman membrane transport445 100
protein,
MTRP-17.


613 X16396 Homo Sapiensprecursor polypeptide 1749 100
(AA -29 to
315)


614 AK000281Homo Sapiensunnamed protein product1814 99


615 AB011128Homo SapiensKIAA0556 protein 5761 99


616 U19361 PetromyzonNF-180 205 21
marinus


617 AF045555Homo Sapienswbscrl 1208 100


618 AF045555Homo Sapienswbscrl alternative spliced1318 100
product


619 U22229 Felis catusribosomal protein L41 I28 100


620 Y 17169 Homo SapiensA6 related protein 1819 100


621 Y12065 Homo SapienshNop56 2956 99


622 AF177758Homo Sapiensubiquitin specific protease2998 100
16


623 AF317425Homo SapiensGAC-1 3866 100


624 AL050297Homo Sapienshypothetical protein 1227 99


625 AC007204Homo SapiensBC273239_1 3398 99


626 268747 Homo Sapiensimogen 38 2024 99


627 268747 Homo Sapiensimogen 38 1958 97


628 Y70229 Homo SapiensHuman RNA-associated 3424 99
protein-10
(RNAAP-10).


629 AF191492Homo Sapiensnasopharyngeal carcinoma613 100
associated
gene protein-8


630 AF119664Homo Sapienstranscriptional regulator1574 100
protein
HCNGP


631 AF119664Homo Sapienstranscriptional regulator1150 89
protein
HCNGP


632 Y17849 Homo sapiensganglioside-induced 1839 98
differentiation
associated protein 1


633 X55740 Homo Sapiens5'-nucleotidase 3012 100


634 AF039688Homo Sapiensantigen NY-CO-3 931 100


635 AF119662Homo SapiensE46 protein 2424 100


636 AB007836Homo SapiensHic-5 2544 100


637 AF077818Mus musculussyntrophin-associated 2027 44
serine-
threonine protein kinase


638 AL035455Homo SapiensdJ1018E9.1 (VAMP (vesicle-150 26
associated membrane
protein)-
associated protein B
and C)


639 AF078844Homo sapienshqp0376 protein 416 81
~ ~


142


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


640 U28377 EscherichiaORF f239; was ORF f191I 198 100
coli and
ORF'_f194 before splice


641 AK024442Homo SapiensFLJ00032 protein 1677 56


642 U58682 Homo sapienSribosomal protein S28 340 100


643 X57432 Rattus rattusribosomal protein S2 1520 98


644 AB002348Homo SapiensI~IAA0350 protein 5186 ~ 99


646 Y96202 Homo sapiensIkappaB kinase (IKK) I 178 98
binding
protein, Y2H56.


647 AB029482Mus musculusJNK-binding protein 4609 81
JNKBP1


648 AB009053Arabidopsiscontains similarity 407 44
thaliana to isoamyl
acetate-hydrolyzing
esterase~gene_id:MQB2.25


650 AC002550Homo SapiensUnknown gene product 858 99


651 U26592 Homo Sapiensdiabetes mellitus type253 66
I autoantigen


6S2 X60155 Homo Sapienszinc finger 41 4349 100


653 XS3330 PlatynereisH4 protein (AA 1 - 523 100
dumerilii 103)


654 AC003682Homo Sapiens827945 2 2558 100
~


655 X80473 Mus musculusrabl9 596 56


656 J02649 Rarius unknown protein 201 95
norvegicus


657 AC006014Homo Sapienssimilar to RFP transforming1331 99
protein;
similar to P14373 (PID:g132517)


658 X92972 Homo Sapiensprotein phosphatase 1666 100
6


659 L35269 Homo Sapienszinc finger protein 2803 99


660 AC003682Homo SapiensF18547_1 3184 96


661 X79204 Homo Sapiensataxin-1 4195 99


662 X17620 Homo SapiensNm23 protein 965 99


663 AB015617Homo SapiensELKS _ 80
1501


664 256281 Homo Sapiensinterferon regulatory 2331 100
factor 3


665 AJ248283Pyrococcus LACTOYLGLUTATHIONE 254 40
abyssi LYASE (EC 4.4.1.5)
METHYLGLYOXALASE)
(ALDOKETOMUTASE)
(GLYOXALASE I).


666 270200 Homo SapiensU5 snRNP-specific 200kD8819 99
protein


667 270200 Homo SapiensU5 snRNP-specific 200kD_ 97
protein 8589


668 AF153450Manduca juvenile hormone esterase_ 32
sexta binding 225
protein


669 AF227198Homo sapiensCrkRS 7231 99


670 X99586 Homo sapiensSMT3C protein 441 87


671 261589 Homo Sapiens17-AUG-1998 DNA encoding2593 100
cdI a
human OC-2 protein.


672 AJ132702Mus musculusATFa-associated factor3240 88


673 AF204159Homo Sapienspotassium large conductance_ 100
calcium-activated channel1486
beta 3a
subunit


674 602061 Homo SapiensHuman secreted protein,558 99
SEQ ID
NO: 6142.


675 601246 Homo sapiensHuman secreted protein,141 77
SEQ ID
NO: 5327.


676 AB016839Homo Sapiensmobl 419 42


677 D86970 Homo sapienssimilar to myosin heavy__ 28
chain: 161
Containing ATP/GTP-binding
site
motif A(P-loop)


678 U83115 Homo Sapiensnon-lens beta gamma-crystallin. 8569 99
like
protein


679 AF203687Homo Sapiensprolactin regulatory 2181 100
element-binding
protein


143


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


680 M27685 Mus musculusulna-high sulphur keratin650 58


681 U04968 Cricetulus nucleotide excision 3712 97
griseus repair protein


682 AF119663Homo SapiensG-protein gamma-12 356 100
subunit


683 603733 Homo sapiensHuman secreted protein,342 100
SEQ ID
NO: 7814.


684 X67699 Homo SapiensCDw52 antigen 297 100


685 AF022789Homo Sapiensubiquitin hydrolyzing 1892 100
enzyme I


686 AJ001006Mus musculusEMeg32 protein 938 96


687 W03S16 Homo SapiensProstaglandin DP receptor.1864 100


688 AF019661Mus musculuszeta proteasome chain;1214 100
PSMAS


689 AF156557Homo Sapiensstomatin related protein2036 100


690 603960 Homo SapiensHuman secreted protein,593 100
SEQ ID
NO: 8041.


691 AFI61512Homo SapiensHSPC163 738 100


692 AL031115Homo SapiensZXDA, ZXDB (zinc finger4298 100
X-linked
protein)


693 L40410 Homo sapiensthyroid receptor interactor806 100


694 AC004542Homo SapiensOXYSTEROL-BINDING 2533 99
PROTEIN-like; similar
to P22059
(PID:g129308)


695 AFI69411Rattus PAPIN 4144 S2
norvegicus


696 YS8168 Homo SapiensHuman hydrolase homologue2144 100
HHH-
4.


697 AF271994Homo Sapiensdopamine responsive 1613 I00
protein DRG-1


698 Y41741 Homo SapiensHuman PRO704 protein 1323 100
sequence.


699 AL133S06Unknown /prediction=(method:"""genscan"",825 48
version:""1.0"", score:""109.13"");
/prediction=(method:


700 Y96870 Homo SapiensHuman goose-type lysozyme1032 100
(GOLY).


701 AC003034Homo sapiensGene with similarity 1190 100
to rat kidney-
specific (KS) gene


702 AC003034Homo SapiensGene with similarity 937 95
to rat kidney-
specific (KS) gene


703 AJ242832Homo sapienscalpain 3756 100


704 SS2624 Homo Sapiensunknown 18S 100


705. AFOOS081Homo Sapiensskin-specific protein 652 100


706 Y16793 Homo Sapienskeratin, type I 2232 100


707 Y44985 Homo SapiensHuman epidermal protein-2.455 69


708 AF113220Homo SapiensMSTP040 686 100


709 Y44985 Homo SapiensHuman epidermal protein-2.408 65


710 Y16132 Homo SapiensCDT6 1874 100


711 Y68775 Homo SapiensAmino acid sequence 2407 100
of a human
phosphorylation effector
PHSP-7.


712 X63422 Homo SapiensH(+)-transporting ATP 209 100
synthase


7I3 AF169968Mus musculusDNA binding protein 1467 79
DESRT '


714 X52563 Bos taurus permability increasing383 29
protein


715 AJ277739Homo SapiensRPBllblalpha protein 480 98


716 AL135791Homo SapiensbA162G10.3 (zinc finger401 98
protein)


717 AF223466Homo SapiensHTO1S protein 1311 97


719 AF117383Homo Sapiensplacental protein 13; 746 100
PP13


720 298743 Homo SapiensdJ181C9.2 (Rho GTPase 324 100
activating
protein 8 (RhoGAP,
p50RhoGAP))


721 AL163815Arabidopsisputative protein 653 61
thaliana


722 GOI436 Homo SapiensHuman secreted protein,418 96
SEQ ID


144


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/0409&
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-


ID NUMBER WATERMAN IDENTITY
NO:


SCORE


NO: 5517.


723 AF282919 Mus musculusZfp228 349 49


724 AB023191 Homo SapiensKIAA0974 protein 2953 100


725 AL03 l Homo SapiensdJ34B21.1 (novel BZRP 920 I00
778


(benzodiazapine receptor
(peripheral)


(MBR, PBR, PBKS, IBP,


Isoquinoline-binding
protein)) LIKE


protein)


726 AL021939 Homo SapiensdJ352A20.2 (aldehyde 1764 100


dehydrogenase family
protein)


727 AF 182426Rattus arylacetamide deacetylase791 42


norvegicus


728 Y08565 Homo sapiensUDP-GaINAc:polypeptide3331 99
N-


acetylgalactosaminyltransferase


729 AF155135 Homo sapiensnovel retinal pigment 1652 99
epithelial cell


protein


730 AL078606 Arabidopsisputative protein 277 55


thaliana


731 Y73352 Homo SapiensHTRM clone 1732368 1720 100
protein


sequence.


732 AF178432 Homo SapiensSH3 protein 3302 100


733 Y17832 Human env protein 223 34


endogenous


retrovirus
K


734 _ Homo SapiensHuman mesoderm induction2067 gg
Y28859 early


response protein ERl.


735 _ Oryctolagusprotein phosphatase 2352 gg
U09355 2A1 B gamma


cuniculus subunit


736 Y94922 Homo SapiensHuman secreted protein724 gg
clone pv6~1


protein sequence SEQ
ID NO:50.


737 AB027003 Mus musculusprotein phosphatase 378 84


738 AFI12200 Homo SapiensNADH-oxidoreductase 739 100
B18 subunit


739 AF112200 Homo SapiensNADH-oxidoreductase 613 gg
B18 subunit


740 AF3021S4 Homo SapiensSPG protein 6556 100


741 B25681 Homo SapiensHuman secreted protein1410 99
sequence


encoded by gene 17
SEQ ID NO:70.


742 L27479 Homo SapiensX123 1237 99


743 L27479 Homo SapiensX123 1206 97


744 Y66745 Homo SapiensMembrane-bound protein588 99
PR01186.


745 AJ001019 Homo Sapiensring finger protein 1292 g9


746 X68453 Sus scrofatubulin-tyrosine ligaseI gg2 94


747 Y57897 Homo SapiensHuman transmembrane 1173 100
protein


HTMPN-2I.


748 AF151069 Homo sapiensHSPC235 1694 96


749 AF182404 Homo Sapiensmitochondrial uncoupling1674 100
protein 1


750 AL 121993Homo SapiensdJ776P7.1 (Novel protein)2500 99


751 AF149825 Homo SapiensPACSIN3 2253 100


752 AL008635 Homo SapiensdJ510H16.2 (high-mobility3026 gg
group


protein 2-Like I)


753 Y57914 Homo SapiensHuman transmembrane 1124 100
protein


HTMPN-38.


754 AF285109 Homo Sapienseptin 3 isoform B 1766 100
s


755 AF004161 Oryctolagusperoxisomal Ca-dependent2371 95
solute


cuniculus arrier
c


756 219585 Homo sapienshrombospondin-4 4239 100
t


757 AP001745 Homo Sapiensimilar to zinc forger 1857 100
s S protein


758 AF190664 Mus musculusLMBR2 555 72


759 AF090326 Mus musculusAE-1 binding protein 1540 97
AEBP2


760 AL096677 Homo SapiensJ322G13.3 (novel protein999 94
d similar to


I45


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


bovine and mouse beta-soluble
NSF
attachment protein (SNAP-beta)
)


761 AC003007Homo sapiensUnknown gene product 649 96
(partial)


762 U66372 Bos taurusribosomal protein S29 230 73


764 Y90899 Homo SapiensD1-like dopamine receptor1152 100
activity
modifying protein SEQ
ID NO:1.


765 U88169 Caenorhabditissimilar to molybdoterin1204 65
. elegans biosynthesis
MOEB proteins


766 ALl 18506Homo SapiensdJ591C20.3.1 (novel 1091 100
DnaJ domain
protein, similar to
mouse and bovine
cysteine string protein)


767 AK024693Homo Sapiensunnamed protein product3767 100


768 211518 Homo Sapienshistidyl-tRNA synthetase2582 100


769 X13916 Homo sapiensLDL-receptor related 25529 100
precursor (AA
-19 to 4525)


770 AC009360ArabidopsisContains 3 PF~00400 333 33
thaliana WD40, G-beta
repeat domains.


771 AB037685Mus musculusLANP-like protein 1246 91


772 AL161578Arabidopsisputative protein 335 46
thaliana


773 AL161578Arabidopsisputative protein 333 47
thaliana


774 AY008271Homo Sapienshelicase SMARCAD1 5264 99


775 Y21591 Homo SapiensHuman secreted protein 1127 96 -
(clone
CC332-33).


776 W88853 Homo sapiensPolypeptide fragment 752 100
encoded by
gene 89.


777 W88853 Homo SapiensPolypeptide fragment 752 100
encoded by
gene 89.


778 W88853 Homo SapiensPolypeptide fragment 752 100
encoded by
gene 89.


779 AF196481Homo SapiensRING forger protein; 3644 100
FXY2


780 AL035427Homo SapiensdJ769N13.1 (KIAA0443 1609 54
protein.)


781 AB026187Homo Sapiensprotocadherin-Xa 5244 100


782 B24458 Homo sapiensHuman secreted protein 1002 100
sequence
encoded by gene 22 SEQ
ID N0:83.


783 AB027289Homo sapienscyclin-E binding protein5421 100
1


784 602916 Homo SapiensHuman secreted protein,627 100
SEQ ID
NO: 6997.


785 AJ245822Homo Sapienstype I transmembrane 4560 100
receptor


786 AJ245820Homo Sapienstype I transmembrane 4624 100
receptor


787 248042 Homo sapiensGPI-anchored protein 3340 99
p137


788 AL031782Homo SapiensdJ708F5.1 (PUTATIVE 2739 100
novel
Collagen alpha 1 LIKE
protein)


789 AJ131245Homo sapiensSec24B protein 6602 100


790 AF107203Homo Sapiensataxin 2-binding protein2008 100


791 Y14690 Homo Sapiensprocollagen alpha2(V) 600 34


792 AL031055Homo SapiensdJ28H20.2 (novel protein)1267 100


793 Y36194 787 Human secreted protein 2051 99


794 AB028127Homo Sapiensmannosyltransferase 2138 96


795 AC007228Homo Sapiens831665_2 2738 79


796 AL049482Arabidopsisputative protein 436 47
thaliana


797 AC004528Homo Sapiens832184_3 891 91


798 AB037830Homo SapiensKIAA1409 protein 7532 100


799 X53793 Homo Sapiens5' half of the product 2232 100
is homologues
to Bacillus subtiis
SAICAR
synthetase, 3' half
corresponds to the
catalytic subunit of
AIR carboxylase


I46


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


800 Y99350 Homo sapiensHuman PR01378 (UNQ715) 1343 I00
amino
acid sequence SEQ ID
N0:33.


801 AB042636Homo Sapiensjunctophilin type3 1225 47


802 AB029324Rattus TIP120-family protein 3916 90
norvegicusTIP120B


803 AB029324Rattus TIP120-family protein 4961 90
norvegicusTIP120B


804 AF251040Homo Sapiensputative nuclear protein2119 I00


805 AB033281Homo SapiensF-box and WD-repeats 2879 100
protein beta-
TRCP2 isoform C


806 U87305 Rattus transmembrane receptor 3257 90
norvegicusUNC5H1


807 AF118889Rattus b-tomosyn isoform 3155 97
norvegicus


808 AF226993Rattus selective LIM binding 8793 95
norvegicusfactor


809 W19919 Homo SapiensHuman Ksr-1 (kinase 3939 99
suppresser of
Ras).


810 AL03I782Homo SapiensdJ708F5.1 (PUTATIVE 1546 I00
novel
Collagen alpha 1 LIKE
protein)


811 AC002542Homo Sapienssimilar to C. elegans 2294 100
F11AI0.5; 80%
similarity to 268297
(PID:g1130619)


812 U83246 Homo sapienscopine I 606 52


813 AF242552Gallus retinovin 945 34
gallus


814 X52332 Homo Sapienszinc forger protein 1651 93
10


815 X52332 Homo Sapienszinc finger protein 2423 99
10


816 Y09631 Homo SapiensPIBFl protein 2935 99


817 X71997 Rattus myosin I 3883 98
norvegicus


818 AY004877Mus musculuscytoplasmic dynein heavy11105 98
chain


819 Y27196 Homo SapiensHuman cyclic nucleotide3790 100
phosphodiester PDEBB(E)
amino
acid sequence.


820 AF081947Mus musculustektin 1134 81


821 AL035106Homo SapiensdJ998C11.1 (continues 871 100
in
Em:AL445192 as bA269H4.1)


822 AF022795Homo SapiensTGF beta receptor associated3 85 24
protein-
1


823 AF015770Mus musculusradical fringe 1422 82


824 U82695 Homo Sapiensexpressed-Xq28STS protein1444 99


825 X77371 MesocricetusCORl 641 78
auratus


826 AB014576Homo SapiensKIAA0676 protein 296 79


827 AL049733Homo SapiensdJ875H3.1 (APK1 antigen)1584 72


828 AF222980Homo Sapiensdisrupted in Schizophrenia4418 100
1 protein


829 231560 Homo Sapienssox-2 1683 100


830 AF295773Homo Sapiensral guanine nucleotide 4717 99
dissociation
stimulator


831 AB041926Homo SapiensGCK family kinase MINK-26866 100


832 L04948 Saccharomycemitochondria) transporter338 35
s cerevisiaeprotein


833 AJ007012Mus musculusFish protein 704 94


834 234289 Homo Sapiensnucleolar phosphoprotein3455 99
p130


835 U10991 Homo SapiensG2 8436 98


836 AF230877Homo SapiensMIP-T3 2945 99


837 X58288 Homo Sapiensprotein-tyrosine phosphatase7734 99


838 X56958 Homo Sapiensankyrin (brank-2) 9631 100


839 AC024791Caenorhabditiscontains similarity 370 24
elegans to beta-lactamases


147


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER ~ WATERMAN IDENTITY
NO: SCORE


840 D83197 Homo Sapiensankyrin repeat protein802 99


841 AF053711Serinus neurofilament medium 192 31
canaria subunit


842 AF283772Homo Sapienssimilar to Homo Sapiens990 96
ribosomal
protein L10 encoded
by GenBank
Accession Number L25899


843 U76343 Homo SapiensGABA transport protein2992 98


844 Y13645 Homo Sapiensuroplakin II 897 100


845 D21064 Homo Sapienssimilar to rat general2710 99
mitochondrial
matrix processing protease
mRNA
(RATMPP).


846 AF192522Homo SapiensNiemann-Pick C3 protein;7047 100
NPC3


847 AF192522Homo SapiensNiemann-Pick C3 protein;5472 100
NPC3


848 X60489 Homo Sapienselongation factor-i-beta1162 100


849 AC007204Homo SapiensBC273239_1 2277 67


850 AC003682Homo SapiensR28830_I 2401 100


851 AL121583Homo sapiensbA358N2.1 (novel protein)353 61


852 248475 Homo Sapiensglucokinase regulator 3155 99


853 283844 Homo SapiensdJ37E16.2 (SH3-domain 1884 98
binding
protein 1)


854 AF233323Homo SapiensFas-associated phosphatase-1390 36


855 AF062741Rattus pyruvate dehydrogenase447 80
norvegicus phosphatase
isoenzyme 2


856 Y11411 Homo Sapienspristanoyl-CoA oxidase3595 98


857 M97188 Strongylocentrtektin A 1 290 46
otus
purpuratus


858 AB001105Homo Sapienshippocalcin-like protein995 100
4


859 AF164791Homo sapiensputative 38.3kDa protein1795 100


860 AF298117Homo Sapienshomeobox protein OTX2 1477 93


86I AFOI5264Rattus golgi peripheral membrane1820 8I
norvegicus protein
p65


862 X16901 Homo Sapiens30kb subunit of RAB30 1284 100
/74


863 M12140 Homo Sapiensenvelope protein 202 81


864 AF161459Homo SapiensHSPC109 815 98


865 AL109983Homo SapiensdJ718P11.1.1 (novel 444 100
class II
aminotransferase similar
to serine
palmotyltransferase
(isoform I))


866 M77183 Rattus alpha-1-macroglobulin 227 45
norvegicus


867 AF272663Homo Sapiensgephyrin 3785 100


868 X75285 Mus musculusfibulin-2 3258 87


869 X82494 Homo sapiensfibulin-2 3407 99


870 AJ297743Mus musculustorsinB protein 169 43


871 AJ278313Homo Sapiensphospholipase C-beta-16258 99
a


872 AF073344Homo Sapiensubiquitin-specific 256 43
protease 3


873 Y91955 Homo SapiensHuman cytoskeleton 535 100
associated
protein 10 (CYSKP-10).


874 AJ000414Homo SapiensCdc42-interacting protein1136 53
4


875 AF265555Homo Sapiensubiquitin-conjugating 627 100
BIR-domain
enzyme APOLLON


876 Y48586 Homo SapiensHuman breast tumour-associated2537 98
protein 47.


877 AF182198Homo Sapiensintersectin 2 long 8764 99
isoform


878 L17308 Gossypium proline-rich cell wall192 35
hirsutum protein


879 AF177169Homo sapienstropomodulin 2 1769 100


880 W03627 Homo SapiensHuman follicle stimulating210 23
hormone
GPR N-terminal sequence.


148


CA 02399776 2002-08-02
WO 01157190 PCT/US01704098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-
ID NUMBER WATERMAN IDENTITY
NO: SCORE


881 AL021068Homo SapiensdJ206D15.3 2615 99


882 AC005498Homo SapiensR31665_2 318 82


883 AF165518Homo SapiensMAGOH isoform 182 94


884 D21211 Homo Sapiensprotein tyrosine phosphatase368 43
(PTP-
I BAS, type 3)


885 U13045 Homo Sapiensnuclear respiratory 869 62
factor-2 subunit
beta 1


886 X52836 Homo sapienstryptophan hydroxylase2320 98
(AA 1 - 444)


887 X51466 Homo Sapienselongation factor 2 4460 100


888 AB039903Homo Sapiensinterferon-responsive 1096 98
finger protein 1
long form


889 X51760 Homo Sapienszinc forger protein 3130 100
(583 AA)


890 AJ243396Homo Sapiensvoltage-gated sodium 1024 100
channel beta-3
subunit


891 W67928 Homo sapiensFragment of human secreted391 100
protein
encoded by gene 4.


892 AB020598Homo Sapienspeptide transporter 3017 100
3


893 Y66648 Homo SapiensMembrane-bound protein4722 99
PR01120.


894 Y66648 Homo SapiensMembrane-bound protein3606 96
PR01120.


895 A29218_cdHomo Sapiens19-NOV-1998 DNA encoding2178 100
1 G-
protein coupled 7 TM
receptor with
AXOR15 activity.


896 AJ000332Homo SapiensGlucosidase II 5063 99


897 X98259 Homo SapiensM-phase phosphoprotein1085 100
8


898 X57110 Homo Sapiensc-cbl protein 4849 99


899 X63652 Homo Sapiensinter-alpha-trypsin 3376 98
inhibitor heavy
chain ITIH1


900 X85134 Homo SapiensRB protein binding 2816 99
protein


901 L11672 Homo sapienszinc forger protein 2047 58


902 Y85565 Homo SapiensHuman homologue of 369 83
UNC-53 (Hs-
UNC-53/2) sequence.


903 X54871 Homo Sapiensras related protein 1094 100
RabSb


904 298265 Homo Sapiensplakophilin 3 4065 100


905 AL035295Homo Sapienshypothetical protein 959 99


906 AF051782Homo Sapiensdiaphanous 1 801 35


907 AF208536Homo Sapiensnucleotide binding 1372 100
protein; NBP


908 U79240 Homo Sapiensserine/threonine protein2365 98
kinase


909 U79240 Homo Sapiensserine/threonine protein2386 99
kinase


910 AJ132545Homo sapiensprotein kinase 2921 100


911 AJ132545Homo Sapiensprotein kinase 1637 99


912 AL 121733Homo Sapienshypothetical protein 1344 99


913 Y67579 Homo SapiensHuman death inducer-obliterator1586 100
1
(DIO-1) polypeptide.


914 X87342 Homo SapiensHuman giant larvae 5317 99
homologue


915 X87342 Homo SapiensHuman giant larvae 3495 96
homologue


916 M94362 Homo Sapienslamin B2 2357 93


917 AJ011654Homo Sapienstriple LIM domain protein3432 100


918 AJ131899Rattus proline rich synapse 5776 88
norvegicus associated
protein 1


919 AF054986Homo sapiensputative transmembrane1816 100
GTPase


920 U95822 Homo Sapiensputative transmembrane1237 100
GTPase


921 Y11588 Homo Sapiensapoptosis specific 1492 100
protein


922 X84195 Homo Sapiensacylphosphatase 510 100


923 U72882 Homo Sapiensinterferon-induced 1409 99
leucine zipper
protein


924 AE000660Homo sapienshADV3651 573 100


925 AF126245Homo Sapiensacyl-Coenzyme A dehydrogenase-82162 100
precursor


149


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION SPECIES DESCRIPTION SMITH-


ID NUMBER WATERMAN IDENTITY
NO:


SCORE


926 AE001968 Deinococcushypothetical protein 147 27


radiodurans


927 W81576 Homo SapiensEBV-induced G-protein 1778 100
coupled


receptor (EBI-2) polypeptide.


928 U01317 Homo sapiensbeta-globin 687 94


929 X98333 Homo Sapiensorganic cation transporter2933 100


930 Y91444 Homo SapiensHuman secreted protein1401 100
sequence


encoded by gene 42
SEQ ID


NO: I 65.


931 Y91644 _ Human secreted protein1243 100
Homo Sapienssequence


encoded by gene 43
SEQ ID


N0:317.


932 D90279 Homo Sapienscollagen alpha 1(V) S69 39
chain precursor


933 231560 Homo sapienssox-2 1587 96


934 AF147790 .Homo Sapienstransmembrane mucin 3047 99
12


935 285996 Homo Sapiensmatch: multiple proteins;726 94
match:


Q081 S 1 P28185 QO
1111 Q43554;


match: Q08150 Q40195
P20340


Q39222; match: Q40368
P36412


~P40393 Q40723; match:
CE01798


Q38923 Q40191 Q41022;
match:


Q39433 Q40177 Q40218
Q08146;


match: P10949 P11023
Q16948


Q20337; match: Q25389
P25228


P20336 P05713; match:
P35276


Q08I47 PI7609 P22I28;
match:


Q15771 P364i0 P3529I;
GTP-


binding


936 AB041S33 Homo Sapienssperm antigen 1054 38


937 X91906 Homo Sapiensvoltage-gated chloride3914 100
ion channel


938 AB032481 Homo Sapienshomeobox transcription1744 100
factor


939 AFl ~ 1106Homo Sapiensprotein serine/threonine4682 99
phosphatase


4 regulatory subunit
1


940 Y17999 Homo SapiensDyrklB protein kinase 3331 99


941 AF305872 Homo Sapiensthyroglobulin 455 92


942 AF263462 Homo Sapienscingulin 5939 99


943 AK024442 Homo SapiensFLJ00032 protein 1616 61


944 Y35911 Homo SapiensExtended human secreted262 35
protein


sequence, SEQ ID NO.
160.


945 AB015320 Homo SapienssigmalB subunit of 599 71
AP-1 clathrin


adaptor complex


946 282287 CaenorhabditisZK550.2 229 35


elegans


947 D84223 Homo Sapiensleucyl tRNA synthetase6207 99


948 U49057 Rattus rA9 3846 62


norvegicus


949 AK000568 Homo Sapiensunnamed protein product1659 100


950 AL021578 Homo SapiensdJ453C12.6.1 (uncharacterized2S7 42


hypothalamus protein
(isoform 1))


9S1 _ Homosapiensdifferentiation-associatedNa-3063 99
AB032435


dependent inorganic
phosphate


cotransporter


952 AF110532 Homo sapiensuncoupling protein 1561 100
UCP-4


9S3 X83587 Mus musculus1A13 protein 1420 59


9S4 AL031665 Homo SapiensdJ545L17.5.1 (novel 386 53
protein)


955 Y87600 Homo SapiensHuman fatty acid synthase-like2377 100


p rotein (HFASLP).


9S6 Y99421 Homo sapiensHuman PR01433 (UNQ738)522 5S
amino


a cid sequence SEQ ID
N0:292.


150


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSIONSPECIES DESCRIPTION SMITH-


ID NUMBER WATERMAN IDENTITY
NO:


SCORE


9S7 U68S35 Mus musculusaldo-keto reductase 4S 1 73


9S8 AC007067 ArabidopsisT10024.10 1594 57


thaliana


959 U72194 Mus muscutusmuskelin 3947 gg


960 AE003661 DrosophilaCG15168 gene product 277 S4


melanogaster


961 X80332 Mus musculusrab20 983 82


962 Y6731S Homo sapiensHuman secreted protein3916 99
BL89_13


amino acid sequence.


963 Y67315 Homo SapiensHuman secreted protein3916 99
BL89_13


amino acid sequence.


964 L32602 Rattus homeodomain 159..341 1821 96


norvegicus


965 297832 Homo SapiensdJ329AS.3 (KIAA06460 3581 99
protein)


966 W8899S Homo sapiensPolypeptide fragment 176 39
encoded by


gene 146.


967 U12465 Homo Sapiensribosomal protein L3S 604 100


968 AF151803 Homo SapiensCGI-45 protein 1101 78


969 W7486S Homo SapiensHuman secreted protein1348 98
encoded by


gene 137 clone HMWIF35.


970 L21936 Homo Sapienssuccinate dehydrogenase703 100
flavoprotein


subunit


971 AJ133S21 Drosophilaprotease, reverse transcriptase,194 23


buzzatii ribonuclease H, integrase


972 AC006017 Homo SapiensN-acetylgalactosaminyltransferase;3271 100


similar to Q10473 (PID:g1709S59)


973 281317 SchizosaccharDNA2-NAM7 helicase 685 3I
family


omyces protein
pombe


974 M17885 Homo Sapiensacidic ribosomal phosphoprotein792 100
(PO)


97S U22829 Mus musculusP2Y purinoceptor 399 40


976 AL132772 Homo SapiensdJ1013A22.1 (hepatic 2466 99
nuclear factor


4, alpha)


977 AC003973 Homo sapiensZNF91L 1550 43


978 J04031 Homo SapiensMDMCSF (EC 1.5.1.5; 2824 63
EC 3.5.4.9;


EC 6.3.4.3)


979 AF136715 Homo Sapienstaxol resistant associated217 76
protein


980 AF136715 Homo Sapienstaxol resistant associated306 95
protein


981 292822 CaenorhabditisZKS20.1 1109 44


elegans


982 AJ29S Homo Sapiensputative dipeptidase 1 S64 99
149


983 AL021331 Homo SapiensdJ366N23.3 (KIAA0173 1492 100
and


Tubulin-Tyrosine Ligase
LIKE)


984 AL161501 Arabidopsisputative adenosine 370 38
deaminase


thaliana


TABLE 3
SEQ ACCESSION DESCRIPTION . RESULTS*
ID NO.
NO:


2 BL00282 Kazal serine protease BL00282 16.88 4.2S9e-14
inhibitors family 97-120


proteins.


3 BL00298 Heat shock hsp90 proteinsBL00298A 10.97 1.000e-40
family 74-


proteins. 119 BL00298E 27.30
1.000e-40


321-376 BL00298F 11.21
1.000e-


40 409-464 BL00298H
20.50


1.000e-40 553-607
BL00298C


16.40 2.286e-40 186-230


151


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


BL00298B 15.64 1.290e-39
134-


181 BL00298G 24.57
S.34Se-39


46S-S20 BL00298I 30.07
7.818e-


34 661-71 S BL00298D
17.97


6.226e-33 242-282


4 PR00237 RHODOPSIN-LIKE GPCR PR00237A 11.48 4.316e-13
S7-82


SUPERFAMILY SIGNATURE


S PD024S4 ! L I 1 PROTEIN ALU SUBFAMILYPD024S4B 11.61 4.309e-17
75-


WARNING ENTRY NUCLEAR 103


PHOSPHO.


6 DM00864 EGF-LIKE DOMAIN. DM00864A 15.217.429e-09
98-


119


7 PR00237 RHODOPSIN-LIKE GPCR PR00237A 11.48 1.7SOe-11
29-S4


SUPERFAMILY SIGNATURE PR00237D 8.94 7.000e-09
138-


160 PR00237B 13.50
8.2SOe-09


61-83


9 PF008SS PWWP domain proteins. PF008SS 13.75 5.667e-1S
272-289


BL00139 Eukaryotic thiol (cysteine)BL00139D 9.24 4.400e-11
proteases 391-


cysteine proteins. 408 BL00139A 10.29
7.Slle-09


67-77


12 BL01113 Clq domain proteins. BL01113B 18.26 9.294e-19
689-


725 BL01113C 13.18
4.857e-11


757-777 BI,01113D7.472.161e-


10 790-800


13 BL01113 Clq domain proteins. BL01113B 18.26 3.813e-14
S99-


63S BL01113C 13.18
4.857e-11


667-687 BL01113D 7.47
2.161e-


10 700-710


14 BLOOS94 Aromatic amino acids permeasesBLOOS94A 16.75 6.531e-10
50-94


proteins.


BL01047 Heavy-metal-associated BL01047B 19.73 4.913e-13
domain proteins. 707-


728


16 PR0062S DNAJ PROTEIN FAMILY PR0062SA 12.84 7.462e-18
310-


SIGNATURE 330 PR0062SB 13.48
3.939e-15


340-361


18 BL0061S C-type lectin domain proteins.BL0061SA 16.68 3.700e-09
144-


162


PR00741 GLYCOSYL HYDROLASE FAMILYPR00741D 16.11 9.082e-21
175-


29 SIGNATURE 195 PR00741F 14.66
9.262e-21


243-265 PR00741B 14.23
1.947e-


18 128-145 PR00741G
9.29


2.180e-17 318-340 PR00741C


9.16 7.328e-17 147-166


PR00741 H 10.32 2.141
e-13 3 S 1-


374 PR00741A 9.24 3.596e-13


89-lOS PR00741E 13.39
3.535e-


12 215-232


22 BLOOI07 Protein kinases ATP-bindingBL00107A 18.39 3.647e-20
region I 17-


proteins. 148 BL00107B 13.31
1.000e-16


182-198


23 BL00107 Protein kinases ATP-bindingBL00107A 18.39 1.600e-23
region 126-


proteins. 1 S7


24 BL00107 Protein kinases ATP-bindingBL00107A 18.39 1.600e-23
region 126-


proteins. 1 S7


27 BL00239 Receptor tyrosine kinase BL00239B 25.15 2.324e-16
class II proteins. 91-


139


28 BL00018 EF-hand calcium-binding BL00018 7.41 3.250e-10
domain 681-694


proteins. BL00018 7.41 6.400e-10
717-730


29 BL00018 EF-hand calcium-binding BL00018 7.41 3.250e-10
~ domain 681-694


152


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


proteins. BL00018 7.41 6.400e-10
717-730


30 .BL01113 Clq domain proteins. BLOT 113A 17.99 9.308e-09
S4-81


33 PD01168 SYNTHETASE LIGASE PROTEINPDOl 168L 9.47 1.667e-09
401-


ALANYL. 416


34 PD01168 SYNTHETASE LIGASE PROTEINPD01168L 9.47 1.667e-09
411-


ALANYL. 426


36 PR00426 CSA-ANAPHYLATOXIN RECEPTORPR00426D 10.59 3.618e-12
110-


SIGNATURE 122


37 PF00791 Domain present in ZO-1 PF00791B 28.49 2.049e-10
and UncS-like 1080


netrin receptors. 1135


38 BL003S0 MADS-box domain proteins.BL003S0 20.79 1.000e-40
1-SS


40 BL00123 Allealine phosphatase BL00123B 19.31 1.000e-40
proteins. 90-


133 BL00123C 24.61
1.000e-40


14S-195 BL00123E 22.25
I.OOOe-


40 304-3S8 BL00123G
26.01


1.000e-40 438-488 BL00123F


19.03 8.714e-3S 364-399


BL00123A 10.80 9.000e-24
S2-77


BL00123D 12.73 1.000e-17
216-


229 '


44 PD00066 PROTEIN ZINC-FINGER METAL-PD00066 13.92 2.800e-14
346-3S9


BINDI. PD00066 13.92 4.600e-14
486-499


PD00066 13.92 1.000e-13
374-387


PD00066 13.92 6.000e-13
458-471


PD00066 13.92 2,714e-12
234-247


PD00066 13.92 3,143e-12
430-443


PD00066 13.92 8.714e-12
S 14-S27


PD00066 13.92 3,739e-11
402-41S


PD00066 13.92 2,038e-10
318-331


45 DM00973 3 kw RESISTANCE BENOMYL DM00973A 21.17 2.946e-10
180-


YLL028W CYCLOHEXIMIDE. 217


47 BL00649 G-protein coupled receptorsBL00649C 17.82 1.682e-10
family 2 47S-


proteins. 501 BL00649B 20.68
7.387e-09


417-463


SO PD00066 PROTEIN ZINC-FINGER METAL-PD00066 13.92 8,200e-16
44S-4S8


BINDI. PD00066 13.92 5.846e-1S
30S-318


PD00066 13.92 1.000e-14
221-234


PD00066 13.92 1,000e-14
417-430


PD00066 13.92 2,800e-14
249-262


PD00066 13.92 2.800e-14
277-290


PD00066 13.92 8.800e-14
333-346


PD00066 13.92 9,400e-14
361-374


PD00066 13.92 4.000e-13
389-402


PD00066 13.92 6.S71e-12
473-486


51 BL00226 Intermediate filaments BL00226D 19.10 1.000e-40
proteins. 417-


464 BL00226B 23.86
3.348e-3S


2S 1-299 BL00226C 13.23
1.429e-


24 316-347 BL00226A
12.77


1.8S7e-1 S 1 S 1-166


S2 PR00217 43 KD POSTSYNAPTIC PROTEINPR00217C 10.91 5.648e-09
133-


SIGNATURE 149


53 BL00232 Cadherins extracellular BL00232B 32.79 1.000e-40
repeat proteins I43-


domain proteins. 191 BL00232A 27.72
2.3SOe-28


49-82 BL00232B 32.79
7.OS2e-21


2S2-300 BL00232C 10.65
6.62Se-


20 2S0-268 BL00232B
32.79


1.314e-11367-41S BL00232C


10.65 9.308e-10 470-488


S4 BL00303 S-100/ICaBP type calcium BL00303B 26.15 8.7S9e-23
binding 12S-


153


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS


ID NO.


NO:


protein. 162 BL00303A 21.77
1.000e-21


82-119


58 PR00378 INOSITOL PHOSPHATASE PR00378D 16.86 1.000e-15
242-


SIGNATURE 261 PR00378B 13.80
9.250e-13


109-129


59 PR00425 BRADYKININ RECEPTOR PR00425C 13.23 9.040e-12
120-


SIGNATURE 140


60 BL00280 Pancreatic trypsin inhibitorBL00280 24.61 6.727e-38
(Kunitz) 238-282


family proteins. BL00280 24.61 1.514e-30
294-338


65 BL01019 ADP-ribosylation factors BL01019A 13.20 1.222e-11
family proteins. 43-83


68 PR00237 RHODOPSIN-LIKE GPCR PR00237E 13.03 5.091e-13
188-


SUPERFAMILY SIGNATURE 212 PR00237G 19.63
7.207e-13


268-295 PR00237A 11.48
4.375e-


11 24-49 PR00237C 15.69


3.057e-10 101-124 PR00237D


8.94 4.750e-10 137-159


PR00237F 13.57 5.364e-10
230-


255 PR002378 13.50
9.438e-10


57-79


70 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 7.938e-28
31-70


FINGER METAL-BINDING NU.


71 PR00830 ENDOPEPTIDASE LA (LON) PR00830A 8.41 8.759e-12
SERINE 348-


PROTEASE (S16) SIGNATURE 368


72 BL00120 Lipases, serine proteins.BL00120B 11.37 2.149e-10
148-


163


77 PR00753 1-AMINOCYCLOPROPANE-1- PR00753E 8.01 3.552e-11
191-


CARBOXYLATE SYNTHASE 216 PR00753D 6.85 2.778e-09


SIGNATURE ' 131-153


78 PR00506 D21 CLASS N6 ADENINE-SPECIFICPR00506C 19.40 8.017e-09
96-


DNA METHYLTRANSFERASE 119


SIGNATURE


82 BL00107 Protein kinases ATP-bindingBL00107A 18.39 3.571e-16
region 436-


proteins. 467


84 BL00675 Sigma-54 interaction domainBL00675A 24.86 8.800e-10
proteins 256-


ATP-binding region A proteins.300


85 BL00027 'Homeobox' domain proteins.BL00027 26.43 2.286e-30
117-160


87 BL00250 TGF-beta family proteins.BL00250A 21.24 6.786e-36
264-


300 BL00250B 27.37
1.450e-26


328-364


91 BL00215 Mitochondrial energy transferBL00215A 15.82 9.250e-17
proteins. 10-35


BL00215A 15.82 6.000e-16
221-


246 BL00215A 15.82
7.857e-12


108-133 BL00215B 10.44
9.526e-


11 168-181


92 BL00027 'Homeobox' domain proteins.BL00027 26.43 9.526e-24
324-367


95 PR00094 ADENYLATE KINASE SIGNATUREPR00094C 12.94 1.000e-08
119-


136 .


96 PD02327 GLYCOPROTEIN ANTIGEN PD02327B 19.84 2.091e-09
143-


PRECURSOR IMMLJNOGLO. 165


97 BL00752 XPA protein. BL00752B 19.17 7.309e-09
28-72


98 PR00876 NEMATODE METALLOTHIONEIN PR00876B 7.66 2.268e-10
135-


SIGNATURE 149


99 PR00109 TYROSINE KINASE CATALYTICPR00109B 12.27 9.824e-12
122-


DOMAIN SIGNATURE 141


100 BL00027 'Homeobox' domain proteins.BL00027 26.43 7.429e-31
118-161


101 BL00028 Zinc forger, C2H2 type, BL00028 16.07 6.870e-12
domain proteins. 370-387


BL00028 16.07 6.885e-11
398-415


BL00028 16.07 8.269e-11342-359


BL00028 16.07 4.300e-10
229-246


154


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTSx


ID NO.


NO:


BL00028 16.07 6.100e-10
2S8-27S


102 PR00048 C2H2-TYPE ZINC FINGER PR00048A 10.52 7.7SOe-14
66S-


SIGNATURE 679 PR00048A 10.52
8.SOOe-14


581-S9S PR00048A 10.52
9.2SOe-


14 637-6S1 PR00048A
10.52


2.OS9e-12 609-623
PR00048A


10.52 2.S88e-12 469-483


PR00048A 10.52 7.353e-12
553-


567 PR00048A 10.52
2.895e-11


S2S-539 PR00048A 10.52
4.316e-


11 441-4SS PR00048A
10.52


5.263e-11413-427 PR00048B


6.02 2.I2Se-10 569-579


PR00048B 6.02 4.938e-10
S 13-


523 PR00048A 10.52
5.696e-10


497-S 11 PR00048B
6.02 8.875e-


10 429-439 PR00048B
6.02


1.000e-09 457-467
PR00048B


6.02 6.684e-09 485-495


103 PR0019S DYNAMIN SIGNATURE PR0019SA 11.94 5.364e-22
31-50


PR0019SB 9.47 1.783e-21
S6-74


PR0019SC 11.50 3.455e-21
126-


144 PR0019SD 11.76
8.714e-21


175-194 PR00195F 16.20
8.500e-


20 217-237 PR0019SE
9.82


8.650e-20 194-211


104 BLO1113 C1q domain proteins. BLO1113A 17.99 1.865e-09
12~-


148 BL01113A 17.99
5.846e-09


82-109


lOS BL00420 Speract receptor repeat BL00420A 20.42 6.400e-11
proteins domain 70-99


proteins. BL00420A 20.42 8.525e-10
73-


102 BL00420A 20.42
5.708e-09


8S-114


108 PR00860 VERTEBRATE METALLOTHIONEINPR00860B 7.04 2.929e-20
27-41


SIGNATURE PR00860A 5.46 S.SOOe-16
S-18


PR00860C 9.61 1.474e-14
41-S 1


112 BL01031 Heat shock hsp20 proteins BL01031C 17.68 6.400e-10
family profile. 122-


147


114 DM01840 kw SPAC24B11.09 R07ES.13. DM01840B 22.04 2.688e-40
59-


103 DM01840A 10.95
9.571e-13


31-43


11 BLO 1126 Elongation factor Ts proteins.BL01126A 18.48 2.317e-30
S 46-89


BL01126B 13.15 7.387e-19
116-


135 BL01126C 9.20
9.735e-11


190-203


116 BL00216 Sugar transport proteins. BL00216B 27.64 4.375e-21
3S-8S


118 BL00437 Catalase proximal heme-ligandBL00437A 18.82 1.000e-40
proteins. 49-


101 BL00437B 16.28
1.000e-40


114-168 BL00437C 21.86
1.000e-


40 190-239 BL00437D
25.72


1.000e-40 248-301
BL00437E


23.95 1.000e-40 327-379


119 BL00140 Ubiquitin carboxyl-terminalBL00140D 22.64 8.274e-14
hydrolase 164-


family 1 cysteine activ. 208 BL00140C 11.80
5.444e-10


77-102


120 BL00224 Clathrin light chain proteins.BL00224B 16.94 6.712e-1095-


148


122 BL00203 Vertebrate metallothioneinsBL00203 13.94 1.000e-40
proteins. 16-62


123 PR00041 CAMP RESPONSE ELEMENT PR00041D 7.95 2.906e-09
24-41


15S


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


BINDING (CREB) PROTEIN


SIGNATURE


124 PR00041 CAMP RESPONSE ELEMENT PR00041D 7.95 2.906e-09
24-41


BINDING (CREB) PROTEIN


SIGNATURE


125 BL00061 Short-chain dehydrogenases/reductasesBL00061C 7.86 3.250e-10
212-


family proteins. 222


126 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 6.400e-25
251-290


FINGER METAL-BINDING NU.


127 PR00318 ALPHA G-PROTEIN (TRANSDUCIN)PR00318D 16.28 1.900e-34
219-


SIGNATURE 248 PR00318B 14.79
3.455e-27


168-191 PR00318C 12.09
7.000e-


23 197-215 PR00318A
7.84


1.600e-19 35-51 PR00318E
7.23


2.500e-12 265-275


128 PR00927 ADENINE NUCLEOTIDE PR00927E 14.93 9.743e-10
67-89


TRANSLOCATOR I SIGNATURE PR00927B 14.66 4.575e-09
69-91


130 BL00824 Elongation factor 1 beta/beta'/deltaBL00824B 9.21 7.750e-22
chain 133-


proteins. 153


131 BL00824 Elongation factor 1 beta/beta'/deltaBL00824C 14.58 1.000e-40
chain 166-


proteins. 204 BL00824D 14.04
1.621e-38


204-239 BL00824B 9.21
7.750e-


22 133-153 BL00824E
12.49


1.000e-19 247-263


132 PR00209 ALPHAIBETA GLIADIN FAMILYPR00209B 4.88 9.222e-13
1209-


SIGNATURE 1228


133 PR00209 ALPHA/BETA GLIADIN FAMILYPR00209B 4.88 9.222e-13
1168-


SIGNATURE 1187


134 PR00708 ALPHA-I-ACID GLYCOPROTEINPR00708D 14.67 1.000e-27
141-


SIGNATURE 168 PR00708C 11.77
1.643e-25


98-120 PR00708B 15.15
2.174e-


24 73-95 PR00708E 13.33


1.600e-21 189-207 PR00708A


14.40 2.636e-21 51-70


135 PR00109 TYROSINE KINASE CATALYTICPR00109B 12.27 8.468e-13
126-


DOMA1N SIGNATURE 145


136 PF00023 Ank repeat proteins. PF00023A 16.03 3.250e-10
201-


217


137 BL00471 Small cytokines (intercrine/chemokine)BL00471 23.92 7.480e-10
42-90


C-x-C subfamily signat.


140 PR00205 CADHERIN SIGNATURE PR00205B 11.39 5.582e-10
328-


346 PR00205B 11.39
9.018e-10


543-561


141 BL00412 Neuromodulin (GAP-43) BL00412D 16.54 7.704e-09
proteins. 976-


1027


143 PR00979 TAFAZZIN SIGNATURE PR00979E 10.83 5.950e-26
192-


214 PR00979A 11.91
8.773e-25


63-83 PR00979C 12.16
6.400e-19


108-124 PR00979D 12.38
7.955e-


19 170-I85 PR00979F
10.14


3.382e-15 230-244 PR00979B


15.59 5.636e-15 94-106


145 DM00686 kw REPLICATION REP 28K DM00686C 14.14 7.720e-09
17.7K. 111-


131


146 PR00604 CLASS IA AND IB CYTOCHROMEPR00604D 15.86 I.OOOe-17
C 8'7-


SIGNATURE 104 PR00604B 12.73
9.591e-16


57-73 PR00604C 10.21
8.200e-12


73-84 PR00604E 10.13
1.000e-11


106-117 PR00604A 11.13
8.800e-


156


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


11 44-52 PR00604F 8.60
1.000e-


10 123-132


147 BL00107 Protein kinases ATP-bindingBL00107A 18.39 3.864e-15
region 266-


proteins. 297 BL00107B 13.31
6.143e-11


335-351


148 PD00289 PROTEIN SH3 DOMAIN REPEATPD00289 9.97 8.448e-09
67-81


PRESYNA.


149 PR00069 ALDO-KETO REDUCTASE PR00069D 19.36 1.857e-30
187-


SIGNATURE 217 PR00069A 16.01
7.429e-2S


41-66 PR00069E 18.14
3.100e-22


235-260 PR00069C 16.03
7.000e-


20 151-169 PR00069B
11.33


8.071 e-19 101-120


150 BL00027 'Homeobox' domain proteins.BL00027 26.43 2.688e-27
139-182


151 PD02906 SYNTHASE I PSEUDOURIDYLATEPD02906C 24.17 7.070e-22
165-


PSEUDOURIDINE LYASE TR. 200 PD02906B 15.35
8.393e-15


114-127 PD02906A 10.84
6.500e-


09 71-84


153 BL00479 Phorbol esters / diacylglycerolBL00479A 19.86 5.091e-12
binding 891-


domain proteins. 914 BL00479B 12.57
1.837e-11


915-931


158 BL00027 'Homeobox' domain proteins.BL00027 26.43 6.786e-31
143-186


160 BL00422 Granins proteins. BL00422C 16.18 7.750e-12
420-


448


162 PR00625 DNAJ PROTEIN FAMILY PR00625A 12.84 9.297e-11
62-82


SIGNATURE


164 BL01282 BIR repeat proteins. BL01282B 30.49 6.182e-10
347-


386


166 PR00860 VERTEBRATE METALLOTHIONEINPR00860B 7.04 2.929e-20
83-97


SIGNATURE PR00860A 5.46 1.000e-18
61-74


PR00860C 9.61 1.900e-15
97-107


167 PR00449 TRANSFORMING PROTEIN P21 PR00449A 13.20 7.052e-09
RAS 196-


SIGNATURE 218


169 BL00514 Fibrinogen beta and gammaBL00514C 17.41 1.346e-39
chains C- 316-


terminal domain proteins.353 BL00514G 15.98
2.241e-34


471-501 BL00514H 14.95
6.571e-


27 510-535 BL00514E
14.28


1.273e-16 388=405 BL00514D


15.35 9.100e-15 369-382
'


BL00514B 16.42 4.857e-14
260-


276 BL00514F 1 I.65
9.690e-14


416-431 BL00514A 11.68
8.200e-


11 149-159


170 BL00514 Fibrinogen beta and gammaBL00514C 17.41 1.346e-39
chains C- 268-


terminal domain proteins.305 BL00514G 15.98
2.241e-34


423-453 BL00514H 14.95
6.571e-


27 462-487 BL00514E
14.28


1.273e-16 340-357 BL00514D


15.35 9.100e-15 321-334


BL00514B 16.42 4.857e-14
212-


228 BL00514F 11.65
9.690e-14


368-383 BL00514A 11.68
8.200e-


11 101-111


171 BL00514 Fibrinogen beta and gammaBL00514G 15.98 2.241e-34
chains C- 38S-


terminal domain proteins.415 BL00514H 14.95
6.571e-27


424-449 BL00514C 17.41
4.632e-


24 230-267 BL00514E
14.28


1.273e-16 302-319 BL00514D


15.35 9.100e-15 283-296


157


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTSX


ID NO.


NO:


BL00514B 16.42 4.857e-14
212-


228 BL00514F 11.65
9.690e-14


330-345 BL00514A 11.68
8.200e-


11 101-111


173 BL00027 'Homeobox' domain proteins.BL00027 26.43 9.400e-29
119-162


174 DM01970 0 kw ZK632.12 YDR313C DM01970B 8.60 5.119e-15
1391-


ENDOSOMAL III. 1404


176 BL00773 Chitinases family 19 proteins.BL00773C 9.42 8.000e-09
2-16


182 PR00109 TYROSINE KINASE CATALYTICPR00109B 12.27 9.163e-14
141-


DOMAIN SIGNATURE 160


183 PD01937 DNA PROTEIN POLYMERASE PD01937A 6.68 3.475e-09
221-


ENDONUCLEASE DNA-. 232


185 BL00845 CAP-Gly domain proteins. BL00845 16.43 2.946e-23
247-272


BL00845 16.43 1.628e-21
107-132


186 PR00452 SH3 DOMAIN SIGNATURE PR00452B 11.65 6.538e-11
525-


541


187 PR00452 SH3 DOMAIN SIGNATURE PR00452B 11.65 6.538e-11
497-


513


188 DM01803 1 HERPESVIRUS GLYCOPROTE1NDM01803A 10.51 1.000e-09
H.


1081-1102


189 PF00651 BTB (also known as BR-C/Ttk)PF00651 15.00 5.091e-15
domain 69-82


proteins.


190 PR00194 TROPOMYOSIN SIGNATURE PR00194C 6.38 1.900e-35
145-


174 PR00194E 8.74 3.250e-30


231-257 PR00194D 9.57
1.500e-


26 175-199 PR00194B
10.24


5.200e-24120-141 PR00194A


7.86 4.857e-21 84-102


192 PD02042 IRON-SULFUR ELECTRON PD02042B 16.75 5.154e-09
131-


TRANSPORT AROMATIC 146 PD02042A 21.13
5.909e-09


HYDROCARB. 94-121


193 PR0002I SMALL PROLINE-RICH PROTEINPROOOZIA 4.31 2.200e-10
2-15


SIGNATURE


195 BL00463 Fungal Zn(2)-Cys(6) binuclearBL00463 8.22 5.071e-09
cluster 111-123


domain proteins.


196 PR00118 BETA-LACTAMASE CLASS A PROO1 f8F 16.42 9.386e-09
165-


SIGNATURE 181


197 DM00215 PROLINE-RICH PROTEIN 3. DM00215 19.43 5.424e-09
234-


267


198 BL00660 Band 4.1 family domain BL00660A 31.50 5.500e-11
proteins. 714-


767


199 BL00282 Kazal serine protease BL00282 16.88 8.820e-13
inhibitors family 70-93


proteins.


202 PR00009 TYPE I EGF SIGNATURE PR00009A 14.15 5.345e-15
971-


987 PR00009C 14.11
8.773e-13


996-1008 PR00009D 16.83


8.000e-11 1008-1018
PR00009C


14.11 1.882e-09 892-904


203 BL00025 P-type'Trefoil' domain BL00025 17.17 4.536e-19
proteins. 38-59


205 BL00018 EF-hand calcium-binding BL00018 7.4I 7.300e-10
domain 165-178


proteins.


206 PR00168 SLOW VOLTAGE-GATED PR00168D 12.88 6.865e-11
67-86


POTASSIUM CHANNEL SIGNATURE


207 BL00025 P-type 'Trefoil' domain BL00025 17.1'7 3.423e-20
proteins. 39-60


BL00025 17.17 8.750e-16
88-109


209 BL00646 Ribosomal protein 513 BL00646B 21.42 6.100e-30
proteins. 110-


143 BL00646A 25.82
6.192e-29


14-62


210 PR00138 MATRIXIN SIGNATURE PR00138D 16.56 3.605e-25
279-


158


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


305 PR00138C 16.41
3.000e-24


2I8-247 PR00138E 6.01
8.714e-


13 314-328 1'R00138A
15.14


9.538e-13 134-148 PR00138B


15.82 4.522e-12 188-204


211 DM01206 CORONAVIRUS NUCLEOCAPSID DMOI206B 10.69 8.429e-I2
386-


PROTEIN. 406 DM01206B 10.69
1.247e-10


384-404 DM01206B 10.69


5.068e-10 388-408


212 PD01941 TRANSMEMBRANE PD01941A 14.81 I.OOOe-40
163-


COTRANSPORTER SYMP. 217 PD01941B 15.02
9.705e-30


420-467 PD01941E 15.92
8.714e-


23 837-884 PD01941C
19.96


8.200e-20 508-563 PD01941D


27.18 1.600e-16 661-710


PD01941F 28.52 9.645e-15
1005-


1060


2I3 BL00362 Ribosomal protein SI5 BL00362 24.67 8.313e-09
proteins. 330-373


214 BL00115 Eukaryotic RNA polymexaseBL00115Z 3.12 2.125e-09
II 1178-


heptapeptide repeat proteins.1227 BL00115Z 3.12
6.096e-09


1164-1213


215 BL00038 Myc-type, 'helix-loop-helix'BL00038B 16.97 7.600e-18
dimerization 125-


domain proteins. 146 BL00038A 13.61
1.474e-13


102-118


216 BL01108 Ribosomal protein L24 BL01108A 20.33 2.241e-22
proteins. 49-82


BL01108B 11.40 8.457e-10
96-


107


217 PR00381 KTNESIN LIGHT CHAIN SIGNATUREPR00381A 9.55 1.321e-10
360-


378


222 BL00514 Fibrinogen beta and gammaBL00514C 17.412.358e-26
chains C- 1166-


terminal domain proteins.1203 BL00514G 15.98
9.000e-15


1289-1319 BL00514D
15.35


6.936e-12 1207-1220
BL00514F


11.65 4.288e-10 1253-1268


BL00514H 14.95 8.636e-10
1318-


1343


223 BL00325 _ BL00325B 21.66 1.000e-40
Actin-depolymerizing proteins.93-


139 BL00325A 24.83
9.333e-24


61-93


224 BL00018 EF-hand calcium-binding BL00018 7.41 1.450e-10
domain 231-244


proteins.


225 PF01329 Pterin 4 alpha carbinolaminePF01329B 18.52 1.692e-18
dhydratase. 67-92


228 BL00211 ABC transporters family BL00211B 13.37 6.250e-18
proteins. 1033-


1065 BL00211B 13.37
8.875e-18


2045-2077 BL00211A
12.23


1.900e-09 931-943


230 PR00761 BIND1N PRECURSOR SIGNATUREPR00761A 5.81 9.366e-09
275-


292


23 PR00049 WILM'S TUMOUR PROTEIN PR00049D 0.00 3.500e-10
2 54-69


SIGNATURE


232 BL00412 Neuromodulin (GAP-43) BL00412D 16.541.978e-10
proteins. 109-


160 BL00412D 16.54
4.122e-09


133-184


233 BL01210 Caveolins proteins. BL012IOB 13.92 8.129e-09
106-


156


236 BL00939 Ribosomal protein LIe BL00939F 17.27 5.393e-09
proteins. 861-


891


238 BL01252 Endogenous opioids neuropeptidesBL01252D 18.25 3.571e-28
205-


precursors proteins. 233 BL01252B 19.09
5.034e-27


159


CA 02399776 2002-08-02
WO O1J57190 PCTlUSOIl04098
SEQ ACCESSION DESCRIPTION RESULTS


ID NO.


NO:


37-67 BL01252C 18.10
1.621e~21


164-190 BL01252A 14.22
7.107e-


18 14-34


239 BL00302 Enkaryotic initiation BL00302 14.81 1.000e-40
factor 5A hypusine 25-79


proteins.


240 PR00420 AROMATIC-RING HYDROXYLASEPR00420A 14.78 8.851e-13
26-49


(FLAVOPROTEIN


MONOOXYGENASE) SIGNATURE


241 PD02929 ADHESION GLYCOPROTEIN PD02929A 28,27 4.529e-09
23S-


PRECURSOR I. 289


243 PD01066 PROTEIN ZINC FINGER ZINC-PDO I066 19.43 8.527e-25
I I-50


FINGER METAL-BTNDING NU.


244 BL01270 Band 7 protein family BLOI270C 16.91 6.745e-17
proteins. 115-


144 BL01270B 18.74
6.857e-17


76-115 BL01270E 13.03
6.016e-


15 182-211 BL01270D
20.87


9.160e-13 144-182


24S PF00791 Domain present in ZO-1 PF00791B 28.49 6.305e-12
and UncS-like 253-


netrin receptors. 308 PF00791B 28.49
1.909e-11


427-482 PF00791B 28.49
2.651e-


09 179-234 PF00791B
28.49


3.890e-09 112-167


246 PD00066 PROTEIN ZINC-FINGER METAL-PD00066 13.92 2.500e-13
277-290


BINDI. PD00066 13.92 9.I43e-12
I93-206


PD00066 13.92 5.304e-11
165-178


PD00066 13.92 6.478e-11249-262


PD00066 13.92 3.423e-10
221-234


247 BL00406 Actins proteins. BL00406D 12.58 6.400e-20
465-


520 BL00406B 5.47 4.857e-14


249-304 BL00406E 8.44
1.OOOe-


I I 522-572 BL00406C
6.75


5.449e-11 313-368


248 BL00951 ER lumen protein retainingBL00951C 19.35 1.000e-40
receptor 112-


proteins. 161 BL00951A 15.10
7.750e-39


21-57 BL00951D 13.94
6.000e-38


161-196 BL00951B 14.23
3.100e-


31 57-88


252 BL01113 Clq domain proteins. BL01113A 17.99 9.129e-15
200-


227 BL01113A 17.99
4.818e-14


194-221 BL01113A 17,99
7.8I8e-


14 182-209 BL01113A
17.99


1.730e-13 185-212 BL01113A


17.99 6.595e-13 191-218


BL01113A 17.99 6.077e-12
203-


230 BL01113A 17.99
9.182e-11


179-206 BL01113A 17.99
2.532e-


10 176-203 BL01113A
17.99


9.043e-10 2I8-245 BL01113A


17.99 9.426e-10 209-236


BL01113A 17.99 4.115e-09
137-


164


257 BL0084S CAP-Gly domain proteins. BL00845 16,43 1.837e-21466-491


259 PR00248 IvIETABOTROPIC GLUTAMATE PR00248G 12.67 2.688e-09
53-78


GPCR SIGNATURE


260 BL00678 Trp-Asp (WD) repeat proteinsBL00678 9.67 3.400e-10
proteins. 441-452


BL00678 9.67 5.800e-10
481-492


BL00678 9.67 8.800e-10
358-369


261 BL00678 Trp-Asp (WD) repeat proteinsBL00678 9.67 3.400e-10
proteins, 415-426


BL00678 9.67 5.800e-10
455-466


160


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


BL00678 9.67 8.800e-10
332-343


262 BL00678 Trp-Asp (WD) repeat proteinsBL00678 9.67 3.400e-10
proteins. 468-479


BL00678 9.67 5.800e-10
508-519 ,


BL00678 9.67 8.800e-10
385-396


263 BL50002 Src homology 3 (SH3) domainBL50002B 15.18 2.200e-10
proteins 415-


profile. 429


264 BL00049 Ribosomal protein L14 BL00049C 17.38 3.040e-12
proteins. 94-


130


265 PD01469 GLYCOPROTEIN PROTEIN PD01469 20.59 2.091e-14
438-470


PRECURSOR SA.


266 PD01469 GLYCOPROTEIN PROTEIN PD01469 20.59 2.091e-14
279-311


PRECURSOR SA.


267 BL00567 Phosphoribulokinase proteins.BL00567A 10.66 1.161e-12
36-55


269 BL00049 Ribosomal protein L14 BL00049C 17.38 2.688e-28
proteins. 92-


128 BL00049B 18.42
6.806e-24


54-86 BL00049A 13.86
8.333e-19


19-42 BL00049D 13.47
5.765e-12


129-140


272 BLO T 115 GTP-binding nuclear proteinBLO 1115A 10.22 9.735e-12
ran proteins. 14-58


273 PR00021 SMALL PROLINE-RICH PROTEINPR00021A 4.31 1.911e-09
819-


SIGNATURE 832


275 PR00179 LIPOCALIN SIGNATURE PR00179B 9.56 2.895e-13
124-


137 PR00179A 13.78
3.250e-11


36-49 PR00179C 19.02
6.040e-11


154-170


276 PR00449 TRANSFORMING PROTEIN P21 PR00449A 13.20 8.364e-17
RAS 22-44


SIGNATURE PR00449C 17.27 1.000e-13
62-85


PR00449E 13.50 4.000e-12
172-


195 PR00449B 14.34
5.680e-10


45-62


277 BL00140 Ubiquitin carboxyl-terminalBL00140D 22.64 I.OOOe-40
hydrolase 161-


family I cysteine activ. 205 BL00140C 11.80
9.053e-30


79-104 BL00140A 15.96
9.400e-


28 5-35 BL00140B 12.29
4,649e-


I7 37-55


278 PD02712 ELEMENT TRANSPOSASE FOR PD02712A 23.03 8.013e-09
47-83


TRANSPOSON TRANSPOSABLE.


279 BL00678 Trp-Asp (WD) repeat proteinsBL00678 9.67 1.474e-09
proteins. 100-111


282 DM00892 3 RETROVIRAL PROTEINASE. DM00892C 23.5S 4.767e-21
864-


898


283 BL00048 Protamine P1 proteins. BL00048 6.39 9.550e-09
56-83


286 PR00081 GLUCOSE/RIBITOL PR00081A 10.53 1.878e-11
36-54


DEHYDROGENASE FAMILY


SIGNATURE


287 PR00310 ANTI-PROLIFERATIVE PROTEINPR00310B 10.59 4.231e-17
29-59


BTG1 FAMILY SIGNATURE PR00310D 9.10 6.679e-16
89-119


289 PD01066 PROTEIN ZINC F1NGER ZINC-PD01066 19.43 7.000e-36
37-76


FINGER METAL-BINDING NU.


293 BL00979 G-protein coupled receptorsBL00979L 20.63 3.800e-12
family 3 111-


proteins. 152


295 PD02411 PROTEIN TRANSCRIPTION PD02411 21.89 7.000e-16195-229


REGULATION NUCLEAR.


296 BL01064 Pyridoxamine 5'-phosphateBL01064A 27.84 8.313e-28
oxidase 77-


proteins. 129 BL01064C 15.22
7.136e-25


202-235


297 BL00030 Eukaryotic RNA-binding BL00030A 14.39 2.929e-13
region RNP-1 37-56


proteins. BL00030B 7.03 1.900e-11
167-


177 BL00030A 14.39
2.000e-10


128-147


161


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTSx


ID NO.


NO:


298 BL01183 ubiE/COQS methyltransferaseBL01183B 21.31 6.660e-12
family 143-


proteins. 188


299 BL01279 Protein-L-isoaspartate(D-aspartate)BL01279A 24.27 5.862e-11
O- 57-


methyltransferase signa. 105


301 BL00191 Cytochrome b5 family, heme-bindingBL00191K 17.38 4.951e-27
184-


domain proteins. 228 BL00191J 11.37
6.447e-17


128-150


302 DM00892 3 RETROVIRAL PROTEINASE. DM00892C 23.55 3.893e-16
33-67


306 PF01140 Matrix protein (MA), p15. PF01140D 15.54 2.988e-09
416-


451


307 PR00245 OLFACTORY RECEPTOR PR00245A 18.03 4.818e-21
59-81


SIGNATURE PR00245C 7.84 5.154e-20
238-


254 PR00245D 10.47
4.000e-15


274-286 PR00245B 10.38
8.200e-


15 177-192 PR00245E
12.40


5.714e-12 291-306


309 BL00203 Vertebrate metallothioneinsBL00203 13.94 2.245e-10
proteins. 612-658


310 BL00237 G-protein coupled receptorsBL00237A 27.68 7.632e-23
proteins. 119-


159 BL00237C 13.19
3.864e-15


251-278 BL00237D 11.23
3.739e-


12 312-329


311 BL00380 Rhodanese proteins. BL00380D 15.90 8.200e-28
110-


136 BL00380G 11.26
5.800e-16


267-280 BL00380B 14.77
7.000e-


14 49-62 BL00380F
9.76 5.886e-


13 203-214 BL00380C
15.67


7.387e-13 82-98 BL00380E
12.44


7.000e-11 181-193
BL00380A


10.48 l .000e-09 10-20


312 BL00227 Tubulin subunits alpha, BL00227B 19.29 1.000e-40
beta, and gamma 50-


proteins. 105 BL00227C 25.48
1.000e-40


111-163 BL00227D 18.46
1.000e-


40 220-274 BL00227F
21.16


1.000e-40 372-426
BL00227A


24.55 3.250e-39 1-35
BL00227E


24.15 8.500e-34 324-359


327 BL00232 Cadherins extracellular BL00232B 32.79 7.362e-21225-
repeat proteins


domain proteins. 273 BL00232B 32.79
2.588e-17


435-483 BL00232B 32.79
6.301e-


15 116-164 BL00232B
32.79


6.769e-13 330-378
BL00232C


10.65 9.341e-12223-241


BL00232C 10.65 5.696e-11
328-


346 BL00232C 10.65
3.942e-10


433-451


329 PD02749 TRANSCRIPTION PROTEIN FACTORPD02749B 12.75 2.241e-37
35-71


BTF3 REGULATION NUCL. PD02749C 13.96 4.892e-28
87-


121 PD02749A 9.56
6.000e-15 2-


15


330 PR00391 PHOSPHATIDYLINOSITOL PR00391E 12.50 7.785e-15
211-


TRANSFERPROTEIN SIGNATURE 231 PR00391B 8.39
1.000e-13


83-104 PR00391D 12.21
9.328e-


13 191-207 PR00391A
7.83


5.390e-11 16-36


332 BL01030 RNA polymerases M / 15 BL01030 23.44 1.818e-23
Kd subunits 87-125


proteins.


337 PD01066 PROTEIN ZLNC FINGER ZINC- PD01066 19.43 2.929e-32
6-45


FINGER METAL-BINDING NU.


340 PD02711 SYNTHASE PD02711B 14.26 1.973e-20
944-


162


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


PHOSPHORIBOSYLFORMYLGLY. 968


343 BL00223 Annexins repeat proteins BL00223C 24.79 1.000e-40
domain 245-


proteins. 300 BL00223B 28.47
8.714e-38


168-218 BL00223A 15.59
8.250e-


27 98-132 BL00223A
15.59


8.750e-27 26-60 BL00223C
24.79


9.438e-16 13-68 BL00223C
24.79


2.735e-15 85-140 BL00223A


15.59 2.253e-11 258-292


346 PR00345 STATHMIN FAMILY SIGNATUREPR00345B 7.12 2.800e-28
81-110


PR00345E 8.54 7.652e-28
158-


183 PR00345C 4.54 9.100e-28


110-134 PR00345D 10.97
1.964e-


24 134-158 PR00345A
13.46


5.645e-16 52-71


347 BL00586 Ribosomal protein L16 BL00586B 17.00 3.215e-15
proteins. 184-


221


348 PR00388 3',5'-CYCLIC NUCLEOTIDE PR00388A 10.45 2.778e-09
CLASS II 86-


PHOSPHODIESTERASE SIGNATURE105


351 BL00018 EF-hand calcium-binding BL00018 7.41 3.118e-11
domain 160-173


proteins. BL00018 7.41 2.350e-10
244-257


354 BL00678 Trp-Asp (WD) repeat proteinsBL00678 9.67 1.947e-09
proteins. 256-267


358 DM01206 CORONAVIRUS NUCLEOCAPSID DM01206B 10.69 3.278e-09
175-


PROTEIN. 195 DM01206B 10.69
6.696e-09


183-203 DM01206B 10.69


8.633e-09 132-152 DM01206B


10.69 8.861e-09 181-201


DM01206B 10.69 9.316e-09
177-


197


361 PD01498 OXIDASE BIOSYNTHESIS PD01498C 24.90 6.880e-14
219-


OXIDOREDUCTASE PORP. 263


362 PD01498 OXIDASE BIOSYNTHESIS PD01498C 24.90 6.880e-14
219-


OXIDOREDUCTASE PORP. 263


365 BL00178 Aminoacyl-transfer RNA BL00178B 7.11 1.000e-11
synthetases 589-


class-I proteins. 600 BL00178A 14.23
8.500e-09


46-56


366 BL00523 Sulfatases proteins. BL00523E 19.27 1.000e-23
318-


348 BL00523A 13.36
S.SOOe-16


30-47 BL00523B 8.64
1.964e-13


78-90 BL00523C 12.64
9.625e-13


129-140 BL00523G 9.46
S.SOOe-


10 506-516


369 BL00107 Protein kinases ATP-bindingBL00107A 18.39 4.818e-09
region 21-52


proteins.


370 BL00880 Acyl-CoA-binding protein.BL00880 17.52 1.000e-40
75-125


371 BL00107 Protein kinases ATP-bindingBL00107A 18.39 1.000e-23
region 276-


,proteins. 307 BL00107B 13.31
1.692e-12


342-358


372 PR00211 GLUTELIN SIGNATURE PR00211B 0.86 6.602e-1
l 326-


347 PR00211B 0.86 6.106e-10


320-341 PR00211B 0.86
3.167e-


09 333-354


373 BL00279 Membrane attack complex BL00279E 37.11 9.349e-10
components / 749-


perforin proteins. 797


375 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 1.231e-33
10-49


FINGER METAL-BINDING NU.


377 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 7.563e-28
10-49


FINGER METAL-BINDING NU.


379 BL00598 Chromo domain proteins. BL00598 14.45 5.781e-.16
3-25


163


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SE ACCESSION DESCRIPTION RESULTS*
Q


ID NO.


NO:


_
380 PR00413 HALOACID PR00413D 11.28 8.941e-09
' 864-


DEHALOGENASE/EPOXIDE 878


E FAMILY SIGNATURE
HYDROLAS


383 PR00413 _ PR00413D 11.28 8.941e-09
HALOACID 864-


DEHALOGENASE/EPOXIDE 878


HYDROLASE FAMILY SIGNATURE


387 BL01060 Flagella transport proteinBL01060A 15.65 1.535e-09
flip family 131-


proteins. 174


388 PR00209 ALPHA/BETA GLIADIN FAMILYPR00209B 4.88 6.318e-I
1 1009-


SIGNATURE 1028


389 PR00837 ALLERGEN VSlTPX-1 FAMILY PR00837B 11.64 1.000e-10
469-


SIGNATURE 483


391 BL00240 Receptor tyrosine kinase BL00240B 24.70 7.907e-10
class III 118-


pxoteins. 142


392 PR00014 FIBRONECTIN TYPE III REPEATPR00014D 12.04 8.412e-10
691-


SIGNATURE 706


393 PR00014 FIBRONECTIN TYPE III REPEATPR00014D 12.04 8.412e-10
706-


SIGNATURE 721


394 BL01209 LDL-receptor class A (LDLRA)BL01209 9.313.368e-15
domain 47-60


proteins. BL01209 9.31 5.500e-13
92-105


395 BL00634 Ribosomal protein L30 BL00634 34.38 4.090e-13
proteins. 70-121


396 BL01013 Oxysterol-binding proteinBL01013D 26.81 8.000e-26
family 358-


proteins. 402 BL01013A 25.14
7.231e-21


45-81 BL01013C 9.97
1.000e-13


132-142 BL01013B 11.33
1.000e-


11 110-121


397 BL00930 Peripherin /rom-I proteins.BL00930E 17.80 1.000e-40
56-92


BL00930D 9.12 4.632e-37
12-56


BL00930F 16.912.800e-36
92-


133


400 PR00780 LEUSERPIN 2 SIGNATURE PR00780B 4.89 4.491e-09
262-


285


401 PR008I9 CBXX/CFQX SUPERFAMILY PR00819B 10.83 7.158e-114-20


SIGNATURE


403 BL00381 Endopeptidase Clp serine BL00381C 23.84 1.250e-32
proteins. 150-


194 BL00381A 16.48
2.286e-22


74-111 BL00381B 21.42
8.326e-


14 78-130


405 BL01105 Ribosomal protein L35Ae BL01105A 17.37 1.000e-40
proteins. 4-49


BL01105B 12.95 1.000e-40
68-


108


406 BL00344 GATA-type zinc forger BL00344 17.99 7.000e-12
domain proteins. 814-852


407 PR00211 GLUTELIN SIGNATURE PR00211B 0.86 9.750e-09
73-94


409 PR00910 LUTEOVIRUS ORF6 PROTEIN PR00910A 2.51 4.321e-09
9-22


SIGNATURE


410 BL00762 WHEP-TRS domain proteins.BL00762A 23.43 1.000e-28
752-


789 BL00762A 23.43
4.400e-21


903-940 BL00762A 23.43
5.415e-


18 825-862 BL00762B
16.14


8.759e-12 1154-1 168


412 BL00690 DEAN-box subfamily ATP-dependentBL00690B 13.38 5.320e-15
262-


helicases proteins. 280 BL00690A 6.87 1,818e-13


230-240


415 BL00227 Tubulin subunits alpha, BL00227B 19.29 1.000e-40
beta, and gamma 52-


proteins. 107 BL00227C 25.48
1.000e-40


113-165 BL00227D 18.46
1.000e-


40 222-276 BL00227F
21.16


1.000e-40 382-436 BL00227E


24.15 1.7SOe-34 326-361


164


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


BL00227A 24.55 1.000e-33
1-35


416 PF00992 Troponin. PF00992A 16.67 1.71
1e-09 557-


592


418 BL00541 Nuclear transition proteinBL00541 8.44 9.875e-09
1 proteins. 256-310


419 BL00541 Nuclear transition proteinBL00541 8.44 9.875e-09
1 proteins. 197-251


420 PF00856 SET domain proteins. PF00856A 26.14 9.074e-13
901-


938 PF00856B 16.42
2.397e-12


951-973


421 BL00678 Trp-Asp (WD) repeat proteinsBL00678 9.67 8.200e-12
proteins. 33-44


423 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 8.600e-30
130-169


FINGER METAL-BINDING NU.


424 PF00564 Octicosapeptide repeat PF00564B 24.74 1.305e-17
proteins. 421-


472


426 PR00988 URIDINE KINASE SIGNATURE PR00988A 6.39 4.569e-12
3-21


427 PR00988 URIDINE KINASE SIGNATURE PR00988A 6.39 4.569e-12
3-21


428 BL00478 LIM domain proteins. BL00478B 14.79 3.250e-13
115-


130 BL00478B 14.79
9.036e-13


50-65


431 BL00282 Kazal serine protease BL00282 16.88 8.875e-12
inhibitors family 464-487


proteins.


432 PD00930 PROTEIN GTPASE DOMAIN PD00930B 33.72 7.800e-18
316-


ACTIVATION. 357 PD00930A 25.62
9.617e-12


125-151 PD00930B 33.72
2.521e-


10 214-255


433 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 4.649e-34
34-73


FINGER METAL-BINDING NU.


434 PR00449 TRANSFORMING PROTEIN P21 PR00449A 13.20 7.563e-11
RAS 56-78


SIGNATURE


436 PR00120 H+-TRANSPORTING ATPASE PR00120C 9.90 5.800e-19
705-


(PROTON PUMP) SIGNATURE 722


437 BL00115 Eukaryotic RNA polymeriseBLOO115T 8.45 7.273e-29
II 1208-


heptapeptide repeat proteins.1242 BLOO115Q 18.08
2.776e-21


953-983 BLOO115Y 11.86
8.000e-


17 1604-1650 BLOO115M
19.19


8.130e-16 731-774 BLOO115H


14.34 9.392e-16 463-496


BLOO115A 15.44 7.414e-15
43-82


BLOO115R 6.50 6.128e-14
983-


1010 BLOO115J 16.71
9.289e-14


591-617 BLOO115I 8.33
4.336e-


13 535-590 BLOO115L
12.25


5.939e-13 662-694 BLOO115G


11.65 6.011e-13 435-463


BLOO115K 15.03 3.417e-10
617-


659 BL001150 16.76
5.805e-10


863-913 BLOO115P 11.54
7.538e-


10 913-953 BLOO115S
18.24


7.968e-10 1010-1052
BLOO115U


10.34 4.475e-09 1242-1265


438 PF00628 PHD-forger. PF00628 15.84 4.536e-10
219-234


440 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 6.351
e-34 10-49


FINGER METAL-BINDING NU.


441 PR00309 ARRESTIN SIGNATURE PR00309A 9.68 5.250e-24
32-55


PR00309D 7.09 4.938e-23
290-


309 PR00309B 7.81 2.800e-21


69-88 PR00309C 8.22
1.621e-19


165-183 PR00309E 9.82
9.438e-


15 374-389


442 BL00600 ~ Aminotransferases class-IIIBL00600B 19.60 7.324e-14
pyridoxal- 103-


165


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


phosphate attachment si. 129 BL00600G 12.43
2.125e-12


306-325 BL00600F 8.77
8.105e-


12 271-284 BL00600E
16.43


3.167e-11 228-257
BL00600D


8.71 8.650e-09 207-221


443 BL00972 Ubiquitin carboxyl-terminalBL00972A 11.93 3.160e-18
hydrolases 69-87


family 2 proteins.


444 BL00349 CTF/NF-I proteins. BL00349A 10.07 1.000e-40
8-54


BL00349C 9.33 1.000e-40
82-125


BL00349E 10.79 1.000e-40
152-


195 BL00349F 11.81
1.000e-40


213-255 BL00349H 15.70
7.387e-


36 361-399 BL00349B
10.51


2.227e-34 54-82 BL00349D
11.70


9.100e-34 125-152
BL00349G


19.72 5.781e-30323-356


445 BL00154 EI-E2 ATPases phosphorylationBL00154F 8.23 8.941e-21271-
site


proteins. 295 BL00154E 20.37
2.620e-15


124-165


448 ' DM00215 PROLINE-RICH PROTEIN 3. DM00215 19.43 4.882e-11
82-115


DM00215 19.43 6.492e-09
87-120


451 BL01283 T-box domain proteins. BL01283A 24.15 3.100e-40
112-


160 BL01283D 11.70
6.000e-39


253-286 BL01283B 23.17
6.538e-


38 170-212 BL01283C
13.05


7.750e-19 222-236


452 PR00420 AROMATIC-RING HYDROXYLASE PR00420A 14.78 2.579e-11
3-26


(FLAVOPROTEIN


MONOOXYGENASE) SIGNATURE


453 PR00162 RIESKE 2FE-2S SUBUNIT PR00162B 12.77 7.429e-17
215-


SIGNATURE 228 PR00162A 9.35
2.324e-14


193-205 PR00162C 8.10
7.120e-


14 227-240


454 PD01066 PROTEIN ZINC FINGER ZINC- PD01066 19.43 7.000e-30
87-126


FINGER METAL-BINDING NU.


456 BL00027 'Homeobox' domain proteins.BL00027 26.43 9.333e-18
1149-


1192


457 PD01066 PROTEIN ZINC FINGER ZINC- PD01066 19.43 2.737e-24
16-55


FINGER METAL-BINDING NU.


459 BL00290 Immunoglobulins and major BL00290A 20.89 1.529e-14
154-


histocompatibility complex177 BL00290B 13.17
proteins. 9.000e-12


214-232


460 PR00413 HALOACID PR00413F 14.91 7.333e-11
193-


DEHALOGENASE/EPOXIDE 214 PR00413E 15.78
5.714e-09


HYDROLASE FAMILY SIGNATURE175-192


463 PR00759 BASIC PROTEASE (KUNITZ-TYPE)PR00759B 11.26 8.385e-09
74-85


INHIBITOR FAMILY SIGNATURE


466 BL00019 Actinin-type actin-bindingBL00019D 15.33 4.200e-19
domain 300-


proteins. 330


467 BL00019 Actinin-type actin-bindingBL00019D 15.33 4.200e-19
domain 300-


proteins. 330


469 PR00153 CYCLOPHILIN PEPTIDYL-PROLYLPR00153D 11.99 3.250e-15
510-


CIS-TRANS ISOMERASE 523 PR00153C 11.014.682e-14


SIGNATURE 495-511 PR00153E 9.10
8.548e-


14 523-539 PR00153B
11.57


1.720e-13 452-465


470 BL00491 Aminopeptidase P and prolineBL00491 C 12.15 3.912e-09
557-


dipeptidase proteins. 572


471 PD00289 PROTEIN SH3 DOMAIN REPEAT PD00289 9.97 1.000e-14
~ 1482-


166


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


PRESYNA. 1496 PD00289 9.97 8.650e-11


1122-1136


474 BL50040 Elongation factor 1 gammaBL50040D 17.41 1.000e-40
chain profile. 279-


329 BL50040E 18.79
1.000e-40


333-388 BL50040F 18.99
5.320e-


40 390-428 BL50040C
22.62


3.739e-38 141-184 BL50040B


13.65 7.000e-30 59-85
BL50040A


12.98 1.450e-14 10-22


475 BL01144 Ribosomal protein L31 BLO T 144 25.07 1.000e-40
a proteins. 22-74


476 PR00007 COMPLEMENT C1Q DOMAIN PR00007C 15.60 2.421e-21
589-


SIGNATURE 611 PR00007B 14.16
3.500e-21


544-564 PR00007A 19.33
6.897e-


20 517-544 PR00007D
9.64


6.571e-12 623-634


477 BL50002 Src homology 3 (SH3) domainBL50002A 14.19 5.846e-10
proteins 170-


profile. 189


479 DM01970 0 kw ZK632.12 YDR313C DM01970B 8.60 9.500e-17
967-


ENDOSOMAL III. 980


480 PR00868 DNA-POLYMERASE FAMILY PR00868C 13.76 5.688e-17
A (POL 284-


I) SIGNATURE 308 PR00868A 16.33
3.186e-13


224-247 PR00868H 12.51
3.388e-


13 431-448 PR00868I
10.87


7.938e-11462-476 PR00868E


13.19 1.608e-10 340-366


481 BL00027 'Homeobox' domain proteins.BL00027 26.43 9.182e-22
53-96


482 BL00061 Short-chain dehydrogenases/reductasesBL00061B 25.79 3.647e-21
188-


family proteins. 226


483 BL50002 Src homology 3 (SH3) domainBL50002A 14.19 1.750e-12
proteins 1032-


profile. 1051


485 PF00023 Ank repeat proteins. PF00023A 16.03 9.625e-10
760-


776 PF00023A 16.03
3.571e-09


715-731


486 PD02870 RECEPTOR INTERLEUKIN-1 PD02870B 18.83 9.262e-20
103-


PRECURSOR. 136 PD02870D 15.74
9.426e-09


201-236


487 PR00370 FLAVIN-CONTAINING PR00370G 10.45 3.769e-28
471-


MONOOXYGENASE (FMO) 493 PR00370B 10.91
1.000e-24


SIGNATURE 27-46 PR00370C 12.72
4.000e-21


140-157 PR00370E 11.96
9.229e-


21 320-339 PR00370D
16.33


1.750e-20 185-204 PR00370F


17.75 7.395e-20 375-395


PR00370A 3.35 2.038e-18
4-20


489 PD01675 GLYCOPROTEIN MAJOR ENVELOPEPD01675C 19.89 2.330e-10
55-89


PROBABLE U3.


492 BL00211 ABC transporters family BL00211A 12.23 S.OSOe-09
proteins. 45-57


493 BL00211 ABC transporters family BL00211A 12.23 S.OSOe-09
proteins. 45-57


494 BL00211 ABC transporters family BL00211A 12.23 S.OSOe-09
proteins. 58-70


495 BL00027 'Homeobox' domain proteins.BL00027 26.43 6.786e-12
509-552


BL00027 26.43 9.143e-12
319-362


BL00027 26.43 2.600e-11
627-670


BL00027 26.43 3.625e-10
779-822


497 BL00107 Protein kinases ATP-bindingBL00107A 18.39 5.800e-22
region 214-


proteins. 245 BL00107B 13.31
1.000e-13


281-297 BL00107A 18.39
3.520e-


13 583-614 BL00107B
13.31


8.615e-12 652-668


499 BL00383 ~ Tyrosine specific proteinBL00383E 10.35 1.000e-14
phosphatases 1902-


167


CA 02399776 2002-08-02
WO 01/57190 PCT/US01/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


proteins. ~ 19I3 BL00383D 11.92
3.077e-14


1862-1875 BL00383A
13.34


5.500e-14 1730-1745
BL00383C


10.10 2.000e-13 1785-1796


BL00383F 15.519.069e-12
1940-


1956 BL00383B 7.61
1.692e-11


1755-1764


501 PROOOI9 LEUCINE-RICH REPEAT PR00019B 11.36 i.360e-09
136-


SIGNATURE 150 PR000i9A 11.I9
I.667e-09


91-105 PROOOI9B 11.36
4.600e-


09 160-174


503 BL00226 Intermediate filaments BL00226D 19.10 1.000e-40
proteins. 367-


414 BL00226B 23.86
6.143e-27


i 95-243 BL00226A 12.?7
7.840e-


14 96-111 BL00226C
13.23


2.600e-13 309-340 BL00226C


13.23 6.143e-12 266-297


BL00226B 23.86 1.209e-09
146-


194


505 PD02407 3-BISPHOSPHOGLYCERATE- PD02407F 7,616.739e-09
916-


INDEPENDENT PHOSPHOGLYCER.930


506 PF00632 HECT-domain (ubiquitin-transferase).PF00632C 20.66 9.830e-19
991-


1023 PF00632B 18.45
1.155e-11


940-968


507 BL01082 Ribosomal protein L7Ae BL01082 20.37 4.273e-20
proteins. 76-i I6


508 BL00678 Trp-Asp (WD) repeat proteinsBL00678 9.67 2.421
proteins. e-09 493-504


509 BL00678 Trp-Asp (WD) repeat proteinsBL00678 9.67 2.42Ie-09
proteins. 473-484


5I0 PR00320 G-PROTEIN BETA WD-40 REPEATPR00320B 12.19 4.774e-11
567-


SIGNATURE 582 PR00320B 12.19
5.886e-10


763-778 PR00320C 13.01
6.760e-


I O 567-582 PR00320A
16.74


7.618e-10 846-861 PR00320A


16,74 3.415e-09 763-778


PR00320A I6.74 6.268e-09
567-


582


5I BL00479 Phorbol esters / diacylglycerolBI,004?9C I2.01 3.250e-12
1 binding 170-


domain proteins. 183


5I2 BL50058 G-protein gamma subunit BL50058 27.23 7.494e-09
profile, ~ I O-58


513 BL00524 Somatomedin B domain proteins.BL00524A 9.65 8.925e-14
80-101


515 BL00041 Bacterial regulatory proteins,BL00041 23.99 1.964e-19
araC family 492-524


proteins.


516 PD00066 PROTEIN ZINC-FTNGER METAL-PD00066 13.92 8,500e-13
391-404


BINDI.


517 BL0041S Synapsins proteins. BL00415E 4.82 9.291e-09
959-


996


518 PR00109 TYROSINE KINASE CATALYTICPR00109B 12.27 9,471e-12
126-


DOMAIN SIGNATURE 145


5I9 BL00290 Immunoglobulins and majorBL00290B 13.17 4.750e-09
47-65


histocompatibility complex
proteins.


522 PR00505 D12 CLASS N6 ADENINE-SPECIFICPR00505A 14.15 7.128e-09
364-


DNA METHYLTRANSFERASE 381


SIGNATURE


525 BL00312 Glycophorin A proteins. BL00312B 9.22 5.781e-10
891-


920


528 PD01066 PROTEIN ,ZINC FINGER ZINC-PD01066 19.43 2.500e-32
16-55


FINGER METAL-BINDING NU.


529 PR00254 NICOTINIC ACETYI,CHOLINE PR00254D 15.50 4.OOOe-I7
I3I-


RECEPTOR SIGNATURE 150 PR00254A 11,23
4.706e-14


6I-78 PR00254C 11.36
4.000e-12


168


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


113-126 PR00254B 12.97
1.486e.-


11 95-110


531 BL00741 Guanine-nucleotide dissociationBL00741B 14.27 6.870e-16
787-


stimulators CDC24 family 810
sign.


532 PR00193 MYOSIN HEAVY CHATN PR00193D 14.36 3.143e-34
447-


SIGNATURE 476 PR00193C 12.60
7.632e-32


216-244 PR00193B 11.69
7.750e-


29 167-193 PR00193A
15.41


2.588e-22 111-131 PR00193E


19.47 2.200e-21 501-530


533 PD02870 RECEPTOR INTERLEUKIN-1 PD02870B 18.83 5.596e-09
348-


PRECURSOR. 3 81


535 PR00683 SPECTRIN PLECKSTRIN PR00683D 15.87 2.452e-10
465-


HOMOLOGY DOMAIN SIGNATURE484


536 BL00027 'Homeobox' domain proteins.BL00027 26.43 6.684e-24
164-207


538 PR00239 MOLLUSCAN RHODOPSIN C- PR00239E 1.58 2.739e-09
225-


TERMINAL TAIL SIGNATURE 237


539 BL00406 Actins proteins. BL00406C 6.75 1.000e-40
157-


212 BL00406B 5.47 6.143e-37


90-145 BL00406D 12.58
4.600e-


36 291-346 BL00406E
8.44


2.200e-33 364-414 BL00406A


9.95 4.441e-23 7-42


540 PR00456 RIBOSOMAL PROTEIN P2 PR00456E 3.06 9.625e-10
44-59


SIGNATURE


541 PR00456 RIBOSOMAL PROTEIN P2 PR00456E 3.06 9.625e-10
44-59


SIGNATURE


542 PF00023 Ank repeat proteins. PF00023A 16.03 7.857e-11
138-


154


544 PF00642 Zinc finger C-x8-C-x5-C-x3-HPF00642 11.59 9.082e-10
type (and 838-849


similar).


546 BL00383 Tyrosine specific proteinBL00383E 10.35 4.115e-10
phosphatases 104-


proteins. 115


547 BL01226 Hydroxymethylglutaryl-coenzymeBL01226A 13.79 1.000e-40
A 50-89


synthase proteins. BL01226C 13.51 1.000e-40
127-


167 BL01226D 11.60
1.000e-40


174-210 BL01226E 13.74
1.000e-


40 212-253 BL01226H
17.74


1.000e-40 386-434 BL01226I


25.06 1.000e-40 460-508


BL01226G 15.76 3.483e-32
292-


321 BL01226B 13.35
1.818e-31


95-127 BL01226F 9.78
8.714e-23


253-271


549 BL00964 Syndecans proteins. BL00964B 12.05 2.426e-10
1246-


1289


551 DM01930 2 kw FINGER SMCX SMCY DM01930E 15.41 1.367e-37
170-


YDR096W. 215 DM01930F 14.16
8.232e-28


267-303 DM01930B 19.86


' 9.163e-10 37-71


552 BL00195 Glutaredoxin proteins. BL00195B 15.31 7.158e-09
9-29


554 BL00383 Tyrosine specific proteinBL00383E 10.35 2.756e-12
phosphatases 436-


proteins. 447


555 PR00403 WW DOMAIN SIGNATURE PR00403B 12.19 7.612e-11
122-


137 PR00403A 16.82
3.912e-10


107-121 PR00403B 12.19
2.068e-


09 76-91


558 PR00380 KINESIN HEAVY CHAIN PR00380A 14.18 2.714e-26
76-98


SIGNATURE PR00380D 9.93 3.000e-24
275-


169


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


297 PR00380C 13.18
5.154e-20


226-245 PR00380B 12.64
9.400e-


20 195-213


559 BL00518 Zinc forger, C3HC4 type BL00518 12.23 5.333e-09
. (RING forger), 522-531


proteins.


561 PD01795 PROTEIN AM1NOPEPTIDASE PD01795B 11.56 2.333e-12
I59-


PRECURSOR HYDROLASE SIGNA.172 PD01795A 10.27
1.000e-09


135-144


562 PD01795 PROTEIN AMINOPEPTIDASE PD01795B 11.56 2.333e-12
110-


PRECURSOR HYDROLASE SIGNA.123 PDO I795A 10.27
1.000e-09


86-95


563 BL00018 EF-hand calcium-binding BL00018 7.41 1.391e-09
domain 41-54


proteins.


565 BL00348 p53 tumor antigen proteins.BL00348F 23.19 4.143e-09
188-


231


567 PD00301 PROTEIN REPEAT MUSCLE PD00301B 5.49 4.115e-09
284-


CALCIUM-BI. 295


569 PF00850 Histone deacetylase family.PF00850E 8.88 6.553e-21
756-782


PF00850D 14.76 1.519e-I6
722-


746 PF00850F 15.70
1.118e-11


794-827 PF00850G 22.75
8.375e-


11 833-875


570 PD00289 PROTEIN SH3 DOMAIN REPEATPD00289 9.97 4.960e-10
137-151


PRESYNA.


571 BL00518 Zinc finger, C3HC4 type BL00518 12.23 8.800e-11
(RING finger), 44-53


proteins.


573 BL00299 Ubiquitin domain proteins.BL00299 28.84 1.123e-11
123-175


574 PF01140 Matrix protein (MA), p15.PF01140D 15.54 3.700e-10
986-


1021


576 BL00284 Serpins proteins. BL00284C 28.56 5.200e-26
200-


242 BL00284A 15.64
4.913e-18


71-95 BL00284B 17.99
7.261e-15


173-194 BL00284D 16.34
5.846e-


13 306-333 BL00284E
19.15


7.429e-12 387-412


579 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 6.553e-29
15-54


FINGER METAL-BINDING NU.


580 BL50001 Src homology 2 (SH2) domainBL50001B 17.40 4.500e-12
proteins 1010-


profile. 1031


581 PD00930 PROTEIN GTPASE DOMAIN PD00930B 33.72 3.189e-22
608-


ACTIVATION. 649 PD00930A 25.62
6.806e-17


505-531


584 BL00612 Osteonectin domain proteins.BL00612B 11.35 2.034e-11
93-


126


585 DM0155I kw OSTEOINDUCTIVE YOPM DM01551C 14.62 8.859e-10
102-


MEMBRANE OUTER. 122


586 PF00628 PHD-forger. PF00628 15.84 3.455e-12
235-250


587 BL00027 'Homeobox' domain proteins.BL00027 26.43 6.063e-10
85-128


588 PR00326 GTP1/OBG GTP-BINDING PROTEINPR00326A 8.75 7.525e-16
227-


FAMILY SIGNATURE 248 PR00326C 9.79 6.760e-15


276-292 PR00326D 19.09
6.657e-


13 293-312 PR00326B
16.74


9.229e-13 248-267


589 BL00422 Granins proteins. BL00422A 28.34 7.429e-09
2349-


2378


590 BL00415 Synapsins proteins. BL00415N 4.29 9.794e-10
295-


339


591 BL00128 Alpha-lactalbumin / lysozymeBL00128A 20.76 3.423e-13
C proteins. 35-65


BL00128C 19.34 2.980e-11
110-


170


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


132


596 PR00049 WILM'S TUMOUR PROTEIN PR00049D 0.00 3.136e-09
31-46


SIGNATURE


597 DM00547 1 kw CHROMO BROMODOMAIN DM00547C 17.30 1.667e-19
207-


SHADOW GLOBAL. 229 DM00547E 13.94
6.200e-18


319-342 DM00547B 11.28


1.000e-17 179-193 DM00547D


11.60 9.250e-13 289-303


DM00547F 23.43 6.727e-12
679-


726 DM00547A 12.38
4.818e-11


158-170


600 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 1.882e-27
13-52


FINGER METAL-BINDING NU.


601 BL00192 Cytochrome b/b6 heme-ligandBL00192A 11.90 6.400e-09
proteins. 390-


430


602 BL00936 Ribosomal protein L35 BL00936B 27.27 8.615e-09
proteins. 118-


157


603 BL00936 Ribosomal protein L35 BL00936B 27.27 8.615e-09
proteins. 118-


157


606 PR00019 LEUCINE-RICH REPEAT PR00019B 11.36 7.300e-10
292-


SIGNATURE 306 PR00019A 11.19
5.667e-09


323-337


607 PR00019 LEUCINE-RICH REPEAT PR00019B 11.36 7.300e-10
292-


SIGNATURE 306 PR00019A 11.19
5.667e-09


323-337


608 PR00320 G-PROTEIN BETA WD-40 REPEATPR00320C I3.OI 9.500e-12
I68-


SIGNATURE 183 PR00320A 16.74
2.853e-10


60-75 PR00320A 16.74
4.706e-10


14-29 PR00320C 13.01
5.320e-10


60-75 PR00320C 13.01
5.680e-10


14-29 PR00320A 16.74
6.049e-09


217-232 PR00320B 12.19
8.875e-


09 168-183


610 BL00750 Chaperonins TCP-1 proteins.BL00750B 16.17 1.000e-40
70-


120 BL00750A 20.07
6.21 1e-37


26-69 BL00750G 20.12
8.800e-31


43I-471 BL00750F 18.40
S.I25e-


30 370-411 BL00750E
24.59


8.650e-29 295-332 BL00750H


21.44 1.000e-27 489-524


BL00750C 25.65 5.345e-17
149-


181 BL00750D 16.16
6.318e-14


203-222


613 BL00766 Tetrahydrofolate . BL00766B 24.49 1.000e-40
142-


dehydrogenase/cyclohydrolase190 BL00766E 13.78
proteins. 1.000e-40


322-359 BL00766C 25.86
S.SOOe-


39 208-256 BL00766D
17.05


4.536e-26 283-313 BL00766A


21.48 6.063e-24102-132


615 BL00256 Adipokinetic hormone familyBL00256 12.28 3.298e-10
proteins. 746-755


616 BL00319 Amyloidogenic glycoproteinBL00319C 17.12 9.053e-09
extracellular 419-


domain proteins. 453


617 BL00030 Eukaryotic RNA-binding BL00030A 14.39 4.429e-09
region RNP-1 44-63


proteins.


618 BL00030 Eukaryotic RNA-binding BL00030A 14.39 4.429e-09
region RNP-1 44-63


proteins.


620 BL00325 Actin-depolymerizing proteins.BL00325B 21.66 5.817e-16
77-


123


622 BL00972 Ubiquitin carboxyl-terminalBL00972A 11.93 S.SOOe-19
hydrolases 213-


171


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


family 2 proteins. 231 BL00972D 22.55
2.742e-16


501-526 BL00972B 9.45
1,000e-


1 I 297-307 BL00972C
16.48


3.160e-11 370-385 BL00972E


20.72 7.517e-10 526-548


625 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 6.333e-39
6-45


FINGER METAL-BINDING NU.


628 BL00039 DEAD-box subfamily ATP-dependentBL00039D 21.67 7.750e-31
478-


helicases proteins. 524 BL00039A 18.44
2.000e-25


198-237 BL00039C 15:63
1.844e-


15 327-351 BL00039B
19.19


5.636e-14 242-268


630 PD00306 PROTEIN GLYCOPROTEIN PD00306A 10.26 7.000e-12
232-


PRECURSOR RE. 246


631 PD00306 PROTEIN GLYCOPROTEIN PD00306A 10.26 7.000e-12
290-


PRECURSOR RE. 304


633 BL00785 5'-nucleotidase proteins.BL00785C 9.45 3,625e-16
108-


122 BL00785E 15.85
4.OOOe-I6


279-295 BL00785A 9.73
6.500e-


14 29-40 BL00785B 10.65


5.500e-13 72-86 BL00785D
9.89


4.000e-12 135-145


636 PR00832 PAXILLIN SIGNATURE PR00832E 14.43 9.901e-14
85-


108


637 PR00109 TYROSINE KINASE CATALYTICPR00109B 12.27 6.362e-13
221-


DOMAIN SIGNATURE 240


638 PF00635 MSP (Major sperm protein)PF00635B 15.84 4.900e-11
domain 463-


proteins. 502


639 PR00860 VERTEBRATE METALLOTHIONEINPR00860B 7.04 1.900e-18
85-99


SIGNATURE PR00860C 9.61 1.474e-14
99-109


PR00860A 5.46 1.720e-14
63-76


641 PD00066 PROTEIN ZINC-FINGER METAL-PD00066 13.92 4,462e-15
271-284


B1NDI. PD00066 13.92 4.462e-15
299-312


PD00066 13.92 2.800e-14
327-340


PD00066 13.92 2,$00e-14
383-396


PD00066 13.92 2.800e-14
411-424


PD00066 13.92 7.000e-14
355-368


PD00066 13.92 8.800e-14
439-452


PD00066 13.92 8.800e-14
495-508


PD00066 13.92 1,500e-13
551-564


PD00066 13.92 7.000e-13
467-480


PD00066 13.92 7.000e-13
523-536


PD00066 13.92 9.500e-13
215-228


PD00066 13.92 9.500e-13
243-256


PD00066 13.92 9.500e-13
579-592


PD00066 13.92 8.615e-10
607-620


PD00066 13.92 1.600e-09
187-200


642 BL00961 Ribosomal protein S28e BL0096IB 11.24 7.429e-37
proteins. 67-


100 BL0096IA 9.90 4.079e-26


42-66


643 BL00585 Ribosomal protein S5 proteins.BL00585A 28.43 1.391e-40
103-


155 BL00585B 18.78
3.250e-30


193-230


647 BL00678 Trp-Asp (WD) repeat proteinsBL00678 9.67 9.400e-10
proteins. 181-192


648 PR00876 NEMATODE METALLOTHIONEIN PR00876C 6.15 9.229e-09112-


SIGNATURE 126


652 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 5.941
e-27 29-68


FINGER METAL-BINDING NU.


653 BL00047 Histone H4 proteins. BL00047A 13.53 1.000e-40
2-4I


172


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTSx


ID NO.


NO:


BL00047B 6.51 I .429e-40
41-74


BL00047C 12.18 1.310e-38
74-


' 104


654 PD01066 PROTEIN ZINC FINGER ZINC- PD01066 19.43 4.109e-25
30-69


FINGER METAL-BINDING NU.


655 BL01115 GTP-binding nuclear proteinBL01115A 10.22 3.483e-17
ran proteins. 19-63


657 BL00518 Zinc forger, C3HC4 type BL00518 12.23 8.286e-10
(RING forger), 31-40


proteins.


658 BL00125 Serine/threonine specific BL00125B 21.48 1.OOOe-4089-
protein


phosphatases proteins. 135 BL00125C 19.97
1.000e-40


153-200 BL00125D 33.11
1.000e-


40 213-268 BL00125A
14.83


8.941e-38 47-84


659 PD00066 PROTEIN ZINC-FINGER METAL-PD00066 13.92 8.200e-16
492-505


BINDI. PD00066 13.92 9.308e-15,380-393


PD00066 13.92 6.000e-13
352-365


PD00066 13.92 7.000e-13
240-253


PD00066 13.92 7.500e-13
268-281


PD00066 13.92 7.500e-13
408-421


PD00066 13.92 2.174e-1
l 464-477


PD00066 13.92 I.OOOe-10
436-449


660 PD01066 PROTEIN ZINC FINGER ZINC- PD01066 19.43 2.189e-26
29-68


FINGER METAL-BINDING NU.


661 BL00795 Involucrin proteins. BL00795C 17.06 7.882e-15
193-


238 BL00795C 17.06
3.797e-13


187-232 BL00795C 17.06
5.014e-


13 188-233 BL00795C
17.06


4.506e-12196-241 BL00795C


17.06 7.896e-12 191-236


BL00795C 17.06 1.667e-11
185-


230 BL00795C 17.06
2.000e-11


198-243 BL00795C 17.06
3.778e-


11 171-216 BL00795C
17.06


6.llle-II 197-242
BL00795C


17.06 6.444e-11 194-239


BL00795C 17.06 8.000e-11
189-


234 BL00795C 17.06
8.556e-11


192-237 BL00795C 17.06
1.733e-


10 195-240 BL00795C
17.06


2.779e-10 184-229
BL00795G


17.06 4.035e-10 199-244


BL00795C 17.06 5.081e-10
186-


231 BL00795C 17.06
6.965e-10


190-235 BL00795C 17.06
2.700e-


09 200-245 BL00795C
17.06


5.800e-09 175-220
BL00795C


17.06 6.500e-09182-227


BL00795C 17.06 6.600e-09
201-


246 BL00795C 17.06
6.600e-09


202-247 BL00795C 17.06
6.600e-


09 208-253


662 BL00469 Nucleoside diphosphate BL00469 22.22 1.000e-40
kinases proteins. 149-204


663 BL01160 Kinesin light chain repeatBL01160B 19.54 9.411e-11
proteins. 331-


385


664 BL00601 Tryptophan pentad repeat BL00601A 20.29 5.500e-23
proteins (IRF 7-46


family) proteins. BL00601B 20.92 3.631e-13
69-98


665 BL00082 Extradiol ring-cleavage BL00082A 19.07 8.615e-12
dioxygenases 49-72


proteins.


666 DM01537 kw SKI2W SKI2 NUCLEOLAR DM01537B 21.63 4.073e-37
834-


173


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


HELICASE. 881 DM01537B 21,63
9.750e-21


1669-1716 DM01537A
15.I4


8.650e-I8 698-718 DM01537A


15.14 6.766e-12 1537-1557


667 DM01537 kw SKI2W SI~I2 NUCLEOLAR DM01537B 21.63 7.923e-38
820-


HELICASE. 867 DM01537B 21.63
9.750e-21


1655-1702 DM01537A
15.14


8.650e-18 684-704 DM01537A


15.14 6.766e-12 1523-1543


669 BL00107 Protein kinases ATP-bindingBL00107A 18.39 6.786e-24
region 849-


proteins. 880 BL00107B 13.31
6.727e-13


916-932


670 BL00299 Ubiquitin domain proteins.BL00299 28.84 9.735e-27
37-89


671 BL00027 'Homeobox' domain proteins.BL00027 26.43 6.571e-12
432-4?5


676 PR00861 ALPHA-LYTIC ENDOPEPTIDASEPR00861E 9.88 2.385e-09
206-


SERINE PROTEASE (52A) 221


SIGNATURE


678 BL00225 Crystallins beta and gammaBL002258 18.06 7.517e-24
'Greek key' 1805-


motif proteins. 1840 BL00225B 18.06
8.297e-20


1987-2022 BL00225B
18.06


2.575e-191896-1931
BL00225B


18.06 8.200e-19175-210


BL00225B 18.06 8.200e-19
1698-


1733 BL00225B 18.06
4.808e-14


73-I08 BL00225B I8.06
4.808e-


14 1596-1631 BL00225B
18.06


5.500e-14 2077-2I 12
BL00225A


13.82 5.829e-12 2043-2064


BL00225A 13.82 3.127e-09
1759-


1780


679 PR00320 G-PROTEIN BETA WD-40 REPEATPR00320C 13.01 4.240e-10
169-


SIGNATURE ' 184 PR00320A 16.74
6.294e-10


169-184


680 BL00243 Integrins beta chain cysteine-richBL0024313I.77 I.143e-I
domain 1 172-


proteins. 215


681 PR00852 XERODERMA PIGMENTOSUM PR00852H 5.90 1.000e-29
612-


GROUP D PROTEIN SIGNATURE635 PR00852E 8.14 3.769e-27


348-371 PR00852D 11.38
8.875e-


27 309-331 PR00852B
11.08


2.800e-2S 249-269 PR00852I


17.26 3.500e-2S 683-?04


PR00852F 11.85 5.909e-24
379-


398 PR00852G 16.19
4.462e-23


468-486 PR00852C 8,81
9,143e-


23 284-303


682 BL50058 G-protein gamma subunit BL50058 27.23 1.375e-35
profile. I5-d3


685 BL00972 Ubiquitin carboxyl-terminalBL00972A I 1.93 7.500e-20
hydrolases 40-58


family 2 proteins. BL00972D 22.55 3.903e-16
300-


325 BL00972B 9.45 1,000e-13


120-130 BL00972E 20.72
5.500e-


11 325-347


687 BL00237 G-protein coupled receptorsBL00237A 27.68 4.273e-14
proteins. 98-


138


688 BL00388 Proteasome A-type subunitsBL00388A 23.14 I.OOOe-40
proteins. 8-54


BL00388B 31.38 3.864e-33
66-


108 BL00388D 20.71
1.000e-21


153-184 BL00388C 18.79
8.147e-


16 126-148


X689 PD02796 PROTEIN STEROL CARRIER PD0279bB 20.92 1.105e-15
~ LIPID- 347-


174


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


TRAN. 394


691 PDO1S72 PHOTOSYSTEM II REACTION PDOiS72 8.77 4.083e-09
1 ~ 1


CENTRE T PROTEIN PHOTOS.


692 BL00028 Zinc forger, C2H2 type, BL00028 16.07 7.600e-10
domain proteins. 488-SOS


694 BL01013 Oxysterol-binding proteinBL01013A 25.14 9.3S7e-33
family S27-


proteins. S63 BL01013D 26.81
8.23Se-23


814-8S8 BL01013C 9.97
6.211e-


14 61S-62S BL01013B
11.33


3.60Se-13 S92-603


69S PD00289 PROTEIN SH3 DOMAIN REPEATPD00289 9,97 3.S71e-13
164-178


PRESYNA. PD00289 9.97 8.6SOe-11
2147-


2161 PD00289 9.97 2.SS2e-09
23-


37


698 PR00161 NICKEL-DEPENDENT PR00161C 9.S 1 4.930e-09
282-


HYDROGENASEB-TYPE 302


CYTOCHROME SIGNATURE


700 PR00749 LYSOZYME G SIGNATURE PR00749F 13.63 8.636e-13
139-


156 PR00749H 8.22 3.681e-12


173-194 PR00749B 16.54
1.419e-


I I 48-70 PR00749C
7.26 3.060e-


11 72-91 PR00749A 10.33


4.815e-10 24-4S


703 PR00704 CALPAIN CYSTEINE PROTEASEPR00704I 9.52 1.000e-29
(C2) 476-SOS


FAMILY SIGNATURE PR00704D 11.05 2.500e-27
132-


158 PR00704E 12.55
S.SOOe-27


162-186 PR00704F 13.61
1.000e-


22 187-21 S PR00704G
13.87


1.237e-21 317-339 PR00704H


13.38 8.138e-21 367-385


PR00704A 14.68 2.125e-19
27-S 1


PR00704C 11.88 1.257e-17
96-


113 PR00704B 17.94
1.833e-15


72-9S


705 PR008S9 PROKARYOTE METALLOTHIONEINPR008S9C 7.06 2.776e-09
94-111


SIGNATURE


706 BL00226 Intermediate filaments BL00226D 19.10 9.581e-26
proteins, 369-


416 BL00226B 23.86
3.250e-24


203-2S 1 BL00226C 13.23
8.269e-


21 268-299 BL00226A
12.77


8.200e-14 103-118


707 PR00021 SMALL PROLINE-RICH PROTEINPR00021A 4.31 2.440e-10
2-1S


SIGNATURE


708 BL00361 Ribosomal protein 510 BL00361B 18.34 S.lOle-10
proteins. 82-


lOS


709 PR00021 SMALL PROLINE-RICH PROTEINPR00021A 4.31 2.200e-10
2-1 S


SIGNATURE


710 BLOOS 14 Fibrinogen beta and gammaBLOOS 14C 17.41 8.412e-27
chains C- 160-


terminal domain proteins.197 BLOOS14E 14.28
8.909e-16


219-236 BLOOS14H 14.95
l.SSle-


1S 317-342 BLOOS14G
15.98


7.7SOe-1S 284-314 BLOOS14D


15.35 4.789e-10201-214


711 PD00930 PROTEIN GTPASE DOMAIN PD00930B 33.72 8.714e-12
49-90


ACTIVATION.


714 BL00400 LBP / BPI / CETP family BL00400C 24.53 6.029e-17
proteins. 158-


202 BL00400D 23.26
2.080e-14


222-259 BL00400A 21.59
1.600e-


10 27-S 9


715 BLOT IS4 RNA polymerases L / 13 BLO11S4B 24.55 S.SOOe-36
to 16 Kd 40-76


175


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTSX


ID NO.


NO:


subunits proteins. BLOT 154A 18.70 3.000e-22
19-40


716 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 9.786e-32
10-49


FINGER METAL-BINDING NU.


717 BL0021 S Mitochondrial energy transferBL0021 SA 15.82 9.206e-14
proteins. 77-


102 BL00215A 15.82
8.412e-10


175-200


719 BL00309 Vertebrate galactoside-bindingBL00309C 18.65 2.241e-09
lectin 62-87


proteins.


726 BL00687 Aldehyde dehydrogenases BL00687E 25.37 7.136e-33
glutamic acid 266-


proteins. 316 BL00687D 26.00
5.333e-28


151-198 BL00687B 17.54
3.647e-


26 39-81 BL00687C 24.13


6.087e-22 96-133 BL00687F
9.5S


2.500e-11 352-363


727 DM01354 kw TRANSCRIPTASE REVERSE DM013S4N 13.17 1.000e-40
II 129-


ORF2. 174 DM013540 8.73 6.60Se-1S


180-226


734 PD00301 PROTEIN REPEAT MUSCLE PD00301A 10.24 6.400e-09
101-


CALCIUM-BI. 112


735 BL01024 Protein phosphatase 2A BL01024A 10.26 1.000e-40
regulatory 22-69


subunit PRSS proteins. BL01024B 8.91 1.000e-40
86-127


BL01024C 7.80 1.000e-40
146-


185 BL01024D 13.22
1.000e-40


185-222 BL01024E 11.96
1.000e-


40 222-266 BL01024F
9.42


1.000e-40 266-317 BL01024G


11.09 1.000e-40 317-349


BL01024H 13.88 1.000e-40
389-


442


736 PF00913 Trypanosome variant surfacePF00913D 11.90 7.130e-10
24-S 1


glycoprotein.


737 PR00700 PROTEIN TYROSINE PHOSPHATASEPR00700D 12.47 2.200e-09
82-


SIGNATURE 101


740 PR00320 G-PROTEIN BETA WD-40 REPEATPR00320C 13.01 1.600e-09
68-83


SIGNATURE PR00320A 16.74 7.366e-09
68-83


743 PR00871 DNA PR00871 G 14.48 8.000e-09
178-


NUCLEOTIDYLEXOTRANSFERASE201


(TDT) SIGNATURE


745 BL00518 Zinc forger, C3HC4 type BL00518 12.23 2.286e-10
(RING forger), 33-42


proteins.


749 BL0021S Mitochondrial energy transferBL0021SA 15.82 5.200e-15
proteins. 221-


246 BL0021SA 15.82
7.618e-14


20-45 BL002ISA 15.82
8.851e-11


123-148 BL0021SB 10.44
9.526e-


11 69-82 BL0021 SB
10.44


7.300e-09 272-285 BL0021SB


10.44 8.500e-09165-178


7S BLS0002 Src homology 3 (5H3) domainBLS0002A 14.19 1.000e-14
1 proteins 370-


profile. 389 BLS0002B 15.18
2.200e-10


408-422


752 BL00353 HMG1/2 proteins. BL003S3B 11.47 3.089e-12
390-


440


753 PF00622 Domain in SPIa and the PF00622B 21.00 4.214e-14
RYanodine 47-69


Receptor.


754 BL00211 ABC transporters family BL00211A 12.23 8.941e-10
proteins. 66-78


755 PR00926 MITOCHONDRIAL CARRIER PR00926F 17.75 7.750e-19
392-


PROTEIN SIGNATURE 415 PR00926C 16.07
5.935e-17


253-274 PR00926D 10.53
2.059e-


15 301-320 PR00926E
11.70


176


CA 02399776 2002-08-02
WO 01157190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS'


ID NO.


NO:


4.971e-15 344-363 PR00926B


16.07 9.526e-13 210-225


PR00926A 10.41 1.514e-12
197-


211


756 BL01187 Calcium-binding EGF-Like BL01187A 9.98 2.125e-12
domain 324-


proteins pattern proteins.336 BL01187A 9.98 4.789e-11


377-389 BLOI 187B 12.04
3.057e-


10 439-455


7S7 PF00651 BTB (also known as BR-C/Ttk)PF00651 15.00 4.429e-10
domain 43-56


.proteins.


758 PR00055 HIV TAT DOMAIN SIGNATURE PR00055A 8.13 8.855e-09
144-


156


759 PD00066 PROTEIN ZINC-FINGER METAL-PD00066 13.92 5.304e-11
110-123


BINDI.


760 PR00448 NSF ATTACHMENT PROTEIN PR00448D 12.42 3.455e-27
162-


SIGNATURE 186 PR00448A 10.74
1.273e-22


37-57 PR00448B 16.01
9.379e-21


100-118 PR00448C 11.46
1,000e-


20 129-147


765 BL01042 Homoserine dehydrogenase BL01042A 13.29 5.909e-11
proteins. 74-95


766 PR00625 DNAJ PROTEIN FAMILY PR0062SA 12.84 2.154e-18
26-46


SIGNATURE PR00625B 13,48 9.000e-16
57-78


768 BL00762 WHEP-TRS domain proteins.BL00762A 23.43 B.SOOe-28
112-


149 BL007628 16.14
3.793e-12


64-78 BL00762A 23.43
6.625e-12


6-43 BL00762C 15.58
4.176e-09


459-472 BL00762D 11.15
9.667e-


09 210-220


769 PR00709 AVIDIN SIGNATURE PR00709A 4.60 1.934e-09
1-20


770 PR00320 G-PROTEIN BETA WD-40 REPEATPR00320C 13.01 1.720e-10
262-


SIGNATURE 277 PR00320A 16.74
2.853e-10


262-277 PR00320C 13.01
4.300e-


09 96-111 PR00320B
12.19


5.500e-09 262-277 PR00320A


16.74 6.268e-09 SS-70


771 PR00019 LEUCINE-RICH REPEAT PR00019B 11.36 8.714e-12
87-


SIGNATURE 101 PR00019A 11.191.000e-10


90-104


772 PD02807 APOLIPOPROTEIN E PRECURSORPD02807C 8.91 6.308e-10
110-


APO-E GLYCOPROTEIN PLAS. 159


773 PD02807 APOLIPOPROTEIN E PRECURSORPD02807C 8.91 6.308e-10
155-


APO-E GLYCOPROTE1N PLAS. 204


774 DM00547 1 kw CHROMO BROMODOMAIN DM00547F 23.43 3.942e-28
943-


SHADOW GLOBAL. 990 DM00547E 13.94
9.750e-21


652-675 DM00547B 11.28


1.818e-18 518-532 DM00547C


17.30 3.531 e-17 S46-568


DM00547A 12.38 1.273e-II
497-


509 DMOOS47D 11.60
9.200e-11


622-636


776 PR00?79 INOSITOL 1,4,5-TRISPHOSpHATE-PR00779F 14.51 5.147e-09
769-


BIND1NG PROTEIN RECEPTOR 792


SIGNATURE


777 PR00779 INOSITOL 1,4,5-TRISPHOSPHATE-PR00779F 14:51 5.147e-09
742-


BINDING PROTEIN RECEPTOR 765


SIGNATURE


778 PR00779 INOSITOL 1,4,5-TRISPH05PHATE-PR00779F 14.51 5.147e-09
742-


BINDING PROTEIN RECEPTOR 765


SIGNATURE


177


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


779 BL01282 BIR repeat proteins. BL01282B 30.49 2.543e-09
6-45


781 PR00205 CADHERIN SIGNATURE PR00205B 11.39 3.118e-11
654-


672 PR00205B 11.39
8.588e-2 1


230-248 PR00205B 11.39
8.527e-


10 551-569 PR00205B
11.39


4.203e-09 336-354


783 BL00625 Regulator of chromosome BL00625B 17.69 2.167e-19
condensation 193-


(RCC1) proteins. 227 BL00625A 16.21
5.500e-17


199-228 BL00625B 17.69
1.885e-


16 140-174 BL00625B
17.69


2.770e-16 245-279 BL00625A


16.21 9.115e-16 251-280


BL00625A 16.21 6.507e-I4
146-


175


785 PF00084 Sushi domain proteins PF00084B 9.45 7.188e-10
(SCR repeat 595-607


proteins. PF00084B 9.45 6.400e-09
656-668


786 PF00084 Sushi domain proteins PF00084B 9.45 7.188e-10
(SCR repeat 595-607


proteins. PF00084B 9.45 6.400e-09
656-668


787 BL00826 MARCKS family proteins. BL00826C 7.63 6.738e-09
203-


230


788 PR00453 VON WILLEBRAND FACTOR PR00453A 12.79 1.310e-I4
TYPE 36-54


A DOMAIN SIGNATURE PR00453B 14.65 8.568e-10
75-90


789 PR00102 ORNITHINE PR00102B 14.82 5.418e-09
963-


CARBAMOYLTRANSFERASE 977


SIGNATURE


790 BL00030 Eukaryotic RNA-binding BL00030B 7.03 5.500e-11
region RNP-1 199-


proteins. 209


791 BL00415 Synapsins proteins. BL00415N 4.29 9.519e-10
393-


437 BL00415N 4.29 2.117e-09


103-147 BL00415N 4.29
3.628e-


0997-141 BL00415N4.29


5.664e-09 387-431


795 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 2.091
e-36 105-144


FINGER METAL-BINDING NU.


799 PF00731 AIR carboxylase. PF00731C 23.16 7.333e-35
337-


380 PF00731B 19.47
7.429e-28


299-336 PF00731A 19.32
6.333e-


24 268-297


804 BL00170 Cyclophilin-type peptidyl-prolylBL00170B 20.97 8.071e-09
cis-trans 297-


isomerase signatur. 337


805 BL00678 Trp-Asp (WD) repeat proteinsBL00678 9.67 3.400e-10
proteins. 378-389


BL00678 9.67 5.800e-10
418-429


BL00678 9.67 8.800e-10
295-306


806 PD01719 PRECURSOR GLYCOPROTEIN PD01719A 12.89 7.571e-14
290-


SIGNAL RE. 318


807 PR00320 G-PROTEIN BETA WD-40 REPEATPR00320B 12.19 9.I
OOe-09 451-


SIGNATURE 466


809 BL00107 Protein kinases ATP-bindingBL00107A 18.39 4.462e-12
region 564-


proteins. 595


810 PR00453 VON WILLEBRAND FACTOR PR00453A 12.79 1.310e-14
TYPE 36-54


A DOMAIN SIGNATURE PR00453B 14.65 8.568e-10
75-90


814 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 2.047e-31
16-55


FINGER METAL-BINDING NU.


8I5 PDO 1066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 2.047e-3
I 16-55


FINGER METAL-BINDING NU.


817 PR00193 MYOSIN HEAVY CHAIN PR00193D 14.36 5.154e-36
125-


SIGNATURE 154 PR00193E 19.47
3.919e-18


179-208


~18 PR00830 ENDOPEPTIDASE LA (LON) PR00830A 8.41 9.571e-11
~ SERINE 115-


178


CA 02399776 2002-08-02
WO 01157190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS''


ID NO.


NO:


PROTEASE (516) SIGNATURE 135


819 BL00126 3'S'-cyclic nucleotide BL00126C 22.07 7.857e-24
phosphodiesterases 528-


proteins. 569 BL00126E 35.22
3.714e-15


669-724 BLOOI26D 25.50
1.173e-


14 584-623 BL00126B
15.20


1.000e-12 502-514 BL00126A


27.56 3.361e-09 461-498


820 PR005I1 TEKTIN SIGNATURE PR00511B 12.25 8.826e-22
174-


195 PR00511A 13.59
7.723e-11'


155-172


821 BL00741 Guanine-nucleotide dissociationBL00741B 14.27 2.800e-15
13-36


stimulators CDC24 family
sign.


822 PF00780 Domain found in NIKl-likePF00780I 14.69 4.825e-09
kinases, 231-


mouse citron and yeast 261
ROM.


827 BL00030 Eukaryotic RNA-binding BL00030A 14.39 5.235e-11
region RNP-1 144-


proteins. 163


828 BL00326 Tropomyosins proteins. BL00326D 8.76 9.357e-11
545-


586


829 PD02448 TRANSCRIPTION PROTEIN PD02448A 9.37 1.000e-40
DNA- 46-85


BINDIN. PD02448B 10.17 1.000e-40
85-


133 PD02448C 13.62
1.000e-40


152-189 PD02448E 11.33
9.000e-


30 235-261 PD02448F
14.22


9.654e-25 279-303 PD02448D


11.48 3.659e-18 197-211


PD02448G 10.73 7.857e-16
305-


318


830 BL00720 Guanine-nucleotide dissociationBL00720B 16.57 4.500e-23
483-


stimulators CDC25 family 507
sign.


831 BL00107 Protein kinases ATP-bindingBL00107A 18.39 6.625e-21
region 143-


proteins. 174 BL00107B 13.31
4.214e-10


213-229


832 BL00215 Mitochondrial energy transferBL00215A 15.82 5.787e-11
proteins. 32-57


833 PR00497 NEUTROPHIL CYTOSOL FACTORPR00497A 6.92 4.375e-09
41-59


P40 SIGNATURE


834 BL00229 Tau and MAP proteins tubulin-bindingBL00229A 23.57 9.565e-10
99-


domain proteins. 138


835 BL00421 Transmembrane 4 family BL00421E 20.97 2.216e-09
proteins. 1053-


1083


836 BL00795 Involucrin proteins. BL00795B 12.41 7.931e-09
405-


445


837 PR00020 MAM DOMAIN SIGNATURE PR00020A 18.17 1.000e-17
34-53


PR00020B 15.52 5.846e-16
68-85


PR00020D 12.70 2.543e-15
147-


162 PR00020C 13.66
3.483e-13


95-107 PR00020E 8.64
6.586e-13


165-179


838 BL50017 Death domain proteins BL50017B 17.60 6.897e-13
profile. 1499-


1515


839 PF00850 Histone deacetylase family.PF00850C 14.55 9.542e-09
1352-


1369


840 PF00023 Ank repeat proteins. PF00023A 16.03 4.500e-12
44-60


PF00023B 14.20 7.923e-11
73-83


PF00023B 14.20 9.OOOe-IO
139-


149 PF00023B 14.20
5.500e-09


40-50


842 BL01194 Ribosomal protein LlSe BL01194B 13.66 1.000e-40
proteins. 37-85


BL01194C 12.35 9.250e-40
103-


138 BL01194A 18.70
7.632e-38


179


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


2-37 BLOT 194D 19.02
2.658e-36


139-178


843 BL00610 Sodium:neurotransmitter BL00610A 17.73 1.000e-40
symporter 40-90


family proteins. BL00610B 23.65 1.000e-40
104-


154 BL00610C 12.94
1.000e-40


206-258 BL00610E 20.34
1.000e-


40 355-398 BL00610F
29.02


1.000e-40 454-509
BL00610D


20.97 6.063e-35 272-325


BL00610G 12.89 8.588e-13
514-


537


845 BL00143 Insulinase family, zinc-bindingBL00143A 20.914.300e-20
region 94-


proteins. 121 BL00143C 14.16
5.500e-13


245-258 BL00143B 14.41
9.053e-


10 141-156


846 PR00543 OESTROGEN RECEPTOR PR00543D 10.87 1.355e-09
898-


SIGNATURE 914


847 PR00543 OESTROGEN RECEPTOR PR00543D 10.87 1.355e-09
898-


SIGNATURE . 914


848 BL00824 Elongation factor 1 beta/beta'/deltaBL00824C 14.58 1.000e-40
chain 129-


proteins. 167 BL00824D 14.04
6.192e-39


167-202 BL00824B 9.21
2.080e-


21 96-116 BL00824E
12.49


3.333e-19 210-226
BL00824A


13.78 8.650e-14 19-34


849 PD01066 PROTEIN ZINC FINGER ZINC- PD01066 19.43 1.000e-40
12-51


FINGER METAL-BINDING NU.


850 PD01066 PROTEIN ZINC FINGER ZINC- PD01066 19.43 7.316e-24
10-49


FINGER METAL-BINDING NU.


852 BL01272 Glucokinase regulatory BL01272B 19.61 6.870e-30
protein family 136-


proteins. 171 BL01272C 11.68
3.314e-25


249-274 BL01272A 6.49
1.231e-


18 99-117


853 PD00930 PROTEIN GTPASE DOMAIN PD00930B 33.72 9.341e-20
65-


ACTIVATION. 106


854 PD00289 PROTEIN SH3 DOMAIN REPEAT PD00289 9.97 6.850e-11
140-154


PRESYNA.


858 PR00450 RECOVERIN FAMILY SIGNATUREPR00450C 12.22 3.250e-25
68-90


PR00450B 11.76 8.125e-23
22-42


PR00450D 16.58 8.920e-22
92-


112 PR00450E 12.14
1.58Ie-19


114-133 PR00450G 15.33
5.500e-


. 19 166-187 PR00450F
12.30


4.375e-15 140-156
PR00450A


13.58 1.857e-14 8-23


860 BL00027 'Homeobox' domain proteins.BL00027 26.43 7.188e-27
74-117


866 BL00477 Alpha-2-macroglobulin familyBL00477L 23.51 7.480e-20
thiolester 54-87


region proteins.


867 BL01078 Molybdenum cofactor biosynthesisBL01078B 14.20 1.621e-20
408-


proteins. 429 BL01078A 10.16
2.000e-13


366-379 BL01078D 5.99
3.455e-


11 566-576 BL01078C
10.52


3.793e-11 501-513


868 BLOT 177 Anaphylatoxin domain proteins.BL01177E 20.64 5.800e-24
462-


489 BL01177C 17.39
5.333e-19


416-435 BL01177B 13.61
7.840e-


16 122-138 BL01177D
17.50


1.900e-15 441-459


L 869 BL01177 Anaphylatoxin domain proteins.BLO1177E 20.64 5.800e-24
~ ~ ~ 415-


180


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


442 BL01177C 17.39
5.333e-19


369-388 BL01177B 13.61
7.840e-


16 122-138 BL01177D
17.50


1.900e-15 394-412


871 BL50007 Phosphatidylinositol-specificBL50007A 19.61 1.000e-40
322-


phospholipase X-box domain368 BL50007D 19.54
proteins 1.000e-40


prof. 589-631 BL50007B 20.90
6.700e-


36 383-421 BL50007E
25.63


9.053e-33 748-785 BL50007C


8.97 5.200e-19 452-469


872 BL00972 Ubiquitin carboxyl-terminalBL00972D 22.55 3.250e-17
hydrolases 90-


family 2 proteins. 115


874 PR00452 SH3 DOMAIN SIGNATURE PR00452B 11.65 4.250e-09
370-


386


877 BL00741 Guanine-nucleotide dissociationBL00741B 14.27 5.500e-13
1343-


stimulators CDC24 family 1366
sign.


878 DM00215 PROLINE-RICH PROTEIN 3. DM00215 19.43 2.525e-09
S2-85


881 PD02807 APOLIPOPROTEIN E PRECURSORPD02807E 10.90 4.702e-09
358-


APO-E GLYCOPROTEIN PLAS. 407


882 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 7.1 ~8e-37
8-47


FINGER METAL-BINDING NU.


885 PF00023 Ank repeat proteins. PF00023A 16.03 8.071e-09
10-26


886 PR00372 BIOPTERIN-DEPENDENT PR00372B 10.30 9.308e-27
225-


AROMATIC AMINO ACID 248 PR00372A 13.39
7.000e-24


HYDROXYLASE SIGNATURE 134-154 PR00372E 12.62
2.125e-


23 360-380 PR00372C
7.90


3.025e-22 289-309 PR00372F


13.09 6.333e-21 395-414


PR00372D 10.22 1.000e-19
329-


348


887 BL00301 GTP-binding elongation BL00301B 20.09 2.800e-24
factors proteins. 103-


135 BL00301A 12.41
4.316e-13


21-33


888 BL00518 Zinc forger, C3HC4 type BL00518 12.23 1.667e-09
(RING finger), 30-39


proteins.


889 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 4.906e-26
6-45


FINGER METAL-BINDING NU.


890 DM00179 w KINASE ALPHA ADHESION DM00179 13.97 7.652e-09
T- 113-


CELL. 123


892 BL01022 PTR2 family proton/oligopeptideBL01022B 22.19 6.016e-14
72-


symporters proteins. 118 BL01022E 23.51
1.173e-12


472-508 BL01022A 11.58
9.135e-


12 42-61 BL01022D 9.42
3.455e-


11 199-212


893 PD02407 3-BISPHOSPHOGLYCERATE- PD02407K 12.59 6.529e-10
360-


INDEPENDENT PHOSPHOGLYCER.383


894 PD02407 3-BISPHOSPHOGLYCERATE- PD02407K 12.59 6.S29e-10
360-


INDEPENDENT PHOSPHOGLYCER.383


895 PR00237 RHODOPSIN-LIKE GPCR PR00237B 13.50 9.100e-14116-


SUPERFAMILY SIGNATURE 138 PR00237F 13.57
1.360e-13


312-337 PR00237G 19.63
9.069e-


13 353-380 PR00237E
13.03


7.120e-12 243-267 PR00237D


8.94 4.150e-11 194-216


PR00237A 11.48 4.375e-11
83-


108
~


896 BL00129 Glycosyl hydrolases familyBL00129D 16.76 8.258e-26
31 proteins. 634-


678 BL00129A 26.21
1.720e-25


384-430 BL00229E 22.60
4.857e-


181


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


23 698-734 BL00129C
15.12


1.750e-22 596-624 BL00129B


19.19 5.891 e-18 495-522


BL00129F 26.19 7.545e-I5
814-


852


897 BL00598 Chromo domain proteins. BL00598 14.45 1.220e-13
9-31


898 BL00518 Zinc finger, C3HC4 type BL00518 12.23 6.000e-09
(RING forger), 396-405


proteins.


899 PDO1101 INHIBITOR HEAVY CHAIN PDO1101B 21.53 1.000e-40
274-


CHANNEL IN. 327 PDO1101D 24.45
1.000e-40


457-512 PDOl lOlA 18.25
6.268e-


23 83-117 PDO1101C
12.69


1.237e-16 366-386 PDO1101E


6.73 7.750e-12 566-576


900 PR00600 PROTEIN PHOSPHATASE PP2A PR00600A 11.61 5.979e-09
55KD 31-52


REGULATORY SUBUNIT


SIGNATURE


901 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 8.116e-31
24-63


FINGER METAL-BINDING NU.


903 BL01115 GTP-binding nuclear proteinBLO 1115A 10.22 1.509e-1
ran proteins. I 21-65


906 DM00215 PROLINE-RICH PROTEIN 3. DM00215 19.43 2.174e-13
539-


572 DM00215 19.43 4.750e-12


549-582 DM00215 19.43
9.824e-


11 551-584 DM00215
19.43


2.929e-I O 548-581
DM00215


19.43 4.054e-10 550-583


DM00215 19.43 5.339e-10
552-


585 DM00215 19.43 7.107e-10


544-577


907 PR00988 URIDINE KINASE SIGNATURE PR00988A 6.39 6.276e-12
314-


332


908 BL00107 Protein kinases ATP-bindingBL00107A 18.39 5.950e-17
region 1125-


proteins. 1156


909 BL00107 Protein kinases ATP-bindingBL00107A 18.39 5.950e-17
region 1118-


proteins. 1149


910 BL00107 Protein kinases ATP-bindingBL00107A 18.39 8.560e-13
region 150-


proteins. 181


911 BL00107 Protein kinases ATP-bindingBL00107A 18.39 8.560e-13
region 150-


proteins. 181


912 PF00856 SET domain proteins. PF00856A 26.14 4.553e-11
243-


280


913 PF00628 PHD-forger. PF00628 15.84 6.400e-13
197-212


914 PR00962 LETHAL(2) GIANT LARVAE PR00962D 10.40 1.000e-27
435-


PROTEIN SIGNATURE 459 PR00962G 15.714.086e-26


593-618 PR00962B 11.98
9.122e-


26 296-3I9 PR00962A
13.28


6.143e-22 15-34 PR00962C
8.00


4.000e-21 348-369 PR00962F


12.39 9.769e-21 552-572


PR00962H 13.32 2.636e-20
623-


643 PR00962I 11.68
9.786e-20


692-712 PR00962E 8.812.915e-


18 515-534


915 PR00962 LETHAL(2) GIANT LARVAE PR00962D 10.40 1.000e-27
365-


PROTEIN SIGNATURE 389 PR00962G 15.71
4.086e-26


523-548 PR00962A 13.28
6.143e-


22 15-34 PR00962C 8.00
4.000e-


21 278-299 PR00962F
12.39


9.769e-21 482-502 PR00962H


182


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION ~ DESCRIPTION RESULTS*


ID NO.


NO:


13.32 2.636e-20 553-573


PR00962I 11.68 9.786e-20
622-


642 PR00962E 8.81 2.915e-18


445-464


916 BL00134 Serine proteases, trypsinBL00134A 11.96 5.886e-14
family, histidine 90-


proteins. 107


917 BL00478 LIM domain proteins. BL00478B 14.79 8.393e-13
211-


226 BL00478B 14.79
6.712e-10


271-286


918 PR00049 WILM'S TUMOUR PROTEIN PR00049D 0.00 5.729e-09
973-


SIGNATURE 988


922 BL00150 Acylphosphatase proteins.BL00150 25.33 1.000e-40
37-84


924 DM00031 IMMLJNOGLOBULIN V REGION.DM00031B 15.41 8.063e-09
79-


113


925 BL00072 Acyl-CoA dehydrogenases BL00072D 30.08 2.837e-24
proteins. 280-


331 BL00072E 24.12
8.200e-24


368-411 BL00072C 25.30
7.873e-


20 226-267 BL00072B
9.48


6.049e-12 183-196


927 BL00237 G-protein coupled receptorsBL00237C 13.19 1.692e-13
proteins. 229-


256 BL00237A 27.68
6.657e-13


90-130 BL00237D 11.23
9.571e-


13 290-307


928 BL01033 Globins profile. BL01033A 16.94 7.923e-18
25-47


BL01033B 13.81 1.000e-15
93-


105


929 BL00216 Sugar transport proteins.BL00216B 27.64 8.714e-13
203-


253


932 BL00415 Synapsins proteins. BL00415N 4.29 9.519e-10
353-


397 BL00415N 4.29 2.117e-09


63-107 BL00415N 4.29
3.628e-09


57-101 BL00415N 4.29
5.664e-09


347-391


933 PD02448 TRANSCRIPTION PROTEIN PD02448A 9.37 1.000e-40
DNA- 46-85


BINDIN. PD02448B 10.17 1.000e-40
85-


133 PD02448C 13.62
1.000e-40


152-189 PD02448E 11.33
9.000e-


30 223-249 PD02448F
14.22


9.654e-25 267-291 PD02448D


11.48 3.659e-18 197-211


PD02448G 10.73 7.857e-16
293-


306


934 DM00191 w SPAC8A4.04C RESISTANCE DM00191D 13.94 9.083e-10
136-


SPAC8A4.OSC DAUNORUBICIN.175


935 BLO1115 GTP-binding nuclear proteinBLO1115A 10.22 4.696e-10
ran proteins. 67-


111


936 BL00019 Actinin-type actin-bindingBL00019D 15.33 8.138e-14
domain 865-


proteins. 895


937 PR00762 CHLORIDE CHANNEL SIGNATUREPR00762A 14.22 4.000e-22
183-


201 PR00762C 9.29 1.000e-21


268-288 PR00762E 12.07
3.250e-


20 520-537 PR00762D
11.29


1.000e-19 470-491 PR00762F


15.12 1.429e-19 538-558


PR00762B 12.12 1.818e-18
214-


234 PR00762G 14.13
3.455e-17


577-592


938 BL00027 'Homeobox' domain proteins.BL00027 26.43 9.500e-25
291-334


L 939 ~ DMO1111 4 kw PHOSPHATASE DMO1111E 17.28 1.568e-10
~ 248-


183


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


TRANSFORMING 61K PDF1. 297 DMO1111E 17.28
5.168e-10


659-708 DMO1111D 16.76


5.263e-09 279-325 DMO1111M


10.67 8.674e-09 911-935


940 BL00107 Protein kinases ATP-bindingBL00I07B 13.31 1.000e-14
region 293-


proteins. 309 BL00107A 18,39
6.760e-13


229-260


942 BL01160 Kinesin Iight chain repeatBL01160B 19.54 9.832e-11
proteins. 543-


597


943 PD01066 PROTEIN ZINC FINGER ZINC-PD01066 19.43 3.500e-35
8-47


FINGER METAL-BINDING NU.


945 BL00989 Clathrin adaptor complexesBL00989B 26.51 I.000e-40
small chain 66-


proteins. 117 BL00989A 11.66
1.000e-13


5-19


946 PROOI78 FATTY ACID-BINDING PROTEINPR00178D 13.52 9.571e-09
450-


SIGNATURE 469


947 BL00178 Aminoacyl-transfer RNA BL00178B 7.11 4.857e-09
synthetases 713-


class-I proteins. 724


948 PF00628 PHD-forger. PF00628 15.84 8.412e-14
201-216


951 BL002I6 Sugar transport proteins.BL00216B 27.64 2.050e-10
180-


230


952 PR00926 MITOCHONDRIAL CARRIER PR00926F 17.75 4.300e-11
26-49


PROTEIN SIGNATURE PR00926F 17.75 6.348e-09
134-


157


955 PF00109 Beta-ketoacyl synthase. PF00109 13.08 2.846e-12
342-357


9S7 PR00069 ALDO-KETO REDUCTASE PR00069A 16.01 8.826e-24
26-51


SIGNATURE PR00069B 11.33 1.S14e-17
86-


105 PR00069C 16.03
8.816e-14


155-173


958 PF00583 Acetyltransferase (GNAT) PF00583A 12.53 5.500e-10
family. 631-


642


961 PR00328 GTP-BINDING SARI PROTEIN PR00328A 10.62 8.740e-I0
7-31


SIGNATURE


962 BL00354 HMG-I and HMG-Y DNA-bindingBL00354A 3.83 9.438e-10
1489-


domain proteins (A+T-hook).1499


963 BL00354 HMG-I and HMG-Y DNA-bindingBL00354A 3.83 9.438e-10
1489-


domain proteins (A+T-hook).1499


964 BL00027 'Homeobox' domain proteins.BL00027 26.43 7.188e-27
53-96


965 PF00992 Troponin. PF00992A16.672.42Ie-09581-


616


966 PR00515 5-HYDROXYTRYPTAMINE 1F PROOSISD 7.91 5.741e-09
13-33


RECEPTOR SIGNATURE


967 BL00579 Ribosomal protein L29 BL00579B 21.99 5.065e-21
proteins. 164-


194


970 BL00504 Fumarate reductase / succinateBL00504C 18.68 2.227e-24
34-59


dehydrogenase FAD-bindingBL00504D 10.43 7.261e-21
site 75-93


proteins.


973 PFOOS80 UvrDIREP helicase. PF00580A 13.37 4.720e-09
249-


271


974 PR00456 RIBOSOMAL PROTEIN P2 PR004S6F 5.86 1.000e-10
242-254


SIGNATURE


975 BL00237 G-protein coupled receptorsBL00237A 27.68 4.429e-22
pxoteins. 99-


139


976 BL00031 Nuclear hormones receptorsBL00031A 19.55 7.158e-33
DNA- 60-93


binding region proteins. BL00031B 22.25 S.SOOe-28
94-


126


977 PD00066 PROTEIN ZINC-FINGER METAL-PD00066 13.92 8.200e-16
196-209


BINDI. PD00066 13.92 8.200e-16
336-349


pPD00066 13.92 2.385e-15
476-489


184


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ ACCESSION DESCRIPTION RESULTS*


ID NO.


NO:


PD00066 13.92 9.308e-15
252-265


PD00066 13.92 2.800e-14
448-461


PD00066 13.92 4.600e-14
392-405


PD00066 13.92 5.200e-14
280-293


PD00066 13.92 4.000e-13
224-237


PD00066 13.92 4.429e-12
308-321


PD00066 13.92 9.571e-12
420-433


PD00066 13.92 6.870e-11
168-181


978 BL00721 Formate--tetrahydrofolateBL00721B 13.21 1.000e-40
ligase proteins. 346-


401 BL00721D 13.90
1.000e-40


538-592 BL00721E 13.46
1.000e-


40 597-646 BL00721I
18.79


2.500e-40 814-860 BL00721H


21.20 8.239e-39 763-814


BL00721A 15.31 9.719e-32
287-


321 BL00721C 16.924.000e-30


498-535 BL00721F 15.96
8.232e-


27 660-702 BL0072fG
7.97


3.017e-10 721-734


981 PD00126 PROTEIN REPEAT DOMAIN PD00126A 22.53 2.552e-09
TPR 180-


NUCLEA. 201


982 BL00869 Renal dipeptidase proteins.BL00869C 12.58 3.172e-19
59-95


BL00869E 13.12 9.129e-18
120-


157 BL00869J 15.60
6.032e-17


270-310 BL00869H 11.08
1.840e-


16 219-242 BL00869G
13.55


2.543e-16 192-214 BL00869F


12.77 7.031e-14 157-192


BL00869I 12.92 3.274e-12
242-


270 BL00869D 14.02
5.282e-10


95-124 BL00869B 15.55
9.382e-


10 31-61


983 PR00196 ANNEX1N FAMILY SIGNATURE PR00196F 13.89 2.125e-09
92-108


984 BL00485 Adenosine and AMP deaminaseBL00485D 30.82 2.427e-10
proteins. 154-


209


* Results include in order: accession number subtype; raw score; p-value;
position of signature in amino acid
sequence
TABLE 4
SEQ PFAM NAME DESCRIPTION p-value PFAM
ID SCORE
NO:


2 ig Immunoglobulin domain 3.9e-17 60.3


3 HSP90 Hsp90 protein 0 1548.4


6 tsp_l Thrombospondin type 0.002 22.1
1 domain


7 7tm-1 7 transmembrane receptor6.7e-08 27.3
(rhodopsin
family)


9 PWWP PVWVP domain 8.1e-16 66.0


12 Clq Clq domain 1.7e-26 101.5


13 CIq CIq domain 2e-20 81.3


14 Aa_trans Transmembrane amino 2.7e-42 153.9
acid
transporter protein


15 E1-E2 ATPaseEI-E2 ATPase 6.3e-124 412.2


16 trypsin Trypsin 1.2e-87 278.6


17 ig Immunoglobulin domain 7.6e-12 43.2


18 lectin_c Lectin C-type domain 0.0003 21.2


20 Alpha L Alpha-L-fucosidase 1.2e-217 736.5
fucos


185


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ PFAM NAME DESCRIPTION p-value PFAM
ID SCORE
NO:


22 pkinase Eukaryotic protein kinase3.3e-87 303.1
domain


23 pkinase Eukaryotic protein kinase2.7e-85 296.8
domain


24 pkinase Eukaryotic protein kinase2.7e-85 296.8
domain


25 ank Ank repeat 5.5e-14 59.9


27 pkinase Eukaryotic protein kinase1.5e-100 347.4
domain


28 spectrin Spectrin repeat 4e-57 203.2


29 spectrin Spectrin repeat 4e-57 203.2


30 WD40 WD domain, G-beta repeat1.2e-07 38.8


33 rrm RNA recognition motif. 1.1e-17 72.2


34 rnn RNA recognition motif. 1.1e-17 72.2


36 7tm~1 7 transmembrane receptor3e-36 117.3
(rhodopsin
family)


37 ank Ank repeat 5.9e-25 96.3


38 SRF-TF SRF-type transcription 1.4e-36 133.9
factor


40 alk-phosphataseAllcaline phosphatase 0 1034.9


44 zf C2H2 Zinc forger, C2H2 type 8.6e-103 354.9


45 sugar_tr Sugar (and other) transporter3. I e-0840.3


47 7tm 2 7 transmembrane receptor6.4e-79 275.6
(Secretin
family)


50 zf C2H2 Zinc forger, C2H2 type 1.3e-98 341.0


51 filament Intermediate filament 1.2e-176 600.3
proteins


52 zf C3HC4 Zinc forger, C3HC4 type2.7e-10 37.7
(RING
forger)


53 Cadherin Cadherin cytoplasmic 1.9e-94 327.2
C ter region
m


54 S_100 S-100/ICaBP type calcium5.2e-18 73.3
binding
domain


58 inositol_P Inositol monophosphatase5e-13 49.8
family


59 7tm_1 7 transmembrane receptor8.8e-46 147.6
(rhodopsin
family)


60 Kunitz_BPTIKunitz/Bovine pancreatic3.7e-47 148.6
trypsin
inhibito


62 DAD DAD family 2.5e-74 260.3


63 MOZ_SAS MOZ/SAS family 5.9e-133 455.1


64 MOZ SAS MOZ/SAS family 1.7e-123 423.6


65 ras Ras family 9.3e-89 308.3


67 Hamlp like Haml family 3.7e-49 176.7


68 7tm~1 7 transmembrane receptor5.2e-39 126.1
(rhodopsin
family)


70 zf C2H2 Zinc forger, C2H2 type 1.5e-112 387.3


71 Peptidase_M41Peptidase family M41 1.2e-110 381.0


72 abhydrolasealpha/beta hydrolase 9.8e-05 26.5
fold


81 K_tetra K+ channel tetramerisation0.022 -16.8
domain


82 pkinase Eukaryotic protein kinase5e-49 176.3
domain


84 AAA ATPases associated with1.3e-77 271.3
various
cellular act


85 homeobox Homeobox domain 1.4e-28 108.3


87 TGF-beta Transforming growth 6.7e-68 210.2
factor beta like


91 mito_carr Mitochondrial carrier 4.6e-57 198.5
proteins


95 adenylatekinaseAdenylate kinase 1.1e-15 60.0


96 ig Immunoglobulin domain 4.1e-20 69.8


99 CNH CNH domain 3.4e-120 412.7


100 homeobox Homeobox domain 7.4e-32 119.3


101 zf C2H2 Zinc forger, C2H2 type 2.2e-47 170.8


102 zf C2H2 Zinc forger, C2H2 type 4.4e-89 309.4
~


103 dynamin Dynamin family 1.4e-150 513.6


104 lectin c Lectin C-type domain 4.2e-15 63.6


105 lectin_c Lectin C-type domain 4.2e-15 63.6
'


108 metalthio Metallothionein 2e-25 97.9


186


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ PFAM NAME DESCRIPTION p-value PFAM
ID SCORE
NO:


112 HSP20 Hsp20/alpha crystallin 2.6e-20 77.7
family


115 EF_TS Elongation factor TS 3.8e-63 221.1


116 sugar_tr Sugar (and other) transporter4e-63 223.1


118 catalase Catalase 0 1158.9


119 UCH Ubiquitin carboxyl-terminal1e-10 24.4
hydrolase, famil


122 metalthio Metallothionein 2.8e-25 97.4


125 adh_short short chain dehydrogenase1.6e-45 164.6


126 KRAB KRAB box 7.9e-25 95.9


127 G-alpha G-protein alpha subunit1e-249 843.0


128 mito_carr Mitochondrial carrier 2e-65 227.2
proteins


131 EF1BD EF-1 guanine nucleotide4.9e-53 189.6
exchange
domain


132 GYF GYF domain 4.9e-28 106.6


133 GYF GYF domain 4.9e-28 106.6


134 lipocalin Lipocalin / cytosolic 2.1e-33 119.1
fatty-acid
binding pr


135 pkinase Eukaryotic protein kinase3.3e-86 299.8
domain


136 ank Ank repeat 2.2e-29 111.1


137 IL8 Small cytokines 3.1e-18 65.2
(intecrine/chemokine),
inter


139 pyridoxal Pyridoxal-dependent 0.00011 19.0
deC decarboxylase
conse


140 cadherin Cadherin domain 1.3e-88 307.8


142 efhand EF hand 5.7e-33 123.0


143 AcyltransferaseAcyltransferase 2e-29 111.2


146 cytochrome Cytochrome c 1.7e-33 124.7
c


147 pkinase Eukaryotic protein kinase2.3e-86 300.3
domain


148 PDZ PDZ domain (Also known 1.7e-09 45.0
as DHR or
GLGF).


149 aldo_ket_redAldo/keto reductase 7.4e-189 640.8
family


150 homeobox Homeobox domain 3.2e-08 38.7


151 PseudoU_synth-tRNA pseudouridine synthase4.7e-57 203.0
1


152 abhydrolasealpha/beta hydrolase 1.7e-31 118.0
fold


153 PDZ PDZ domain (Also known 1.1e-09 45.6
as DHR or
GLGF).


156 PHD PHD-finger 7.6e-15 62.8


157 fn3 Fibronectin type III 0.015 21.9
domain


158 homeobox Homeobox domain 2.7e-27 104.1


160 PWI PWI domain 3.9e-24 93.6


162 DnaJ DnaJ domain 2e-06 34.8


164 Cbl_N CBL proto-oncogene N-terminal8e-117 401.5
domain


166 metalthio Metallothionein 3.1e-26 100.6


167 LRR Leucine Rich Repeat 0.00069 26.3


169 fibrinogen Fibrinogen beta and 5.3e-180 611.4
C gamma chains,
C-term


170 fibrinogen_CFibrinogen beta and 5.3e-180 611.4
gamma chains,
C-term


171 fibrinogen_CFibrinogen beta and 1e-149 510.8
gamma chains,
C-term


173 homeobox Homeobox domain 1.5e-29 111.6


174 FYVE FYVE zinc finger 7.4e-28 103.8


175 GRIP GRIP domain 3.9e-08 40.5


182 pkinase Eukaryotic protein kinase3.4e-71 250.0
domain


185 CAP_GLY CAP-Gly domain 5.6e-51 182.8


186 TBC TBC domain 2.2e-50 180.8


187 TBC TBC domain 2.2e-50 180.8


187


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ PFAM NAME DESCRIPTION p-value PFAM
ID SCORE
NO:


188 PDZ PDZ domain (Also known 4e-13 57.0
as DHR or
GLGF).


189 Kelch Kelch motif 5.2e-106 365.6


190 TropomyosinTropomyosins 3.8e-171 535.4


192 Rieske Rieske [2Fe-2S] domain 0.0016 18.5


199 ig Immunoglobulin domain 5.9e-19 66.1


202 EGF EGF-like domain 3.4e-54 193.5


203 trefoil Trefoil (P-type) domain1e-24 95.5
~


204 TBC TBC domain 8.5e-38 139.0


205 ethand EF hand 0.0096 22.6


206 ISK_ChannelSlow voltage-gated potassium0.0031 8.1
channel


207 trefoil Trefoil (P-type) domain2.9e-48 173.7


209 Ribosomal Ribosomal protein S13/S181.2e-78 274.7
S13


210 hemopexin Hemopexin 1.3e-62 221.5


213 TBC ~ TBC domain 2.5e-48 174.0


215 Basic Myogenic Basic domain 4.3e-50 I79.8


216 Ribosomal KOW motif 8.2e-23 89.2
L24


222 fn3 Fibronectin type III 7.3e-141 481.4
domain


223 cofilin_ADFCofilin/tropomyosin-type9.3e-47 168.8
actin-
binding pr


224 efhand EF hand 6.1e-06 33.2


225 Pterin_4a Pterin 4 alpha carbinolamine9.3e-42 152.1
dehydratase


228 ABC_tran ABC transporter 4.1e-110 379.2


234 E1_DerP2_DerFE1 family 3.7e-90 312.9
2


235 E1_DerP2_DerFE1 family 1.6e-48 174.6
2


237 PMP22_ClaudinPMP-22/EMP/MP20/Claudin1.7e-25 98.1
family


238 Opiods neuropeVertebrate endogenous 1.8e-159 543.2
p opioids
neurope


239 eIF-5a Eukaryotic initiation 5.9e-104 358.8
factor 5A
hypusine


240 Amino_oxidaseFlavin containing amine2.5e-1 37.8
oxidase l


243 zf C2H2 Zinc finger, C2H2 type 2.1e-99 343.6


244 Band_7 SPFH domain / Band 7 2.3e-53 190.7
family


245 ank Ank repeat I.6e-88 307.5


246 zf C2H2 Zinc finger, C2H2 type 6.7e-49 175.9


247 actin Actin 2.3e-42 140.3


248 ER_lumen_recepER lumen protein retaining2.4e-155 529.5
t receptor


250 PMP22_ClaudinPMP-22/EMP/MP20/Claudin2.2e-38 140.9
family


252 Collagen Collagen triple helix 1.4e-13 58.6
repeat (20
copies)


255 C2 C2 domain 0.052 7.8


257 CAP_GLY CAP-Gly domain 1.4e-20 8I.8


260 WD40 WD domain, G-beta repeat9.9e-62 218.5


261 WD40 WD domain, G-beta repeat9.9e-62 218.5


262 WD40 WD domain, G-beta repeat9.9e-62 218.5


263 cofilin_ADFCofilin/tropomyosin-type7.8e-21 82.6
actin-
binding pr


264 Ribosomal Ribosomal protein Ll4p/L23e9.2e-10 40.6
L14


265 SAPA Saposin A-type domain 4.4e-27 103.4


266 SAPA Saposin A-type domain 4.4e-27 103.4


267 ABC_tran ABC transporter 9.5e-39 142.2


269 Ribosomal_L14Ribosomal protein Ll4p/L23e6.2e-62 219.2


270 abhydrolasealpha/beta hydrolase 0.042 -3.3
fold


272 ras Ras family 4.3e-87 302.8


188


CA 02399776 2002-08-02
WO 01/57190 - PCT/USO1/04098
SEQ PFAM NAME DESCRIPTION p-value PFAM
ID SCORE
NO:


273 mn RNA recognition motif. 0.074 14.6


275 Iipocalin Lipocalin l cytosolic 2.5e-41 146.4
fatty-acid
binding pr


276 ras Ras family 1.1e-67 238.3


277 UCH Ubiquitin carboxyl-terminal1.2e-147 503.9
hydrolase, famil


278 START START domain 3.2e-09 44.1


279 WD40 WD domain, G-betaxepeat1,8e-27 104.7


282 G-patch G-patch domain - 7.$e-2286.0


287 Anti_proliferatBTGl family 1.2e-101 351.0


289 KRAB KKAB box 7.1e-21 82.8


293 7tm 3 7 transmembrane receptor3.3e-73 256.6


295 SET SET domain Se-30 113.2


296 Pyridox Pyridoxamine 5'-phosphateI.3e-76 268.0
oxidase oxidase


297 rrm RNA recognition motif. 5.4e-45 162.9


298 Ubie_methyltranubiE/COQS methyltransferase6.3e-05 -96.3
family


299 Ubie_methyltranubiEfCOQS methyltransferase0.0024 -118.1
family


301 Cyt reductaseFAD/NAD-binding Cytochrome7.7e-61 215.5
reductase


302 G-patch G-patch domain 3.1e-14 60
.7


307 7tm 1 7 transmembrane receptor7.7e-43 _
(rhodopsin 138.2
family)


308 PH PH domain 0.0015 17.8


310 7tm_1 7 transmembrane receptor1.4e-84 270.8
(rhodopsin
family)


31 Rhodanese Rhodanese-lilee domain 3.3e-64 226.7
I


312 tubulin Tubulin/FtsZ family 4.9e-286 963.6


314 SURF4 SURF4 family 1.2e-19 676.6
9


325 IMS impB/mucB/samB family _ 207.5
2e-58


327 cadherin Cadherin domain 4.3e-9I 316.0


329 NAC NAC domain 2.1e- 107.8
8


330 IP_trans Phosphatidylinositol 6.5e-98 338.7
transfer protein


332 TFIIS Transcription factor 8.8e-05 29.3
S-II (TFIIS)


337 zf C2H2 Zinc forger, C2H2 type 3.6e-61 216.6


340 AIRS AIR synthase related 4e-32 120.2
protein


343 annexin Annexin 4.6e-80 279.4


346 Stathmin Stathmin family I .8e-90 314.0


347 Ribosomal_L16Ribosomal protein L16 4.6e-09 34.9


348 lactamase Metallo-beta-lactamase 0.012 -6.0
B superfamily


351 ethand EF hand 2.5e-14 61.0


353 lectin_c Lectin C-type domain 1.3e-05 32.1


354 WD40 WD domain, G-beta repeat2.2e-18 74.5


360 lipocalin Lipocalin / cytosolic 6.3e-10 38.3
fatty-acid
binding pr


362 AcetyltransfAcetyltransferase (GNAT)0.0019 24.9
family


365 tRNA-synt tRNA synthetases class 4.6e-I85 628.2
1 I (I, L, M and
V )


366 Sulfatase Suifatase 6.1 e-228770.6


368 START START domain 3.8e-11 50.5


369 pkinase Eukaryotic protein kinase2.4e-10 41.3
domain


370 ACBP Acyl CoA binding protein4.4e-56 199.7


371 pkinase Eukaryotic protein kinase1.6e-94 327.5
domain


373 EGF EGF-like domain 2.6e-12 54.3


375 zf C2H2 Zinc forger, C2H2 type 8.2e-64 225.4


377 KRAB DRAB box 3.7e-27 103.7


379 SET SET domain 7.3e-61 215.6


380 Glyco_transfGlycosyl transferase 0.0028 -40.1
8 family 8


381 zf C2H2 Zinc forger, C2H2 type 4.3e-06 33.7


383 Glyco transfGlycosyl transferase 0.0028 -40.1
8 family 8


189


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ PFAM NAME DESCRIPTION p-value PFAM
ID SCORE
NO:


384 RasGEF RasGEF domain 8.1 e-43 l 55.7


385 TBC TBC domain 0.017 -66.6


389 Glycos_transfGlycosyl transferases 1.3e-15 65.3
2


390 Na_Ca Ex Sodium/calcium exchanger3.9e-105 362.7
protein


391 fn3 Fibronectin type III 4.1e-102 352.6
domain


392 fn3 Fibronectin type III 3.4e-45 163.6
domain


393 fn3 Fibronectin type III 3.4e-45 163.6
domain


394 ldl_recept Low-density lipoprotein7.1e-49 175.8
b receptor
repeat


395 Ribosomal_L30Ribosomal protein L30p/L7e0.0023 16.0


396 Oxysterol_BPOxysterol-binding protein1.5e-94 327.5


397 RDS_ROMl Peripherin/rom-1 2.9e-33 123.9


399 lactamase Metallo-beta-lactamase 3.4e-39 143.6
B superfamily


402 F-box F-box domain. 0.0002 28.1


403 CLP'proteaseClp protease 4.8e-64 226.2


405 Ribosomal_L35Ribosomal protein L35Ae6e-77 269.0
Ae


406 LIM LIM domain containing 0.00021 20.7
proteins


410 tRNA-synt_lctRNA synthetases class 1e-236 799.8
I (E and Q)


411 NTP_transf Nucleotidyltransferase 3.9e-16 67.0
2 domain


412 DEAD DEAD/DEAH box helicase 0.00016 17.2


414 DUF94 Domain of unknown function0.00011 26.9
DUF94


415 tubulin Tubulin/FtsZ family 4.5e-289 973.7


420 SET SET domain 3.3e-57 203.5


421 WD40 WD domain, G-beta repeat6.1 e-29 109.6


423 zf C2H2 Zinc forger, C2H2 type 1.5e-39 144.9


424 pkinase Eukaryotic protein kinase8.9e-75 261.8
domain


428 LIM LIM domain containing 1.8e-34 126.7
proteins


431 kazal I~azal-type serine protease3.7e-18 73.8
inhibitor
domain


432 SH2 Src homology domain 1.4e-67 198.4
2


433 zf C2H2 Zinc forger, C2H2 type 2.8e-144 492.7


434 ras Ras family 0.012 -106.8


436 El-E2_ATPaseE1-E2 ATPase 1.6e-117 391.0


437 RNA_pol RNA polymerase alpha 0 1077.7
A subunit


438 PHD PHD-forger 1.6e-11 51.7


439 lectin_c Lectin C-type domain 4.7e-30 113.3


440 zf C2H2 Zinc forger, C2H2 type 1.1 e-65 231.6


441 arrestin Arrestin (or S-antigen)2.9e-254 858.1


442 aminotran_3Aminotransferases class-III8.2e-80 231.1
pyridoxal-pho


443 UCH-1 Ubiquitin carboxyl-terminal8.5e-12 52.6
hydrolases famil


444 CTF_NFI CTF/NF-I family 2.6e-277 934.6


451 T-box T-box 3.8e-117 402.6


453 Rieske Rieske [2Fe-2S] domain 2.6e-13 57.7


454 zf C2H2 Zinc forger, C2H2 type 3.9e-64 226.5


456 homeobox Homeobox domain 2.8e-08 38.9


459 ig Immunoglobulin domain 2.6e-20 70.5


460 Hydrolase haloacid dehalogenase-like4e-25 96.9
" hydrolase


462 rve Integrase core domain 1.6e-13 50.7


466 CH Calponin homology (CH) 2.4e-17 71.1
domain


467 CH Calponin homology (CH) 2.4e-17 71.1
domain


468 Sterol_desatSterol desaturase 7.5e-38 139.2


469 pro isomeraseCyclophilin type peptidyl-prolyl2.6e-63 220.9
cis-


470 Peptidase metallopeptidase family6e-08 28.1
M24 M24


471 PDZ PDZ domain (Also known 5.4e-129 441.9
as DHR or
GLGF).


190


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
SEQ PFAM NAME DESCRIPTION p-value PFAM
ID SCORE
NO:


472 myb DNA- Myb-like DNA-binding 3.6e-06 33.9
binding domain


473 ZZ Zinc finger present 0.012 20.0
in dystrophin, CB


474 EF1G_domainElongation factor 1 6.3e-88 305.5
gamma,
conserved doma


475 Ribosomal Ribosomal protein L3le 6.1e-66 232.5
L3le


476 Clq Clq domain 2.5e-75 263.7


477 SH3 SH3 domain 1.1e-12 55.6


478 MoaA_NifB_PqmoaA / nifB / pqqE family0.002 -17.7
qE


479 FYVE FYVE zinc forger 9.3e-21 78.6


480 DNA_pol_A DNA polymerase family 2.3e-46 167.4
A


482 adh_short short chain dehydrogenase1.2e-62 221.6


483 ank Ank repeat 1.3e-17 71.9


484 IMS impB/mucB/samB family 2.2e-83 290.5


486 TIR TIR domain 3.2e-19 67.8


487 FMO-like Flavin-binding monooxygenase-like0 1425.5


488 I_LWEQ I/LWEQ domain 9.5e-101 341.0


495 homeobox Homeobox domain 3.6e-06 30.8


497 pkinase Eukaryotic protein kinase2.3e-166 566.1
domain


499 fn3 Fibronectin type III 2.5e-237 801.8
domain


501 LRR Leucine Rich Repeat 9.3e-31 115.6


502 RGS Regulator of G protein 0.041 11.9
signaling
domain


503 filament Intermediate filament 1e-142 487.5
proteins


505 fn3 Fibronectin type III 1.3e-100 347.7
domain


506 HECT HECT-domain (ubiquitin-1e-13 59.0
transferase).


507 Ribosomal_L7ARibosomal protein L7Ae 5.7e-26 99.7
a


508 WD40 WD domain, G-beta repeat0.063 19.8


509 WD40 WD domain, G-beta repeat0.063 19.8


510 WD40 WD domain, G-beta repeat2.1 e-42 154.3


511 pkinase Eukaryotic protein kinase2.3e-86 300.4
domain


512 G-gamma GGL domain 1.9e-08 34.3


513 SH3 SH3 domain 3e-06 34.2


515 HTH_AraC Bacterial regulatory 3.9e-27 103.6
helix-turn-helix
protei


516 zf C2H2 Zinc forger, C2H2 type 1.7e-34 128.0


517 S 1 S 1 RNA binding domain 6.1 e-58 205.9


518 pkinase Eukaryotic protein kinase1.8e-75 264.2
domain


525 cadherin Cadherin domain 2e-80 280.6


528 zf C2H2 Zinc finger, C2H2 type 4e-70 246.4


529 neur_chan Neurotransmitter-gated 5.8e-222 750.8
ion-channel


531 RhoGEF RhoGEF domain 3.5e-44 160.2


532 myosin headMyosin head (motor domain)0 1494.5


533 LRR Leucine Rich Repeat 8.3e-15 62.6


535 Sec7 Sec7 domain 5.1e-92 319.1


536 homeobox Homeobox domain 4.8e-05 26.4


539 actin Actin 2.4e-100 330.6


542 ank Ank repeat 1.9e-35 131.2


544 zf CCCH Zinc forger C-x8-C-x5-C-x3-H2.8e-10 41.7
type


546 DSPc Dual specificity phosphatase,2.4e-40 147.4
catalytic doma


547 HMG CoA_syntHydroxymethylglutaryl-coenzyme0 1250.8
A
synthas


549 laminin Laminin G domain 3.3e-76 266.6
G


551 PHD PHD-forger 0.008 9.3


552 PDZ PDZ domain (Also known 0.0017 25.0
as DHR or


191


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
54 1038 2022 3006 787CIP2 56 7591


55 1039 2023 3007 787CIP2 57 7600


56 1040 2024 3008 787CIP2 58 7604


57 1041 2025 3009 787CIP2 59 7612


58 1042 2026 3010 787CIP2 60 7613


59 1043 2027 3011 787CIP2 61 7615


60 1044 2028 3012 787CIP2 62 7616


61 1045 2029 3013 787CIP2 63 7617


62 1046 2030 3014 787CIP2 64 7623


63 1047 2031 3015 787CIP2 65 7625


64 1048 2032 3016 787CIP2 66 7625


65 1049 2033 3017 787CIP2 67 7630


66 1050 2034 3018 787CIP2 68 7638


67 1051 2035 3019 787CIP2 69 7640


68 1052 2036 3020 787CIP2 70 7670


69 1053 2037 3021 787CIP2 71 7676


70 1054 2038 3022 787CIP2 72 7688


71 1055 2039 3023 787CIP2 73 7690


72 1056 2040 3024 787CIP2 74 7700


73 1057 2041 3025 787CIP2 7S 7774


74 1058 2042 3026 787CIP2 76 7784


75 1059 2043 3027 787CIP2 77 7785


76 1060 2044 3028 787CIP2 78 7792


77 1061 2045 3029 787CIP2 79 7798


78 1062 2046 3030 787CIP2 80 7807


79 1063 2047 3031 787CIP2 81 7810


80 1064 2048 3032 787CIP2 82 7812


81 1065 2049 3033 787CIP2 83 7816


82 1066 2050 3034 787CIP2 84 7826


83 1067 2051 3035 787CIP2 85 7842


84 1068 2052 3036 787CIP2 86 7850


85 1069 2053 3037 787CIP2 87 7865


86 1070 2054 3038 787CIP2 88 7882


87 1071 2055 3039 787CIP2 89 7891


88 1072 2056 3040 787CIP2 90 7892


89 1073 2057 3041 787CIP2 9,1 7896


90 1074 2058 3042 787CIP2 92 7896


91 1075 2059 3043 787CIP2 93 7907


92 1076 2060 3044 787CIP2 94 7913


93 1077 2061 3045 787CIP2 95 7914


94 1078 2062 3046 787CIP2 96 7915


95 1079 2063 3047 787CIP2 97 7920


96 1080 2064 3048 787CIP2 98 7921


97 1081 2065 3049 787CIP2 99 7924


98 1082 2066 3050 787CIP2 100 7927


99 1083 2067 3051 787CIP2 101 7929


100 1084 2068 3052 787CIP2 102 7937


101 1085 2069 3053 787CIP2 103 7940


102 1086 2070 3054 787CIP2 104 7942


103 1087 2071 3055 787CIP2 105 7944


104 1088 2072 3056 787CIP2 106 7951


105 1089 2073 3057 787CIP2 107 7951


106 1090 2074 3058 787GIP2 108 7962


107 1091 2075 3059 787CIP2 109 7964


108 1092 2076 3060 787CIP2 110 7977


109 1093 2077 3061 787CIP2 111 7978


110 1094 2078 3062 787CIP2 112 7980


111 1095 2079 3063 787CIP2 113 7982


112 1096 2080 3064 787CIP2 114 8000


113 1097 2081 3065 787CIP2 115 8003


198


Image


CA 02399776 2002-08-02
WO 01/57190 PCT/USO1/04098
174 1158 2142 3126 787CIP2 176 8149


175 1159 2143 3127 787CIP2 177 8150


176. 1160 2144 3128 787CIP2 178 8157


177 1161 2145 3129 787CIP2 179 8161


I78 1162 2146 3130 787CIP2 180 8162


179 1163 2147 3131 787CIP2 181 8165


180 1164 2148 3132 787CIP2 182 8166


181 1165 2149 3133 787CIP2 183 8167


182 1166 2150 3134 787CIP2 184 8169


183 1167 2151 3135 787CIP2 185 8170


184 1168 2152 3136 787CIP2 186 8172


185 1169 2153 3137 787CIP2 187 8173


186 1170 2154 3138 787CIP2 188 8174


187 1171 2155 3139 787CIP2 189 8174


188 1172 2156 3140 787CIP2 191 8182


189 1173 2157 3141 787CIP2 192 8186


190 1174 2158 3142 787CIP2 193 8188


191 1175 2159 3143 787CIP2 194 8191


192 1176 2160 3144 787CIP2 195 8192


193 1177 2161 3145 787CIP2 196 8193


194 1178 2162 3146 787CIP2 197 8194


195 1179 2163 3147 787CIP2 198 8195


196 1180 2164 3148 787CIP2 199 8196


197 1181 2165 3149 787CIP2 200 8200


198 1182 2166 3150 787CIP2 201 8201


199 1183 2167 3151 787CIP2 202 8202


200 1184 2168 3152 787CIP2 203 8205


201 1185 2169 3153 787CIP2 204 8206


202 1186 2170 3154 787CIP2 205 8207


203 1187 2171 3155 787CIP2 206 8208


204 1188 2172 3156 787CIP2 207 8209


205 1189 2173 3157 787CIP2 208 8210


206 1190 2174 3158 787CIP2 209 8211


207 1191 2175 3159 787CIP2 210 8212


208 1192 2176 3160 787CIP2 221 8213


209 1193 2177 3161 787CIP2 212 8214


210 1194 2178 3162 787CIP2 213 8215


211 1195 2179 3163 787CIP2 214 8216


212 1196 2180 3164 787CIP2 215 8217


213 1197 2181 3165 787CIP2 217 8221


214 1198 2182 3166 787CIP2 218 8222


215 1199 2183 3167 787CIP2 219 8223


216 1200 2184 3168 787CIP2 220 8224


217 1201 2185 3169 787CIP2 221 8225


218 1202 2186 3170 787CIP2 222 8227


219 1203 2187 3171 787CIP2 223 8232


220 1204 2188 3172 787CIP2 224 8235


221 1205 2189 3173 787CIP2 225 8236


222 1206 2190 3174 787CIP2 227 8238


223 1207 2191 3175 787CIP2 228 8239


224 1208 2192 3176 787CIP2 229 8240


225 1209 2193 3177 787CIP2 230 8242


226 1210 2194 3178 787CIP2 231 8246


227 1211 2195 3179 787CIP2 232 8252


228 1212 2196 3180 787CIP2 233 8257


229 1213 2197 3181 787CIP2 234 8288


230 1214 2198 3182 787CIP2 235 8310


231 1215 2199 3183 787CIP2 236 8311


232 1216 2200 3184 787CIP2 237 831S


233 1217 2201 3185 787CIP2 238 8318


200


Image


Image


Image


Image


Image


Image


CA 02399776 2002-08-02
WO 01/57190
PCT/USO1/04098
594 1 578 562 3 546 7 87CIP2B 244 156
2 7


595 1 579 563 3 547 787CIP2B 245 7156
2


596 1580 564 3 548 787CIP2B 246 7171
2


597 1581 565 3549 787CIP2B 248 7265
2


598 1582 2566 3550 787CIP2B 249 7268


599 1583 2567 3551 787CIP2B 250 7308


600 1584 2568 3552 787CIP2B 251 7336


601 1585 2569 3553 787CIP2B 252 7347


602 1586 2570 3554 787CIP2B 253 7405
'


603 1587 2571 3555 254 7405
787CIP2B


604 1588 2572 3556 787CIP2B 255 7412
l


605 1589 2573 3557 256 7412
787CIP2B


606 1590 2574 3558 787CIP2B 257 7436


607 1591 2575 3559 787CIP2B 258 7436


608 1592 2576 3560 787CIP2B 259 7454


609 1593 2577 3561 787CIP2B 260 7476


610 1594 2578 3562 787CIP2B 261 7598


611 1595 2579 3563 787CIP2B 262 7619


612 1596 2580 3564 787CIP2B 263 7644


613 1597 2581 3565 787CIP2B 264 7648


614 1598 2582 3566 787CIP2B 265 7659
~


615 1599 2583 3567 266 7661
787CIP2B
'


616 1600 2584 3568 267 7669
787CIP2B
~


617 1601 2585 3569 268 7686
787CIP2B
~


618 1602 2586 3570 _269 7686
787CIP2B


619 1603 2587 3571 787CIP2B 270 7694
~


620 1604 2588 3572 _271 7697
787CIP2B


621 1605 2589 3573 787CIP2B 272 7733


622 1606 2590 3574 787CIP2B 273 7734
~


623 1607 2591 3575 274 7744
787CIP2B
~


624 1608 2592 3576 _275 7751
787CIP2B


625 1609 2593 3577 787CIP2B 276 7756


626 1610 2594 3578 787CIP2B 277 7761
~


627 1611 2595 3579 278 7761
787CIP2B
~


628 1612 2596 3580 279 7776
787CIP2B
~


629 1613 2597 3581 ' _280 7783
787CIP2B


630 1614 2598 3582 787CIP2B 281 7800
~


631 1615 2599 3583 _282 7800
787CIP2B


632 1616 2600 3584 787CIP2B~283 7801


633 1617 2601 3585 787CIP2B 284 7811


634 1618 2602 3586 787CIP2B 285 7817
~


635 1619 2603 3587 _286 7821
787CIP2B


636 1620 2604 3588 787CIP2B 287 7822


637 1621 2605 3589 787CIP2B 288 7841


638 1622 2606 3590 787CIP2B 289 7847


639 1623 2607 3591 787CIP2B 290 7880


640 1624 2608 3592 787CIP2B 291 7910


641 1625 2609 3593 ~ 787CIP2B 293 7936


642 1626 2610 3594 787CIP2B 294 7945


643 1627 2611 3595 787CIP2B 295 7948


644 1628 2612 3596 787CIP2B 296 7963


645 1629 2613 3597 787CIP2B 297 7984


646 1630 2614 3598 787CIP2B 298 7985


647 1631 2615 3599 787CIP2B 299 8014


648 1632 2616 3600 787CIP2B 301 8029'


649 1633 2617 3601 787CIP2B 302 8043


650 1634 2618 3602 787CIP2B 303 8164


651 1635 2619 3603 787CIP2B 304 8175


652 1636 2620 3604 787CIP2B 305 8250


653 1637 2621 3605 787CIP2B 306 8253


207


Image


CA 02399776 2002-08-02
WO 01/57190
PCT/USO1/04098
1 698 682 3 666 7 87CIP2B 367 646
2 8


714 699 683 3 667 7 87CIP2B 368 657
1 2 8


715 700 684 3 668 7 87CIP2B 369 661
1 2 8


716 701 685 3 669 7 87CIP2B 370 8670
1 2


717 702 686 3 670 787CIP2B 371 8692
1 2


jig 1703 687 3 671 787CIP2B 372 8698
2


719 1704 2688 3 672 787CIP2B 373 8762


720 1705 2689 3 673 787CIP2B 374 8768


721 1706 2690 3674 787CIP2B 375 8768


722 1707 2691 3675 787CIP2B 376 8799


723 1708 2692 3676 787CIP2B 377 8806


724 1709 2693 3677 787CIP2B 378 8809


725 1710 2694 3678 787CIP2B 379 8814


726 1711 2695 3679 787CIP2B 380 8822


727 1712 2696 3680 787CIP2B 381 8833


728 1713 2697 3681 787CIP2B 382 8835


729 1714 2698 3682 787CIP2B 383 8877


730 1715 2699 3683 787CIP2B 384 8886


731 1716 2700 3684 787CIP2B 385 9003


732 1717 2701 3685 787CIP2B 386 9157


733 1718 2702 3686 787CIP2B 387 9175


734 1719 2703 3687 787CIP2B 388 9205


735 1720 2704 3688 ?87CIP2B 389 9260


736 1721 2705 3689 787CIP2B 390 9295


737 1722 2706 3690 787CIP2B 391 9307


738 1723 2707 3691 787CIP2B 392 9307


739 1724 2708 3692 787CIP2B 393 9312


740 1725 2709 3693 787CIP2B 394 9347


741 1726 2710 3694 787CIP2B 395 9370


742 2711 3695 396 9370
7CIP2B
78


743 1727 2712 3696 _ 9382
,
787CIP2B 397


744 1728 2713 3697 787CIP2B 398 9591


745 1729 2714 3698 787CIP2B 399 9650


746 1730 2715 3699 400 9655
?8? CIP2B


747 1?31 2716 3700 _ 9663
787CIP2B 401


74g 1732 2717 3701 ' 787CIP2B 402 9715


749 1733 2718 3702 787CI1'2B 9755
403


750 1734 3703 787CIP2B 404 9766


751 1735 2719 3704 787CIP2B 405 9771


752 1736 2720 3705 787CIP2B 406 9784


753 1737 2721 3706 787CIP2B 407 9925


754 1738 2722 3707 787CIP2B 408 9970


755 1739 2723 787CIP2B 409 9997


756 1740 2724 3708 10008


757 1741 2725 3709 787CIP2B 410


1742 2726 3710 787CIP2B 411 10010


758 1743 2727 3711 787CIP2B 412 10023


759 1744 2728 3712 787CIP2B 413 10043


760 1745 2729 3713 787CIP2B 414 10093


761 1746 2730 3714 787CIP2B 415 10172


762 2731 3715 787CIP2B 416 10184


763 1747 3716 417 10205
787CIP2B~


764 1748 2732 _ 10246
787CIP2B 418


765 1749 2733 3717 10298


766 1750 2734 3718 787CIP2B 419


767 175I 2735 3719 787CI1'2C_i 886


76g 1752 2736 3720 787CIP2C 2 1028


769 1753 2737 3721 787CIP2C 3 1916


770 1754 2738 3722 787CIP2C 4 2072


771 1755 2739 3723 787CIP2C 5 2424


772 1756 2740 3724 787CIP2C 6 2474


773 1757 2741 3725 787CIP2C 7 2474


209


Image


Image


Image


CA 02399776 2002-08-02
WO 01/57190
PCT/USO1/04098
594 1 578 562 3 546 7 87CIP2B 244 156
2 7


595 1 579 563 3 547 787CIP2B 245 7156
2


596 1580 564 3 548 787CIP2B 246 7171
2


597 1581 565 3549 787CIP2B 248 7265
2


598 1582 2566 3550 787CIP2B 249 7268


599 1583 2567 3551 787CIP2B 250 7308


600 1584 2568 3552 787CIP2B 251 7336


601 1585 2569 3553 787CIP2B 252 7347


602 1586 2570 3554 787CIP2B 253 7405
'


603 1587 2571 3555 254 7405
787CIP2B


604 1588 2572 3556 787CIP2B 255 7412
l


605 1589 2573 3557 256 7412
787CIP2B


606 1590 2574 3558 787CIP2B 257 7436


607 1591 2575 3559 787CIP2B 258 7436


608 1592 2576 3560 787CIP2B 259 7454


609 1593 2577 3561 787CIP2B 260 7476


610 1594 2578 3562 787CIP2B 261 7598


611 1595 2579 3563 787CIP2B 262 7619


612 1596 2580 3564 787CIP2B 263 7644


613 1597 2581 3565 787CIP2B 264 7648


614 1598 2582 3566 787CIP2B 265 7659
~


615 1599 2583 3567 266 7661
787CIP2B
'


616 1600 2584 3568 267 7669
787CIP2B
~


617 1601 2585 3569 268 7686
787CIP2B
~


618 1602 2586 3570 _269 7686
787CIP2B


619 1603 2587 3571 787CIP2B 270 7694
~


620 1604 2588 3572 _271 7697
787CIP2B


621 1605 2589 3573 787CIP2B 272 7733


622 1606 2590 3574 787CIP2B 273 7734
~


623 1607 2591 3575 274 7744
787CIP2B
~


624 1608 2592 3576 _275 7751
787CIP2B


625 1609 2593 3577 787CIP2B 276 7756


626 1610 2594 3578 787CIP2B 277 7761
~


627 1611 2595 3579 278 7761
787CIP2B
~


628 1612 2596 3580 279 7776
787CIP2B
~


629 1613 2597 3581 ' _280 7783
787CIP2B


630 1614 2598 3582 787CIP2B 281 7800
~


631 1615 2599 3583 _282 7800
787CIP2B


632 1616 2600 3584 787CIP2B~283 7801


633 1617 2601 3585 787CIP2B 284 7811


634 1618 2602 3586 787CIP2B 285 7817
~


635 1619 2603 3587 _286 7821
787CIP2B


636 1620 2604 3588 787CIP2B 287 7822


637 1621 2605 3589 787CIP2B 288 7841


638 1622 2606 3590 787CIP2B 289 7847


639 1623 2607 3591 787CIP2B 290 7880


640 1624 2608 3592 787CIP2B 291 7910


641 1625 2609 3593 ~ 787CIP2B 293 7936


642 1626 2610 3594 787CIP2B 294 7945


643 1627 2611 3595 787CIP2B 295 7948


644 1628 2612 3596 787CIP2B 296 7963


645 1629 2613 3597 787CIP2B 297 7984


646 1630 2614 3598 787CIP2B 298 7985


647 1631 2615 3599 787CIP2B 299 8014


648 1632 2616 3600 787CIP2B 301 8029'


649 1633 2617 3601 787CIP2B 302 8043


650 1634 2618 3602 787CIP2B 303 8164


651 1635 2619 3603 787CIP2B 304 8175


652 1636 2620 3604 787CIP2B 305 8250


653 1637 2621 3605 787CIP2B 306 8253


207


Image




DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.
CECI EST LE TOME 1 DE 3
~~ TTENANT LES PAGES 1 A 208
NOTE : Pour les tomes additionels, veuillez contacter 1e Bureau canadien des
brevets
JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME
THIS IS VOLUME 1 OF 3
CONTAINING PAGES 1 TO 208
NOTE: For additional volumes, please contact the Canadian Patent Office
NOM DU FICHIER / FILE NAME
NOTE POUR LE TOME / VOLUME NOTE:

Representative Drawing

Sorry, the representative drawing for patent document number 2399776 was not found.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date Unavailable
(86) PCT Filing Date 2001-02-05
(87) PCT Publication Date 2001-08-09
(85) National Entry 2002-08-02
Examination Requested 2005-09-16
Dead Application 2009-02-05

Abandonment History

Abandonment Date Reason Reinstatement Date
2008-02-05 FAILURE TO PAY APPLICATION MAINTENANCE FEE

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Application Fee $300.00 2002-08-02
Maintenance Fee - Application - New Act 2 2003-02-05 $100.00 2002-12-12
Registration of a document - section 124 $100.00 2003-11-05
Registration of a document - section 124 $100.00 2003-11-05
Registration of a document - section 124 $100.00 2003-11-05
Registration of a document - section 124 $100.00 2003-11-05
Registration of a document - section 124 $100.00 2003-11-05
Registration of a document - section 124 $100.00 2003-11-05
Maintenance Fee - Application - New Act 3 2004-02-05 $100.00 2003-12-12
Maintenance Fee - Application - New Act 4 2005-02-07 $100.00 2004-12-10
Request for Examination $800.00 2005-09-16
Maintenance Fee - Application - New Act 5 2006-02-06 $200.00 2005-12-12
Maintenance Fee - Application - New Act 6 2007-02-05 $200.00 2006-12-14
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
NUVELO, INC.
Past Owners on Record
ASUNDI, VINOD
CAO, YICHENG
CHEN, RUI-HONG
DRMANAC, RADOJE T.
GOODRICH, RYLE
HYSEQ, INC.
LIU, CHENGHUA
MA, YUNQUING
REN, FEIYAN
TANG, Y. TOM
WANG, DUNRUI
WANG, JIAN-RUI
WANG, ZHIWEI
WEHRMAN, TOM
XU, CHONGJUN
XUE, AIDONG J.
YANG, YONGHONG
ZHANG, JIE
ZHAO, QING A.
ZHOU, PING
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Description 2002-08-02 78 5,811
Description 2002-08-02 197 15,292
Cover Page 2002-12-19 2 37
Description 2002-08-02 210 15,307
Abstract 2002-08-02 1 80
Claims 2002-08-02 4 160
Drawings 2002-08-02 1 13
Assignment 2002-08-02 3 131
Prosecution-Amendment 2002-08-02 1 19
Correspondence 2002-12-16 1 24
Prosecution-Amendment 2003-02-03 1 34
Correspondence 2003-11-05 2 73
Assignment 2003-11-05 50 2,603
PCT 2002-08-02 1 59
PCT 2002-08-02 1 57
Prosecution-Amendment 2005-09-16 1 36

Biological Sequence Listings

Choose a BSL submission then click the "Download BSL" button to download the file.

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Please note that files with extensions .pep and .seq that were created by CIPO as working files might be incomplete and are not to be considered official communication.

No BSL files available.