Language selection

Search

Patent 2361743 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2361743
(54) English Title: GENES ASSOCIATED WITH DISEASES OF THE COLON
(54) French Title: GENES ASSOCIES A DES MALADIES DU COLON
Status: Deemed Abandoned and Beyond the Period of Reinstatement - Pending Response to Notice of Disregarded Communication
Bibliographic Data
(51) International Patent Classification (IPC):
  • C12N 15/12 (2006.01)
  • A61K 38/00 (2006.01)
  • A61K 38/17 (2006.01)
  • A61K 39/395 (2006.01)
  • A61K 48/00 (2006.01)
  • C07K 14/47 (2006.01)
  • C07K 16/18 (2006.01)
  • C12N 15/63 (2006.01)
(72) Inventors :
  • WALKER, MICHAEL G. (United States of America)
  • VOLKMUTH, WAYNE (United States of America)
  • KLINGLER, TOD M. (United States of America)
  • LAL, PREETI (United States of America)
(73) Owners :
  • INCYTE GENOMICS, INC.
(71) Applicants :
  • INCYTE GENOMICS, INC. (United States of America)
(74) Agent: SMART & BIGGAR LP
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date: 2000-02-01
(87) Open to Public Inspection: 2000-08-31
Availability of licence: N/A
Dedicated to the Public: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/US2000/002595
(87) International Publication Number: US2000002595
(85) National Entry: 2001-07-31

(30) Application Priority Data:
Application No. Country/Territory Date
09/255,381 (United States of America) 1999-02-22

Abstracts

English Abstract


The invention provides colon cancer genes and polypeptides encoded by those
genes. The invention also provides expression vectors, host cells, and
antibodies. The invention also provides methods for diagnosing, treating or
preventing diseases of the colon.


French Abstract

L'invention concerne de gènes du cancer du côlon ainsi que des polypeptides encodés par ces gènes. L'invention concerne également des vecteurs d'expression, des cellules hôtes et des anticorps. L'invention concerne enfin des procédés pour le diagnostic, le traitement ou la prévention de maladies du côlon.

Claims

Note: Claims are shown in the official language in which they were submitted.


What is claimed is:
1. A substantially purified polynucleotide comprising a gene that is
coexpressed with one or
more known colon cancer genes in a plurality of biological samples, wherein
each known colon cancer
gene is selected from the group consisting of carbonic anhydrase I, II, and IV
(CA I, II, and IV),
carcinoembryonic antigen family of proteins (cea), colorectal carcinoma tumor-
associated antigen (CO-
029), down-regulated in adenoma (dra),fatty-acid binding protein (fabp),
galectin (galec), glutathione
peroxidase (gpx2), guanylin (guan), cytokeratin 8 and 20 (ker 8 and 20),
cadherin (cadher), and intestinal
mucin (muc-2).
2. RECONSTITUTE
(a) a polynucleotide sequence selected from the group consisting of SEQ ID
NOs: l-7;
(b) a polynucleotide encoding a polypeptide sequence selected from the group
consisting of SEQ
ID NOs:8 and 9;
(c) a polynucleotide sequence having at least 75% identity to the
polynucleotide sequence of (a)
or (b);
(d) a polynucleotide sequence which is complementary to the polynucleotide
sequence of (a), (b)
or (c);
(e) a polynucleotide sequence comprising at least 18 sequential nucleotides of
the polynucleotide
sequence of (a), (b), (c), or (d); and
(f) a polynucleotide which hybridizes under stringent conditions to the
polynucleotide of (a), (b),
(c), (d), or (e).
3. A substantially purified polypeptide comprising the gene product of a gene
that is coexpressed
with one or more known colon cancer genes in a plurality of biological
samples, wherein each known
colon cancer gene is selected from the group consisting of carbonic anhydrase
I, II, and IV (CA I, II, and
IV), carcinoembryonic antigen family of proteins (cea), colorectal carcinoma
tumor-associated antigen
(CO-029), down-regulated in adenoma (dra), fatty-acid binding protein (fabp),
galectin (galec),
glutathione peroxidase (gpx2), guanylin (guan), cytokeratin 8 and 20 (ker 8
and 20), cadherin (cadher),
and intestinal mucin (muc-2).
4. The polypeptide of claim 3, comprising a polypeptide sequence selected from
the group
consisting of:
(a) the polypeptide having the amino acid sequence selected from the group
consisting of SEQ
ID NOs:8 and 9;
(b) a polypeptide sequence having at least 85% identity to the polypeptide
sequence of (a); and
(c) a polypeptide sequence comprising at least 6 sequential amino acids of the
polypeptide
sequence of (a) or (b).
24

5. An expression vector comprising the polynucleotide of claim 2.
6. A host cell comprising the expression vector of claim 5.
7. A pharmaceutical composition comprising the polynucleotide of claim 2 in
conjunction with a
suitable pharmaceutical carrier.
8. A pharmaceutical composition comprising the polypeptide of claim 3 in
conjunction with a
suitable pharmaceutical carrier.
9. An antibody or antibody fragment comprising an antigen binding site,
wherein the antigen
binding site specifically binds to the polypeptide of claim 4.
10. An immunoconjugate comprising the antigen binding site of the antibody or
antibody
fragment of claim 9 joined to a therapeutic agent.
11. A method for diagnosing a disease or condition associated with the altered
expression of a
gene that is coexpressed with one or more known colon cancer genes, wherein
each known colon cancer
gene is selected from the group consisting of carbonic anhydrase I, II, and IV
(CA I, II, and IV),
carcinoembryonic antigen family of proteins (cea), colorectal carcinoma tumor-
associated antigen (CO-
029), down-regulated in adenoma (dra), fatty-acid binding protein (fabp),
galectin (galec), glutathione
peroxidase (gpx2), guanylin (guan), cytokeratin 8 and 20 (ker 8 and 20),
cadherin (cadher), and intestinal
mucin (muc-2), the method comprising the steps of:
(a) providing a biological sample;
(b) hybridizing a polynucleotide of claim 2 to the biological sample under
conditions effective to
form one or more hybridization complexes;
(c) detecting the hybridization complexes; and
(d) comparing the levels of the hybridization complexes with the level of
hybridization
complexes in a non-diseased sample, wherein the altered level of hybridization
complexes compared with
the level of hybridization complexes of a nondiseased sample correlates with
the presence of the disease
or condition.
12. A method for treating or preventing a disease associated with the altered
expression of a gene
that is coexpressed with one or more known colon cancer genes in a subject in
need, wherein each known
colon cancer gene is selected from the group consisting of carbonic anhydrase
I, II, and IV (CA I, II, and
IV), carcinoembryonic antigen family of proteins (cea), colorectal carcinoma
tumor-associated antigen
(CO-029), down-regulated in adenoma (dra), fatty-acid binding protein (fabp),
galectin (galec),
glutathione peroxidase (gpx2), guanylin (guan), cytokeratin 8 and 20 (ker 8
and 20), cadherin (cadher),
and intestinal mucin (muc-2), the method comprising the step of administering
to the subject in need the
pharmaceutical composition of claim 7 in an amount effective for treating or
preventing the disease.
13. A method for treating or preventing a disease associated with the altered
expression of a gene

that is coexpressed with one or more known colon cancer genes in a subject in
need, wherein each known
colon cancer gene is selected from the group consisting of carbonic anhydrase
I, II, and IV (CA I, II, and
IV), carcinoembryonic antigen family of proteins (cea), colorectal carcinoma
tumor-associated antigen
(CO-029), down-regulated in adenoma (dra), fatty-acid binding protein (fabp),
galectin (galec),
glutathione peroxidase (gpx2), guanylin (guan), cytokeratin 8 and 20 (ker 8
and 20), cadherin (cadher),
and intestinal mucin (muc-2), the method comprising the step of administering
to the subject in need the
pharmaceutical composition of claim 8 in an amount effective for treating or
preventing the disease.
14. A method for treating or preventing a disease associated with the altered
expression of a gene
that is coexpressed with one or more known colon cancer genes in a subject in
need, wherein each known
colon cancer gene is selected from the group consisting of carbonic anhydrase
I, II, and IV (CA I, II, and
IV), carcinoembryonic antigen family of proteins (cea), colorectal carcinoma
tumor-associated antigen
(CO-029), down-regulated in adenoma (dra), fatty-acid binding protein (fabp),
galectin (galec),
glutathione peroxidase (gpx2), guanylin (guan), cytokeratin 8 and 20 (ker 8
and 20), cadherin (cadher),
and intestinal mucin (muc-2), the method comprising the step of administering
to the subject in need the
antibody or the antibody fragment of claim 9 in an amount effective for
treating or preventing the disease.
15. A method for treating or preventing a disease associated with the altered
expression of a gene
that is coexpressed with one or more known colon cancer genes in a subject in
need, wherein each known
colon cancer gene is selected from the group consisting of carbonic anhydrase
I, II, and IV (CA I, II, and
IV), carcinoembryonic antigen family of proteins (cea), colorectal carcinoma
tumor-associated antigen
(CO-029), down-regulated in adenoma (dra), fatty-acid binding protein (fabp),
galectin (galec),
glutathione peroxidase (gpx2), guanylin (guan), cytokeratin 8 and 20 (ker 8
and 20), cadherin (cadher),
and intestinal mucin (muc-2), the method comprising the step of administering
to the subject in need the
immunoconjugate of claim 10 in an amount effective for treating or preventing
the disease.
16. A method for treating or preventing a disease associated with the altered
expression of a gene
that is coexpressed with one or more known colon cancer genes in a subject in
need, wherein each known
colon cancer gene is selected from the group consisting of carbonic anhydrase
I, II, and IV (CA I, II, and
IV), carcinoembryonic antigen family of proteins (cea), colorectal carcinoma
tumor-associated antigen
(CO-029), down-regulated in adenoma (dra), fatty-acid binding protein (fabp),
galectin (galec),
glutathione peroxidase (gpx2), guanylin (guan), cytokeratin 8 and 20 (ker 8
and 20), cadherin (cadher),
and intestinal mucin (muc-2), the method comprising the step of administering
to the subject in need the
polynucleotide sequence of claim 2 in an amount effective for treating or
preventing the disease.
26

Description

Note: Descriptions are shown in the official language in which they were submitted.


CA 02361743 2001-07-31
WO 00/50588 PCT/iJS00/02595
GENES ASSOCIATED WITH DISEASES OF THE COLON
TECHNICAL FIELD
The invention relates to seven genes associated with diseases of the colon,
particularly colon
cancer, as identified by their coexpression with known colon cancer genes. The
invention also relates to
the use of these biomolecules in diagnosis, prognosis, prevention, treatment,
and evaluation of therapies
for diseases of the colon.
BACKGROUND ART
Colon cancer is the third leading cause of cancer deaths in the United States.
Each year over
100,000 new cases are diagnosed, and 50,000 patients die from the disease. In
large part this death rate is
due to the inability to diagnose the disease at an early stage (Wanebo (1993)
Colorectal Cancer, Mosby,
St Louis MO). Although some of the genes that participate in or regulate the
growth of colon cells are
known, many other genes remain to be identified. Identification of new genes
with significant levels of
expression in cells of the diseased colon will provide new diagnostics,
opportunities for earlier patient
diagnosis, and targets for the development of therapeutic agents.
The present invention satisfies a need in the art by providing new
compositions, seven genes
associated with diseases of the colon identified by their coexpression
patterns with genes expressed in
colon cancer, that are useful for diagnosis, prognosis, treatment, prevention,
and evaluation of therapies
for diseases of the colon.
SUMMARY OF THE INVENTION
In one aspect, the invention provides for a substantially purified
polynucleotide comprising a
gene that is coexpressed with one or more known colon cancer genes in a
plurality of biological samples.
Preferably, known colon cancer genes are selected from the group consisting of
carbonic anhydrase I, II,
and IV (CA I, II, and IV), carcinoembryonic antigen family of proteins (cea),
colorectal carcinoma tumor-
associated antigen (CO-029), down-regulated in adenoma (dra), fatty-acid
binding protein (fabp), galectin
(galec), glutathione peroxidase (gpx2), guanylin (guan), cytokeratin 8 and 20
(ker 8 and 20), cadherin
(cadher), and intestinal mucin (muc-2). Preferred embodiments include: (a) a
polynucleotide sequence
selected from SEQ ID NOs: I-7; (b) a polynucleotide sequence which encodes the
polypeptide of SEQ
ID NOs:8 or 9; (c) a polynucleotide sequence having at least 75% identity to
the polynucleotide
sequence of (a) or (b); (d) a polynucleotide sequence which is complementary
to the polynucleotide
sequence of (a), (b), or (c); (e) a polynucleotide sequence comprising at
least 10, preferably at least I 8,
sequential nucleotides of the polynucleotide sequence of (a), (b), (c), or
(d); or (f) a polynucleotide
which hybridizes under stringent conditions to the polynucleotide of (a), (b),
(c), (d) or (e). Furthermore,
the invention provides an expression vector comprising any of the
polynucleotides described above and
host cells comprising the expression vector. Still further, the invention
provides a method for treating or

CA 02361743 2001-07-31
WO 00/50588 PCT/LTS00/02595
preventing a disease or condition associated with the altered expression of a
gene that is coexpressed with
one or more known colon cancer genes comprising administering to a subject in
need a polynucleotide
described above in an amount effective for treating or preventing the disease.
In a second aspect, the invention provides a substantially purified
polypeptide comprising the
gene product of a gene that is coexpressed with one or more known colon cancer
genes in a plurality of
biological samples. The known colon cancer gene may be selected from the group
consisting of carbonic
anhydrase I, II, and IV, carcinoembryonic antigen family of proteins,
colorectal carcinoma tumor-
associated antigen, down-regulated in adenoma, fatty-acid binding protein ,
galectin, glutathione
peroxidase, guanylin, cytokeratin 8 and 20, cadherin, and intestinal mucin.
Preferred embodiments are
(a) the polypeptide sequence of SEQ ID NOs:8 and 9; (b) a polypeptide sequence
having at least 85%
identity to the polypeptide sequence of (a); and (c) a polypeptide sequence
comprising at least 6
sequential amino acids of the polypeptide sequence of (a) or (b).
Additionally, the invention provides
antibodies that bind specifically to any of the above described polypeptides
and a method for treating or
preventing a disease or condition associated with the altered expression of a
gene that is coexpressed with
one or more known colon cancer genes comprising administering to a subject in
need such an antibody in
an amount effective for treating or preventing the disease.
In another aspect, the invention provides a pharmaceutical composition
comprising the
polynucleotide of claim 2 or the polypeptide of claim 3 in conjunction with a
suitable pharmaceutical
carrier and a method for treating or preventing a disease or condition
associated with the altered
expression of a gene that is coexpressed with one or more known colon cancer
genes comprising
administering to a subject in need such a composition in an amount effective
for treating or preventing the
disease.
In a further aspect, the invention provides a method for diagnosing a disease
or condition
associated with the altered expression of a gene that is coexpressed with one
or more known colon cancer
genes, wherein each known colon cancer gene is selected from the group
consisting of carbonic
anhydrase I, II, and IV, carcinoembryonic antigen family of proteins,
colorectal carcinoma tumor-
associated antigen, down-regulated in adenoma, fatty-acid binding protein,
galectin, glutathione
peroxidase, guanylin, cytokeratin 8 and 20, cadherin, and intestinal mucin.
The method comprises the
steps of (a) providing a sample comprising one of more of the coexpressed
genes; (b) hybridizing the
polynucleotide of claim 2 to the coexpressed genes under conditions effective
to form one or more
hybridization complexes; (c) detecting the hybridization complexes; and (d)
comparing the levels of the
hybridization complexes with the level of hybridization complexes in a
nondiseased sample, wherein
altered levels of one or more of the hybridization complexes in a diseased
sample compared with the level
of hybridization complexes in a non-diseased sample correlates with the
presence of the disease or

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
condition.
Additionally, the invention provides antibodies, antibody fragments, and
immunoconjugates that
exhibit specificity to any of the above described polypeptides and methods for
treating or preventing
diseases or conditions of the colon.
BRIEF DESCRIPTION OF THE SEQUENCE LISTING
The Sequence Listing provides exemplary colon cancer gene sequences including
polynucleotide
sequences, SEQ ID NOs:l-7, and the polypeptide sequences, SEQ ID NOs:8 and 9.
Each sequence is
identified by a sequence identification number (SEQ ID NO) and by the Incyte
clone number with which
the sequence was first identified.
DESCRIPTION OF THE INVENTION
It must be noted that as used herein and in the appended claims, the singular
forms "a", "an", and
"the" include the plural reference unless the context clearly dictates
otherwise. Thus, for example, a
reference to "a host cell" includes a plurality of such host cells, and a
reference to "an antibody" is a
reference to one or more antibodies and equivalents thereof known to those
skilled in the art, and so forth.
DEFINITIONS
"NSEQ" refers generally to a polynucleotide sequence of the present invention,
including SEQ ID
NOs: l-7. "PSEQ" refers generally to a polypeptide sequence of the present
invention, SEQ ID NOs:8
and 9.
A "fragment" refers to a nucleic acid sequence that is preferably at least 20
nucleic acids in
length, more preferably 40 nucleic acids, and most preferably 60 nucleic acids
in length, and
encompasses, for example, fragments consisting of nucleic acids 1-50, 51-400,
401-4000, 4001-12,000,
and the like, of SEQ ID NOs:I-7.
"Gene"refers to the partial or complete coding sequence of a gene and to its
5' or 3' untranslated
regions. The gene may be in a sense or antisense (complementary) orientation.
"Colon cancer gene" refers to a gene whose expression pattern is similar to
that of known colon
cancer genes which are useful in the diagnosis, treatment, prognosis, or
prevention of diseases of the
colon, particularly colon cancer and other diseases associated with abnormal
cell growth. ''Known colon
cancer gene" refers to a sequence which has been previously identified as
useful in the diagnosis,
treatment, prognosis, or prevention of diseases of the colon. Typically, this
means that the known gene is
expressed at higher levels (i.e., has more abundant transcripts) in diseased
or cancerous colon tissue than
in normal or non-diseased colon or any other tissue.
"Polynucleotide" refers to a nucleic acid molecule, nucleic acid sequence,
oligonucleotide,
nucleotide, or any fragment thereof. It may be DNA or RNA of genomic or
synthetic origin,

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
double-stranded or single-stranded, and combined with carbohydrate, lipids,
protein or other materials to
perform a particular activity or form a useful composition. "Oligonucleotide"
is substantially equivalent
to the terms amplimer, primer, oligomer, element, and probe.
"Polypeptide" refers to an amino acid molecule, amino acid sequence,
oligopeptide, peptide, or
protein or portions thereof whether naturally occurring or synthetic.
A "portion" refers to peptide sequence which is preferably at least 5 to about
IS amino acids in
length, most preferably at least 10 amino acids long, and which retains some
biological or immunological
activity of, for example, a portion of SEQ ID NOs:8 and 9.
"Sample" is used in its broadest sense. A sample containing nucleic acids may
comprise a bodily
fluid; an extract from a cell, chromosome, organelle, or membrane isolated
from a cell; genomic DNA,
RNA, or cDNA in solution or bound to a substrate; a cell; a tissue; a tissue
print; and the like.
"Substantially purified" refers to a nucleic acid or an amino acid sequence
that is removed from
its natural environment and that is isolated or separated, and is at least
about 60% free, preferably about
75% free, and most preferably about 90% free, from other components with which
it is naturally present.
"Substrate" refers to any suitable rigid or semi-rigid support to which
polynucleotides or
polypeptides are bound and includes membranes, filters, chips, slides, wafers,
fibers, magnetic or
nonmagnetic beads, gels, capillaries or other tubing, plates, polymers, and
microparticles with a variety of
surface forms including wells, trenches, pins, channels, and pores.
A '' variant" refers to a polynucleotide whose sequence diverges from SEQ ID
NOs:l-7 or to a
polypeptide who sequence diverges from SEQ ID NOs:8 and 9, respectively.
Polynucleotide sequence
divergence may result from mutational changes such as deletions, additions,
and substitutions of one or
more nucleotides; it may also be introduced to accommodate differences in
codon usage. Each of these
types of changes may occur alone, or in combination, one or more times in a
given sequence. Polypeptide
variants include sequences that possess at least one structural or functional
characteristic of SEQ ID
NOs:8 and 9.
THE INVENTION
The present invention encompasses a method for identifying biomolecules that
are associated
with a specific disease, regulatory pathway, subcellular compartment, cell
type, tissue type, or species. In
particular. the method identifies genes useful in diagnosis, prognosis,
treatment, prevention, and
evaluation of therapies for diseases of the colon including, but not limited,
colon cancer, metastatic colon
cancer, atrophic gastritis, cholecystitis, Crohns disease, irritable bowel
syndrome, ulcerative colitis, and
the like.
The method entails first identifying polynucleotides that are expressed in a
plurality of cDNA
libraries. The identified polynucleotides include genes of known or unknown
function which are known

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
to be expressed in a specific disease process, subcellular compartment, cell
type, tissue type, or species.
The expression patterns of the genes with known function are compared with
those of the genes with
unknown function to determine whether a specified coexpression probability
threshold is met. Through
this comparison, a subset of the polynucleotides having a high coexpression
probability with the known
genes can be identified. The high coexpression probability correlates with a
particular .coexpression
probability threshold which is preferably less than 0.001 and more preferably
less than 0.00001.
The polynucleotides originate from cDNA libraries derived from a variety of
sources including,
but not limited to, eukaryotes such as human, mouse, rat, dog, monkey, plant,
and yeast, and prokaryotes
such as bacteria; and viruses. These polynucleotides can also be selected from
a variety of sequence
types including, but not limited to, expressed sequence tags (SSTs), assembled
polynucleotide sequences,
full length gene coding regions, promoters, introns, enhancers, 5'
untranslated regions, and 3' untranslated
regions. To have statistically significant analytical results, the
polynucleotides need to be expressed in at
least three cDNA libraries.
The cDNA libraries used in the coexpression analysis of the present invention
can be obtained
from adrenal gland, biliary tract, bladder, blood cells, blood vessels, bone
marrow, brain, bronchus,
cartilage, chromaffin system, colon, connective tissue, cultured cells,
embryonic stem cells, endocrine
glands, epithelium, esophagus, fetus, ganglia, heart, hypothalamus, immune
system, intestine, islets of
Langerhans, kidney, larynx, liver, lung, lymph, muscles, neurons, ovary,
pancreas, penis, peripheral
nervous system, phagocytes, pituitary, placenta, pleurus, prostate, salivary
glands, seminal vesicles,
skeleton, spleen, stomach, testis, thymus, tongue, ureter, uterus, and the
like. The number of cDNA
libraries selected can range from as few as 3 to greater than 10,000.
Preferably, the number of the cDNA
libraries is greater than 500.
In a preferred embodiment, genes are assembled to reflect related sequences,
such as assembled
sequence fragments derived from a single transcript. Assembly of the
polynucleotide sequences can be
performed using sequences of various types including, but not limited to,
ESTs, extensions, or shotgun
sequences. In a most preferred embodiment, the polynucleotide sequences are
derived from human
sequences that have been assembled using the algorithm disclosed in "System
and Methods for Analyzing
Biomolecular Sequences", USSN 09/276,534, filed March 25, 1999, incorporated
herein by reference.
Experimentally, differential expression of the polynucleotides can be
evaluated by methods
including, but not limited to, differential display by spatial immobilization
or by gel electrophoresis,
genome mismatch scanning, representational difference analysis, and transcript
imaging. Additionally,
differential expression can be assessed by microarray technology. These
methods may be used alone or
in combination.
Known colon cancer genes can be selected based on the use of these genes as
diagnostic or

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
prognostic markers or as therapeutic targets. Preferably, the known colon
cancer genes include carbonic
anhydrase I, II, and IV, carcinoembryonic antigen family of proteins,
colorectal carcinoma tumor-
associated antigen, down-regulated in adenoma, fatty-acid binding protein,
galectin, glutathione
peroxidase, guanylin, cytokeratin 8 and 20, cadherin, intestinal mucin, and
the like.
The procedure for identifying novel genes that exhibit a statistically
significant coexpression
pattern with known colon cancer genes is as follows. First, the presence or
absence of a gene in a cDNA
library is defined: a gene is present in a cDNA library when at least one cDNA
fragment corresponding
to that gene is detected in a cDNA sample taken from the library, and a gene
is absent from a library when
no corresponding cDNA fragment is detected in the sample.
Second, the significance of gene coexpression is evaluated using a probability
method to measure
a due-to-chance probability of the coexpression. The probability method can be
the Fisher exact test, the
chi-squared test, or the kappa test. These tests and examples of their
applications are well known in the
art and can be found in standard statistics texts (Agresti (1990) Categorical
Data Anal sis, John Wiley &
Sons, New York NY; Rice (1988) Mathematical Statistics and Data Analysis,
Duxbury Press, Pacific
Grove CA). A Bonferroni correction (Rice, supra, page 384) can also be applied
in combination with one
of the probability methods for correcting statistical results of one gene
versus multiple other genes. In a
preferred embodiment, the due-to-chance probability is measured by a Fisher
exact test, and the threshold
of the due-to-chance probability is set preferably to less than 0.001, more
preferably to less than 0.00001.
To determine whether two genes, A and B, have similar coexpression patterns,
occurrence data
vectors can be generated as illustrated in Table 1. The presence of a gene
occurring at least once in a
library is indicated by a one, and its absence from the library, by a zero.
Table 1. Occurrence data for genes A and B
Library Library Library ... Library
1 2 3 N
gene 1 1 0 ... 0
A
gene 1 0 1 ... 0
B
For a given pair of genes, the occurrence data in Table 1 can be summarized in
a 2 x 2 contingency table.
Table 2. Contingency table for co-occurrences of genes A and B
Gene A present Gene A absent Total
Gene B present 8 2 10
Gene B absent 2 18 20
Total 10 20 30

CA 02361743 2001-07-31
WO 00/50588 PCT/US00102595
Table 2 presents co-occurrence data for gene A and gene B in a total of 30
libraries. Both gene A
and gene B occur 10 times in the libraries. Table 2 summarizes and presents: 1
) the number of times
gene A and B are both present in a. library, 2) the number of times gene A and
B are both absent in a
library, 3) the number of times gene A is present and gene B is absent, and 4)
the number of times gene
B is present and gene A is absent. The upper left entry is the number of times
the two genes co-occur in a
library, and the middle right entry is the number of times neither gene occurs
in a library. The off
diagonal entries are the number of times one gene occurs and the other does
not. Both A and B are
present eight times and absent 18 times. Gene A is present and gene B is
absent two times; and gene B is
present and gene A is absent two times. The probability ("p-value") that the
above association occurs due
to chance as calculated using a Fisher exact test is 0.0003. Associations are
generally considered
significant if a p-value is less than 0.01 (Agresti, supra; Rice, supra).
This method of estimating the probability for coexpression of two genes makes
several
assumptions. The method assumes that the libraries are independent and are
identically sampled.
However, in practical situations, the selected cDNA libraries are not entirely
independent, because more
than one library may be obtained from a single subject or tissue. Nor are they
entirely identically
sampled, because different numbers of cDNAs may be sequenced from each
library. The number of
cDNAs sequenced typically ranges from 5,000 to 10,000 cDNAs per library. In
addition, because a
Fisher exact coexpression probability is calculated for each gene versus
41,419 other assembled genes, a
Bonferroni correction for multiple statistical tests is necessary.
Using the method of the present invention, we have identified seven novel
genes that exhibit
strong association, or coexpression, with known genes that are specific to
colon cancer. These known
colon cancer genes include carbonic anhydrase I, II, and IV, carcinoembryonic
antigen family of proteins,
colorectal carcinoma tumor-associated antigen, down-regulated in adenoma,
fatty-acid binding protein,
galectin, glutathione peroxidase, guanylin, cytokeratin 8 and 20, cadherin,
and intestinal mucin. The
results presented in Table 6 show that the expression of the seven novel genes
have direct or indirect
association with the expression of known colon cancer genes. Therefore, the
novel genes can potentially
be used in diagnosis, treatment, prognosis, or prevention of diseases of the
colon or in the evaluation of
therapies for diseases of the colon. Further, the gene products of the seven
novel genes are either
potential therapeutic proteins or targets of therapeutics against diseases of
the colon.
Therefore, in one embodiment, the present invention encompasses a
polynucleotide sequence
comprising the sequence of SEQ ID NOs:l-7. These seven polynucleotides are
shown by the method of
the present invention to have strong coexpression association with known colon
cancer genes and with
each other. The invention also encompasses a variant of the polynucleotide
sequence, its complement, or
18 consecutive nucleotides of a sequence provided in the above described
sequences. Variant

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
polynucleotide sequences typically have at least about 75%, more preferably at
least about 85%, and most
preferably at least about 95% polynucleotide sequence identity to NSEQ.
NSEQ or the encoded PSEQ may be used to search against the GenBank primate
(pri), rodent
(rod), mammalian (mam), vertebrate (vrtp), and eukaryote (eukp) databases,
SwissProt, BLOCKS
(Bairoch et al. ( 1997) Nucleic Acids Res 25:217-221 ), PFAM, and other
databases that contain previously
identified and annotated motifs, sequences, and gene functions. Methods that
search for primary
sequence patterns with secondary structure gap penalties (Smith et al. (1992)
Protein Engineering 5:35-
51) as well as algorithms such as Basic Local Alignment Search Tool (BLAST;
Altschul (1993) J Mol
Evol 36:290-300; Altschul et al. (1990) J Mol Biol 215:403-410), BLOCKS
(Henikoff and Henikoff
(1991) Nucleic Acids Research 19:6565-6572), Hidden Markov Models (HMM; Eddy
(1996) Cur Opin
Str Biol 6:361-365; Sonnhammer et al. (1997) Proteins 28:405-420), and the
like, can be used to
manipulate and analyze nucleotide and amino acid sequences. These databases,
algorithms and other
methods are well known in the art and are described in Ausubel et al. (1997;
Short Protocols in Molecular
Bioloey, John Wiley & Sons, New York NY, unit 7.7) and in Meyers (1995;
Molecular Biology and
Biotechnology, Wiley VCH, New York NY, p 856-853).
Also encompassed by the invention are polynucleotide sequences that are
capable of hybridizing
to SEQ ID NOs:l-7, and fragments thereof under stringent conditions. Stringent
conditions can be
defined by salt concentration, temperature, and other chemicals and conditions
well known in the art.
Suitable conditions can be selected, for example, by varying the
concentrations of salt in the
prehybridization, hybridization, and wash solutions or by varying the
hybridization and wash
temperatures. With some substrates, the temperature can be decreased by adding
formamide to the
prehybridization and hybridization solutions.
Hybridization can be performed at low stringency, with buffers such as SxSSC
with 1% sodium
dodecyl sulfate (SDS) at 60° C, which permits complex formation between
two nucleic acid sequences
that contain some mismatches. Subsequent washes are performed at higher
stringency with buffers such
as 0.2xSSC with 0.1% SDS at either 45° C (medium stringency) or
68° C (high stringency), to maintain
hybridization of only those complexes that contain completely complementary
sequences. Background
signals can be reduced by the use of detergents such as SDS, Sarcosyl, or
Triton X-100, and/or a blocking
agent, such as salmon sperm DNA. Hybridization methods are described in detail
in Ausubel su ra,
units 2.8-2.11, 3.18-3.19 and 4-6-4.9) and Sambrook et al. (1989; Molecular
Cloning, A Laboratory
Manual, Cold Spring Harbor Press, Plainview NY)
NSEQ can be extended utilizing a partial nucleotide sequence and employing
various PCR-based
methods known in the art to detect upstream sequences such as promoters and
other regulatory elements.
(See, e.g., Dieffenbach and Dveksler (1995) PCR Primer, a Laboratory Manual,
Cold Spring Harbor

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
Press, Plainview NY). Additionally, one may use an XL-PCR kit (PE Biosystems,
Foster City CA),
nested primers, and commercially available cDNA (Life Technologies, Rockville
MD) or genomic
libraries (Clontech, Palo Alto CA) to extend the sequence. For all PCR-based
methods, primers may be
designed using commercially available software, such as OLIGO 4.06 Primer
analysis software (National
Biosciences, Plymouth MN) or another appropriate program, to be about 18 to 30
nucleotides in length, to
have a GC content of about 50%, and to form a hybridization complex at
temperatures of about 68°C to
72°C.
In another aspect of the invention, NSEQ can be cloned in recombinant DNA
molecules that
direct the expression of PSEQ or structural or functional fragments thereof,
in appropriate host cells. Due
to the inherent degeneracy of the genetic code, other DNA sequences which
encode substantially the same
or a functionally equivalent amino acid sequence may be produced and used to
express the polypeptide
encoded by NSEQ. The nucleotide sequences of the present invention can be
engineered using methods
generally known in the art in order to alter the nucleotide sequences for a
variety of purposes including,
but not limited to, modification of the cloning, processing, and/or expression
of the gene product. DNA
shuffling by random fragmentation and PCR reassembly of gene fragments and
synthetic oligonucleotides
may be used to engineer the nucleotide sequences. For example, oligonucleotide-
mediated site-directed
mutagenesis may be used to introduce mutations that create new restriction
sites, alter glycosylation
patterns, change codon preference, produce splice variants, and so forth.
In order to express a biologically active protein, NSEQ, or derivatives
thereof, may be inserted
into an appropriate expression vector, i.e., a vector which contains the
necessary elements for
transcriptional and translational control of the inserted coding sequence in a
particular host. These
elements include regulatory sequences, such as enhancers, constitutive and
inducible promoters, and 5'
and 3' untranslated regions. Methods which are well known to those skilled in
the art may be used to
construct such expression vectors. These methods include in vitro recombinant
DNA techniques,
synthetic techniques, and in vivo genetic recombination. (See, e.g., Sambrook,
supra; and Ausubel,
s_u~ra).
A variety of expression vector/host cell systems may be utilized to express
NSEQ. These include,
but are not limited to, microorganisms such as bacteria transformed with
recombinant bacteriophage,
plasmid, or cosmid DNA expression vectors; yeast transformed with yeast
expression vectors; insect cell
systems infected with baculovirus vectors; plant cell systems transformed with
viral or bacterial
expression vectors; or animal cell systems. For long term production of
recombinant proteins in
mammalian systems, stable expression in cell lines is preferred. For example,
NSEQ can be transformed
into cell lines using expression vectors which may contain viral origins of
replication and/or endogenous
expression elements and a selectable or visible marker gene on the same or on
a separate vector. The

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
invention is not to be limited by the vector or host cell employed.
In general, host cells that contain NSEQ and that express PSEQ may be
identified by a variety of
procedures known to those of skill in the art. These procedures include, but
are not limited to,
DNA-DNA or DNA-RNA hybridizations, PCR amplification, and protein bioassay or
immunoassay
techniques which include membrane, solution, or chip based technologies for
the detection and/or
quantification of nucleic acid or protein sequences. Immunological methods for
detecting and measuring
the expression of PSEQ using either specific polyclonal or monoclonal
antibodies are known in the art.
Examples of such techniques include enzyme-linked immunosorbent assays
(ELISAs),
radioimmunoassays (RIAs), and fluorescence activated cell sorting (FACS).
Host cells transformed with NSEQ may be cultured under conditions suitable for
the expression
and recovery of the protein from cell culture. The protein produced by a
transgenic cell may be secreted
or retained intracellularly depending on the sequence and/or the vector used.
As will be understood by
those of skill in the art, expression vectors containing NSEQ may be designed
to contain signal sequences
which direct secretion of the protein through a prokaryotic or eukaryotic cell
membrane.
In addition, a host cell strain may be chosen for its ability to modulate
expression of the inserted
sequences or to process the expressed protein in the desired fashion. Such
modifications of the
polypeptide include, but are not limited to, acetylation, carboxylation,
glycosylation, phosphorylation,
lipidation, and acylation. Post-translational processing which cleaves a
"prepro" form of the protein may
also be used to specify protein targeting, folding, and/or activity. Different
host cells which have specific
cellular machinery and characteristic mechanisms for post-translational
activities (e.g., CHO, HeLa,
MDCK, HEK293, and WI38) are available from the American Type Culture
Collection (ATCC, Manasas
VA) and may be chosen to ensure the correct modification and processing of the
expressed protein.
In another embodiment of the invention, natural, modified, or recombinant
nucleic acid sequences
are ligated to a heterologous sequence resulting in translation of a fusion
protein containing heterologous
protein moieties in any of the aforementioned host systems. Such heterologous
protein moieties facilitate
purification of fusion proteins using commercially available affinity
matrices. Such moieties include, but
are not limited to, glutathione S-transferase, maltose binding protein,
thioredoxin, calmodulin binding
peptide, 6-His, FLAG, c-myc, hemaglutinin, and monoclonal antibody epitopes.
In another embodiment, the nucleic acid sequences are synthesized, in whole or
in part, using
chemical or enzymatic methods well known in the art (Caruthers et al. (1980)
Nucl Acids Symp Ser (7)
215-233; Ausubel, s-upra). For example, peptide synthesis can be performed
using various solid-phase
techniques (Roberge et al. (1995) Science 269:202-204), and machines such as
the ABI 431A Peptide
synthesizer (PE Biosystems) can be used to automate synthesis. If desired, the
amino acid sequence may
be altered during synthesis and/or combined with sequences from other proteins
to produce a variant

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
protein.
In another embodiment, the invention entails a substantially purified
polypeptide comprising the
amino acid sequence of SEQ ID NOs:8 and 9 or fragments thereof.
DIAGNOSTICS and THERAPEUTICS
The polynucleotide sequences can be used in diagnosis, prognosis, treatment,
prevention, and
evaluation of therapies for diseases of the colon including, but not limited,
colon cancer, metastatic colon
cancer, atrophic gastritis, cholecystitis, Crohns disease, irritable bowel
syndrome, ulcerative colitis, and
the like.
In one preferred embodiment, the polynucleotide sequences are used for
diagnostic purposes to
determine the absence, presence, and excess expression of the protein. The
polynucleotides may be at
least 18 nucleotides long and consist of complementary RNA and DNA molecules,
branched nucleic
acids, and/or peptide nucleic acids (PNAs). In one alternative, the
polynucleotides are used to detect and
quantify gene expression in samples in which expression ofNSEQ is correlated
with disease. In another
alternative, NSEQ can be used to detect genetic polymorphisms associated with
a disease. These
polymorphisms may be detected in the transcript cDNA.
The specificity of the probe is determined by whether it is made from a unique
region, a
regulatory region, or from a conserved motif. Both probe specificity and the
stringency of diagnostic
hybridization or amplification (maximal, high, intermediate, or low) will
determine whether the probe
identifies only naturally occurring, exactly complementary sequences, allelic
variants, or related
sequences. Probes designed to detect related sequences should preferably have
at least 75% sequence
identity to any of the nucleic acid sequences encoding PSEQ.
Methods for producing hybridization probes include the cloning of nucleic acid
sequences into
vectors for the production of mRNA probes. Such vectors are known in the art,
are commercially
available, and may be used to synthesize RNA probes in vitro by adding
appropriate RNA polymerases
and labeled nucleotides. Hybridization probes may incorporate nucleotides
labeled by a variety of
reporter groups including, but not limited to, radionuclides such as''-P
or'SS, enzymatic labels such as
alkaline phosphatase coupled to the probe via avidin/biotin coupling systems,
fluorescent labels, and the
like. The labeled polynucleotide sequences may be used in Southern or northern
analysis, dot blot, or
other membrane-based technologies; in PCR technologies; and in microarrays
utilizing samples from
subjects to detect altered PSEQ expression.
NSEQ can be labeled by standard methods and added to a sample from a subject
under conditions
suitable for the formation and detection of hybridization complexes. After
incubation the sample is
washed, and the signal associated with hybrid complex formation is quantitated
and compared with a
standard value. Standard values are derived from any control sample, typically
one that is free of the

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
suspect disease. If the amount of signal in the subject sample is altered in
comparison to the standard
value, then the presence of altered levels of expression in the sample
indicates the presence of the disease.
Qualitative and quantitative methods for comparing the hybridization complexes
formed in subject
samples with previously established standards are well known in the art.
Such assays may also be used to evaluate the efficacy of a particular
therapeutic treatment
regimen in animal studies, in clinical trials, or to monitor the treatment of
an individual subject. Once the
presence of disease is established and a treatment protocol is initiated,
hybridization or amplification
assays can be repeated on a regular basis to determine if the level of
expression in the subject begins to
approximate that which is observed in a healthy subject. The results obtained
from successive assays may
be used to show the efficacy of treatment over a period ranging from several
days to many years.
The polynucleotides may be used for the diagnosis of a variety of diseases
associated with the
colon. These include, but are not limited to, colon cancer, metastatic colon
cancer, atrophic gastritis,
cholecystitis, Crohns disease, irritable bowel syndrome, ulcerative colitis,
and the like.
The polynucleotides may also be used as targets in a microarray. The
microarray can be used to
IS monitor the expression patterns of large numbers of genes simultaneously
and to identify splice variants,
mutations, and polymorphisms. Information derived from analyses of the
expression patterns may be
used to determine gene function, to understand the genetic basis of a disease,
to diagnose a disease, and to
develop and monitor the activities of therapeutic agents used to treat a
disease. Microarrays may also be
used to detect genetic diversity, single nucleotide polymorphisms which may
characterize a particular
population, at the genome level.
In yet another alternative, polynucleotides may be used to generate
hybridization probes useful in
mapping the naturally occurring genomic sequence. Fluorescent in situ
hybridization (FISH) may be
correlated with other physical chromosome mapping techniques and genetic map
data as described in
Heinz-Ulrich et al. (In: Meyers, s_~ra, pp 965-968).
In another embodiment, antibodies or antibody fragments comprising an antigen
binding site that
specifically binds PSEQ may be used for the diagnosis of diseases
characterized by the over-or-under
expression of PSEQ. A variety of protocols for measuring PSEQ, including
ELISAs, RIAs, and FACS,
are well known in the art and provide a basis for diagnosing altered or
abnormal levels of expression.
Standard values for PSEQ expression are established by combining samples taken
from healthy subjects,
preferably human, with antibody to PSEQ under conditions suitable for complex
formation The amount
of complex formation may be quantitated by various methods, preferably by
photometric means.
Quantities of PSEQ expressed in disease samples are compared with standard
values. Deviation between
standard and subject values establishes the parameters for diagnosing or
monitoring disease.
Alternatively, one may use competitive drug screening assays in which
neutralizing antibodies capable of
12

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
binding PSEQ specifically compete with a test compound for binding the
protein. Antibodies can be used
to detect the presence of any peptide which shares one or more antigenic
determinants with PSEQ. In one
aspect, the anti-PSEQ antibodies of the present invention can be used for
treatment or monitoring
therapeutic treatment for diseases of the colon, particularly colon cancer.
In another aspect, the NSEQ, or its complement, may be used therapeutically
for the purpose of
expressing mRNA and protein, or conversely to block transcription or
translation of the mRNA.
Expression vectors may be constructed using elements from retroviruses,
adenoviruses, herpes or vaccinia
viruses, or bacterial plasmids, and the like. These vectors may be used for
delivery of nucleotide
sequences to a particular target organ, tissue, or cell population. Methods
well known to those skilled in
the art can be used to construct vectors to express nucleic acid sequences or
their complements. (See,
e.g., Maulik et al. (1997) Molecular Biotechnolo~y Therapeutic Applications
and Strat~s, Wiley-Liss,
New York NY.) Alternatively, NSEQ, or its complement, may be used for somatic
cell or stem cell gene
therapy. Vectors may be introduced in vivo, in vitro, and ex vivo. For ex vivo
therapy, vectors are
introduced into stem cells taken from the subject, and the resulting
transgenic cells are clonally
propagated for autologous transplant back into that same subject. Delivery of
NSEQ by transfection,
liposome injections, or polycationic amino polymers may be achieved using
methods which are well
known in the art. (See, e.g., Goldman et al. (1997) Nature Biotechnology
15:462-466.) Additionally,
endogenous NSEQ expression may be inactivated using homologous recombination
methods which insert
an inactive gene sequence into the coding region or other appropriate targeted
region of NSEQ. (See, e.g.
Thomas et al. (1987) Cell 51:503-512.)
Vectors containing NSEQ can be transformed into a cell or tissue to express a
missing protein or
to replace a nonfunctional protein. Similarly a vector constructed to express
the complement of NSEQ
can be transformed into a cell to downregulate the overexpression of PSEQ.
Complementary or antisense
sequences may consist of an oligonucleotide derived from the transcription
initiation site; nucleotides
between about positions -10 and +10 from the ATG are preferred. Similarly,
inhibition can be achieved
using triple helix base-pairing methodology. Triple helix pairing is useful
because it causes inhibition of
the ability of the double helix to open sufficiently for the binding of
polymerases, transcription factors, or
regulatory molecules. Recent therapeutic advances using triplex DNA have been
described in the
literature. (See, e.g., Gee et al. In: Huber and Carr ( 1994) Molecular and
Immunolo is Approaches,
Futura Publishing, Mt. Kisco NY, pp 163-177.)
Ribozymes, enzymatic RNA molecules, may also be used to catalyze the cleavage
of mRNA and
decrease the levels of particular mRNAs, such as those comprising the
polynucleotide sequences of the
invention. (See, e.g., Rossi (1994) Current Biology 4:469-471.) Ribozymes may
cleave mRNA at
specific cleavage sites. Alternatively, ribozymes may cleave mRNAs at
locations dictated by flanking
13

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
regions that form complementary base pairs with the target mRNA. The
construction and production of
ribozymes is well known in the art and is described in Meyers su ra).
RNA molecules may be modified to increase intracellular stability and half
life. Possible
modifications include, but are not limited to, the addition of flanking
sequences at the 5' and/or 3' ends of
the molecule, or the use of phosphorothioate or 2' O-methyl rather than
phosphodiester linkages within
the backbone of the molecule. Alternatively, nontraditional bases such as
inosine, queosine, and
wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms
of adenine, cytidine, guanine,
thymine, and uridine which are not as easily recognized by endogenous
endonucleases, may be included.
Further, an antagonist, or an antibody that binds specifically to PSEQ may be
administered to a
subject to treat or prevent a disease associated with colon cancer. The
antagonist, antibody, or fragment
may be used directly to inhibit the activity of the protein or indirectly to
deliver a therapeutic agent to
cells or tissues which express the PSEQ. An immunoconjugate comprising a PSEQ
binding site of the
antibody or the antagonist and a therapeutic agent may be administered to a
subject in need to treat or
prevent disease. The therapeutic agent may be a cytotoxic agent selected from
a group including, but not
limited to, abrin, ricin, doxorubicin, daunorubicin, taxol, ethidium bromide,
mitomycin, etoposide,
tenoposide, vincristine, vinblastine, colchicine, dihydroxy anthracin dione,
actinomycin D, diphteria
toxin, Pseudomonas exotoxin A and 40, radioisotopes, and glucocorticoid.
Antibodies to PSEQ may be generated using methods that are well known in the
art. Such
antibodies may include, but are not limited to, polyclonal, monoclonal,
chimeric, and single chain
antibodies, Fab fragments, and fragments produced by a Fab expression library.
Neutralizing antibodies,
such as those which inhibit dimer formation, are especially preferred for
therapeutic use. Monoclonal
antibodies to PSEQ may be prepared using any technique which provides for the
production of antibody
molecules by continuous cell lines in culture. These include, but are not
limited to, the hybridoma, the
human B-cell hybridoma, and the EBV-hybridoma techniques. In addition,
techniques developed for the
production of chimeric antibodies can be used. (See, e.g., Pound (1998)
Immunochemical Protocols,
Methods Mol Biol, Vol 80). Alternatively, techniques described for the
production of single chain
antibodies may be employed. Antibody fragments which contain specific binding
sites for PSEQ may
also be generated. Various immunoassays may be used to identify antibodies
having the desired
specificity. Numerous protocols for competitive binding or immunoradiometric
assays using either
polyclonal or monoclonal antibodies with established specificities are well
known in the art.
Yet further, an agonist of PSEQ may be administered to a subject to treat or
prevent a disease
associated with decreased expression, longevity or activity of PSEQ.
An additional aspect of the invention relates to the administration of a
pharmaceutical or sterile
composition, in conjunction with a pharmaceutically acceptable carrier, for
any of the therapeutic
14

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
applications discussed above. Such pharmaceutical compositions may consist of
PSEQ or antibodies,
mimetics, agonists, antagonists, or inhibitors of the polypeptide. The
compositions may be administered
alone or in combination with at least one other agent, such as a stabilizing
compound, which may be
administered in any sterile, biocompatible pharmaceutical carrier including,
but not limited to, saline,
buffered saline, dextrose, and water. The compositions may be administered to
a subject alone or in
combination with other agents, drugs, or hormones.
The pharmaceutical compositions utilized in this invention may be administered
by any number
of routes including, but not limited to, oral, intravenous, intramuscular,
intra-arterial, intramedullary,
intrathecal, intraventricular, transdermal, subcutaneous, intraperitoneal,
intranasal, enteral, topical,
sublingual, or rectal means.
In addition to the active ingredients, these pharmaceutical compositions may
contain suitable
pharmaceutically-acceptable carriers comprising excipients and auxiliaries
which facilitate processing of
the active compounds into preparations which can be used pharmaceutically.
Further details on
techniques for formulation and administration may be found in the latest
edition of Remin.t~'s
Pharmaceutical Sciences (Maack Publishing, Easton PA).
For any compound, the therapeutically effective dose can be estimated
initially either in cell
culture assays or in animal models such as mice, rats, rabbits, dogs, or pigs.
An animal model may also
be used to determine the appropriate concentration range and route of
administration. Such information
can then be used to determine useful doses and routes for administration in
humans.
A therapeutically effective dose refers to that amount of active ingredient
which ameliorates the
symptoms or condition. Therapeutic efficacy and toxicity may be determined by
standard pharmaceutical
procedures in cell cultures or with experimental animals, such as by
calculating and contrasting the EDso
(the dose therapeutically effective in 50% of the population) and LDSO (the
dose lethal to 50% of the
population) statistics. Any of the therapeutic compositions described above
may be applied to any subject
in need of such therapy, including, but not limited to, mammals such as dogs,
cats, cows, horses, rabbits,
monkeys, and most preferably, humans.
EXAMPLES
It is to be understood that this invention is not limited to the particular
devices, machines,
materials and methods described. Although particular embodiments are
described, equivalent
embodiments may be used to practice the invention. The described embodiments
are not intended to limit
the scope of the invention which is limited only by the appended claims. The
examples below are
provided to illustrate the subject invention and are not included for the
purpose of limiting the invention.
cDNA Library Construction
The COLNTUT16 cDNA library, in which Incyte clone 2790708 was discovered, was
1J

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
constructed from colon tumor tissue obtained from a 60 year-old Caucasian male
during a left
hemicolectomy. Pathology indicated an invasive grade 2 adenocarcinoma, a
sessile mass located three
cm from the distal margin. The tumor extended through the submucosa and
superficially into the
muscularis propria. The margins of resection were free of involvement. One of
nine regional lymph
nodes contained metastatic adenocarcinoma. The patient presented with blood in
the stool and a change
in bowel habits. Patient history included thrombophlebitis, inflammatory
polyarthropathy, prostatic
inflammatory disease, and depressive disorder. Previous surgeries included
resection of the rectum, a
vasectomy, and exploration of the spinal canal. Family history included a
malignant colon neoplasm in a
sibling. The COLNNOT08 cDNA library in which Incyte clone 1843578 was
discovered is from the
same patient.
The frozen tissue was homogenized and lysed in TRIZOL reagent (1 gm tissue/10
ml TRIZOL;
Life Technologies), a monoplastic solution of phenol and guanidine
isothiocyanate, using a Polytron
homogenizer (PT-3000; Brinkmann Instruments, Westbury NY). After a brief
incubation on ice,
chloroform was added (1:5 v/v), and the lysate was centrifuged. The chloroform
layer was removed to a
fresh tube, and the RNA extracted with isopropanol, resuspended in DEPC-
treated water, and treated with
DNase for 25 min at 37°C. The RNA was re-extracted once with acid
phenol-chloroform pH 4.7 and
precipitated using 0.3M sodium acetate and 2.5 volumes ethanol. The mRNA was
isolated with the
OLIGOTEX kit (Qiagen, Valencia CA) and used to construct the cDNA library.
The mRNA was handled according to the recommended protocols in the SUPERSCRIPT
plasmid
system (Life Technologies). The cDNAs were fractionated on a SEPHAROSE CL4B
column
(Amersham Pharmacia Biotech, Piscataway NJ), and those cDNAs exceeding 400 by
were ligated into
pINCY 1 plasmid (Incyte Pharmaceuticals, Palo Alto CA). The plasmid was
subsequently transformed
into DHSa competent cells (Life Technologies).
II Isolation and Sequencing of cDNA Clones
Plasmid DNA was released from the cells and purified using the REAL Prep 96
plasmid kit
(Qiagen). This kit enabled the simultaneous purification of 96 samples in a 96-
well block using
multi-channel reagent dispensers. The recommended protocol was employed except
for the following
changes: 1) the bacteria were cultured in 1 ml of sterile Terrific Broth (Life
Technologies) with
carbenicillin at 25 mg/L and glycerol at 0.4%; 2) after inoculation, the
cultures were incubated for 19
hours; at the end of incubation, the cells were lysed with 0.3 ml of lysis
buffer; and 3) following
isopropanol precipitation, the plasmid DNA pellet was resuspended in 0.1 ml of
distilled water, after
which samples were transferred to a 96-well block for storage at 4° C.
The cDNAs were prepared using a MICROLAB 2200 (Hamilton, Reno NV) in
combination with
DNA ENGINE thermal cycler (PTC200; MJ Research, Watertown MA). cDNAs were
sequenced by the
16

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
method of Sanger et al. (1975. J. Mol. Biol. 94:441f) using ABI PRISM 377 DNA
sequencing systems
(PE Biosystems) or MEGABASE 1000 sequencing systems (Molecular Dynamics,
Sunnyvale CA).
Most of the sequences disclosed herein were sequenced using standard ABI
protocols and ABI
kits (Cat. Nos. 79345, 79339, 79340, 79357, 79355; PE Biosystems). The
solution volumes were used at
0.25x -1.Ox concentrations. Some of the sequences disclosed herein were
sequenced using solutions and
dyes from Amersham Pharmacia Biotech.
III Selection, Assembly, and Characterization of Sequences
The sequences used for coexpression analysis were assembled from EST
sequences, 5' and 3'
longread sequences, and full length coding sequences. Selected assembled
sequences were expressed in
at least three cDNA libraries.
The assembly process is described as follows. EST sequence chromatograms were
processed and
verified. Quality scores were obtained using PHRED (Ewing et al. (1998) Genome
Res 8:175-185;
Ewing and Green (1998) Genome Res 8:186-194), and edited sequences were loaded
into a relational
database management system (RDBMS). The sequences were clustered using BLAST
with a product
score of 50. All clusters of two or more sequences created a bin, and each bin
with its resident sequences
represents one transcribed gene.
Assembly of the component sequences within each bin was performed using a
modification of
Phrap, a publicly available program for assembling DNA fragments (Green,
University of Washington,
Seattle WA). Bins that showed 82% identity from a local pair-wise alignment
between any of the
consensus sequences were merged.
Bins were annotated by screening the consensus sequence in each bin against
public databases,
such as GBpri and GenPept from NCBI. The annotation process involved a FASTn
screen against the
gbpri database in GenBank. Those hits with a percent identity of greater than
or equal to 75% and an
alignment length of greater than or equal to 100 base pairs were recorded as
homolog hits. The residual
unannotated sequences were screened by FASTx against GenPept. Those hits with
an E value of less
than or equal to 108 were recorded as homolog hits.
Sequences were then reclustered using BLASTn and Cross-Match, a program for
rapid protein
and nucleic acid sequence comparison and database search (Green, supra),
sequentially. Any BLAST
alignment between a sequence and a consensus sequence with a score greater
than 150 was realigned
using cross-match. The sequence was added to the bin whose consensus sequence
gave the highest
Smith-Waterman score (Smith et al. supra) amongst local alignments with at
least 82% identity. Non-
matching sequences were moved into new bins, and assembly processes were
performed for the new bins.
IV Coexpression Analyses of Known Colon Cancer Genes
Fourteen known colon cancer genes were selected to identify novel genes that
are closely
17

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
associated with diseases of the colon. These known genes were carbonic
anhydrase I, II, and IV,
carcinoembryonic antigen family of proteins, colorectal carcinoma tumor-
associated antigen; down-
regulated in adenoma, fatty-acid binding protein, galectin, g(utathione
peroxidase, guanylin, cytokeratin 8
and 20, cadherin, and intestinal mucin. The colon cancer genes which were
examined in this analysis and
brief descriptions of their functions are listed in Table 4.
TABLE 4
GENE DESCRIPTION AND REFERENCES
CA I, II,
and IV Carbonic
anhydrase
I, II, and
IV
Isoenzymes in colorectal mucosa, differentially expressed
in colon cancer
(Mori et al. (1993) Gastroenterology 105:820-6)
-
CEA Carcinoembryonic antigen family of proteins
Cell adhesion glycoprotein, diagnostic marker for colon
cancer, prognostic
for survival from colon cancer (Carpelan-Holmstrom
et al. (1996)
Dis Colon Rectum 39:799-805; Harrison et al. ( 1997)
J Am Coll
Surg 185:55-59; Graham et a(. (1998) Ann Surg 228:59-63)
CO-029 CO-029 colorectal carcinoma tumor-associated antigen
Cell surface glycoprotein (Seta et al. (1989) Hybridoma
8:481-491;
Szala et al. (1990) Proc Natl Acad Sci 87:6833-6837)
DRA Down-regulated in adenoma (DRA)
Anion transporter expressed predominantly in colon
mucosa, expression
decreased in colon tumors, marker for progression of
colon tumor
(Schweinfest et al. (1993) Proc Natl Acad Sci 90:4166-4170;
Byeon et al. (1996) Oncogene 12:387-396; Antalis et
al.
(1998) Clin Cancer Res 4:1857-1863)
FABP Fatty-acid binding protein
Hydrophobic ligand-binding protein expressed in liver
and intestines,
differentially expressed in colon and other cancers
(Davidson et al.
(1993) Lab Invest 68:663-675; Khan (1994) Proc Natl
Acad Sci
91:848-852; Gromova et al. (1998) Int J Oncol 13:379-383)
Galec Galectin family (Alternate name: IgE-binding protein)
Modulate cell adhesion, cell proliferation, and cell
death, differentially
expressed in colon cancer including the metastatic
phase (Sanjuan et al.
(1997) Gastroenterology 113:1906-15; Bresalier et al.
(1998)
Gastroenterology 115:287-296; Perillo et al. (1998)
J Mol Med
76:402-412)
Gpx2 Glutathione peroxidase
Anti-oxidant, differentially expressed in colon cancers
(Jendryczko et al. (1993) Neoplasma 40:107-109; Bravard
et al.
(1994) Int J Cancer 59:843-7; Beno et al. (1995) Neoplasma
42:265-9)
Guan Guanylin
Regulates chloride transport in epithelial tissues
such as colon and shows
decreased expression in colorectal adenocorcinoma (Cohen et al. (1998)
Lab Invest 78:101-108)
ker 8 and 20 Cytokeratin 8 and 20
Cytoskeleton filaments and serum markers for colon cancer including the
metastatic phase (Funaki, et al. (1997) Life Sci 60:643-652;
Nakamori et al. (1997) Dis Colon Rectum 40: S29-36)
18

CA 02361743 2001-07-31
WO 00/50588 PCT/~JS00/02595
Cadher Cadherin family
Cell adhesion proteins and differentiation markers which are differentially
expressed in colon and other cancers (Breen et al. (1995) Ann Surg
Oncol 2:378-385; Eckert et al. (1997) Anticancer Res 17:7-12; Kreft,
et al. (1997) J Cell Biol 136:1109-1121; Efstathiou et al. (1998)
Proc Natl Acad Sci 95:3122-3127)
MUC-2 Intestinal mucin
Expression decreased in majority of colorectal carcinomas (Ho et al.
(1996) Oncol Res 8: 53-61; Hanski et al. (1997) J Pathol 182:385-
391; Hanski et al. (1997) Lab. Invest. 77:685-95)
From a total of 41,419 assembled gene sequences, we have identified seven
novel genes that
show strong association with 14 known colon cancer genes. Initially, the
degree of association was
measured by probability values using a cutoff p value less than 0.00001. The
sequences were further
examined to ensure that the genes that passed the probability test had strong
association with known colon
cancer genes. The process was reiterated so that the initial 41,419 genes were
reduced to the final seven
colon disease associated genes. Details of the expression patterns for the 14
known and seven novel
colon disease genes are presented in Tables 5 and 6.
Table 5 Co-Expression of the 14 Known Colon Cancer Genes (-log p)
1 2 3 4 5 6 7 8 9 10 11 12
13
Guan 1
Cadher 2 7
CA IV 3 6 3
FABP 4 13 8 4
Galec 5 10 137 17
CO-029 6 7 I13 13 23
DRA 7 13 1010 20 2117
MUC-2 8 13 5 8 18 1812 15
CAI 9 15 4 5 11 7 5 9 8
CEA 10 10 134 18 2420 18 15 8
Gpx2 11 8 125 16 2519 15 11 6 21
CA II 12 6 5 4 8 114 12 6 7 7 7
ker20 13 14 107 16 2119 18 16 1024 18 7
kerb 14 4 5 3 8 1712 9 7 3 12 17 3
8
Table 6 Co-Expression of Seven Novel Genes and 14 Known Colon Cancer Genes (-
log p)
Clone Guan Cadh CA FAB Galec CO- DRA MUC- CA I CEA GpY2 CA ker20 kerb
2790708 8 4 3 5 6 3 8 3 4 4 5 3 4 2
1961467 2 3 I 4 4 2 8 4 2 3 4 3 3 3
1580553 5 4 6 12 12 8 10 15 5 13 12 4 15 5
2296694 2 3 3 2 7 9 2 1 1 6 7 1 3 16
1843578 10 5 3 7 6 3 8 7 8 5 4 5 8 2
2516888 14 6 6 20 21 13 17 16 8 14 14 7 15 8
3235282 10 8 5 12 16 12 17 10 9 l4 18 8 15 7
Vve examined genes that are coexpressed with the 14 known colon cancer genes,
and identified
19

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
seven novel genes that are strongly coexpressed. Each of the seven novel genes
is coexpressed with at
least one of the 14 known genes with a p-value of less than l0e-O5. The
coexpression of the seven novel
genes with the 14 known genes are shown in Table 6. The entries in Table 6 are
the negative log of the p-
value (-log p) for the coexpression of the two genes. The novel genes
identified are listed in the table by
their Incyte clone numbers, and the known genes, by their abbreviated names as
shown in Example V.
For convenience, all the genes in the table 5 are assigned an identifying
number, 1 to 14.
V Novel Genes Associated with Colon Diseases
Using the co-expression analysis method, we have identified seven novel genes
that exhibit
strong association, or co-expression, with 14 known colon cancer genes.
Nucleic acids comprising the consensus sequences of SEQ ID NOs:l-7 of the
present invention
were first identified from Incyte Clones 1580553, 1843578, 1961467, 2296694,
2516888, 2790708, and
323.35282, respectively, and assembled according to Example III. BLAST and
other motif searches were
performed for SEQ ID NOs:l-7 according to Example VII. SEQ ID NOs:l-7 were
translated and
sequence identity was sought via comparison to known sequences. SEQ ID NOs:8
and 9 of the present
invention were encoded by the nucleic acids of SEQ ID Nos:6-8, respectively.
SEQ ID Nos:8 and 9 were
also analyzed using BLAST and other motif search tools as disclosed in Example
VI. Analyses of the
novel genes is as follows.
SEQ ID NO:1 (Incyte clone 1580553) is 219 nucleotides in length and has about
74% identity to
the nucleic acid sequence of a mouse mucin glycoprotein (g2583092). SEQ ID
N0:2 (Incyte clone
2296694) is 252 nucleotides in length and has no known homologs in any of the
public databases
described in this application. SEQ ID N0:3 (Incyte clone 2516888) is 285
nucleotides in length and has
no known homologs in any of the public databases described in this
application. SEQ ID N0:4 (Incyte
clone 2790708) is 1010 nucleotides in length and about 56% identity to the
nucleic acid sequence from
nucleotide 107789 to nucleotide 108777 of human chromosome 9 (g2564750). SEQ
ID NO:S (Incyte
clone 3235282) is 2616 nucleotides in length and has about 64% identity to the
nucleic acid sequence
encoding a mouse calcium sensitive chloride conductance protein (g3925280) and
70% identity to a
partial cDNAs of a colon specific gene, CSGS, which is 878 nucleotides long.
SEQ ID N0:6 (Incyte
clone 1843578) is 795 nucleotides in length and has about 64% identity to a
nucleic acid sequence
encoding a mouse calcium sensitive chloride conductance protein (g3925280).
SEQ ID N0:7 (Incyte
clone 1961467) is 2225 nucleotides in length and has about 6% identity to
human gene signature
HUMGS07792. SEQ ID N0:8 has 1 I 5 amino acids which are encoded by SEQ ID N0:6
and has no
known homologs in any of the public databases described in this application.
Motif analysis of SEQ ID
N0:8 shows a potential phosphorylation site at S83. SEQ ID N0:9 has 90 amino
acids which are
encoded by SEQ ID N0:7 and has no known homologs in any of the public
databases described in this

CA 02361743 2001-07-31
WO 00/50588 PCT/LJS00/02595
application. Motif analysis of SEQ ID N0:9 shows five potential
phosphorylation sites at TIO, T6, T21,
S66, and S86.
VI Homology Searching for Colon Disease Genes and Their Encoded Proteins
The polynucleotide sequences, SEQ ID NOs: l-7, and polypeptide sequences, SEQ
ID NOs:8 and
9, were queried against databases derived from sources such as GenBank and
SwissProt. These
databases, which contain previously identified and annotated sequences, were
searched for regions of
similarity using BLAST (Altschul, supra). BLAST searched for matches and
reported only those that
satisfied the probability thresholds of 10-'-5 or less for nucleotide
sequences and 10-$ or less for
polypeptide sequences.
The polypeptide sequences were also analyzed for known motif patterns using
MOTIFS,
SPSCAN, BLIMPS, and HMM-based protocols. MOTIFS (Genetics Computer Group,
Madison WI)
searches polypeptide sequences for patterns that match those defined in the
Prosite Dictionary of Protein
Sites and Patterns (Bairoch, supra) and displays the patterns found and their
corresponding literature
abstracts. SPSCAN (Genetics Computer Group) searches for potential signal
peptide sequences using a
weighted matrix method (Nielsen et al. (1997) Prot Eng 10:1-6). Hits with a
score of 5 or greater were
considered. BLIMPS uses a weighted matrix analysis algorithm to search for
sequence similarity
between the polypeptide sequences and those contained in BLOCKS, a database
consisting of short amino
acid segments, or blocks of 3-60 amino acids in length, compiled from the
PROSITE database (Henikoff,
supra; Bairoch, supra), and those in PRINTS, a protein fingerprint database
based on non-redundant
sequences obtained from sources such as SwissProt, GenBank, PIR, and NRL-3D
(Attwood et al. ( 1997)
J. Chem Inf Comput Sci 37:417-424). For the purposes of the present invention,
the BLIMPS searches
reported matches with a cutoff score of 1000 or greater and a cutoff
probability value of 1.0 x 10-3.
HMM-based protocols were based on a probabilistic approach and searched for
consensus primary
structures of gene families in the protein sequences (Eddy, supra; Sonnhammer,
supra). More than 500
known protein families with cutoff scores ranging from 10 to 50 bits were
selected for use in this
mvent~on.
VII Labeling of Probes and Hybridization Analyses
Blotting
Polynucleotide sequences are isolated from a biological source and applied to
a solid matrix (a
blot) suitable for standard nucleic acid hybridization protocols by one of the
following methods. A
mixture of target nucleic acids is fractionated by electrophoresis through an
0.7% agarose gel in lx TAE
[40 mM Tris acetate, 2 mM ethylenediamine tetraacetic acid (EDTA)] running
buffer and transferred to a
nylon membrane by capillary transfer using 20x saline sodium citrate (SSC).
Alternatively, the target
nucleic acids are individually ligated to a vector and inserted into bacterial
host cells to form a library.
21

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
Target nucleic acids are arranged on a blot by one of the following methods.
In the first method, bacterial
cells containing individual clones are robotically picked and arranged on a
nylon membrane. The
membrane is placed on bacterial growth medium, LB agar containing
carbenicillin, and incubated at 37°C
for 16 hours. Bacterial colonies are denatured, neutralized, and digested with
proteinase K. Nylon
membranes are exposed to UV irradiation in a STRATALINKER UV-crosslinker
(Stratagene, La Jolla
CA) to cross-link DNA to the membrane.
In the second method, target nucleic acids are amplified from bacterial
vectors by thirty cycles of
PCR using primers complementary to vector sequences flanking the insert.
Amplified target nucleic acids
are purified using SEPHACRYL-400 (Amersham Pharmacia Biotech). Purified target
nucleic acids are
robotically arrayed onto a glass microscope slide. The slide was previously
coated with 0.05%
aminopropyl silane (Sigma-Aldrich, St Louis MO) and cured at 110°C. The
arrayed glass slide
(microarray) is exposed to UV irradiation in a STRATALINKER UV-crosslinker
(Stratagene).
Probe Preparation
cDNA probe sequences are made from mRNA templates. Five micrograms of mRNA is
mixed
with 1 ~tg random primer (Life Technologies), incubated at 70°C for 10
minutes, and lyophilized. The
lyophilized sample is resuspended in 50 pl of lx first strand buffer (cDNA
Synthesis system; Life
Technologies) containing a dNTP mix, [a-3ZP]dCTP, dithiothreitol, and MMLV
reverse transcriptase
(Stratagene), and incubated at 42°C for 1-2 hours. After incubation,
the probe is diluted with 42 pl dH~O,
heated to 95°C for 3 minutes, and cooled on ice. mRNA in the probe is
removed by alkaline degradation.
The probe is neutralized, and degraded mRNA and unincorporated nucleotides are
removed using a
PROBEQUANT G-50 MicroColumn (Amersham Pharmacia Biotech). Probes can be
labeled with
fluorescent markers, Cy3-dCTP or Cy5-dCTP (Amersham Pharmacia Biotech), in
place of the
radionuclide, [32P]dCTP.
Hybridization
Hybridization is carried out at 65°C in a hybridization buffer
containing 0.5 M sodium phosphate
(pH 7.2), 7% SDS, and I mM EDTA. After the blot is incubated in hybridization
buffer at 65°C for at
least 2 hours, the buffer is replaced with 10 ml of fresh buffer containing
the probe sequences. After
incubation at 65°C for 18 hours, the hybridization buffer is removed,
and the blot is washed sequentially
under increasingly stringent conditions, up to 40 mM sodium phosphate, I %
SDS, 1 mM EDTA at 65°C.
To detect signal produced by a radiolabeled probe hybridized on a membrane,
the blot is exposed to a
PHOSPHORIMAGER cassette (Molecular Dynamics), and the image is analyzed using
IMAGEQUANT
data analysis software (Molecular Dynamics). To detect signals produced by a
fluorescent probe
hybridized on a microarray, the blot is examined by confocal laser microscopy,
and images are collected
and analyzed using GEMTOOLS gene expression analysis software (Incyte
Pharmaceuticals).
22

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
VIII Production of Specific Antibodies
SEQ ID NOs: 8-9, or portions thereof, substantially purified using
polyacrylamide gel
electrophoresis or other purification techniques, is used to immunize rabbits
and to produce antibodies
using standard protocols as described in Pound su ra).
Alternatively, the amino acid sequence is analyzed using LASERGENE software
(DNASTAR,
Madison WI) to determine regions of high immunogenicity, and a corresponding
oligopeptide is
synthesized and used to raise antibodies by means known to those of skill in
the art. Methods for
selection of appropriate epitopes, such as those near the C-terminus or in
hydrophilic regions are well
described in the art. Typically, oligopeptides 15 residues in length are
synthesized using an ABI 431A
Peptide synthesizer (PE Biosystems) using Fmoc-chemistry and coupled to
keyhole limpet hemocyanin
(KLH, Sigma-Aldrich) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide
ester (Ausubel,
su ra) to increase immunogenicity. Rabbits are immunized with the oligopeptide-
KLH complex in
complete Freund's adjuvant. Resulting antisera are tested for antipeptide
activity by, for example, binding
the peptide to plastic, blocking with 1 % BSA, reacting with rabbit antisera,
washing, and reacting with
IS radio-iodinated goat anti-rabbit IgG.
23

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
SEQUENCE LISTING
<110> INCYTE PHARMACEUTICALS, INC.
G~alker, Michael, G.
Volkmuth, Wayne
Klingier, Tod, M.
Lai, Preeti
<120> GENES ASSOCIATED WITH DISEASES OF THE COLON
<130> PB-0007 PCT
<140> To be assigned
<141> Herewith
<150> 09/255,381
<151> 1999-02-22
<160> 9
<170> PERL Program
<210> 1
<211> 219
<212> DNA
<213> Homo sapiens
<220>
<221> misc-feature
<223> Incyte ID No.: 1580553CB1
<400> 1
caccttctat atctctccag gctcaatgga aacaacatta gccagcacta ccacaacacc 60
aggcctcagt gcaaaatcta ccatccttta cagtagctcc agatcaccag accaaacact 120
ctcacctgcc agcatgagaa gctccagcat cagtggagaa cccaccagct tgtatagcca 180
agcagagtca acacacacaa cagcgttccc tgccagcac 219
<210> 2
<211> 252
<212> DNA
<213> Homo Sapiens
<220>
<221> unsure
<222> 201
<223> a or g or c or t, unknown, or other
<220>
<221> misc-feature
<223> Incyte ID No.: 2296694CB1
<400> 2
cttttcagaa ccccagatga gagccaatgt cagataaagt aagcatagca atgtagcagg 60
aactacaata gaagacattt tcactggaat tacaaagcag aattaaaatt atattgtaga 120
aggaaacacc aagaaaagaa tttccaggga aaatcctctt tgcaggtatt aattcttata 180
attttttgtc ttttggataa nctgtttact gcctcatctg aactgatccc aggtgaacgg 240
tttattgcct ag 252
<210> 3
<211> 285
<212> DNA
<213> Homo sapiens
1

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
<220>
<221> mist-feature
<223> Incyte ID No.: 2516888031
<400> 3
gtggatgaca gggttggcca ccatggagca cctccaggct gacagagttg agacaagaac 60
ccatacctcc taactggcgc cactccaccc aggaggactc agccagccct tgagcacaca 120
gggacacact gctgaacctt atattgactt ccaatatgta tctttgctga gagaatgaat 180
gaaggaatga ttgtcagggg cactgccact gtggggggca tggccatcct ccaggtcact 240
gcggacttac ccctggccat ggcccagggc cctgctgtta ttatc 285
<210> 4
<211> 1010
<212> DNA
<213> Homo Sapiens
<220>
<221> mist-feature
<223> Incyte ID No.: 2790708CB1
<400> 4
attttccttt actttttaaa taggttgttg cctcttatat atttattcta tgatgcaaat 60
gtcactatcc taattcctca gtttatgttt aacagcacac agtggcactt ctatgattca 120
aatacatttg ataacctttg aaatcaatca gaatactgca aaattaattt ttctaaaaca 180
atgcttttat cgttatttct cctgttgaat catcagtaca atttccaatt gaaaacactt 240
aaaataatct catattacaa tctttctcta acagaaccat gatgtaagga cagtgataac 300
aaatatctga caatgatatg attatttcct catccatgga aattttcctt aataaactaa 360
agggctattt tctaaaaagc caaagcattg cttacaagaa cttttcatca tgacatggat 420
agacactcag attcatacat tcaaagggaa gtgtcatgta ttccctttca atccacccta 480
ttctattgtg ttatcttcct aaattatttt ctatctacat tcttcattct ctttcccatt 540
gaccctatgt tctgtgtgat aaaaattgcg tcattggagg ctttttaagg ttaagtatta 600
tgccccattt caccattaat caacatacaa cccttctcca tattttgtaa ttcctttcat 660
atacagaaaa aaagatacta taatttcttc aaaatgcttg atattaatga tatatgggaa 720
aacaattatt ttgtgcagca atcttcagat aactgggaaa ggccggggaa aaagagagat 780
actggtggtt atcaatgacc catgtataaa ttgtttttat tatgtaagct gtcttcacaa 840
atgtcttctt atgtatgatc attagaactg ttttatatat atatgtaaaa tttccacatt 900
atcgagacat tactttcagc agtgaagtaa tcctttttta actgccactt aatgaattca 960
ataaaatata atttattgta ttttgctata ataaactatt gatgactatt 1010
<210> 5
<211> 2616
<212> DNA
<213> Homo sapiens
<220>
<221> mist-feature
<223> Incyte ID No.: 3235282CB1
<400> 5
aaaaatcgaa gcaacaaggt gttccgcagt atctctggta gaaatagagt ttataagtgt 60
caaggaggca gctgtcttag tagagcatgc agaattgatt ctacaacaaa actgtatgga 120
aaagattgtc aattctttcc tgataaagta caaacagaaa aagcatccat aatgtttatg 180
caaagtattg attctgttgt tgaattttgt aacgaaaaaa cccataatca agaagctcca 240
agcctacaaa acataaagtg caattttaga agtacatggg aggtgattag caattctgag 300
gattttaaaa acaccatacc catggtgaca ccacctcctc cacctgtctt ctcattgctg 360
aagatcagtc aaagaattgt gtgcttagtt cttgataagt ctggaagcat ggggggtaag 420
gaccgcctaa atcgaatgaa tcaagcagca aaacatttcc tgctgcagac tgttgaaaat 480
ggatcctggg,tggggatggt tcactttgat agtactgcca ctattgtaaa taagctaatc 540
caaataaaaa gcagtgatga aagaaacaca ctcatggcag gattacctac atatcctctg 600
ggaggaactt ccatctgctc tgaaattaaa tatacatttc aggtgattgg agagctacat 660
_cccaactcg atggatccga agtactgctg ctgactgatg gggaggataa cactgcaagt 720
tcttgtattg atgaagtgaa acaaagtggg gccattgttc attttattgc tttgggaaga 780
gctgctgatg aagcagtaat agagatgagc aagataacag gaggaagtca tttttatgtt 840
tcagatgaag ctcagaacaa tggcctcatt gatgcttttg gggctcttac atcaggaaat 900

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
actgatctct cccagaagtc ccttcagctc gaaagtaagg gattaacact gaatagtaat 960
gcctggatga acgacactgt cataattgat agtacagtgg gaaaggacac gttctttctc 1020
atcacatgga acagtctgcc tcccagtatt tctctctggg atcccagtgg aacaataatg 1080
gaaaatttca cagtggatgc aacttccaaa atggcctatc tcagtattc~ aggaactgca 1140
aaggtgggca cttgggcata caatcttcaa gccaaagcga acccagaaac attaactatt 1200
acagtaactt ctcgagcagc aaattcttct gtgcctccaa tcacagtgaa tgctaaaatg 1260
aataaggacg taaacagttt ccccagccca atgattgttt acgcagaaat tctacaagga 1320
tatgtacctg ttcttggagc caatgtgact gctttcattg aatcacagaa tggacataca 1380
gaagttttgg aacttttgga taatggtgca ggcgctgatt ctttcaagaa tgatggagtc 1440
tactccaggt attttacagc atatacagaa aatggcagat atagcttaaa agttcgggct 1500
catggaggag caaacactgc caggctaaaa ttacggcctc cactgaatag agccgcgtac 1560
ataccaggct gggtagtgaa cggggaaatt gaagcaaacc cgccaagacc tgaaattgat 1620
gaggatactc agaccacctt ggaggatttc agccgaacag catccggagg tgcatttgtg 1680
gtatcacaag tcccaagcct tcccttgcct gaccaatacc caccaagtca aatcacagac 1740
cttgatgcca cagttcatga ggataagatt attcttacat ggacagcacc aggagataat 1800
tttgatgttg gaaaagttca acgttatatc ataagaataa gtgcaagtat tcttgatcta 1860
agagacagtt ttgatgatgc tcttcaagta aatactactg atctgtcacc aaaggaggcc 1920
aactccaagg aaagctttgc atttaaacca gaaaatatct cagaagaaaa tgcaacccac 1980
atatttattg ccattaaaag tatagataaa agcaatttga catcaaaagt atccaacatt 2040
gcacaagtaa ctttgtttat ccctcaagca aatcctgatg acattgatcc tacacctact 2100
cctactccta ctcctactcc tgataaaagt cataattctg gagttaatat ttctacgctg 2160
gtattgtctg tgattgggtc tgttgtaatt gttaacttta ttttaagtac caccatttga 2220
accttaacga agaaaaaaat cttcaagtag acctagaaga gagttttaaa aaacaaaaca 2280
atgtaagtaa aggatatttc tgaatcttaa aattcatccc atgtgtgatc ataaactcat 2340
aaaaataatt ttaagatgtc ggaaaaggat actttgatta aataaaaaca ctcatggata 2400
tgtaaaaact gtcaagatta aaatttaata gtttcattta tttgttattt tatttgtaag 2460
aaatagtgat gaacaaagat cctttttcat actgatacct ggttgtatat tatttgatgc 2520
aacagttttc tgaaatgata tttcaaattg catcaagaaa ttaaaatcat ctatctgagt 2580
agtcaaaata caagtaaagg agagcaaata aacatc 2616
<210> 6
<211> 795
<212> DNA
<213> Homo Sapiens
<220>
<221> misc-feature
<223> Incyte ID No.: 1843578CB1
<400> 6
aggagaccca ggggtcccag agctgggctg gcgggaggcg taatccggcg gggtgagggt 60
tgatcgaaga gccccgcgcg cactgccgct cacagcccct tcccgagtgc agagcgggca 120
gagaagtcca ctgcttttaa ggccctgcac tgaaaatgca agctcaggcg ccggtggtcg 180
ttgtgaccca acctggagtc ggtcccggtc cggcccccca gaactccaac tggcagacag 240
gcatgtgtga ctgtttcagc gactgcggag tctgtctctg tggcacattt tgtttcccgt 300
gccttgggtg tcaagttgca gctgatatga atgaatgctg tctgtgtgga acaagcgtcg 360
caatgaggac tctctacagg acccgatatg gcatccctgg atctatttgt gatgactata 420
tggcaactct ttgctgtcct cattgtactc tttgccaaat caagagagat atcaacagaa 480
ggagagccat gcgtactttc taaaaactga tggtgaaaag ctcttaccga agcaacaaaa 540
ttcagcagac acctcttcag cttgagttct tcaccatctt ttgcaactga aatatgatgg 600
atatgcttaa gtacaactga tggcatgaaa aaaatcaaat ttttgattta ttataaatga 660
atgttgtccc tgaacttagc taaatggtgc aacttagttt ctccttgctt tcatattatc 720
gaatttcctg gcttataaac tttttaaatt acatttgaaa tataaaccaa atgaaatatt 780
aaaaaaaaaa aaaaa 795
<210> 7
<211> 2225
<212> DNA
<213> Homo sapiens
<220>
<221> misc-feature
<223> Incyte ID No.: 1961467CB1

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
<40C> 7
gttcgggtcc tcggaccaca ctctggtttt ctatgctgtt ttggtgcaag tacaactg~c 60
gtagtcatgg ctttaggagc aataggattt taataaacag aacccatccc aaagccat~a 120
ctacgacagt tgtacttgca ccaaaacagc atagaaaacc agagtgtggt gggaggaccc 180
gaagccggtt gggggaggat gtgagtaggg gcctggaggg tgcagggtca ttaatctg~~ 240
gggagaacat tgtgctttag cccagggagg ggaggggtgg ggcaaatgca ccgaggtccc 300
cactttttcc tgctgccctc ggcaccctgg ggatgcaggc atctgggcac atctgcccc= 360
tattgctgcc caccagcgtt aaacgccccc gatcccaaca ctagcaccac aggtggttcc 42C
ggggcaggga gaggcaggaa tgggaaaatt gcttagagaa agattccact agaatccact 480
gaattgtgct cagttctctt tacttcctac aaccgagtac atgggtcaca gggtggaggg 540
tgcaacagga catggaacat gcccctccgt gccccccaac acacacctgc acacagga~~ 600
gtggtgtctg cagcatcaca ggtcatgcag ggcatgggga aggggaggtt cacacacaca 660
tagatgccca cagcgggtac cagacggaga acacccctga atatacatag ctgtacatgg 720
ggaaccccca ggtccccacc ccaaccctct cccctgtctt gctgtccccc gcaggggaac 780
tatattgctt tgagagagcc accccagggg ctgctctgcc aggcaccctc ccctcccacc 840
cacccccatt ttggcacatc tgcaagacac acagcagcga gagtaggcac cctcccttcc 900
caggcttctg tggcctggag ctggagaagg gggtaggaga cttcatcctc catcctcccc 960
taacccttcc caaacccctg ccaaacccac tcaagccaga acccaccccc acccccca~~ 1020
cacacataca aagctgagct atccaggaac acaagggaaa caaggagatt gtccagggtg 1080
ggagcggagg cagcggggga agaagactgg aagcagagac ctcccccctt gtggggggca 1140
gactggcaca acagctactt tagtgcaatt ggagagggtg cccagagtga gaggtggaga 1200
agggagggaa ggcggtcccc aacttccctg ggggcaaagt caggcttcca gattccccag 1260
ggaaagggcc tagcaggagt gggtgagggc caaggtggat cctctggtta cccgccaccc 1320
tctgccctcc caaatgcagt gacagtgtcc ccctcacacc taagtgggca acagcagcct 1380
tggagtcagt accttcaagt aattcaaaga gcagaccctc cccaccccag cttcacccc~ 1440
tctctgggat ttggtcgctt ctctaggggt tgggttggga ggagggagcc cccaaggcag 1500
acccttccct ctctacctcc cgattcccag accactgggc ttggtcctca aagattcctc 1560
acctccgccc ttgcccaacc tgggtcaagg ctgcagaagg ctggagccac cacaattaga 1620
ggggaagggg ctgctttgtt ccttatccct ccttcttaaa aggtagggtt caaactaggc 1680
gggatggggg cccatactgg tttgccccag gagtagggtt tctgggctag ggtctgtaag 1740
gctattttcc tttgcggtgg gaaggggagg taggggatga acactgggta tgggaagtgg 1800
gtgagaaatg gctgagaggg aaggaggaag gggcctcccc gctggagcag tcactggagt 1860
catttagaca aaaacactca tgtgcataag atacacagtg cgcaaactca gccctgccag 1920
cccggcccca atcccacctc tcaggactcc ttccaagacc ctggaggagg ttctggggat 1980
acagctgtag aaccgttcac tctggcccca tccaccccac ctccagcctc ttctcccctt 2040
ctaggtccag ggagtaagaa ggtgctcggg tgggcagaca gtggtggaaa cagtattgag 2100
ttttcctttg gttacatatt gaaggcaaag gtgagctgga cttacagtca aaacggatag 2160
gggtgaggaa ggaagagggg ccatggctgg ggttggagag ggaggtaggc cctcgtcagc 2220
ccctc 2225
<210> 8
<211> 115
<212> PRT
<213> Homo sapiens
<220>
<221> misc-feature
<223> Incyte ID No.: 1843578CD1
<400> 8
Met Gln Ala Gln Ala Pro Val Val Val Val Thr Gln Pro Gly Val
1 5 10 15
Gly Pro Gly Pro Ala Pro Gln Asn Ser Asn Trp Gln Thr Gly Met
20 25 30
Cys Asp Cys Phe Ser Asp Cys Gly Val Cys Leu Cys Gly Thr Phe
35 4U 45
Cys Phe Pro Cys Leu Gly Cys Gln Val Ala Ala Asp Met Asn Glu
50 55 60
Cys Cys Leu Cys Gly Thr Ser Val Ala Met Arg Thr Leu Tyr Arg
65 70 75
Thr ~rg Tyr Gly Ile Pro Gly Ser I1e Cys Asp Asp Tyr Met Aia
80 85 90
Thr Leu Cys Cys Pro His Cys Thr Leu Cys Gln Ile Lys Arg Asp
95 100 105
4

CA 02361743 2001-07-31
WO 00/50588 PCT/US00/02595
Ile Asn Arg Arg Arg Ala Met Arg Thr Phe
110 115
<210> 9
<211> 90
<212> PRT
<213> Homo sapiens
<220>
<221> misc feature
<223> Incyte ID No.: 1961467CD1
<400> 9
Met Pro Thr Ala Gly Thr Arg Arg Arg Thr Pro Leu Asn Ile His
1 5 10 15
Ser Cys Thr Trp Gly Thr Pro Arg Ser Pro Pro Gln Pro Ser Pro
20 25 30
Leu Ser Cys Cys Pro Pro Gln Gly Asn Tyr Ile Ala Leu Arg Glu
35 40 45
Pro Pro Gln Gly Leu Leu Cys Gln Ala Pro Ser Pro Pro Thr His
50 55 60
Pro His Phe Gly Thr Ser Ala Arg His Thr Ala Ala Arg Val Gly
65 70 75
Thr Leu Pro Ser Gln Ala Ser Val Ala Trp Ser Trp Arg Arg Gly
80 85 90

Representative Drawing

Sorry, the representative drawing for patent document number 2361743 was not found.

Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee  and Payment History  should be consulted.

Event History

Description Date
Inactive: IPC expired 2018-01-01
Inactive: IPC from MCD 2006-03-12
Application Not Reinstated by Deadline 2006-02-01
Time Limit for Reversal Expired 2006-02-01
Deemed Abandoned - Failure to Respond to Maintenance Fee Notice 2005-02-01
Inactive: Abandon-RFE+Late fee unpaid-Correspondence sent 2005-02-01
Inactive: Correspondence - Transfer 2002-08-16
Inactive: Office letter 2002-08-16
Letter Sent 2002-08-09
Inactive: Single transfer 2002-05-28
Inactive: Office letter 2002-03-11
Inactive: Notice - National entry - No RFE 2002-01-26
Inactive: Courtesy letter - Evidence 2001-12-24
Inactive: Cover page published 2001-12-13
Inactive: First IPC assigned 2001-12-09
Inactive: Applicant deleted 2001-12-07
Application Received - PCT 2001-11-26
Application Published (Open to Public Inspection) 2000-08-31

Abandonment History

Abandonment Date Reason Reinstatement Date
2005-02-01

Maintenance Fee

The last payment was received on 2004-01-23

Note : If the full payment has not been received on or before the date indicated, a further fee may be required which may be one of the following

  • the reinstatement fee;
  • the late payment fee; or
  • additional fee to reverse deemed expiry.

Patent fees are adjusted on the 1st of January every year. The amounts above are the current amounts if received by December 31 of the current year.
Please refer to the CIPO Patent Fees web page to see all current fee amounts.

Fee History

Fee Type Anniversary Year Due Date Paid Date
Basic national fee - standard 2001-07-31
Registration of a document 2001-10-18
MF (application, 2nd anniv.) - standard 02 2002-02-01 2002-01-30
Registration of a document 2002-05-28
MF (application, 3rd anniv.) - standard 03 2003-02-03 2003-01-24
MF (application, 4th anniv.) - standard 04 2004-02-02 2004-01-23
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
INCYTE GENOMICS, INC.
Past Owners on Record
MICHAEL G. WALKER
PREETI LAL
TOD M. KLINGLER
WAYNE VOLKMUTH
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Description 2001-07-30 28 1,675
Claims 2001-07-30 3 173
Abstract 2001-07-30 1 54
Reminder of maintenance fee due 2001-12-09 1 112
Notice of National Entry 2002-01-25 1 193
Request for evidence or missing transfer 2002-07-31 1 109
Courtesy - Certificate of registration (related document(s)) 2002-08-08 1 134
Reminder - Request for Examination 2004-10-03 1 121
Courtesy - Abandonment Letter (Request for Examination) 2005-04-11 1 166
Courtesy - Abandonment Letter (Maintenance Fee) 2005-03-28 1 174
PCT 2001-07-30 7 266
Correspondence 2001-12-18 1 24
Correspondence 2002-03-11 1 21
PCT 2001-07-31 6 263
Correspondence 2002-08-15 1 13

Biological Sequence Listings

Choose a BSL submission then click the "Download BSL" button to download the file.

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Please note that files with extensions .pep and .seq that were created by CIPO as working files might be incomplete and are not to be considered official communication.

BSL Files

To view selected files, please enter reCAPTCHA code :