Note: Descriptions are shown in the official language in which they were submitted.
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
NUCLEOTIDE POLYMORPHISMS ASSOCIATED WITH OSTEOPOROSIS
A portion of the disclosure of this patent document contains material which is
subject
to copyright protection. The copyright owner has no objection to the facsimile
reproduction
by anyone of the patent document or the patent disclosure, as it appears in
the Patent and
Trademark Office patent file or records, but otherwise reserves all copyright
rights
whatsoever.
TECHNICAL FIELD
The invention relates in general to polymorphisms in genes associated with
susceptibility to low bone mineral density and bone remodeling and methods of
identifying
individuals having a gene containing a polymorphism associated with
osteoporosis. The
invention also relates to a method of detecting an increases susceptibility to
a disease in an
individual resulting from the presence of a polymorphism or mutation in the
gene coding
sequence of a osteoporosis and bone remodeling associated gene.
BACKGROUND
Single nucleotide substitutions and small unique insertions and deletions are
the most
frequent form of DNA polymorphism and disease-causing mutation in the human
genome.
These DNA sequence variations, called single nucleotide polymorphisms (SNPs),
have
gained popularity and have been proposed as the genetic markers of choice for
the study of
complex genetic traits (Collins et al. 1997 Science 278: 1580- 1581; Risch and
Merkangas
1996 Science 273: 1516-1517). Despite the fact that on average approximately
one
nucleotide position in every 1000 bases along the human chromosome is
estimated to differ
between any two copies of the chromosome (Cooper et al. 1985 Human Genetics
69: 201-
205; Kwok et al. 1996 Genomics 31: 123-126) developing SNP markers is not
easy.
It has been suggested that association studies (such as linkage equilibrium
studies)
with a set of single nucleotide polymorphism (SNP) markers evenly spaced
across the
genome at approximately 100 KB intervals would provide the necessary power to
detect the
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
small effects of each gene involved in a complex trait (Hauser et al. 1996
Genetic
Epidemiology 13: 117-137 in I~wok and Chen 1998 Genetic En ing eerin~ 20: 125-
134,
Plenum Press, New York). Alternatively, one can take a candidate gene approach
in
performing association studies with the use of a set of gene-associated SNP
markers to detect
these genetic factors (ibid.).
Nucleotide sequence mutations which occur in a gene or gene family, where the
gene
or gene family is associated with a given disease, indicates susceptibility to
or development of
the disease.
Osteoporosis is a common disease characterized by low bone mineral density
(BMD),
deterioration of bone micro-architecture and increased risk of bone damage,
such as fracture.
Common types of osteoporosis include postmenopausal and senile osteoporosis,
which
generally occur later in life, e.g., 70+ years. Osteoporosis is a major health
problem in
virtually all societies. It is estimated that 30 million Americans and 100
million people
worldwide are at risk for osteoporosis. In European populations, one in three
women and one
in twelve men over the age of fifty is at risk. These numbers are growing as
the elderly
population increases. It is estimated that by the middle of the next century
the number of
osteoporosis sufferers will double in the West, but may increase six-fold in
Asia and South-
America. Fracture is the most serious endpoint of osteoporosis, particularly
fracture of the
hip which affects up to 1.7 million people worldwide each year. It is
estimated that by the
year 2050, the number of hip fractures worldwide will increase to over 6
million, as life
expectancy and age of the population increase (See Spangles et al. "The
Genetic Component
of Osteoporosis Mini-review"; http:l/www.csa.com.osteointro.html). Thus,
osteoporosis is a
major public health problem which affects quality of life and increases costs
to health care
providers.
Peak bone mass is mainly genetically determined, though dietary factors and
physical
activity can have positive effects. Peak bone mass is attained at the point
when skeletal
growth ceases, after which time bone loss starts. In contrast to the positive
balance that
occurs during growth, in osteoporosis, the resorbed cavity is not completely
refilled by bone.
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Despite recent successes with drugs that inhibit bone resorption, there is a
clear need
for specific anabolic agents that will considerably increase bone formation in
people who
have already suffered substantial bone loss. There are no such drugs currently
approved.
Current treatment for osteoporosis helps stop further bone loss and fractures.
Common therapeutics include HRT (hormone replacement therapy),
bisphosphonates, e.g.,
alendronate (Fosamax), as well as, estrogen and estrogen receptor modulators,
progestin,
calcitonin, and vitamin D. While there may be numerous factors that determine
whether any
particular person will develop osteoporosis, a step towards prevention,
control or treatment of
osteoporosis is determining whether one is at risk for osteoporosis. Genetic
factors also play
an important role in the pathogenesis of osteoporosis. Some attribute 50-60%
of total bone
variation (Bone Mineral Density; BMD), depending upon the bone area, to
genetic effects.
However, up to 85%-90% of the variance in bone mineral density may be
genetically
determined.
Studies have shown from family histories, twin studies, and racial factors,
that there
may be a predisposition for osteoporosis. Several candidate genes may be
involved in this,
most probably multigenic, process.
Osteoporosis can be considered a complex genetic trait with variants of
several genes
underlying the genetic determination of the variability of the phenotype. Low
bone mineral
density (BMD) is an important risk factor for fractures, the clinically most
relevant feature of
osteoporosis. Segregation analysis in families has shown that BMD is under
polygenic control
while, in addition, biochemical markers of bone turnover have also been shown
to have
strong genetic components. Several candidate genes have been analysed in
relation to BMD
but the most widely studied gene in this respect, the vitamin D receptor (VDR)
gene, explains
only a small part of the genetic effect on BMD. Numerous studies, focussing on
the BsznI
allele of the vitamin D receptor gene have concluded that absence of the
restriction site
correlates with low bone mineral density.
Diagnosis of those at risk of developing osteoporosis allows more effective
preventive
measures. Strategies for the prevention of this disease include development of
bone density
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
in early adulthood, and minimisation of bone loss in later life. Changes in
lifestyle, nutrition
and hormonal factors have been shown to affect bone loss.
There is need for clinical and epidemiological research for the prevention and
treatment of osteoporosis for gaining deeper knowledge of factors controlling
bone cell
activity and regulation of bone mineral and matrix formation and remodelling.
ST.JMMARY
One or more of these novel polymorphisms at the positions indicated in Table 2
found in
U.S. Provisional Patent Application Serial No. 60/342,711 entitled "Nucleotide
Polymorphisms
Associated with Osteoporosis" filed December 20, 2001 (which is hereby
incorporated herein by
reference in its entirety) may be responsible for increased susceptibility to
low bone mineral
density (BMD) and/or bone fracture, which indicates bone damage and related
conditions such
as osteoporosis. In particular, the polymorphisms of the present invention,
either alone or in
combination with other polymorphisms, may be useful in identifying individuals
susceptible or
resistant to low BMD andlor bone damage, and for those individuals that are
susceptible to low
B1VVID and/or bone damage in the prevention or treatment of this condition.
The present invention is applicable to any disease in which low BMD and/or
bone
fracture is a factor, and is therefore particularly concerned with diseases
such as osteoporosis.
Low BMD is defined as two standard deviations below the age-matched mean of
bone mineral
density for a given population. Bone damage may be defined as any form of
structural damage
such as fractures, bones or chips, and degradation or deterioration of the
bone other than normal
wear and tear resulting from low bone mineral density or another cause. Such
low BMD and/or
bone damage is associated with osteoporosis.
The invention may be practised on any mammalian subject. Preferably, the
mammalian
subject will be a human, and most preferably an adult, preferably female.
The polynucleotide of this invention is preferably DNA, or may be RNA or other
options.
In a second aspect, fragments of the nucleic acid sequences of the first
aspect are
provided, which comprise one or more nucleotide substitutions, insertions or
deletions. The
4
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
novelty of a fragment according to the present embodiment may be easily
ascertained using
sequence comparison methods as previously described.
Preferred fragments may be 10 to 40 nucleotides in length. More preferably,
the
fragments are between 5 to 10, 5 to 20, or 10 to 20 nucleotides in length. For
example, the
fragments may be 5, 8, 10, 12, 15, 18, 20, 22, 25, 28, 30, or 35 nucleotides
in length. The
fragments may be useful in a variety of diagnostic, prognostic or therapeutic
methods, or may be
useful as research tools for example in drug screening.
In a third aspect of the invention, there are provided non-coding,
complementary
sequences which hybridize to a nucleic acid sequence of the first aspect. Such
"anti-sense"
sequences are useful as probes or primers for detecting an allele of a
polymorphism of the
invention, or in the regulation of the genes. They may also be used as agents
for use in the
identification and/or treatment of individuals having or being susceptible to
low bone mineral
density.
The anti-sense polynucleotides of this embodiment may be the full length of
sequence of
the first aspect, or more preferably may be 5 to 30 nucleotides in length.
Preferred
polynucleotides are 5 to 10 or 10 to 25 nucleotides in length. Primers, in
particular, are typically
10 to 15 nucleotides long, and may occasionally be 16 to 25.
In a preferred embodiment, the polynucleotides of the aforementioned aspects
of the
invention may be in the form of a vector, to enable the in vitro or in vivo
expression of the
polynucleotide sequence. The polynucleotides may be operably linked to one or
more regulatory
elements including a promoter; regions upstream or downstream of a promoter
such as enhancers
which regulate the activity of the promoter; an origin of replication;
appropriate restriction sites
to enable cloning of inserts adjacent to the polynucleotide sequence; markers,
for example
antibiotic resistance genes; ribosome binding sites: RNA splice sites and
transcription
termination regions; polymerisation sites; or any other element which may
facilitate the cloning
and/or expression of the polynucleotide sequence. Where two or more
polynucleotides of the
invention are introduced into the same vector, each may be controlled by its
own regulatory
sequences, or all sequences may be controlled by the same regulatory
sequences. In the same
manner, each sequence may comprise a 3' polyadenylation site. The vectors may
be introduced
into microbial, yeast or animal DNA, either chromosomal or mitochondrial, or
may exist
5
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
independently as plasmids. Examples of suitable vectors will be known to
persons skilled in the
art and include pBluescript II, LambdaZap, and pCMV-Script (Stratagene Cloning
Systems, La
Jolla (USA))
In another aspect of the present invention, there is provided host cell
comprising a
polynucleotide according to any of the aforementioned aspects, for expression
of the
polynucleotide. The host cell may comprise an expression vector, or naked DNA
encoding said
polynucleotides. A wide variety of suitable host cells are available, both
eukaryotic and
prokaryotic.
In a further aspect of the present invention, there is provided a transgenic
non-human
animal comprising a polynucleotide according to an aforementioned aspect of
the invention.
Preferably, the transgenic, non-human animal comprises a polynucleotide
according to the second
third aspects. Transgenic non-human animals are useful for the analysis of the
single nucleotide
polymorphisms and their phenotypic effect.
In an eighth aspect of the present invention there is provided a method of
screening for
agents for use in the prognosis, diagnosis or treatment of individuals having,
or being susceptible
to, low bone mineral density, said method comprising contacting a putative
agent with a
polynucleotide or protein according to an aforementioned aspect of the present
invention, and
monitoring the reaction there between. Preferably, the method further
comprises contacting a
putative agent with a reference polynucleotide or protein, and comparing the
reaction between
(i) the agent and the polynucleotide or protein encoding the reference allele;
and (ii) the agent and
polynucleotide or protein of the invention. Potential agents are those which
react differently with
a variant of the invention and a reference allele. It is envisaged that the
present method may be
carried out by contacting a putative agent with a host cell or transgenic non-
human animal
comprising a polynucleotide or protein according to the invention. Putative
agents will include
those known to persons skilled in the art, and include chemical or biological
compounds, such
as anti-sense polynucleotide sequences, complementary to the coding sequences
of the first
aspect, or polyclonal or monoclonal antibodies which bind to a product such as
a protein or
protein fragment of the second aspect. They may also be useful in determining
susceptibility to
low bone mineral density, or in the diagnosis, prognosis or treatment of
related conditions.
6
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
In a ninth aspect of the present invention, there is provided a method of
diagnosing, or
determining susceptibility of a subject to low bone mineral density andlor
bone damage, said
method comprising analysing the genetic material of a subject to determine
which alleles) of the
gene is/are present. The method may include determining whether one or more
particular alleles
are present, or which combination of alleles (i.e. a haplotype) is present.
The method may also
include determining whether subjects are homozygous or heterozygous for a
particular allele or
haplotype. In a preferred embodiment, the method comprises determining which
allele of one
or more of the polymorphisms of the invention is/are present. In particular,
the method may
include determining the presence of the polymorphism of the gene which in
combination with
polymorphisms defined herein or other polymorphisms may define a risk
haplotype.
In another preferred embodiment of the ninth aspect, the method may comprise
determining which allele is present in the protein. Preferably, the method
comprises determining
whether the allele of the polymorphism of the fourth aspect is present. Any
method for
determining the presence of an allele may be used. One such method involves
the use of
antibodies in diagnosing or determining susceptibility to low bone mineral
density. The method
may comprise removing a sample from a subject, contacting the sample with an
antibody to an
antigen of the protein, and detecting binding of the antibody to the antigen,
wherein binding is
indicative of the presence of a particular allele or form of the protein and
thus risk to low BMD.
Tissue samples as described above are suitable for this method.
In a further aspect of the present invention, there is provided a method of
predicting the
response of a subject to treatment, said method comprising analysing genetic
material of a subject
to determine which alleles) of the gene is/are present. Preferably, the method
is carried out
according to the ninth aspect. This aspect of the invention is based upon the
observation that the
effectiveness of treatment depends upon the underlying cause of disease.
Therefore, depending
upon the presence of particular allele(s), and their effect, certain
treatments may be effective,
whereas others may not. This will be the case where different alleles or
haplotypes result in low
bone mineral density, but mediate their effect via different biological
mechanisms. The method
preferably also comprises comparing the alleles present in a subject with
those of the genes which
require particular treatments. This may be done by use of a chart or visual
aid detailing the
therapies which are most appropriate for particular genotypes.
7
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
In a further aspect, the present invention provides a kit to determine which
alleles of the
gene is/are present. Preferably, the kit will be suitable for determining
which alleles of the
polymorphisms of the first aspect are present. The kit may contain
polynucleotides, most
preferably anti-sense sequences such as those of the third aspect, for use as
probes or primers;
antibodies which bind to alleles of the protein, such as those of the fifth
aspect; or restriction
enzymes for use in detecting the presence of a polynucleotide, protein, or
fragment thereof.
Preferably, the kit will also comprise means for detection of a reaction, such
as nucleotide label
detection means, labelled secondary antibodies or size detection means. In yet
a further preferred
embodiment, the polynucleotides, or antibodies may be fixed to a substrate,
for example an
array.. The kit further comprises means for indicating correlation between the
genotype of a
subject and risk of low BMD. Such means may be in the form of a chart or
visual aid, which
indicate that presence of one or more alleles of the gene, including alleles
of the polymorphisms
of the invention, is/are associated with low BMD.
DESCRIPTION
The invention provides novel polynucleotides and polymorphic polynucleotides
associated with a given human disease, for example, with osteoporosis. The
invention also
provides a gene sequence containing one or more polymorphic nucleotides
associated with a
predisposition to or the development of a given human disease such as
osteoporosis. The
invention also relates to polypeptides encoded by the novel polynucleotides or
the polymorphism-
containing gene. The invention also provides methods of detecting a
polymorphism according
to the invention in individuals at risk for osteoporosis, and for determining
if a given
polymorphism is associated with a predisposition to the disease. The invention
also discloses
polymorphism(s) that are either associated with or are not associated with
(i.e., are neutral)
osteoporosis. A polymorphism in a given gene can be utilized in various
diagnostic and
therapeutic methods and procedures, for example, in nucleic acid and peptide
diagnosis, drug
screening and design, and in gene and peptide therapy. A polymorphism
associated with a given
gene can be utilized in various gene expression systems and assays designed to
analyze gene
regulation and expression.
s
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Definitions
As used herein, "polymorphism" refers to a nucleotide alteration that either
predisposes
an individual to a disease or is not associated with a disease, which occurs
as a result of a
substitution, insertion or deletion.
More particularly, a "polymorphism" or "polymorphic variation" may be a
nucleic acid
sequence variation, as compared to the naturally occurring sequence, resulting
from either a
nucleotide deletion, an insertion or addition, or a substitution, which is
present at a frequency of
greater than 1 % in a population.
As used herein, "neutral polymorphism" refers to a polymorphism which is
present at a
frequency of greater than 1 % in a population, which does not alter gene
function or phenotype,
and thus is not associated with a predisposition to or development of a
disease.
As used herein "polynucleotide sequence" refers to a sense or antisense
nucleic acid
sequence comprising RNA, cDNA, genomic DNA, synthetic forms and mixed
polymers, that
may be chemically or biochemically modified or may contain non-natural or
derivatized
nucleotide bases.
As used herein "mutation" refers to a variation in the nucleotide sequence of
a gene or
regulatory sequence as compared to the naturally occurring or normal
nucleotide sequence. A
mutation may result from the deletion, insertion or substitution of more than
one nucleotide (e.g.,
2, 3, 4, or more nucleotides) or a single nucleotide change such as a
deletion, insertion or
substitution. The term "mutation" also encompasses chromosomal rearrangements.
As used herein, "nucleic acid probe" refers to an oligonucleotide, nucleotide
or
polynucleotide, and fragments and portions thereof, and to DNA or RNA of
genomic or synthetic
origin which may be single- or double- stranded, which represents the sense or
antisense strand.
Both terms "nucleic acid probe" and "DNA fragment" refer to a length of
polynucleotide, for
example, as small as 5 nucleotides, 10, 20, 25, 40, 50, 75, 100, 250, 400, 500
and 1 kb, and as
large as 5-lOkb.
As used herein, "alteration" refers to a change in either a nucleotide or
amino acid
sequence, as compared to the naturally occurring sequence, resulting from a
deletion, an insertion
or addition, or a substitution.
9
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
As used herein, "deletion" refers to a change in either nucleotide or amino
acid sequence
wherein one or more nucleotides or amino acid residues, respectively, are
absent.
As used herein, "insertion" or "addition" refers to a change in either
nucleotide or amino
acid sequence wherein one or more nucleotides or amino acid residues,
respectively, have been
added.
As used herein, "substitution" refers to a replacement of one or more
nucleotides or
amino acids by different nucleotides or amino acid residues, respectively.
As used herein, "specifically hybridizable" refers to a nucleic acid or
fragment thereof
that hybridizes to another nucleic acid (or a complementary strand thereof)
due to the presence
of a region that is at least approximately 90% homologous preferably at least
approximately 90-
95% homologous, and more preferably approximately 98-100% homologous, as are
polynucleotides that hybridize to a partner under stringent hybridization
conditions. "Stringent"
hybridization conditions are defined hereinbelow for various hybridization
protocols. A probe
that is specifically hybridizable to a given sequence can be used to detect a
1 by out of 10 by
(10%) or a 1 by out of 20 by (5%) difference between nucleic acid sequences
and is therefore
useful for discriminating between a wild type and a mutant form of a gene of
interest.
As used herein, "amino acid sequence" refers to the sequential array of amino
acids that
have been joined by peptide bonds between the carboxylic acid group of one
amino acid and the
amino group of the adjacent amino acid to form long linear polymers comprising
proteins.
As used herein, "amino acid" refers to protein subunit molecules that contain
a carboxylic
acid group, and an amino group., both linked to a single carbon atom.
A polypeptide is said to be "encoded" by a polynucleotide if the
polynucleotide, either
in its native state or in a recombinant form can be transcribed and/or
translated to produce the
mRNA for and/or the polypeptide or a fragment thereof. °
As used herein, "gene " refers to a region of DNA which includes a portion
which can
be transcribed into RNA, and which may contain an open reading frame or coding
region (also
referred to as an axon) which encodes a protein, a non-coding region (also
referred to as an
intron), and a specific regulatory region comprising the DNA regulatory
elements which control
expression of the transcribed region.
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
As used herein, "coding region" refers to a region of DNA which encodes a
protein, also
known as an exon.
As used herein, "non-coding region" refers to a region of DNA which does not
encode
a protein coding region, also known as an intron, and is not included in the
RNA molecule that
is synthesized from a particular gene.
As used herein, "regulatory region" refers to DNA sequences which are located
either 5'
of the transcription start site, 3' or the transcription termination site,
within an intron or exon,
capable of ensuring that the gene is transcribed at the proper time and in the
appropriate cell type.
As used herein, "consensus DNA sequence" or "wild-type DNA sequence" refers to
a
sequence wherein every position represents the nucleotide that occurs with the
highest frequency
when many actual sequences are compared. As used herein, "consensus DNA
sequence" or
"wild-type DNA sequence" also refers to the normal, naturally occurring DNA
sequence.
As used herein, a given sequence (or mutation or polymorphism) "associated
with"
osteoporosis refers to a nucleic acid sequence that increases susceptibility
to the disease,
predisposes an individual to the disease or contributes to the disease,
wherein the nucleic acid
sequence is present at a higher frequency (at least 5%, preferably 10%, more
preferably 25%
higher) in individuals with the disease as compared to individuals who do not
have the disease.
As used herein, a sequence "not associated with" osteoporosis refers to a
nucleic acid
sequence that does not ,increase susceptibility to the disease, predispose an
individual to the
disease or contribute to the disease, wherein the nucleic acid sequence is not
present at a higher
frequency in
individuals with the disease, and thus is present at a frequency about equal
to its frequency in
individuals who do not have the disease.
As used herein, "amplifying" refers to producing additional copies of a
nucleic acid
sequence, preferably by the method of polymerase chain reaction (Mullis and
Faloona, 1987,
Methods Enzymol, 155: 335).
As used herein, "oligonucleotide primers" refer to single stranded DNA or RNA
molecules that are hybridizable to a nucleic acid template and prime enzymatic
synthesis of a
second nucleic acid strand. Oligonucleotide primers useful according to the
invention are
11
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
between 5 to 100 nucleotides in length, preferably 20-60 nucleotides in
length, and more
preferably 20-40 nucleotides in length.
As used herein, "sequencing" refers to determining the precise nucleotide
composition
or sequence of a nucleic acid region by methods well known in the art (see
Ausubel et al., supra
and Sambrook et al., supra).
As used herein, "comparing" a sequence refers to determining if the
nucleotides at one
or more positions in a particular region of a nucleic acid fragment are
identical for any two or
more sequences. According to the invention, sequence comparisons can be
performed by using
computer program analysis as described below in Section F entitled
"Identification and
Characterization of Polymorphisms".
As used herein, "sequence differences" or "sequence variations" refer to
nucleotide
changes, at one or more positions between any two or more sequences being
compared.
As used herein, "determining the presence of polymorphic variations" refers to
using
methods well known in the art to identify a nucleotide, at one or more
positions within a
particular nucleic acid region, that is distinct from the nucleotide present
in the naturally
occurring, wild-type or consensus sequence, resulting from either a nucleotide
deletion, an
insertion or addition, or a substitution.
As used herein, "determining the absence of polymorphic variations" refers to
using
methods well known in the art to determine that the nucleotides present at
every position
analyzed in a particular nucleic acid region are identical to the nucleotides
present in the naturally
occurring, wild type or consensus sequence.
As used herein, "genotyping" refers to determining the composition of the
genetic
material that is inherited by an organism from its parents.
As used herein, "biological sample" refers to a tissue or fluid sample
containing a
polynucleotide or polypeptide of interest, and isolated from an individual
including but not
limited to plasma, serum, spinal fluid, lymph fluid, urine, stool, external
secretions of the skin,
respiratory, intestinal and genitoruinary tracts, saliva, blood cells, tumors,
organs, tissue and
samples of in vitro cell culture constituents.
12
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
As used herein, "amplimers" refer to a specific fragment of DNA generated by
PCR that
is at least 30 by in length and is preferably between 50 and 100 by in length,
and is more
preferably between 150-300bp in length, with a melting temperature in the
range of
approximately 60-62°C.
As used herein, "phenotype" refers to the biological appearances of an
organism or a
tissue derived from an organism, wherein biological appearances include
chemical, structural and
behavioral attributes, and excludes genetic constitution.
As used herein, "genotype" refers to the genetic material that is inherited by
an organism
from its parents.
As used herein, "genetic susceptibility to osteoporosis" refers to an
increased risk of
developing osteoporosis resulting from specific DNA differences relative to
non-susceptible
individuals. Preferably an individual who is genetically susceptible to
osteoporosis has a 5-100%,
and more preferably a 25-50% greater chance of developing osteoporosis, as
compared to non-
susceptible individuals.
As used herein, "diagnostic" refers to the practice of identifying a disease
from the signs
and symptoms of an individual including the DNA sequences of genes that are
associated with
an increased susceptibility to the disease. "Diagnostic" also refers to the
practice of stratifying
patient populations based on the efficacy or toxicity of a composition, and
the predictive
placement of an individual in a response strata based on stata-associated
parameters.
As used herein, "prognosis" refers to the possibility of recovering from a
particular
disease or condition, and also refers to risk assessment of developing a
particular disease or
condition.
A. Design and Synthesis of Oligonucleotide Primers
According to the present invention, oligonucleotide primers are disclosed that
are useful
for determining the sequence of a particular allele of a gene. The invention
also discloses
oligonucleotide primers designed to amplify a region of a gene that is known
to contain a
polymorphism. The invention also discloses oligonucleotide primers designed to
anneal
specifically to a particular allele of a gene.
13
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Oligonucleotide primers useful according to the invention are single-stranded
DNA or
RNA molecules that are hybridizable to a nucleic acid template and prime
enzymatic synthesis
of a second nucleic acid strand. The primer is complementary to a portion of a
target molecule
present in a pool of nucleic acid molecules. It is contemplated that
oligonucleotide primers
according to the invention are prepared by synthetic methods, either chemical
or enzymatic.
Alternatively, such a molecule or a fragment thereof is naturally-occurring,
and is isolated from
its natural source or purchased from a commercial supplier. Oligonucleotide
primers are 5 to 100
nucleotides in length, ideally from 20 to 40 nucleotides, although
oligonucleotides of different
length are of use.
Pairs of single-stranded DNA primers can be annealed to sequences within or
surrounding
a gene on chromosome Y in order to prime amplifying DNA synthesis of a region
of a gene. A
complete set of gene primers will allow synthesis of all of the nucleotides of
the coding
sequences, e.g., the exons, introns and control regions. Preferably, the set
of primers will also
allow synthesis of both intron and exon sequences.
Allele-specific primers are also useful, according to the invention. Such
primers will
anneal only to a particular-mutant allele (e.g. alleles containing a
polymorphism), and thus will
only amplify a product if the template also contains the polymorphism. Allele
specific primers
that anneal only to a wild type gene sequence are also useful according to the
invention.
Typically, selective hybridization occurs when two nucleic acid sequences are
substantially complementary (at least about 65% complementary over a stretch
of at least 14 to
nucleotides, preferably at least about 75%, more preferably at least about 90%
complementary). See Kanehisa, M., 1984, Nucleic Acids Res. 12: 203,
incorporated herein by
reference. As a result, it is expected that a certain degree of mismatch at
the priming site is
tolerated. Such mismatch may be small, such as a mono-, di- or tri-nucleotide.
Alternatively, it
25 may encompass loops, which are defined as regions in which there exists a
mismatch in an
uninterrupted series of four or more nucleotides.
Numerous factors influence the efficiency and selectivity of hybridization of
the primer
to a second nucleic acid molecule. These factors, which include primer length,
nucleotide
sequence and/or composition, hybridization temperature, buffer composition and
potential for
14
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
steric hindrance in the region to which the primer is required to hybridize,
will be considered
when designing oligonucleotide primers according to the invention.
A positive correlation exists between primer length and both the efficiency
and accuracy
with which a primer will anneal to a target sequence. In particular, longer
sequences have a
higher melting temperature (TM) than do shorter ones, and are less likely to
be repeated within
a given target sequence, thereby minimizing promiscuous hybridization. Primer
sequences with
a high G-C content or that comprise palindromic sequences tend to self-
hybridize, as do their
intended target sites, since unimolecular, rather than bimolecular,
hybridization kinetics are
generally favored in solution. However, it is also important to design a
primer that contains
sufficient numbers of G-C nucleotide pairings since each G-C pair is bound by
three hydrogen
bonds, rather than the two that are found when A and T bases pair to bind the
target sequence,
and therefore forms a tighter, stronger bond. Hybridization temperature varies
inversely with
primer annealing efficiency, as does the concentration of organic solvents,
e.g. formamide, that
might be included in a priming reaction or hybridization mixture, while
increases in salt
concentration facilitate binding. Under stringent annealing conditions, longer
hybridization
probes (of use, for example, in Northern analysis), or synthesis primers,
hybridize more
efficiently than do shorter ones, which are sufficient under more permissive
conditions. Stringent
hybridization conditions typically include salt concentrations of less than
about 1M, more usually
less than about 500 mM and preferably less than about 200 mM. Hybridization
temperatures
range from as low as 0°C to greater than 22°C, greater than
about 30°C, and (most often) in
excess of about 37°C. Longer fragments may require higher hybridization
temperatures for
specific hybridization. As several factors affect the stringency of
hybridization, the combination
of parameters is more important than the absolute measure of a single factor.
Oligonucleotide primers can be designed with these considerations in mind and
synthesized according to the following methods.
1. Oligonucleotide Primer Design Strategy
The design of a particular oligonucleotide primer for the purpose of
sequencing or PCR
involves selecting a sequence that is capable of recognizing the target
sequence, but has a
minimal predicted secondary structure. The oligonucleotide sequence binds only
to a single site
in the target nucleic acid. Furthermore, the Tm of the oligonucleotide is
optimized by analysis
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
of the length and GC content of the oligonucleotide. Furthermore, when
designing a PCR primer
useful for the amplification of genomic DNA, the selected primer sequence does
not demonstrate
significant matches to sequences in the GenBank database (or other available
databases).
The design of a primer is facilitated by the use of readily available computer
programs,
developed to assist in the evaluation of the several parameters described
above and the
optimization of primer sequences. Examples of such programs are "PrimerSelect"
of the
DNAStarTMSOftware package (DNAStar, Inc.; Madison, WI), OLIGO 4.0 (National
Biosciences,
Inc.), PRIMER, Oligonucleotide Selection Program, PGEN and Amplify (described
in Ausubel
et al.,1995, Short Protocols in Molecular Biolo~y, 3rd Edition John Wiley &
Sons). Primers are
designed with sequences that serve as targets for other primers to produce a
PCR product that has
known sequences on the ends which serve as targets for further amplification
(e.g. to sequence
the PCR product). If many different genes are amplified with specific primers
that share a
common 'tail' sequence', the PCR products from these distinct genes can.
subsequently be
sequenced with a single set of primers. Alternatively, in order to facilitate
subsequent cloning of
amplified sequences, primers are designed with restriction enzyme site
sequences appended to
their 5' ends. Thus, all nucleotides of the primers are derived from gene
sequences or sequences
adjacent to a gene, except for the few nucleotides necessary to form a
restriction enzyme site.
Such enzymes and sites are well known in the art. If the genomic sequence of a
gene and the
sequence of the open reading frame of a gene are known, design of particular
primers is well
within the skill of the art.
2. Synthesis
The primers themselves are synthesized using techniques which axe also well
known in
the art. Once designed, oligonucleotides are prepared by a suitable method,
e.g. the
phosphoramidite method described by Beaucage and Carruthers (1981, Tetrahedron
Lett.,
22:1859) or the triester method according to Matteucci et al. (1981, J. Am.
Chem. Soc.,
103:3185), both incorporated herein by reference, or by other chemical methods
using either a
commercial automated oligonucleotide synthesizer (which is commercially
available) or
VLSIPSTM technology.
B. Production of a Polynucleotide Sequence
l6
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
The invention discloses polynucleotide sequences comprising polymorphisms. The
polynucleotide sequences of the invention are specifically hybridizable to a
mutant form of a
gene and are therefore useful for discriminating between a wild-type form of a
gene and a mutant
form of a gene. The polynucleotide sequences of the invention may also be
useful for expression
of the encoded protein or a fragment thereof. The invention also features
antisense polynucleotide
sequences complementary to polynucleotide sequences comprising polymorphisms.
Antisense
polynucleotide sequences are useful according to the invention for inhibiting
expression of an
allelic form of a gene.
The present invention utilizes polynucleotide sequences and fragments
comprising RNA,
cDNA, genomic DNA, synthetic forms, and mixed polymers. The invention includes
both sense
and antisense strands of the polynucleotide sequences. According to the
invention, the
polynucleotide sequences may be chemically or biochemically modified or may
contain non-
natural or derivatized nucleotide bases. Such modifications include, for
example, labels,
methylation, substitution of one or more of the naturally occurring
nucleotides with an analog,
internucleotide modifications such as uncharged linkages (e.g. methyl
phosphonates,
phosphorodithioates. etc.), pendent moieties (e.g., polypeptides),
intercalators, (e.g. acridine,
psoralen, etc.) chelators, alkylators, and modified linkages (e.g. alpha
anomeric nucleic acids,
etc.) Also included are synthetic molecules that mimic polynucleotides in
their ability to bind to
a designated sequence via hydrogen bonding and other chemical interactions.
Such molecules
are known in the art and include, for example, those in which peptide linkages
substitute for
phosphate.linkages in the backbone of the molecule.
The polynucleotide may be a naturally occurring polynucleotide, or may be a
structurally
related variant of such a polynucleotide having modified bases and/or sugars
and/or linkages. The
term "polynucleotide" as used herein is intended to cover all such variants.
Modifications, which may be made to the polynucleotide may include (but are
not limited
to) the following types:
a) Backbone modifications
17
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
i) phosphorothioates (X or Y or W or Z = S or any combination of two or more
with the
remainder as 0).
e.g. Y=S (Stein et al., 1988, Nucleic Acids Res., 15:3209), X=S (Cosstick and
Vyle,
1989, Tetrahedron Letters, 30:4693), Y and Z=S (Brill et al., 1989, J. Amer.
Chem. Soc.,
111:2321)
ii) methylphosphonates (eg Z=methyl (Miller et al., 1980, J. Biol. Chem.,
255:9569))
iii) phosphoramidates (Z~= N-(alkyl)2 e.g. alkyl methyl, ethyl. butyl)
(Z=morpholine or
piperazine) (Agrawal et al., 1988, Proc. Natl. Acad. Sci.. USA, 85;7079) (X or
W = NH) (Mag
and Engels. 1988, Nucleic Acids Res., 16:3525)
iv) phosphotriesters (Z=O-alkyl e.g. methyl, ethyl etc) (Miller et al.,
Biochemistry,
21:5468) v) phosphorus-free linkages (e.g. carbamate, acetamidate, acetate)
(Gait et al.,
1974, J Chem.Soc. Perkin I, 1684, Gait et al., 1979, J Chem.Soc. Perkin I,
1389)
b) Sugar modifications
i) 2'-deoxynucleosides (R=IT)
ii) 2'-O-methylated nucleosides (R=OMe) (Sproat et al., 1989, Nucleic Acids
Res., 17:
3373)
iii) 2'-fluoro-2'-deoxynucleosides (R=F) (Krug et al.,1989, Nucleosides and
Nucleotides,
8:1473)
c) Base modifications - (for a review see Jones, 1979, Int. J. Biology.
Macromolecules,
1:194)
i) pyrimidine derivatives substituted in the 5-position (e.g. methyl, bromo,
fluoro etc) or
replacing a carbonyl group by an amino group (Piccirilli et al., 1990, Nature,
343:33).
ii) purine derivatives lacking specific nitrogen atoms (e.g. 7-deaza adenine,
hypoxanthine)
or functionalized in the 8-position (e.g. 8-azido adenine, 8-bromo adenine)
d) Polynucleotides covalently linked to reactive functional groups, e.g.:
18
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
i) psoralens (Miller et al., 1988, Nucleic Acids Res. Special Pub. No. 20:113,
phenanthrolines (Sun et al.,1988, Biochemistry, 27:6039), mustards (Vlassov et
al.,1988, Gene,
72:313) (irreversible cross-linking agents with or without the need for co-
reagents)
ii) acridine (intercalating agents) (Helene et al., 1985, Biochimie, 67:777)
iii) thiol derivatives (reversible disulfide formation with proteins)
(Connolly and
Newman, 1989, Nucleic Acids Res., 17:4957)
iv) aldehydes (Schiffs base formation)
v) azido, bromo groups (UV cross-linking)
vi) ellipticines (photolytic cross-linking) (Perrouault et al., 1990, Nature,
344:358)
e) Polynucleotides covalently linked to lipophilic .roups or other rea ents c-
apable of improving
uptake by cells, e.g.:
i) cholesterol (Letsinger et al., 1989, Proc. Natl. Acad. Sci. USA, 86:6553),
polyamines
(Lemaitre et al., 1987, Proc. Natl. Acad. Sci. USA, 84: 648), other soluble
polymers (e.g.
polyethylene glycol)
f) Pol~nucleotides containing alpha-nucleosides (Morvan et al., Nucleic Acids
Res., 15: 3421)
g) Combinations of modifications a)-f)
It should be noted that such modified polynucleotides, while sharing features
with ,
polynucleotides designed as "anti-sense" inhibitors, are distinct in that the
compounds
correspond to sense-strand sequences and the mechanism of action depends on
protein-nucleic
acid interactions and does not depend upon interactions with nucleic acid
sequences.
1. Polynucleotide Sequences Comprising DNA
a. Cloning
Polynucleotide sequences comprising DNA can be isolated from cDNA or genomic
libraries (including YAC and BAC libraries) by cloning methods well known to
those skilled in
the art (Ausubel et al., supra). Briefly, isolation of a DNA clone comprising
a particular
polynucleotide sequence involves screening a recombinant DNA or cDNA library
and identifying
19
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
the clone containing the desired sequence. Cloning will involve the following
steps. The clones
of a particular library are spread onto plates, transferred to an appropriate
substrate for screening,
denatured, and probed for the presence of a particular sequence. A description
of hybridization
conditions, and methods for producing labeled probes is included below.
The desired clone is preferably identified by hybridization to a nucleic acid
probe or by
expression of a protein that can be detected by an antibody. Alternatively,
the desired clone is
identified by polymerase chain amplification of a sequence defined by a
particular set of primers
according to the methods described below.
The selection of an appropriate library involves identifying tissues or cell
lines that are
an abundant source of the desired sequence. Furthermore, if the polynucleotide
sequence of
interest contains regulatory sequence or intronic sequence a genomic library
is screened (Ausubel
et al., supra).
b. Genomic DNA
Polynucleotide sequences of the invention are amplified from genomic DNA.
Genomic
DNA is isolated from tissues or cells according to the following method.
To facilitate detection of a variant form of a gene from a particular tissue,
the tissue is
isolated free from surrounding normal tissues. To isolate genomic DNA from
mammalian tissue,
the tissue is minced and frozen in liquid nitrogen. Frozen tissue is ground
into a fine powder with
a prechilled mortar and pestle, and suspended in digestion buffer (100 mM
NaCI,10 mM TrisCl,
pH 8.0, 25 mM EDTA, pH 8.0, 0.5% (w/v) SDS, 0.1 mg/ml proteinase K) at 1.2m1
digestion
buffer per 100mg of tissue. To isolate genomic DNA from mammalian tissue
culture cells, cells
are pelleted by centrifugation for 5 min at 500 x g,, resuspended in 1-10 ml
ice-cold PBS,
repelleted for 5 min at 500 x g and resuspended in 1 volume of digestion
buffer.
Samples in digestion buffer are incubated (with shaking) for 12-18 hours at
50°C, and
then extracted with an equal volume of phenol/chloroform/isoamyl alcohol. If
the phases are not
resolved following a centrifugation step (10 min at 1700 x g), another volume
of digestion buffer
(without proteinase K) is added and the centrifugation step is repeated. If a
thick white material
is evident at the interface of the two phases, the organic extraction step is
repeated. Following
extraction the upper, aqueous layer is transferred to a new tube to which will
be added 1/a volume
of 7.5M ammomum acetate and 2 volumes of 100% ethanol. The nucleic acid is
pelleted by
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
centrifugation for 2 min at 1700 x g, washed with 70% ethanol, air dried and
resuspended in TE
buffer (10 mM TrisCl, pH 8.0, 1 mM EDTA, pH 8.0) at lmg/ml. Residual RNA is
removed by
incubating the sample for 1 hour at 37°C in the presence of 0.1% SDS
and 1 mg/ml DNAse-free
RNASE, and repeating the extraction and ethanol precipitation steps. The yield
of genomic DNA,
according to this method is expected to be approximately 2 mg DNA/1 g cells or
tissue (Ausubel
et al., supra). Genomic DNA isolated according to this method can be used for
Southern blot
analysis, restriction enzyme digestion, dot blot analysis or PCR analysis,
according to the
invention.
c. Restriction digest (of cDNA or genomic DNA)
Following the identification of a desired cDNA or genomic clone containing a
particular
sequence, polynucleotides of the invention are isolated from these clones by
digestion with
restriction enzymes.
The technique of restriction enzyme digestion is well known to those skilled
in the art
(Ausubel et al., supra). Reagents useful for restriction enzyme digestion are
readily available
from commercial vendors including New England Biolabs, Boebringer Mannheim,
Promega, as
well as other sources.
d. PCR
Polynucleotide sequences of the invention are amplified from genomic DNA or
other
natural sources by the polymerase chain reaction (PCR). PCR methods are well-
known to those
skilled in the art.
PCR provides a method for rapidly amplifying a particular DNA sequence by
using
multiple cycles of DNA replication catalyzed by a thermostable, DNA-dependent
DNA
polymerase to amplify the target sequence of interest. PCR requires the
presence of a nucleic acid
to be amplified, two single stranded oligonucleotide primers flanking the
sequence to be
amplified, a DNA polymerase, deoxyribonucleoside triphosphates, a buffer and
salts.
The method of PCR is well known in the art. PCR, is performed as described in
Mullis
and Faloona, 1987, Methods Enzymol., 155: 335, herein incorporated by
reference.
PCR is performed using template DNA (at least 1 pg; more usefully, 1- 1000 ng)
and at
least 25 pmol of oligonucleotide primers. A typical reaction mixture includes:
2 ml of DNA, 25
21
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
pmol of oligonucleotide primer, 2.5 ml of lOX PCR buffer 1 (Perkin-Elmer,
Foster City, CA), 0.4
ml of 1.25 mM dNTP, 0.15 ml (or 2.5 units) of Taq DNA polymerise (Perkin
Elmer, Foster City,
CA) and deionized water to a total volume of 25 ml. Mineral oil is overlaid
and the PCR is
performed using a programmable thermal cycler.
The length and temperature of each step of a PCR cycle, as well as the number
of cycles,
are adjusted according to the stringencyrequirements in effect. Annealing
temperature and timing
are determined both by the efficiency with which a primer is expected to
anneal to a template and
the degree of mismatch that is to be tolerated. The ability to optimize the
stringency of primer
annealing conditions is well within the knowledge of one of moderate skill in
the art. An
annealing temperature of between 30°C and 72°C is used. Initial
denaturation of the template
molecules normally occurs at between 92°C and 99°C for 4
minutes, followed by 20-40 cycles
consisting of denaturation (94-99°C for 15 seconds to 1 minute),
annealing (temperature
determined as discussed above; 1-2 minutes), and extension (72°C for 1
minute). The final
extension step is generally carried out for 4 minutes at 72°C, and may
be followed by an
indefinite (0-24 hour) step at 4°C.
Several techniques for detecting PCR products quantitatively without
electrophoresis may
be useful according to the invention in order to make it more suitable for
easy clinical use. One
of these techniques, for which there are commercially available kits such as
TaqmanTM (Perkin
Elmer, Foster City, CA), is performed with a transcript-specific antisense
probe. This probe is
specific for the PCR product (e.g. a nucleic acid fragment derived from a
gene) and is prepared
with a quencher and fluorescent reporter probe complexed to the 5' end of the
oligonucleotide.
Different fluorescent markers can be attached to different reporters, allowing
for measurement
of two products in one reaction. When Taq DNA polymerise is activated, it
cleaves off the
fluorescent reporters of the probe bound to the template by virtue of its 5'-
to-3' nucleolytic
activity. In the absence of the quenchers, the reporters now fluoresce. The
color change in the
reporters is proportional to the amount of each specific product and is
measured by a fluorometer;
therefore, the amount of each color can be measured and the PCR product can be
quantified. The
PCR reactions can be performed in 96 well plates so that samples derived from
many individuals
can be processed and measured simultaneously. The TaqmanTM system has the
additional
advantage of not requiring gel electrophoresis and allows for quantification
when used with a
standard curve.
22
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
2. Polynucleotide Sequences Comprising RNA
The present invention also provides a polynucleotide sequence comprising RNA.
A
polynucleotide comprising RNA is useful for detecting snps and polymorphisms
by techniques
including but not limited to hybridization methods or the RNase protection
method. A
polynucleotide comprising RNA is also useful as a template for the in vitro
production of protein.
A polynucleotide comprising RNA is also useful for detecting and localizing
specific mRNA
sequences by in situ hybridization.
Polynucleotide sequences comprising RNA can be produced according to the
method of
i~z vitro transcription.
The technique of in vitro transcription is well known to those of skill in the
art. Briefly,
the gene of interest is inserted into a vector containing an SP6, T3 or T7
promoter. The vector
is linearized with an appropriate restriction enzyme that digests the vector
at a single site located
downstream-of the coding sequence. Following a phenol/chloroform extraction,
the DNA is
ethanol precipitated, washed in 70% ethanol, dried and resuspended in sterile
water. The in vitro
transcription reaction is performed by incubating the linearized DNA with
transcription buffer
(200 mM TrisCl, pH 5.0,40 mM MgCl2, 10 mM spermidine, 250 NaCl [T7 or T3] or
200 mM
TrisCl, pH 7.5,30 mM MgCh, lOmM spermidine [SP6]), dithiothreitol, RNASE
inhibitors, each
of the four ribonucleoside triphosphates, and either SP6, T7 or T3 RNA
polymerase for 30 min
at 37°C. To prepare a radiolabeled polynucleotide comprising RNA,
unlabeled UTP will be
omitted and ~SUTP will be included in the reaction mixture. The DNA template
is then removed
by incubation with DNaseI. Following ethanol precipitation, an aliquot of the
radiolabeled RNA
is counted in a scintillation counter to determine the cpm/ml (Ausubel et al.,
supra).
Alternatively, polynucleotide sequences comprising RNA are prepared by
chemical
synthesis techniques such as solid phase phosphoramidite (described above).
3. Polynucleotide Sequences Comprising Oligonucleotides
A polynucleotide sequence comprising oligonucleotides can be made by using
oligonucleotide synthesizing machines which are commercially available
(described above).
4. Polynucleotide Sequences Encoding Fusion Proteins
23
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Polynucleotide sequences of the invention can be used to express the protein
product (or
fragment thereof) of the gene of interest by inserting the polynucleotide
sequence into an
expression vector. Expression vectors suitable for protein expression in
mammalian cells,
bacterial cells, insect cells or plant cells are well known in the art and are
described in Section
H entitled "Production of a Mutant Protein".
Polynucleotide sequences of the invention can be used to prepare hybrid
polynucleotides
comprising a sequence of a gene adj acent to a sequence encoding a foreign
protein or a fragment
thereof (e.g lacZ, trpE, glutathionine S-transferase or thioredoxin) or a
protein tag
(hemmaglutinin or FLAG). Such hybrid polynucleotides produce fusion proteins
that are useful,
according to the invention, for improved expression and/or rapid isolation of
a protein or protein
fragment, encoded by the sequence of a gene. Hybrid polynucleotides are also
useful as a source
of antigen for the production of antibodies.
Nucleic acid constructs comprising a polynucleotide of genomic, cDNA,
synthetic or
semi- synthetic origin in association with a polynucleotide sequence encoding
a foreign protein
or a fragment thereof, (carrier sequence) can be generated by recombinant
nucleic acid techniques
well known in the art (See Ausubel et al., supra). According to this method,
the cloned gene is
introduced into an expression vector at a position located 3' to a carrier
sequence coding for the
amino terminus of a highly expressed protein, an entire functional moiety of a
highly expressed
protein or the entire protein. It is preferable to use a carver sequence from
an E. coli gene or from
any gene that is expressed at high levels in E. coli. It is often preferable
to select a carrier
sequence that will facilitate protein purification, either with antibodies, or
with an affinity
purification protocol that is specific for the carrier protein being used. For
example, the
purification protocol can be designed in accordance with the unique physical
properties of the
carrier protein (e.g. heat stability). Alternatively, the tag sequence may
encode a protein (e.g.
glutathione-S-transferase (GST)) which can be purified by either a chemical
interaction (for
example glutathione purification of GST). Alternatively, some carrier
proteins, such as
thioredoxin (Trx) can be selectively released from intact cells by osmotic
shock or freeze/thaw
procedures. Often, proteins that are fused to these carrier proteins can be
purified away from
intracellular contaminants by virtue of the physical attributes of the carrier
protein (Ausubel et
al., supra).
24
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
To ensure that a fusion protein is useful, according to the invention, it may
be necessary
to modify the expression protocol to produce a soluble protein. Due to the
fact that high-level
expression of certain proteins can lead to the formation of inclusion bodies,
if a soluble protein
is required it may be necessary to modify the following variables. The
temperature at which
expression is induced can affect inclusion body formation since inclusion body
formation is
induced at higher temperatures (37°C and 42°C) and inhibited at
lower temperatures (30°C). In
certain instances, lowering the total level of protein expression can lead to
an increase in the
proportion of soluble protein that is produced. The strain background of the
cells in which the
protein is being produced can affect the proportion of a particular protein
that is expressed in a
soluble form. Furthermore, the choice of carrier protein can affect the
solubility of an expressed
fusion protein (Ausubel et al., supra).
An additional problem that can be encountered when producing fusion proteins
in E. coli
is formation of an unstable protein, or a protein that is cleaved at the site
of the junction between
the carrier sequence and the sequence of the protein of interest. To decrease
complications due
to protein instability one can arrange for the fusion protein to be expressed
as insoluble
aggregates. Alternatively, one can express the fusion protein in E. coli
strains that are deficient
in proteases (Ausubel et al., supra).
Often it is useful to remove the carrier protein moiety from the protein of
interest to
facilitate biochemical and functional analyses. Methods for cleavage of fusion
proteins to remove
the carrier are known to those skilled in the art. The choice of a method is
usually determined by
the composition, sequence, and physical characteristics of the particular
protein. Reagents such
as cyanogen bromide, hydroxylamine or low pH can be used to chemically cleave
fusion proteins.
To avoid complications resulting from chemical cleavage (e.g. the presence of
chemical cleavage
sites in the protein of interest and/or the occurrence of side reactions
resulting in protein
modification), enzymatic cleavage methods can be used. Enzymatic cleavage
protocols are
advantageous because they can be carried out under relatively mild reaction
conditions, and
because they involve highly specific cleavage reactions. Enzymes useful for
enzymatic cleavage
of fusion proteins include factor Xa, thrombin, enterokinase, renin and
collagenase (Ausubel et
al., supra).
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Recombinant constructs encoding fusion proteins wherein the carrier sequence
is on the
order of 9-15 codons, can be generated by PCR methods. According to this
method, a PCR
primer will be designed to contain at least 13 nucleotides that are identical
to the target sequence
on either side of the nucleotide sequence encoding the carrier sequence.
Preferably, the PCR
primer will also contain a restriction enzyme site to facilitate cloning of
the amplified product
into an appropriate expression vector. PCR will be carried out as described
above and the
sequence of the amplified product will be confirmed by sequence analysis as
described in Section
D entitled "Isolation of a Wild type Gene".
Alternatively, recombinant constructs encoding fusion proteins can be
generated by
site/oligonucleotide directed mutatagenesis (Ausubel et al., supra). According
to the method of
site directed mutatagenesis the DNA to be mutated is inserted into a plasmid
which has an Fl
origin of replication. A mutagenesis oligonucleotide is designed to contain 13
by that are 100%
identical to the target sequence, on either side of a sequence coding for the
9-15 codons of carrier
sequence that is to be added by the mutatgenesis protocol.
A single stranded preparation of the vector is prepared by the following
method.
Following transformation of an appropriate bacterial strain (e.g. CJ236) with
the recombinant
plasmid and plating of the bacteria on LB agar plates, a single resulting
colony is grown in 4x5
ml of LB plus ampicillin for 1 hour at 37°C with vigorous shaking.
M13K07 helper phage (2 ml,
approximately 101°-1011 plaque forming units) is added and the bacteria
are grown for an
additional hour at 37°C with vigorous shaking. Following the addition
of 7 ml of kanamycin (50
mg/ml), the bacteria are grown overnight at 37°C with vigorous shaking.
The following day
bacterial cultures are pooled and cells are separated by centrifugation. After
the addition of 2.6
ml of 20% polyethylene glycol 200-800/2M NaCl to 20 ml of bacterial
supernatant, the sample
is incubated for 1 - 1.5 hours on ice. The sample is pelleted by
centrifugation at 9000 rpm for 20
minutes. Following removal of the supernatant, residual supernatant are
removed by
centrifugation at 3000 rpm for 5 minutes. The pellet is resuspended in 400 ml
of TE, extracted
twice with phenol and four times with phenol:chloroform and ethanol
precipitated. The resulting
a
pellet is resuspended in 40 ml TE.
Mutagenesis is performed by using a muta-gene kit (Bio-Rad, Hercules, CA)
according
to the following method. To kinase the oligonucleotide primer, 1 ml (200ng) of
oligonucleotide
26
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
is incubated in the presence of 2 ml of 10 kinase buffer (0.5M Tris, pH 8.0,
70mM MgCl2, lOmM
DTT), 2 ml lOmM rATP, 2 ml polynucleotide kinase and 13 ml H20 for 37°C
for 1 hour. To carry
out the annealing and synthesis steps, 2.5 ml of single-stranded template are
mixed with 1 ml of
kinased oligonucleotide, 1.0 ml of lOX annealing buffer (200mM Tris-HCl, pH
7.4, 20 mM
MgCI.,, 500mM NaCI) and 5.5 ml HZO for 10 min at 65°C. The reaction
mixture is slow-cooled
to 37°C. Once the sample has reached 37°C, the sample is spun
briefly in a microfuge. Following
the addition of 1.0 ml of lOX synthesis buffer (5mM each dATP, dCTP, cGTP,
dTTP, lOmM
ATP,100mM Tris-HCI, pH 7.4, 50 mM MgCl2, 20mM DTT), 1.0 ml T4 DNA ligase and
0.5 ml
of T4 DNA polymerase, the sample is incubated for 5 minutes on ice, 5 minutes
at room
temperature and 1 hour at 37°C. A 2 ml aliquot of the sample is used to
transform E. coli.
DNA is isolated from the transformed E. coli cells by mini prep methods known
in the
art (Ausubel et al., supra), and sequenced according to methods known in the
art (described in
Section D entitled "Isolation of a Wild Type Gene".
C. Production of a Nucleic acid Probe
The invention discloses nucleic acid probes. Preferably, the nucleic acid
probes of the
invention are specifically hybridizable to a mutant gene but not to a wild
type form of a gene due
to the presence of one or more polymorphisms. These allele specific probes can
be used to screen
DNA sequences of a gene which have been amplified by PCR, or are present in a
genomic DNA
or RNA test sample. Hybridization of a particular allele specific probe to an
amplified gene
sequence, under stringent conditions (described below), indicates that the
polymorphism
contained in the probe is present in the amplified sequence. Hybridization of
a particular allele
specific probe to a test sample comprising genomic DNA or RNA, under stringent
conditions
(described below), indicates that the polymorphism contained in the probe, is
present in the
nucleic acid of the test sample. Nucleic acid probes that are specifically
hybridizable to a wild
type form of a gene but not to a mutant form of a gene are also useful
according to the invention.
In another embodiment, the probes of the claimed invention will be specific
for a nucleic
acid region that is adjacent to a region that is thought to contain one or
more polymorphisms.
These probes will be useful for detecting the presence of one or more
polymorphisms in the
27
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
adjacent region by the method of primer extension (as described in Section F
entitled
"Identification and Characterization of Polymorphisms".
In other embodiments, probes of the claimed invention will be used to detect a
gain or
loss of a restriction enzyme site known to contain one or more polymorphisms
of the claimed
invention. Nucleic acid probes, according to this embodiment, are able to
detect a restriction
enzyme fragment that is of a size that can be easily separated on an agarose
gel~and visualized
by Southern blot analysis. Probes that are useful according to this embodiment
of the claimed
invention can be specific for any region within a gene or outside of a gene.
The nucleic acids probes of the invention are useful for a variety of
hybridization-based
analyses including but not limited to Southern hybridization to genomic DNA,
cDNA sequences
or PCR amplification products, Northern hybridization to mRNA and RNase
protection assays,
DNA sequencing and isolation of genomic or cDNA clones of a gene. The probes
may also be
used to determine whether mRNA encoded for by a gene is present in a cell or
tissue by the
method of ifi situ hybridization. These techniques are well known in the art
and can be performed
as described in Ausubel et al., supra.
According to the methods of the above-referenced hybridization assays,
polymorphisms
associated with alleles of a gene, which either predispose to a particular
disease (e.g.
osteoporosis) or are not associated with a particular disease (e.g.
osteoporosis), will be detected
by the formation of a stable hybrid consisting of a polynucleotide probe
comprising one or more
polymorphisms and a target sequence, that also comprises one or more
polymorphisms, under
stringent to moderately stringent hybridization and wash conditions. If it is
expected that the
probes will be perfectly complementary to the target sequence, stringent
conditions will be used.
Hybridization stringency may be lessened if some mismatching is expected, for
example, if
variants are expected with the result that the probe will not be completely
complementary.
Conditions are chosen which rule out nonspecific/adventitious bindings, that
is, which minimize
noise. Since such indications identify neutral DNA polymorphisms as well as
mutations, these
indications need further analysis (such as assays described in Section F
entitled "Identification
and Characterization of Polymorphisms") to demonstrate detection of a
susceptibility allele of
a gene.
28
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Probes for alleles of a gene may be derived from genomic DNA or cDNA sequences
from
specific for the gene of interest. The probes may be of any suitable length,
which span all or a
portion of the region containing the gene. If the target sequence contains a
sequence identical to
that of the probe, the probes may be short, e.g., in the range of about 8-30
base pairs, since the
hybrid will be relatively stable under even stringent conditions. If some
degree of mismatch is
expected with the probe, i.e., if it is suspected that the probe will
hybridize to a variant region,
a longer probe may be employed which hybridizes to the target sequence with
the requisite
specificity.
Probes according to the invention also include an isolated polynucleotide
attached to a
label or a reporter molecule which may be useful for isolating other
polynucleotide sequences,
having sequence similarity by standard methods, including but not limited to
the above-
referenced hybridization-based assays. Techniques for preparing and labeling
probes (as
described in Ausubel et al. Supra) are included below. A wide variety of
labels and conjugation
techniques are known by those skilled in the art and can be used in a various
nucleic acid and
amino acid assays. Means for producing labeled hybridization or PCR probes for
detecting
related sequences include oligolabeling, nick translation, end-labeling or PCR
amplification
using a labeled nucleotide. Alternatively, the protein-encoding sequence, or
any portion of it, may
be cloned into a vector for the production of an mRNA probe. Such vectors are
known in the art,
are commercially available, and may be used to synthesize RNA probes in vitro
by addition of
an appropriate RNA polymerase such as T7, T3 or SP6 and labeled nucleotides.
A number of companies such as Pharmacia Biotech (Piscataway NJ), Promega
(Madison
WI) and US Biochemical Corp (Cleveland OH) supply commercial lots and
protocols for these
procedures. Suitable reporter molecules or labels include those radionuclides,
enzymes,
fluorescent, chemiluminescent, or chromogenic agents as well as substrates,
cofactors, inhibitors,
magnetic particles and the like. Patents teaching the use of such labels
include US Patents
3,817,838; 3,350,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149 and
4,366,241. Also,
recombinant immunoglobulins may be produced as shown in US Patent No.
4,816,567
incorporated herein by reference.
29
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Probes comprising synthetic oligonucleotides or other polynucleotides of the
present
invention may be derived from naturally occurring or recombinant single- or
double- stranded
polynucleotides, or be chemically synthesized.
Portions of the polynucleotide sequence having at least approximately 5
nucleotides,
preferably 9-15 nucleotides, fewer than about 6 kb and usually fewer than
about 1 kb, from a
polynucleotide sequence encoding a gene are preferred as probes.
A DNA probe useful according to the present invention can be isolated from a
gene or
a polynucleotide construct derived from a gene, or from a cDNA sequence
specific for a gene or
a cDNA construct specific for a gene by the methods of PCR or restriction
enzyme digestion, as
described above. Riboprobes useful according to the invention can be
synthesized by the method
of i~ vitro transcription, or by chemical synthesis methods, as described
above.
An oligonucleotide probe useful according to the invention can be designed, as
described
above, and synthesized in a commercially available automated synthesizer.
Nucleic acid hybridization rate and stability will be affected by a variety of
experimental
parameters including salt concentration, temperature, the presence of organic
solvents, the
viscosity of the hybridization solution, the base composition of the probe,
the length of the
duplex, and the number of mismatches between the hybridizing nucleic acids
(Ausubel et al.,
supra), and as described in Section A entitled "Design and Synthesis of
Oligonucleotide
Primers".
Southern blot analysis can be used to detect sequence variations in a gene
from a PCR
amplified product or from a total genomic DNA test sample via a non-PCR based
assay. The
method of Southern blot analysis is well known in the art (Ausubel et al.,
supra, Sambrook et al.,
1989, Molecular Cloning. A Laboratory Manual., 2nd Edition, Cold Spring Harbor
Laboratory
Press, Cold Spring Harbor, NY). This technique involves the transfer of DNA
fragments from
an electrophoresis gel to a membrane support resulting in the immobilization
of the DNA
fragments. The resulting membrane carries a semipermanent reproduction of the
banding pattern
of the gel.
Southe"rn blot analysis is performed according to the following method.
Genomic DNA
(5-20 mg) is digested with the appropriate restriction enzyme and separated on
a 0.6-1.0%
agarose gel in TAE buffer. The DNA is transferred to a commercially available
nylon or
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
nitrocellulose membrane (e.g. Hybond-N membrane, Amersham, Arlington Heights,
IL) by
methods well known in the art (Ausubel et al., supra, Sambrook et al., supra).
Following transfer
and UV cross linking, the membrane is hybridized with a radiolabeled probe in
hybridization
solution (e.g. under stringent conditions in 5X SSC, 5X Denhardt solution, 1%
SDS) at 65°C.
Alternatively, high stringency hybridization can be performed at 68°C
or in a hybridization buffer
containing a decreased concentration of salt, for example O.1X SSC. The
hybridization conditions
can be varied as necessary according to the parameters described in Section A
entitled "Design
and Synthesis of Oligonucleotide Primers". Following hybridization, the
membrane is washed
at room temperature in 2X SSC/0.1% SDS and at 65°C in 0.2X SSC/0.1%
SDS, and exposed to
film. The stringency of the wash buffers can also be varied depending on the
amount of the
background signal (Ausubel et al., supra).
Detection of a nucleic acid probe-target nucleic acid hybrid will include the
step of
hybridizing a nucleic acid probe to the DNA target. This probe may be
radioactively labeled or
covalently linked to an enzyme such that the covalent linkage does not
interfere with the
specificity of the hybridization. A resulting hybrid can be detected with a
labeled probe. Methods
for radioactively labeling a probe include random oligonucleotide primed
synthesis, nick
translation or kinase reactions (see Ausubel et al., supra). Alternatively, a
hybrid can be detected
via non-isotopic methods. Non-isotopically labeled probes can be produced by
the addition of
biotin or digoxigenin, fluorescent groups, chemiluminescent groups (e.g.
dioxetanes, particularly
triggered dioxetanes), enzymes or antibodies. Typically, non-isotopic probes
are detected by
fluorescence or enzymatic methods. Detection of a radiolabeled probe-target
nucleic acid
complex can be accomplished by separating the complex from free probe and
measuring the level
of complex by autoradiography or scintillation counting. If the probe is
covalently linked to an
enzyme, the enzyme-probe-conjugate- target nucleic acid complex will be
isolated away from
the free probe enzyme conjugate and a substrate will be added for enzyme
detection. Enzymatic
activity will be observed as a change in color development or luminescent
output resulting in a
103-106 increase in sensitivity. An example of the preparation and use of
nucleic acid probe-
enzyme conjugates as hybridization probes (wherein the enzyme is alkaline
phosphatase) is
described in (Jablonski et al., 1986, Nucleic Acids Res., 14:6115)
Two-step label amplification methodologies are known in the art. These assays
are based
on the principle that a small ligand (such as digoxigenin, biotin, or the
like) is attached to a
31
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
nucleic acid probe capable of specifically binding to a gene. Allele specific
gene probes are also
useful according to this method.
According to the method of two-step label amplification,'the small ligand
attached to the
nucleic acid probe will be specifically recognized by an antibody-enzyme
conjugate. For
example, digoxigenin will be attached to the nucleic acid probe and
hybridization will be
detected by an antibody-alkaline phosphatase conjugate wherein the alkaline
phosphatase reacts
with a chemiluminescent substrate. For methods of preparing nucleic acid probe-
small ligand
conjugates, see (Martin et al., 1990, B,ioTechniques, 9:762). Alternatively,
the small ligand will
be recognized by a second ligand-enzyme conjugate that is capable of
specifically complexing
to the first ligand. A well known example of this manner of small ligand
interaction is the biotin
avidin interaction. Methods for labeling nucleic acid probes and their use in
biotin-avidin based
assays are described in Rigby et al., 1977, J. Mol. Biol., 113:237 and Nguyen
et al., 1992,
BioTechniques, 13:116).
Variations of the basic hybrid detection protocol are known in the art, and
include
modifications that facilitate separation of the hybrids to be detected from
extraneous materials
and/or that employ the signal from the labeled moiety. A number of these
modifications are
reviewed in, e.g., Matthews & Kricka, 1988, Anal. Biochem., 169:1; Landegren
et al., 1988,
Science, 242:229; Mittlin, 1989, Clincal Chem. 35:1819; U.S. Pat. No.
4,868,105, and in EPO
Publication No. 225,807.
D. Isolation of a Wild type gene
A wild type version of a candidate gene according to the invention can be
isolated by
cloning from an appropriately selected genomic library according to methods
well known in the
art. Methods of cloning are described in Section B entitled "Production of a
Polynucleotide
Sequence
The sequence of the cloned gene will be determined by sequencing methods well
known
in the art (see Ausubel et al., supra and Sambrook et al., supra). Methods of
sequencing employ
such enzymes as the I~lenow fragment of DNA polymerise I, Sequenase~ (US
Biochemical
Corp, Cleveland, OH), Taq polymerise (Perkin Elmer, Norwalk, CT), thermostable
T7
polymerise (Amersham, Chicago, IL), or combinations of recombinant polymerises
and
32
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
proofreading exonucleases such as the ELONGASE Amplification System (Gibco
BRL,
Gaithersburg, MD). Preferably, the process is automated with machines such as
the Hamilton
Micro Lab 2200 (Hamilton, Reno NV), Peltier Thermal Cycler (PTC200; MJ
Research,
Watertown, MA) and the ABI 377 DNA sequencers (Perkin Elmer).
E. Isolation of a Mutant Gene
A mutant version of a candidate gene according to the invention can be
isolated by
cloning from an appropriately selected genomic library according to methods
well known in the
art. Methods of cloning are described in Section B entitled "Production of a
Polynucleotide
Sequence."
The sequence of the cloned gene will be determined by sequencing methods
described
in Section D entitled "Isolation of a Wild Type Gene."
F. Identification and Characterization of Polymorphisms
a. Identification of SNPs by in silico methods (isSNPs)
1. Identification of Polymorphisms in Candidate Genes
The starting point is a set of experimentally derived nucleic acid sequences.
In order to
be useful for SNP discovery by the invention, it is preferred that the
sequences have complete
chromatogram files from a gel or capillary electrophoresis sequencing machine.
When this is not
available, quality score data which assigns a score to each base in the
sequence indicating the
likelihood of error for the basecall may be used. If neither of these data are
available, the
sequence may be used to assist the clustering of other sequences and in some
cases to provide
additional verification for a discovered SNP, but is not be used by the
invention for the
identification of the polymorphism.
The population of sequences used may constitute either a database of cDNA-
derived
sequences or genomic sequence. In a preferred embodiment, sequences used by
the invention
are from an assembled cDNA database, such as the LifeSeqGold database (Incyte
Genomics,
Inc(Incyte), Palo Alto, CA).
33
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Derivation of Nucleic Acid Seauences
cDNA was isolated from libraries constructed using RNA derived from normal and
diseased human tissues and cell lines. The human tissues and cell lines used
for cDNA library
construction were selected from a broad range of sources to provide a diverse
population of
cDNAs representative of gene transcription throughout the human body.
Descriptions of the
human tissues and cell lines used for cDNA library construction are provided
in the LIFESEQ
database (Incyte Pharmaceuticals, Inc. (Incyte),, Palo Alto CA). Human tissues
were broadly
selected from, for example, cardiovascular, dermatologic, endocrine,
gastrointestinal,
hematopoietic/immune system, musculoskeletal, neural, reproductive, and
urologic sources.
Cell lines used for cDNA library construction were derived from, for example,
leukemic
cells, teratocarcinomas, neuroepitheliomas, cervical carcinoma, lung
fibroblasts, and endothelial
cells. Such cell lines include, for example, THP-1, Jurkat, HLTVEC, hNT2,
WI38, HeLa, and
other cell lines commonly used and available from public depositories
(American Type Culture
Collection, Manassas VA). Prior to mRNA isolation, cell lines were untreated,
treated with a
pharmaceutical agent such as 5'-aza-2'-deoxycytidine, treated with an
activating agent such as
lipopolysaccharide in the case of leukocytic cell lines, or, in the case of
endothelial cell lines,
subjected to shear stress.
Sequencing of the cDNAs
Methods for DNA sequencing are well known in the art. Conventional enzymatic
methods employ the Klenow fragment of DNA polymerise I, SEQUENASE DNA
polymerise
(U.S. Biochemical Corporation, Cleveland OH), Taq polymerise (The Perkin-Elmer
Corporation
(Perkin-Elmer), Norwalk CT), thermostable T7 polymerise (Amersham Pharmacia
Biotech, Inc.
(Amersham Pharmacia Biotech), Piscataway NJ), or combinations of polymerises
and
proofreading exonucleases such as those found in the ELONGASE amplification
system (Life
Technologies Inc. (Life Technologies), Gaithersburg MD), to extend the nucleic
acid sequence
from an oligonucleotide primer annealed to the DNA template of interest.
Methods have been
developed for the use of both single-stranded and double-stranded templates.
Chain termination
reaction products may be electrophoresed on urea-polyacrylamide gels and
detected either by
autoradiography (for radioisotope-labeled nucleotides) or by fluorescence (for
fluorophore-
34
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
labeled nucleotides). Automated methods for mechanized reaction preparation,
sequencing, and
analysis using fluorescence detection methods have been developed. Machines
used to prepare
cDNAs for sequencing can include the MICROLAB 2200 liquid transfer system
(Hamilton
Company (Hamilton), Reno NV), Pettier thermal cycler (PTC200; MJ Research,
Inc. (MJ
Research), Watertown MA), and ABI CATALYST 800 thermal cycler (Perkin-Elmer).
Sequencing can be carried out using, for example, the ABI 373 or 377 (Perkin-
Elmer) or
MEGABACE 1000 (Molecular Dynamics, Inc. (Molecular Dynamics), Sunnyvale CA)
DNA
sequencing systems, or other automated and manual sequencing systems well
known in the art.
The nucleotide sequences have been prepared by current, state-of the-art,
automated
methods and, as such, may contain occasional sequencing errors or unidentified
nucleotides.
Such unidentified nucleotides are designated by an N. These infrequent
unidentified bases do
not represent a hindrance to practicing the invention for those skilled in the
art.- Several methods
employing standard recombinant techniques may be used to correct errors and
complete the
missing sequence information. (See, e.g., those described in Ausubel, F.M. et
al. (1997) Short
Protocols in Molecular Biolo~y, John Wiley & Sons, New York NY; and Sambrook,
J. et al.
(1989) Molecular Cloning, A Laborator~Manual, Cold Spring Harbor Press,
Plainview NY.)
AssemblX of cDNA Sequences
Human polynucleotide sequences may be assembled using programs or algorithms
well
known in the art. Sequences to be assembled are related, wholly or in part,
and may be derived
from a single or many different transcripts. Assembly of the sequences can be
performed using
such programs as PHRAP (Phils Revised Assembly Program) and the GELVIEW
fragment
assembly system (GCG), or other methods known in the art.
Alternatively, cDNA sequences are used as "component" sequences that are
assembled
into "template" or "consensus" sequences as follows. Sequence chromatograms
are processed,
verified, and quality scores are obtained using PHRED. Raw sequences are
edited using an
editing pathway known as Block 1 (See, e.g., the LIFESEQ Assembled User Guide,
Incyte
Pharmaceuticals, Palo Alto, CA). A series of BLAST comparisons is performed
and low-
information segments and repetitive elements (e.g., dinucleotide repeats, Alu
repeats, etc.) are
replaced by "n's", or masked, to prevent spurious matches. Mitochondria) and
ribosomal RNA
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
sequences are also removed. The processed sequences are then loaded into a
relational database
management system (RDMS) which assigns edited sequences to existing templates,
if available.
When additional sequences are added into the RDMS, a process is initiated
which modifies
existing templates or creates new templates from works in progress (i.e.,
nonfinal assembled
sequences) containing queued sequences or the sequences themselves. After the
new sequences
have been assigned to templates, the templates can be merged into bins. If
multiple templates
exist in one bin, the bin can be split and the templates reannotated.
A resultant template sequence may contain either a partial or a full length
open reading
frame, or all or part of a genetic regulatory element. This variation is due
in part to the fact that
the full length cDNAs of many genes are several hundred, and sometimes several
thousand, bases
in length. With current technology, cDNAs comprising the coding regions of
large genes cannot
be cloned because of vector limitations, incomplete reverse transcription of
the mRNA, or
incomplete "second strand" synthesis. Template sequences may be extended to
include
additional contiguous sequences derived from the parent RNA transcript using a
variety of
methods known to those of skill in the art. Extension may thus be used to
achieve the full length
coding sequence of a gene.
Analysis of the cDNA Seauences
The cDNA sequences are analyzed using a variety of programs and algorithms
which are
well known in the art. (See, e.g., Ausubel, supra, Chapter 7.7; Meyers, R.A.'
(Ed.) (1995)
Molecular Biology and Biotechnolo~y, Wiley VCH, New York NY, pp. 856-853).
These
analyses comprise both reading frame determinations, e.g.; based on triplet
codon periodicity for
particular organisms (Fickett, J.W. (1982) Nucleic Acids Res. 10:5303-5318);
analyses of
potential start and stop codons; and homology searches.
Computer programs known to those of skill in the art for performing computer-
assisted
searches for amino acid and nucleic acid sequence similarity, include, for
example, Basic Local
Alignment Search Tool (BLAST; Altschul, S.F. (1993) J. Mol. Evol. 36:290-300;
Altschul,
S.F.et al. (1990) J. Mol. Biol. 215:403-410.) BLAST is especially useful in
determining exact
matches and comparing two sequence fragments of arbitrary but equal lengths,
whose alignment
is locally maximal and for which the alignment score meets or exceeds a
threshold or cutoff score
36
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
set by the user (Karlin, S. et al. (1988) Proc. Natl. Acad. Sci. USA 85:841-
845.) Using an
appropriate search tool (e.g., BLAST or HMM), GenBank, SwissProt, BLOCKS, PFAM
and
other databases may be searched for sequences containing regions of homology
to a query rbosm
or RBOSM of the present invention.
Other approaches to the identification, assembly, storage, and display of
nucleotide and
polypeptide sequences are provided in "Relational Database for Storing
Biomolecule
Information," U.S.S.N. 08/947,845, filed October 9, 1997; "Project-Based Full-
Length
Biomolecular Sequence Database," U.S.S.N. 08/811,758, filed March 6, '1997;
and "Relational
Database and System for Storing Information Relating to Biomolecular
Sequences," U.S.S.N.
09/034,807, filed March 4, 1998, all of which are incorporated by reference
herein in their
entirety.
Protein hierarchies can be assigned to the putative encoded polypeptide based
on, e.g.,
motif, BLAST, or biological analysis. Methods for assigning these hierarchies
are described, for
example, in "Database System Employing Protein Function Hierarchies for
Viewing
Biomolecular Sequence Data," U.S.S.N. 08/812,290, filed March 6, 1997,
incorporated herein
by reference.
Identification of Seauence Variants and Polymorphisms
The method comprise a series of filters to identify isSNPs from other
sequencing variants
and errors. The filters can be grouped into the following five sets of filters
by the order of
application in the method:
Preliminary Filters: the main filter in the first group removes the majority
of base call
0
errors by requiring a minimum phred quality score of 15. Additional filters at
this stage deal with
sequence alignment errors as well as errors resulting from improper trimming
of vector sequence,
chimeras and splice junctions.
Advanced Chromatogram Analysis: additional base call errors are then detected
by
examining the original chromatogram files in the vicinity of a putative SNP by
an automated
procedure resulting in a set of SNPs wherein the base call error rate is
reduced to less than 5°Io.
37
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Clone Error Filters: errors introduced during laboratory processing such as
those caused
by reverse transcriptase, polymerase or somatic mutation are among the most
difficult to
distinguish from true SNPs. The Clone Error filters use statistically
generated algorithms to
identify these sources of error. A small percentage of actual SNPs will be
discarded at this stage.
Clustering Error Filters: these types of errors result from the incorrect
clustering of close
homologs, pseudo- genes or from contamination by nonhuman sequences. The
filters developed
to minimize these clustering errors are also statistically based. As above
these filters may be
reject a fraction of actual SNPs
Finishing Filters: these filters remove duplicate and redundant SNPs from the
generated
list of SNP, and remove SNPs which are from the hypervariable regions of
hypervariable genes
such as immunoglobulin and T cell receptors.
Pre-processin steps
The sequences must first be trimmed to eliminate vector sequence,
contamination and
repetitive sequences. Then certain low information content sequences (for
example, long runs
of a single base, or two or three-base repeats) and repetitive sequences (for
example Alu
sequences in humans) must be massed (changed to N's) to prevent over-
clustering errors. The
clustering process then identifies the sets of sequences that are believed to
be derived from the
same original DNA sequence or gene. The sequences in each cluster or then
aligned using a
method such as phrap which also defines a consensus sequence. It will be well
recognized by
those spilled in the art that there are numerous existing programs for
carrying out these processes,
and the SNP discovery process described herein will work equally well with any
of them. In the
instant embodiment, the preferred processes are Blocked 1 for trimming and
masking, a variety
of different algorithms for clustering, and phrap for the alignment. It will
be recognized by those
skilled in the art that phrap and other alignment methods carry out a
secondary clustering step
which divides clusters into contigs, and carry out a secondary trimming step
which defines the
end points of the portion of each sequence which participates in the contig.
The contigs then
maybe searched for the occurrence of SNPs.
Errors in the trimming, clustering and alignment processes will cause SNP
discovery
errors, usually false positives (the prediction of SNPs where they do not
exist). Additional filters
38
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
which are the subject of the invention are designed to recognize and remove
these errors by
providing the ability to identify likely errors in the processes and to
correct them.
In some instances, it is preferred, as an optional step, to unmask regions of
sequences
which were masked because of low information content or repetitive sequence)
during the
clustering process can be unmasked after clustering to allow discovery of SNPs
within these
regions.
Identification of Candidate SNP Seauences
The first step in identifying candidate SNP sequences is to redefine the end
points of each
sequence as the points within the previous end points where a stretch of at
least 10 consecutive
base calls, containing at least eight base changes, matches the consensus
sequence exactly.
Sequence trimming errors (both at single sequence stage and at the alignment
stage contribute
to the false positives when foreign sequence (vector, chimera or splice
variant) is similar to the
real sequence and the true boundary is difficult to determine. This step is a
conservative
approach to avoid false positives and also filters out lower-quality sequence
that the ends. The
reason the length of the match with a consensus is measured in base changes is
to avoid low
significance matches on repetitive sequence such as polyA.
The next step is an each position of the alignment to compare the base calls
of all the
aligned sequences which are between their start and end positions and which
have quality scores
greater than a set threshold, and which have neighboring base calls which
agree with a consensus
sequence and where the neighboring base calls also have a quality score > the
threshold.
Preferably the threshold is a phred quality score greater than or equal to 15.
The possibilities are
A, C, G, T, and -(deletion).
The next step is a Clone Filter where if there has been more than one base
call for a
sequence position, then the clone for each sequence is identified in the
sequences corresponding
to each clone are compared. If the base calls for different sequences from the
same clone
disagree, then all the sequences for this clone at this base position are
removed from
consideration.
39
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
After all of these filters, positions for which there is more than one base
call are candidate
SNPs. The "wild type" base call is the one in the consensus sequence and the
others are
designated candidate SNPs. If the wild type base call is a deletion, then the
SNP is considered
to be an insertion at the previous base.
Automated Chromatogram Checking
The next filters require opening of the chromatogram files for the sequences
identified
as containing candidate SNPs. At each candidate SNP position, the chromatogram
data of each
sequence passing the Identification Filters is extracted. The first step in
this process utilizes a
program ABIdump to translate binary ABI chromatogram files into usable form.
Multiple Base Call Algorithm filter: the ABI base calls for each sequence are
compared
to~the phred base calls. If the base calls do not agree at the SNP position
and the two adjacent
flanking positions, then the sequences are removed from consideration.
Intensity Filter: if the SNP is a single base change (this step is skipped for
insertions and
deletions), then the process intensity values for each of four bases at the
call chromatogram
location of the candidate SNP base are used to compute a ratio. If we call the
intensity of wild
type, "wt", the intensity of the SNP base "snp", the minimum of the other two
"min", and the
phred quality of the base call "Q", then the wild type sequences must have
(snp-min) < (wt-min)(Q-17)/37 and Q>=17 to be considered high-quality, and
(snp-min)<(wt-min)(Q-4)/37 and Q>=15 to be considered a low quality pass.
The basis for these formula is that if a base is mis- called, then there is
likely to be a residual peak
for the correct base. The larger the peak for the wild type base, the less
likely that the call of the
SNP is correct. The actual thresholds in the formula are based on empirical
data from clones
which were sequence multiple times and which gave a set of confirmed SNPs and
error rates for
algorithm optimization.
The candidate SNP passes only if at least one wild type sequence passes and at
least one
SNP sequence passes. The quality of the candidate SNP is the lower of the
highest wild type
pass level and the highest SNP pass level (if there is a high-quality wild
type sequence but only
low quality SNP sequences, then the candidate is low quality. A SNP quality
value is returned.
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Clone Error Quality Filters (somatic mutation/reverse transcriptase/polymerase
errors)
The purpose of these filters is to remove errors which are actually in the
clone, that is, the
clone sequence was correct but the clone does not represent the individual
being sequenced.
Three possible sources of these errors are somatic mutations, errors made by
reverse transcriptase
in the process of making cDNA, and DNA polymerase errors in those situations
where the DNA
has been amplified by PCR at some point prior to inserting in the cloning
vector. Somatic
mutations can be a particular problem in sequencing clones derived from cell
lines.
Polymerase errors are specific to the type of sequencing protocol used. For
example,
reverse transcriptase is involved in EST sequencing but not genomic clone
sequencing.
Polymerase is involved in the creation of extension clones (polymerase is used
in all sequencing
reactions, but errors are less likely to arise because only a fraction of the
templates are affected
in contrast to the extension process where a single polymerase product becomes
a template for
the entire reaction). This filter is not applied to genomic sequences in the
current embodiment
on the premise that the genomic sequences do not have polymerase errors, and
that somatic
mutations are likely to have the same profile as real SNPs.
This filter also filters out rare SNPs as well as apparent SNPs which are not
real. It is
difficult to determine and confirm by experiments to what extent SNP
candidates are too rare to
be confirmed vs. simply not real. For many applications, very rare SNPs are of
less utility than
common ones such that this is not a problem; however in some applications it
may be advisable
to turn this filter off.
Base change sequence analysis filter
The premise of this filter is that probabilities of different mutations is
different depending
on the source. For example true SNPs may be mostly transitions whereas reverse
transcriptase
mutations could be primarily G to T mutations. While this does not allow one
to determine for
sure that a given change is a true SNP, it allows one to evaluate the relative
likelihood that a
given mutation is a true SNP. SNP confirmation data suggest that G/T SNP
candidates in which
there is only one clone having the T allele have a very low probability of
being real SNPs. The
SNP candidates are excluded from the high confidence set (they are kept in a
different file-their
41
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
confirmation rate is well below 50 percent). The other set which had a very
low confirmation
rate is any AlT SNP.
FreQUenc. F
This filter is based on the concept that true SNPs have a different frequency
profile than
clone errors and that a candidate SNP which is evident in only one clone in a
deep alignment is
less likely to be real than one which appears in one clone in a shallow
alignment. The likelihood
of finding a SNP at a given sequence location is a function of the number of
chromosomes
sequenced. This curve is distinctly non-linear as most SNPs are sufficiently
frequent, to be found
with relatively few sequences. The probability of an error of this type,
however is essentially
linear in the number of sequences since the chance of the change occurring in
two different
sequences is independent. This means that the probability that a candidate SNP
observed in a
single clone is a true SNP is lower if the alignment is deep then if a is
shallow. Any SNP
occurring in a single clone in an alignment of more than 20 clones (counting
only high-quality
sequences which have a chance of contributing a candidate SNP) is excluded
from the high
confidence set.
This filter is the basis of a secondary method used to develop the base change
sequence
analysis filter. Comparing the set of single clone SNPs from shallow
alignment's with those from
deep alignment's, which are more likely to be errors, will reveal base changes
which are more
lilcely to be associated with polymerase errors and somatic mutations.
Clustering_Error Filters
These filters are intended to remove candidates SNPs which result from the
incorrect
clustering of similar. sequences such as highly homogenous genes, similar
genomic sequences,
and contamination from other species where the sequences of the species have
been mis- labeled
as human.
Number of base chan.e~ filter
42
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
This filter distinguishes homologous sequences from SNPs on the basis of the
frequency
of variants. True SNPs occur about one per kd when comparing to sequences or
once per 2 kb
if the length of sequences is included, and this fraction decreases as the
depth of the alignment
increases. Since EST sequences tend to be about 500 by or less in length, then
it would be
expected to have not more than one SNP per four sequences. The number of SNPs
in the cluster
is divided by the number of sequences in the cluster and SNPs for which this
number is larger
than one are discarded. The higher the number, the less likely the SNP is to
be real. The
threshold value of one was chosen because it appears to correspond to roughly
a 50 percent
success rate, however the threshold value could be adjusted to higher value to
accept lower
confidence SNPs.
Distance from next pol,~~hism filter
This filter calculates the number of SNPs for which the sequence is the only
representative within a window of 100 bases on either side, and discards any
of the SNPs for
which there are more than one other SNP in this window. This threshold can be
set higher, but
the actual fraction of SNP candidates which are true SNPs drops off to less
than 50 percent.
Haplotype clustering filter
When sequences from different sources are inappropriately clustered, it is
possible to
divide them into two or more clusters which are consistent. In particular, if
we take any two
differences between homologs and consider the haplotypes of the clones which
overlap both
SNPs, there are only two haplotypes. In other words, a 2x2 matrix of
haplotypes is diagonal
having only two non-zero entries. If there are only two sequences, then this
is expected. For
each SNP, a 2x2 haplotype matrix with each other SNP is computed. If it is
diagonal, and there
are more than two sequences, than the sum of the diagonal elements minus one
is a "cluster total"
for this SNP. This "cluster total" number has proven to be empirically
correlated with the
confirmation rate, probably because it predicts clusters which contain para-
logs, homologs and
contamination from other species. Candidates SNPs which have a cluster number
of less than
eight are kept. This threshold value for the cluster total can be varied.
43
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Redundancy/finishin, filters
Redundant SNP filter: SNPs in different contigs of the same gene which have
the same
base change and surrounding sequence are flagged as redundant. To accommodate
possible
splice variants this redundancy filter also applies to SNPs which have the
surrounding sequence
matches on only one side.
T cell receptor/immunoglobulin filters
Sequences containing SNPs are filtered to remove SNPs in sequences that are
homologs
to T cell receptors and immunoglobulin genes because both types of genes have
hyper-variable
regions which could result in false positives.
Output file
SNP related data: With each candidate SNP a variety of data is kept, including
the
number and sources of all contributing sequences (for example gene album,
HTPS, FL,
WashU/Merck, etc.), the surrounding sequence, measures of the ratio and
quality scores for the
"best" sequence representing each allele, etc.
Sequence related data: for each sequence associated with each SNP, the
following data
is kept including the distance in each direction to the end of the sequence,
the distance in each
direction to the next base different from the consensus and passing the
initial quality filters, the
library, tissue ID, donor ID and comments (for example tumor, diseases,
normal).
b. Identification of polymorphisms in osteoporosis associated genes by SSCP
The invention provides methods for detecting the presence of polymorphisms in
candidate
genes of the invention. The invention also provides methods for distinguishing
polymorphisms
which contribute to a particular disease (e.g. osteoporosis) over
polymorphisms which do not
contribute to the disease.
1. Identification of Polymorphisms in Candidate Genes
44
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Identification of polymorphisms in a candidate gene, according to the
invention, will
involve the steps of isolating the candidate gene, determining its genomic
structure and
identifying polymorphisms in the DNA sequences in any portion of the entire
protein-coding
region. The invention also provides methods for identifying polymorphisms in
the DNA
sequences corresponding to RNA splice junctions. The invention also provides
methods for
identifying polymorphisms in the DNA sequence corresponding to the regulatory
(promoter)
region of the candidate gene.
A candidate gene is isolated by cloning methods well known in the art
(described above).
Preferably the genomic structure of a candidate gene is determined by Southern
blot analysis, as
described in Section C. It is expected that the entire sequence of an open
reading frame (ORF)
of an average entire gene can be spanned by 16 PCR-amplified DNA fragments or
amplimers of
an average length of 225 bp. It is expected that a smaller gene can be spanned
by 1-2 amplimers
and that >50 amplimers are required to span extremely large genes. Primers
useful for production
of the amplimers of a particular candidate gene are designed based on
preexisting knowledge of
the sequence of the wild type gene, according to the primer design strategies
described in Section
A entitled "Design and Synthesis of Oligonucleotide Primers."
For PCR amplification of a region to be tested by SSCP it is preferable to
design primers
that amplify overlapping regions of the candidate gene. If a sequence
variation is located in a
region of a candidate gene that corresponds to the region to which the primers
hybridize, the
primers will likely not bind, the region containing this sequence variation
will not be amplified
and the variation will not be detected in PCR based assays. By producing
overlapping amplimers
it is expected that virtually all of the sequence variations in a particular
candidate gene will be
detected. The amount of overlap in the amplimers is somewhat variable
(approximately 20%) and
the precise location of the overlapping regions will depend on the location of
regions comprising
a sequence that is an appropriate primer sequence. It is a possibility that a
polymorphism will be
located at a position just adjacent to the primer site. Consequently, sequence
information will be
available for only 20 by on one side of the polymorphism and for 104-279 by on
the other side
of the polymorphism. However, this should be a sufficient amount of sequence
information to
allow definition of a unique sequence context in which to define the
particular polymorphism.
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Based on screening analysis of 92 samples (184 chromosomes), it is expected
that about
50% of the amplimers will demonstrate polymorphisms, and that approximately
80% of these
amplimers will detect changes at single positions while the remaining 20% will
detect base
changes at two positions. B ased on these estimates, it is expected that there
will be approximately
10 sequence variations per open reading frame. However, the number of
amplimers that
demonstrate polymorphisms with vary depending on the number of individuals
tested, the
ethnicity and structure of the population being tested, and the region of DNA
being tested.
Preferably, each polymorphism will be detected in the context of an SSCP
fragment.
Polymorphism analysis by fluorescent SSCP (fSSCP, described in detail in
Section F entitled
"Identification and Characterization of Polymorphisms") uses PCR to generate
an amplimer of
DNA to be studied. The region to be tested is defined as the region between
the primers (e.g. the
region that is incorporated into the PCR product and reflects the sequence of
the DNA sample
being tested). The PCR primers reflect the sequence of the DNA sample being
tested and are
incorporated into the PCR product as one end of each strand of DNA in the PCR
product. If a
polymorphism occurs in a primer binding site either the PCR primer does not
bind due to the
mismatch and the PCR will not produce a product, or the primer binds, an
amplification step
occurs wherein the primer is incorporated, but the amplified product does not
contain the
polymorphism which occurs at the primer binding site. Therefore, fSSCP
provides a method of
screening a DNA sequence located between PCR primers for the presence of
polymorphisms.
The sensitivity of the technique of fSSCP for detecting a polymorphism is
affected by
length, such that there is a substantial decrease in the detection of
polymorphisms in amplimers
that are greater than 300 by in length. However, different conditions for
performing SCCP at high
sensitivity with larger fragments, e.g. 800-1500 by have also been described.
If the length of DNA
screened per amplimer is decreased then more amplimers are required to screen
a region of a
,25 given size. Therefore, efficient screening of a gene dictates that the
lower limit of the size of an
amplimer is 125 bp. To attain specificity for a particular gene sequence,
pnmers are usually 20-25
by in length, and additional criteria such as G:C content, and intra- and
inter-primer
complementarity are important considerations in primer design (as described
above). All of these
considerations are addressed if the primer3 program (Copyright (c) 1996
Whitehead Institute for
Biomedical Research) is employed to design pairs of primers suitable for use
in a single PCR
reaction. Typically, program parameters are set so that multiple amplimers are
designed in the
46
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
length range of 150-300bp, with predicted primer melting temperatures in the
narrow range 60-
62°C. The narrow temperature range increases the likelihood that a
single set of PCR conditions
can be used.to generate a wide variety of different amplimers.
If it is desirable to screen a contiguous stretch of DNA which is larger than
the maximum
fragment size desired for sensitive polymorphism detection by fSSCP (300 bp)
it is necessary to
use multiple amplimers (which are assayed separately) which span the region of
interest. Since
the primer sites in an amplimer are not tested, these sequences need to be
contained within
another amplimer. To test the primer sequence, overlapping amplimers are
designed by an
algorithm that evaluates a large number of amplimers generated by the primer3
program for the
optimum overlapping set according to a cost function. Thus, a series of
overlapping PCR
amplification products can be used to test a contiguous stretch of DNA.
Constraints on primer
design are such that the absolute minimum overlap is rarely possible. As a
result, some regions
of overlap occur that results in 'double testing' of a particular segment of
DNA. The detection
efficiency is affected by the sequence context of the polymorphism; it is
possible that a
polymorphic site will be detected in only one of two different amplimers which
overlap the same
site. One strategy that is useful for increasing polymorphism detection
efficiency is to design
overlapping amplimers to generate 2-fold coverage of all sequences. .
SSCP does not detect 100% of polymorphisms. The invention provides for
detection of
polymorphisms with an efficiency of 95% under a single set of conditions using
single coverage
of sequences; a 2-fold screening strategy can be employed if it is necessary
to increase this
detection efficiency.
It is expected that the polymorphism can be located, and detected anywhere in
the SSCP
fragment except in the regions at each end that correspond to the sequence of
the PCR primers.
The precise location and identity of the sequence variations) of a particular
SSCP fragment can
be confirmed by sequencing the fragment as described in Section D entitled
"Isolation of a Wild
Type Gene". The sequence of a candidate gene will be compared to the known
sequence of a
wild-type version of the gene by using the following DNA/protein sequence
analysis programs
and methods.
There are a large number of freely available methods for performing sequence
comparisons. These methods differ in their speed of execution, their
sensitivity, and the type of
47
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
comparisons they are able to make. For example one can compare two DNA
sequences, two
protein sequences, a DNA sequence to a protein sequence by conceptual
translation, or DNA
sequences as if they were protein sequences, again by conceptual translation.
The BLAST suite
of programs (Altschul et al., 1990, J.Mol.Biol. 215:403) are commonly used to
perform the
above-referenced type of analysis. Although the BLAST suite of programs
provides a rapid
method of determining multiple distinct similarities between two sequences,
these programs are
not guaranteed to find an optimal solution when comparing two sequences
according to a
particular set of parameters. PSI-BLAST is a more sensitive variant of BLAST
that operates by
interactively searching the database while simultaneously refining the query
pattern based on the
results of the searches. Other packages of programs that are available and
which have different
specific properties include the I~VIMER, SAM, WISE, STADEN and FASTA packages,
and the
programs est_genome, dotter, e-PCR, Clustal, cross_match and phrap
(Pearson,1996, Methods
Enz,~, 266:227).
If sequence information is available for the intron-axon boundaries and for a
region of
the intron (of approximately 30-150 bp) located immediately 5' of an intron-
axon boundary,
primers can be designed to produce amplimers useful for identifying
polymorphisms located in
the RNA splice junctions. Similarly, if the promoter region of a candidate
gene has been
sequenced, primers can be designed to produce amplimers useful for identifying
polymorphisms
located in the promoter region. Additional methods for detecting and isolating
polymorphisms
include, but are not limited to fluorescent polarization-TDI, mass
spectroscopy denaturing
gradient gel electrophoresis, chemical cleavage of mismatch, constant
denaturant capillary
electrophoresis, RNase cleavage, heteroduplex analysis, sequencing by
hybridization, DNA
sequencing, representational difference analysis, and denaturing high
performance liquid
chromatography, described below in Section F entitled, "Identification and
Characterization of
Polymorphisms".
2. Methods of Determining if a Polymorphism Contributes to osteoporosis
No two individuals (excluding identical twins or other clones) have the same
sequence
of DNA in their genome. Variability in gene sequences between individuals
accounts for many
of the obvious phenotypic differences (such as pigmentation of hair, skin,
etc.) and many non-
48
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
obvious ones (such as drug tolerance and disease susceptibility). In a
population, the DNA
sequence that occurs at the highest frequency at any given site is commonly
referred to as.the
wild type sequence. The term "wild type sequence" can be misleading, however,
because in
different populations an alternative form of a DNA sequence may be predominant
and thus
considered wild type for that particular population. DNA polymorphisms are
located throughout
the genome, within and between genes, and the various forms may or may not
result in
differential gene function (as determined by comparing the function of two
alternative forms of
the same sequence). Most polymorphisms do not alter gene function and are
called neutral
polymorphisms. Some polymorphisms do have an effect on gene function, for
example by
changing the amino acid sequence of a protein, or by altering control
sequences such as
promoters or RNA splicing or degradation signals.
Polymorphisms can be used in genetic studies to identify a gene involved in a
disease.
If a polymorphism alters a gene function such that it increases disease
susceptibility, then it will
be present more often in individuals with the disease than in those without
the disease.
Alternatively, if a particular DNA variant is protective against a disease, it
will be found more
often in individuals without the disease than in those with the disease.
Statistical methods are
used to evaluate polymorphism frequencies found in diseased as compared to
normal
populations, and provide a means for establishing a causal link between a
polymorphism and a
phenotype. To detect a significant association between a disease and a
polymorphic site, different
tests may be used with either genotypic or allelic distributions. The simplest
test consists of a t-
test wherein the frequency of the polymorphic alleles in normal individuals
and individuals with
the disease phenotype is compared. A comparison of the genotypic distribution
in normal
individuals and individuals with the disease phenotype can also be performed
using a chi-square
test of homogeneity. These tests are implemented in all commercially or freely
available
statistical packages, for example SAS and S+, and are even included in
Microsoft Excel. More
sophisticated analyses will be performed by incorporating covariates such as
linear regression
or logistic regression, and by accounting for the information provided by
adjacent polymorphic
sites (multipoint analysis). An example of this type of program is the freely
available program
"Analyze" by JD Terwilliger (currently available at the WWW site
ftp://ftp.well.ox.ac.uk/pub/geneticslanalyze).
49
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
If a polymorphism has a phenotypic effect, a bias will exist in the
distribution of
polymorphisms between groups that have and do not have the disease phenotype.
This manner
of analysis can be used to study a trait that is not necessarily a disease;
any trait can be studied
by comparing a group with a particular phenotypic form of a trait to a group
with a different
phenotypic form of that trait. It is important that the cases and controls are
correctly matched with
regards to ethnicity, environmental influences, and other factors which could
effect the phenotype
being studied. Studies which test polymorphism frequencies within groups
exhibiting different
phenotypes and use statistical methods to compare the group polymorphism
frequencies and
identify correlations with phenotypes, are known as "associations studies".
Some polymorphisms that occur in a single gene can alter the function of a
gene
sufficiently such that the polymorphism results in a disease (monogenic
disease). However, many
common human diseases are polygenic; that is they are the result of complex
interactions of
various forms of multiple genes. In the case of polygenic diseases, the
alteration of a single gene
may not be detrimental per se, but in combination with certain sequence
variants of other genes,
this altered DNA sequence may contribute to a disease phenotype. DNA variants
leading to
monogenic diseases are usually rare in a population due to the process of
natural selection against
those carrying the disease gene. As variants in genes that are involved in
polygenic disease do
not produce the disease phenotype unless they occur in the appropriate
combination with other
gene variants, normal individuals can carry a subset of the disease-
contributing variants without
suffering adverse effects. Thus, disease-contributing gene variants that are
associated with
polygenic diseases may exist at a high frequency in a normal population.
Selection against these
disease variant forms of a gene will only occur when they are present in the
appropriate disease-
causing combination and there may not necessarily be selection against these
gene variants in
individuals carrying a subset of the disease-contributing variants. Neutral
DNA variants do not
alter gene function or contribute to a disease, are under no selective
pressure and occur at variable
frequencies within populations.
Monogenic diseases tend to be rare within the population, and therefore few
patients may
be available for studies of these diseases. A polymorphism in a single
specific gene is necessary
and usually sufficient to cause a monogenic disease, such that associations
between the variant
gene and the phenotype are usually readily apparent. In cases where the
expression of a mutation
phenotype is complete, ("complete penetrance"), the polymorphism present in
the disease gene
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
will not be found upon examination of a large number of normal individuals. If
there is not
complete penetrance then some apparently normal individuals will contain the
mutation; the
difference in frequency of occurrence of the variant gene in the disease group
as compared to the
normal population will reveal that the variant is associated with the disease.
In polygenic diseases, variation at different genes occurs in a combination
which alters
susceptibility to the disease. Although several genes may have variant forms
which can
contribute to a disease phenotype, it is not always necessary for a
contributing variant to be
present at every gene potentially contributing to the disease in a given
affected individual. For
example, a hypothetical disease could be caused by a particular combination of
variants at three
of four genes, designated as A, B, C, and D. Appropriate susceptibility
variants in combination
at any three of the genes can cause the susceptibility, i.e. one person with
increased susceptibility
may have susceptibility variants in genes A, B, and C, while another
individual with increased
susceptibility to the same disease will have susceptibility variants in genes
B, C, and D.
Therefore, although not all affected individuals will have the same
susceptibility variants, the net
result is that a diseased population will have susceptibility variant forms of
genes A, B, C, and
D at a higher frequency than an unaffected population (as detected by
association studies).
Unlike monogenic diseases which result from polymorphisms that are not present
in
control populations, the polymorphisms which contribute to the polygenic
disease are also
present in a normal population. As described in the example above, an
individual with
susceptibility polymorphisms in only one or two of the genes potentially
contributing to the
disease susceptibility will be normal with regard to disease susceptibility.
Therefore, normal
populations can be used to identify polymorphic regions of the genome in the
population, and
these regions can then be specifically tested in larger patient and control
populations. Typically,
a gene is analyzed for the presence of polymorphisms by testing between 2 and
100 normal
individuals in order to establish if a particular polymorphism is present for
that gene in the
population. Once a polymorphic sites) has been defined, the polymorphic site
is then tested in
case (disease) and control (normal) populations and statistical analyses are
performed to identify
polymorphisms which occur at significantly different frequencies in the two
populations.
The determination of the statistical significance of polymorphism frequency
differences
is dependent upon the size of the observed frequency difference between the
populations, and on
51
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
the size of the populations being studied. If a significant difference is
found, then it can be
concluded that an association exists between the polymorphism and the
phenotype being studied.
A statistically significant difference is a frequency difference at a
particular site between
populations which would be expected to occur by chance in only 5 out of 100
tests. That is, a
difference which has a 95% probability of being a true difference due to the
affect of the gene.
The foregoing discussion describes a method of testing for an association
between a
polymorphism which is the direct contributor to a disease and the disease
phenotype. However,
polymorphisms which do not directly contribute to a disease can also be used
to identify regions
of the genome which contain genes that contribute to the disease by virtue of
their proximity to
disease-contributing polymorphisms.
In humans, DNA exists as 23 homologous pairs of linear molecules
(chromosomes).
Recombination is a process which results in reciprocal exchanges of short
homologous DNA
segments between these homologous DNA pairs. Only one of each of the 23 pairs
of
chromosomes is inherited by the offspring. The inherited chromosome is thus
made up of
tandemly arrayed segments of DNA derived from both of a pair of chromosomes.
Consequently,
DNA is transferred in segments from one generation to the next. Although the
boundaries of each
inherited segment may vary in each generation, the net effect is that
sequences of DNA which
are adjacent along the length of the molecule are inherited together at a
higher frequency than
sequences that are farther apart. If a region (continuous linear segment) of
DNA has two or more
polymorphisms that are close together, they will be co-inherited at a higher
frequency than
polymorphisms that are farther apart, as they are more likely to remain on the
same segment of
DNA during recombination. Therefore, if two or more polymorphisms are close
together, they
will occur together at a higher frequency in a population than would be
expected by random
segregation. This effect is known as linkage. Linkage studies are performed
using multiply
affected individuals within families; the most commonly used approach is to
test markers located
throughout the genome in many sets of affected sib pairs that share the same
phenotype. Markers
which are located in the region of a genome that contributes to the phenotype
will be inherited
in both siblings, along with the phenotype, at a higher frequency than
expected by chance.
Studies wherein data from many such families is compared can be used to
implicate a region of
a genome as one that contributes to a particular phenotype.
52
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Linkage disequilibrium (LD) association studies provide another method for
using
polymorphisms in genetic studies. The method of LD involves making a
correlation at the
population level, between the alleles (alternative polymorphic forms of the
same sequence site)
present at different genomic sites. If site 1 has two variant forms, A and a,
and site 2 has two
variant forms B and b, the observation in a population that allele A at site 1
is more often found
with allele B at locus 2 than with allele b is an example of LD. If allele B
is a disease-
contributing polymorphism, then testing at allele A may show an association
with the disease.
Linkage disequilibrium may be generated in several ways. Maintenance of LD in
a
population allows a disease association to be detected many generations after
the formation of
LD. The maintenance of LD is explained by linkage: the closer the two loci,
the longer (in terms
of number of generations) that particular LD is maintained. As a result,
polymorphisms which
do not directly contribute to a disease can be used to identify regions of the
genome which
contain a disease contributing polymorphism. If a polymorphism affects gene
function such that
it contributes to a phenotype being studied and is found to be associated with
the phenotype,
nearby (neutral) polymorphisms which are in LD with the disease polymorphism
may also show
an association with the disease. Conversely, if a polymorphism does not affect
gene function but
is found to be associated with a particular phenotype, this polymorphism is in
LD with a
different, but adjacent polymorphism that affects gene function such that it
contributes to the
phenotype being studied. If a neutral polymorphism is always inherited with a
phenotype-
contributing polymorphism, then the strength of the association of the neutral
polymorphism to
the phenotype will be equal to that of the polymorphism which affects gene
function and is
contributing to the phenotype. A polymorphism which shows an association with
a phenotype
(for instance with disease susceptibility) is a marker for that phenotype and
implicates the region
in which the polymorphism resides as a region containing a polymorphism which
contributes to
the phenotype. Additional flanking polymorphisms can be tested to determine
the precise
location of the true phenotype-contributing variant.
Linkage studies on families, and LD studies on populations have different
degrees of
resolution with regards to defining the size of a DNA region which contains
the phenotype-
contributing polymorphism. In general, linkage studies define an interval
which potentially
contains tens to hundreds of genes, while LD studies have been used to
implicate single genes
in the development of a particular phenotype.
53
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
3. Test Populations Useful for Polymorphism Genotyping
The invention provides methods of determining allelic frequencies by
performing
genotypic analyses in appropriate test populations.
The following study populations from the FAMOS study group may be utilized.
Bone Fracture Cohort: 1000 multiple or low trauma fracture cases and 1000
control cases
to determine genetic association with fracture.
BMD (Bone Mass Density) Cohort: 300 high and 300 low BMD cases to study
genetic
association with high or low BMD.
BMD Case Control Cohort: 500 low BMD and normal BMD case contols to study
genetic
association with low BMD/fracture.
4. Assays Useful for Determining the Association of a Polymorphism with
osteoporosis
Preventative treatment for osteoporosis is most effective at the time when
bone loss is increasing
and before the bones have become fragile and prone to fracturing. Established
diagnostic
techniques use x-ray and ultrasonography to measure skeletal parameters of
bone size, volume
and mineral density to predict fracture risk and to assess response to
therapy. Such
measurements give a "static" value which can be compared to normal values to
aid diagnosis of
low bone mass and fracture rislc (Schott, Cormier et al. 1998). The World
Health Organization
defines osteoporosis as present when the bone mineral density levels are more
than 2.5 standard
deviations below the young normal mean. The various techniques used to measure
bone
mineral density are:
dual energy X-ray absorptiometry (DXA) - used to measure bone mass at the
lumbar
spine and hip, but it can also be applied to measuring total skeletal bone
mass, soft-tissue
composition and other regional bone measurements. Considered the "gold
standard" for BMD
measurement.
high-resolution quantitative computed tomography (QCT) - highly sensitive,
accurate
and specific spinal measurements. This technique is more costly and involves
higher radiation
doses than other techniques and is not widely available.
54
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
single-energy x-ray absorptiometry (SXA) - provides accurate radius BMD
measurements.
quantitative ultrasound (QUS) - new and promising technique which may have
applications in both BMD measurement and assessment or architectural
deterioration of bone
tissue. Recent studies suggest QUS of calcaneus bone predicts hip fracture as
well as DXA
(Hans, Dargent-Molina et al. 1996).
An alternative method to predict fracture independently of bone mass is to
measure bone
turnover. High turnover (bone resorption and formation) is associated with
rapid bone loss and
is likely to contribute to micro-architectural deterioration (Ross and
Knowlton 1998). This is a
"dynamic" measurement which is assessed with biochemical markers in urine or
serum and can
be used very effectively in therapy monitoring in preference to BMD
measurements which alter
more slowly (results of PEPI trial and Merck Research Laboratories). When used
in combination
with bone mass assessment, biomarkers can provide more accurate fracture
predictions over bone
mass measurement alone. Several markers for bone resorption (deoxypyridinoline
crosslinks),
and bone formation (bone alkaline phosphatase, osteocalcin) have been
developed for use in
diagnostic kits. The current challenge is to reduce the variability of the
measurements and
improve their reliability and applicability.
5. Methods of Genotyping Polymorphisms
The invention discloses methods for performing polymorphism genotyping. These
methods can be used to detect the presence of a polymorphism in a sample
comprising DNA or
RNA.
A DNA sample for analysis according to the invention may be prepared from any
tissue
or cell line, and preparative procedures are well-known in the art. The
preparation of genomic
DNA is performed as described in Section B.
RNA samples may also be useful for genotyping according to the invention.
Isolation of
RNA can be performed according to the following methods.
RNA is purified from mammalian tissue according to the following method.
Following
removal of the tissue of interest, pieces of tissue of __<2g are cut and quick
frozen in liquid
nitrogen, to prevent degradation of RNA. Upon the addition of a volume of 20
ml tissue
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
guanidinium solution per 2 g of tissue, tissue samples are ground in a
tissuemizer with two or
three 10-second bursts. To prepare tissue guanidiium solution (1 L) 590.8 g
guanidinium
isothiocyanate is dissolved in approximately 400 ml DEPC-treated HZO. 25 ml of
2 M Tris-Cl,
pH 7.5 (0.05 M final) and 20 ml NaZEDTA (0.01 M final) is added, the solution
is stirred
overnight, the volume is adjusted to 950 ml, and 50 ml 2-ME is added.
Homogenized tissue samples are subjected to centrifugation for 10 min at
12,000 x g at
12°C. The resulting supernatant is incubated for 2 min at 65°C
in the presence of 0.1 volume of
20% Sarkosyl, layered over 9 ml of a 5.7M CsCI solution (O.lg CsCl/ml), and
separated by
centrifugation overnight at 113,000 x g at 22°C. After careful removal
of the supernatant, the tube
is inverted and drained. The bottom of the tube (containing the RNA pellet) is
placed in a 50 ml
plastic tube and incubated overnight (or longer) at 4°C in the presence
of 3 ml tissue resuspension
buffer (5 rnM EDTA, 0.5% (v/v) Sarkosyl, 5% (v/v) 2-ME) to allow complete
resuspension of
the RNA pellet. The resulting RNA solution is extracted sequentially with
25:24:1
phenol/chloroform/isoamyl alcohol, followed by 24:1 chloroform/isoamyl
alcohol, precipitated
by the addition of 3 M sodium acetate, pH 5.2, and 2.5 volumes of 100%
ethanol, and
resuspended in DEPC water (Chirgwin et al., 1979, Biochemistry, 18: 5294).
Alternatively, RNA is isolated from mammalian tissue according to the
following single
step protocol. The tissue of interest is prepared by homogenization in a glass
teflon homogenizes
in 1 ml denaturing solution (4M guanidiium thiosulfate, 25 mM sodium citrate,
pH 7.0, 0.1 M
2-ME, 0.5% (w/v) N-laurylsarkosine) per 100mg tissue. Following transfer of
the homogenate
to a 5-ml polypropylene tube, 0.1 ml of 2 M sodium acetate, pH 4, 1 ml water-
saturated phenol,
and 0.2 ml of 49:1 chloroform/isoamyl alcohol are added sequentially. The
sample is mixed after
the addition of each component, and incubated for 15 min at 0-4°C after
all components have
been added. The sample is separated by centrifugation for 20 min at 10,000 x
g, 4°C, precipitated
by the addition of 1 ml of 100% isopropanol, incubated for 30 minutes at -
20°C and pelleted by
centrifugation for 10 minutes at 10,000 x g, 4°C. The resulting RNA
pellet is dissolved in 0.3 ml
denaturing solution, transferred to a microfuge tube, precipitated by the
addition of 0.3 ml of
100% isopropanol for 30 minutes at -20°C, and centrifuged for 10
minutes at 10,000 x g at 4°C.
The RNA pellet is washed in 70% ethanol, dried, and resuspended in 100-200 ml
DEPC-treated
water or DEPC-treated 0.5% SDS (Chomczynski and Sacchi, 1987, Anal.
Biochem.,162: 156).
56
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
RNA prepared according to either of these methods can be used for genotyping
by the
methods of Northern blot analysis, S 1 nuclease analysis and primer extension
analysis (Ausubel
et al., supra).
cDNA samples also may be prepared according to the invention, i.e., DNA that
is
complementary to RNA such as mRNA. The preparation of cDNA is well-known and
well-
documented in the prior art.
cDNA is prepared according to the following method. Total cellular RNA is
isolated (as
described) and passed through a column of oligo(dT)-cellulose to isolate polyA
RNA. The bound
polyA mRNAs are eluted from the column with a low ionic strength buffer. To
produce cDNA
molecules, short deoxythymidine oligonucleotides (12-20 nucleotides) are
hybridized to the
polyA tails to be used as primers for reverse transcriptase, an enzyme that
uses RNA as a
template for DNA synthesis. Alternatively, mRNA species can be primed from
many positions
by using short oligonucleotide fragments comprising numerous sequences
complementary to the
mRNA of interest as primers for cDNA synthesis. The resultant RNA-DNA hybrid
can be
converted to a double stranded DNA molecule by a variety of enzymatic steps
well-known in the
art (Watson et al., 1992, Recombinant DNA, 2nd edition, Scientific American
Books, New
York).
Tissues or fluids which are useful for obtaining a DNA or RNA sample according
to the
invention include but are not limited to plasma, serum, spinal fluid, lymph
fluid, external
secretions of the skin, respiratory, intestinal and genitoruinary tracts,
saliva, blood cells, tumors,
organs, tissue and samples of iia vitro cell culture constituents.
Genotyping methods which are useful according to the invention, i.e., for the
detection
of polymorphisms in nucleic acid samples isolated from individuals, are
disclosed below.
Single Strand Conformation Polymorphism (SSCP) Screening and Fluorescent SSCP
Screening (fSSCP)
SSCP Analysis
One technique for detecting DNA sequence variations in a biological sample is
single
strand conformation polymorphism (SSCP) (Glavac et al., 1993, Hum. Mut. 2:404;
Sheffield et
al.,1993, Genomics 16:325). SSCP is a simple and effective technique for the
detection of single
57
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
base changes. This technique is based on the principle that single-stranded
DNA molecules
assume specific sequence-based secondary structures (conformers) under
nondenaturing
conditions. The detection of point mutations by single stranded conformation
polymorphism is
believed to be due to an alteration in the structure of single stranded DNA.
Molecules differing
by only a single base substitution may assume different conformers and migrate
differently in a
nondenaturing polyacrylamide gel. Single stranded DNAs that contain sequence
variations are
identified by an abnormal mobility on polyacrylamide gels. SSCP detects all
types of point
mutations and short insertions or deletions that are located between the PCR
primers (within the
probe region) with apparently equal efficiency. This technique has proven
useful for detection
of multiple mutations and polymorphisms, including SNPs. SSCP sensitivity
varies dramatically
with the size of the DNA fragment being analysed. The optimal size fragment
for sensitive
detection by SSCP is approximately 125-300bp.
The mobility of a single stranded DNA or double stranded DNA fragment during
electrophoresis through a gel matrix is dependent on its size. Small molecules
migrate more
rapidly than large molecules because they pass through the pores in the matrix
more easily.
Conventionally, electrophoresis of single stranded DNA involves a 'denaturing'
gel which
maintains the single strandedness of the molecules. The denaturant is
typically urea in
polyacrylamide gels, and typically formamide or sodium hydroxide in agarose
gels. In contrast,
according to the SSCP screening protocol, single-strandedDNA is analysed on a
'nondenaturing'
gel. When single stranded DNA is analysed on a 'non-denaturing' gel,
intramolecular interactions
can occur. In particular, the single stranded DNA is able to (partially) bind
to itself.
Consequently, DNA that is separated by electrophoresis on an SSCP gel does not
migrate as a
linear molecule but rather, the mobility of the DNA on an SSCP gel is governed
by both its size
and tertiary structure (conformation). The tertiary structure of a single
stranded DNA fragment
is dependent on the sequence of the entire fragment. Therefore, if a
polymorphism exists in a
given fragment, the conformation will usually be altered. The technique is
performed as follows.
One or more test DNA samples are prepared for analysis as described above, and
subject
to PCR amplification. Oligonucleotide primers are designed and synthesized as
described above.
Amplifications are performed in a total volume of 10 ml containing 50 mM KCI,
10 mM Tris-
HCI, pH 9.0 (at 25°C), 0.1 % Triton X-100,1.5 mM MgCl2, 0.2mM of dGTP,
dATP, dTTP, 0.02
mM of non radioactive dCTP, 0.05 ml [a-33P] dCTP (1,000-3,000 Ci mmol-1; 10
mCi ml-1), 0.2
58
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
uM each primer, 50 ng genomic DNA (or 1 ng of cloned DNA template) and 0.1 U
Taq DNA
polymerase. The PCR cycling profile is as follows : preheating to 94°C
for 3 min followed by
94°C, 1 min; annealing temperature, 30 sec; 72°C, 45 sec for 35
cycles and a final extension at
72°C for 5 min. Annealing temperature is different for each PCR primer
pair and can be
optimized according to the parameters described above. Amplifications using
Vent Taq
polymerase (New England Biolabs) are performed in a total volume of 10 ul
using the buffer
provided by the manufacturer with 1 mM each of dGTP, dATP, dTTP, 0.02 mM dCTP,
0.25 ul
[a-33P~ dCTP (1,000-3,000 Ci mmol-';10 mCi ml-1), 0.2 uM of each primer, 50 ng
of genomic
DNA (or 1 ng of cloned DNA template) and 0.1 U of Vent Taq DNA polymerase.
Samples are
heated to 98°C for 5 min prior to addition of enzyme and nucleotides.
The PCR cycling profile
is 98°C, 1 min; annealing temperature, 45 sec; 72°C, 1 min for
35 cycles, followed by a final
extension at 72°C for 5 min. The length and temperature of each step of
a PCR cycle, as well as
the number of cycles, is adjusted in accordance to the stringency
requirements, as described
above.
SSCP analysis is performed as follows. Ten ul of formamide dye (95% formamide,
20mM EDTA, 0.05% bromophenol blue, 0.05% xylene cyanol) are added to 10 ul
aliquots of
radiolabeled PCR product. Following denaturation at 100°C for 5 min,
the reaction mixture is
placed on ice. Two ul aliquots are loaded onto 8% acrylamide:bisacrylamide
(37.5:1), 0.5X TBE
(45 mM Tris-borate, 1 mM EDTA), 5% glycerol gels. Electrophoresis is carried
out at 25W at
4°C for 8 hours in 0.5X TBE. Dried gels are exposed to X-GMAT ARfilm
(Kodak) and the
autoradiographs are analysed and scored for aberrant migration of bands (band
shifts). SSCP may
be optimized, as desired, as taught in Glavac et al., 1993, Hum. Mut. 2:404.
fSSCP Anal.~is
Techniques for screening multiple DNA samples simultaneously are also useful
for
performing rapid genotyping analysis on a large number of samples according to
the invention.
By pooling and multiplexing DNA samples in fluorescent SSCP (fSSCP) assays,
the high
throughput required for detecting sequence variations in a large number of
samples is achieved
(Makino et al., 1992, PCR Methods Appl. 2:10; Ellison et al., 1993,
BioTechniques 15:684).
According to the method of fSCCP, PCR products are visualized and analysed
using an ABI
fluorescent DNA sequencing machine. Different primer pairs are identified by
different color
59
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
fluorochromes (4 different fluorochromes are now available). fSSCP offers the
following
advantages over SSCP. Unlike SSCP, fSSCP does not require handling of
radioactive materials.
Furthermore, the fSSCP technique allows for automated data and automated data
analysis
programs that detect aberrantly migrating samples. In contrast, SSCP
evaluation involves visual
examination by an individual, and does not provide a means for correcting for
lane to lane
variations in electrophoretic conditions, as does fSSCP analysis.
fSSCP Analysis is performed as follows.
Amplifications are performed in a total volume of 10 ul containing 50 mM KCl,
lOmM
Tris-HCl, pH 9.0 (at 25 °C), 0.1 % Triton X-100,1.5 mM MgCl2, 0.2mM of
dGTP, dATP, dTTP,
dCTP, 0.2 uM primer labeled with one of the fluorochromes HEX, FAM, TET or
JOE, 50 ng
genomic DNA (or 1 ng of cloned DNA template) and 0.1 U Taq DNA polymerise. The
PCR
cycling profile is as follows : preheating to 94°C for 3 min followed
by 94°C, 1 min; annealing
temperature, 30 sec; 72°C, 45 sec for 35 cycles and a final extension
at 72' C for 5 min. Annealing
temperature is different for each PCR primer pair. Amplifications using Vent
Taq polymerise
(New England Biolabs) are performed in a total volume of 10 ul using the
buffer provided by the
manufacturer with 1 mM each of dGTP, dATP, dTTP, dCTP, 0.2 uM primer labeled
with one
of the fluorochromes HEX, FAM, TET or JOE, 50 ng genomic DNA (or 1 ng of
cloned DNA
template) and 0.1 U of Vent Taq DNA polymerise. Samples are heated to
98°C for 5 min prior
to addition of enzyme and nucleotides. The PCR cycling profile is 98°C,
1 min; annealing
temperature, 45 sec; 72°C, 1 min for 35 cycles, followed by a final
extension at 72°C for 5 min.
Annealing temperature is different for each PCR primer pair. Two ul of
fluorescent PCR products
are added to 3 ul formamide dye (95% formamide, 20mM EDTA, 0.05% bromophenol
blue,
0.05% xylene cyanol), denatured at 100°C for 5 min, then placed on ice.
Thereafter, 0.5-1 ml of
GenescanTM 1500 size markers are added as an internal standard. Two ul of the
mix is loaded
onto 8% or 10% acrylamide:bisacrylamide (37.5:1), 0.5X TBE (45 mM Tris-borate,
1 mM
EDTA), 5% glycerol gels and electrophoresis is performed on an ABI 377 DNA
sequencing
machine. Gel temperature is maintained between 4° and 10°C by an
external cooling unit
connected to the internal cooling plumbing and chambers. Electrophoresis is
carried out at 2500-
3500 volts for 4 - 10 hours in 0.5X TBE. Data is automatically collected and
analysed with
Genescan and Genotype analysis software (ABI).
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
The fSSCP procedure identifies regions of 150-300 base pairs containing a
sequence
variation. To identify the exact sequence change, the fragment which
demonstrates the aberrant
migration is amplified again from the same biological sample, using non
fluorescent primers. The
sequence is then determined using standard DNA sequencing methods well known
to those
skilled in the art (Ausubel et al., supra).
Although SSCP and fSSCP techniques are preferred according to the invention,
other
methods for detecting sequence variations, including DNA sequencing, can be
employed.
Additional techniques for detecting DNA sequence variations useful according
to the invention
are described below.
Fluorescence Polarization-TDI
Fluorescence polarization-TDI is another preferred technique according to the
invention
for the detection of sequence variations. Template-directed primer extension
is a dideoxy chain
terminating DNA sequencing protocol designed to ascertain the nature of the
one base
immediately 3'to the sequencing primer that is annealed to the target DNA
immediately upstream
from the polymorphic site. In the presence of DNA polymerase and the
appropriate
dideoxyribonucleoside triphosphate (ddNTP), the primer is extended
specifically by one base as
dictated by the target DNA sequence at the polymorphic site. By determining
which ddNTP is
incorporated, the alleles present in the target DNA can be determined.
Fluorescence polarization is based on the observation that when a fluorescent
molecule
is exited by plane-polarized light, it emits polarized fluorescent light into
a fixed plane if the
molecules remain stationary between excitation and emission. However, because
the molecule
rotates and tumbles in solution, fluorescence polarization is not observed
fully by an external
detector. The fluorescence polarization of a molecule is proportional to the
molecule's rotational
relaxation time, which is related to the viscosity of the solvent, absolute
temperature, molecular
volume, and the gas constant. If the viscosity and temperature are held
constant, then
fluorescence polarization is directly proportional to the molecular volume,
which is directly
proportional to the molecular weight. If the fluorescent molecule is large
(with high molecular
weight), it rotates and tumbles more slowly in solution and fluorescence
polarization is
61
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
preserved. If the molecule is small (with low molecular weight), it rotates
and tumbles faster and
fluorescence polarization is largely lost (depolarized).
In the FP-TDI assay, the sequencing primer is an unmodified primer with its 3'
end
immediately upstream from a polymorphic or mutation site. When incubated in
the presence of
ddNTPs Tabled with different fluorophores, the allele-specific dye ddNTP is
incorporated onto
the TDI primer in the presence of DNA polymerase and target DNA. The genotype
of the target
DNA molecule can be determined simply by exciting the fluorescent dye in the
reaction and
determining whether a change in fluorescence polarization occurs. Chen et al.,
1999, Genome
Res., 9:492.
One or more test DNA samples are prepared for analysis as described above, and
subject
to PCR amplification. Oligonucleotide primers are designed and synthesized as
described above.
Amplifications are performed in a total volume of 10 ml containing 50 mM KCI,
10 mM Tris-
HCI, pH 9.0 (at 25°C), 0.1 % Triton X-100,1.5 mM MgCl2, 0.2mM of dGTP,
dATP, dTTP, 0.02
mM of non radioactive dCTP, 0.05 ml [a-33P] dCTP (1,000-3,000 Ci mmol-'; 10
mCi ml-1), 0.2
uM each primer, 50 ng genomic DNA (or 1 ng of cloned DNA template) and 0.1 U
Taq DNA
polymerase. The PCR cycling profile is as follows : preheating to 94°C
for 3 min followed by
94°C, 1 min; annealing temperature, 30 sec; 72°C, 45 sec for 35
cycles and a final extension at
72°C for 5 min. Annealing temperature is different for each PCR primer
pair and can be
optimized according to the parameters described above. Amplifications using
Vent Taq
polymerase (New England Biolabs) are performed in a total volume of 10 ul
using the buffer
provided by the manufacturer with 1 mM each of dGTP, dATP, dTTP, 0.02 mM dCTP,
0.25 ul
[a-33P] dCTP (1,000-3,000 Ci mmol-1;10 mCi ml-'), 0.2 uM of each primer, 50 ng
of genomic
DNA (or 1 ng of cloned DNA template) and 0.1 U of Vent Taq DNA polymerase.
Samples are
heated to 98°C for 5 min prior to addition of enzyme and nucleotides.
The PCR cycling profile
is 98°C, 1 min; annealing temperature, 45 sec; 72°C, 1 min for
35 cycles, followed by a final
extension at 72°C for 5 min. The length and temperature of each step of
a PCR cycle, as well as
the number of cycles, is adjusted in accordance to the stringency
requirements, as described
above.
Following PCR amplification, unused PCR primers and dNTPs are destroyed by
adding
2ml of PCR product to 2ml of SAP/Exonuclease cocktail (0.1U shimp alkaline
phosphatase (1
62
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
U/ml,Amersham Pharmacia Biotech, Inc., Piscataway, NJ)and 0.2U E. coli
exonuclease I (10
U/ml, Amersham)in SAP buffer (20mM TrisHCl, pH 8.0; 10 mM MgClz, Amersham))per
well
of a 384-well Black PCR plate (ABT). The mixtures are incubated at 37°C
for 60 min before the
enzymes are heat inactivated at 95°C for 15 min. The mixture is held at
4°C until used in the FP
TDI assay.
To the enzymatically treated PCR product, 2 ml of TDI reaction cocktail
containing TDI
buffer (50mM Tris-HCl (pH 9.0), 50mM KCI, 5 mM NaCl, 2 mM MgCl2, 8% glycerol),
1 mM
TDI primer, 12.5 nM of each of two allele specific dye-labled ddNTPs (ROX-
ddGTP, BFL-
ddATP, Tamra-ddCTP, or R6G-ddUTP; NEN Life Science Products, Inc., Boston,
MA), and
0.32U Thermo Sequenase (Amersham). The reaction mixtures are incubated at 94oC
for 15 min,
followed by 34 cycles of 94°C for 30 seconds and 55°C for 15
seconds. Upon completion of the
reaction cycles, the samples are held at 4°C.
After the primer extension reaction, 24 ml of TE buffer/methanol (2:1) is
added to each
sample well, and the fluorescence polarization is measured using a LJL Analyst
(LJL Biosystems,
Sunnyvale, CA).
Denaturing Gradient Gel Electrophoresis
Denaturing gradient gel electrophoresis (DGGE) is a gel system which allows
electrophoretic separation of DNA fragments differing in sequence by a single
base pair. The
2 o separation is based upon differences in the temperature of strand
dissociation of the wild-type
and mutant molecules. During electrophoresis, fragments migrating through the
gel are exposed
to an increasing concentration of denaturant in the gel. When the DNA
fragments are exposed
to a critical level of denaturant, the DNA strands begin to dissociate. This
dissociation causes a
significant reduction in the mobility of the fragment. The position in the gel
at which the level
2 5 of denaturant is critical for a particular DNA fragment is a function of
the Tm of the DNA
fragment and is therefore different for wild-type versus mutant fragments.
Consequently, upon
migration to the position at which the level of denaturant is at the critical
point, for either the
wild-type or the mutant fragment, the mobility of these two molecules will
become different, thus
resulting in their separation. The mutation detection rate of DGGE approaches
100%. Although
3 o the technique of DGGE is relatively simple to perform, and does not
require radioisotopes or
63
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
toxic chemicals, it does require some specialized equipment. Furthermore, DGGE
can only be
used to analyze fragments between 100 and 800bp due to the resolution limit of
polyacrylamide
gels. DGGE is advantageous over other methods useful for detecting sequence
variations because
the behavior of DNA molecules on DGGE gels can be modeled by computer thereby
making it
possible to accurately predict thedetectability of a mutation in a given
fragment. Genomic DNA
fragments can be efficiently transferred from the gel following DGGE as
described in US Patent
No. 5,190,856.
Chemical Cleavage of Mismatches
Chemical cleavage of mismatch (CCM) is another technique for detection of
sequence
variations that is useful according to the invention. CCM is based upon the
ability of
hydroxylamine and osmium tetroxide to react with the mismatch in a DNA
heteroduplex and the
ability of piperidine to cleave the heteroduplex at the point of mismatch.
According. to the
method of CCM, sequence variations are detected by the appearance of fragments
that are smaller
than the untreated heteroduplex following denaturing polyacrylamide gel
electrophoresis.
DNA fragments up to 1kb in size can be analysed by CCM with a probable 100%
detection rate for sequence variation. CCM is particularly useful for either
detecting all of the
sequence variations in a particular fragment of DNA or for determining that
there are no
sequence variations in a particular fragment of DNA.
Constant Denaturant Capillary Electrophoresis (CDCE) Anal
CDCE analysis is particularly useful in high throughput screening, i.e.,
wherein large
numbers of DNA samples are analysed. CDCE analysis combines several elements
of both
replaceable linear polyacrylamide capillary electrophoresis and constant
denaturant gel
electrophoresis. The technique of CDCE is a rapid, high resolution procedure
that demonstrates
a high dynamic range, and is automatable. The method of CDCE, as described in
detail in
Khrapko et al., 1994, Nucleic Acids Res. 22:364, involves the use of a zone of
constant
temperature and a denaturant concentration in capillary electrophoresis.
Linear polyacrylamide
gel electrophoresis is performed at viscosity levels that permit facile
replacement of the matrix
64
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
after each run. For a typical 100 by fragment of DNA, point mutation-
containing heteroduplexes
are separated from wild type homoduplexes in less than 30 minutes. Using laser-
induced
fluorescence to detect fluorescent-tagged DNA, the system has an absolute
limit of detection of
3 x 104 molecules with a linear dynamic range of six orders of magnitude. The
relative limit of
detection is about 3110,000, i.e., 100,000 mutant sequences are recognized
among 3 x 10g wild
type sequences. This approach is applicable to analysis of low frequency
mutations, and to
genetic screening of pooled samples for detection of rare variants.
RNase Cleavage
An additional method for genotyping that is useful according to the invention
is RNASE
Cleavage. Various ribonuclease enzymes, including RNASE A, RNASE T1 and RNASE
T2
specifically digest single stranded RNA. When RNA is annealed to form double
stranded RNA
or an RNA/DNA duplex, it can no longer be digested with these enzymes.
However, when a
mismatch is present in the double stranded molecule, cleavage at the point of
mismatch may
occur.
RNASE Cleavage is preferably performed with RNASE A. Ribonuclease A
specifically
digests single stranded RNA but can also cleave heteroduplex molecules at the
point of
mismatch. The extent of cleavage at single base mismatches depends on both the
type of
mismatch, and the sequence of DNA flanlung the mismatch. Sequence variations
leading to
mismatch are indicated by the presence of fragments that are smaller than the
uncleaved
heteroduplex on denaturing polyacrylamide gels.
According to the invention, RNASE Cleavage involves forming a heteroduplex
between
a radiolabeled single stranded RNA probe (riboprobe) and a PCR product derived
from a
biological sample. If a point mutation is present in the PCR product,
following treatment of the
resulting RNA/DNA heteroduplex with RNASE A, the RNA strand of the duplex may
be
cleaved. The sample is then denatured by heating and analysed on a denaturing
polyacrylamide
gel. If the RNA probe has not been cleaved, it will be the same size as the
PCR product. If the
probe has been cleaved, it will be smaller than the PCR product. RNASE
Cleavage can be used
to easily detect a 1 by deletion. However, small insertions may not be as
easily detected as small
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
deletions, by RNASE Cleavage, as 'looping-out' occurs on the target strand
rather than the probe
strand.
Heteroduplex Analysis
Another method for genotyping according to the invention is heteroduplex
analysis.
Heteroduplex molecules, i.e., double stranded DNA molecules containing a
mismatch, can be
separated from homoduplex molecules on ordinary gels. The exact rate of
detection of sequence
variations by heteroduplex analysis is unknown, but is clearly significantly
lower than 100%.
Presumably, the sequence of DNA flanking the mismatch, rather than the actual
mismatch affects
the detectability. Mismatches that are located in the middle of a DNA fragment
are detected most
easily. Although heteroduplex analysis is less sensitive than some of the
other genotyping
methods described, it may be considered useful according to the invention due
to its simplicity.
Mismatch Repair Detection (MRD)
Another technique that is useful for genotyping according to the invention is
mismatch
repair detection (MRD). MRD is an in vivo method that detects DNA sequence
variation by the
occurrence of a change in bacterial colony color. DNA fragments to be screened
for variation are
cloned into two MRD plasmids, and bacteria are transformed with heteroduplexes
of these
constructs. The resulting colonies are blue in the absence of a mismatch and
white in the presence
of a mismatch. MRD can be used to detect a single mismatch in a DNA fragment
as large as 10
kb in size. MRD permits high-throughput screening of genetic mutations, and is
described in
detail in Faham et al., 1995, Genome Research 5:474.
Mismatch Recognition by DNA Repair Enzymes
Another technique that is useful for detecting sequence variations according
to the
invention is Mismatch Recognition by DNA Repair Enzymes. The E.coli mismatch
correction
systems are well-understood. Three of the proteins required for the methyl-
directed DNA repair
pathway: MutS, Mutt and Mutes are sufficient to recognize 7 of the possible 8
single base-pair
mismatches (C/C mismatches are not recognized) and cut/nick the DNA at the
nearest GATC
66
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
sequence. The Mutt protein, which is involved in a distinct repair system can
also be used to
detect A/G and A/C mismatches. Some mammalian enzymes are also useful for
mismatch
recognition: thymidine glycosylase can recognize all types of T mismatch and
'all-type
endonuclease' or Topoisomerase I is capable of detecting all 8 mismatches, but
does so with
varying efficiencies, depending on both the type of mismatch and the
neighboring sequence.
The MutS gene product is the methyl-directed repair protein which binds to the
mismatch.
Purified MutS protein has been used to detect mutations by several different
methods. Gel
mobility assays can be performed in which DNA bound to the MutS protein
migrates more
slowly through an acrylamide gel than free DNA. This method has been used to
detect single
base mismatches.
An alternative method for the use of MutS in mismatch recognition, which does
not
require gel electrophoresis, involves the immobilization of MutS protein on
nitrocellulose
membranes. Labeled heteroduplexed DNA is used to probe the membrane in a dot-
blot format.
When both DNA strands are used, all mismatches can be recognized by binding of
the DNA to
the protein attached to the membrane. Although C/C mismatches are not
detected, the
corresponding G/G mismatch derived from the other strand is recognized. This
technique is
particularly useful because it is simple, inexpensive, and amenable to
automation. However, the
detection efficiency of this method may be limited by the size of the DNA
fragment. In particular,
this method works well for very short fragments.
Seauencing b~ybridization (SBH)
An alternative method for detecting sequence variations according to the
invention is
sequencing by hybridization (SBH). According to this method, arrays of short
(8-10 base long)
oligonucleotides are immobilized on a solid support in a manner similar to the
reverse dot-blot
protocol, and probed with a target DNA fragment. In particular,
oligonucleotides are synthesized
together and directly onto the support.
The synthesis system begins with a silicon chip coated with a nucleotide
linked to a light-
sensitive chemical group which is used to illuminate particular grid co-
ordinates removing the
blocking group at these positions. The chip is then exposed to the next
photoprotected nucleotide,
which polymerizes onto the exposed nucleotides.
67
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
In this manner, as a result of successive rounds of nucleotide additions,
oligonucleotides
of different sequences can be synthesized at different positions on the solid
support. Thirty-two
cycles of specific additions (i.e., 8 additions of each of the four
nucleotides) should enable the
production of all 65,536 possible 8-mer oligonucleotides at defined positions
on the chip.
When the chip is probed with a DNA molecule, e.g., a fluorescently labeled PCR
product,
fully matched hybrids should give a high intensity of fluorescence and hybrids
with one or more
mismatches should give substantially less intense fluorescence. The
combination of the position
and intensity of the signals on the chip enables computers to derive the
sequence of the DNA
molecule being analysed for the presence of sequence variations.
Allele-Specific Oli~onucleotide Hybridization
The technique of allele-specific oligonucleotide (ASO) hybridization or the
'dot-blot' is
also useful for genotyping according to the invention. Under specific
hybridization conditions,
an oligonucleotide will only bind to a PCR product if the two are 100%
identical. A single base
pair mismatch is sufficient to prevent hybridization. A pair of
oligonucleotides, one carrying the
wild type base and the other carrying a single base change, as compared to the
wild type
sequence, can be used to determine if a PCR product is homozygous wild type,
heterozygous or
homozygous mutant for a particular base change. When performing conventional
dot blots, the
PCR product is fixed onto a nylon membrane and probed with a labeled
oligonucleotide. When
performing a 'reverse dot blot', an oligonucleotide is fixed to a membrane and
probed with a
labeled PCR product. The probe may be isotopically labeled, or non-
isotopically labeled. The
technique allows for the genotyping of multiple PCR amplified samples for the
presence of a
single base change.
Allele-Specific PCR
Many methods for identifying sequence variations involve the analysis of PCR-
amplified
DNA. The allele-specific polymerase chain reaction (also called the
amplification refractory
mutation system or ARMS) comprises an assay that occurs during the PCR
reaction itself. ARMS
requires the use of sequence-specific PCR primers which differ from each other
at their terminal
68
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
3' nucleotide and are designed to amplify only the normal allele in one
reaction, and only the
mutant allele in another reaction. When the 3' end of a specific primer is
100% identical to the
target, amplification occurs. When the 3' end of a specific primer is not 100%
identical to the
target, amplification does not occur. Agarose gel electrophoresis is used to
detect the presence
of an amplified product. The genotype of a (heterozygous) wild-type sample is
characterized by
amplification products in both reactions, and a homozygous mutant sample
generates product in
only the mutant reaction.
This technique can be modified so that the 5' ends of the allele-specific
primers are
labeled with different fluorescent labels, and the 5' end of the common
primers are biotin labeled.
According to this alternate protocol, the wild-type specific and the mutant-
specific reactions are
performed in a single tube. The advantages of this approach are that a gel
electrophoresis step
is not required, and the method is amenable to automation.
Primer-Introduced Restriction Analysis
The method of primer-introduced restriction analysis (PIRA) can also be used
for
genotyping according to the invention. PIRA is a technique which allows known
sequence
variations to be detected by restriction digestion. By introducing a base
change close to the
position of a known sequence variation (for example by using a PCR primer
containing a
mismatch, as compared to the target sequence), it is possible to create a
restriction endonuclease
recognition site that indicates the presence of a particular sequence change.
The combination of
the altered base in the primer sequence and the altered base at the mutation
site, creates a new
restriction enzyme target site. This approach may be used to create a new
restriction enzyme site
in either the wild-type allele or the mutant allele. If a novel restriction
enzyme site is introduced
in the mutant allele then, following digestion with the appropriate
restriction enzyme, the
homozygous wild-type form would produce a single band of the full-length size,
the homozygous
mutant form would produce a single band of the reduced size and the
heterozygous form would
produce both full length and reduced sized bands. Band size will be analysed
by gel
electrophoresis.
Oligonucleotide Li~ation Assay
69
a
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
The technique of oligonucleotide ligation can also be used for genotyping
according to
the invention.
The method of oligonucleotide ligation is based on the following observations.
If two
oligonucleotides are annealed to a strand of DNA and are exactly juxtaposed,
they can be joined
by the enzyme DNA ligase. If there is a single base pair mismatch at the
junction of the two
oligonucleotides then ligation will not occur. According to the method of
oligonucleotide
ligation, the two oligonucleotides used in the assay are modified by the
addition of two different
labels. According to this method, the assay for a ligated product involves
detecting a ligated
product by assaying for the appearance of the labels of the two
oligonucleotides on a single
molecule rather than visualization of a new, larger sized DNA fragment by gel
electrophoresis.
When ligation reactions are conducted in 96-well microtiter plates and
ligation is scored
by ELISA, the oligonucleotide ligation assay can be performed by a robot and
the results can be
analysed by a plate reader and fed directly into a computer. This method is
therefore extremely
useful for detecting the presence of a sequence variation in a large number of
samples. The
oligonucleotide ligation assay is performed on PCR-amplified DNA. A
modification of this
assay, termed the ligase chain reaction, is performed on genomic DNA and
involves
amplification with a thermostable DNA ligase.
Direct DNA Sequencing
Genotyping according to the invention may also be carried out by directly
sequencing the
DNA sample in the region of the gene of interest, using DNA sequencing
procedures well-known
in the art (described above in Section D, entitled "Isolation of a Wild Type
Gene")
Mini-Sequencin
The technique of mini-sequencing (also known as single nucleotide primer
extension) can
also be used to detect any known point mutation, deletion or insertion,
according to the invention.
Obtaining sequence information for just a single base pair only requires the
sequencing of that
particular base. This can be done by including only one base in the sequencing
reaction rather
than all four. When this base is labeled and complementary to the first base
immediately 3' to
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
the primer (on the target strand), the label will not be incorporated. Thus, a
given base pair can
be sequenced on the basis of label incorporation or failure of incorporation
without the need for
electrophoretic size separation.
5' Nuclease Assay
Genotyping according to the invention can also be performed by the method of
5'
nuclease assay. The 5' nuclease assay is a technique that monitors the extent
of amplification in
a PCR reaction on the basis of the degree of fluorescence in the reaction mix.
A low level of
fluorescence indicates no amplification or very poor amplification and a high
level of
fluorescence indicates good amplification. This system can be adapted to
permit identification
of known ~ sequence variations, without the need for any post-PCR analysis
other than
fluorescence emission analysis.
PCR amplification is detected by measuring the 5' to 3' exonuclease activity
of Taq
polymerise. Taq polymerise cleaves 5' terminal nucleotides of double stranded
DNA. The
preferred substrate for Taq polymerise is a partially double stranded
molecule. Taq polymerise
cleaves the strand that contains the closest free 5' end. According to the 5'
nuclease assay, an
oligonucleotide 'probe' which is phosphorylated at its 3' end so as to render
it incapable of
serving as a DNA synthesis primer, is included in the PCR reaction. The probe
is designed to
anneal to a position between the two amplification primers. When an actively
extending Taq
polymerise molecule reaches the probe molecule, it partially displaces the
probe and then cleaves
the probe at or near the single stranded/double stranded cleavage site until
the entire probe is
broken up and removed from the template. The polymerise continues this process
of
displacement and cleavage until the entire probe is broken up and removed from
the template.
The probe is labeled in a manner that permits detection of the removal of the
probe. In particular,
the probe is labeled at different positions with two different fluorescent
labels. One label has a
localized quenching effect on the fluorescence of the other (reporter) label.
This effect is
mediated by energy transfer from one dye to the other, and requires that the
two dyes are in close
proximity to each other. If the probe is cleaved at a position between the
reporter and the
quencher dyes, the two dyes become physically separated thereby resulting in
an increase in
fluorescence which is proportional to the yield of the PCR product.
71
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Representational Difference Analysis (RDA)
Genotyping according to the invention can also be carried out by
Representational
Difference Analysis (RDA). RDA is described in detail in Lisitsyn et al.,
1993, Science 259:946,
and an adaptation which combines selective breeding with RDA is described in
Lisitsyn et al.,
1993, Nature Genet. 6:57. RDA identifies sequence dissimilarities through the
application of a
powerful approach to subtractive hybridization. According to the method of
RDA, one first
creates simplified representations, called amplicons, from two samples that
are being compared.
An amplicon can comprise, for example, the set of BglII fragments that are
small enough to be
amplified by the PCR. The iterative subtraction step begins with the ligation
of a special adaptor
to the 5' end of fragments contained in the amplicon derived from the test
sample (tester
amplicon). The tester amplicon is then melted and briefly reannealed in the
presence of a large
excess of amplicon, derived from the wild type sample (driver amplicon). Those
tester fragments
that reanneal (presumably fragments absent from the wild type, driver
amplicon) can serve as a
template for the addition of the adaptor sequence to the 3'-end of the
"partner" fragment. As a
result, these tester fragments can be exponentially amplified by PCR. This
procedure is then
repeated to achieve successively higher enrichment.
RDA may be used to clone sequences that are either wholly absent from the wild
type
sample or are present in the wild type DNA, but are contained in a restriction
fragment that is too
large to be amplified in the amplicon. The former case may arise from a total
deletion; the latter
from a restriction fragment length polymorphism with the short allele present
in the tester but not
the wild type DNA. RDA is useful for subtracting DNA from an individual with a
particular
disease from normal DNA so as to identify regions showing homozygous or
heterozygous
deletions; locating fragments present in a parent with a dominant disorder but
absent in his
unaffected offspring; and locating mRNAs expressed in normal tissue but not
present in tissue
isolated from an individual with a particular disease.
Denaturin~~h Performance Liauid Chromatography
According to the scanning method of Denaturing High Performance Liquid
Chromatography (DHPLC), partial heat denaturation and a linear acetonitrile
column are used
to identify polymorphisms in DNA fragments. DHPLC provides a method of
comparative DNA
72
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
sequencing based on the capability of ion-pair reverse phase liquid
chromatography on alkylated
nonporous polystyrene divinylbenzene) particles to resolve homo- from
heteroduplex molecules
under conditions of partial denaturation. This method can potentially be
automated to allow for
rapid analysis of a large number of samples (Underhill et al., 1996, Proc.
Natl. Acad. Sci. USA,
93:196).
Mass Spectroscopy
Matrix-assisted laser desorption-ionization-time-of-flight (MALDI-TOF) mass
spectroscopy is another method according to the invention by which genotyping
can be
performed. The method of MALDI-TOF mass spectroscopy is based on the
irradiation of crystals
formed by suitable small organic molecules (referred to as the matrix) with a
short laser pulse
at a wavelength close to the resonant adsorption band of the matrix molecules.
This causes an
energy transfer and desorption process producing matrix ions. Low
concentrations of nucleic
acid molecules are added to the matrix molecules while in solution and become
embedded in the
solid matrix crystals upon drying of the mixture. The intact nucleic acids are
then desorbed into
the gas phase and ionized upon irradiation with a laser allowing their mass
analysis. MALDI is
used primarily with time-of flight spectrometers where the time of flight is
related to the mass-to-
charge ratio of the nucleic acids molecules. Reviewed in Griffin T.J. and
Smith L.M., 2000,
Trends Biotech 18:77.
Genotyping can be performed by any of the following MALDI-TOF mass
spectroscopy
approaches including sequencing of PCR products (Fu, D-J et al.,1998, Nat.
Biotechnol.16:381;
Kirpekar, F. et al., Nucleic Acids Res. 26:2554), direct mass-analysis of PCR
products (Ross,
P.L. et al., 1998, Anal. Chem. 70:2067), analysis of allele-specific PCR
(Taranenko, N.I. et al,
1996, Genet. Anal. Biomol. Eng. 13:87) or LCR (ligase chain reaction; Jurinke,
C. et al., 1996,
Anal. Biochem. 237:174)products, analysis of RFLP-PCR products (Srinivasan,
J.R. et al.,1998,
Rapid Commun. Mass Spectrom.12:1045), minisequencing (Haff, L.A. and Smirnov,
LP.,1997,
Genome Res. 7:378; Higgens, G.S. et al., 1997, BioTechniques 23:710), analysis
of PNA
(peptide nucleic acid) hybridization probes (Griffin, T.J. et al., 1997, Nat.
Biotech. 15:1368;
Ross, P.L., Anal. Chem. 69:4197; Jiang-Baucom, P. et al.,1997, Anal. Chem.
69:4894), or direct
73
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
analysis of invasive cleavage products (Griffin, T.J. et al, 1999, Proc. Natl.
Acad. Sci. USA
96:6301).
6. Methods of Specifying a Polymorphism
The invention provides methods for specifying a particular polymorphism. By
"specifying
an polymorphism" is meant defining a polymorphism in the context of a larger
region of nucleic
acid which contains the polymorphism, and is of sufficient length to be easily
differentiated from
any other position in the genome.
A unique nucleotide position (e.g. a polymorphic site) in the human genome can
be
specified by describing a unique sequence of DNA within the genome, and
providing the location
of the unique nucleotide position relative to that sequence. Preferably this
is done by providing
the sequence identity of a length of unique DNA containing the polymorphism,
and indicating
which of the nucleotide sites is polymorphic.
A calculation can be made to determine a sequence length which will be unique
in the 3
billion nucleotide human genome. If it is assumed that the genome contains
equal numbers of the
nucleotides A, G, C and T, and that they occur randomly in the genome, one can
determine the
probability of any given sequence of a defined length occurring in the genome;
a random l2mer
will appear in a random 3,000,000,000 by genome 179 times, a random 15 mer
will appear in a
random 3,000,000,000 by genome 3 times and a random l6mer will appear in a
random
3,000,000,000 by genome 1 time.
Thus, it would appear that specifying 16 by would uniquely define a sequence
in the
genome. However, the genome is not composed of random sequence and does not
contain equal
amounts of A, G, C and T. In fact,10-12 by sequences are likely to be specific
for 95% of genes.
Some sequences may even be specified by as few as 8 nucleotides. The minimum
sequence
length that is useful according to the invention for identifying polymorphisms
in most gene and
intergenic sequences is approximately 9-15 bp.
In the case of repeat sequences and sequences associated with gene families,
the
probability of observing a particular sequence is greatly increased and it
becomes difficult to
specify a polymorphism in the context of a sequence that is only on the order
of 9-15 bp. There
74
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
are many types of repeats including tandem repeats, where a larger sequence
block has within it
smaller repeat units (e.g. microsatellites). Tandem repeats usually occur
within non-genic areas,
but can also occur within genes and subsequently affect gene function; they
can be 10-1000s of
by long, or, if located in centromeres and telomeres, be megabase sized. Some
repeats are
composed of blocks which do not have sub-repeat units and are non-functional
(e.g. 300 by Alu
repeats). These occur by duplication/dispersal throughout the genome.
It may be difficult to specify a polymorphism that occurs in a gene that is a
member of
a gene family. Through the mechanism of gene duplication, gene families,
comprising multiple
copies of a gene in which some, but not all of the DNA sequence has diverged,
have been
formed. Thus, certain regions of a gene may be conserved in different gene
family members.
With time, a duplicated gene can lose function and the sequence of the
duplicated gene can
deteriorate; the amount of homology between the original gene and the
duplicated version
depends upon the time since duplication. Other duplications maintain function
and retain some
level of similarity with the original gene in the important domains. Some
related genes can share
nearly 100% homology across a region that is hundreds of by long, and yet have
no significant
homology at any other location. In these cases, it may be necessary to specify
dozens or more
nucleotides to provide a unique sequence.
To identify a unique sequence, a search must be done wherein a specific
sequence is
compared to all known human sequences and the minimum unique sequence is
defined.
However, in the absence of a complete sequence for the human genome, it cannot
be guaranteed
that a sequence is truly unique. Empirical experimentation can be used to
determine the minimum
sequence for specificity/uniqueness. In the case of a gene family member, if
sequence
information is available for the region corresponding to the region of
interest in other members
of the gene family, than it may be possible to define a unique short (9-15 bp)
sequence that
contains a polymorphism and has specificity. In the event that a particular
region cannot be
defined as unique, a larger region of nucleic acid which contains the
polymorphism will be
required to define a polymorphism in a gene that is a member of a gene family.
It is predicted that
a sequence of 9-15 by will be sufficient to define a polymorphism in 99% of
all cases.
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Methods of specifying a polymorphism that involve using sequences which either
encompass or overlap the polymorphic site to be tested or do not encompass or
overlap the
polymorphic site to be tested are useful according to the invention and are
described below.
Oli~onucleotide Hybridization.
An oligonucleotide is designed such that it is specific for a target sequence,
and
hybridizes only at the target sequence site. This oligonucleotide will not
hybridize if the target
sequence differs at the position in the sequence to be tested. Another
oligonucleotide is designed
such that it hybridizes with the polymorphic form of the sequence. A DNA
sample is tested for
hybridization with each of the tW o probes independently. If the DNA
hybridizes to only one of
the probes, it can be concluded that the individual is homozygous for the
corresponding
sequence. If both probes hybridize to a test DNA sample, then the individual
is heterozygous.
Hybridization will be detected by the method of Southern blot analysis (as
described in Section
C entitled "Production of a Nucleic Acid Probe").
Specif,~g a Pol.~rphism by PCR
An alternative method for specifying a particular polymorphism involves a PCR-
based
strategy. According to this method, a region of a candidate gene to be tested
is amplified by PCR
(as described). The amplified fragment is digested with a restriction enzyme
that will not cut a
fragment that contains a polymorphism, due to the location of the polymorphism
within the
recognition site of this restriction enzyme. The products of the digestion
reaction mixture are size
separated in an agarose gel, stained with ethidium bromide, and visualized
under ultraviolet light
to determine if the amplified product has been digested. According to this
method, the PCR
primers provide the specificity for a particular polymorphism by virtue of the
specific sequence
of the two primers, as well as by the location of the primer binding sites in
the target DNA.
Although, multiple sites for primer binding may exist in a target DNA
sequence, only the sites
that are close enough together will produce an amplified product that includes
the nucleic acid
region containing the polymorphism.
76
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Alternatively, a PCR reaction is carried out with PCR primers that contain
polymorphisms. According to this embodiment, if the template nucleic acid
lacks the
polymorphism present in the primers there will be no PCR product. Thus,
according to this
embodiment of the invention, the absence of a PCR product indicates that a
polymorphism is not
present in the target sequence.
Primer Extension
A DNA fragment comprising the region containing a polymorphism is PCR
amplified
from an individual to be tested. The PCR product is denatured and one strand
is retained for
analysis. An oligonucleotide probe is designed such that it is specific for a
region in the sequence
and hybridizes such that its 3' terminal nucleotide is paired with the
nucleotide adjacent to the
one to be tested. The PCR product and probe axe combined with a polymerase and
terminating,
differentially colored, nucleotides. The polymerase extends the probe by one
base, and only the
base which is complementary to the site being tested is added. The reaction is
washed, and the
color of the reaction indicates the nucleotide that has been added and the
sequence at the position
of interest.
The PCR step provides one level of specificity by amplifying a region (1 -
10000 by as
desired between the PCR primers) from a complex (3,000,000,000 bp) mixture.
The PCR probes
primers must be unique in both their hybridization specificity and their
proximity to one another.
Since proximity of the two PCR primers is needed (i.e. a distance across which
a polymerase can
extend to join the primers), shorter PCR primers can be used, e.g. in theory a
small enough region
could be amplified with a 8-10 by binding site for a PCR primer. To ensure
that a primer
hybridizes with specificity, a primer must be at least 5 bp.
A second level of specificity is provided by the primer which is extended in
the primer
extension reaction. Since this primer is hybridizing to a short piece of DNA,
it can be short and
unique for the fragment with which it binds. The primer is at least 5bp and
preferably 8bp.
Although the primer used for the primer extension step is located probe
adjacent to the
polymorphic site, the PCR primers should not overlap with the polymorphic site
being tested.
~7
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Southern Blotting
One method for detecting a previously defined polymorphism involves Southern
blot
analysis of wild type and mutant DNA following digestion with a restriction
enzyme which has
a recognition sequence which includes the polymorphic site to be tested.
According to this
method, a particular restriction enzyme cuts wild type DNA but does not cut
mutant DNA due
to the presence of a polymorphism within the recognition site of this
restriction enzyme. Many
restriction enzymes exist which recognize 4bps. The resulting fragments will
be size separated
in an agarose gel, transferred to a membrane and probed with a nucleic acid
probe. If the site is
uncut, the fragment is one length and if the site is cut the fragment will be
of a shorter length.
The nucleic acid hybridization probe will provide specificity to the
particular
polymorphism being tested by defining the polymorphism in the context of a
larger stretch of
nucleic acid sequence. The nucleic acid probe may comprise the nucleic acid
sequence
corresponding to the region known to contain the polymorphism. The sequence-
specific probe
may be located 10, 100, 1000, or even 100s of thousands of bases from the
region containing the
polymorphism. If the probe is located some distance from the region containing
the
polymorphism, an intervening recognition site for the restriction enzyme
cannot be located
between the probe hybridization site and the region of interest containing the
polymorphism site.
Typically, a hybridization probe useful according to this method will be much
larger than the
minimum length of a sequence (9-15 bp) required to give specificity to, or
define a particular
polymorphism.
Alternatively, a chemical or enzyme which recognizes a unique pair of
nucleotides at the
site of a polymorphism, can be used to detect the polymorphism. According to
this method, the
amount of sequence required for recognition by a chemical or enzyme is 2 by
(providing that the
2 by sequence is unique in a region large enough to produce a fragment which
can then be bound
by a specific probe).
According to a variation of the above method, a labeled chemical or enzyme
which binds
to one sequence of the polymorphic recognition site and not another is used.
This method
involves the steps of digesting the DNA with a restriction enzyme, and adding
a labeled,
sequence-specific binding protein (e.g. a restriction enzyme that lacks
cleavage capability). The
sequence-specific binding protein will bind to multiple sites in the genome,
including the site to
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
be tested. The fragments will be separated on a gel and then probed with a
probe specific for the
test sequence. If the fragment identified by the second probe is identical to
a fragment identified
by the first probe (e.g. the labeled chemical or enzyme), then the sequence
being tested for is
present.
7. Determination of the Phenotypic Outcome of a Polymorphism
To determine the phenotypic outcome of a polymorphism according to the
invention, it
is necessary to screen suitable populations to obtain a statistically
significant measure of the
association of a polymorphism with a particular disease (e.g osteoporosis).
The invention
provides methods for performing polymorphism genotyping in appropriate
populations
(described above). The invention also provides ifa vitro and izz vivo assays
useful for determining
the phenotypic outcome of a polymorphism in a candidate gene.
Every polymorphism has the potential to alter the genetic activity of an
individual. At the
level of a single gene, the effect of a polymorphism can range from an
inconsequential, silent
change to a change that causes a complete loss of protein function to a gain
of aberrant or
detrimental function mutation. The severity of the effect of a polymorphism on
gene activity will
depend on the exact molecular consequences of the particular polymorphism. For
example,
alterations of a single pre-mRNA splicing dinucleotide could have profound
effects on both the
quantitative and qualitative properties of gene activity since alterations in
splicing efficiency can
both reduce the overall level of normal transcription as well as cause "exon
skipping". If the
deleted exon involves a coding exon then exon skipping will lead to an
alteration in the amino
acid composition of the resulting protein and likely effect protein activity.
To accurately assess
the role of a particular polymorphism in the regulation of various molecular
events, appropriate
assays for both gene expression and protein function must be carried out.
Ifz vitro assays useful for determining the effects of a polymorphism on gene
expression
and protein function include, but are not limited to the following.
i. Transcriptional Regulation
79
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
The transcriptional regulation of a candidate gene containing a polymorphism
may be
altered, as compared to the wild type gene.
Promoter Activity
If a polymorphism is located in the promoter, enhancer or repressor region of
a candidate
gene, promoter assays (well known in the art) wherein the altered promoter of
the candidate gene
. is used to drive the expression of a reporter gene (e.g. CAT, luciferase,
GFP) are performed.
Changes in the transcriptional regulation of a candidate gene due to the
presence of a
polymorphism can also be detected by methods useful for measuring the level of
mRNA
including S 1 nuclease mapping and RT-PCR.
Sl Analysis
The S 1 enzyme is a single-stranded endonuclease that will digest both single-
stranded
RNA and DNA. According to the method of S 1 analysis, a probe that has been
efficiently labeled
to a high specific activity at the 5' end through the use of a kinase, is used
to determine either the
amount of an mRNA species or the 5' end of a message. A single stranded probe
that is
complementary to the sequence of the RNA species of interest is utilized in Sl
analysis. If the
structure of a particular mRNA species is known, S 1 analysis is performed
with oligonucleotide
probes of at least 40 bp, that are complementary to the RNA of interest. It is
preferable to use
oligonucleotides wherein the 5' end of the oligonucleotide is complementary to
the RNA. It is
also preferable to use oligonucleotides wherein the 5' terminal residues
contain dG or dC
residues. If Si nuclease analysis will be utilized to determine the 5' termini
of an RNA species,
the 3' end of the oligonucleotide should extend at least 4 nucleotides beyond
the RNA coding
sequence. The inclusion of additional nucleotides facilitates differentiation
of a band resulting
from an RNA:DNA duplex and a band representing the probe.
A hybridization probe for S l analysis is prepared by incubating 2pmo1 of an
oligonucleotide in the presence of 150 mCi[y32P]ATP (3000-7000Ci/mmol), 2.5 ml
lOX T4
polynucleotide kinase buffer (700mM Tris-Cl, pH 7.5, 100 mM MgClz, 50 mM
dithiothreitol,
1 mM spermidine-Cl, 1 mM EDTA), and l0U T4 polynucleotide kinase for
37°C for 30-60
so
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
minutes. The radiolabeled probe is ethanol precipitated and resuspended at
1ml/0.3ng
oligonucleotide or 105 cpm.
The hybridization reaction is performed as follows. An amount of probe equal
to 5x10ø
Cerenkov counts is added to 5Omg RNA on ice and ethanol precipitated. The
resulting pellet is
resuspended in 20m1 S 1 hybridization solution (80% deionized formamide, 40 mM
PIPES, pH
6.4, 400mM NaCI, 1 mM EDTA, pH 8), denatured for 10 min at 65°C and
hybridized overnight
at 30°C. The following day, 300 ml of a mixture of 150 ml 2x S 1
nuclease buffer (0.56M NaCl,
0.1 M sodium acetate, pH 4.5, 9mM ZnS04), 3m12mg/ml single-stranded calf
thymus DNA,147
ml HZO and 300U S 1 nuclease is added to the hybridization reaction and
incubated for 60 minutes
at 30°C. Following the addition of 80m1 S 1 stop buffer (4M ammonium
acetate, 20mM EDTA,
40 mg/ml tRNA) the sample is ethanol precipitated, resuspended in formamide
loading dye,
denatured and analysed on a denaturing polyacrylamide/urea gel of the
appropriate percentage
for the expected size of the protected band (Ausubel et al., supra).
RT-PCR
The method of RT-PCR is useful according to the invention for RNA expression
analysis.
According to the method of reverse transcription /polymerase chain reaction
(RT-PCR) during
the reverse transcription (RT) step, the RNA is converted to first strand
cDNA, which is
relatively stable and is a suitable template for a PCR reaction. In the second
step, the cDNA
template of interest is amplified using PCR. This is accomplished by repeated
rounds of
annealing sequence- specific primers to either strand of the template and
synthesizing new
strands of complementary DNA from them using a thermostable DNA polymerase.
An RNA sample is ethanol precipitated with a cDNA primer. It may be preferable
to use
a cDNA primer that is identical to one of the amplification primers. To the
pellet is added 12 ml
HZO, 4m1400mM TrisCl, pH 8.3, and 4 ml 400 mM KCI. The mixture is heated to
90°C, slow
cooled to 67°C, microfuged and incubated for 3 hours at 52°C.
Following the addition of 29rn1
reverse transcriptase buffer (per samplel2.5m1400mM TrisCl, pH8.3, 2.5m1400mM
KCI, lml
300mM MgClz, 5m1 100mM DTT, 5m15mM 4 dNTP mix, 2m1 actinomycin D, 1 lml H20)
and
0.5m1 (16U) AMV reverse transcriptase, the sample is incubated for 1 hour at a
temperature
between 37°C and 55°C. The temperature will be adjusted in
accordance with the composition
81
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
of the primer and the RNA of interest. The sample is then extracted
sequentially with phenol and
chloroform, and ethanol precipitated. The resulting cDNA pellet is resuspended
in 40m1 HZO. 5ml
of the cDNA sample is mixed with 5ml or each amplification primer (~20mM
each), 4m15mM
4dNTP mix, lOml lOX amplification buffer (500mM KCI, 100mM TrisCl, pH8.4,
lmg/ml
gelatin) and 70.5m1 H20. After the mixture is heated for 2 minutes at
94°C, 0.5 ml (2.5U) Taq
DNA polymerase is added and the sample is overlaid with mineral oil. PCR
amplification of the
cDNA will be performed using the following automated amplification cycles: 39
cycles (2
minutes at 55°C, 2 minutes at 72°C, 1 minute at 94°C), 1
cycle (2 minutes at 55°C, 7 minutes at
72°C). The number of cycles can be varied in accordance with the
abundance of RNA (Ausubel
et al., supra).
If a polymorphism is located in a transcription factor binding site, assays
including but
not limited to the yeast two-hybrid assay (Fields et al., 1994, Trends Genet.,
10:286) can be used
to determine the effects of a polymorphism on transcription factor binding.
If the protein product of the gene of interest is a DNA binding protein the
phenotypic
outcome of a polymorphism may be impaired nuclear transport, DNA binding,
chromatin
assembly or chromatin structure, methylations or histones deacetylation.
Nuclear Transport
Immunocytochemical methods or cell fractionation techniques (as described
above) are
used to determine if the protein is correctly localized in the nucleus.
The DNA binding properties of a transcription factor are determined by gel
shift analysis
(as described in Ausubel et al., supra), oligonucleotide selection,
southwestern assays or by
immunohistochemical analysis of fixed chromosomes.
Gel Shift Anal, sis
The method of gel shift analysis is used to detect sequence specific DNA-
binding proteins
from crude extracts. According to this method, proteins that bind to an end-
labeled DNA
fragment will retard the mobility of the fragment. The change in the mobility
of the labeled
fragment is detected by the appearance of a discrete band comprising the DNA-
protein complex.
82
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
A number of methods for preparing nuclear and cytoplasmic extracts useful for
gel shift
analysis are known in the art. For example, nuclear extracts are prepared
according to the
following method. A cell pellet is washed in PBS, resupended in a volume of
hypotonic buffer
(10 mM HEPES, pH 7.9, 1.5 mM MgCl2, lOmM KCl, 0.2 mM PMSF, 0.5 mM DTT ) that
is
approximately equal to 3 times the packed cell volume and allowed to swell on
ice for 10
minutes. Cells are homogenized in a glass Dounce homogenizer and the nuclei
are collected by
centrifugation and resupended in a volume of low-salt buffer (20 mM HEPES, pH
7.9, 25% (v/v)
glycerol,1.5 mM MgCl2, 0.02 M KCl, 0.2 mM EDTA, 0.2 mM PMSF, 0.5 mM DTT)
equivalent
to one-half of the packed nuclear volume. Following the addition of a volume
of high-salt buffer
(20 mM HEPES, pH 7.9, 25% (v/v) glycerol, 1.5 mM MgCl2, 1.2 M KCI, 0.2 mM
EDTA, 0.2
mM PMSF, 0.5 mM DTT) equivalent to one-half of the packed nuclear volume
(dropwise with
stirring) to the nuclei, nuclear extraction is carried out for 30 minutes with
continuous gentle
stirring. The nuclei are collected by centrifugation and the nuclear extract
is dialyzed against 50
volumes of dialysis buffer (20 mM HEPES, pH 7.9, 20% (v/v) glycerol, 100mM
KCl, 0.2 mM
EDTA, 0.2 mM PMSF, 0.5 mM DTT) until the conductivities of extract and buffer
are
equivalent. The extract is removed from the dialysis tubing and analysed for
protein
concentration (Ausubel et al., supra).
Probes useful for gel shift analysis include a fragment of plasmid DNA or a
gel-purified
double stranded oligonucleotide. Preferably the probe is labeled with Klenow
fragment by
incubating a lOOml solution of plasmid DNA or oligonucleotide with IOOmCi of
the desired [a-
32P~ dNTP, 4ml of 5 mM 3dNTP mix and 2.5 U Klenow fragment for 20 minutes at
room
temlperature. Upon the addition of 4m1 of a solution comprising 5 mM of the
dNTP
corresponding to the radioactive dNTP, the sample is incubated for 5 minutes
at room
temperature. The radiolabeled probe is ethanol precipitated, resuspended in TE
buffer and gel
purified.
Gel shift analysis is performed by incubating 10,000 cpm of the labeled probe
(0.1-0.5
ng) with 2mg poly (dI-dC)-poly(dI-dC), 300 mg BSA, and approximately l5mg of a
nuclear
extract or buffered crude protein extract prepared, for example, as described
above, for 15
minutes at 30°C. An aliquot of the binding reaction is analysed by
electrophoresis on a
prewarmed low-ionic strength gel (e.g. a 4% polyacrylamide gel in TBE) and
autoradiography
(Ausubel et al., supra).
83
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Oli~oselection Assays for DNA Binding Activity
DNA binding activity is an essential property of proteins involved in many
basic cell
biological events, such as chromatin structure, transcriptional regulation,
DNA replication and
repair. The biological activity of a DNA binding protein can be assayed by
defining the optimal
target DNA binding site. Using the PCR based primer selection technique
(Blackwell, 1990,
Science, 250:1104) the canonical nucleotide sequence defining the binding site
is elucidated is2
vitro by mixing purified full length protein, or just the DNA binding domain
of a protein of
interest, with an oligonucleotide duplex pool containing a completely
randomized central region
flanked by primer-annealing sites. Multiple rounds of immunoprecipitation and
amplification by
PCR enriches for high affinity sites which are cloned are sequenced in order
to define a canonical
binding site.
The ability of a DNA binding protein to correctly regulate chromatin assembly
and
structure can be determined by DNase hypersensitivity assays. Alternatively,
coimmunoprecipitation experiments or Western blot analysis can be used to
determine if the
DNA binding protein is associated with a component of the chromatin.
Southwestern Blot Assay for Protein-DNA Interactions
The ability of a protein to bind DNA is measured by using the "Southwestern"
blot
technique (for example see Antalis et al., 1993, Gene, 134:201). According to
this method,
radiolabelled DNA is incubated with protein that has been immobilized on
nitrocellulose filters
and the amount of boundDNA is measured by scintillation counting or
autoradiography followed
by densitometry. The protein to be tested can be pure protein,
immunoprecipitated protein, crude
cell lysates or even recombinant protein denatured directly from bacterial
coldnies, yeast or cell
culture.
Assay of Protein Binding to Chromosomes if2 Vivo: Immunoc ology of Fixed
Chromosomes
Numerous biologically important nuclear proteins are in direct contact with
genomic
DNA. The presence of these proteins can be detected immunocytologically by
fixing metaphase
chromosomes such that the protein is permanently fixed at the region of DNA to
which it
84
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
normally binds. The presence and cytological location of the protein can then
be determined by
incubating the fixed chromosomes with an antibody directed against the protein
of interest, and
performing standard methods of immunohistochemical staining (link and Paro,
1989, Nature,
337:468).
Coimmunoprecipitation Assay for Chromatin Assembly/Structure.
If an antibody specific for a protein of interest exists, immunoprecipitation
can be used
to test for the presence of the protein (Otto and Lee, 1993, Methods Cell
Biol., 37:119, Banting,
1995, In Gene Probes l: A practical approach. Chapter 8: Antibody probes, pp.
225-227, IRL
press.). The following methods are used for determining if a protein of
interest is associated with
a particular subcellular component. According to one method, proteins are
immunoprecipitated
with an antibody specific for a cellular component (e.g. chromatin or nuclear
antigens), the
immunoprecipitated material is analysed on a gel by denaturing polyacrylamide
gel
electrophoresis and western blot analysis is performed with an antibody
specific for the protein
of interest, to determine if a physical association exists between the
cellular component and the
protein of interest. Various incubation and wash treatments of the cell lysate
are used to remove
background contamination and enhance the sensitivity of detection (Banting,
1995, supra).
Alternatively, the initial immunoprecipitation can be carried out with the
antibody specific for
the protein of interest, and the western blot analysis can be performed with
an antibody specific
for a cellular component. According to a variation of this method, prior to
immunoprecipitation
the cells can be treated with a protein crosslinker to ensure that protein-
protein interactions are
maintained during immunoprecipitation. According to another variation of this
method, proteins
can be cross-linked to DNA and then precipitated (Dedon et al., 1991, Anal.
Biochem., 197:83).
If DNA coprecipitates with a particular protein, this suggests that DNA is
associated with, and
presumably bound to the protein. The coprecipitating DNA can be sequenced to
identify the
bound sequence.
DNAse Hypersensitivity
The transcriptionally active promoter region of a gene can be analysed for
susceptibility
to cleavage by DNAseI (Montecino et al., 1994,Biochemistry, 33:348). Efficient
cleavage of
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
genomic DNA is dependent on the accessibility of this enzyme to the DNA, and
is influenced by
several factors, including nucleosome packaging, overall chromatin
configuration, and the
presence of DNA binding proteins such as transcription factors. DNA sequence
variations within
the promoter DNA may have profound effects on these factors and result in
aberrant regulation
of gene transcription and ultimately abnormal biological activity of the gene.
Therefore, altered
gene activity around a polymorphic site can be detected as increased or
decreased DNAseI
hypersensitivity (Vaishnaw et al., 1995, Immunogenetics, 41:354).
Assay for DNA Methylations
Accurate mapping of DNA methylations patterns, for example, in CpG islands
which are
unmethylated regions of DNA, is used to investigate and gain a better
understanding of diverse
biological processes such as the regulation of imprinted genes, X chromosome
inactivation and
tumor suppressor gene silencing in human cancer. DNA methylations at specific
sites is most
frequently studied by use of methylations-sensitive restriction endonucleases
(for example HpaII)
and Southern blotting (Sambrook et al., supra). The sensitivity of this method
can be enhanced
several hundred-fold by performing a ligation-mediated PCR step (as described
in Steigerwald
et al.,1990, Nucleic Acids Res., 6:1435) after enzyme treatment. An
alternative strategy termed
methylations-specific PCR (Herman et al.,1996, Proc Natl Acad Sci USA.,
93:9821), is used to
determine the methylations status of CpG islands without the use of
methylations-specific
restriction enzymes.
Histones-Deacet, lad
Transcription of chromatin-packaged genes involves highly regulated changes in
nucleosome structure that control DNA accessibility. Changes in nucleosome
structure can be
mediated by enzymatic complexes which control the acetylation and
deacetylation of histones.
Transcription elongation is required for the formation of the unfolded
structure of transcribing
nucleosomes, and histones acetylation is required for the maintenance of these
structures (Walia
et al., 1998, J. Biol. Chem., 3:14516). Deacetylation can be prevented by
incubating cells with
histones deacetylase inhibitors such as sodium butyrate or trichostain A. To
assay for changes
in acetylation and the state of transcriptional activity, chromatin fractions
are purified using
86
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
organomercury and hydroxylapatite dissociation chromatographic techniques
(Walia et al.,
supra).
ii. Transcription Start Site
To determine if a particular polymorphism causes a change in the
transcriptional start site
of a candidate gene S 1 nuclease mapping and primer extension can be
performed. The presence
of a polymorphism may cause an mRNA to be aberrantly expressed. In particular,
a
polymorphism may change the tissue specificity or developmental expression
pattern of an
mRNA species. A variety of molecular methods for detecting mRNA known in the
art can be
performed to determine the expression pattern of an mRNA These methods
include, but are not
limited to the following: Northern blot analysis, RT-PCR, S 1 analysis, RNASE
Protection
analysis, or in situ hybridization analysis of sections, wherein the samples
are derived from
multiple different tissues or from a tissue at different stages of
development. Northern blot
analysis, RT-PCR and S 1 analysis can also be used to determine if a
polymorphism results in an
altered pattern of mRNA splicing.
Northern-B 1 ottin~
The method of Northern blotting is well known in the art. This technique
involves the
transfer of RNA from an electrophoresis gel to a membrane support to allow the
detection of
specific sequences in RNA preparations.
Northern blot analysis is performed according to the following method. An RNA
sample
(prepared by the addition of MOPS buffer, formaldehyde and formamide) is
separated on an
agarose/formaldehyde gel in 1X MOPS buffer. Following staining with ethidium
bromide and
visualization under ultra violet light to determine the integrity of the RNA,
the RNA is
hydrolyzed by treatment with 0.05M NaOH/1.5MNaCl followed by incubation with
0.5M Tris-Cl
(pH 7.4)/1.5M NaCI. The RNA is transferred to a commercially available nylon
or nitrocellulose
membrane (e.g. Hybond-N membrane, Amersham, Arlington Heights, IL) by methods
well
known in the art (Ausubel et.al., supra, Sambrook et al., supra). Following
transfer and LTV cross
linking, the membrane is hybridized with a radiolabeled probe in hybridization
solution (e.g. in
87
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
50% formamide/2.5% Denhardt's/100-200mg denatured salmon sperm DNA/0. 1%
SDS/5X
SSPE) at 42°C. The hybridization conditions can be varied as necessary
as described in Ausubel
et al., supra and Sambrook et al., supra. Following hybridization, the
membrane is washed at
room temperature in 2X SSC/0.1% SDS, at 42°C in 1X SSC/0.1% SDS, at
65°C in 0.2X
SSC/0.1% SDS, and exposed to film. The stringency of the wash buffers can also
be varied
depending on the amount of background signal (Ausubel et al., supra).
RNASE Protection Analysis
RNASE Protection analysis can be used to analyze RNA structure and amount and
determine the endpoint of a specific RNA.
The method of RNASE protection is more sensitive than S 1 analysis since it
utilizes a
sequence specific hybridization probe that is labeled to a high specific
activity. The probe is
hybridized to sample RNAs and treated with ribonuclease to remove free probe.
Following
ribonuclease treatment, the fragments comprising probe annealed to homologous
sequences in
the sample RNA are recovered by ethanol precipitation, and analysed by
electrophoresis on a
sequencing gel. The presence of the target mRNA is indicated by the presence
of an appropriately
sized fragment of the probe.
A probe is labeled by the method of ifz vitro transcription (in the presence
of [a-32P] CTP
as described in Section B entitled "Production of a Polynucleotide Sequence".
The RNA sample
to be analysed is ethanol precipitated and resuspended in 30m1 hybridization
buffer (4 parts
formamide/1 part 200 mM PIPES, pH 6.4, 2M NaCI, 5 mM EDTA) containing 5 x 105
cpm of
the probe RNA. The mixture is denatured 5 minutes at 85°C and incubated
at the desired
hybridization temperature (30°C to 60°C) for >8 hours. To each
reaction mixture is added 350
ml ribonuclease digestion buffer (10 mM Tris-Cl, pH 7.5, 300mM NaCI, 5mM EDTA)
containing 40mg/ml ribonuclease A and 2mg/m1 ribonuclease T1. The sample is
incubated for
30-60 minutes at 30°C. Following the addition of 10 ml 20%SDS and
2.5m120mg/ml proteinase
K, the sample is incubated for 15 minutes at 37°C. The sample is
extracted with phenol
lchloroformlisoamyl alcohol, ethanol precipitated, resuspended in RNA loading
buffer (80%
(v/v) formamide, 1 mM EDTA, pH 8.0, 0.1 % bromophenol blue, 0.1 % xylene
cyanol),
ss
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
denatured and analysed by electrophoresis on a denaturing polyacrylamide/urea
gel and
autoradiography (Ausubel et al., supra).
Primer Extension
The method of primer extension is used to map the 5' end of an RNA and to
quantitate
the amount of an RNA of interest by using reverse transcriptase to extend a
primer that is
complementary to a region of a given RNA.
An oligonucleotide primer is labeled in a kinase reaction as described for S 1
analysis.
The primer extension reaction is performed by mixing 10-50mg total cellular
RNA (in lOml)
with l.5ml lOX Hybridization buffer (1.5M KCI, O.1M TrisCl, pH 8.3, lOmM EDTA)
and' 3.5
ml labeled oligonucleotide. Samples are heated to 65°C for 90 minutes
and allowed to slow cool
at room temperature. To each sample is added 30m1 of primer extension reaction
mixture (0.9m1
Tris-Cl, pH 8.3, 0.9m10.5M MgClz, 0.25m1 DTT, 6.75m11 mg/ml actinomycin D,1.33
ml 5, mM
4dNTP mix, 20 ml H20, 0.2m125U/ml AMV reverse transcriptase). Samples are
incubated for
1 hour at 42°C, and then, following the addition of 105m1 RNASE
reaction mix (100 mg/ml
salmon sperm DNA, 20 mg/ml RNASE A) for 15 minutes at 37°C. Samples are
extracted in
phenol/chloroformlisoamyl alcohol, ethanol precipitated, resuspended in
stop/loading dye (20
mM EDTA, pH 8.0, 0.05% bromophenol blue, 0.05% xylene cyanol in formamide),
heated at
65°C and analysed by electrophoresis on a 9% acrylamide/7M urea gel and
autoradiography.
If2 Situ Hybridization
Cytological techniques well known in the art can be used to determine the
temporal and
spatial expression patterns of mRNA (ira situ hybridization of tissue
sections) and protein
(immunohistochemistry in individual cells).
Preparation of histolo~ical samples
Tissue samples intended for use in in situ detection of either RNA or protein
are fixed
using conventional reagents; such samples may comprise whole or squashed
cells, or sectioned
tissue. Fixatives useful for such procedures include, but are not limited to,
formalin, 4%
89
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
paraformaldehyde in an isotonic buffer, formaldehyde (each of which confers a
measure of
RNAase resistance to the nucleic acid molecules of the sample) or a mufti-
component fixative,
such as FAAG (85 % ethanol, 4% formaldehyde, 5% acetic acid, l% EM grade
glutaraldehyde).
For the detection of RNA, water used in the preparation of an aqueous
component of a solution
to which the tissue is exposed until it is embedded is RNAase-free, i.e.
treated with 0.1%
diethylprocarbonate (DEPC) at room temperature overnight and subsequently
autoclaved for 1.5
to 2 hours. Tissue will be fixed at 4°C, either on a sample roller or a
rocking platform, for 12 to
48 hours in order to allow the fixative to reach the center of the sample.
Prior to embedding, excess fixative will be removed and the sample will be
dehydrated
by a series of two- to ten-minute washes in increasingly high concentrations
of ethanol, beginning
at 60% and ending with two washes in 95% and another two in 100% ethanol,
followed by two
ten-minute washes in xylene. Samples will be embedded in one of a variety of
sectioning
supports, e.g. paraffin, plastic polymers or a mixed paraffin/polymer medium
(e.g.
Paraplast~Plus Tissue Embedding Medium, supplied by Oxford Labware). For
example, fixed,
dehydrated tissue will be transferred from the second xylene wash to paraffin
or a
paraffin/polymer resin in the liquid-phase at about 58°C. The paraffin
or a paraffin/polymer resin
will be replaced three to six times over a period of approximately three hours
to dilute out
residual xylene. The sample will be incubated overnight at 58°C under a
vacuum, in order to
optimize infiltration of the embedding medium into the tissue. The next day,
following several
additional changes of medium at 20 minute to one hour intervals, also at
58°C, the tissue sample
will be positioned in a sectioning mold, the mold will be surrounded by ice
water and the
medium will be allowed to harden. Sections of 6mm thickness will be taken and
affixed to
'subbed' slides, which are slides coated with a proteinaceous substrate
material, usually bovine
serum albumin (BSA), to promote adhesion. Other methods of fixation and
embedding are also
applicable for use according to the methods of the invention; examples of
these are found in
Humason, G.L.,1979, Animal Tissue Techniques, 4th ed. (W.H. Freeman & Co., San
Fransisco),
as is frozen sectioning (Serrano et al., 1989, supra).
hZ situ Hybridization Anal
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
According to the method of in situ hybridization a specifically labeled
nucleic acid probe
is hybridized to cellular RNA present in individual cells or tissue sections.
In situ hybridization
can be performed on either paraffin or frozen sections. Depending on the
desired sensitivity and
resolution, either film or emulsion autoradioagraphy can be utilized to detect
the hybridized
radioactive probe.
The following method of in situ hybridization is performed by incubating
slides
containing cell or tissue specimens in a slide rack contained within a glass
staining dish.
According to this method, it is preferable to use solutions that have been
prepared fresh. Prior
to the hybridization steps, slides are dewaxed to remove the sectioning
support material. The
dewaxing protocol involves sequential washes in xylene, rehydration by
sequential washes in
100%, 95%, 70% and 50% ethanol, and denaturation in 0.2N HCl. Following a heat
denaturation
step (70°C in 2X SSC), samples are postfixed in a freshly prepared
solution of 4% PFA, washed
in PBS, incubated in 10 mM DTT (10 min at 45°C) and blocked in 400 ml
PBS containing
0.617g DTT, 0.74 g iodoacetamide and O.Sg N-ethylmaleimide, for 30 min at
45°C in a water
bath covered with aluminum foil, due to the light sensitivity of iodoacetamide
and N-
ethylinaleimide. The samples are washed in PBS and equilibrated sequentially
in freshly prepared
O.1M triethanolamine (TEA buffer), TEA buffer/0.25% acetic anhydride, and TEA
buffer/0.5%
acetic anhydride. Following a blocking step in 2X SSC, the sample are
dehydrated by sequential
washes in 50%, 70%, 95%, and 100% ethanol and air dried. 35S-labeled
riboprobes and
competitor probes prepared in the absence of a radiolabel (prepared as
described in Section B
entitled "Production of a Polynucleotide Sequence") or double-stranded DNA
probes (prepared
with [35S]dNTPs by methods well known in the art including nick translation or
random
oligonucleotide-primed synthesis) are heated to 100°C for 3 min and
diluted to a concentration
of 0.3mg/ml final probe concentration, in 50% formamide, 0.3M NaCI, lOmM
TrisCl, pH 8.0,
1 mM EDTA, lx Denhardt solution, 500mg/ml yeast tRNA, 500mg/ml poly(A)
(Pharmacia), 50
mM DTT, 10% polyethylene glycol (MW 6000). The hybridization step is carried
out by
covering the sample with an appropriate amount of probe, and incubating for 30
min to 4 hour
at 45°C in a chamber designed to prevent dilution or concentration of
the hybridization solution.
Samples are washed sequentially at 55°C in solution A (50% (v/v)
formamide, 2X SSC, 20 mM
2-mercaptoethanol), and solution B (50% (v/v) formamide, 2X SSC, 20 mM 2-
mercaptoethanol,
0.5% (v/v) Triton-X-100) and at room temperature in solution C (2X SSC, 20 mM
2-
91
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
mercaptoethanol). Following a 15 minute incubation with RNASE, samples are
washed at 50"C
in solution C, and at room temperature in 2X SSC. Samples are rehydrated by
sequential washes
in 50% ethanol/0.3M ammonium acetate, 70% ethanol/0.3M ammonium acetate, 95%
ethanol/0.3M ammonium acetate, and 100% ethanol. Slides are air dried and
analysed by film
or by emulsion autoradiography (Ausubel et al., supra).
iii. mRNA Stability/Control of Turnover and mRNA Transcription Rate
Changes in mRNA stability/control of turnover and mRNA transcription rates due
to the
presence of a polymorphism, can be detected by the following methods.
mRNA Stability
Gene-expression can be regulated by variations in mRNA stability (Liebhaber,
1997,
Nucleic Acids Symp Ser., 36:29 and Ross J. 1996, Trends Genet., 5:171). Any
gene variation
occurring within the cis-acting elements which control mRNA abundance may
influence gene
expression levels (Peltz et al.,1992, Curr Opin Cell Biol., 4:979).
Quantitative RT-PCR (Kohler,
et al, 1995, Quantitation of mRNA by polymerase chain reaction, Springer) and
mRNA
radiolabelling techniques are two methods for measuring relative mRNA
abundance and stability.
Quantitative PCR employs an internal standard to provide a direct comparison
between
alternative reactions, enabling comparison of low abundance transcripts or
transcripts derived
from a sample that is only available in a limited quantity (McPherson MJ et
al., eds,1995, PCR2-
A practical approach. IRL Press).
Assay for mRNA Transcription Rates.
Genetic polymorphism within the regulatory regions of a gene can significantly
alter
transcription rate and mRNA stability, resulting in reduced biological
activity of the encoded
protein. One of the most sensitive assays for measuring the rate of gene
transcription is the
nuclear runoff assay (Groudine and Casimir,1984, Nucleic Acids Res 12: 1427).
Nuclei isolated
from cell lines expressing the target gene of interest are treated with
radiolabelled UTP and the
92
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
level of incorporation of radiolabel into nascent RNA transcripts is
determined by filter
hybridization to immobilized cDNA derived from the target gene.
iv. Intracellular mRNA Localization
A genetic variation can cause a change in the localization of a particular
mRNA species
(e.g. to the cytoskeleton, or to the nuclear scaffold).
Immunohistochemisitry
Changes in RNA localization can be detected by immunohistochemical methods
well
known in the art (e.g. in situ analysis described above).
Oocyte In~iection Assay
In many cases mRNA, like protein, is localized in relation to the polarity of
the cell or the
cytoskeletal architecture (St. Johnston, 1995, Cell, 81:161). The Xeuopus
oocyte is a popular,
experimentally tractable, system for studying intracellular trafficking of
mRNA (Nakielny et al.,
1997, Annu. Rev. Neurosci., 20:269). Fluorescently labelled RNA is
microinjected into the large
oocyte cell where its location can be detected using standard microscopy
methods. Polymorphic
variants of a particular mRNA species may differ in their response to cellular
mechanisms
responsible for partitioning mRNA within the cell. This method has been useful
for
I demonstrating that sequence variations can affect sub-cellular localization
(Grimm et al.,
1997,EMB0 J., 16:793)
v. Post-Translational Alterations
Post-Translational alterations resulting from premature stop codons,
translational
readthrough or multiple open reading frames and translational suppression may
occur as a result
of a polymorphism. To detect post-translational alterations, a polynucleotide
comprising one or
more polymorphisms is subjected to in vitro transcription and in vitro
translation (as described
in sections B and J entitled "Production of a Polynucleotide Sequence" and
"Preparation of a
93
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Labeled Protein"). The translation products) axe analysed for the appearance
of aberrantly sized
proteins. Additional post-translational alterations that may occur as a result
of a polymorphism
include changes in localization due to an altered signal sequence, and changes
in glycosylation,
myristilation, and susceptibility to or sites of proteolytic cleavage.
The method of immunocytochemistry can be used to determine if a protein is
incorrectly
localized, due to the presence of an altered signal sequence.
Immunohistochemistry
Immunohistochemical techniques including indirect immunofluorescence,
immunoperoxidase labeling or immunogold labeling, are used for protein
localization.
Immunofluorescent labeling of tissue sections (prepared as for in situ
analysis, described
above) is performed by the following method. Slides containing the sample of
interest are
equilibrated to room temperature washed in PBS, incubated with an appropriate
dilution of
primary antibody (1 hour at room temperature), washed in PBS, incubated with
an appropriate
dilution of secondary antibody (1 hour at room temperature), washed in PBS and
analysed under
a microscope (Ausubel et al., supra). Alternatively, the sensitivity of the
immunohistochemical
reaction is increased by using a streptavidin-secondary antibody conjugate
reacted with a biotin-
fluorochrome conjugate. Alternatively, immunogold labeling is used to detect a
protein of interest
by using an immunogold-conjugated secondary antibody.
Immunoperoxidase labeling of tissue sections is performed by the following
method.
Slides are pretreated in 0.25% hydrogen peroxide, incubated with primary
antibody, washed in
PBS and incubated (1 hour at room temperature) with a specific secondary
bridging antibody
capable of recognizing both the primary antibody and a Horseradish peroxidase
antiperoixidase
(PAP) complex. The slides are washed in PBS and developed in diaminobenzidene
substrate
solution (0.03% (w/v) 3,3' diaminobenzidene in 200 ml PBS) at room temperature
(Ausubel et
al., supra).
Alternatively, protein localization is determined by cell fractionation
wherein cells axe
biosynthetically labeled, the labeled material is fractionated, and the
radiolabeled proteins in each
fraction are analysed by immunoprecitation with an antibody specific for the
protein of interest.
94
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Assay for Gl,~~ation Inhibition
Changes in protein glycosylation can be detected by radiolabelling a protein
of interest
with sugars, determining if a change in the cellular localization (by
immunocytochemistry) of the
protein in culture has occurred due to aberrant glycosylation, or by
determining the effects of
inhibitors of glycosylation on the migration pattern of proteins analysed by
polyacrylamide gel
electrophoresis.
Post-translational glycosylation of proteins plays an important role in
defining protein
function (Baeziger, 1994, FASEB J., 13:1019; Jacob, 1995, Curr. Opin. Struct.
Biol., 5:605).
Protein glycosylation can be inhibited by tunicaxnycin, an antibiotic, as well
as by several sugar
analogues (Schwaxz, 1991, Behring Inst Mitt., 89:198). These reagents are used
to characterize
the effects of sequence changes on protein glycosylation.
Assay for Post-Translational Modification with Lipids
Changes in protein modification with lipids (e.g. myristilation) axe detected
by
radiolabelling a protein of interest with myristic acid or by determining if a
change in the cellular
localization of the protein in culture has occurred as a result of aberrant
lipid modification (by
immunocytochemistry).
Covalent attachment of lipids is a mechanism by which eukaryotic cells direct
and, in
some cases, control, membrane localization of proteins (Casey, 1994, Curr.
Opin. Cell. Biol.,
2:219). Such post-translational addition of myristyl, palmityl or prenyl side-
chains has a key role
in the functional regulation of many proteins (Chow et al., 1992, Curr. Opin.
Cell. Biol., 4:629;
Resh, 1994, Cell, 763:411). Assays for detecting proteins that are covalently
modified by the
attachment of lipids include labeling with [3H]myristate (Stevenson et al.,
1992, J. Exp. Med.,
176:1053), or a combination of enzymatic and chemical cleavage techniques
performed in
conjunction with tandem mass spectrometry to determine sites of modification
(Papac et al.,
1992, J. Biol. Chem., 267:16889).
Proteolytic Cleavage
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Post-translational cleavage of polypeptides is an important mechanism for
modulating
protein function in many physiological processes. Protease activity is
involved in zymogen
processing, activation of enzyme catalysis, tissue/cell remodelling, signal
transduction cascades,
protein degradation and cell death pathways (Rappay, 1989, Prog Histochem
Cytochem., 18:1).
A protein that is predicted to be a protease or the target of a protease can
be assayed in vitro using
purified proteins or cell extracts (Muta et al., 1995, J. Biol. Chem. 270:892)
where cleavage
efficiency is monitored by standard PAGE or western blotting. Alternatively,
proteases and/or
their targets can be expressed from expression plasmids in in vivo cell
culture systems in order
to monitor their biological activity (Zhang, et al.,1998, J. Biol. Chem.
273:1144). The specificity
of proteolytic cleavage is determined using inhibitors that selectively block
seine, cysteine,
asps -tic and metallo proteolytic activity (e.g. pepstatin A selectively
inhibits aspartic proteases)
(Rich, et al., 1985, Biochemistry., 24: 3165).
To determine if a protein has been modified such that the sites of proteolytic
cleavage
have been altered, or susceptibility to proteolytic cleavage has changed pulse
chase experiments
with radiolabeled protein can be carried out to determine the precursor-
product relationship
following digestion with a protease of a given specificity. The method of
pulse chase labeling is
described in Ausubel et al., supra. Alternatively, inhibitors of proteases
(e.g acid proteases or
seine proteases) can be used to identify protease cleavage sites.
vi. Changes in Receptor Properties
If the gene of interest encodes a receptor protein, a polymorphism may modify
the
properties of the receptor such that receptor binding/turnover or activation
is altered. Receptor
formation can be impaired if a polymorphism causes improper receptor
localization or assembly.
Receptor Localization
To determine if a receptor protein is being expressed at the proper location
(e.g. nucleus,
cytoplasm, cell surface), the receptor can be localized by immunocytochemical
techniques.
Alternatively, cells that are expressing the receptor can be fractionated and
subjected to Western
blot analysis or biosynthetically labeled, fractionated and analysed by
immunoprecipitation.
96
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Protein-Protein Interactions/hz vitYO Assembl Assays for Receptors
A number of methods can be used to determine if a receptor is colocalized with
the
appropriate protein partner.
The function of a protein may be dependent on the ability of the protein to
interact with
other proteins as part of a large complex. For example, certain cell surface
receptors consist of
a receptor complex that is composed of several homo- or heteromeric protein
subunits, and
activation by ligand can result in altered protein-protein interactions both
within the receptor
complex and with "downstream" targets such as G-proteins (Okada and Pessin,
1996, J. Biol.
Chem., 271:25533). Protein-protein interactions can be assayed immunologically
by co-
immunoprecipitation of native (Gilboa et al.,1998, J. Biol. Chem.,140:767) or
chemically cross-
linked complexes (Haniu et al., 1997, J. Biol. Chem., 272:25296), or through
protein-protein
mobility shift assays (Stern and Frieden, 1993, Anal. Biochem., 212:221). If
all of the
components of a receptor complex have been identified, one can employ i~a
vitro reconstitution
assays to assess whether a single protein alteration can effect the
functioning of the entire ,
complex (Durovic et al., 1994, J. Biol. Chem., 269:30320).
Assay for If2 Vitro Assembly of Multimeric Protein Complexes
To determine whether these genetic variations have affected protein complex
assembly,
experiments are carried out wherein recombinant mutant subunits are
transfected into cells and
coexpressed with the other subunit components in vitro. Proper assembly is
assessed by
immunoprecipitation of the protein complex in question with antibodies
specific for the various
members of the complex followed by PAGE analysis (Koster et al.,1998,
Biophysl. J., 74:1821).
Assa, Receptor Binding/Turnover.
Receptor-ligand interaction is essential for the functionality of the bound
complex.
Genetic changes that alter either ligand or receptor can dramatically affect
receptor binding,
turnover, and subsequent activation of downstream signaling events. Receptor
binding/turnover
can be measured by standard Scatchard analysis of radiolabelled ligand binding
in vitro
97
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
(Culouscou et al., 1993, J. Biol. Chem. 268:10458) or in cellular based assays
(Greenlund et al.,
1993, J. Biol. Chem. 268: 18103).
Li~andiBindin~ as Measured by Affinity Chromat~phy
Alternatively, affinity chromatography methods (well known in the art) can be
employed
to determine if a receptor is demonstrating aberrant binding characteristics.
According to the
method of affinity chromatography, receptor-ligand interactions are allowed to
occur, and the
binding efficiency or receptor and ligand and/or turnover of receptor-ligand
complexes is
measured. Alternatively, affinity chromatography can be used to isolate one or
more components
of a receptor ligand interaction for further analysis (March et al., 1974,
Adv. Exp. Med. Biol.,
42:3). The method of affinity chromatography typically involves immobilizing
on a solid support
one component, for example a known ligand for a receptor, and then incubating
the immobilized
ligand with radiolabelled protein under optimal binding conditions. To measure
the exact binding
affinity of a given ligand-receptor pair, an increasing amount of non-labeled
competitor is added.
This assay can be used to assess altered binding efficiency resulting from the
presence of a
polymorphism in a protein of interest.
Receptor Activation Assays~ Phosphorylation I~inase Activity and Mito~enic
Stimulation
Almost all signaling that occurs through cell surface receptors is regulated
by
phosphorylation, a reversible post-translational event that occurs at specific
amino acid residues
and is catalyzed by a protein kinase activity present within the receptor
itself
(autophosphorylation) or in trans via direct interaction with an associated
kinase (Hunter, 1997,
Philos Trans R Soc Lond B Biol Sci., 353:583). The specific effect of
phosphorylation on a
biological activity depends on the receptor, but often results in modulation
of endogenous
receptor kinase activity. or interaction with associated proteins, which are
also often kinases. The
results of a phosphorylation event are passed on through a cascade of protein
kinases/phosphatases which ultimately effect downstream processes controlling
gene
transcription, cell proliferation, metabolism, movement and differentiation
(Patarca, 1996, Crit
Rev Oncog., 7:343). The biological function of a receptor is usually assayed
in cell culture
following over-expression. The phosphorylated state of a receptor can be
assayed directly by
98
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
immunological methods by employing an antibody that specifically recognizes a
phosphorylated
residue (Bangalore, 1992., Proc Natl Acad Sci USA., 89:11637). Endogenous
kinase activity
associated with a receptor is measured via the incorporation of radiolabelled
phosphate in
immunoprecipitated receptor complex (Kazlauskas and Cooper, 1989, Cell
58:1121).
"Downstream" events of receptor activity including mitogenic stimulation or
map kinase activity,
can be measured by tritiated thymidine incorporation (Luo et al., 1996, Cancer
Res. 56:4983),
or by mobility-shift analysis of map kinase on western blots (Vietor, 1993.,
J. Biol. Chem.
268:18994), respectively.
Immunocytochemical methods can be used to determine if a receptor-ligand
complex is
correctly translocated to the nucleus. Alternatively, nuclear preparations
(prepared as described
below) can be analysed by Western blot or immunoprecipitation for the presence
of the receptor
protein.
If a receptor is a transcriptional activator, the ability of the receptor to
induce gene
expression can be measured by a variety of methods including Northern blot
analysis, or reporter
gene assays wherein the promoter region isolated from a gene that is activated
by the receptor
regulates the expression of a reporter protein.
vii. Enzyme Catalysis
The gene of interest may encode a protein that has an enzymatic activity
wherein the
enzyme catalyzes a reaction that is critical to the general metabolism of a
cell. To determine if
a mutated protein is impaired in its enzymatic function, assays can be
performed to measure the
enzymatic activity of the protein. There are many important enzymatic
activities associated with
normal cellular metabolism, including: glycosidation, esterification,
amidation, hydroxylation,
acetylation, sulfonylation, alkylation. Each of these activities are assayed
using in vitro methods
employing overexpressed or purified proteins, well known in the art (Eisenthal
and Danson,
1992, Enzyme Assays: A Practical Approach, Rickwood et al., Eds., IRL Press.
Oxford,
England).
The protein of interest may also be involved in various aspects of DNA
synthesis or
replication. In vitro assays for the enzymatic reactions involved in DNA
synthesis or replication
(e.g. polymerase, ligase, exonuclease or helicase activity) are known in the
art. The biological
99
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
activity of the proteins catalyzing these activities are assayed in vitro
using standard enzymatic
techniques (Adams,199, DNA Replication: A Practical Approach I, Rickwood, et
al., Eds., IRL
Press. Oxford, England).
If the protein of interest is involved in glycolysis or energy transport,
assays for measuring
transporter activity or the activity of ATP dependent pumps are useful,
according to the
invention, for determining if a mutated protein is impaired in these
functions.
Transporter Activity
Mammalian cells possess a variety of transporter systems, for example amino
acid
transporters, which have overlapping substrate specificity (Van Winkle et al.,
1993, Biochim
Biophys Acta, 1154:157). To determine if a polymorphism in a candidate gene of
interest has
altered the function of the protein product of this gene as a molecular
transporter, the full-length
cDNA clone is isolated by standard expression cloning strategies, and a change
in activity of the
full-length cDNA or antisense cDNA upon rnicroinjection into Xenopus laevis
oocytes is
determined by measuring changes in influxfefflux transport of radiolabelled
amino acid
molecules (Broer et al.,1995, Biochem J., 312(Pt 3):863), neurotransmitters or
their metabolites.
ATP-dependent pumps Activity
Mammalian cells possess a variety of molecules that are categorized as ATP-
binding
cassette or ATP-dependent transporters or pumps. These include the Na+-K+-
ATPase ion pump,
the calcium uptake pump, (K~ + H~)-ATPase and the human multidrug resistant
protein termed
P-glycoprotein. Alterations in pump activity are investigated by expressing
the clone specific for
the pump proteins) of interest in Xetropus oocytes, and performing tracer
studies which measure
the changes in ATP-dependent uptake or extrusion of a radiolabelled substrate,
and changes in
the coupling ratios (e.g. moles substrate transported/mole ATP hydrolyzed)
(Shapiro et al.,1998,
Eur. J. Biochem., 254:189).
viii. Ion Channel
100
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
The gene of interest may encode for a protein that is a component of an ion
channel.
Immunocytochemical methods can be used to determine if an ion channel protein
demonstrates
the appropriate cell type specificity.
The activity of an ion channel can be measured by electrophysiological methods
in
oocytes. Alternatively, the sensitivity of ion channel activity to a
particular inhibitor can be
determined.
Assays for Ion Channel Activity in Oocytes
Polymorphisms which alter ion channel function and regulation are studied
using the
oocytes of Xenopus laevis. Injection of the oocytes with exogenous in vitro
transcribed mRNA
results in the production and functional expression of foreign membrane
proteins, including
voltage- and neurotransmitter- operated ion channels (Dascal et al., 1987.,
CRC Crit Rev
Biochem., 224:317). Changes in the oocyte transmembrane current in response to
expression of
an exogenous mRNA is measured. This technique has been improved by the
development of
rapid superfusion systems that utilize a dual role perfusion micropipette that
controls internal
solution as well as monitoring voltage (Costa et al., 1994, Biophys J.,
67:395). This technology
represents a useful system for studying various aspects of ion channels
encoded for by foreign
mRNAs including channel expression, single-channel behavior, and the response
of channels to
the action of pharmacologically active substances (Sigel, 1987,J. Physiol.,
386: 73).
Patch Clamp Assays for Ion Channel Activity.
The function of individual channel proteins is determined by the high
resolution patch
clamp technique. This technique (which is useful in a variety of cell types,
including Xes2opus
oocytes described above) involves measuring changes in transmembrane current
across the cell
membrane in vitro (Sachs et al., 1983, Methods Enzymol., 103: 147). Processes
such as
signaling, secretion, and synaptic transmission are examined at the cellular
level by the patch
clamp method. The gene expression pattern and protein structure of ionic
channels can be
determined by combining information derived from high-resolution
electrophysiological
101
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
recordings obtained by the patch clamp method with molecular biological
analysis (Liem et al.,
1995, Neurosurgery, 36: 382).
A polymorphic variation in a gene that encodes a protein that is a member of a
multimeric
protein complex, such as an ion channel or a cytoskeletal structural
component, can alter the
assembly and function the multimeric protein complex (Lee et al., 1994.,
Biophys J., 66: 667).
A gene variation may affect protein-protein interaction, or disrupt the
production of components
of a multimeric complex, thereby disrupting stoichiometry and consequently
decreasing stability.
Assay for If2 Vitro Assembly of Multimeric Protein Complexes
In vitro assembly assays (described above) can be performed to determine if a
polymorphism has affected the assembly of an ion channel.
ix. Cellular Properties
The influence of a polymorphism on general aspects of cell behavior, including
cell
morphology, adhesive properties, differentiation and proliferation can be
assessed using a
combination of methods including microscopic observation of cell cultures
(Azuma et al.,1994,
Histol.Histopathol., 9:781), immunohistochemistry, and FACs analysis
techniques (Beesley,
1993, Immunocytochemistry: a Practical Approach, Rickwood, et al., (Eds), IRL
Press and
Ormerod, 1994, Flow Cytometry: a practical Approach, Rickwood et al., (Eds),
IRL Press.
Oxford, England).
Assays for Measuring-Apoptosis
Apoptosis has been implicated in the etiology and pathophysiology of a variety
of human
diseases. Gene variants which influence the process of apoptosis can be
assessed by a variety of
methods of analysis involving either the tissues or cells (Allen et al., 1997,
J Pharmacol Toxicol
Methods, 37: , 215). Cell cultures expressing the gene variants of interest
are analysed using
Annexin V which interacts strongly with phosphatidylserine residues that have
been exposed as
a result of plasma membrane breakdown occurring in the early stages of
apoptosis. Either vital
or fixed material can be analysed by Annexin V labeling in combination with
microscopy and
102
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
flow cytometry detection methods (van Engeland et al., 1998, Cytometry, 31:1).
TdT-mediated
deoxyuridine triphosphate (dUTP)-biotin nick end-labeling (TUNEL) is a
preferred method for
specific staining of apoptotic cells in histological sections and cytology
specimen (Labat-Moleur
et al., 1998, J. Histochem Cytochem., 46:327; Sasano et al., 1998., Diagn
Cytopathol.,18:398).
Apoptosis is also detected by quantification of DNA fragmentation by ethidium
bromide staining
and gel electrophoresis, or by the use of saturation labeling of 3' ends of
DNA fragments (Peng
and Liu, 1997, Lab Invest., 77:547).
Assay for 1» Vivo Receptor Function: Growth Cone Guidance Assay.
Activation of cell-surface receptors can result in the stimulation of cell
motility. There
are many different families of signaling molecules, for example the netrins,
(Serafini et al.,1994,
Cell. 78: 409), which are responsible for both contact mediated or chemo-
mediated attraction and
repulsion of migrating cells. A classic model for this activity is the
trajectory that the leading
edge "growth cone" takes when a neuron is stimulated to grow out from
explanted neural tissue
in cell culture (Goodman, 1996, Annu Rev Neurosci. 19: 341). Ligands present
in the culture
medium or immobilized on a substrate bind to receptors on the cell-surface of
the growth cone
and trigger second-messenger signals thereby dictating an appropriate steering
response. The
biological activity of such receptors or ligands can be measured by
overexpressing the receptor
or ligand protein in culture and then monitoring growth cone guidance
(I~remoser et al., 1995,
Cell 82: 359). Attraction or repulsion of cells which is observed to be
different than normal is
an indication of the role of this protein in growth guidance, and identifies
the polymorphisms as
altering function.
x. Changes in gene expression or protein function that result from the
presence of a
polymorphism can be detected by in vivo assays including the production of
transgenic animals,
knock out animals or the analysis of naturally occurring animal models of a
particular disease.
Trans~enic Animals
103
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Transgenic mice provide a useful tool for genetic and developmental biology
studies and
for the determination of a function of a novel sequence. According to the
method of conventional
transgenesis, additional copies of normal or modified genes are injected into
the male pronucleus
of the zygote and become integrated into the genomic DNA of the recipient
mouse. The
transgene is transmitted in a Mendelian manner in established transgenic
strains.
Constructs useful for creating transgenic animals comprise genes under the
control of
either their normal promoters or an inducible promoter, reporter genes under
the control of
promoters to be analysed with respect to their patterns of tissue expression
and regulation, and
constructs containing dominant mutations, mutant promoters, and artificial
fusion genes to be
studied with regard to their specific developmental outcome. Transgenic mice
are useful
according to the invention for analysis of the dominant effects of
overexpressing a candidate gene
in mouse. Typically, DNA fragments on the order of 10 kilobases or less are
used to construct
a transgenic animal (Reeves,1998, New. Anat., 253:19). Transgenic animals can
be created with
a construct comprising a candidate gene containing one or more polymorphisms
according to the
invention. Alternatively, a transgenic animal expressing a candidate gene
containing a single
polymorphism can be crossed to a second transgenic animal expressing a
candidate gene
containing a different polymorphism and the combined effects of the two
polymorphisms can be
studied in the offspring animals. Transgenic mice engineered to overexpress a
number of genes,
including PCKl (Valera et al., 1994, Proc. Natl. Acad. Sci. USA, 91: 9151),
INS (Mitanchez et
al., FEBS Letters, 421: 285), IAPP (D'Alession et al.,1994, Osteoporosis,
43:1457), Asp (Klebig
et al., Proc. Natl. Acad. Sci. USA, 92: 4728) and Agrt (Graham et al., Nature
Genetics, 17:273),
have been prepared and may be useful for studying osteoporosis.
Knock Out Animals
i. Standard
Knock out animals are produced by the method of creating gene deletions with
homologous recombination. This technique is based on the development of
embryonic stem (ES)
cells that are derived from embryos, are maintained in culture and have the
capacity to participate
in the development of every tissue in the mouse when introduced into a host
blastocyst. A knock
out animal is produced by directing homologous recombination to a specific
target gene in the
104
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
ES cells, thereby producing a null allele of the gene. The potential
phenotypic consequences of
this null allele (either in heterozygous or homozygous offspring) can be
analysed (Reeves, supra).
Single or double knock out mice that may be useful for studying osteoporosis
have been
produced for a number of genes including IRS 1 (Araki et al.,1994, Nature,
372:186, Tamemoto
et al., 1994, Nature, 372:182), 1852 (Withers et al., 1998, Nature, 391:900),
INSR, BIRKO,
MIRKO, INSR (Lamothe et al., 1998, FEBS Letter, 426:381), GLUT2, GLUT4 (Katz
et al.,
1995, Nature, 377:151), GLP1R (Gallwitz and Schmidt,1997,~Z. Gastroenterol,
35:655):, GCK
(Sakura et al.,1998, Diabetologia, 41:654), GCKIIRS 1, IRS 1/IIVSR, MC4R
(Huszar et al.,1997,
Cell, 88:13 1) and BRS3 (Ohlci-Hamazaki et al., 1997, Nature, 390:165).
ii. hZ vavo Tissue Specific Knock Out in Mice Using Cre-lox.
The method of targeted homologous recombination has been improved by the
development of a system for site-specific recombination based on the
bacteriophage P1 site
specific recombinase Cre. The Cre-loxP site-specific DNA recombinase from
bacteriophage P1
is used in transgenic mouse assays in order to create gene knockouts
restricted to defined tissues
or developmental stages. Regionally restricted genetic deletion, as opposed to
global gene
knockout, has the advantage that a phenotype can be attributed to a particular
cell/tissue (Marth,
1996, Clin. Invest. 97: 1999). In the Cre-loxP system one transgenic mouse
strain is engineered
such that loxP sites flank one or more exons of the gene of interest.
Homozygotes for this so
called 'foxed gene' are crossed with a second transgenic mouse that expresses
the Cre gene under
control of a cell/tissue type transcriptional promoter. Cre protein then
excises DNA between loxP
recognition sequences and effectively removes target gene function (Sauer,
1998, Methods,
14:381). There are now many if2 vivo examples of this method, including the
inducible
inactivation of mammary tissue specific genes (Wagner et al., 1997, Nucleic
Acids Res.,
25:4323).
iii. Bac Rescue of Knock Out Phenotype
In order to verify that a particular genetic polymorphism/mutation is
responsible for
altered protein function ifz vivo one can "rescue" the altered protein
function by introducing a
wild-type copy of the gene in question. Ira vivo complementation with
bacterial artificial
105
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
chromosome (BAC) clones expressed in transgenic mice can be used for these
purposes. This
method has been used for the identification of the mouse circadian Clock gene
(Antoch et al.,
1997, Cell 89: 655).
G. Production of an Amplified Product
Amplified products useful according to the invention can be prepared by
utilizing the
method of PCR as described in Section B entitled "Production of a
Polynucleotide Sequence
Primers useful for producing an amplified product according to the invention
(e.g. an amplified
product comprising one or more polymorphisms) can be designed and synthesized
as described
in Section A entitled "Design and Synthesis of Oligonucleotide Primers".
The invention provides methods (e.g. Southern blot analysis, PCR, primer
extension and
oligonucleotide hybridization), of detecting a polymorphism in an amplified
product.
H. Production of a Mutant Protein
1. Expression of the Nucleotide Sequence
In accordance with the present invention, polynucleotide sequences which
encode
candidate gene protein fragments, fusion proteins or functional equivalents
thereof may be used
in recombinant DNA molecules that direct the expression of a candidate gene
protein in
appropriate host cells. Due to the inherent degeneracy of the genetic code,
other DNA sequences
which encode substantially the same or a functionally equivalent amino acid
sequence, may be
used to clone and express the candidate gene protein. As will be understood by
those of skill in
the art, it may be advantageous to produce candidate gene-encoding nucleotide
sequences
possessing non-naturally occurring codons. Codons preferred by a particular
prokaryotic or
eukaryotic host (hurray et al., 1989, Nucleic Acid Res 17:477) can be
selected, for example, to
increase the rate of protein expression or to produce recombinant RNA
transcripts having
desirable properties, such as a longer half-life as compared to transcripts
produced from the
naturally occurring sequence.
The nucleotide sequences of the present invention can be engineered in order
to alter a
candidate gene-encoding sequence for a variety of reasons, including but not
limited to,
106
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
alterations which modify the cloning, processing and/or expression of the gene
product. For
example, mutations may be introduced using techniques which are well known in
the art, e.g.,
site-directed mutagenesis to insert new restriction sites, to alter
glycosylation patterns, to change
codon preference or to produce splice variants.
In another embodiment of the invention, a natural, modified or recombinant
candidate
gene protein-encoding sequence may be ligated to a heterologous sequence to
encode a fusion
protein (as described in Section B entitled "Production of a Polynucleotide
Sequence"). For
example, for screening of peptide libraries for inhibitors of candidate gene
protein activity, it may
be useful to encode a chimeric protein that is recognized by a commercially
available antibody.
a fusion protein may also be engineered to contain a cleavage site located
between a candidate
protein and the heterologous protein sequence, so that the protein of interest
may be substantially
purified away from the heterologous moiety following cleavage.
In another embodiment of the invention, the sequence encoding the candidate
gene
protein may be synthesized, whole or in part, using chemical methods well
known in the art (see
Caruthers, et al.,1980, Nuc Acids Res Symp Ser, 7:215, Horn, et al.,1980, Nuc
Acids Res Symp
Ser, 225, etc.) Alternatively, the protein itself, or a portion thereof, could
be produced using
chemical methods of synthesis. For example, peptide synthesis can be performed
using various
solid-phase techniques (Roberge, et al., 1995, Science, 269:202) and automated
synthesis may
be achieved, for example, using the A.I. 431 A Peptide Synthesizer (Perkin
Elmer) in accordance
with the instructions provided by the manufacturer.
The newly synthesized peptide can be substantially purified by preparative
high
performance liquid chromatography (e.g., Creighton, 1983, Proteins, Structures
and Molecular
Principles, WH Freeman and Co. New York NY). The composition of the synthetic
peptides may
be confirmed by amino acid analysis or sequencing (e.g., the Edman degradation
procedure;
Creighton, supra). Additionally the amino acid sequence of interest, or any
part thereof, may be
altered during direct synthesis and/or combined using chemical methods with
sequences from
other proteins , or any part thereof, to produce a variant polypeptide.
2. Expression Systems
107
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
In order to express a biologically active protein, the nucleotide sequence
encoding the
protein of interest or its functional equivalent, is inserted into an
appropriate expression vector,
i.e., a vector which contains the necessary elements for the transcription and
translation of the
inserted coding sequence.
Methods which are well known to those skilled in the art can be used to
construct
expression vectors containing a protein-encoding sequence and appropriate
transcriptional or
translational controls. These methods include ifi vivo recombination or
genetic recombination.
Such techniques are described in Ausubel et al., supra and Sambrook et al.,
supra.
A variety of expression vectorlhost systems may be utilized to contain and
express a
protein product of a candidate gene according to the invention. These include
but are not limited
to microorganisms such as bacteria transformed with recombinant bacteriophage,
plasmid or
cosmid DNA expression vectors; yeast transformed with yeast expression
vectors; insect cell
systems infected with virus expression vectors (e.g., baculovirus); plant cell
systems transfected
with virus expression vector (e.g., cauliflower mosaic virus, CaMV; tobacco
mosaic virus, TMV)
or transformed with bacterial expression vectors (e.g., Ti or pBR322 plasmid);
or animal cell
systems.
The "control elements" or "regulatory sequences" of these systems vary in
their strength
and specificities and are those nontranslated regions of the vector,
enhancers, promoters, and 3'
untranslated regions, which interact with host cellular proteins to carry out
transcription and
translation. Depending on the vector system and host utilized, any number of
suitable
transcription and translation elements, including constitutive and inducible
promoters, may be
used. For example, when cloning in bacterial systems, inducible promoters such
as the hybrid
lacZ promoter of the Bluescript~ phagemid (Stratagene, LaJolla CA) or pSportl
(Gibco BRL)
and ptrp-lac hybrids and the like may be used. The baculovirus polyhedron
promoter may be used
in insect cells. Promoters or ~nhancers derived from the genomes of plant
cells (e.g., heat shock,
RUBISCO; and storage protein genes) or from plant virus (e.g. viral promoters
or leader
sequences) may be cloned into the vector. In mammalian cell systems promoters
from the
mammalian genes or from mammalian viruses are most appropriate. If it is
necessary to generate
a cell line that contains multiple copies of the sequence encoding the protein
product of the gene
of interest, vectors based on 5V40 or EBV may be used with an appropriate
selectable marker.
108
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
In bacterial systems, a number of expression vectors may be selected depending
upon the
use intended for the protein of interest. For example, when large quantities
of a protein are
required for the production of antibodies, vectors which direct high level
expression of fusion
proteins that are readily purified may be desirable. Such vectors include, but
are not limited to,
the multifunctional E. coli cloning and expression vectors such as Bluescript~
(Stratagene), in
which the sequence encoding the protein of interest may be ligated into the
vector in frame with
sequences encoding the amino-terminal Met and the subsequent 27 residues of b-
galactosidase
so that a hybrid protein is produced; pIN vectors (Van Heeke & Schuster, 1989,
J Biol Chem
264:5503); and the like. Pgex vectors (Promega, Madison WI) may also be used
to express
foreign polypeptides as fusion proteins with GST. In general, such fusion
proteins are soluble and
can easily be purified from lysed cells by adsorption to glutathione-agarose
beads followed by
elution in the presence of free glutathione. Proteins made in such systems are
designed to include
heparmn, thrombin or factor XA protease cleavage sites so that the cloned
polypeptide of interest
can be released from the GST moiety at will.
In the yeast, Saccharomyces cerevisiae, a number of vectors containing
constitutive or
inducible promoters such as alpha factor, alcohol oxidase and PGH may be used.
For reviews,
see Ausubel et al (supra) and Grant et al., 1987, Methods in Enzymology
153:516.
In cases where plant expression vectors are used, the expression of a sequence
encoding
a protein of interest may be driven by any of a number of promoters. For
example, viral
promoters such as the 35S and 19S promoters of CaMV (Brisson et al., 1984,
Nature 310:511)
may be used alone or in combination with the omega leader sequence from TMV
(Takamatsu et
al.,1987, EMBO J 6:307). Alternatively, plant promoters such as the small
subunit of RUBISCO
(Coruzzi et al., 1984, EMBO J 3:1671; Broglie et al., 1984, Science, 224:838);
or heat shock
promoters (Winter I and Sinibaldi RM, 1991, Results Probl Cell Differ., 17:85)
may be used.
These constructs can be introduced into plant cells by direct DNA
transformation or pathogen-
mediated transection. For reviews of such techniques, see Hobbs S or Murry LE
in McGraw Hill
Yearbook of Science and Technology (1992) McGraw Hill New York NY, pp 191-196
or
Weissbach and Weissbach (1988) Methods for Plant Molecular Biology, Academic
Press, New
York, pp 421-463.
109
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
An alternative expression system which could be used to express a protein of
interest is
an insect system. In one such system, Autographa califorhica nuclear
polyhedrosis virus
(AcNPV) is used as a vector to express foreign genes in Spodoptera fr-
i,~giperda cells or in
Tr~ichoplusia larvae. The sequence encoding the protein of interest may be
cloned into a
nonessential region of the virus, such as the polyhedrin gene, and placed
under control of the
polyhedrin promoter. Successful insertion of the sequence encoding the protein
of interest will
render the polyhedron gene inactive and produce recombinant virus lacking coat
protein coat.
The recombinant viruses are then used to infect S. frigoerda cells or
Trichoplusia larvae in which
the protein of interest is expressed (Smith et al., 1983., J Virol 46:584;
Engelhard, et al., 1994,
Proc Natl Acad Sci 91:3224).
In mammalian host cells, a number of viral-based expression systems may be
utilized. In
cases where an adenovirus is used as an expression vector, a sequence encoding
the protein of
interest may be ligated into an adenovirus transcription/translation complex
consisting of the late
promoter and tripartite leader sequence. Insertion in a nonessential El or E3
region of the viral
genome will result in a viable virus capable of expressing in infected host
cells (Logan and
Shenk,1984, Proc Natl Acad Sci, 81:3655). In addition, transcription
enhancers, such as the rous
sarcoma virus (RSV) enhancer, may be used to increase expression in mammalian
host cells.
Specific initiation signals may also be required for efficient translation of
a sequence
encoding the protein of interest. These signals include the ATG initiation
codon and adjacent
sequences. In cases where the sequence encoding the protein, its initiation
codon and upstream
sequences are inserted into the most appropriate expression vector, no
additional translational
control signals may be needed. However, in cases where only coding sequence,
or a portion
thereof, is inserted, exogenous transcriptional control signals including the
ATG initiation codon
must be provided. Furthermore, the initiation codon must be in the correct
reading frame to
ensure transcription of the entire insert. Exogenous transcriptional elements
and initiation codons
can be of various origins, both natural and synthetic. The efficiency of
expression may be
enhanced by the inclusion of enhancers appropriate to the cell system in use
(Scharf, et al., 1994,
Results Probl Cell Differ, 20:125; Bittner et al., 1987, Methods in Enzymol,
153:516).
In addition, a host cell strain may be chosen for its ability to modulate the
expression of
the inserted sequences or to process the expressed protein in the desired
fashion. Such
110
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
modifications of the polypeptide include but are not limited to, acetylation,
carboxylation,
glycosylation, phosphorylation, lipidation and acylation. Post-translational
processing which
cleaves a"prepro" form of the protein may also be important for correct
insertion, folding and/or
function. Different host cells such as CHO, HeLa, MDCI~, 293, W138, etc have
specific cellular
machinery and characteristic mechanisms for such post-translational activities
and may be chosen
to ensure the correct modification and processing of the introduced, foreign
protein.
For long-term, high-yield production of recombinant proteins, stable
expression is
preferred. For example, cell lines which stably express a foreign protein may
be transformed
using expression vectors which contain viral origins of replication or
endogenous expression
elements and a selectable marker gene. Following the introduction of the
vector, cells may be
allowed to grow for 1-2 days in an enriched media before they are switched to
selective media.
The purpose of the selectable marker is to confer resistance to selection, and
its presence allows
growth and recovery of cells which successfully express the introduced
sequences. Resistant
clumps of stably transformed cells can be expanded using tissue culture
techniques appropriate
to the cell type.
Any number of selection systems may be used to recover transformed cell lines.
These
include, but are not limited to, the herpes simplex virus thymidine kinase
(Wigler., et al., 1977,
Cell 11:223) and adenine phosphoribosyltransferase (Lowy, et al., 1980, Cell
22:817) genes
which can be employed in tle- or aprt- cells, respectively. Also,
antimetabolite, antibiotic or
herbicide resistance can be used as the basis for selection; for example, dhfr
which confers
resistance to methotrexate (Wigler et al., 1980, Proc Natl Acad Sci 77:3567);
npt, which confers
resistance to the aminoglycosides neomycin and G-418 (Colbere-Garapin et al.,
1981., J Mol
Biol., 150:1) and als or pat, which confer resistance to chlorsulfuron and
phosphinotricin
acetyltransferase, respectively (Murry, supra). Additional selectable genes
have been described,
for example, trpB, which allows cells to utilize indole in place of
tryptophan, or hisD, which
allows cells to utilize histinol in place of histidine (Hartman and
Mulligan,1988, Proc Natl Acad
Sci 85:8047). Recently, the use of visible markers has gained popularity with
such markers as
anthocyanins, B glucuronidase and its substrate, GUS, and luciferase and its
substrate, luciferin,
being widely used not only to identify transformants, but also to quantify the
amount of transient
or stable protein expression attributable to a specific vector system (Rhodes
et al.,1995, Methods
Mol Biol 55:121).
111
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
3. Identification of Transformants Containing the Polynucleotide Sequence
Although the presence/absence of marker gene expression suggests that the gene
of
interest is also present, its presence and expression should be confirmed. For
example, if the
sequence encoding a foreign protein is inserted within a marker gene sequence,
recombinant cells
containing the sequence encoding the foreign protein can be identified by the
absence of marker
gene function. Alternatively, a marker gene can be placed in tandem with the
sequence encoding
the foreign protein under the control of a single promoter. Expression of the
marker gene in
response to induction or selection usually indicates expression of the tandem
sequences as well.
Alternatively, host cells which contain the coding sequence for a protein of
interest and
express the protein of interest may be identified by a variety of procedures
known to those of skill
in the art. These procedures include, but are not limited to, DNA-DNA or DNA-
RNA
hybridization and protein bioassay or immunoassay techniques which include
membrane,
solution, or chip based technologies for the detection and/or quantification
of the nucleic acid or
protein.
The presence of the polynucleotide sequence encoding the protein of interest
can be
detected by DNA-DNA or DNA-RNA hybridization or amplification using probes,
portions or
fragments of the sequence encoding the foreign protein of interest.
A variety of protocols for detecting and measuring the expression of the
foreign protein,
using either polyclonal or monoclonal antibodies specific for the protein are
known in the art.
Examples include enzyme-linked immunosorbant assay (ELISA), radioimmunoassay
(RIA) and
fluorescent activated cell sorting (FAGS). A two-site, monoclonal-based
immunoassay utilizing
monoclonal antibodies reactive to two non-interfering epitopes on the protein
of interest is
preferred, but a competitive binding assay may be employed. These and other
assays are
described in Hampton et al., 1990, Serological Methods a Laboratory Manual,
APS Presds, St
Paul MN and Maddox., et al., 1983, J Exp Med 158:1211.
4. Purification of the Protein of Interest
Host cells transformed with a nucleotide sequence encoding a protein of
interest may be
cultured under conditions suitable for the expression and recovery of the
encoded protein from
112
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
cell culture. The protein produced by a recombinant cell may be secreted or
contained
intracellularly depending on the sequence and/or the vector used. As will be
understood by those
of skill in the art, expression vectors containing a sequence encoding a
protein of interest can be
designed with signal sequences which direct secretion of the protein of
interest through a
prokaryotic or eucaryotic cell membrane. Other recombinan~constructions may j
oin the sequence
encoding the protein of interest to the nucleotide sequence encoding a
polypeptide domain which
will facilitate purification of soluble proteins (Knoll et al., 1993, DNA Cell
Biol, 12:441).
The protein of interest may also be expressed as a recombinant protein with
one or more
additional polypeptide domains added to facilitate protein purification. Such
purification
facilitating domains include, but are not limited to, metal chelating peptides
such as a histidine-
tryptophan modules that allow purification on immobilized metals, protein a
domains that allow
purification on immobilized immunoglobulin, and the domain utilized in the
FLAGS
extension/affinity purification system (Immunex Corp, Seattle WA). The
inclusion of a cleavable
linker sequences such as Factor XA or enterokinase (Invitrogen, San Diego CA),
between the
purification domain and the protein of interest is useful for facilitating
purification. One such
expression vector provides for expression of a fusion protein comprising the
sequence encoding
a foreign protein and nucleic acid sequence encoding 6 histidine residues
followed by thioredoxin
and an enterokinase cleavage site. The histidine residues facilitate
purification while the
enterokinase cleavage site provides a means for purifying the foreign protein
from the fusion
protein.
In addition to recombinant production, fragments of the protein of interest
may be
produced by direct peptide synthesis using solid-phase techniques (Stewart et
al., 1969, Solid-
Phase Peptide Synthesis, WH Freeman Co,. San Francisco; Merrifield, 1963, J Am
Chem Soc,
85:2149). In vitro protein synthesis may be performed using manual techniques
or by automation.
Automated synthesis may be achieved, for example, using Applied Biosystems 431
A Peptide
Synthesizer (Perkin Elmer, Foster City CA) in accordance with the instructions
provided by the
manufacturer. Various fragments of a protein of interest may be chemically
synthesized
separately and combined using chemical methods to produce the full length
molecule.
I. Preparation of Antibodies
113
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Antibodies specific for the protein products of the candidate genes of the
invention are
useful for protein purification, for the diagnosis and treatment of various
diseases (e.g
osteoporosis) and for drug screening and drug design methods useful for
identifying and
developing compounds to be used in the treatment of various diseases (e.g.
osteoporosis). By
antibody, we include constructions using the binding (variable) region of such
an antibody, and
other antibody modifications. Thus, an antibody useful in the invention may
comprise a whole
antibody, an antibody fragment, a polyfunctional antibody aggregate, or in
general a substance
comprising one or more specific binding sites from an antibody. The antibody
fragment may be
a fragment such as an Fv, Fab or F(ab')Zfragment or a derivative thereof, such
as a single chain
Fv fragment. The antibody or antibody fragment may be non-recombinant,
recombinant or
humanized. The antibody may be of an immunoglobulin isotype, e.g., IgG, lgM,
and so forth.
In addition, an aggregate, polymer, derivative and conjugate of an
immunoglobulin or a fragment
thereof can be used where appropriate. Neutralizing antibodies are especially
useful according
to the invention for diagnostics, therapeutics and methods of drug screening
and drug~design.
Although a protein product (or fragment or oligopeptide thereof) of a
candidate gene of
the invention that is useful for the production of antibodies does not require
biological activity,
it must be antigenic. Peptides used to induce specific antibodies may have an
amino acid
sequence consisting of at least five amino acids and preferably at least 10
amino acids.
Preferably, they should be identical to a region of the natural protein and
may contain the entire
amino acid sequence of a small, naturally occurring molecule. Short stretches
of amino acids
corresponding to the protein product of a candidate gene of the invention may
be fused with
amino acids from another protein such as keyhole limpet hemocyanin or GST, and
antibody will
be produced against the chimeric molecule. Procedures well known in the art
can be used for the
production of antibodies to the protein products of the candidate genes of the
invention.
For the production of antibodies, various hosts including goats, rabbits,
rats, mice etc...
may be immunized by injection with the protein products (or any portion,
fragment, or
oligonucleotide thereof which retains immunogenic properties) of the candidate
genes of the
invention. Depending on the host species, various adjuvants may be used to
increase the
immunological response. Such adjuvants include but are not limited to
Freund's, mineral gels
such as aluminum hydroxide, and surface active substances such as
lysolecithin, pluronic polyols,
114
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
polyanions, peptides, oil emulsions, keyhole limpet hemocyanin, and
dinitrophenol. BCG (bacilli
Calmette-Guerin) and CoYynebacte~ium parvuma are potentially useful human
adjuvants.
1. Polyclonal antibodies.
The antigen protein may be conjugated to a conventional carrier in order to
increase its
immunogenicity, and an antiserum to the peptide-carrier conjugate will be
raised. Coupling of
a peptide to a carrier protein and immunizations may be performed as described
(Dymecki et al.,
1992, J. Biol. Chem., 267: 4815). The serum can be titered against protein
antigen by ELISA
(below) or alternatively by dot or spot blotting (Boersma and Van Leeuwen,
1994, J Neurosci.
Methods, 51: 317). At the same time, the antiserum may be used in tissue
sections prepared
asdescribed. A useful serum will react strongly with the appropriate peptides
by ELISA, for
example, following the procedures of Green et al., 1982, Cell, 28: 477.
2. Monoclonal antibodies.
Techniques for preparing monoclonal antibodies are well known, and monoclonal
antibodies may be prepared using a candidate antigen whose level is to be
measured or which is
to be either inactivated or affinity-purified, preferably bound to a carrier,
as described by
Arnheiter et al., 1981, Nature, 294;278.
Monoclonal antibodies are typically obtained from hybridoma tissue cultures or
from
ascites fluid obtained from animals into which the hybridoma tissue' was
introduced.
Monoclonal antibody-producing hybridomas (or polyclonal sera) can be screened
for
antibody binding to the target protein.
3. Antibody Detection Methods
Particularly preferred immunological tests rely on the use of either
monoclonal or
polyclonal antibodies and include enzyme-linked immunoassays (ELISA),
immunoblotting and
immunoprecipitation (see Volley, 1978, Diagnostic Horizons, 2:1,
Microbiological Associates
Quarterly Publication, Walkersville, MD; Volley et al., 1978, J. Clin.
Pathol., 31: 507; U.S.
115
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Reissue Pat. No. 31,006; UK Patent 2,019,408; Butler, 1981, Methods Enzymol.,
73: 482;
Maggio, E. (ed.), 1980, Enzyme Immunoassay, CRC Press, Boca Raton, FL) or
radioimmunoassays (RIA) (Weintraub, B., Principles of radioimmunoassays,
Seventh Training
Course on Radioligand Assay Techniques, The Endocrine Society, March 1986, pp.
1-5, 46-49
and 68-78). For analysing tissues for the presence or absence of a protein
produced by a candidate
gene according to the present invention, immunohistochemistry techniques may
be used. It will
be apparent to one skilled in the art that the antibody molecule may have to
be labelled to
facilitate easy detection of a target protein. Techniques for labelling
antibody molecules are well
known to those skilled in the art (see Harlow and Lane, 1989, Antibodies, Cold
Spring Harbor
Laboratory).
J. Preparation of a Labeled Protein
1. Labeling of protein
Labeling techniques are useful, according to the invention, for studying the
biochemical
properties, processing, intracellular transport, secretion and degradation of
proteins.
Biosynthetic labeling of proteins produced by candidate genes of the invention
is
preferably performed with 35S-methionine due to the high specific activity
(>800Ci/mmol) and
ease of detection of this amino acid. Another amino acid should be used to
label a protein that
contains little or no methionine.
According to the following protocol, either suspension cells or adherent cells
are labeled
with 35S-methionine. Briefly, cells are washed and incubated for 15 min at
37°C in short-term
labeling medium (complete serum-free, methionine free RPMI or DMEM containing
5% (v/v)
dialyzed fetal bovine serum) to deplete intracellular pools of methionine.
Cells are then incubated
in the presence of 35S-methionine working solution (0.1 to 0.2 mCi/ml in
37°C short-term
labeling medium) such that 4ml of 35S-methionine working solution is added per
2 x 10'
suspension cells and 2 to 4 ml of 35S-methionine working solution is added per
100 mm dish of
adherent cells (0.5-2 x 10' cells), for a period of 30 min to 3 hour in a
humidified, 37°C, 5 % COZ
incubator. Upon completion of labeling, suspension cells are washed by
centrifugation in ice-cold
PBS. Following removal of labeling medium, adherent cells are washed with PBS,
scraped and
collected by centrifugation. Labeled cells are processed and analysed by
immunoaffinity
116
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
chromatography, immunoprecipitation and one- and two-dimensional gel
electrophoresis
(Ausubel et al., supra).
If the protein of interest is synthesized at a relatively low rate or is in a
steady state, it may
be necessary to label cells for an extended period of time. When performing
long-term
biosynthetic labeling of cells, it is necessary to include unlabeled
methionine in the medium to
maintain cell viability and to ensure that incorporation of label is
maintained during the course
of the experiment. According to this method, cells can be labeled in the
presence of 35S-
methionine in long term labeling medium (90% methionine free RPMI or DMEM) for
up to 16
hours (Ausubel et al., supra).
2. Ira vitro Translation
The protein product of the cloned candidate gene of the invention can be
produced by the
methods of irc vitro transcription and ifz vitro translation. If2 vitro
transcription is performed
essentially as described in Section B entitled "Production of a Polynucleotide
Sequence" in the
absence of a labeled ribonucleoside. The RNA produced by the in vitro
transcription reaction will
be extracted with phenol, ethanol precipitated twice and resuspended in lOml
of TE buffer. In
vitro translation is performed by adding 1 to lOml of RNA to an in vitro
translation kit (e.g.
wheat germ or reticulocyte lysate) in the presence of l5mCi [35S]methionine,
following the
directions provided by the manufacturer. A typical reaction is carried out in
a 30m1 volume at
room temperature for 30 to 60 minutes (Ausubel et al., supra).
K. Production of Cells Expressing a Nucleotide Sequence Comprising a
Polymorphism
Mammalian cells expressing a nucleotide sequence comprising a polymorphism are
useful, according to the invention for determining the biochemical and
functional properties of
the protein product of a nucleotide sequence comprising a polymorphism, for
analyzing
expression of a candidate gene, for large scale production of a protein of
interest, for drug
screening and for the production of transgenic animals or knockout mice.
117
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Methods of efficiently introducing foreign DNA into mammalian cells are known
in the
art and include calcium phosphate transfection, DEAF-dextran transfection,
electroporation and
liposome-mediated transfection (Ausubel et al., supra).
Transfection Protocols
1. Calcium-Phosphate Transfection
The method of calcium phosphate transfection involves preparing a precipitate
by slowly
mixing a HEPES-buffered saline solution with a mixture of calcium chloride and
DNA.
According to this method, up to 10% of the cells on a dish will incorporate
DNA.
Cells to be transfected are split one day prior to transfection so that on the
day of
transfection cells are well-separated on the plate. a 10 cm dish of cells is
fed with 9.0 ml of
complete medium approximately 2 to 4 hours before the addition of the
precipitate. DNA to be
transfected (10-50mg/10-cm plate) is ethanol precipitated, resuspended in
450m1 sterile water
and mixed with 50m1 of 2.5M CaCl2_ The DNA/CaClz solution is added dropwise to
a 15-ml
conical tube containing 500m1 2X HeBS (0.283M NaCI, 0.023M HEPES acid, 1.5 mM
Na2HP04, pH 7.05). It is preferable to bubble the HeBS solution during the
addition of the DNA
mixture. After the precipitate has formed for 20 minutes at room temperature,
it is added evenly
to the cells. The cells are incubated with the precipitate at 37°C in a
COZ humidified incubator
for 4-16 hours. Following removal of the precipitate, the cells are washed
with PBS and fed in
complete medium. Glycerol or dimethyl sulfoxide shocle can be used to increase
the DNA uptalce
by certain types of cells (Ausubel et al., supra).
2. DEAF-Dextran Transfection
Cells to be transfected are plated at a concentration such that after 3 days
of growth they
are 30-50% confluent. The DNA to be transfected (approximately 4mg) is ethanol
precipitated,
resuspended in 40m1 TBS and added slowly while shaking to 80m1 of warm
lOmg/rnl DEAE-
dextran in TBS. After cells have been washed with PBS and fed with 4 ml of
DMEM containing
10% Nu Serum/1 Ocm dish, the DEAF-dextranlDNA mixture is evenly distributed
over the entire
plate. Cells are incubated with the DNA for approximately 4 hours in a
humidified COZ
118
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
incubator. Following the removal of the DEAE-dextran/DNA mixture, cells are
shocked by the
addition of 5 ml of 10°70 DMSO in PBS. After a 1 minute incubation at
room temperature, cells
are washed with PBS and fed with complete medium (Ausubel et al., supra).
3. Electroporation
Alternatively, DNA can be introduced into cells by the use of high-voltage
electric
shocks, a technique termed electroporation. Briefly, according to the method
of electroporation,
cells are suspended in an appropriate electroporation buffer and placed in an
electroporation
cuvette. Following the addition of DNA, the cuvette is connected to a power
supply and the cells
are subjected to a high-voltage electrical pulse of a defined magnitude and
length, optimized for
the cell type being transfected. After a brief period of recovery, the cells
are placed in normal
culture medium.
A population of cells to be transfected by electroporation is grown to late-
log phase in
complete medium. Typically stable transfection requires 5 X 106 cells, and
transient transfection
requires 1-4 X 10' cells. Cells are harvested by centrifugation for 5 minutes
at 640 x g at 4°C. The
resulting cell pellet is resuspended in half of the original volume of ice-
cold electroporation
buffer (e.g. PBS without calcium or magnesium, Hepes buffered saline, tissue
culture medium
without serum, or phosphate buffered sucrose (272mM sucrose/? mM KzHP04, pH
7.4/1mM
MgCl2)). The choice of an electroporation buffer is dictated by the cell line.
Cells are then
harvested by centrifugation for 5 minutes at 640 x g at 4°C, and
resuspended at 1 X 10'/ml in
electroporation buffer at 0°C for stable transfection or at a higher
concentration (up to 8 X
10'/ml) for transient transfection. Aliquots of the cells (0.5 ml) are
transferred into the desired
number of electroporation cuvettes and placed on ice.
DNA is added to the cell suspension in the cuvettes on ice. For stable
transfection, DNA
(optimally 1-lOmg) should be linearized with a restriction enzyme that cuts at
a site in a non-
essential region, purified by phenol extraction and ethanol precipitated.
Supercoiled DNA
(optimally lOmg) may be used for transient transfection. The DNA/cell
suspension is mixed, and
incubated on ice for 5 minutes.
The cuvette is placed in the holder in the electroporation apparatus (at room
temperature)
and shocked one or more times at the desired voltage and capacitance settings.
An
119
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
electroporation apparatus useful according to the invention is the Bio-Rad
Gene Pulser. The
number of shocks and the voltage and capacitance settings will vary depending
on the cell type,
and should be optimized. The two parameters that are critical for successful
electroporation are
the maximum voltage for the shock and the duration of the current pulse.
Following electroporation, the cuvette containing the mixture of cells and DNA
is
incubated on ice for 10 minutes. The transfected cells are diluted 20-fold in
complete culture
medium. For stable transfection cells are grown for 48 hours in nonselective
medium and then
transferred to antibiotic containing medium. For transient transfection, cells
are incubated 50-60
hours and then harvested for the desired transient assay.
L. Production of Animals Expressing a Nucleotide Sequence Comprising a
Polymorphism
Transgenic animals expressing a construct comprising a candidate gene
containing a
polymorphism, according to the invention can be produced by methods well known
in the art
(reviewed in Reeves et al., supra). Knock out mice wherein a candidate gene
according to the
invention has been disrupted can be produced by methods well known in the art
(reviewed in
Moreadith and Radford, 1997, J,MoI. Med., 75:208 and Shastry, 1998, Mol. Cell.
Biochem.,
181:163). These animals provide useful models for studying the functional
consequences of one
or more polymorphisms in a gene of interest.
M. Production of a Candidate Gene Library
The invention provides a method of producing a candidate gene library
comprising genes
that are potentially associated with the susceptibility to, or pathogenesis of
a disease. A candidate
gene library is useful for determining the genetic basis of a disease of
interest.
Genetic susceptibility to a disease must occur as a result of specific DNA
differences
relative to non-susceptible individuals. In the case of osteoporosis, many
genes are known which
are potentially involved in the susceptibility to, or pathogenesis of the
disease. These genes are
included in the candidate gene library and the association of these genes with
osteoporosis is
determined from population studies according to the invention. Unlike linkage
studies wherein
120
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
a region of the genome that is thought to be involved in a disease is
determined, the candidate
gene strategy, including association studies, addresses the involvement of a
particular gene in a
disease. The results of association studies of candidate genes are used to
identify genes that
should be intensively studied as potential therapeutics or therapeutic
targets.
According to the invention, the full range of polymorphic sites within each
candidate gene
is identified and examined in diseased and normal populations. The frequency
of each gene
variant (allele) in each population is then compared to the other. If a
specific polymorphism
under analysis contributes to the disease phenotype, it will be present in the
diseased population
at a higher frequency than in the normal population. In addition, if the
specific polymorphism
under analysis does not itself contribute to the disease phenotype but resides
elsewhere in, or is
near to a gene containing a contributory polymorphism, a significant
association may be seen
with the polymorphic marker being tested. This is because the two markers are
in linkage
disequilibrium with each other due to their close proximity.
1. Strategies for Identifying Genes Associated with a Disease
There are a number of methods known in the art for the identification of genes
involved
in a disease. These methods include familial linkage studies followed by
positional cloning,
differential gene expression studies on tissues, and population-based
candidate gene association
studies. Although positional cloning has proven to be useful for diseases
resulting from a single
mutation, this technique is not suitable for identifying genetic linkage in
diseases where multiple
genetic variants combine to create disease susceptibility. Furthermore, it has
been demonstrated
that the etiological basis of the majority of diseases comprises more than one
gene.
The goal of linkage studies is to determine the approximate position of
disease genes by
studying related individuals in families. According to linkage strategies, DNA
markers that are
randomly spaced throughout the genome, but are rarely located within genes,
are tested for the
frequency of their presence along with the particular disease phenotype. There
is approximately
a 50% chance of an unlinked gene and marker gene co-localizing. If a
particular marker is present
at a significantly higher frequency than expected in disease individuals, this
indicates that the
marker is located in the vicinity of the disease gene. Usually the disease
gene is delimited to a
large region (containing tens to hundreds of genes). After a disease gene has
been grossly
121
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
mapped, this entire region must be extensively characterized to determine what
genes are present
in the region. Any gene that is identified according to this method becomes a
candidate gene.
Linkage studies have been used successfully to identify the genes responsible
for certain
genetic diseases originating from mutations in a single gene (monogenic
diseases). However,
most common human diseases are of polygenic origin wherein changes in multiple
genes causes
an increased susceptibility to or pathogenesis of a particular disease.
Because the DNA changes
associated with genes which contribute to polygenic diseases are common in the
population,
thereby diluting the contribution of a given region of the genome to the
disease, it is difficult to
perform linkage studies on diseases of polygenic origin.
Linka.eg anal,
A series of genetic crosses is performed in an animal model system of a
particular defect
that is characteristic of a disease of interest (e.g. osteoporosis) between
individuals having an
observable mutant phenotype and normal individuals of a control strain. At
least one disease-
related loci is used as a marker in these crosses. Alternatively, linkage
analysis can be performed
using chromosomal markers that do not comprise a disease related locus
(described below). If
non-random assortment of the mutant trait with a marker locus is observed, and
if that non-
random assortment is statistically significant (for example, if a Student's t
test or ANOVA is
applied to the results) the trait is linked to the marker locus.
Similarly, linkage analysis using an existing human or other mammalian
pedigree may
be performed. Pedigree analysis is a useful technique for identifying genes
for which variant
alleles may contribute to the risk, onset or progression of a disease in a
family containing
multiple individuals afflicted with a disease; according to this method,
numerous genetic loci
from affected and unaffected family members are compared. Non-random
assortment of a given
genetic marker between affected and unaffected family members relative to the
distributions
observed for other genetic loci indicates that the marker (for example, a
variant isoform of a
gene) either contributes to the disease or is in physical proximity to another
that does so.
If a non-random assortment of the disease-related phenotype with a marker
locus is
observed, using either approach, this is indicative of an association between
the gene underlying
the defect and that locus. Because the strength of any conclusion drawn from
linkage analysis is
122
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
statistically-based, the accuracy of the results is thought to be proportional
to the number of
crosses or family members and genetic loci analysed.
Positional Cloning
. If linkage is confirmed it is preferable to perform a molecular analysis of
the region in
which the peals of linkage maps. The wide availability of yeast artificial
chromosome (YAC) or
bacterial artificial chromosome (BAC) libraries facilitates this analysis. a
nucleic acid sequence
specific for a region encompassing a gene which is determined to occupy a map
location of a
particular locus of interest is examined, and open reading frames are
evaluated to determine their
relationship with the observed phenotype. An initial evaluation may be
performed with the
assistance of a computer program, such as the PathCallingTM (CuraGen)
biological pathway
discovery platform. All or a subset of the open reading frames present in the
region are then
cloned (e.g., by PCR) from mutant animals or affected family members and from
their healthy
counterparts (either control animals or unaffected family members), and the
sequences of these
open reading frames are compared. If a mutation or other allelic variant is
found to be linked to
individuals displaying the disease phenotype (in a statistically-significant,
non-random manner),
it can be concluded that this mutation is associated with a disease phenotype.
A nucleic acid
fragment containing this gene can be labeled and used as a probe for in situ
hybridization analysis
of fixed chromosomes of the human or other mammal to determine precisely the
physical
location of the gene. Furthermore, a gene that has been mapped and isolated in
this manner may
be useful as a candidate target for disease diagnosis and for drug targeting
according to the
invention (see below). .
2. Identification of Genes to be Included in Candidate Gene Library
A candidate gene library according to the invention will include i. genes that
are involved
in known or predicted disease pathways, ii. new genes that are identified by a
relevant pattern of
specific tissue or cell expression, iii. genes that map to genomic regions of
known linkage, and
iv. gene sequences (from sequence databases) that are homologs of the above
referenced
categories of potential candidate genes. The choice of potentially related
genes to be selected
123
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
from a database will depend on the percent identity as calculated by Fast DB
and based upon
mismatch penalty, gap penalty, gap size penalty and joining penalty.
Based on the physiological changes associated with a disease of interest,
predictions can
be made regarding a cell or tissue-type that would be expected to express high
or low levels of
candidate genes associated with a particular disease. For osteoporosis, it is
expected that muscle,
adipose, pancreas or liver tissue or tissue comprising insulin secreting
pancreatic b-cells, would
be useful for identifying candidate genes according to the invention.
Differences in the expression of lcnown and unknown genes in normal and
disease tissue
can be determined by methods known in the art including Serial Analysis of
Gene Expression
(SAGE) (Velcuescu et al., 1995, Science, 270:484), subtractive
hybridization/screening
(described below), differential display (Ling and Pardee, 1992, Science,
257:967) high-density
microarray expression testing.
The technique of SAGE allows for the rapid, detailed analysis of thousands of
transcripts.
SAGE depends on the following two principles. First, sufficient information is
contained within
a short nucleotide sequence (approximately 9-lObp), isolated from a defined
location within a
transcript, to uniquely identify a transcript. Second, the concatenation of
short tags of sequence
allows transcripts to be analysed serially by sequencing multiple tags within
a single clone.
The method of SAGE is performed by synthesizing double-stranded cDNA from
mRNA,
cleaving the resulting cDNA with an anchoring restriction endonuclease that is
expected to
cleave most transcripts at least one time, and isolating the most 3' region of
the cleaved cDNA
by binding to streptavadin beads. This protocol allows for the identification
of a unique site on
a transcript that corresponds to the restriction site located closest to the
polyA tail. Replicate
samples of the most 3' region of the cDNA are ligated to one of two linker
molecules that contain
a type IIS restriction site for a tagging enzyme. The cleavage site for Type
lIS restriction
endonucleases is located at a defined distance up to 20 by from the asymmetric
recognition site.
Linkers are designed such that upon cleavage of the ligation product with the
tagging enzyme
there is release of the linker and an attached short region of cDNA.
Following the creation of blunt ends, the two pools of released tags are
ligated to each
other and the resulting ligated product is used as a template for PCR
amplification in the presence
of primers that are specific for each linker. The PCR product is cleaved with
the anchoring
124
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
enzyme and amplification products, comprising two tags linked tail to tail,
are isolated,
concatenated by ligation, cloned and sequenced (Velescu et al., supra).
Differential display provides a method for separating and cloning individual
mRNAs by
PCR analysis. According to the method of differential display, oligonucleotide
primers are
selected wherein one primer is anchored to the polyadenylate tail of a subset
of mRNA species
and the other primer is short and of an arbitrary sequence such that it
anneals at different
positions relative to the first primer. The mRNA subpopulations that are
identified with these
primer pairs are subj ected to reverse transcription, amplified and analysed
on a DNA sequencing
gel. By using multiple sets of primers, a reproducible pattern of amplified
cDNA fragments that
demonstrate a requirement for the sequence specificity of either primer can be
obtained (Liang
and Pardee, supra).
According to the method of high-density microarray expression testing, DNA
sequences
to be tested for expression are spotted onto a surface, usually at high-
density to allow for the
testing of many genes. The surface contain the DNA sequences is typically
referred to as a 'chip' .
The spotted DNA cam be either cDNA clones or oligonucleotides. RNA is prepared
from the two
cells or tissues to be compared. The RNA from one cell/tissue will be labeled
red and the RNA
from the other cell/tissue will be labeled yellow. Both RNA preparations are
hybridized to the
DNA array. The ratio of red to yellow is indicative of the relative levels of
expression between
the two cells/tissues.
3. Mapping a candidate gene
Molecular and cytogenetic methods of mapping candidate genes are known in the
art and
are summarized below. Linkage analysis provides a method for identifying genes
mapping to
genomic regions of lcnown linkage.
Linkage analysis
As described above, linkage analysis may be performed between an unmapped
candidate
gene and one or more of the disease-related loci or by analyzing the genetic
linkage between the
candidate gene and chromosomal markers which are not themselves linked to a
disease-related
125
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
locus, according to the same method. For the latter type of analysis it is
preferable that the
spacing of markers throughout the genome of the test organism is approximately
one every cM
or less. This spacing will ensure complete coverage of the genome and will
facilitate accurate
mapping.
Other methods for mapping a candidate gene axe provided below.
Syntenic similarity
As a result of classical genetic studies and, more recently, multi-laboratory
genomic
sequencing collaborations such as the Human Genome Project and Mouse Genome
Project, the
human and mouse genomes have been extensively characterized. It is now known
that there is
a significant degree of co-lineaxity among human, mice and rats wherein there
is conservation
relative to one another among these several species in the chromosomal map
positions of
numerous genes and groups of genes. Examination of the human and/or mouse
chromosomal
maps in the regions comparable to those to which a particular loci of interest
maps in the rat will
yield candidate genes which may be responsible for the physiological changes
associated with
a disease of interest. The methods of radiation hybrid mapping or fluorescence
in situ
hybridization at low stringency to rat chromosomes using labeled fragments
derived from the
human or mouse genes can be used to confirm that genes present in these
regions of the human
and/or mouse are present in the regions of interest in the rat.
Radiation hybrid (RH) mapping is a somatic cell hybrid technique that was
developed to
create high resolution, contiguous maps of mammalian chromosomes. The method
is useful for
ordering DNA markers spanning millions of base pairs of DNA at a resolution
not easily
obtained by other mapping methods (Cox et al.,1990, Science, 250: 245;
Burmeister et al.,1991,
Genomics, 9:19; Warrington et al., 1992, Genomics, 13: 803; Abel et al., 1993,
Genomics,
17:632). Radiation hybrid mapping facilitates the mapping of non-polymorphic
DNA markers
that cannot be used for meiotic mapping.
According to the method of radiation hybrid mapping a lethal dose of X-
irradiation is
used to fragment the chromosomes of the donor cell line. Chromosome fragments
from the donor
cell line are then retained, in a non-selective manner, following cell fusion
with a recipient cell
line. The resulting hybrid clones are then analysed for the presence or
absence of specific donor
126
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
chromosome markers. It is expected that markers that are further apart on a
chromosome are
more likely to be broken apart by radiation and to segregate independently in
the RH cells than
markers that are closer together. By performing a statistical analysis of the
co-segregation of
various loci in hybrid clones, it is possible to construct a map that provides
information regarding
the relative order and distance of markers (Cox et al., 1990, supra;
Warrington et al., 1991,
Genomics, 11: 701; Ceccherini et al., 1992, Proc. Natl. Acad. Sci. USA, 89:
104).
Subtractive screening
In view of the observation that only a subset of an organism's genes are
expressed in a
given tissue, there is a high probability that transcripts which differ in
expression between cells
of the same tissue in a mutant and control animal are responsible for the
observed mutant
phenotype.
According to the method of subtractive cloning, mRNA is isolated from a tissue
of
choice, wherein the tissue is obtained from two distinct organisms and wherein
one organism
15. displays a mutant phenotype with regard to a particular trait while the
other is normal in that
respect. Methods well known in the art are used to prepare cDNA from the mRNA
derived from
the organism. The mRNA template is .then degraded, either by hydrolysis under
alkaline
conditions or by RNAase H-mediated cleavage, and the cDNA is returned to a
buffer in which
mRNA is stable, and mixed with a molar excess of mRNA prepared from the second
organism
under conditions of stringent hybridization. The mixture is then passed over a
hydroxyapatite
column, which binds double-stranded nucleic acids but allows single stranded
nucleic acid
molecules to pass through. Reverse transcripts derived from the first sample
which do not
hybridize to niRNA molecules derived from the second organism (in other words,
reverse
transcripts specific to the first tissue sample) are present in the flow-
through fraction and are
cloned into a vector to create a subtraction library. The reciprocal
experiment (in which the
cDNA is derived from the second mRNA preparation) is also carried out to
create a complete set
of transcripts specific to the tissue samples derived from the two organisms.
This procedure will provide transcripts that can be labeled and used as probes
in ifz situ
hybridization analysis of immobilized chromosomes. The method of subtractive
screening
therefore, yields both cloned genes as well as reagents useful for determining
if the cloned genes
127
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
co-localize with a loci of interest. If a particular gene is found to co-
localize to a loci of interest,
the genes may be analysed functionally (e.g., in a phenotypic rescue
experiment, as described
below or by the phenotypic assays described in Section F entitled
"Identification and
Characterization of Polymorphisms") Ultimately, these genes may be used as
targets for drugs
or disease diagnostic methods, or even as therapeutic nucleic acids.
Muta~enic tranposon mappi~
The selection of insertional events that lie within genes (e.g., within coding
or regulatory
sequences) is facilitated by the use of entrapment vectors, first described in
bacteria (Casadaban
and Cohen, 1979, Proc. Natl. Acad. Sci. U.S.A., 76: 4530; Casadaban et al.,
1980, J Bacteriol,
143: 971). By employing animal models, entrapment vectors can be introduced
into pluripotent
ES cells in culture (for example, using electroporation or a retrovirus) and
then passed into the
germline via chimeras (Gossler et al., 1989, Science, 244: 463; Skames, 1990,
Biotechnology,
8:827). Alternatively, transgenic animals containing entrapment vectors may be
generated by
standard oocyte injection protocols.
These methods result in DNA integrations that are highly mutagenic because
they
interrupt the endogenous coding sequence. It is estimated that the frequency
of obtaining a
mutation in some gene of any in the genome using a promoter or gene trap is
about 45%. For a
detailed description of retroviral insertion mutagenesis see Methods Enzymol.,
vol. 225, 1990.
Genes which are expressed in a tissue of interest and for which a biochemical
assay of a
particular activity have been developed in animal models are most useful
according to this
method. Promoter or gene trap vectors often contain a reporter gene, e.g.,
lacZ, Cat or gree~a
fluorescent protein (Gfp) that lacks its own upstream promoter and/or splice
acceptor sequence.
That is, promoter gene traps contain a reporter gene with a splice site but no
promoter. If the
vector integrates within a gene and is spliced into the gene product, then the
reporter gene will
be expressed. Enhancer traps contain a reporter gene and have a minimal
promoter which
requires the activity of an enhancer in order to function. If the vector
integrates near an enhancer
(whether in a gene or not), then the reporter gene will be expressed.
Activation of the reporter
gene can only occur when the vector is integrated within an active host gene
and generates a
fusion transcript with the host gene. The activity of a reporter gene provides
an easy assay for
128
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
determining if a vector has been integrated into an expressed gene. Methods
for detecting reporter
gene activity in transfected cells or tissues of a transgenic animal are well
known in the art.
The mutagenic vector may be mapped using standard cytogenetic techniques, such
as in
situ hybridization, wherein a labeled fragment comprising vector-specific
sequence is used as a
probe. Co-localization of the probe with a particular locus of interest
indicates that the associated
gene is a suitable candidate and should be subjected to further analysis. A
gene that has been
identified in this manner can be cloned as described.
N. Diagnostic Indicators, Screens and Disease Symptoms
In another embodiment of the invention, there is provided a method of
diagnosing or
determining susceptibility of a subject to low BMD and/or bone damage. This
method involves
analyzing the genetic material of a subject to determine which alleles) of the
gene is/are present.
The method may include determining whether one or more particular alleles are
present, or which
combination of alleles (i.e. a haplotype) is present. The method rnay also
include determining
whether subjects are homozygous or heterozygous for a particular allele or
haplotype.
In a preferred embodiment, the method comprises determining which allele of
one or
more of the polymorphisms of the invention is/are present. In particular, the
method may include
determining the presence of the polymorphism of the gene which in combination
with
polymorphisms defined herein or other polymorphisms may define a risk
haplotype. The
polynucleotides sequences for these particular alleles may be used for
diagnostic purposes. The
polynucleotides which may be used include oligonucleotides, complementary RNA
and DNA
molecules and PNAs. The polynucleotides may be used to determine whether
subjects are
homozygous or heterozygous for a particular allele or haplotype making them
susceptible to low
BMD and/or bone damage, and hence, osteoporosis.
In one aspect, hybridization with a PCR probe which is capable of detecting
particular
polymorphism and these probes may be used to identify nucleic acid sequences
of particular
alleles or haplotype. These probes must be specific to these particular
alleles and the stringency
of the hybridization or amplification must be such that the probe identifies
only this particular
allele.
129
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Means for producing specific hybridization probes for these polynucleotides of
particular
alleles include the cloning of these polynucleotide sequences into vectors for
the production of
mRNA probes is well known to one skilled in the art. Such vectors are known in
the art, are
commercially available, and may be used to synthesize RNA probes in vitro by
means of the
addition of the appropriate RNA polymerases and the appropriate labeled
nucleotides.
Hybridization probes may be labeled by a variety of reporter groups, for
example, by
radionuclides such as 32P or 355, or by enzymatic labels, such as alkaline
phosphatase coupled to
the probe via avidin/biotin coupling systems, and the like.
Polynucleotides of particular alleles or haplotype may be used in Southern or
northern
analysis, dot blot, or other membrane-based technologies; in PCR technologies;
in dipstick, pin,
and multiformat ELISA-like assays; and in microarrays utilizing fluids or
tissues from patients
to detect susceptibility to low BMD and/or bone damage. Such qualitative
methods are well
known in the art.
In a particular embodiment, polynucleotides of particular alleles or haplotype
may be used
in assays that detect susceptibility to low BMD andlor bone damage,
particularly those mentioned
above. Polynucleotides complementary to sequences of a particular allele or
haplotype may be
labeled by standard methods and added to a fluid or tissue sample from a
patient under conditions
suitable for the formation of hybridization complexes. After a suitable
incubation period, the
sample is washed and determined if there is a signal. If a signal is found,
then the presence of
the polynucleotide of a particular allele, alleles or haplotype in the sample
indicates the
susceptibility to low BMD andlor bone damage, and hence, osteoporosis. Such
assays may also
be used to determine the particular therapeutic treatment regimen for an
individual patient.
With respect to osteoporosis, the presence of a particular polymorphism or
polymorphisms in a tissue sample from an individual may indicate a
predisposition for low BMD
andlor bone damage, or may provide a means for detecting osteoporosis prior to
the appearance
of actual clinical symptoms. A more definitive diagnosis of this type may
allow health
professionals to employ preventative measures or aggressive treatment earlier,
thereby preventing
the development or further progression of osteoporosis.
Additional diagnostic uses for oligonucleotides designed from the
polynucleotide
sequences of a particular allele or haplotype may involve the use of PCR.
These oligomers may
130
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
be chemically synthesized, generated enzymatically, or produced in vitYO.
Oligomers will contain
a fragment of a polynucleotide a particular allele, alleles or haplotype or a
fragment of a
polynucleotide complementary to the polynucleotide a particular allele,
alleles or haplotype, and
will be employed under optimized conditions for identification of a specific
polymorphism,
polymorphisms or haplotype. Oligomers may also be employed under very
stringent conditions
for detection of these particular DNA or RNA sequences. Examples of particular
primer
sequences and annealing temperatures for specific polymorphism markers are
found in Table 10
of U.S. Provisional Patent Application Serial Number 60/423559, entitled
"Nucleotide
Polymorphisms Associated with Osteoporosis" filed November 4, 2002, which is
hereby
incorporated herein by reference in its entirety, and in Tables 10 and 13 of
this application.
In further embodiments, oligonucleotides or longer fragments derived from any
of the
polynucleotides described herein may be used as elements on a microarray. The
microarray can
be used in transcript imaging techniques to detect a particular polymorphism,
polymorphisms or
haplotype simultaneously as described below. In particular, this information
may be used to
develop a pharmacogenomic profile of a patient in order to select the most
appropriate and
effective treatment regimen for that patient. For example, therapeutic agents
which are highly
effective and display the fewest side effects may be selected for a patient
based on his/her
pharmacogenomic profile.
In another embodiment, a method involves the use of antibodies in diagnosing
or
determining the susceptibility to low BMD and/or bone damage. The antibodies
would
specifically bind to an epitope of a particular allele or form of the protein
and may be used to
determine susceptibility to low BMD and/or bone damage, and hence',
osteoporosis. Antibodies
useful for diagnostic purposes may be prepared in the same manner as described
above for
therapeutics. Diagnostic assays for determining susceptibility to low BMD
and/or bone damage
include methods which utilize the antibody and a label to detect a particular
allele or form of the
protein in human body fluids or in extracts of cells or tissues. The
antibodies may be used with
or without modification, and may be labeled by covalent or non-covalent
attachment of a reporter
molecule. A wide variety of reporter molecules, several of which are described
above, are known
in the art and may be used.
131
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
A variety of protocols for measuring a particular allele or form of the
protein, including
ELISAs, RIAs, and FACS, are known in the art and provide a basis for
diagnosing susceptibility
to low BMD and/or bone damage.
In another embodiment, , fragments of ABBR, or antibodies specific for ABBR
may
be used as elements on a microarray.
Microarrays may be prepared, used, and analysed using methods known in the art
(Brennan, T.M. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et al.
(1996) Proc. Natl.
Acad. Sci. USA 93:10614-10619; Baldeschweiler et al. (1995) PCT application
W095/251116;,
Shalom D. et al. (1995) PCT application W095/35505; Heller, R.A. et al. (1997)
Proc. Natl.
Acad. Sci. USA 94:2150-2155; Heller, M.J. et al. (1997) U.S. Patent No.
5,605,662). Various
types of microarrays are well known and thoroughly described in Schena, M.,
ed. (1999; DNA
Microarrays: A Practical Approach, Oxford University Press, London).
O. Preparation of a Human Sample
The presence of an allelic form of a gene containing a sequence variation,
according to
the invention, can be detected by testing any tissue of a human subject. Human
samples that are
useful according to the invention include tissue or fluid samples containing a
polynucleotide or
polypeptide of interest, include but are not limited to plasma, serum, spinal
fluid, lymph fluid,
urine, stool, external secretions of the skin, respiratory, intestinal and
genitoruinary tracts, saliva,
blood cells, tumors, organs, tissue and samples of ifi vitro cell culture
constituents. Genomic
DNA, cDNA or RNA can be prepared from the human sample according to the
methods
described above.
P. Methods of Use
1. Nucleic. Acid Diagnosis and Diagnostic Kits
In order to detect the presence of an allele of a gene predisposing an
individual to
osteoporosis, a biological sample such as blood is prepared and analysed for
the presence or
absence of susceptibility alleles of a gene containing a polymorphism,
according to the invention.
Results of these tests and interpretive information will be returned to the
health care provider for
132
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
communication to the tested individual. Such diagnoses may be performed by
diagnostic
laboratories, or, alternatively, diagnostic kits are manufactured and sold to
health care providers
or to private individuals for self-diagnosis.
Initially, the screening method will involve amplification of the relevant
gene sequences.
In another preferred embodiment of the invention, the screening method
involves a non-PCR
based strategy. Such non-PCR based screening methods include Southern blot
analysis to detect
the presence of a variant form of a gene in a sample comprising total genomic
DNA from the
individual being tested. Alternatively, northern blot analysis can be used to
detect an aberrant
mRNA encoded by a gene, that exhibits altered stability or is the result of
alternative splicing in
a sample comprising RNA from an individual being tested. The methods of S 1
nuclease analysis,
RNASE protection and primer extension can also be used to determine both the
endpoint and the
amount of a gene specific mRNA (Ausubel et al., supra). Both PCR and non-PCR
based
screening strategies can detect target sequences with a high level of
sensitivity.
The preferred method, according to the invention, is target amplification.
According to
this method, the target nucleic acid sequence is amplified with polymerases.
One particularly
preferred method using polymerase-driven amplification is PCR (described
above). The
polymerase chain reaction and other polymerase-driven amplification assays can
achieve over
a million-fold increase in copy number through the use of polymerase-driven
amplification
cycles. PCR primers useful for target amplification according to the
invention, will be designed
to amplify a region of DNA containing one or more polymorphisms. Allele
specific primers
(comprising one or more polymorphisms) are also useful for detecting gene
sequence variations
by PCR methodologies according to the invention. The absence of a particular
polymorphism
will be indicated by the absence of an amplified product when the
amplification step is carried
out in the presence of allele specific primers. Once amplified, the resulting
nucleic acid can be
sequenced and the specific sequence of the test DNA will be compared with the
wild type
sequence by using the computer programs described in Section F entitled
"Identification and
Characterization of Polymorphisms". Alternatively, the amplified product will
be analysed by
Southern blot assay with~nucleic acid probes. Nucleic acid probes, useful
according to the
invention, will be specifically hybridizable to a mutant form of a gene but
not to the wild type
gene due to the presence of one or more polymorphisms.
133
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
When a probe comprising the target sequence, according to the invention, is
used to
detect the presence of the target sequences via non PCR-based strategies, (for
example, in
screening for osteoporosis susceptibility), the biological sample to be
analysed, such as blood or
serum, may be treated, if desired, to extract the nucleic acids (as described
above). The sample
nucleic acids (isolated from abiological sample or amplified by PCR) may be
prepared in various
ways to facilitate detection of the target sequence; e.g. denaturation,
restriction digestion,
electrophoresis or dot blotting. Preferably, the targeted region of the
nucleic acids being analysed
are at least partially single-stranded to form hybrids with the targeting
sequence of the probe. If
the sequence is naturally single-stranded, denaturation will not be required.
However, if the
~ sequence is double-stranded, the sequence will probably need to be
denatured. Denaturation can
be carried out by various techniques known in the art.
To detect the presence of a sequence variation in a gene, according to the
invention,
analyte nucleic acid and probe will be incubated under conditions which
promote stable hybrid
formation of the target sequence in the probe with the putative targeted
sequence in the sample
DNA. If the region of the probe which is used to bind to the analyte is
designed to be completely
complementary to the targeted region, high stringency conditions are desirable
in order to prevent
false positives. However, conditions of high stringency will be used only if
the probes are
complementary to regions of the chromosome which are unique in the genome. The
stringency
of hybridization is determined by a number of factors (described above).
Detection, if any, of the
resulting hybrid is usually accomplished by the use of labeled probes.
Alternatively, the probe
may be unlabeled, but may be detectable by specific binding with a ligand
which is labeled, either
directly or indirectly. Suitable labels, and methods for labeling probes and
ligand are known in
the art, and are described in Section C entitled "Production of a Nucleic Acid
Probe".
Accordingly, the foregoing screening method may be modified to identify
individuals
having a gene containing a neutral polymorphism not associated with
osteoporosis, by preferably
amplifying DNA fragments of a gene derived from a particular individual. The
amplified DNA
fragments are sequenced and the sequence is compared to the consensus gene
sequence
containing neutral polymorphisms. At this time, differences between the
individual's coding
sequence for a gene and a consensus sequence for the same gene are determined
wherein the
presence of any neutral polymorphisms and the absence of a polymorphisms not
previously
134
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
identified as neutral polymorphisms can be correlated with an absence of
increased genetic
susceptibility to osteoporosis resulting from a mutation in a gene coding
sequence.
In another embodiment of the invention, detection of a polymorphism will be
performed
by detecting loss of a restriction enzyme recognition site due to the presence
of one or more
polymorphisms. According to this embodiment, a polymorphism will be detected
with a
polynucleotide probe that is capable of detecting a restriction enzyme
fragment containing the
polymorphism, wherein the fragment is of a size that can be easily separated
on an agarose gel
and visualized by Southern blot analysis. A polynucleotide probe according to
this embodiment
of the invention can be specific for a sequence within the candidate gene or
outside of the
candidate gene.
It is also contemplated within the scope of this invention that the nucleic
acid probe
assays of this invention will employ a mixture of nucleic acid probes capable
of detecting a gene.
Thus, in one example to detect the presence of a gene in a test sample, more
than one probe
complementary to a gene is employed and in particular the number of different
probes is
alternatively 2, 3, or 5 different nucleic acid probe sequences. In another
example, to detect the
presence of mutations in the gene sequence in a patient, more than one probe
complementary to
a gene is employed wherein the probe mixture includes probes capable of
binding to the allele-
specific mutations identified in populations of patients with alterations in a
gene. In this
embodiment, any number of probes can be used, and will preferably include
probes
corresponding to the major gene mutations identified as predisposing an
individual to
osteoporosis.
Northern blot analysis, S 1 nuclease analysis, RNASE protection and primer
extension
(Ausubel et al., supra) are also methods according to the invention for
detecting changes in
mRNA resulting from the presence of one or more polymorphisms in the sequence
of a gene.
Additionally, of the methods of genotyping described in Section F entitled
"Identification
and Characterization of Polymorphisms" can be used for diagnostics according
to the invention.
2. Peptide Diagnosis and Diagnostic kits
135
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
osteoporosis can also be detected on the basis of an alteration of the wild-
type
polypeptide. Such alterations can be determined by sequence analysis in
accordance with
conventional techniques. More preferably, antibodies (polyclonal or
monoclonal) are used to
detect differences in, or the absence of peptides derived from a gene of
interest. The antibodies
may be prepared as described above in Section I entitled "Preparation of
Antibodies". Preferably,
antibodies will immunoprecipitate the protein product of a gene from solution
as well as react
with the protein product of a gene on Western or immunoblots of polyacrylamide
gels.
Antibodies useful according to the invention will also detect the protein
product of a gene in
paraffin or fiozen tissue sections, using immunocytochemical techniques.
Preferred embodiments relating to methods for detecting wild type or mutant
forms of the
protein product of a gene include enzyme linked immunosorbent assays (ELISA),
radioimmunoassay (RIA), immunoradiometric assays (IRMA) and immunoenzymatic
assays
(IEMA), including sandwich assays using monoclonal andlor polyclonal
antibodies. Exemplary
sandwich assays are described by David et al. In U.S. Pat. Nos. 4,376,110 and
4,486,530, hereby
incorporated by reference.
3. Drug Screening
This invention is particularly useful for screening therapeutic compounds by
using the
mutant gene or protein product or binding fragment of the gene in any of a
variety of drug
screening techniques.
The protein product or fragment of a gene employed in such a test may either
be free in
solution, affixed to a solid support, expressed on the surface of a cell, or
located intracellularly.
One method of drug screening utilizes eukaryotic or procaryotic host cells
which are stably
transformed with a recombinant polynucleotide expressing the polypeptide or
fragment,
preferably in competitive binding assays. Such cells, either in viable or
fixed form, can be used
for standard binding assays. In particular, these cells can be used to measure
formation of a
complex comprising the protein product or fragment of a gene and the agent
being tested.
Alternatively, these cells can be used to determine if the formation of a
complex between the
protein product or fragment of a gene and a known ligand is interfered with by
an agent being
tested. .
136
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Thus, the present invention discloses methods useful for drug screening
wherein such
methods comprise contacting a candidate drug with a polypeptide or fragment
derived from a
gene and assaying (i) for the presence of a complex between the drug and the
polypeptide derived
or fragment derived from a gene, or (ii) for the presence of a complex between
the polypeptide
or fragment derived from a gene and a ligand, by methods well known in the
art. Preferably, the
polypeptide or fragment derived from a gene is labeled for use in competitive
binding assays.
Methods for producing a labeled protein by in vitro translation are described
in Section J entitled
"Preparation of a Labeled Protein". Free polypeptide or fragment will be
separated from that
present in a protein:protein complex, and the amount of free (i.e.,
uncomplexed) label will be
used as a measure of the binding of the test drug to the polypeptide or the
ability of the test drug
to interfere with protein:ligand binding.
Another method of drug screening alloys for high throughput screening for
compounds
exhibiting suitable binding affinity to the polypeptides and is described in
detail in Geysen, WO
84/03564. According to this method, large numbers of different small peptide
test compounds
are synthesized on a solid substrate, such as plastic pins or another suitable
surface. The peptide
test compounds are reacted with the polypeptides or peptide fragments derived
from a gene, and
washed. Bound polypeptide is then detected by methods well known in the art.
Purified protein can be coated directly onto plates for use in the
aforementioned drug
screening techniques. Alternatively, non-neutralizing antibodies to the
polypeptide can be used
to capture the polypeptide or peptide fragment of interest and immobilize it
on the solid support.
Competitive drug screening assays in which neutralizing antibodies capable of
specifically binding the polypeptide of interest compete with a test compound
for binding to the
polypeptide or fragments thereof of interest are also useful according to the
invention. According
to this method, antibodies can be used to detect the presence of any test
peptide which shares one
or more antigenic determinants with the polypeptide of interest.
An additional technique for drug screening involves the use of host eukaryotic
cell lines
or cells (such as described above) which have a gene that produces a defective
protein. According
to this method, the host cell lines or cells are grown in the presence of a
test drug compound. The
rate of growth of the host cells is measured to determine if the compound is
capable of regulating
the growth of cells expressing a nonfunctional protein product of a gene.
Alternatively, the ability
137
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
of the test compound to restore the function of the mutant gene protein can be
measured by using
an appropriate ifz vitro assay for function of the protein product of a gene.
Suitable ih vitro
functional assays are described in Section F entitled "Identification and
Characterization of
Polymorphisms". If the host cell lines or cells express a protein product of a
gene that exhibits
an aberrant pattern of cellular localization, the ability of the test compound
to alter the cellular
localization of the protein will be determined. Changes in the cellular
localization of a protein
of interest will be detected by performing cellular fractionation studies with
biosynthetically
labeled cells. Alternatively, the cellular localization of a protein of
interest can be determined by
immunocytochemical methods well known in the art.
A method of drug screening may involve the use of host eukaryotic cell lines
or cells
(described above) which have an altered gene that demonstrates an aberrant
pattern of expression.
By aberrant pattern of expression is meant the level of expression is either
abnormally high or
low, or the temporal pattern of expression is different from that of the wild
type gene. The ability
of a test drug to alter the expression of a mutant form of a gene can be
measured by Northern blot
analysis, S 1 nuclease analysis, primer extension or RNASE protection assays.
Alternatively, if
a mutant form of a gene contains an polymorphisms in the promoter region of a
gene, cells can
be engineered to express a reporter construct comprising a mutant gene
promoter driving
expression of a reporter gene (e.g. CAT, luciferase, green fluorescent
protein). These cells can
be grown in the presence of a test compound and the ability of a test compound
to alter the level
of activity of the mutant gene promoter can be determined by standard assays
for each reporter
gene which are well known in the art.
Candidate Drugs
A "candidate drug" as used herein, is any compound with a potential to
modulate a
phenotype associated with a particular disease according to the invention.
A candidate drug is tested in a concentration range that depends upon the
molecular
weight of the drug and the type of assay. For example, for inhibition of
protein/protein complex
formation, small molecules (as defined below) may be tested in a concentration
range of lpg -
100mglml, preferably at about 100 pg - 10 ng/ml; large molecules, e.g.,
peptides, may be tested
in the range of 10 ng - 100 mglml, preferably 100 ng - 10 mg/ml.
138
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Candidate drug compounds from large libraries of synthetic or natural
compounds can
be screened. Numerous means are currently used for random and directed
synthesis of saccharide,
peptide, and nucleic acid based compounds. Synthetic compound libraries are
commercially
available from a number of companies including Maybridge Chemical Co.
(Trevillet, Cornwall,
UK), Comgenex (Princeton, NJ), Brandon Associates (Merrimack, NH), and
Microsource (New
Milford, CT). A rare chemical library is available from Aldrich (Milwaukee,
WI). Combinatorial
libraries are available and can be prepared. Alternatively, libraries of
natural compounds in the
form of bacterial, fungal, plant and animal extracts are available from e.g.,
Pan Laboratories
(Bothell, WA) or MycoSearch (NC), or are readily produceable by methods well
known in the
art. Additionally, natural and synthetically produced libraries and compounds
are readily
modified through conventional chemical, physical, and biochemical means.
Useful compounds may be found within numerous chemical classes, though
typically they
are organic compounds, and preferably small organic compounds. Small organic
compounds
have a molecular weight of more than 50 yet less than about 2,500 daltons,
preferably less than
about 750 daltons, more preferably less than about 350 daltons. Exemplary
classes include
heterocycles, peptides, saccharides, steroids, and the like. The compounds may
be modified to
enhance efficacy, stability, pharmaceutical compatibility, and the lilee.
Structural identification
of an agent may be used to identify, generate, or screen additional agents.
For example, where
peptide agents are identified, they may be modified in a variety of ways to
enhance their stability,
such as using an unnatural amino acid, such as a D-amino acid, particularly D-
alanine, by
functionalizing the amino or carboxylic terminus, e.g. for the amino group,
acylation or
alkylation, and for the carboxyl group, esterification or amidification, or
the like.
Determination of Activity of a Drug
A candidate drug, assayed according to the invention as described above, is
determined
to be effective if its use results in a change of about 10% of a phenotype
associated with a disease
according to the invention.
The level of modulation by a candidate modulator of a phenotype associated
with a
disease according to the invention, may be quantified using any acceptable
limits, for example,
via the following formula, which describes detections performed with a
radioactively labeled
139
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
probe (e.g., a radiolabeled antibody in an immunobinding experiment or a
radiolabeled nucleic
acid probe in a Northern hybridization).
(CPMcontra, - CPMs~~"P~e)
Percent Modulation = ---------------------------------------------x100
(CPMco"~ro~)
where CPMco"VO1 is the average of the cpm in antibody/ligand complexes or on
Northern blots
resulting from assays that lack the candidate modulator (in other words,
untreated controls), and
CPMsa",ple is the cpm in antibody/ligand complexes or on Northern blots
resulting from assays
containing the candidate modulator. A similar calculation is performed where
the assay
comprises use of a labeling system or system of measuring enzymatic activity
in which there is
a linear relationship between the amount of label detected and the amount of
protein or nucleic
acid being represented per unit of label or the amount of protein or nucleic
acid represented by
a unit of enzymatic activity.
4. Rational Drug Design
Rational drug design is useful for producing either structural analogs of
biologically
active polypeptides of interest or small molecules with which polypeptides of
interest interact
(e.g., agonists, antagonists, inhibitors) in order to design drugs which are,
for example, more
active or stable forms of the polypeptide, or which enhance or interfere with
the function of a
polypeptide in vivo. See, e.g., Hodgson, 1991, BioTechnology, 9:19. According
to one method
of rational drug design, the three-dimensional structure of a protein of
interest (e.g., the
polypeptide product of the gene) or, or the complex comprising the protein
product of a gene in
association with its ligand, is determined by x-ray crystallography, by
computer modeling or most
typically, by a combination of approaches. Alternatively, useful information
regarding the
structure of a polypeptide may be obtained by modeling based on the structure
of homologous
proteins. Rational drug design has been used successfully in the development
of HIV protease
inhibitors (Erickson et al., 1990, Science, 249: 527).
140
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Rational drug design may also involve the analysis of peptides derived from
the protein
product of a gene by an alanine scan (Wells, 1991, Methods in Enzymol., 202:
390). According
to this method, each of the amino acid residues of the peptide is sequentially
replaced by alanine,
and the effect of this amino acid substitution on the peptide's activity is
determined. This
technique can be used to determine the functionally relevant regions of the
peptide.
Another experimental approach to rational drug design will involve the
isolation of a
target-specific antibody (selected by a functional assay) and the
determination of the crystal
structure of this antibody. Theoretically, this approach will yield a
pharmacore upon which
subsequent drug design can be based. Alternatively, if anti-idiotypic
antibodies (anti-ids) specific
for a functional, pharmacologically active antibody are generated, there is no
need to determine
the crystallographic structure of the target-specific antibody. It is expected
that the binding site
of the anti-ids will be an analog of the original receptor. The anti-id could
then be used to identify
and isolate potentially therapeutic peptides from banks of chemically or
biologically produced
banks of peptides. These selected peptides would then function as pharmacores.
According to these methods it may be possible to design drugs which
demonstrate
increased activity or stability of the protein product of a gene or which
function as inhibitors,
agonists, antagonists, etc. of the activity of a protein product of a gene.
The availability of cloned
gene sequences, including polymorphisms, ensures that sufficient amounts of
the polypeptide
product of a gene are available to facilitate analytical studies such as x-ray
crystallography.
Furthermore, the knowledge of the sequence of the protein product of a gene
provided herein will
guide those using computer modeling techniques in place of, or in addition to
x-ray
crystallography.
5. Gene Therapy
The present invention also provides a method of supplying wild-type gene
function to a
cell which carries a mutant allele of a gene. By replacing a mutant gene with
a wild type gene,
it may be possible to reverse the symptoms of osteoporosis in the recipient
cells. a full length
version of the wild-type gene, or a fragment of the gene, may be introduced
into the cell in a
vector such that the gene remains extrachromosomal and is expressed by the
cell from the
extrachromosomal location. More preferably, following introduction into the
mutant cell, the
141
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
wild-type gene or gene fragment should recombine with the endogenous mutant
gene X already
present in the cell. Such recombination requires a double recombination event
which results in
the correction of the gene mutation. Vectors for introduction of genes both
for recombination and
for extrachromosomal maintenance are known in the art, and any suitable vector
may be used.
Methods for introducing DNA into cells such as electroporation, calcium
phosphate co-
precipitation and lipofection are known in the art (described above). Cells
transformed with the
wild-type gene can be used as model systems to study changes in the intensity
of symptoms
associated with osteoporosis and drug treatments which promote such changes.
As generally discussed above, a gene or a fragment thereof, where applicable,
may be
used in gene therapy methods in order to increase the amount of the expression
products of such
genes in cells of patients with osteoporosis. It may also be useful to
increase the level of
expression of a gene even in those cells in which the mutant gene is expressed
at a "normal"
level, but the gene product is not fully functional.
It other embodiments of the invention it may be useful to increase the amount
of the
expression products of a mutant form of a gene in a cell that expresses the
wild type protein.
Gene therapy can be carned out according to generally accepted methods, for
example, as
described by Friedman, 1991, In Therapy for Genetic Diseases; T. Friedman ed.,
Oxford
University Press, pp. 105-121). Initially, the appropriate cells from a
patient with osteoporosis
would be analysed by the diagnostic methods described above, to determine the
level of
production of a polypeptide from a gene and the activity of a polypeptide
product of a gene. A
virus or plasmid vector (see further details below), comprising a copy of a
gene and suitable
expression control elements, and capable of replicating inside the cells, will
be prepared. Suitable
vectors are known and are disclosed in U.S. Pat. No. 5,252,479 and PCT
published application
WO 93107282. The vector will be injected into the patient, either locally at
an appropriate site
according to the invention or systemically.
Gene transfer systems known in the art may be useful in the practice of the
gene therapy
methods of the present invention. These include viral and nonviral transfer
methods. a number
of viruses have been used as gene transfer vectors, including papovaviruses,
e.g., 5V40 (Madzak
et al.,1992, J Gen Virol., 73:1533), adenovirus (Berkner,1992, Curr. Top.
Microbiol. Immunol.,
158:39; Berkner et al., 1988, BioTechniques, 6:616; Gorziglia and Kapikian,
1992, J Virol.,
142
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
66:4407; Quantin et al.,1992, Proc. Natl. Acad. Sci. USA, 89:2581; Rosenfeld
et al.,1992, Cell,
68:143 ; Wilkinson et al.,1992, Nucleic Acids Res. 20:2233; Stratford-
Perricaudet et al.,1990,
Hum. Gene Ther., 1:241), vaccinia virus (Moss,1992, Curr. Top. Microbiol.
Immunol.,158:25)
adeno-associated virus (Muzyczka, 1992, Curr. Top. Microbiol. Immunol.,
158:97; Ohi et al.,
1990, Gene, 89:279), herpesviruses including HSV and EBV (Margolskee, 1992,
Curr. Top.
Microbiol. Immunol., 158:67, Johnson et al., 1992, J. Virol., 66:2952; Fink et
al., 1992, Hum.
Gene Ther., 3:11; Breakfield and Geller, 1987, Mol. Neurobiol., 1:337; Freese
et al., 1990,
Biochem. Pharmacol., 40: 2189), and retroviruses of avian (Brandyopadhyay and
Temin, 1984,
Mol. Cell. Biol., 4:749; Petropoulos et al.,1992, J. Virol., 66:3391), marine
(Miller,1992, Curr.
Top. Microbiol. Immunol.,158:1; Miller et al.,1985, Mol. Cell. Biol., 5:431;
Sorge et al.,1984,
Mol. Cell. Biol., 4:1730; Mann and Baltimore, 1985, J. Virol., 54:401; Miller
et al., 1988, J.
Virol., 62:4337), and human origin (Shimada et al., 1991, J. Clin. Invest.,
88:1043); Helseith et
al., 1990, J. Virol., 64:24 16; Page et al., 1990, J. Virol., 64: 5370;
Buchschacher and
Panganiban, 1992, J. Virol., 66:2731). Most human gene therapy protocols have
been based on
disabled murine retroviruses.
Nonviral gene transfer methods known in the art include chemical techniques
such as
calcium phosphate coprecipitation (Graham and van der Eb, 1973, Virology,
52:456; Pellicer et
al., 1980, Science, 209:1414); mechanical techniques, for example
microinjection (Anderson et
al., 1980, Proc. Natl. Acad. Sci. USA, 77: 5399; Gordon et al., 1980, Proc.
Natl. Acad. Sci..
USA, 77: 7380; Brinster et al.,1981, Cell, 27:223; Constantini and Lacy,1981,
Nature, 294:92);
membrane fusion-mediated transfer via liposomes (Felgner et al., 1987, Proc.
Natl. Acad. Sci.
USA, 84:7413; Wang and Huang, 1989, Biochemistry, 28:9508; Kaneda et al. 1989,
J. Biol.
Chem., 264:12126; Stewart et al., 1992, Hum. Gen. Ther., 3:267; Nabel et al.,
1990, Science,
249:1285; Lim et al.,1992, Circulation, 83:2007); and direct DNA uptake and
receptor-mediated
DNA transfer (Wolff et al.,1990, Science, 247:1465; Wu et al.,1991, J. Biol.
Chem., 266:14338;
Zenke et al., 1990, Proc. Natl. Acad. Sci. USA, 87:3655; Wu et al., 1989b, J.
Biol. Chem.,
264:16985; Wolff et al., 1991, BioTechniques, 11:474; Wagner et al., 1990,
Proc. Natl. Acad.
ScLUSA, 87:3410; Wagner et al.,1991, Proc. Natl. Acad. Sci.USA, 88:4255;
Cotten et al.,1990,
Proc. Natl. Acad. Sci.USA, 87:4033; Curiel et al., 1991a, Proc. Natl. Acad.
Sci.USA, 88:8850;
Curiel et al., 1991b, Hum. Gene Ther., 3:147.
143
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
In an approach which combines biological and physical gene transfer methods,
plasmid
DNA of any size is combined with a polylysine-conjugated antibody specific to
the adenovirus
hexon protein, and the resulting complex is bound to an adenovirus vector. The
trimolecular
complex is then used to infect cells. The adenovirus vector permits efficient
binding,
internalization, and degradation of the endosome before the coupled DNA is
damaged.
LiposomeIDNA complexes have been shown to be capable of mediating direct in
vivo
gene transfer. While in standard liposome preparations the gene transfer
process is nonspecific,
localized ifz vivo uptake and expression have been reported in tumor deposits,
for example,
following direct ira situ administration (Nabel, 1992, Hum. Gen. Ther.,
3:399).
Gene transfer techniques which target DNA directly to an appropriate tissue,
e.g., a tissue
that normally expresses the protein product of the candidate gene of the
invention, is preferred.
Receptor-mediated gene transfer, for example, is accomplished by the
conjugation of DNA
(usually in the form of covalently closed supercoiled plasmid) to a protein
ligand via polylysine.
Ligands are chosen on the basis of the presence of the corresponding ligand
receptors on the cell
surface of the target cell/tissue type. These ligand-DNA conjugates can be
injected directly into
the blood if desired and are directed to the target tissue where receptor
binding and
internalization of the DNA-protein complex occurs. To overc~me the problem of
intracellular
destruction of DNA, coinfection with adenovirus can be included to disrupt
endosome function.
6. Peptide Therapy
Peptides which have gene activity can be supplied to cells which carry mutant
or missing
alleles of a gene. Alternatively, peptides specific for a mutant form of the
protein product of a
gene can be supplied to cells carrying a wild type protein. The protein
product of a gene can be
produced by expression of the cDNA sequence in bacteria, for example, using
known expression
vectors (as described in Section H entitled "Production of a Mutant Protein").
Alternatively, the
protein product of a gene can be extracted from mammalian cells engineered to
produce the
protein product of a gene of interest. In addition, the techniques of
synthetic chemistry can be
employed to synthesize the protein product of a gene. Any of the above
techniques can provide
a preparation of protein product of a gene that is substantially free of other
human proteins. This
is most readily accomplished by carrying out protein synthesis in a
microorganism or in vitro.
144 '
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Active gene molecules can be introduced into cells by microinjection or by the
use of
liposomes, for example. Alternatively, some active molecules may be taken up
by cells, actively
or by diffusion. Extracellular application of the protein product of a gene
may be sufficient to
decrease or reverse the physiological effects of osteoporosis. Other molecules
with the activity
of a protein product of a gene (for example, peptides, drugs or organic
compounds) may also be
used to effect such a reversal. Modified polypeptides having substantially
similar function may
also be useful for peptide therapy.
7. Transformed Hosts
Cells and animals which carry a mutant allele of a gene can be used as model
systems to
study and test for substances which have potential as therapeutic agents.
Following application
of a test substance to the cells, the phenotype of the cell will be
determined. Any variety of
phenotypic changes associated with osteoporosis can be assessed, including
insulin resistance
and combined insulin resistance/insulin secretion detect. Assays for each of
these traits are
known in the art.
Animals useful for testing therapeutic agents can be selected after
mutagenesis of whole
animals or after treatment of germline cells or zygotes. Such treatments
include insertion of
mutant alleles of a gene, usually from a second animal species, as well as
insertion of disrupted
homologous genes. Alternatively, the endogenous gene of the animals may be
disrupted by
insertion or deletion mutation or other genetic alterations using conventional
techniques
(Capecchi, 1989, Science, 244:1288; Valancius and Smithies, 1991, Mol. Cell.
Biol., 11:1402;
Hasty et al., 1991, Nature, 350:243; Shinkai et al., 1992, Cell, 68:855;
Mombaerts et al., 1992,
Cell, 68:869; Philpott et al.,1992, Science, 256:1448; Snouwaert et al.,1992,
Science, 257:1083;
Donehower et al., 1992, Nature, 356;215). Following the administration of test
substances, the
physiological changes associated with osteoporosis will be assessed. If the
test substance prevents
or suppresses any of these physiological changes, then the test substance will
be considered a
candidate therapeutic agent for the treatment of osteoporosis. These animal
models provide an
extremely important testing vehicle for potential therapeutic products.
8. Use of a Polynucleotide as a Unique Sequence Marker:
145
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Polynucleotides can be used to mark objects or substances for the purposes of
later
identification. Thus, polynucleotides of the invention are useful for tracking
the manufacture and
distribution of a large number of diverse substances, including but not
limited to: (1) natural
resources such as animals, plants, oil, minerals, and water; (2) chemicals
such as drugs, solvents,
petroleum products, and explosives; (3) commercial by-products including
pollutants such as
radioactive or other hazardous waste; and (4) articles of manufacture such as
guns, typewriters,
automobiles and automobile parts. A nucleic acid according to the invention,
when used as a
marker, thus aids in the determination of product identity and so provides
information useful to
manufacturers and consumers.
Polynucleotides have the advantage over other marking materials of being
readily
amplifiable through the use of polymerase chain reaction (PCR) technology. The
method of PCR
is well known in the art. PCR is performed as described by Mullis & Faloona,
1987, Methods
Enzymol., 155:335, herein incorporated by reference. It is the unique sequence
of a
polynucleotide which renders it useful as a marker, since the sequence, or a
characteristic pattern
derived from its sequence, confers a property on the polynucleotide which
permits it to be
tracked.
It is contemplated that a novel polynucleotide sequence of the invention, or
fragments or
derivatives of it may be used as markers by their attachment to or mixture in
objects or
substances to be marked. Methods for marking various classes of substances and
later detection
of the tags in those substances are disclosed in U.S. Patent Nos. 5,451,505,
and 5,643,728.
Briefly, the use of a polynucleotide of the invention as a marker may entail
combining
a polynucleotide with the substance or object to be marked, using methods
appropriate to that
substance or object; and detecting the marker through amplification of the
polynucleotide
sequence using PCR technology, followed by either sequence analysis or
identification by other
means known in the art (e.g., hybridization assays).
The methods of applying a marker nucleic acid to a substance or object and
subsequent
detection of that nucleic acid will vary depending upon the nature of the
substance or object and
the environment to which it will be exposed. For example, inert solids such as
paper, many
pharmaceutical products, wood, some foodstuffs, etc., can be either processed
with the marker
nucleic acid, or the nucleic acid may be sprayed onto their surfaces.
Chemically active
146
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
substances, such as foodstuffs with enzymatic activity, polymers with charged
groups, or acidic
pharmaceuticals may require that a protective composition (e.g., liposomes) be
added to the
nucleic acid being used as a marker.
In order to mark liquids, the nucleic acid may be mixed directly with the
liquid, or, if the
chemical nature of the liquid is not compatible with this approach (i.e.,
nucleic acids are not
soluble in the liquid), the nucleic acid may be mixed with a detergent to
enhance its solubility.
Containerized gases may be marked simply by adding a nucleic acid to the
container in dry form,
as it will be dispersed throughout the gas as the gas is released.
The amount of nucleic acid to add to a substance as a marker will also vary
with the given
situation, as will the detection strategy. PCR technology, however, allows the
amplification and
detection of as little as one molecule from a sample. Other means of
detection, such as
hybridization assays require that more nucleic acid be recovered from a sample
to efficiently
detect it. PCR can be combined with a hybridization assay, however, to enhance
the sensitivity
of the method.
A nucleic acid sequence used as a marker will generally be from 20 to 1,000
bases long,
and preferably will be 60 to 1,000 bases long when PCR is to be used to detect
the marker.
One example of a substance for which nucleic acid marking is suited is
gunpowder.
Marked gunpowder may be prepared as follows: 1) add 16 ng of nucleic acid
bearing the chosen
marker sequence (derived from a polynucleotide of the invention) to 1 ml of
distilled water; 2)
mix the solution of nucleic acid with 1 g of nitrocellulose-based gunpowder;
and 3) dry in air or
under vacuum at 85°C. To recover the marker from gunpowder: 1) wash the
gunpowder sample
with 1 ml of distilled water; 2) add 50 ml of the wash solution to a standard
PCR mix, or,
alternatively, place gunpowder flakes directly into a 100 ml PCR mix; and 3)
amplify according
to standard PCR methods using primers which anneal at opposite ends and on
opposite strands
of the sequence used as a marker (annealing and extension conditions will
depend upon the exact
sequences chosen for oligonucletide primers, and may be adjusted according to
methods known
in the art).
Another example of a substance which may be marked with a nucleic acid
according to
the invention is ink. To prepare marked ink sample: 1) if the ink is water
insoluble, mix the
nucleic acid with detergents as for oil. If the ink is water soluble, add
nucleic acid directly to the
147
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
ink to a concentration of about 1 to 20 ng per ml. To recover the marker from
ink, proceed as for
oils and medicines.
In the above examples, the presence of an amplification product of the proper
size
(visualized, for example by gel electrophoresis alongside nucleic acid size
markers followed by
ethidium bromide staining of the gel, according to standard methods) will
indicate the presence
of the marker in the sample. In some instances, the PCR product may be further
subjected to
hybridization analysis or to sequencing to enhance the accuracy of the method.
A method of
hybridization analysis which can be used is described herein.
9. Use of a Polynucleotide of the Invention as a Marker for Chromosome
Mapping:
Because a polynucleotide of the invention is novel, (that is, its sequence is
unique),it is
useful as a marker for chromosomal mapping. There are a number of methods of
chromosomal
mapping known in the art. Prominent among them is the variant of the ih situ
hybridization
technique known as "Fluorescence Ifa Situ Hybridization", or FISH. Details of
methods and
solutions used for in situ hybridization are well-known in the art. There are
many variations of
the FISH technique itself, however the basic approach is similar in each case.
Essentially, ifs situ
hybridization of cells, nuclei, or metaphase chromosome spreads is performed
with a
polynucleotide probe either directly labeled with a fluorochrome, or labeled
with a moiety which
will be bound by a fluorochrome tagged entity. The hybridized probe is
visualized by irradiation
of the sample with light in the wavelength which excites fluorescence from the
fluorochrome.
When combined with standard methods of karyotyping known in the art, this
method allows the
polynucleotide sequence to be localized to a particular arm of a particular
chromosome. Once
mapped to a specific chromosome, the location of the novel polynucleotide
sequence on that
chromosome may be further localized by in situ hybridization along with probes
specific for
known genes or sequences, labeled with other fluorescent tags which allow the
differentiation
of the signals from the different probes. Such an approach and various
adaptations of it allows
the localization of the novel gene relative to a known gene. Methods of
generating and using
fluorescence-labeled polynucleotide probes for FISH and chromosome mapping are
known in
the art (for example, see Malcolm et al., 1981, Ann. Hum. Genet., 45:134; Bar-
Am et al., 1992,
Genes. Chromosomes ~ Cancer, 4:314; Pinkel et al.,1988, Proc. Natl. Acad. Sci.
USA, 85:9138;
148
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
U.S. Patent No. 5,728,527). Additional variations of the chromosome mapping
method utilize
a PCR approach (Dionne et al.,1990, BioTechniques, 8(2):190 and Iggo et
al.,1989, Proc. Natl.
Acad. Sci. USA, 86:6211).
In addition to being able to determine the chromosomal location of the novel
polynucleotide, similar technology, in which FISH is combined with flow
cytometry, will allow
the polynucleotide of the invention to be used to sort chromosomes, nuclei, or
whole cells
containing various dosages (i.e., gene copy numbers) of the gene encoding that
polynucleotide
(Hulfdin et al., 1998, Nuc. Acids Res., 26:3651).
10) Use of a Polynucleotide of the Invention as a Marker for Analysis of
Forensic Materials:
Forensic science depends heavily on methods for determining the source of
various
compounds associated with criminal activity. In particular, the identification
of individuals
involved in criminal activity through analysis of substances found at the
crime scenes is critical.
Such identification is possible with genetic typing, which involves the
determination of the
genotype of an individual with regard to loci which are polymorphic within the
population. As
used herein, "polymorphic" refers to a gene or other segment of DNA which
shows nucleotide
sequence variability from individual to individual. The use of PCR techniques
and nucleotide
probes to detect even single nucleotide changes in a polynucleotide sequence
has revolutionized
the field of forensic serology (see Reynolds and Sensabaugh, 1991, Anal.
Chern., 63:2). For an
example of polymorphisms useful for forensic identification and methods of
typing samples with
regard to those polymorphisms, see U.S. Patent # 5,273,883.
If a polynucleotide of the invention is found to have nucleotide sequence
variation among
individuals within a population, it may be useful in the analysis of forensic
samples. There are
a number of methods known to those skilled in the art for typing nucleic acids
with regard to
polymorphisms. It should be understood that any such method is acceptable
according to the
invention. One particular method is termed the "reverse dot blot" method. The
basic steps
involved are: 1) oligonucleotides bearing the sequences of various polymorphic
forms of the
polynucleotide region to be analysed are bound to membranes; 2) labeled, PCR-
amplified
fragments, derived from the sample to be genotyped, and corresponding to the
polymorphic
region ("target DNA") are allowed to hybridize to the bound oligonucleotides
under conditions
149
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
which only allow the hybridization of molecules with 100°Io
complementary sequences; 3)
unbound target DNA is removed; and 4) hybridized molecules are detected.
The specific genotype of the individual from whom the target sample was
obtained
(amplified), with regard to the polymorphic region of a polynucleotide of the
invention, may thus
be determined by screening a panel of probes containing the known polymorphic
sequence
variations of that region. It should be understood that the hybridization
conditions may be
adjusted by one of skill in the art so that limited amounts of non-
complementarity, including
single base mismatches, may be detected with this method.
Q. Pharmaceutical Compositions--Prevention and Treatment
1. Administration of Pharmaceutical Compositions
Administration of pharmaceutical compositions is accomplished orally or
parenterally.
Methods of parenteral delivery include topical, intra-arterial (directly to
the tumor),
intramuscular, subcutaneous, intramedullary, intratheeal, intraventricular,
intravenous,
intraperitoneal, or intranasal administration. In addition to the active
ingredients, these
pharmaceutical compositions may contain suitable pharmaceutically acceptable
carrier
preparations which can be used pharmaceutically.
Pharmaceutical compositions for oral administration can be formulated using
pharmaceutically acceptable carriers well known in the art in dosages suitable
for oral
administration. Such carriers enable the pharmaceutical compositions to be
formulated as tablets,
pills, dragees, capsules, liquids, gels, syrups, slurnes, suspensions and the
like, for ingestion by
the patient.
Pharmaceutical preparations for oral use can be obtained through combination
of active
compounds with solid excipient, optionally grinding a resulting mixture, and
processing the
mixture of granules, after adding suitable auxiliaries, if desired, to obtain
tablets or dragee cores.
Suitable excipients are carbohydrate or protein fillers such as sugars,
including lactose, sucrose,
mannitol, or sorbitol; starch from corn, wheat, rice, potato, or other plants;
cellulose such as
methyl cellulose, hydroxypropylmethyl-cellulose, or sodium carboxymethyl
cellulose; and gums
including arabic and tragacanth; and proteins such as gelatin and collagen. If
desired,
150
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
disintegrating or solubilizing agents may be added, such as the cross-linked
polyvinyl
pyrrolidone, agar, alginic acid, or a salt thereof, such as sodium alginate.
Dragee cores are provided with suitable coatings such as concentrated sugar
solutions,
which may also contain gum arabic, talc, polyvinylpyrrolidone, carbopol gel,
polyethylene glycol,
and/or titanium dioxide, lacquer solutions, and suitable organic solvents or
solvent mixtures.
Dyestuffs or pigments may be added to the tablets or dragee coatings for
product identification
or to characterize the quantity of active compound, ie, dosage.
Pharmaceutical preparations which can be used orally include push-fit capsules
made of
gelatin, as well as soft, sealed capsules made of gelatin and a coating such
as glycerol or sorbitol.
Push-fit capsules can contain active ingredients mixed with a filler or
binders such as lactose or
starches, lubricants such as talc or magnesium stearate, and, optionally,
stabilizers. In soft
capsules, the active compounds may be dissolved or suspended in suitable
liquids, such as fatty
oils, liquid paraffin, or liquid polyethylene glycol with or without
stabilizers.
Pharmaceutical formulations for parenteral administration include aqueous
solutions of
active compounds. For injection, the pharmaceutical compositions of the
invention may be
formulated in aqueous solutions, preferably in physiologically compatible
buffers such as Hank's
solution, Ringer' solution, or physiologically buffered saline. Aqueous
injection suspensions may
contain substances which increase the viscosity of the suspension, such as
sodium carboxymethyl
cellulose, sorbitol, or dextran. Additionally, suspensions of the active
solvents or vehicles include
fatty oils such as sesame oil, or synthetic fatty acid esters, such as ethyl
oleate or triglycerides,
or liposomes. Optionally, the suspension may also contain suitable stabilizers
or agents which
increase the solubility of the compounds to allow for the preparation of
highly concentrated
solutions.
For topical or nasal administration, penetrants appropriate to the particular
barrier to be
permeated or used in the formulation. Such penetrants are generally known in
the art.
2. Manufacture and Storage
151
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
The pharmaceutical compositions of the present invention may be manufactured
in a
manner that known in the art, e.g. by means of conventional mixing,
dissolving, granulating,
dragee-making, levitating, emulsifying, encapsulating, entrapping or
lyophilizing processes.
The pharmaceutical composition may be provided as a salt and can be formed
with many
acids, including but not limited to hydrochloric, sulfuric, acetic, lactic,
tartaric, malic, succinic,
etc... Salts tend to be more soluble in aqueous or other protonic solvents
that are the
corresponding free base forms. In other cases, the preferred preparation may
be a lyophilized
powder in 1mM-50 mM histidine, 0.1 %-2% sucrose, 2%-7% mannitol at a PhRange
of 4.5 to 5.5
that is combined with buffer prior to use.
After pharmaceutical compositions comprising a compound of the invention
formulated
in a acceptable carrier have been prepared, they can be placed in an
appropriate container and
labeled for treatment of an indicated condition with information including
amount, frequency and
method of administration.
3. Therapeutically Effective Dose
Pharmaceutical compositions suitable for use in the present invention include
compositions wherein the active ingredients are contained in an effective
amount to achieve the
intended purpose. The determination of an effective dose is well within the
capability of those
skilled in the art.
For any compound, the therapeutically effective dose can be estimated
initially either in
cell culture assays, or in animal models, usually mice, rabbits, dogs, or
pigs. The animal model
is also used to achieve a desirable concentration range and route of
administration. Such
information can then be use to determine useful doses and routes for
administration in humans.
A therapeutically effective dose refers to that amount of protein or its
antibodies,
antagonists, or inhibitors which ameliorate the symptoms or conditions.
Therapeutic efficacy and
toxicity of such compounds can be determined by standard pharmaceutical
procedures in cell
cultures or experimental animals, eg, ED50 (the dose therapeutically effective
in 50% of the
population) and LD50 (the dose lethal to 50% of the population). The dose
ratio between
therapeutic and toxic effects is the therapeutic index, and it can be
expressed as the ratio,
i
152
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
LD50/ED50. Pharmaceutical compositions which exhibit large therapeutic indices
are preferred.
The data obtained from cell culture assays and animals studies is used in
formulating a range of
dosage for human use. The dosage of such compounds lies preferably within a
range of
circulating concentrations that include the ED50 with little or no toxicity.
The dosage varies
within this range depending upon the dosage from employed, sensitivity of the
patient, and the
route of administration.
The exact dosage is chosen by the individual physician in view of the patient
to be
treated. Dosage and administration are adjusted to provide sufficient levels
of the active moiety
or to maintain the desired effect. Additional factors which may be taken into
account include the
severity of the disease state; age, weight and gender of the patient; diets
time and frequency of
administration, drug combination(s), reaction sensitivities, and
tolerance/response to therapy.
Long acting pharmaceutical compositions might be administered every 3 to 4
days, every week,
or once every two weeks depending on a half-life and clearance rate of the
particular formulation.
Dosage amounts may vary from 0.1 to 100,000 micrograms per person per day, for
example, lug, l0ug,100ug, 500 ug, lmg, l0mg, and even up to a total dose of
about 1g per person
per day, depending upon the route of administration. Guidance as to particular
dosages and
methods of delivery is provided in the literature. See U.S. Patent Nos.
4,657,760; 5,206,344; or
5,225,212, hereby incorporated by reference. Those skilled in the art will
employ different
formulations for nucleotides than for proteins or their inhibitors. Similarly,
delivery of
polynucleotide or polypeptides will be specific to particular cells,
conditions, locations, etc.
Without further elaboration, it is believed that one skilled in the art can,
using the
preceding description, utilize the present invention to its fullest extent.
The following
embodiments are, therefore, to be construed as merely illustrative, and not
limitative of the
remainder of the disclosure in any way whatsoever.
EXAMPLE 1
Establishment of an Association Between a Given Polynucleotide Sequence and
Osteoporosis
A polynucleotide sequence according to the invention containing a mutation
which is
believed to be associated with osteoporosis, can be statistically linked to
osteoporosis by linkage
153
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
analysis. An animal model system exhibiting a particular phenotypic defect
that is characteristic
of the osteoporosis is selected. A series of genetic crosses is performed in
this animal model
system between individuals having an observable mutant phenotype and normal
individuals of
a control strain. At least one disease-related locus or a chromosomal marker
that does not
comprise a disease related locus is used as a marker in these crosses. If a
statistically significant
pattern of non-random assortment of the mutant trait with a marker locus is
observed, the trait
is linked to the marker locus.
Similarly, linkage analysis can be performed on an existing human or other
mammalian
pedigree. According to this method, numerous genetic loci from affected and
unaffected family
members are compared. Non-random assortment of a given genetic marker between
affected and
unaffected family members relative to the distributions observed for other
genetic loci indicates
that the marker (for example, a variant isoform of a gene) either contributes
to the disease or is
in physical proximity to another that does so.
Tf either approach demonstrates a non-random assortment of the osteoporosis-
related
phenotype with a marker locus, this is indicative of an association between
the gene underlying
the defect and that locus. Because the strength of any conclusion drawn from
linkage analysis is
statistically-based, the accuracy of the results is thought to be proportional
to the number of
crosses or family members and genetic loci analysed.
EXAMPLE 2
Screening Assay For Osteoporosis
A polynucleotide sequence according to the invention can be used as a marker
for a
normal phenotype or for a phenotype associated with osteoporosis.
If it can be demonstrated by the methods of phenotyping, described above, that
a
particular sequence is associated with an osteoporosis phenotype, this
sequence can be used as
a marker for osteoporosis. A sequence of interest can be used as a probe to
screen genomic DNA
from individuals by Southern blot analysis according to the method described
above. If the
sequence of interest is detected by Southern blot analysis, and the presence
of this sequence is
confirmed by direct sequencing, it can be concluded that the individual from
which the genomic
154
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
DNA has been isolated has an increased frequency for the development of
osteoporosis for which
the sequence is a marker.
The marker can also be used as an osteoporosis indicator according to the
method of
PCR. A genomic DNA sample of interest can be analysed in a PCR reaction
wherein one of the
primers contains the marker sequence. If the marker sequence is present in the
sample DNA, a
PCR product will be produced. Alternatively, the PCR primers can be designed
such that they
amplify a region containing the marker sequence. The amplified product can be
analysed by
hybridization methods, described above, to determine the presence of the
sequence of interest.
EXAMPLE 3
Use of a Given Polynucleotide as a Target for Drug Screening
A polynucleotide according to the invention, containing a mutation which is
believed
to be associated with osteoporosis can be used a target for drug screening.
One method of drug screening utilizes eukaryotic or procaryotic host cells
which are
stably transformed with a polynucleotide according to the invention and either
exhibit a
particular phenotype characteristic of the presence of the polynucleotide or
express a
polypeptide or fragment encoded by the polynucleotide. Such cells, either in
viable or fixed
form, can be used for standard competitive binding assays. In particular,
these cells can be
used to measure formation of a complex comprising the protein product or
fragment of a
polynucleotide according to the invention and the agent being tested.
Alternatively, these cells
can be used to determine if the formation of a complex between the protein
product or
fragment of a polynucleotide according to the invention and a known ligand is
interfered with
by an agent being tested.
An alternative method for drug screening involves using of eukaryotic cell
lines or
cells (such as described above) which contain a polynucleotide according to
the invention that
produces a defective protein. According to this method, the host cell lines or
cells are grown
in the presence of a test drug. The rate of growth of the host cells is
measured to determine if
the compound is capable of regulating the growth of cells expressing a
nonfunctional protein
product of the polynucleotide according to the invention. Preferably, a drug
that is useful
155
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
according to the invention will increase or decrease the growth rate of a cell
by at least 10%.
Alternatively, the ability of the test compound to restore the function of the
mutant gene
protein by at least 10% can be measured by using an appropriate in vitro assay
for function of
the protein product of a gene (as described in Section F entitled
"Identification and
Characterization of Polymorphisms"). If the host cell lines or cells express a
protein product
of a gene that exhibits an aberrant pattern of cellular localization, the
ability of the test
compound to alter the cellular localization of the protein by at least 10%
will be determined.
Changes in the cellular localization of a protein of interest will be detected
by performing
cellular fractionation studies with biosynthetically labeled cells.
Alternatively, the cellular
localization of a protein of interest can be determined by immunocytochemical
methods well
lcnown in the art.
A method of drug screening may also involve the use of host eukaryotic cell
lines or
cells (described above) which have an altered gene that demonstrates an
aberrant pattern of
expression where the level of expression is either abnormally high or low, or
the temporal
pattern of expression is different from that of the wild type gene. The
ability of a test drug to
alter the expression of a mutant form of a gene by at least 10% can be
measured by Northern
blot analysis, S 1 nuclease analysis, primer extension or RNase protection
assays, as described
above. Alternatively, if a mutant form of a gene contains a polymorphism in
the promoter
region of a gene, cells can be engineered to express a reporter construct
comprising a mutant
gene promoter driving expression of a reporter gene (e.g. CAT, luciferase,
green fluorescent
protein). These cells can be grown in the presence of a test compound and the
ability of a test
compound to alter the level of activity of the mutant gene promoter can be
determined by
standard assays for each reporter gene which are well known in the art.
A transgenic animal whose genomic DNA contains a polynucleotide associated
with a
particular phenotypic defect that is characteristic of osteoporosis and a
normal, control animal
(not containing the polynucleotide) can be treated with a candidate drug
according to the
invention. The ability of a candidate drug to ameliorate symptoms of the
disease, by at least
10%, will be analysed by assessing the disease symptoms and their
amelioration.
EXAMPLE 4
156
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Pol,~rphisms in Genes Associated With Osteoporosis
The osteoporosis candidate gene list was compiled using gene or gene sequences
selected from literature sources, using sequence homology, library subtraction
and expression
analysis.
Expression analysis was performed using "guilt-by-association" queries to
identify
Incyte-novel and known genes not previously associated with osteoporosis which
have
similar expression patterns to genes known to be associated with osteoporosis.
Guilt-by-
association analysis was performed as described in Walker et al. 1999 Genome
Res 9:1198
and U.S. Provisional Patent Application Serial No. 60/342,711 entitled
"Nucleotide
Polymorphisms Associated with Osteoporosis" filed December 20, 2001, both of
which are
incorporated by reference in their entirety.
Polymorphism discovery was by fSSCP as described in section F "Identification
and
Characterization of Polymorphisms". The polymorphisms were mapped to cDNA
sequences
in the LifeSeqGold database (Incyte Genomics, Inc., Palo Alto, CA) to identify
the affected
gene.
EXAMPLE 5
Frequency of pol,~rphisms in Osteoporosis associated genes and polynucleotides
in various
populations.
Polymorphisms identified in Example 4 were genotyped against populations
described below by fSSCP or FP-TDI as described above. The results of the
population
frequency studies are given in Table 2 found in U.S. Provisional Patent
Application Serial
No. 60/342,711 entitled "Nucleotide Polymorphisms Associated with
Osteoporosis" filed
December 20, 2001, which is hereby incorporated herein by reference in its
entirety.
Two panels of human DNA have been developed to support the identification of
frequent SNPs within an ethnically diverse population. The genomic Human
Diversity Panel
(HDP) will be used where full genomic structure for the selected candidate
genes is available
to allow screening of the open reading frame of the gene including splice
junctions. A cDNA
version of the HDP (generated from lymphoblastoid cell lines to obviate the
need for
157
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
intron/exon structure in 50% of human genes) will be used where full genomic
structure for
the selected candidate genes may not be available to permit screening of the
open reading
frame of the gene.
This HDP is derived from 47 consenting individuals from four ethnic groups
(Caucasian, African-American, Asian and Hispanic). The panel is sufficiently
sized to
enable identification of 95% of SNPs with allele population frequencies >= 5%.
Comparable utility of the HDP with the NI~I Diversity panel was demonstrated
by parallel
screening of 90 kilobases of coding sequence from each panel.
EXAMPLE 6
Osteoporosis stud~~opulation recruitment and clinical data collection
Families were identified through probands with a BN1D Z score of at least -1.6
(equivalent to approximately the lower 5% of the normal distribution of BMD)
at either the
femoral neck or the lumbar spine (L2-L4). A "proband" is defined as the first
person
identified with a particular phenotype (in this case low BMD) within a family.
The initial phase of family collection focused on nuclear families of European
Caucasoid origin. These families were used primarily for a genome-wide scan
for genetic
determinants of BMD. BMD was measured in all participating family members and
treated
as a quantitative trait.
First degree relatives of probands will be invited to participate. These
included
parents, siblings and offspring over the age of twenty. Spouses could to take
part to act as
controls and to assist the analysis of their children's genotype.
If a relative is found to have a low bone density (cut off at Z of
approximately -1.28;
equivalent to the lower 10% of the phenotypic range), the invitation to
participate will be
extended to their first-degree relatives. In some cases the parents of a
proband will be
deceased. If a strong family history suggesting osteoporosis in deceased
parents is present
then secondary relatives such as aunts, uncles, cousins will be invited to
participate.
The size and nature of families will therefore depend on a number of factors
including the age of the proband, family history of osteoporosis or fractures
and whether
158
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
other family members are willing to participate. It is expected, judged from
previous
experience, that the average number of volunteers per family will be five
individuals. The
absolute minimum family that was accepted into the study is a pair of
siblings, either
concordant or discordant for BMD where one of the siblings was a proband.
A collection of large numbers of simplex families for linkage disequilibrium
studies
was carried out to get finer mapping stages of positional cloning and for
systematic mapping
of functional candidate genes. At a later stage, families from other ethnic
groups will
provide genetic diversity for haplotype analysis to help identify the primary
disease-
predisposing sequences. Cape Town and Singapore were selected to collect
material from
ethnic groups.
Potential probands were identified if they had a femoral neck/lumbar spine BMD
equal or lower than Z -2.0, were between 20 - 85 years of age, European, white
Caucasian
and fully mobile. They were excluded from the study if they had secondary
osteoporosis,
prednisolone usage at a dose of 7.5mg per day for six months or longer or
equivalent steroid
doses of Dexamethasone 0.75mg per day or hydrocortisone 30mg per day, were
hypothyroid
patients on thyroxine if the TSH is below the laboratory normal range, had a
malignancy
(including myeloma) within five years, have malabsorption, have a inflammatory
bowel
disease, have premenopausal (aged less than 45 years) amenorrhoea greater than
six months,
other than pregnancy, had previous or current alcohol intake estimated at
greater than 30
units per weelc for more than six months, chronic renal failure (creatinine >
150 ~,mol/1) or
chronic liver dysfunction (AST > twice normal).
Volunteers gave blood samples for DNA extraction for genetic studies, as well
as
blood samples for calcium, creatinine, liver function (if over 60 years), TSH
and vitamin D
(if over 60 years) tests, and a second voided urine sample for markers of bone
turnover.
For genetic studies, at least 10 ml of venous blood was collected from a
forearm vein
into EDTA tubes. Blood collected into plastic tubes will be frozen straight
away. Blood
collected into glass tubes was transferred to plastic tubes before freezing.
Once frozen the
blood will not be thawed until DNA extraction takes place. DNA extraction was
performed
using standard procedures. The blood was frozen quickly as possible to -
70°C and then
stored at -70°C.
159
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
A 10-ml venous blood sample was also taken from all subjects for biochemical
assays of calcium, creatinine, liver function, TSH and vitamin D. Blood was
collected into a
plain container. Separated serum was stored at -70°C.
Second voided urine samples for analysis of biochemical markers of bone
turnover
were taken. These samples were stored at -70°C.
In addition to BMD at femoral neck and lumbar spine, height and weight were
measured. Volunteers were scanned at the femoral neck and lower spine (L2-
ILL). For the
femoral neck, the volunteer will be placed in the dorsal decubitus position
with a 10-degree
internal rotation of the hip, according to the manufacturer's protocol. For a
satisfactory
lumbar spine scan (e.g. no scoliosis, severe degenerative disease or obvious
fracture), the
volunteer will be placed, as described in the manufacturer's manual, in a
comfortable supine
position with legs raised and supported so as to ensure that the lumbar spine
is as horizontal
as possible. The axis of the spine should be parallel to the axis of the
scanning machine.
Bone mineral density measurements was performed using dual energy X-ray
absorbtiometry (DXA) scanning. The bone density data was standardized by the
use at each
center of the same male and female reference population databases for hip and
spine. The Z
score at both the femoral neck and the lumbar spine for an individual
volunteer was
calculated using the regression line and standard deviation from the
respective reference
database. This is done by using the regression equation y = mx + c, where y is
the absolute
BMD, x is the volunteer's age and na and c are the slope and constant,
respectively. From
this, one can calculate the predicted BMD value. The Z score was calculated by
subtracting
the predicted B1V~ from the actual BNfl~, and then dividing the difference by
the reference
population standard deviation.
EXAMPLE 7
Data Handling and Statistical Analysis to determine association of
pol,>~rphisms with
Osteoporosis
160
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Bone mineral density was analysed as a quantitative trait in probands and
family
members. Selection of probands with a low BMD increased the power to detect
linkage of
genetic marker loci.
Power is defined here as the probability of observing positive evidence for
linkage at
a single additive quantitative locus, assuming that a genetic susceptibility
locus exists, using
the variance component model of Amos (1994). Positive evidence of linkage
means a LOD
score of 3.0 (p<0.001) or greater, the accepted scientific standard.
Power to detect an additive locus was estimated from observed phenotypic data
given
assumed values for:
~ Broad sense heritability (a measure of the overall genetic component of the
phenotype).
Narrow sense heritability (a measure of the genetic contribution of a specific
locus).
The frequencies of the alleles at the quantitative locus.
The broad sense heritability for osteoporosis was estimated to be between 0.3
and 0.8.
Theoretical calculations were based on the analysis of 108 nuclear families
with an
average of 2.9 phenotyped siblings per family using the same recruitment
strategy as
described above, assuming:
An intermediate value of 0.5 for broad sense heritability.
The presence of two alleles at a given quantitative trait locus.
~ That the locus behaves in an additive manner.
That the marker locus is highly informative.
That the recombination frequency between the trait locus and the marker is
negligible.
While it is difficult to predict both the contribution of a specific locus to
genetic
susceptibility and the frequency of the associated allele, it is likely that
for a multifactorial
disease like osteoporosis, susceptibility will be due to common alleles with
small gene
effects. Power calculations were performed, therefore, for a range of narrow
sense
161
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
heritability and allele frequency combinations; narrow sense heritability >
8°70 and allele
frequency of 0.22 or less, and narrow sense heritability > 12% and allele
frequency of 0.28 or
less are typical examples. Thus, if the modeling assumptions are reasonable
approximations
to reality, it was estimated that about 1200 families will be needed to detect
a quantitative
locus at 80070 power using families in which the proband has a BMD Z score of -
2.0 or less.
Similar calculations suggest that by tightening the proband inclusion
criterion to a
BMD Z score of -2.0 (approximately the lower 2.5°l0 of the phenotypic
range) the number of
families required would be reduced to an estimated 800.
Thus, it is proposed that family recruitment begins initially using the
proband
inclusion criterion of a BMD Z score of -2Ø Depending on the rate of family
recruitment,
which will be monitored constantly, and also on the ongoing genetic analyses,
a more
stringent proband criterion of Z score of -2.0 or less will be adopted.
Statistical Anal,
The genome scan was performed using a strategy of replication. This is a
powerful
approach based on the premise that it is highly improbable that false positive
evidence for
linkage will be replicated in the analysis of additional data sets. There was
an interim
analysis on an initial population of 200 families from the Oxford region,
followed by analysis
of additional family data sets from the UK and The Netherlands.
Linlcage analysis was performed using the variance-components analysis
program.
B1VID was corrected for height, weight, age and sex. The experimental
threshold for positive
evidence for linkage was p <0.001 at one locus or p <0.01 at two or more
adjacent loci.
EXAMPLE 8
Identification of Single Nucleotide Pol m~orphisms
Single nucleotide polymorphisms (SNPs) were identified using Incyte's
proprietary
fSSCP method. Fluoresently labeled primers were synthesized and PCR was
performed on 47
DNAs from a Coriel-derived Human Diversity Panel. The PCR products were
electrophoresed on an ABI 377 machine and 8% nondenaturing, l2cm SSCP gels
were used.
162
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
The resulting traces were aligned in ABI Genotyper software and where variant
traces
(indicating underlying polymorphisms) were found, examples of each variant
were
sequenced.
Biallelic pol,~morphisrn ~enotyPin~ b~yrosequencingTM
A pair of oligonucleotides for amplification by PCR was designed on either
side of
each biallelic polymorphism to produce a product size between 50bp and 350bp.
A
sequencing oligonucleotide was designed to end within 30bp either 5' or 3' to
each
polymorphic site. All amplification oligonucleotides used to generate the
complementary
strand to the sequencing primer were labeled with a 5' - Biotin. Examples of
the particular
sequencing primers used are found in Table 10 of U.S. Provisional Patent
Application Serial
Number 60/423559, entitled "Nucleotide Polymorphisms Associated with
Osteoporosis" filed
November 4, 2002, which is incorporated by reference in its entirety, and in
Tables 10 and 13
of this application.
For each marker, all samples genotyped were amplified by PCR using the PCR
amplification oligonucleotides. Each reaction used: 20ng DNA (dried down), 0.6
units of
AmpliTaq GoIdTM DNA polymerase, 1X PCR Buffer II, 2.5mM MgCl2, 1mM dNTP, and
lOpmol of each PCR oligonucleotide in a final volume of 10m1. The PCR cycling
conditions
used were: 95°C for 12 min, 45 cycles of: 94°C for 15 sec, TA
for 15 sec, 72°C for 30 sec, and
72°C for 5 min.
After amplification the DNA strand of each PCR template complementary to the
sequencing primer was isolated, ready for pyrosequencing (PSQ). To do this, 1)
50m1 of
Dynabead solution (2mg/ml Dynabeads0, 5mM Tris-HCI, 1M NaCl, 0.5 mM EDTA,
0.05%
Tween 20) was added to the PCR product and shaken at 65°C for 15 min,
2) the template was
transferred using magnets to 50m1 of 0.5M NaOH for 1 min, 3) the template was
transferred
using magnets to 100m1 of 1X Annealing buffer (20mM Tris-Acetate, 5mM MgAc2)
for 1
min, and 4) the template was transferred using magnets to 45m1 of 1X Annealing
buffer
containing l5pmol of sequencing oligonucleotide. Examples of particular
sequencing
primers and specific annealing temperatures are found in Table 10 of U.S.
Provisional Patent
163
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Application Serial Number 60/423559, entitled "Nucleotide Polymorphisms
Associated with
Osteoporosis" filed November 4, 2002, which is incorporated by reference in
its entirety.
After template isolation, the sequencing oligonucleotide was annealed to the
template
by denaturing at 80°C for 2min and then cooling to room temperature for
10 min. Each
marker/sample combination was then sequenced/genotyped by pyrosequencingTM on
a
PSQ96TM (Pyrosequencing AB). Genotype results were stored in the PSQ oracle~
database
ready for statistical analysis.
EXAMPLE 9
Genes Analyzed for Polymorphism Association with Osteoporosis
The following genes were found to have polymorphism-associated effects on the
susceptibility to low mineral bone density and/or bone damage, and hence
osteoporosis:
1) Aortic carboxypeptidase-like protein (ACLP)
mRNA: NM_001129 Protein: NP_001120
The ACLP, also known as the adipocyte enhancer(AE)-binding protein 1 (AEBP1),
is
a transcriptional repressor with carboxypeptidase activity and may play a role
in
adipogenesis.
2) A kinase anchor protein 9 (AKAP9)
mRNA: NM 005751 Protein: NP_005742
mRNA: NM_147166 Protein: NP_671695
mRNA: NM_147171 Protein: NP_671700
mRNA: NM_147185 Protein: NP_671714
AKAP9, also known as YOTIAO, is a scaffold protein that binds type I protein
phosphatase (PPl) and cAMP-dependent protein kinase (PISA) to NMDA receptors.
AKAP9
also anchors protein kinases and phosphatases to the centrosome and the Golgi
apparatus.
164
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
3) Bone morphogenetic protein receptor, type II (BMPR2)
Variant l: mRNA: NM 001204 Protein: NP 001195
Variant 2: mRNA: NM 033346 Protein: NP 203132
BMPR2, also know as the serine/threonine kinase type II activin receptor-like
lunase
is
a transforming growth factor beta (TGF-beta) receptor that can also bind type
I receptors and
is involved in bone and other morphogenesis. Mutations in the gene are
associated with
familial primary pulmonary hypertension.
4) Fibroblast growth factor receptor 2 (FGFR2)
mRNA: NM 000141 Protein:
NP 000132
mRNA: NM_ 022969 Protein:075258
NP_
mRNA: NM 022970 Protein:075259
NP
mRNA: NM 022971 Protein:075260
NP
mRNA: NM 022972 Protein:075261
NP
mRNA: NM 022973 075262
Protein: NP
mRNA: NM 022974 Protein:
NP 075263
mRNA: NM 022975 Protein:075264
NP
mRNA: NM 022976 Protein:075265
NP
mRNA: NM 023028 Protein:075417
NP
mRNA: NM 023029 Protein:075418
NP
mRNA: NM 023030 075419
Protein: NP
mRNA: NM 023031 Protein:
NP 075420
FGFR2 is a high-affinity receptor, depending on the isoform, for acidic, basic
and/or
keratinocyte growth factor. This receptor is a member of the fibroblast growth
factor receptor
family, where amino acid sequence is highly conserved between members and
throughout
165
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
evolution. FGFR family members differ from one another in their ligand
affinities and tissue
distribution. A full-length representative protein consists of an
extracellular region, composed
of three immunoglobulin-like domains, a single hydrophobic membrane-spanning
segment
and a cytoplasmic tyrosine kinase domain. The extracellular portion of the
protein interacts
with fibroblast growth factors, setting in motion a cascade of downstream
signals, ultimately
influencing mitogenesis and differentiation. Mutations in this gene are
associated with many
craniosynostotic syndromes and bone malformations. The genomic organization of
this gene
encompasses 20 exons. Alternative splicing in multiple exons, including those
encoding the
Ig-lilee domains, the transmembrane region and the carboxyl terminus, results
in varied
isoforms which differ in structure and specificity.
5) FBJ murine osteosarcoma viral oncogene homolog B (FOSB)
mRNA: NM 006732 Protein: NP 006723
FOSB is a DNA-binding member of the Fos family, forms AP-1 transcription
factor
complex with Jun proteins. FOSB may be involved in the pathogenesis of breast
tumors. An
alternative form, deltaFosB, plays a role in persistent neuroplasticity
associated with cocaine
addiction.
6) Follistatin-lilce 1 (FSTL1)
mRNA: NM 007085 Protein: NP 009016
FSTLl, also known as follistatin-related protein, is a nuclear activin-binding
protein
that is induced by TGF beta 1 (TGFBl) and inhibits cell proliferation. FSTL1
is also an
autoantigen in systemic rheumatic diseases. FSTLl is abundantly expressed
(0.33%) in
trabecular bone libraries.
7) Insulin-like growth factor binding protein 5 (IGFBPS)
mRNA: NM 000599 Protein: NP 000590
166
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
IGFBP5 is a member of the insulin-like growth factor binding family of
proteins that
bind to and modulate insulin-like growth factor activity, regulates bone
formation and may
serve in muscle and cartilage development. IGFBP5 has tissue specificity with
osteosarcoma,
and at lower levels in liver, kidney, and brain. IGFBP5 can also alter the
interaction of insulin
growth factors with their cell surface receptors.
8) Insulin receptor substrate 1 (IRS1)
mRNA: NM 005544 Protein: NP 005535
IRS1, also known as HIRS-l, is a cytoplasmic docking protein that mediates
IGFl
signaling to SH2-containing effector molecules such as Grb2 and PI3-kinase and
inhibits
apoptosis. IRS 1 also plays a role in cell proliferation and glucose
transport.
9) Alpha V subunit integrin (ITGAV)
mRNA: NM 002210 Protein: NP 002201
~ ITGAV is a subunit of the vitronectin receptor that is involved in cell-cell
and
cell-matrix interactions, plays a role in tumor angiogenesis and may
contribute to
tumorigenicity of cutaneous malignant melanoma. Integrins serve as major
receptors for
extracellular matrix-mediated cell adhesion and migration, cytoskeletal
organization, cell
proliferation, survival, and differentiation. Alpha-V integrins comprise a
subset sharing a
common alpha-V subunit combined with 1 of 5 beta subunits (beta-1, -3, -5, -6,
or -8). All or
most alpha-V integrins recognize the sequence RGD in a variety of ligands
(vitronectin,
fibronectin, osteopontin, bone sialoprotein, thrombospondin, fibrinogen, von
Willebrand
factor, tenascin, and agrin) and, in the case of alpha-V-8, laminin and type
IV collagen
Vitronectin is a multifunctional glycoprotein present in blood and in the
extracellular
matrix. It binds glycosaminoglycans, collagen, plasminogen and the urokinase-
receptor, and
also stabilizes the inhibitory conformation of plasminogen activation
inhibitor-1. By its
localization in the extracellular matrix and its binding to plasminogen
activation inhibitor-1,
vitronectin can potentially regulate the proteolytic degradation of this
matrix. In addition,
vitronectin binds to complement, to heparin and to thrombin-antithrombin III
complexes,
implicating its participation in the immune response and in the regulation of
clot formation.
167
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
The biological functions of vitronectin can be modulated by proteolytic
enzymes, and by exo-
and ecto-protein kinases present in blood.
Vitronectin contains an Arg-Gly-Asp (RGD) sequence, through which it binds to
the
integrin receptor alpha v beta 3, and is involved in the cell attachment,
spreading and
migration.
Bone resorption requires the tight attachment of the bone-resorbing cells, the
osteoclasts, to the bone mineralized matrix. Integrins, a class of cell
surface adhesion
glycoproteins, play a key role in the attachment process. Most integrins bind
to their ligands
via the RGD tripeptide present within the ligand sequence. The interaction
between integrins
and ligands results in bidirectional transfer of signals across the plasma
membrane. Tyrosine
phosphorylation occurs within cells as a result of integrin binding to ligands
and probably
plays a role in'the formation of the osteoclast clear zone, a specialized
region of the osteoclast
membrane maintained by cytoskeletal structure and involved in bone resorption.
Human osteoclasts express alpha 2 beta 1 and alpha v beta 3 integrins on their
surface.
The alpha v beta 3 integrin, a vitronectin receptor, plays an essential role
in bone resorption.
For example, echistatin, an RGD-containing protein from a snake venom, binds
to the alpha v
beta 3 integrin and blocks bone resorption both in vitro and in vivo. (Dresner-
Pollak R,
Rosenblatt M. J Cell Biochem 1994 Nov;56(3):323-30).
Crystal structure of the extracellular portion of integrin alpha-V-beta-3 at
3.1-
angstrom resolution. Its 12 domains assemble into an ovoid head and 2 tails.
In the crystal,
alpha-V-beta-3 is severely bent at a defined region in its tails, reflecting
an unusual flexibility
that may be linked to integrin regulation.
Alpha-V integrins have been implicated in many developmental processes and are
therapeutic targets for inhibition of angiogenesis and osteoporosis.
Surprisingly, the ablation
of the gene for the alpha-V integrin subunit, eliminating all 5 alpha-V
integrins, although
causing lethality, allows considerable development and organogenesis
including, most
notably, extensive vasculogenesis and angiogenesis. Eighty percent of embryos
die in
midgestation, probably because of placental defects, but all embryos develop
normally to
E9.5, and 20% are born alive. These liveborn alpha-V-null mice consistently
exhibit
intracerebral and intestinal hemorrhages and cleft palates. These results
necessitate
reevaluation of the primacy of alpha-V integrins in many functions including
vascular
168
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
development, despite reports that blockade of these integrins with antibodies
or peptides
prevents angiogenesis
10) KJ_bonlib4/Eukaryotic translation initiation factor 4 gamma 2 (EIF4G2)
mRNA: NM 001418 Protein: NP 001409
KJ bonlib4 is also known as p97, DAPS, NAT1 and Eukaryotic translation
initiation
factor 4G-like 1. KJ bonlib4 is a translational repressor that binds EIF3 and
EIF4A, but not
EIF4E, promotes IFNG-induced programmed cell death and is cleaved by caspase-3
(CASP3)
during apoptosis.
11) KJ bonlib7
mRNA: NM 018067
KJ bonlib7 has an unknown function.
12) KJ_opgbal
XM 053496
KJ_opgbal is a member of the sulfatase family, which hydrolyze sulfate esters,
has a
region of moderate similarity to a region of N-acetylglucosamine-6-sulfate
sulfatase (human
GNS), which is associated with Sanfilippo disease IIID upon deficiency
13) KJ opgbal3
mRNA : NM_007021 Protein: NP 008952
KJ opgbal3 is also known as DEPP. Pfram model results indicate that
KJ OPGBA13 is a thermophilic metalloprotease.
14) KJ_opgbal4
mRNA: NM 015429 Protein: NP_056244
KJ-opgbal4 is also known as NESHBP and TARSH. KJ opgbal4 contains a
fibronectin type III domain, which is involved in cell surface binding.
169
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
15) KJ opgba47
mRNA: NM 024843 Protein: NP 079119
KJ_opgba47 is a member of the cytochrome b561 family, has moderate similarity
to
uncharacterized cytochrome b561 (human CYB561), which is an integral membrane
protein
found in neuroendocrine secretory vesicles.
16) KJ_opgba115
mRNA: NM 152309 Protein: NP 689522
KJ opgba115 has a strong similarity to B cell phosphoinositide 3-kinase (PI3K)
adaptor
(mouse Bcap), which binds to SH2 domains of PI3K and may recruit PI3K to
glycolipid-enriched
microdomains leading to BCR-mediated PI3K activation.
17) KJ opgbal36
mRNA: NM 015493 Protein: NP 056308
KJ_opgba136 has an unknown function.
18) Lumican (LLTM)
mRNA: NM 002345 Protein: NP 002336
LUM is an extracellular matrix keratan sulfate proteoglycan that may be
involved in the
development and maintenance of corneal transparency.
19) Matrix metalloproteinase 1 (MMP1)
mRNA : NM 002421 Protein : NP 002412
MMP1, also known as interstitial collagenase, is a matrix metalloprotease that
cleaves
fibrillar collagen type I to gelatin and functions in collagen turnover in
most tissues and may play
a role in cartilage destruction in rheumatoid arthritis.
170
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
20) Mitogen-activated protein kinase 8 (MAPK8)
MAPK8 isoform 1: mRNA: 139049 Protein: NP_620637
NM_
MAPK8 isoform 2: mRNA: 002750 Protein: NP_002741
NM_
MAPK8 isoform 3: mRNA: 139046 Protein: NP_620634
NM_
MAPK8 isoform 4: mRNA: 139047 Protein: NP
NM_ 620635
MAPK8 is also known as JNK, JNK1, PRKMB, SAPK1, JNK1A2 and JNK21B 1/2.
MAPK8 is a serine-threonine kinase that regulates c-Jun (JLTN) and plays a
role in the induction
of apoptosis and other cellular responses to stressors such as ultraviolet
light, reactive oxygen
and hypoxia.
21) Nuclear factor of kappa light polypeptide gene enhancer in B-cells 2
(NFKB2)
mRNA: NM_002502 Protein: NP 002493
NFKB2 is a transcription factor, involved in immune response, may coordinate
pre
mRNA splicing and transcription and may play a role in HIV infection,
leukemia, breast cancer
and lymphoid neoplasia.
22) Notch (Drosophila) homolog 3 (NOTCH3)
mRNA: NM_008716 Protein: NP 032742
NOTCH3 encodes the third discovered human homologue of the Drosophilia
melanogaster type I membrane protein notch. In Drosophilia, notch interaction
with its cell-
bound ligands (delta, serrate) establishes an intercellular signalling pathway
that plays a key role
in neural development. Homologues of the notch-ligands in human function in
CNS development
and are upregulated in renal cell carcinoma. Mutations in NOTCH3 have been
identified as the
underlying cause of cerebral autosomal dominant arteriopathy with subcortical
infarcts and
leukoencephalopathy (CADASIL).
Alignment of available genomic sequence to the CDS contig identified at least
29 exons.
Screening of 14 of these exons by SSCP revealed 11 conformational variants
within 7 exons, 10
of which were observed in 14 unrelated patients and none of 200 control
chromosomes. Each was
shown by nucleotide sequencing to be due to nucleotide substitutions resulting
in an amino acid
171
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
change. Cosegregation of the abnormal conformer with the affected phenotype as
established in
6 pedigrees available in the set of 14 patients. The eleventh variant was seen
in patients and in
controls, and sequencing showed that it was due to a silent nucleotide change.
Notch is known
for its role in specifying cell fate during Drosophila development. They
stated that the only
human disorder implicating a Notch gene previously was an adult T-cell
leukemia, which is
associated with truncation of the NOTCHl transcript. No developmental
abnormality or
neoplasia is associated with CADASIL. On the basis of an analysis of
Drosophila mutants, it had
been proposed that Notch may be a receptor with different functional domains,
the intracellular
domain having the signal-transducing activity of the intact protein and the
extracellular domain
possessing a ligand-binding and regulatory activity.
23) Osteoblast specific factor 2 (OSF2)
mRNA: NM 006475 Protein: NP_006466
OSF2 is also known as periostin.
24) Osteoglycin (OGN)
mRNA: NM 014057 Protein: NP 054776
mRNA: NM 024416 Protein: NP 077727
mRNA: NM 033014 Protein: NP_148935
OGN is a member of the keratan sulfate proteoglycan group of the small leucine-
rich
proteoglycan family and may play a role in regulating corneal transparency.
25) Osteomodulin (OMD)
mRNA: NM 005014 Protein: NP 005005
OMD, also known as osteoadherin, is a leucine-rich repeat containing
proteoglycan that
172
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
may play a role in bone mineralization and has moderate similarity to proline
arginine-rich and
leucine-rich repeat protein (human PRELP)
26) Plasminogen activator inhibitor 1 (PAI1)
mRNA: NM 000602 Protein: NP_000593
PAI1 is a member of the serpin family of serine proteases, inhibitors and
plays a role in
regulating blood coagulation by inhibiting fibrinolysis, contributes to tumor
progression and is
a rislc factor for cardiovascular diseases.
27) Prostaglandin endoperoxide synthase 1 (PTGS1)
mRNA: NM 000962 Protein: NP_000953
mRNA: NM_080591 Protein: NP_542158
PTGS1 is also known as COX1, catalyzes the conversion of arachidonic acid to
prostaglandin H2 and may be involved in inflammation and blood coagulation.
PTGS 1's activity
is irreversibly inhibited by aspirin.
28) CCL2 chemokine (C-C motif) ligand 2 (SCYA20)
mRNA: NM 002982 Protein: NP 002973
SCYA2 is also kriow~ as monocyte secretory protein JE monocyte chemoattractant
protein-1 monocyte chemotactic and activating factor small inducible cytokine
subfamily A
(Cys-Cys), member 2 monocyte chemotactic protein 1 and homologous to mouse Sig-
je, is a
Cytokine A2, CC chemokine that attracts monocytes, memory T-cells, natural
killer cells and
endothelial cells. SCYA2 plays a role in the inflammatory response to
infection and in
inflammatory diseases including arthritis, multiple sclerosis and
atherosclerosis.
29) Tissue inhibitor of metalloproteinase 1 (TM'1)
mRNA: NM_003254 Protein: NP_003245
173
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
TIMP is also known as erythroid potentiating activity, EPA, EPO, HCI and CLGI.
TIMP
inhibits matrix metalloproteases including MMP2, stimulates growth of
erythroid cells and
attenuates metastasis of tumorigenic cells when overexpressed.
30) Transglutaminase 2 (TGMl)
mRNA: NM 000359 Protein: NP 000350
TGM1 is membrane bound and catalyzes the crosslinking of extracellular matrix
(ECM)
proteins and other cellular proteins, modulates the ECM, cell growth,
adhesion, signaling, and
apoptosis, and has been associated with Alzheimer's, Huntington, and celiac
disease.
31) Tumor Necrosis Factor-Alpha-Induced Protein 6 (TNFAIP6)
mRNA: NM 007115 Protein: NP 009046
TNFAIP6 is a metalloprotease. TNFAIP6 is transcribed in normal fibroblasts and
activated by binding of the TNFa. Similar to CD44, TNFAIP6 binds hyaluronate
and is involved
in plasmin inhibition and the inhibition of inflammation.
32) Vascular endothelial growth factor (VEGF)
mRNA: NM 003376 Protein: NP 003367
VEGF, which is structurally related to platelet-derived growth factor, induces
endothelial
cell proliferation and migration, vascular permeability, angiogenesis and NO-
mediated signal
transduction. Many polypeptide mitogens, such as basic fibroblast growth
factor and platelet-
derived growth factors are active on a wide range of different cell types. In
contrast, vascular
endothelial growth factor is a mitogen primarily for vascular endothelial
cells. Data suggest that
mutations of p53 and activation of the Ras/MAPK pathway may play a role in the
induction of
VEGF expression in human colorectal cancer. Up-regulation of vascular
endothelial growth
factor by membrane-type 1 matrix metalloproteinase stimulates human glioma
xenograft growth
and angiogenesis. Both VEGF-induced PI 3-kinase activation and beta(1)
integrin-mediated
binding to fibronectin are required for the recruitment and activation of PKC
alpha.
EXAMPLE 10
174
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
QTDT analysis of results
Quantitative transmission disequilibrium tests (QTDT) for association between
100 SNP loci and 13 phenotypic traits are reported. The traits comprise
calibrated bone
mineral density (BMD) values for four skeletal sites, the corresponding Z
scores (mean = 0,
variance = 1), the occurrence of fractures, and four other traits not directly
related to
osteoporosis. For each marker-trait combination the significance of
stratification is tested.
The significance of association between marker and trait is tested both
unpartitioned, and
partitioned into between-family and within-family components. These analyses
are
performed for the sexes pooled, and using phenotypic data from males only and
females only.
There is little evidence of stratification, and interpretation is therefore
focused on the
unpartitioned association. Of the many significant associations, the most
consistent are those
with SNP loci OGN_02, OMD 03 and OMD Ol. The first two loci show significant
associations with phenotypic traits in all three sub-sets of the data (sexes
pooled, males only
and females only), while O1VVID O1 shows six associations that are significant
at the 1 % level.
The effect of an individual significant marker-trait association (measured as
the difference
between either homozygote and the heterozygote) ranges from 2.8 % to 10.4 % of
the mean
value for the trait in the case of the calibrated B1VID traits, and from 0.114
to 0.448 units for
the Z scores. In those cases where stratification is significant, the within-
family association
effect is consistently much smaller than the unpartitioned effect. Six
additional individual
SNP-trait associations are significant at the 1 % level. The most notable of
these is between
SNP locus ITGA08 and BMD in lumbar vertebrae 2 to 4 in males. The effect of
this
association is 4.1 % of the mean value for calibrated BMD, and 0.237 units for
the Z score.
The traits comprise calibrated bone mineral density (BMD) values for four
skeletal
sites, the corresponding Z scores, the occurrence of fractures, and four other
traits. The
skeletal sites studied are lumbar vertebrae 2 to 4 (mean value), the neck of
the femur, the
trochanter, and the total of BMD values over three sites in the hip (neck of
femur, trochanter
and 'inter'). Calibrated BMD values are given in units of g/cm2. The
corresponding Z
scores (calculated within Oxagen Limited) are obtained by adjusting these
values for the age
and sex of the individual, and scaling them so that mean = 0 and variance = 1.
The
occurrence of fractures is scored as 0 = no fractures, 1 = fractures. The four
other traits,
which are not directly related to osteoporosis (though they are associated
with it) and which
175
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
are included for purposes of comparison, are the ages of onset and cessation
of periods in
females, and height and weight in both sexes.
Statistical analysis is performed using the software QTDT. For each marker-
trait
combination, the significance of stratification is tested. If stratification
is present, the
between-pedigrees component of the marker-trait association is not entirely
due to linkage
and only the within-pedigree component can legitimately be used to measure the
effect of the
locus. However, if stratification is absent the unpartitioned association
provides a stronger
test of significance and a more precise measure of the effect of the locus.
Therefore each
marker-trait combination is. tested for association both without partitioning
of the association,
and with partitioning into between- and within-pedigree components. The
interpretation of
the results then depends on the outcome of the test for stratification. These
analyses are
performed both for the sexes pooled, and also using phenotypic data from males
only and
females only.
Table A. SNP loci subjected to QTDT analysis
ACLP02 AKA906 FGF201 IRS102 K11503KJ1403 MAP803 NOT302 PAI108SDF106
ACLP04 AKA908 FGF202 IRS104 K11504KJ1405 MMP101 NOT303 PMX101SOD201
ACLP05 BMPA01 FOSBO1 IRS105 K13601KJ1 MMP103 NOT304 PTG101TGM102
O1
ACLP0 6 KJ1303KJ1 MMP104 OGN_02 SC2001TGM103
BMPA03 FOSB04 IRS 02
107
ACLP0 IRS108 7 KJ4701 MMP105 OGN_03 SC2002TGM106
BMPA04FST101 KJ1304
ACLP0 ITGA02 8 KJ4702 MMP107 OMD_Ol SC2003TGM111
CHUKO1 FST102 KJ1306
ACLP09 CHUK02 FST103 ITGA08 KJ1307KJ4703 NFK201 OMD_03 SCY201TIF102
ACLP10 CHUK03 FST104 ITGA11 KJ1308KJ4704 NFK202 PAI102 SCY202TNF601
AD1204 CY1701 IGF401 ITGA12 KJ1311LIF_02 NFK203 PAI105 SDF101TNF602
AKA901 CY1705 IGF503K11501 KJ1401LLTM VEGFO1
O1
NOT301
PAI107
SDF104
Table B. Phenotypic traits subjected to QTDT analysis
Age Periods StartedCalL2 L4BMD Zox _L24
AgePeriodsStopped CalNeckBMD Zox neck
ht filled in CalTrochBMD Zox troch
wt filled_in CalHTotaIBMDZox _ht
fracture numeric
The script contained in the file heritability/run_QTDT heritability is then
run. This
script fits a QTDT model with options -a- -We -Veg to each phenotypic trait
for the sexes
pooled, both for the complete set of phenotypic values and with the exclusion
of outliers for
176
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
BMD. These options indicate that no model of association is to be fitted, and
that the
variance components Ve (null model) and Ve+ Vg (full model) are to be
estimated.
Heritability is then estimated as
hz - Vg
Vg +Ve
The corresponding analysis was performed using the phenotypic values of males
only
and using those of females only. This script fits a QTDT model with options -
at -Weg or each
phenotypic trait, for the sexes pooled, in combination with each SNP locus.
These options
indicate that the association between the trait and the SNP locus is to be
estimated, and that the
model is also to include the variance components Ve + Vg. However, the
association is not to
be partitioned into between-pedigree and within-pedigree components. The same
model is
fitted for the phenotypes of males only and for females only. The chi-square
value for
association of each SNP with each trait is extracted from these files and
these values are
stored. They are transferred to an Excel workbook. The unpartitioned effect of
each SNP on
each trait is extracted from the output files and is stored. These values are
transferred to
3rd export_QTDT.xIs, worksheets. The effect is measured as the difference
between the
value of the trait for an individual homozygous for allele 1 and for a
heterozygous individual.
It is expressed both in the units of the trait (cm for height, kg for weight
etc.) and as a
percentage of the mean value for the trait. The script is then run. This
script fits a QTDT
model with options -ao -Weg. These options specify the same model as the
previous set,
except that the association between the phenotypic trait and the SNP locus is
partitioned into
between-pedigree and within-pedigree components. The same models are fitted
for each sex
separately. The chi-square value for within-family association of each SNP
with each trait is
extracted from these files and these values are stored. They are transferred
to
3rd export_QTDT.xIs, worksheets together with the frequency of the rare allele
at each SNP
locus. The within-family effect of each SNP on each trait (expressed both in
the units of the
trait and as a percentage of the mean value for the trait) is extracted from
the output files and is
stored. These values are transferred to 3rd_export_QTDT.xls, worksheets. The
script is then
run. This script fits a QTDT model with options -ap -Weg. These options
specify that the
177
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
models with and without partitioning of the association into between-pedigree
and within-
pedigree components are to be compared. The same models are fitted for each
sex separately.
The chi-square value for the improvement of fit due to partitioning of the
association of each
SNP with each trait is extracted from these files and these values are stored.
They are
transferred to 3rd_export_QTDT.xls, worksheets. In each worlcsheet that
contains chi-square
values, the background of each cell holding a chi-square value above the 5 %
critical value (,~
= 3.841, DF = 1), and the proportion of chi-square values that exceeds this
value is presented
for each SNP over the eight BMD traits (i.e. the four calibrated BMD values
and the
corresponding Z scores), for each trait over the 100 SNPs, and for the 800 (=
100 ' 8) SNP-
BMD trait combinations. The chi-square value needed to achieve significance
following the
Bonferroni correction is presented for all BMD traits at a single SNP locus (8
tests), for all
SNP loci for a single trait (100 tests) and for all SNP-trait combinations
(800 tests). The
frequency of the rare allele at each SNP locus is extracted from the file and
is also presented in
each of these worksheets.
The mean and heritability of each phenotypic trait, for the sexes pooled (with
and
without the inclusion of outliers) and for each sex individually, are
presented in Table 1. As
expected, for the non-B1VID traits the exclusion of outliers makes little
difference to either the
mean or the heritability. For the BMD traits (calibrated BMD values and Z
scores) the
exclusion of outliers consistently raises the mean value and lowers the
heritability. This is to
be expected, as the outliers were identified on the basis of high BNID.
Heritability of the
BMD traits is strikingly lower in females than in males. In particular, that
of CalL2 L4BN1D
is zero. Conversion of the BMD values to Z scores causes a small but
consistent increase in
heritability in females, but had no consistent effect in males.
The test for association between a phenotypic trait and a SNP within pedigrees
is less
powerful than the non-partitioned test, provided that stratification is
absent. It is therefore to
be expected that the chi-square value for non-partitioned association (-at
model in QTDT) will
be larger than that for association within pedigrees (-ao model), unless there
is strong
association due to stratification in the opposite direction to that caused by
linkage. In the
present case there were no exceptions to this expectation. The significance
tests for
stratification (given by the -ap model) are summarized in Tables 2 and 3.
There are a few
significant values (P < 0.05), but not many more than the 5 % expected by
chance. It is
178
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
therefore concluded that stratification is not strong or widespread in these
data, and attention is
therefore focused on the non-partitioned model of association.
The significance tests for non-partitioned association are summarized in
Tables 4 and
5. Summarizing over all BMD traits and SNPs, the proportion of significant chi-
square
values (P < 0.05) is substantially higher than the 5 % expected by chance in
the sexes pooled
and in males, but fairly close to 5 % in females. This suggests that less
emphasis should be
placed on the search for association between SNPs and traits in females,
though individual
SNP-trait combinations with highly significant chi-square values, and SNPs
associated with
several traits, are still worth pursuing. In the sexes pooled there are nine
SNPs each of which
associated with four or more of the BMD traits and in the males only there are
eight SNPs that
meet this criterion, whereas in the females only there are only five such
SNPs. The identity of
these SNPs, and the chi-square values for their association with each BMD
trait, are presented
in Table 6. The chi-square values for stratification are presented in Table 7
for the same SNP-
trait combinations, and the magnitude of the effect of each of these SNPs on
each of the B1VID
phenotypes is presented in Table 8. The SNP loci OGN_02 and OMD 03 show
significant
associations with phenotypic traits in all three sub-sets of the data (sexes
pooled, males only
and females only), and O1VID O1 shows six associations that are significant at
the 1 % level.
The difference between either homozygote and the heterozygote in these marker-
trait
associations ranges from 2.8 % to 10.4 % of the mean value for the trait in
the case of the
calibrated BMD traits, and from 0.114 to 0.448 units for the Z scores. In the
cases where
stratification is significant, the within-family association effect is
consistently much smaller
than the unpartitioned effect.
The six additional individual SNP-trait associations that are significant at
the 1 % level
are presented in Table 9. In none of these is there significant evidence of
stratification. The
most notable of these associations is that between SNP locus ITGA08 and BMD in
lumbar
vertebrae 2 to 4 in males. The effect of this association is 4.1 % of the mean
value for
calibrated BMD, and 0.237 units for the Z score.
The strongest and most consistent associations between SNP loci and phenotypic
traits
related to BMD are at loci OGN_02, OMD 03, O~ O1 and ITGA08. The first three
of
these each show several associations significant at the 5 % level, with
effects ranging from 2.8
% to 10.4 % of the mean value for the trait in the case of the calibrated BMD
traits, and from
179
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
0.114 to 0.448 units for the Z scores. ITGA08 shows significant association
only with BMD
in lumbar vertebrae 2 to 4 in males, but this effect is significant at the 1 %
level. Its
magnitude lies in the same range as those at the other three loci.
Some of polymorphism markers and their association with various effects on
susceptibility to low BMD and bone damage, and hence osteoporosis are
summarized in Table
of U.S. Provisional Patent Application Serial Number 60/423559, entitled
"Nucleotide
Polymorphisms Associated with Osteoporosis" filed November 4, 2002, which is
incorporated
by reference in its entirety.
In addition, Table 10 of this application provides a list of the polymorphism
markers of
10 the thirty-two (32) genes listed in Example 9 which have been found to have
various effects on
susceptibility to low BMD and bone damage. Tables 11 and 12 ranks into groups
the
polymorphic markers by the relevance of their association to the
susceptibility to low BMD by
sexes (Table 11-males and Table 12-females). Those markers ranked in Group A
are the ones
that show the most association to the susceptibility to low BMD, Group B show
less
association and Group C shows the least association.
EXAMPLE 11
Gene-Gene Interaction between OMD and ITGAV
The gene by gene interactions were assessed by logistic regression. The
interaction
was assessed for every pair of OMD-ITGAV SNPs (OMDOl and OMD03 versus ITGAV02,
08, 11 and 12). The logistic regression models were
MEN: OP or Frx (0 or 1)= OMD(snp) + ITGAV(snp) + OMG(snp)+ ITGAV(snp)+ age+
weight
WOMEN: OP or Frx (0 or 1)= OMD(snp) + ITGAV(snp) + OMG(snp)+ ITGAV(snp)+ age+
height
The weight was not included for the women and height for the men since they
did not show
significant effect on OP or Frx in the sample set used.
The study group was made up of individuals with osteoporosis (OP) that were
unrelated individuals from the FAMOS cohort that had (1) been diagnosed with
OP, (2) had
fractures and had a maximum Z (spine) of -1 or (3) was a proband and had a
maximum Z
180
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
(spine) of -0.5. For fractures, unrelated individuals who had self-reported
fractures were used.
Healthy individuals were unrelated subjects, not probands, not OP diagnosed,
no fractures and
maximum Z (spine) >-1.
Regardless of whether or not the individual SNPs had significant effect, the
following
interactions were found to be nominally or nearly-nominally significant: .
Table C
ITGAV SNP OMD SNP TRAIT GENDER P-VALUE
11 1 OP Women 0.06
2 3 Frx Women 0.1
11 3 Frx Women 0.1
12 3 OP Women 0.05
2 3 OP Women 0.05
g 3 Op Men 0.02
11 3 OP Men 0.06
The odds ratio of the predisposing variant (AA for OMD.O1 and A+ for OMD.03)
was
computed by logistic regression analyzing separately the two or three ITGAV
genotypes. The
significant level of the association was determined by computing the Wald's
Chi-Square for
the OMD SNP (model tested OP+OMDsnp + age + height~;f Wome"~ + weight~;f~"en>)
independently for each ITGAV genotype. This is illustrated by the plots in
Figure lA-E.
These results are the first evidence to demonstrate that OMD SNPs show a
significant
association with osteoporosis in both men and women. In addition, these
results provide the
first experimental evidence that ITGAV and OMD interacting in bone metabolism
as well as
confirming the hypothesis of the role of OMD in promoting integrin mediated
cell binding.
These results further indicate pharmacogenetic implications, since there are
current various
integrins (e.g. GSK and Pharmacia) in development and the OMD variants may
affect the
efficacy or ideal dosage of such compounds.
181
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
OTHER EMBODIMENTS
Other embodiments will be evident to those of skill in the art. It should be
understood
that the foregoing detailed description is provided for clarity only and is
merely exemplary.
The spirit and scope of the present invention are not limited to the above
examples, but are
encompassed by the following claims.
The disclosures of all patents, applications and publications, mentioned above
are
expressly incorporated by reference herein.
182
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
r I~r ((~O O ~ ~ r ~.(7W '
U ((
~ CU r 00Lf7O O C~N N r LO C'~7
CY
O O O C)O O O O O O O O
O C
C~d' r
C I~ C'Oc~0N ODr tnd'N C~
_ O O Cr7
N ~ ~ ~ ' r
N r d'~ ~ C 0 ~ C~ CC
O 0 ~ Cv
O O O
r d'r .CflO O C)O ~Oi ~ C
~
i
f~ O d' N 00 r O r CC O CC
N i~ I~ N CU I~ CO 00 C~ 00 CO CC
O. ~ ~ O O O O O O O O O O C
~_ O
C r 00 r O C~ ~ d' ~" ~ r. OC
OOwtO0~0~0000C
O O O C
O r 1~ O r O O r i
Cn
.Q
_C~.
r o d- I~ cW ~ N N r d- N oo ,
O ~ CO r CO 'd- r N d' ~ 'd' C~ CO 'd' Lf,
O O O O O O O O O O O O C
W XO ~ No~O~C~C'OC~r~~NNNC~1
O O O C'~ I~ (O ~ ~ O I~ (O O O O O C
(n C E r d' r C~ O O O O O i i ~
U ,~
Q
O -O
O L r I~ d' I~ d' O 00 00 r d' C~ r d'
r O ~ r
O O O O O O O O O O O O O
O ~ No~00d~°dN'O~~N~ON
O O C~ t~ ~ ~ ~ O I~ C~ O N N N N
O Cn ~ r cf' r CO O r O O O ~, ~, O O
>,
''= ~ L L E E ~ E
-O C ~ ~ U U U U
a>,~U ~ I ~~o» I I I I
O
L
~ O
cd ~ Q U
N
~O C~
O .O _C .
I .~I I J .~ -~ ~ d- U U
r ~ U U U O N U O .~.
~ O N ~ ~ O O ~ .-1 C .~ ~
OI ~ ;~ ~--_ .~ J ~ I- Z I I I I
I U X X X X
h Q Q cl ~ ~ U U U U N N N N
183
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
0
+~
0
0
.n
c~
0
L
c»
cat
c
0
U
L
O > O
O ~ O Lid
cLf ~ u7 N N
>> CO
N r
N N LL O t O O O O O O O O O O 1 O O O O O O
O O O O O
~
> O
O
O ~ O 1 O O O O O O O O O O 1 I O O 1 O O
O O O O O O
c cn
~
> N
.,...
X
U ~ ~ N N
Q
. O m N r r r N
O O
..
m ~ (n O I O O O O O O O O O O I O O O O O O
U Q O O O O
.C U
~
O O
. CLf
O O
O ~ I' C~ C~'~ 00 00 N d' r CC N O ~ p d' 00 00
C~ ~ C'C O I~ C~ I,n
'
.C i3' O O O r 00 O r CO r C~~ CO O C~ d' r 00
d' O? CO d' C~ C~
CC 7
N ~ p C~ N C''~ d' N O O O O O O O N ~-
r r CY7 O O C~ O
O LL Cd O O O O O ~ O O O O O O O O O O O O O
O O O O O
'
'
> r C~ O O O O O r N C d
C~ r N C
O CY7~ r N N C~ d' O O O
r O r r T- r r O O O
O O
L Q7 O O O O ~ ~ p_ p_ a ~ N N
O O O O ~
00
~
7 U
-
. ~ ~ ~ ' ~ z z z z z
~ J J ~ r C' ~ ~ ~ z z
~ ~
N
N
N T r
LL O 1 O 1 1 O O O O O ~ O O O I O O 1 O
O O O O O
n >
r _ O
> ~
II ~ N N N
~
N ~ "- C
O O O O O O O O O O O O O 1 O O 1 O
O O O
1 I
1
O
> O
00O >C y.(7 O N
CLS
U C~U- N I j N r
p O
, (
:~
.
O ~ ~ Cn O I O 1 1 O O O O O O O O O 1 O O 1 O
V Q O O O O
~
N O
'
00 O r ~ .C~ 00 O (OO d' N 00 O r 00 LO
C''W d' It7 ~-f~ ~ IW.c
7
O- ~ ~' ~ L(0O L(7 O O I~ C'C CCN N r CO O N 00 O
O I' f~ C'r7 O C~ I~
'
> O ~ N O O O O N CC r C~r r r O O d' CC O
N C~ r C~ N d
d- _ _ _ O O O O O O O O O O O O O O O O O O O
cLSLL (d O O O O
O
' O OOOO OO~~r0 0 O O O O 00 O
r ~ O O
N ~'- N ~ ~ ~ O N N m
O U ~ ~ CL ~ ~ d. ~ ~ ~ ~ ~ ~ ~ t
d. ~ Cn
LL t
r
o s_ J J J J J J J r Q Q ~ ~ ~ _ r -
J Q Y _ _ -
~ >' r
>' U
U ~
U U U U U U U D Y 1
U ~
U ~ u
~ Q Q Q Q Q Q Q Q Q Q m m m U U U .
Q Q U . .
i- ~
' 184
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
~.w u~ o ~ o
N N ~ N i.c~
r r N r N
O O O O O 1 O O O O 1 O 1 O O O O O O O 1 O O 1 1 O O
' N N
r r
O O O O O 1 O O O O 1 1 1 O 1 O O O O O 1 O O I 1 O O
O O O
~f7 O ~.t~
I~ ~ N
O O O O O O O O O O 1~ O 1 O O O O O O O 1 O O 1 I O O
f~ 00 O Wit' r d- O) I~ O I~ d' I~ N O O C'17 C'C
C'~ O ~ O O O N d' I~
N
d' d' d' r 00 ~ r LC) ~.C~ r d' O 00 O) C'7 r Cr7
00 r O r O i.n O O d' r
Lf~
O C'C O d' O O r r r O O CY7 O CC r N d' r O r C'r7 0
O O O O r C'17
O O O O O O O O O O O O O O O O O O O O O O O O U
O O O
N C~'7O r r r N C~ - N r 'd' Cfl r N C'C C~ r cn
O r N r
O O ~ N ~ I~ 00 O O O O O O O O O O O O r N O O N
~ O p O
O O O O r r O O N N r r r N r r r r- O O O
O LL
r r r r x ~1 N N ~- ~- LL LL LL D 2 ~ ~ ~ r
N LL LL
~ ~ O
0 ~ C~ (n ~
r H h- f- f'- h ~- f-
~
CY7
0..
~
a.
d.
~
~
CI)
Gn
C/)
O u7 O
N ~ N
O O O O O O 1 1 O O O O O O O O O O O I O O O O
1 O O
~(7
C
c~
C~
O O O O O O 1 1 O O 1 O O O O O O O O 1 O O O O
1 O O N
D
r r N
O O O O O O O I O O O O O O O O O O O 1 O O O O
1 O O
~
_
07 O ILn r C'7 C~ O O II7 I~ 'ch O r d' t(7 CCt
N C'C r CO C~ d" 00 d' C~ d'
C~
O I~ O O O d' r r 1~ 00 I~ N 00 O ~ t~ O r O O Ln
O N d' C'C d' d'
Cr7 LC7 d' C~ O O O O O N N d' r O N C'7 Cr7 O O N CCS
C'17 O N r C~ r
O
O O O O O O O O O O O O O O O O O O O O O O O O U
O O O >
_
O O O O OOOOOOOOO
O N ~r~N~ ~~OO O
O
r
O O O O O O O rr
m r
-
Y
'~
'
r r r r ~- ~ r r r r r ~ L(] Lf7 CO C~ C~ C
Q 7 C
7 C~ C~ d
Q Q Q
_
~ ~ ~ r r ~
~ I L t r
-L. L l. ~ ~ ~ ~ ~ ~ ~ ~' ~' ~' Y Y Y 'S Y Y Y Y
Ii~ ~ Y Y Y
185
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
U
U
0
N
O
O
.Q
c»
O
i
(~
C
O
U
:,_
L
+-.
L
O
v-
+~
O
O
O
L
Q.
Z
O uj
O
O ~ O
O U
+~
> CIS O O CO O N 00 N d' O 00 'd' N O r O
O CO c0 C~ c0 I~ d' 1~ N c0 d' 00 r CO ~ O
O O O O O O O O O O O O O O
LL O O O O O O O O O O O O O O O
O
O
N
00 00 O O CO 00 CO O CO 00 O d' OU
O C~ CC O) O N C~'~ N O N C~ c~ N O
cn ~ O O O O O O O O O O O O
~ O O O O O O O O O O O O ''"'
C~
C~ O O
Q.
O O ~ O d' r d' 1~ d' 1~ I~ d' O d' 'd' >
r X O CW.L'7 N I~ N ~f' N d' cY O ~ O
-O O O O O O O O O O O O O O O O
U II ~ ~ p O O O O O O O O O O O O m
O U-
O ~ '~ N
r~ ~ Q U
O 00 Cn O .N Q 0 +~
O Crj cnl Cn ~ ~ ~ ~ ~ ~ c~
II o'~ ~.~
'L O ~I I OI ~ ~ -O ~ cf' U U .v-.
U U O N N O .~.. O
N L I N ~ I-- J C .~ ~
~I ~ ;,_. :a.=I V ~ Z I-- = XI
f~ > h Q Q ~I ~ ~ U U U U N N N N
186
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
c~
a~
c~
o
0
a~
0
~U
O
O
.+- ~ .
O
O ~c7 ~ O ~ O Ln ~ Ln ~ in O O
~ N ~-f~ I~ ~c7 N N N N i~ O u7
O r N C''7 N r r r r C~ LC~ N
O LLO O O O O O O O O O O O O O O O O O O O O O
L O O O O
p
p C~W.t7 N ~c7 O I~
V-
O O O O O O O O O O O O O O O O O O O O r O
O O O O
N ~ M o ~
N N
X
~ pp r I~ I~ C~
O OOOOOOOOO O O O O 0 OO ON O~
O OONO
~ ~
~, C
p
i C~7 IS7 T N r N c~ O r r r r O O O O O O O O
~t N O r O ~
~ ~ ~~ z
~~~~~
L ~ ~~~~~~~~~~m Q~ ~ ~ ~ ~ z
.
~ O O
~,-.
J J ~ Z ~
L ~ ~ O O O r
r
~
CC
~
'd'
~
~-f7
~
I~
Z
Z
U
0
N O
~
L
O
O N N
N ~
N
O ~ r r
cIi LLO O O O O O O O O O O O O O O O O O O O O O
~ O O O O
(~ O
N
.~, C~j N r
U ~ O O O O O O O O O O O O O O O O O O O O O O
Oj O O O O
O ~
.O N
O O
O O
~ Q
O ~ ~7 tf~ O O
~ p
O' X N r N N
~
~O
(
p
0- (I~O 0 0 0 0 0 0 0 0 0 O O O O 0 0 0 0 0 0 0 0
~ 0 0 0 0
.C N 'd' u7 CO I~ 00 C~ '- c'0d' r N ~- ~.f7 r N r
O O d- r 00 ~ O ~ ~ r N
O O
O
d' ~-O O O O O O O r O O O O O O O O O O
-''' O O Q Q Q ~ Y O O
'~ m m
o ' a.. ~ ~ ~ a. ~ ~ ~ o~ r ,~. N N
~ N o~ ~ r T
+r ~ J J J J J J J J r Q Q- ~- 0- ~ ~ r r LL LL Cn
Q Q ~
a m m m U U ~~~~~o~~
aQaQaQQQQQ a U
1g~
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
m
O O O O O O O O O O O O O O O O O O O O I O O O
O O _ TAN" O
T O O O O O T O O O O O O O O O O O O O I O O O
O
O O O O O O O O O O O O O O O O 1 O O
O O O O O
O T' T Cr7 T N T 'd' CO r r. N N
r N T' N CrJ CD T
00 ~ OO OOOOOONOO~ T
O
O
00 ~ O(
O O N N T T T pc
O OU-
T T T T N N 7- ~ LL 11 LL D ~ t- LL
N ~ ~ ~ LL C~
Q Q Q Q ~ U U U U D D D O C'3 C'3 ~-
I- U C'3 C'3 Z Z Lu
O c~ a. a. cI~ U) (n cn C~ c!) Cn i- I- I- o
d. ~ ~.. ~ fn cn I- I- t- I- >
a~
L
0
U
r N
O O O O O O O O O O O O O O O O O O O
O O O O O
N
c~
N ~ N N ~ >
N T C~ ~
O O O O O O O O O O O O O O O O O O O O O O O O
m
N
cLS
Ln O u7 O
~ N
r N
O O O O O O O O O O O O O O O O O O O O O O V
O O
CC~ T cC N d' i~ 00 N 00 T N T CC d- T CC d'
~ C~ I~ ~ T T
O O O O O O O O O O r t- O O O O O O O O O
O T O
T T tf7 Lf~ ~(7 G~ C'C C~ C~ C'7 CC
T T T C~ ~t O
T T ~ Q ~ Q
LL ~ ~ ~ ~ _ _ _ _ _ _ _ _ _ _
~ ~ ~ ~ ~ ~
T T T T ~
~ C'3 C!3 ~ ~ i
~ ~ ~ - I
'
I -
- H H Y Y Y Y Y Y Y
S Y Y Y
188
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
+.
0
L
c
cd
L
a~
L
c~
E
c
0
0
a~
c
0
cps
U
O
CLS
L
O
+r
O.
Y
r
O
p a
c
o
L~
a~ _a~
c~ ' O O r O O r r O r r r O r
' ctSr
oII E T ~ ~ c~ N co i.c~ d- o a0
N ~ c~
O O O O O O O O O O O O O
O
L
LLO O O O O O O O ~ O O O O
O
_~ _a
l'U~cLS
O
O>
r r O r r r r r r r r O
O '
U C CO r 00 00 ~.C~ O
r O C
_ C~r r O O r O O O r O O O
NL ~ ~ ~ O O O O O O O O O O O
O
OU
_0
~ O
O
O r r O r r r r r r r r r
r O Cfl lf~ N C~ O CO 00
I~ r C~ O 00
.CO O O O O O O O r O O O r O r
O
~ UJO~ O O O O O O O O O O O
O O
O O .O
L
~ ~
m ~ m m
O ~ ~
d .
O O
. ~ ~ f~
'L
O - I I I ~ m .C ~ d' U U
_
O y-
O .,... O
~ N ~ ~ ~ ~I U p O N N
L
L C
=
=
I
~ Z
W~ '~ ~
I U
X
.
H
a
I
~Q ~ ~ ~ U U U U N N N N
Q Q ~
i I
189
<IMG>
190
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
r
0
0
0
V
d
:::::
>v~<>~=:;;'~
X~ I~ Lt7 r OW.C7 :~~;'#~:: I~ ~ C7 N ~ r O I~ ,~~',' O CO I~ O) C~ r p
p C~~7 d' O r I~ :,'~:z?t::0 O 00 ~ ~ d0 d7 O ~ d' W r r 00 O C'~ _
N r r O O r :;~'''~:: O O O O ~r O O O ,"r O r O N O
U :::::: :i~:::::; ~::::~:: ._
::. :~.'~:::::: :........
:::: . .. r ~ ~:.~.SS. . O ~ O ~, r ::'~'.:'.'~:: :. _
.. C'C O O N O CC
~ N d' r N CC :~::.'.I~ ::~'~<; ~~7:;:;::: C
x r O r N N ::::: 0~ i~ C'~ ::;3;.::.C~ d' O ~ O ~t»:;:::N O I~ r ~ O
O .O r O O N ~'4~':> O ~ O >:~:~.~,~"S,('~~~':N C'~7 O O O ~~E7'N O O O ~-
...__ O O
.:v...: ..:
:::~::.~..
p :. .~.:.. .::::.; ss::
O U ~'''': .:..;.sy~ >
C .::.a::::. '..::::::~':. U
p
CO d' r O r N O OO CO O C'~ CO CC O ::~~'~.~1??'~' r r
p Cr7 N O r ~ d' CO O Cfl d' N O O C4 O ::v ~ O W- ::.~',h,~.:.::~::O
N r O O O O Cr7 ~- O O C~ O CV O O O 1~'j:C~j~ O O O #'.'.~'.~.~'.O U
C
O
c~ z
C x~ O O O O t.C~ O C~ O .d' ~ OD C'~ O 1~ N N r O d' I~ N d'
p r O O O 'd' 00 Ln r O O N O O CO 'd' td7 d' r L(7 Ln ~S7 O O
N O O O O r N r O O CV O t- O O O N O O O O r O
L. ._:~:_..~:. :::::::::.>.: V
.........
.:..~.w. ':~..~.~~:
"~>
._
.::::::: ~ ':::>:::
.., ~..... 0....s
. s .~.."..
,v. ~::'::'::
.:::::::: :~::::::::: ,U
'U = I~ ~(7 O O) N CO CO 00 I~ kf5'.'.~.::0 I~ I~ C'0 I' :QV'.'': 1.f) C~~ r O
CO C~'~
~ 00 CO O O C'C d' 00 00 00 :t~,3,?:::~ C'~ O) r r O :'..EY7::::a: 00 ~ 'dy O
00 Cr7
,- r O O N N O O p ~::::.,~:,; . . . . . ,. ,, . .
::'Cl~.:'"v' '.'. °%::: r O r O r O
() U ,... ..:._. O r r r O
~ ''...2.;,
:~::~::~::~:;~: p
..... a..
..... .
::::.y::
..... ...
:' :v~::' >,
I- 07 N 00 O r LC? C~ Cr7 r I~ O O 1~ r d' '00 N Cfl ~ C~ C'rJ
N ~ ~t7 r r r O 1~ O tS7 C~0 N d' r 00 Cfl C~C ::C3::::: N ~h' I~ O CO O
U O r O O N N r r O N r ~ O O O -."!~~:~>.: N O O O r O
.........
'.'~'.'~:~'.>:~2;:
U .........
'.:. O
r z LC~ I~ o r ~ 00 I~ 00 00 CO d' r ~ 00 O ~v:' 00 d' I~ N ~ CO
~ O P7 O O ~ 00 ct C~ 00 O O ~ N 1~ f~ ~:1~: N ~C7 O N O r V
:: ::.::
U N O O O r r r O O r O C''~ r O O :~1'::: N O O O C~ O
p
> N~
-1 O N r O I~ Lf7 I~ d' ~ N d' CO O 1~ r N C'7 O d' r O d'
(jf O O O O 00 CY7 I~ r O I~ O O l.~ CO O d' '~' r Lf7 O In O
C~
U O O O O r N r r O r O r O ~O O N O O O O r- O
p p (~ N p ~ CCi C~ N p ~ N ~ N CLi p N p N Cif N N U
U r~ °Q. °~~ E ~ °Q..~ ~ E °Q Q.E Q.E ~ n.E
~..~ ~ E ~
N r ~ r 1.
~ O O O O O O r O O O~ ~~~ O~ O O _O
O ~ ~ ~ ~ ~ ~ ~ ~ LI ~ / [5
Y ~ '7 ~ ~ '~ ~ U- ~ ~ ~ ~~'J- Z
i- ~ Q Q Y Y Y Y Y Y Z O O O a. I- r
191
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
~ ~ m
0 0 o a~ o
O . O U >
O
C
I ~ ~ ~ (0
~
X r o r r r r r C'~V 7 m O h
m '- r N ~' N ~ n C N
O c V
4 M N
O ~'~,'V' h , .
0 ' C~ c9 . .
N
N ~ q o 0 0 0 o q q 0 o q o 0
a o o o 0 q 0
q 0
, m
-
v ~ m
o o '
s 0 0 0 _
0
0 ,,
0 o ~ N ~ ~ n n n ~ m u~ ~
I T o ~ o~ ao o o o m N
~
X M O '~O O ~ O)N O If7 n T
r N ~ N C
N
.~ r , m ~ ch ro
O O - o N ~ N
~ N
m
O O O O O O O O O C
N ~ 0 0 O 0 0 0 0 0 O U
a 0 0
0
m
O ~ U
y O O ~
N t~
~ ' ' ~ "' V t0 a
N t0
'
r r - ~ T 'n ; . ' a
r r r N l~ T n ~ m
; N o
th N
N
O r N - d. O
.d. o
N o q o q o o q 0 0 Q q
a~ q o o 0
0
t N
'~-' t' O
N ~ O
XI ~ C~ ~ ~ C9 m ~ T N V CO m 0 vJ
~ 7 ~ ~ O M O
r r
_ W, ch N U
O r O ~ r r (h V N N N
N
r
O O O q O O O O O O
N 0 0 0 q 0 0 0 0 0 O O ~ .
a~ 0
0
~ >
m v t
v
E o o c
o ao coa> '~u~h, '~ o> N
V '- cD N 07V . In o t0 co
. m . to '- O
. c0
o CV N CVN N njnjIfjf0 ~ ~ c0 to
o r ~,, r ~ to ~ V v
tp '
, O
O
D U
v m
0 0 ~ a~
a o q ~n
= N a0 ,~.V LU M N OJN n N M n _ d.
U n M N t0 t M V, ~ N O N
N 07 n
(~'J
0 n V (D t
c0 N O N O O N N p O (O n V ~t ciY
O r O '~' lU O O O a
O O '~' O O
. O O.O . . O O U
U o q o q q o o q O O o q o
a~ q o q q O q q
o 0
0
0
r
a
U
~ r N = O
r V-
(O ~ N r N N ~ ('7N ~ ~ f0 (p (O C~J ti ~
N ~ N 07 N Ln N
(D cQ
, O
~
_ N
U U
E-
C CII O G N
O O
U
O
w cn O O ttO n V m ~ O t0 O ~ O
_ N O IO N n U
N
N O O O N O O~ N n O O
O N ~ N ~ ~ O O
O N
~
n ~ o 0 0 0 0 0 0 O
. 0 0 0 0
0
0
0 0 0 0 0 0 0 ~ o
U 0 0 0 0 0 0 0 0 0 0
a~ 0 ,
0
a
a~ ,
c
o ~ ,
c
m ~
io
'
E '. ~, ~
a
O N U
O V n n r tU(O ~: O) CO d
n (~ N W
O U
V o N CV N N C~ N N 47(O n O c0 PJ fn ~
CV N M ', ~ 47
Ch
~ M . p =
~ O
a
~ V t
~ ~7
.'
..
p = a~
~ vl
m o m m n o~ O N ~.n _N
N Z rn M N N M O V LU O O v, 41
U N N M n M N v
N l0
.
yN O O N O O N N O..O ~ 07 N V = O
N O N O - O' O O M N O 07
- O . i ~t O O O
O O O O
O
O
U o q o q ~? o o q o 0 o q q v
a~ q o q 0 0 q q o
0
~
0
y o ~ s
' ~a
m ~
a
~ U ~
O _ UI
~
U O ~ r N n c~7C~7~ M t0 O) N O 'C O
N '~ M N l0 ~ (U
O n
. D N CV r N N N N CVCV oD . ~ (O
1' o ,r CV ~ CO .d.d'
M tn (h
N G ' ' '
.
. C N
t
O -~ _ O N
I N Ch N (D M M COM (p M 07 t0 _ "'
U ~ N ~ N M 07 a0 ~ O
o 07 r
N N T.N N N N IfjN t0 C
N ~ ~ O O O ~h U
O ~ O V
O O O
O O O
O . ~ r
a~ U o q o 0 0 0 0 o 0 ~ . ~
n~ q o 0 o q q o
0 q 0 0
' - 0
U
_ , >
~ O U ~ ~ ~ ~' N
0
O rt5 Q7N N m N N N 0
m f0 N c0 N N ~
U f
a m ~ E E E .
a E ~ ~
E
~
$.a a a a ~ ~ E ~ ~
' E E >
a
C ~ ~
. O
E ~a
~ 0 r N o o o o o
m m o 0 o o I I f ~
m I I W co c~7n Y Z D D M o
O 7 --j-7' 7 7 LLCJ ~ ~ .f- = O
H' 7 z N
~
_ Q Y Y Y Y Y Y Z O O O y- ~ N
Q o
192
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Y
rtr
O
r
O
O
II
U
U
~o
0
r
+~
U
X
N
C
O
ctf
~
U
O
CLf
C
N
f
O
N
O
CCS O r ~ ~ O
o i ~ N r d'
i
O N ~c7 Wit'
I~
r O O
N
O O O p o
O
O
j
O
U
O
N N O N r
C'~
v) a..L(7O Cr7O C~
O
_ O O O r C~
cn O
U
N O
pp L O
~
~ O C~
O
~U
.
O O d- O COr cf'.
N
~ _C~ O '00 O I' N
r
. U c~accic~ co00 ~ o
r
D D
o m m ' U
c m O
c
' - ~ ~ ~ n
J u~
C .d d ,
~ I O O ~ CLS
~
~ F-
o l J Z Z --
~ l I
O . x _ _ _ _
~ O x ~
L ~
O
Q f- N U U U U
N
a~ a~
L
ca u7 a ~
cd
a~
O
~ N N O O C~ I ~
N Q t~ D
LL 1- Q .
U ~ ~.
E- ~ L. ~'-' Y
~ L
193
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
,;5 0 ,~ o ~ o
~ Q ~ Q ~ n.
a~ a~ a~
U ~ U ~ U
C ~ C f~/~ C
(CS
d d D_
i _d
d'
yC y-
U f.3 U C'3 ~- U a U ~3 U a I- U (.~ (3 ~- U
a C'3 U I- U U U (~ C'3 ~ E- U U ~- a a U I-
H U U U H U tU- Q (.a'3 a U ~ U Q C3 U U (.3
U IU- U U E- Q U U U U U U U U U (~ C~ U C~'3
a a ~ U a~- a a U C'3 ~
UUUQ~'C'3~ UQ~aC3UC~'3 UUUUQUU
cn U U a C'3 ~ C3 U U U !- ~ Q U I- Q Q U U C3 C3 U
IU-~UQU~U UUUUU~ UaUUaQ
U U i- (~ U U a U C3 U a a U C'3 U U C'3 U
U a I- C'3 C3 U U a ~- C3 a U C3 a U C3 U
U H C'3 C~'3 U C3 C3 C3 ~ Q C3 U U C3 U U C.3 U
t
..
o U >-
0
a
I- C3 U U C3 U a C3 C'3 (~ U U a (~ U f.3 U a U a
a a U a a U U U C'3 h- C.3 a U ~-- I-- ~ C3 U C.3 E-. U
~- a ~' U a a U i- a C.3 U ~- U U a ~-- I- I- U a
U C3 (~~ a I- C3 U a U I- I- C5 a U U U (~ U U U C'3
C'3 I- U C'3 C'3 U a U U U (~ U I- U ~- I- H- I- t-
a' U U a a U U C3 I- t- U U U C3 I- C3 C3 C3 U U U
as ~ a a U a C.W- U U I- ~-- a U C'3 U C~ a C~ C'3 U U U
N C.3 a Q a a ~ U U C.3 U U C'3 U U i- a U I- U U C.3 Q C.3 Q
in a f- U C'3 U (~ U f- I- a U U C3 C3 I- U (~ I- U a I-
U C.~ (.3 U E- U U C.~ U y- U a C'3 ~- 1- U U U U
C3 C'3 ~ C3 U U U U a U ~ (~ a H a a C3 (3 I- U C.3
C.3 a a C'3 a U C'3 C3 ~-- ~ U U U C3 I- C.3 a I- U a
a U U U L'3 U (~ U U U a I- U a a U ~- I- U U I-
U C.3 ~- U a U a a C~ ~"' U a a I- U t- (~ C.3 I- C3 U
N d'
D_ d D_
U U U
a a a
U
d U
U
N
U
U a. a
J J
U Q Q
a
d
c5 Q a Q
194
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
a~ a~
L
L
o o
,;c .~
~
Q. -a n.
w a~ a~
~U ~ -U
.C ~ C
f~ dS
d d
z z
O
~ o
d N
O
'rte
~
.
=
U E- C.3 U U ~' (3 U U U C.3 U C'3
~""' a U U f-
~. C3 I- C3 ~- U U C3 I- C3 U a C'3
U U a ~ C3
C3 ~ U (~ U U I- U C.3 C'3 C'3 ~'
I- U a U 1-
a U E- U I- U C'3 C3 a U ~ I- ~'3
U U C'3 C3
~"' ~ U
U a U
I-
'
a U U
C~ i- a
U I- (~ C
U U C'3 U U U a U 3 C3
U U U U a (~ C5 U
C.3 a C3
a
U U Q ~ Q U U U ~ '3
~ U H H U Q U ~ H
H C
~ ~ ~ a I- U E- U C3 U U C3 U U C3
i~ a U C'3 C'S
U a U I- U a U C5 U a ~ I- C3 C5
U U C3 C'3 U
U I- a U U U ~' U C3
~- Q ~ C3 C3 U a C.3
a'3 a U U ~ ~ ~ U U ~ V CU3 U U C3
Q C'3 U
C
. a a a C'3 U H U
~ U C~'3 (3 U U H U 1-
~ Q '
U a '
U U
a
I- U 3 C
U (.~ a 3 U a U t-
U U a a U U U U U C3 U C
C3 a t-- U U U 1-
U C'3
.Q o >' >'
0
a
U U a C'3 U C3 C'3 I- U a U U U C'3
~- C5 ~ U C3 C'3
t-- U U C3 C~ I- U t- C3 U a C'3
a U I- U I- (~ a a U
U a a 1- C'3 U C'3 a U C3 U ~'3 U a
I- (.3 a U a I-
U U ~ U C.3 I- C3 U C3 C'3 C'3 a ~
~"' U C3 C.3 a 1- a a U
' I-
'
'
a U C3 U I- a a C 3 C
3 C3 C'3 C3 3 a a (~ U
U U U a a a C
Q U
a U U
U N U
U Q
a
U
N - ~
~ U C'3
U t
l - C3 H Q
- U U a U a C3 C3
l ~ U a U
I-
-
a C3 ~ a U C3 C3
C3 C'3 (~ C'3
C3 C.3 U U ~ C.3 ~ ~ U I- (3 C3 a
IU- C3 U ~ U ~a U Q C3
U C3 U U H U
U '
C ~' U
3 ~
(,~ U a
~- C
. 3
I- C3 E' U I- ~ C3 U I-
a U C.3 U a 1- ~ U U a U
E- ~ U I- C
Cy 3
U a C3
C
3
~-. Q .
- I- I-
U U U C3 I- a
. C3 U a a a U I-
U U U a I- a
C'S a a C3 C3 I-
a C'3 C3 a
r
d 0~_.
U U
a a
as
R U
U U
d M
a
U
a
d ~ a_
G J J
a Q
195
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
o ,;~ o
~ ~
~. fl..
a~
W
~U ~ ~U
C N .C
(~
z z
y d o
= d o i
d
'rte r N
~
.
=
~~~Q~~~a~ Q~~~~~~aa
a I- U U U U U C'3 U I- U (
3 C~ C3
a .
I- U (~ U a U U U a U
U U ~, U a a a ~- C'3
U U
U U U U U U U V Q Q N a a Q
a a C3 a
~ ~ C'3 a U Q Q U
' U U ~ U
U
U U
C (.3
3 ~-
C.3 UU
U U
a
V ~~Q C
~"~ C '3
'3U~~U QC3Uf.3V~V
I- U C3 U ~ a C'3 U ~ C3 U a U C3
(3 C'3 C'3 I- a
C3
U ~- U C'S a C3 C'3 U U U (.5 U
I- U C3 a C3 C'3 U U
U ~
U U
U a
a U
a
~
'3QC '3Q
-HC UC
'3C '3
'3HU I
HI -~~
U U a C3 a a C'3 '3C
U U '3~
Q ~ U U C'S I- U
~
(~ I- U U I- C'3 U
a U I- U C'3 E- U I-
a~'3C'3C3C'3C3C'3t-UUUC'3aaC'3UUU
U
a
f
o
a
U U U ~-- U C3 U ~ I- C3 I- C.3 ~.
U ~' (.3 C'3 a U U
~-Qa~ e~c~ ~ c~~c~aa~aaa~~
a C.5 U ~ C3 C3 (3 a t- U U U U
~ C'3 ~ C~ U U ~.
C3 Q a U IU- V U Q U H Q a U C~ C~
U U V U IU- U
N U'3C~C'3C~'3U C.~~C'3t3UUQC.~'3(~'3C'3~
UUU~C~~--C
. C~'3UHCa'3C.3UUC.3UUH
UUQaHUUC~'3QaU ~ I- U
U U U C3 ~
U ~'3 '
U a (~
U E- C~ C
~- ~ U U ~- a a 3
a C'3 C.3 ~ a a U ~ C'3 U U
I- U U
U
HUi-U-(.5C'S~~U~H -UU
C.a'3UQUtU-V C3Ut
U
'
U ~
'
a
U
U U C3 (.~ C3 a U
~- U a C.3 C3 C
3
C
3
C3
N
a
N U
T
U
U
a ~-
Q a
c~ Q Q
196
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
L
U
f~
L
O
w.. ~~ O
C7 'a ~.
C
LIJ f~ N
~U
.C
N
f~
N
Z
L
d o 0
d'
U U ~ U ~ V ~ C.3 C3 U a a Q
U IU-
a
VU~UUUQQ~ QQU~
Q
a
QH~~aC'3UUV -
U QC.a'3aQC3l
- V U ~ U ~ ~ ~ Q ~ Q
Q U ~ UI Q
c~ U V ~ U ~ Q C3 C'3 U ~ ~ U Q
U Q U
~ U ~ U ~ U ~ Q Q V Q C'3 a
Q C~'3
U ~3 U U Q C~ C.~'3 a U a Q H
C3 a Q
' a a
U a
U
f- U C.3 ~ I-
C a
3 (.3. a C3 a C'3 a C'3
U F- ~- (~ U (~" (y- (~
C~
O
N
'
s
a
o Y '-r
0
a
c~c~Uac~aUac~c~c~ c~Q~c~aUa
aac~c~U~Uac~c~c~ c~~c~~-ac~Q
~~Qaa~~~~~a ~Q~aa~c~
a U U U ~ Q a U H ~ U C'3 CU'3
U IU- U Q U Q
N C3UUC.3UUUUUUUU aU~~".UUU~-
U
U ~ ~ a C'3 ~ U Q Q Q I-a- Q
U C Q Q Q
'3 U
U~~UU~~UU~~ ~QNQUQQ
a
~QC'3C3~UUU~C3U la-C'3C.a'3QQQQ
N
O T
N T O
Y
U a
a
N
L
Y
a
o>
Y
Q a
197
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
~s ~ o ~ p '~ ,~, '~ '
p
c~ ui
~
_ ca ~
_ c~ c ~~o
~m o
~p ~m
\
o. a ~ ~
~ ~
~
o
N ~ m ~ ~
~
~
~ c ~ _ o
~ N ~
~
.
E C~ -f- O ~ c~ cfl N ~ :--
~ ~ M O ~ O
'~ '
""'
Q LO
'
d'
N o - L N C
~ ~ ~ ~ .S
~ ~ ~ '~~
_ ~
C ~ - p~
- Q~s~_a ~,~o ~o ~.~o~~ ~ >
L ~~ +'
N (SS N ~
~
,~ N N O ~ ~ U >' N
UJ ~ ~ N ~ Ln ~ ~ c~
~ ~ ~ d' N ~ -
~ nj
= C
Q
O ~
~ ~ ~ ~ U
~ U V ~ N
~ U ~ ~ N ~ ~
o O
~ N
~ '~
'~
Q C N _~
0 p ~ -p
~ ~
tO
~ N C ~ ~ ~ ~
Q ~ ~ Q
~ Q ~
. . O
i C .
N ~
tn
~
.
~
U
~
~ tn O
U '~ O ~ p fLf
(t~
p c~ ~
~ >
Q
>
'~ O
~. N L
a _
O ~
O o
O M
_ r
~
O
~ ~
U
U
a
I-
a C'3
a
a a U
a U a ~ ~
M ~~~a
Q
a
C'3Ua~- U
U I-
~
U
U
~Q
-U
I
ac~a a
0
0
a
E-aa~-Uaa
a U a U I- a
U
H
U
Q
I
- C3 U
U
C3 U U C'3 I- a
Q
-
C3I-Uaa~'I
1- U a I- C3 U a
(~ I- a C~ C3 U U
ir7 U a a a a U a
a U U a (~ Q U
aQ~
QUQU
~c~a~Q
c~aaU~-~Q
N
O
_ M
O
O
Y
'- a
tC
d U
R
c a
d
L a
to Y
a
d
C
Y
a
198
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
L ~ ~ ~ ~ ~ O ~ T Y ~ T U
C ~ .° -O U ~ O N
o ~ .~ ~ ~ m O ~ ~ Q ~ ~ ~ O r C r o
~ O Q
Q. O ~ p ~ C
~ ~ v N .~ .~- T .~ L ~ N C ~ >' E O p
o ~ r ~ a~ o o ° -a ~ °_> ~~ Q- > ~ n. ° ~ ~
G 'Q ~ ~ c O o ~ O N i ~ In O ~ -p »-~ ~ ~ ~ m o
O \ ~~ .~. N >. C V ~ _N' ~r L O
.,... o O C~ ~ ~ p
LL! -O ~ fo' v~ ~ ~ ~ 'a c ~ C C~ 'ln ~ Q. ~ ~ ~ ~ V
C ~ '~ O C CY7 V C f~ .,r ~ U Cn O
.~~r 'O In C ~ f~ U lO O
C p ~ p M ~ ;-' ~O U
~U ~ O N Q _N 'p (~ ~ V 'U .~ CO O
~3 ~ O ~ ~ c~ ~ cn ~ ~ 'a
L
'c °' m ~ -a U .~ ~s ~ ~ ~ ~ a m tn fn
s.
0
'r~~ ~ = N '_
aQa~a ~~Qaaa
ac~~c~a~ QaaQQQ
a a a a a U f- a (~ U
a C'3 f- U C'3 ~ U I- U a Q U
U C'3 C3 C.3 a U a a I- U a
a Qaaa~-E- a a
QQQC~c~c~ c~Q~aa~a
aa~~~Q ~~Q~~e~
a UU~Q~a
~~a~~~ ~a~~ac~
U UU
UCa'3t-~Q~ UQUCa'3C3C3
O N
Q.
Y >-
0
a
Q Ca'3 Q a (.a'3 U a~-- V ~ U U CU'3 ~ U
aUUaUI-I~- QQ~C~'3QU~
Qt-QQC.3UU C3C'3UaUQ~
I-- C3 a
Q ~ Q U Q ~ ~ Ca'3 Q U I-I-U- C.~'3 U C3 a
U
~ U a U U ~ a V ~ V ~ U C~
a
~ N Q a ~ ~ Q H Q H U Ca'3 H
N
op O
O
Y
L a a
U
a Q
d ~ a
c
Q a
199
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
c c
._ ._
O D
m ~ ~. ~ ~r
L ,- ~ o ~ o
,~, ~ ~ o '~ o '~ o
d ~ ~ °- ~ ~ ~ c
3 0 ~ ~ E
.o .o
N ~ a
Q
a
o °-' W y
c ~ d ° o~
r N
I-aaUa UC.~UUa(~ UUC'3 aE-
a~~Q~~- ~~~~a~ t
aQa~~Q ~~c~~~Q ~~Q~~~
as a ~a~~a ae~Q~ac~
Q ~QQ~UUQ QQQ~Qy-,
cn U t- C3 U a U U U C.3 C'3 I- C.~ U ~ (~ U
ih N ~ U ~ Q ~ U a F- ~- U H U C~ a U y- a
,- ac~aa~~ c~U~aUa
c~QC~~-a
C3QQla_U~ aIV--~aCUU'3U Q(.~'3QU~U
U
la- Q ~- ~ t3 U U la- ~ (.3 Q U Q C3 V Q
O N
s j
a.
o U
E
I- p
a
C'3 I- C3 a C3 C'3 U U C'3 a U U C3 U U a a I- U a
a a a f- ~- U U a E' a ~- C'3 U a a U C3 U U
i- C'3 a C3 ~ I- U C'3 a a C3 1- U U a U a U
U U a a C.3 ~ U I- a a U U U U U U a a (~ H'
a a I- a a I- U U a I- a U C'3 C'S ~3 U I- U
a' ~ a U a a C~ C3 C5 I- (.3 U U I- y- C5 U a a C'3 ~ I~'_
N Q ~ a Q Q ~ U I- U I- Q I- C3 U U U U C3 U C3 a a U
in ~' ~ a a N tU- f-f-U- U U a tU- C.~'S U a ~ C.3 a U I- C'S U
I- (.3 U ~ (~ U I- a ~ U a a U Q U C'S Q C'3
H Q H H a a a U ~ I- U a a C'3
C3 (~ a U a a a
E- a a a ~ t3 Q U ~ U a a C.3 ~ Q Q U ~ U U ~
~~'3 ~ Q ~ H Q IU- C'3 Q U U U C3 U I- a a I- C'3 V
r N
d LNL. LNL
m
F- U
U o 0
U H
N N
i m U
U
c
200
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
D D
m m
U U
O N
O
.C O -_~ O
O 'J O
'a Q. 'a Q.
N N
~U ~U
O O
fn W
Q a
o °'
c d d
C'3~~C.~ ~3UC'3U ~c~Q~~~ ' Q~a~~~
U U Q (~ Q
IU- H U Q ~ Q U V ~
U U C3 H- U U V (~ U ~ C'3 U Q Q Q (~ U Q a (~ Q
UU~UUUUQUQ UQQUQQ QU~UU~
C3 C5 ~ (Q,'3 U U U H U V Q C.~'3 C.~'3 (.3 Q U U fQ- U Q ~ V Q
H C3 C~'3 Ca'3 U C~ ~ U H U U H ~ Q ~ U U U I~- Q U C5
U Q I- (~ ~' Q Q I- U U Q Q U I- ~ Q I- U I-
U Q (~-.3 1- U U U U U~- ~' (~ U ~ Q U Q U U a U U
C3 U
H Q ~ U (3 tU- U IU- I- U ~ U ~' a U
(.3UUI--i-UQUC~ C'3 U~Ca'3Q QUQU~C'3
O N
Q.
p CG tn
0
a
U U U ~ U U U ~ U ~ U U ~ Q ~ Q U U U C'3 U h Q U
~ Q U I- Q y- C3 (~ U ~ Q I- U Q U U U U ~ U Q U ~ Q
U F- Q (~ I- ~- U U C'3 C'3 U U U C'3 C3 U U U U E- U Q
U C3 U U ~ U H U Q C'3 Q Q V Q U U ~ U U~- Q C3 Q ~ U Q
UUU~UCSIU-UUQUV UtU-C.~'3~~VU (.~'3~UQUUI-a-
N U U ~ U C'3 Q U C3 U (~ U Q I- U Q N U U (.3 C3 ~ V U Q U U
in U C.3 C'3 U U U Q Q (~ U U U U C'S U U U C'3 U Q U
t- (3 ~' C3 Q U C3 C3 U U Q C'3 ~ U f- ~ Q 1- ~- ~ U C3
U (~ t- U C3 U I- Q U E- U C.~ U U U U U U C3 U Q U C'3 Q U
U I- C'3 C'3 U C'S Q Q U U C3 C'3 U C'3 C'3 I- a U C3 I- U ~' I- (~ I
H U U ~ Q C.U'3 Q U Q ~ U H U U la- C3 C.~'3 U ~ U U Q ~ C5
N
I_C ,- d~ r
m
O O
LL LL
U f
m T
m
J
0
201
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
~ ~ a~ ~n ~ ~ o
~ >, ~
~ .n o
~
~ :
~
;,..,
~ .
cn .
~
~ ~ ~ cfl 0 0 ~ o
~ o
'
"~ ~ N
~
~
.~
a
t~ '
v~ ~
Q
c~ 'a ~ C '~ ~ c
-
~ _~ ~ ~ = N ~ ~
,p_~ '~
j cCS~~~t~n~cp~~~c
d ~ O
~
0
ai
~ ~ ~ p O U
LLJ N ~-
~ U ~
C 7, U o v-
'a
~
~ ~
N ..~~. O
Q.
~ CD U
~ Y O ~ ~
~ p) -~
,
~ C 'i ~ ~ ~ ~ ~
~ O
G! o
is N dog.
.=
~QHQ Q
~
~
~
U~ UUUC
C'3 '3C
U H '3QC.3C
U '3
a' Q
a U
'
t- U Q
'
~
~'
U
t -
- C. I-
3 C
t 3
- - C
U Q U C3 U U 3
3
I
Q E- U U Q C'3 U
(~ U
U
a
d ~QUU U QUUU
C'31-
-e3~
U V ~ ~ Q ~ U Q (.3 U
~
U
ch V ~ C~'3 U Q ~ U Q Q C.~ U ~ H
Q
~ ~ U Q ~ ~ U U ~ C3 ~ H Q H
Q
a U Q U U C~'3 Q Q H ~ ~ Q ~ ~ a
Q C'S C'3 I- ~ t- U I- ~ Q Q Q
0
Q.
O
0
a
H QQ~U~U ~'
E-Q
UU
UFU-QU
a'
Q Q C.
Q ~ H ~ U 3
C
3
U C~'3 Q Q U I- ~
f- U
a ~- QU~UH~~~~U~
Q~U~QUV
V C.~ U Q Q ~ Q Q I- f- C'S U Q Q
U U Q~-. U
U
~
U
in QQUC3QQ C'3C.3C'3
U Q Q I- Q ~ ~QQC3
Q
U C
3 (
3 ~' I-
Q U Q ('3
Q Q ~ (~ U Q U .
.
Q
I- Q I- U Q ~ Q Q
C3
a
~
U C.a'3 U H IU- Q '3
U ~ U Q C'3 tU- U ~
H Q ( C
Q C'3 ~ '3 C
C'3
U U ~
Q U
~ I-
U U
N
0
R o
d C~ M
T O
T
T
'i ~ J
d
C
d7
202
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
_ U O L
.n- fn o ~ ~ ~ ~ ~ ~ ~ ~ ~ -C ~ N
O . C ,d.
,0*~"~'~O O'"'O~ Op00~~O
O -fn In M In -p ''~ O ~ O ~ "~ .C ~ p ~" ~ O
C'3 > w ca ~ c~ ~ ~~ :,_. ~ >' o ~ -a o o = ~ ~ o
- .p o
~ ~ a~ ~ ~ ~ ~ p' °= o m ~ ~ o ~ Q-
Q a~ -~ ~ E cn ~ ~ ~ ~ a~ ~ ~
0 0 ~ .v~- ~ ~ ~ ~ c o ~ ~ '~ ~ c~a .-~ v -°'a ~ c ~3
LU U ~ ~ ~ ~ fn O O ~ tn U .°-' ~ ~ ~ .~ Q- -° ~ ~ c
N ~ ~ 'U ~. O ,-Cue, N O p O c~ ~ ~ ~ ~ ~ p (~ N
~ ~ +,' ~ c .~ p -o ~ ~ U n. ~ ~ N ~ n
a> a ~ ~ ~'? ~ c~u
0
0 o I-~- c''_'n ~ m O u~
d o
d
U C3 I- ~ (.~ C'3 C3 Q U U U C3 U I- ~ U C'3 C5
N U
QUC~'JU~UU~~ UUQC~'SU
C.U'3 U C'3 Q IU- C~'3 U Q Q U U Q Q ~ a ~.." UI- I-
a QUI-C~I-C~ UU UI-C~UQUU~C~
as C3 U I- U Q ~ I- Q I- (~ C~ U U Q (~ I-
C3 U U ~' C3 ~ ~ U U Q Q ~'3 U ~ U Q U
Q ~ U H I~- V U ~ H U Q Q H C'3 U ~ ~ Q
H ~ a Q ~ V G~'3 U U C.~'3 U ~ Q Q CU'3
C'3 H C3 H U U E- C.3 ~ Q U U Q f- Q C3 Q ~'3
I- Q f- Q a U (~ I- U C3 U Q C'3 Q U f- I- U
O N
Q.
0
0
a
Q I- Q f-- Q ~-. ~- U a U U (> U i- C3 U U U
U Q Q U Q I- U Q ~ Q U U I- U U U Q C3 Q F-
U C3 U U U U C'3 U I- U (3 U Q C'3 U U C'S U
~ U C3 Q Q C'S C3 ~- C'3 Q C3 Q ~ U Q C3 U I- ~ (3
a U Q C'3 U ~ Q Q U C.~'3 C~'3 U U Q U ~ U U Q Ca'3 C3 Q Ca'3
Q~UUQQQC.~'3la-UQ C3QQC'3QHNQQUCU'3
Q U (~ Cy- C'3 Q I- U Q U Q Q U ~ ~ Q U ~ C'3
Q Q ~ ~ Q C~'3 ~ ~ U Q U ~ C3 U U Q U a U
Q Q Q U Q U Q Q C.U'S Q H C3 U U U Q Q Q Q ~
QUUU~3tU-QC'3C.~~ UUCa'3UCU'3U(.a'3NQU
0 0
T
O
O
O
C ~ r
U
J a.
is
203
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
~ o ~ -- o ~ -- o
O m O O N m O O N
UQ NOQ-~ N~Q-0
O In - -C Q- 'p O C Q- 'O V
N .~ O ~ Q.
p f~
_ Q p _ ~
O ~ O ~ (~ L O ~ f~ L
V Q ~ N ~ .r.. _ N ~ +~
cn ~ Y O d. ~ Y O d.
C
LJJ
C C N O C C O O
N m O ~ O fl. O L
Q. ~V p ~ C_ U O ~ C
O O C ~Q. O O
f~ Q c~ Q c~
i
0 0 0
I~ ~t
N M
a~~Q~~a~~ a~aaQ~ a~~~~~a~~
U I- ~ C.3 Q U V ~ U Q U Q Q Q U ~ U ~ I-I-a- U Q ~ U U
C3 C'3 Q C'3 U U ~- U ~-- U ~- U U C'3 Q C.3 Q U I- C3 Q U
Q U Q U U U Q Q Q H Q ~ Q Q IQ- ~ t- ~' Q U Q Q U U C~
I- I- Q Q C'3 U U
U C3 C.3 Q U U U U U U U Q Q Q
i~ E- Q U Q C') C~ Q U Q V ~ U U Q Q
U C'3 Q U U Q U ~ U U ~ ~ ~. U Q Q C3 Q Q U V Q I-U
I- I- U U U C'3 C5 U U U I- ~ I- U U C'3 ~ Q U Q C3 Q C'3
I- Q U Q ~ I- Q U U U C'3 Q U U Q C3 Q C'3 ~ Q C'3
Ca'3 H C'3 C.a'3 U U U V a Q Q Q Q ~ U Q U Q C.~'3 U Q ~ U U
t
N
a
0
a
Q U Q C'3 Q Q U C3 C'3 ~ Q 1- C'3 C.~ C3 Q U Q I- Q U U U
U U Q ~ y-- U f.3 Q Q U Q U ~ C~ U Q
U Q Ca'3 U C3 I- U U f- C'3 Q Q U U I- i- I- Q Q U Q Q
Q U I- U I- Q U C'3 U U Q H f- I- U U C3 Q ~ U f- f- Q (.3 Q I- Q
~.U- Q ~ U C3 Q C.3 U U U U ~ U ~ Q ~ ~ Q Q ~ U U a U Q U Q
U U Q U t- Q I- Q Q C3 U Q U Q U Q I- U U C3 f- ~ ~ Q ~ Q C'3 Q
N U U Q U (3 C~ C'3 Q U (~~ U ~ Q (.3 Q U Q Q U a I- Q Q (~ Q U Q Q C.3
Q Q C'3 Q Q C'3 ~ Q U U U Q Q Q Q Q Q I-
in UQU C'3~--E
C'3 I- U I- a U Q Q ~ C3 ~ U Q ~ a Q Q Q Q Q Q H' Q ~ fQ- Q Q
Q C~ U C~ U C~ U C'3 U Q ~ U F- C'3
U U U H ~ U ~ U ~ U Q Q ~ I~- 1-~- U Q ~ U Q U ~ Q Q C'3 U U
U U ~ U (~ C3 U I- U C.3 U U Q C'3 U U Q Q Q Q Q Q Q I- C'3
c_a
ca O Q° Q
i.. T
Q U
r
N I- M
s- C3 > U
Q
C~
H
> >
C~ ~ H H
204
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
~-~o ~--o
c
0 0 ~ o 0
m o Q.o m o ~o
c_ Q'~° c_ Q.~° m
~°- L E
a~ o
o _c
E ~ v Wit"
U O U O
O C O ~ O C O ~ ~ O f~
_O E'~ _O EQ. ~~E
p ~ _C ~ O ~ _
N ~ ~O Vf~J Q .O C~ Q C
N
a ~ a
O ~
= d d
~ 4- N d' d'
ac~U a a ~-acs
zaa~aa a~aUC~~
c~aaa ~- ~-U~-~ ~- ~U~
~~a~-~U as ac~~ c~~aaa~
a U a C.3 I- a U ~ a I- ~- U a a a a
' Q a U F- ~ U a C3 Q a a ~
d C3UC~'3~atU- UUaUUa U~aC'3Qa
Q~~~Q~'
U Q ~ a C'3 Q Q H (.a'3 ~ C'3 ~ U Va- Q ~ a
U I- a (~ U a a ~ U a ~ U a a U
U a (~ U a a U ~ a a ~ a ~ ~' U ~Q
~zUUU
a U I- a ~ a U I- a U a ~' a a
O N
t
Q
o OC >- Y
0
a
aa~~-~-a~ as ~a~ c~ ~-c~~-
U C3 a C3 U U U a I- ~ a c~ ~ ~ ~ a a a
ac~aa~a~- ~-a~Q~Q~ ~~a~a~a
c~
a Q U U Q C'3 Ca'3 z ~ Q la- ~- Q Q ~ Q Q U U ~ a a~-
C5C'SQVUC~'3Q~UQQUQ~U UQQQQ~~
C'3 I- U ~ ~'3 U a ~ I- a (~ U I- U ~- I- a I- U a U
C3 ~ a Q U Q a U Q a U
~ Q ~ Q C3 H C'3 U C3 Q fa- Q Q I- ~ U I- (~ U U C.a'3
~c~~.~c~c~c~ ~-~aaaa
r N
LC r r r
~L
d a
0 0
C'3 U
> >
C'3
H
C~ ~ ~ H
205
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
c
c m c
D ~ D c ~
~o
m :.c o m ~ n.
v
c~
'~ '= T N '=
o N
O
V .~
V
cn O
N
can ~ .~ ~ m
s.
C G7 L ~ d.
~- U (~ I- a U U t- U ~- U C'S I- U U I- I- E-
(~ U a a ~ a ~- U (~ I- U I- a U (~ U C'3 U I-
a U H ~-' U U C.~'3 U a U ~ U Q H U f.3 H H U
Q a Q U H U a a I-U- C.3 I- ~' U ~- ~ I- U U C'3 (~
a ~3 U U (~ I- U ~ (.~ C3 a C.3 U U U i- C'3 C'3 (~
d ~ U ~ U a ~ U Q U U C'3 U U U U ~ a t- I-
"~ Ca'3aQCa'3UC3Q~U~' (.~'3C3C~'3UUUUC'3UH V
a U a U U~- a ~a- C3 C3 C3 ~ U U U U H U
Q U Q U Q I- Q I- a C'3 ~3 U U U U C3 C~ C'3 U
c~~-c~,-c~~c~c~aa
i-a i--ac~a c~c~~~ac~~aUe~
U 1-a- U U I-U- Q U Q C.a'3 U C~ IU- C~ C'3 1-a- H tU- H'
N
o OC
0
a
c~~~-c~a~U~aU a~-~c~ac~~~-U
a ~Qac~~-c~ UU ~Ue~ac~c~
U a Q U U U U C.3 t3 a U U (~ a C3 C.3 ~' (.3
C~'3QaUQUQUQ~ QQ~3~QC~'3UUU~
a
~n Q ~ ~ U a U V V Q U ~ Q Q Q a a~ ~ ~ Q ~ I-U
U a U a U i- U U
Q a ~ U C.3 I-U- Q ~ ~' Q (3 U U U a ~ U C'3 (.a'3 C.3
Q ~ IU- C.a'S C3 U U C3 Q N U (.3 C3 ~' ~ C3 H (.a'3 ~
U
H ~ Q U C'3 U H U Q C~'3 U Ca'3 Q C'3 V Q CU'3 ~
N
R
is o 0
d m
Y Y
co
> ' >
d ~~ C~ CVO ~-
r n
z~a zo U
a~ o o a~ r o
r
~ Q
T
.-
Y Y
206
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
0
w" ~ ~ ~ C _C L L ~ y_ ~ ~ r.
o c~ ~ .'~ N ,-~ ~ 'a V C~ fn CO~ ~ ~ O ~ ~ ~ ~ ~ ~ In ~ ;~ O C~
0 0 ~~ ~ ~ ~ '."~- ~ °o o .c -Q :~ ~ ~ ~ ~ ~p o o ~ v °
"p -U
E ~ N ~ cIS ~ N '.~ ~ ~ a7.~ ~ N ~ ~ + ~ ~ .> N ~ = Q
O C fO L '~ +. C
O ~U ~ ~ C ,_~ O ~ O N (~ ~ U 7 ~ O '-~ O ,-O.-. ~O O ~ tn ~~ C ~
.~ c O N . ~ Q. ~ ~ ~ O ~ N -a O N '~ N ''~ L V ~' .~ N ~
LLJ N :~ N c~ N cd ~ ~ I-. '''' >, ~ G ~. ~ ~ c 'a ca 'v ~ ~ c~ ~ O
a' ~ ~ p ~ ~ ~ vi ~ ° o ~ ~ a~ Wa ~ -tea ~ ~
aop~pv~mo0oc~uQ,LU.~~~~ a0~'~ n~_-~N-,
,_ Q'+~ U ~ O X 'V ~ ~ '' ~ ~ O ~ ~ ~ 'O
°'E~~~>~~~ ago ~u°~U~~c?oo
-° o ,~ ~ ~~°-~' ~ H v ~ m o ~ o '_' '. ~ .n
i v E-
o ~ W w
'rte ~ .i N N
~c~-~-Q ~~- Uc~aQ~Q~~-
Q ~ V ~ ~ IU- ~ H U U U~ Ca'3 1U-
QI-U-V~Q~ H~Ca'3UU~QU
d ~ la- U ~ Q U U H fa'3 U Q Q Q Q
C3
i~ G~ Q ~ U U ta- ~ U Q Q Q ~ Q U a
V E' f.~~ ~ Q Q Q Q Q ~ Q H H' Q
C'3UU~Q QU~UCSUIa-U
Q ~ac~Q
H Q U ~ a ~ Q U V C3 (3 Q Q ~
C3 Q ~ Q Q
N
Q.
o >' >'
0
a
~ tQ- U U Q ~ V Q U Q ~ Q Q ~
U Q U ~ ~ Q Q CU'3 ~ ~ U Q C.a'3 Q
I- I- U Q
~~~QUUQQ Q~~QQ~Q
U Q Q U Q C3 ~ Q Q Q Q ~ G.~ C'3
~ ~ ~ V ~ Q U U Q U ~ Q ~ Q Ca'3
QQU~C~SQU H~Q~~QQ
UI-~U~'QQ QU~C3C'3UQ
N
r N
(0 O O
I I.
r r
Y Y
U U
d
H H
c~a c~a
.n
d
~I
Y Y
C3 YI ~ , Y
207
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
_ _
0 0 o E ~H ~ ~ .~ ~ ~ °o ° ~ c ~ o ° ~ o ° -
°
o +~ r ~ 'O _O _~ ° ~ = V (LS(LS ° N _O ~ (A ~ ° N M
O O p 'a O o ~ ~ ~ Q ~ ~ tO Co\O C ~ O. ~ ~ 00
O
Y fl- +.. ~ ~ O 7 N ~ ° ~ ' ~ ~ ' Q ° fn
U ~ (0 O ~ ~. a0 O ~ N Q- (~ p' ~ v 'p ~ p Q cIf Q- cn
v- ~ O ~ ~ ~V O ~ ~ ~ ~ ~ N '~ ~ ~ V .fir Q
a~ ~ °'~ ~-a ~.~ ~c~ o ~ o ~ QN gas cu E~ ° o.~
>, ° o ~ c'~n ° ~ . o >, o o ~ ~ ° ~ -a ,,r o
a> c ~ c ~ ~ v ~ V ~ o ?, ~n a~ '~ ~ c -° ~ o ~~ ~ a~ .~
o cn -~ ~ N ' ~ ~ c~ p ~ ~- ao N ~ ~ '- c ~
U ~ f~ f~ . cn N :~ p U U ~ ~ F- ~ ~ c~ ~ ' [W N
° ~ V ~ N '~ U7 ~ Q ~ ~ ~ O U ~ ° Q, N v- ~ O U
Q .C m O ~- .~ O Q ~ .~~. o ~ O Q
L
C_ d G7 \ o Iw
v.L. O r N
UUQUUU aUUC.~'JQQ I-I-UC'3Q
U Q CU'3 U Q ~ la- Q (.a'3 U Q N C'S U H U C'3 U
~ U Q I-U- I-U- Q Q Q Q ~ C.~'J Q V ~ U ~ U U
UVUQUCSU UIQ-VC3~~ UVUHUCa'3U
N COQ ~QUQ ~QU~QQ ~UQ~U
V ~ ~ H (~ C3 U Q Q Q Q Q C3 U Q C3 U C'3 U ~
U U Q U C.3 N Q ~ Q U C.3 C'3
V U ('3 Q U Q Q Q U Q Q U U U U C~'3 Q
U F- I- C3 Q t- I- Q Q Q Q U C3 U f- U U
DC Y Y
>.
0
a
U C3 U I- ~-- I- Q ~' C.3 C'3 U Q ~ i- Q (.3 U U Q U
U I- U U U L'3 C3 U Q U ~- C.3 U Q U ~ Q U U U C3
U U ~ 1- ~ U Q V I- U I- ~ Q ~ U U ~ ~ H Q
QQ~FU-UHQ C3U~ ~'IQ-Q UQVH~C'3V
a U U U U ~' U U ~ Q Q ~ U Q U U U C~ U C~ Q
N UUC3UUUU aQUUQQH UUI-~--C'3~'UU
U U f- Q Q C'S U U Q U Q Q C3 U U Q U I- U U U
Q U U CU'3 U IU- ~ U C.U'3 ~ U Q Q Q U Q ~ U C.U'S Q
C'3 Q Q Q ~ U Q U Q I- ~ Q Q ~ U U U U 1- C'S U
Q U U (3 ~- I- ~ C5 Q U U I- Q Q U U U C3 C3 Q
Q C3 ~-- C'3 I- U U I- Q I- U Q C'3 U C.~ U Q (3 U U
C.3UUI-UUU UUQU~'C3 UUQUI-(.3U
To 0 0 0
L
r r r
Y Y Y
_
r r_
R (LS C~ f~ E-
r
L ~r
,C ° r
L ~ ~ ~ ~ Ct.
°
Y
Q. r Q. r Q r
YI -Q Y ~ Y -°
208
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
~ ~ ~ >~ ~
c~
~
o
.~ ~ o ~ _ c ~n
~ C o .~ N 'O .C ~ V ~
N
.~ O ~ r ~ 'a C U (lf
In ~ p>
~ 00 N ~ ~ o ~> "-_' cCl
p
~ ~ (O ., C ~-' N ~ >
~ vi
cn tL
~ I
N
'~
.,r p
~ ~ p
~ -p
cn
7 p
O O '~ ~ O
W p ~ N
~
C
O
p Q ~ a. 0 ~ ~: N ~ ct5
E ~ ~ ~ ~ ~ ~ .cCf
c cp ~ c ~
U '~ ~ U as v ~ ~ a'~
~ ~
~ ~ . a~ c~ ~ o c ~
v '~ E
o ~ m ~ ~ ~~
-
n.
o W
~' 0
~ N
.
f:
a C'3 U a a (~ U U I- ~ ~- C3 a a U
U ~' (.3 U U E- a U ~ a a
a U a a (.3 U U ~-
(~ U
U
a C3 I- a U U 1- a C'3 I-
C'3 a C'3 U a U C'3 ~- C3 U C3 (~ a C.3
U C'3 a ~ a I- a I-
U a
U
C3 U a U C3 U 1- U a C3 U U C'3 C3 U
C3 U
U
U
a IQ-UQQU~~I-al- ~~~'~UQI
-
~U
I- U C.3 U U C3 I- a ~ C'3 U U C'3 U a
a C3 H C'3 Q
Q ~'
U ~ V U U U Q
V ~ H U Q a
~
c~ ~ C.
C.3 ~ 3 U
C3 Q U U C~ ~ a (3 U C.
'3
a U I- U
U U Q U
C3 a U
~ U C3 U U a C'3
U '
H U H U
H Q Q
C. 3 (~
'S ~ C'3 a I- C
t-- U
~ C3 ~"' U U (.3 U
U U a U U C3 C3 C3 U U U U
U ~. C'3 U C3 f- t- (~ a U a U I- C3 ~'
a U I-
Q.
O
O
a
U ~ C'3 ~ H, ~ ~ U C~'3 Q e~'3 U U Q U U
Q U U U IU-
U '
U C3 U f._' a ~ U U ~ U
U U ~- ~- U U ~ U ~ C
U U U U U (~ ~ C'3 C3 3
U C3 U U ~. H C'3 a U
~- U U
U
a' UI-~I-C.3UU~'aI-I- aUUC3UUUUUU
U I- U U a U a I- U U U C3 U 1- U ~--
U
U
d C3 U ~' U U ~ C.3 U a U
U C3
a C3 U a U U
I- ~ a U C3 U U Q Q ~ U Q U Q C~'3 U U
in ~a U C'3 U C'3
UUUE-
U I- f.3 a I- ~I-
I~'3-VU~I-UUU(.3U1- UUUCSI-UUUU~I-
~ ~ U C3 U U ~ C3 ~ a U ~ C'3 U
U ~' U U
a'
a
U C.3
U C
U ~ U U 3
a C3
~ ~'
H U
U U
U
U
C3 a
U I-I- a I-
- C3
I I
- -
C3 C'3 U U U (
U 3 U U U U U C3
- a
(~ U U
U .
(~
F
a
N
cc
.~a o 0
i M M
,- r
Y Y
d M M
r
G O M
. a i U
~ Y
a
Y ~
209
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
ca ~ ~ o o O z3 cn ~-
y~ ~ ~ o ~ a? ~ ~ ~ -~ ~ ~ ~ ~ ~ c_
'~ co
CO I~
~ '~
O ~
' ~
O f\ i 00 ~ 'a O n N .O
a f~
~ \
C U O
cLf ~ O . o
>,
O _
N (CS ~ ~; M p ~ o ~ LI~ ~' L V
~ ~ ~ p "- CO tO ~
r N
O C~ O ~ N N ~ O
- ~
v '
O
N .Q r ~ ~ .
p ~ fn O ?~ .C
tf? p i O
~U C O + fn C
O O =
/~ i. ~
O C ~
. .,r f~ f-
~ c~
~ ~ ~ ~ O ~ O ~ f~
~
~ . O p
N ~ N
~ n. D yn ~ .~-. ~ ~
v~ (~ ~ S ~ ~
p_ a~ ~'
~ ~ .
, uj _
.f. r ~
O ' cd E O ~ C f~
U o .~ D o ~
N p ~ CD ~ ~ ~ ~ ~ O ' Q. ~ p
Q. C ~
+ p Q- O ~ O
p
~ v ~~ E
H' ,~ .~
O I-- .,-. U
O
-p
O ~ \ o
'r~~ M M
~
~
=
~aa~aaaaU aU~r;~aa
~ ~ ~ U V U a ~ U '
a
U Q
'
~~~aaa~~c~ C
3
C
3 U
aaQC~c~~
ac~~ U~~c~~ Uc~~c~c~
a
d a Q IU- U U U U U Ca'3 ~
U U U'
C~'3 U C
'3
~- (.
~ 3
W- C IU- Q U U U ~'3
'3 U Q Q a a ~ U U
U ~ U a U
(.~ U U ~ (~ a a
Q Q U C
3 Q U ~ ~ U
U U Q V Q a a U .
U U a C.3 U U C3 a
~ U
a (
3 C'3
'
'
U
a .
a C
a 3 C
C'3 U a U ~ i-- a C'3 3
a C~ a W- f- C'3
Q.
E
>.
0
a
~~~a~-~-c~~-aa ~-~c~c~QC~c~
U U U C'3 U a U a U U a ~' I- a a I-
U U
U U U
E- U H- U C
U 3
(~ U a
U U .
~ ~'3 U H a C3
I- C.3
C.3 C'3 U U C'3 U C'3 I- U a U a C3
U U U a
t- a I- a U ~- C~ ~ C'3 a
U
U
~ U U
a U ~
U H U U ~ a a
U
a'
C. ~
3 c~~~~~~
Uac~
aQ~
aa~
c~ a
c~ U'3 ~ U U C
U'3 Q U C
Q U Q U Q U V ( 3 V U
. .
U .
a I- U I- a a a i-
C3 a ~3 U U U C3 U
I- a
C3 a I- U (~ U C'3 a C3 a C3 U ~'3
U U a U C'3
U
I- C'3 U C~ U a C~ a C3 U C3 U C3
a I- U a
C'3 C'S
' '
'
'
U
C3 (,3 U C
3 t- a U
C 3
3 C
3 U I- U
C3 C
N
~_a
r
r _M
Y 'S
O M M
M
~
C O ~
O
Q QM
~C O M O
r
-,~ a I
~ a
Y Y
d ~M QM
Q
(3 Y .
210
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
L O ~ ~ ~ ~ ~ N ~
+.. .~ C .- O p cLf
~ p ~ ~ O
N Q
D O ~ _C
O
O Y ~ tn
o ~' ~ .
'a Q- ~ V
O
L U > CO
cCS
_ >,
C ~
N ~
U ~
Q ~ O 0 O c~
O N Q
~ O O -O ~ M LI)
, ;=. c~
~
'~ V "- . ~
V ~t ~
O N o .~ ~ ~ O ~ .~-.
~ C :~
~
~_ O. O C
_ +. -O c
~~ ~~~' E O
~ET ~ ~
'O o ~ ~ ~
~ f
~ m ~
O '
U
c~ C ~ n C
'~ ~
~ O >
~ C
~
p ~ N N ~ ~ ~
~ N '
~
y ~- U c~ ~
.~ O
~''~ -a
co o O o 'W
U Q o ~.
O ~ o iu ~ o .: ~
~ c~a n ~
~
~
tn O O Q V ~ ~ p O
~ > O ~
Q~ ~ o ~
o
d
c M
d r N
d
~
.
f:
~ ~ U ~ Q ~- ~ ~ U Q I-
U
Q ~ tU- U ~ H
C~Q ~U~UUQ
a~
a ~V i- I- C'3 ~ U C'3
C3 I- f-
U U
Q Q
U
~QUQUU UU~U~
~ ~'3 C3
C3
Q C3 U U a
Q ~.
~ Q C~ ~ Q U Q
Q U C'3 C'3
~ C
~- ~ ~ U U ~
U
~ U ~ Q
~
I- C'3
- C
'3
U U ~ Q f- Q U
C'3
I-' C3 C3
O
N
Q.
O
O
a
aQaQQ~~ a~~~~~a
U Q Q Q U Q 1- IU- Q U Q U U
U U
a UQUU~~~ U~QUV~
Q~UQQQU ~H UUUN
~
' ~
U U f- U
U
~
a U U Q Q
~ U
U ~ ~ ~
U U t
C -
'3 Q
U U ~ Q U U
I- Q -
C3 Q H Q Q U Q I
Q Q
Q Q E- C3 I--
~. Q U
U
U
N
O
_
T M
.~
M
f T
Y Y
d
d'
.
ca
c ~ ~ ~ o
d ~ o ~ N
Y IC'3 IU
Y -~
Y
of
Y Y
211
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
o
0 o a~ o o ~ z3 L 'vi .~
.t ~ ~ ~ ~ ca
3 ~ ~ v .,- a~ ~ o .c~ E ° p ~ ~ c~ -~
c o > ~ a~
o ~ E o o W ~ ~ ° ~ '> ~ ~ .~-->'' ~ r~
o ~ ~ o a~ o ~ o ° m ~ ~ ~n c ~r
~.o.~ ~ Q-Tao c ~ ~ ~-a ~ N ~~ no
° ~ ~' ~ ~ ~ a~ r °' U ~ ~ ~ ~ ~ o ~ ch ~'
~ c o °- in >~. U o o ° o~° o 0 0 o vi
_ T
Y.. C ~ L
U ~ ~ o ~ C N ~ M ~ In C ~ (Cj >
U .°_~ ~ o ~ U ~ C3 ~ o ~ O o ,~ .n°
° ~ ~ U .,r cu L o
Lov oo°~o.~p° o>o.,.-.
0
n. ~ Q. c~ ~ m 0. ~ o
_d
c d ~ ° °op
U U
a
a U a ~ Q C3 ~~'3 a U U
Q U~U
a
d Q a ~ ~- U ~ U U a U a U
M aQ~~Q~ ~ a~~Ca'3~Q
C.3 a ~ U ~- U U a H H- ~' U
a U a U ~ U a (.a'3 ~ Q a
~aa~ ~ac~aQ
aQa~~~ c~Q~~U
0
a
I- o
a
aUaUaa
~~~~~e~a Q~a~~~~
aU
a QaUQUa~ UaUUV~Q
C'3 ~ ~ 1- V- a H ~ Q (.~'S U a U ,
Qa~~Q~a QQ~~~~~a
F- H a a C3 a
U Q Q U C3 U
~ C3 U U U U Q Q ~ U Q C3 Q Q
QQQ~H~~ ~QHHUUQ
~a o 0
L M
T
Y
d
r r
~ U .n U
d~
Qo
d o T o
-~I U ~I ~'3
r~ Y '-G
C'~~3
Y
212
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
ya ~r ~ o ~ N .~ y o~
..... ~ N o ~ o o ~ ~ ~ o Q ~ N o d.
C ~ c~ O
j ~U 0 \ ~-' c~ N v- ~ ~~. ~ CO ~ O
V r ~ o ~ CO ~ > fn ~ 'a Q. O
M N O m p- M Q ~ ~ - V '~ ~ ~ ~ O'
O p Q ~ ~ Q Cr7 O N C 'O ~ T ~ ~ tn
O tn .Q ;~ O O O t T O N ~ ~ ~ *' p
d ~ ~ ~ ~ ~ a O - p O ~ ~ o ~ O ~ :~_.
C ~ O ~ ~ ~ ~ ~ Q. ~ dM' ~ ~ ~ .N
O O ~ - 'O O N +-. ~ in ~ M ~ Ct. Cp 'a
O o '~ .-C~. .~ m p ~G N ~ .C tn ~ 'a
O ~ M ~ C~ ~ O N :~~ m O ~ V C ~
U ~. ~ of -~ C.~ can c ca ~ L °o °
n. o ~ " ~ ~ ~ .~ ~ ~ o a~
E ~ ~' h-~ o ~ 'a °° ~
c~ ~
o °-'
r N N
~~~Qa~ Q~~~Q~ ~QQaac~
U ~ C.~'3 U a I- Q C3 ~ Q Q U U U Q Q Q ~
Q U (~ I- a U ~ U Q Q C3 (~ Q i-. U Q
n ~~~~Q~ a~~~~~a ~aaQQ
as ~~~ Q~ ~~aaQ~
c~aQ c~~~~a Q~-ac~U~
Q~a~~V ~~~Qaa QQaaQ~
UQ ~ Q Q Q a ~ C'3 U U ~ ~ ~ a U Q U H
I- Q U Q U (.3 I- U Q Q Q i-- C'3 a
Q.
o GC fn >-
0
a
~_~-~~c~aa a~ aU~ aaaQ~~-a
t-- U Q I- (.3 I- U U ~ C5 U Q U a I- ~ C'3 U
Q C3 Q U Q ~ Q Q ~ U Q U Q U ~ C~'3 U a C3 E- U
a' U ~ IQ- ~ U Q U ~ ~ U U H U ~' a Q Q U Q ~ Q
Y U I- U U U Q Q I-- I-a- Q Q U U C~'3 CQ'3 ~ Q Q ~ U
U f- a a U U Q U U U U Q I- ~ U a Q Q a
ire U Q a C'3 Q U ~ U ~' a C3 U f- U U ~ U a U f-
1~-~UQQC.U'3CQ~ QIU-~Q(.~'3(U,~Q C'3UQ~UNQ
C.~~UCQ'3QQ~ QQQUUQQ (.~'3C~'3C.~'3QQQQ
N
r N M
Y Y Y
m o
o c~
r r
Q C4 cCS ~ cd
M
Y
Q. a.
Y pl i
Y Y
o ~ o ~ o
Y Y Y
213
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
a~ a~ Q- o
co ,~ . ~ >,
o ~ ~ in ~ ~. ~ p
o ~
N E o ~ ~,~ a a~~ ~~
~o ° c ~~'> ~y~ ~ vo
'''' Q c ~ .'~ c o m ~ ~' °-~ ~ ~ c o
v~ ~° c
11! ~ ° r= ~ N .~ ~ ~ N v .~ o ~ N
~ c~ ~
U Q ~ ~ ~ ~ o ~ ~ ~ ~ ~. .
~ ° -~ ~ r ° v~ ~ ~ o
a~ a~ o a~ ~ ~ d. o ~ ~ a~ ~
~ ,~ N c_ ~ ° ~ .° ~ Q
H
y' _d o
00
R ~"
Q U U L'3 C~'3 U ~ Q Q U ~ U U
Q ~ Q U ~ ~ Q U ~ I- ~ a ~ tU- Q
~ Q U ~ Q Q V Q U Q C'3 Q U U
a~ U ~' U I- Q Q ~ Q Q (~ Q Q Q Q IU
ch Q U U ~ ~ U C3 Q (.~'S C.~'3 Q Q Q Q Q
U C3 E- ~' Q ~ ~ U
Q Q Q Ca'3 ~ Q Q Q ~ ~ Q U ~ ~ U
a~QQ~a ~aQ~a~~~c~
N
Q.
0
0
a
c~
Q ~UUUUQUQ~U
U U C'3 Q I- Q U U ~-
U Q tQ- U U I"' U h U U U U U U Q
C'3C3UUU~HU UQ~QC~'3~QUQ~Q
ire (~ (~ U I- C.3 U' U C~'3 Q IU- Q Q ~ Q ~ Q Q
C3 U U ~- C3
~ U ~ la- ~ U (.a'3 Q Q Q U U U ~ Q Q ~
Q ~ Q U C'3 U C'3 Q ~ C'3 U Q U U Q U
N
~I
47 due'
Y
J
T
Y
d7
l3 p J
I
Y
Y ~ J
214
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
~o
~o
V
U Q
U N
N
+~
C
0
N
- C
U
a O
L
'_2 10 ~ = ~f' N tn
~ac~a ~-UUC~aa c~a~aa
a C3 I- a ~ a U I- (3 I- U a a I- ~ C'3 U
U i- U a I- U~- U a C'3 I- a U U a
~ a I- a ~ ~ U (~ y- a a U U U C3 ~ (~
H- U U a U I- a C3 U ~- a U a a U I- a
~a~aUQ ~U~UHU UUQQ~Q
d C.~'3 (3 C'3 ~ a E- H U ~ C3 U ~ a C5 i- I
N ~- ~- ~- a a a a
QQQHC3~ V~~UQU Q~-Ca'JtU-QHU.
V U I-I-U- (.a'3 ~ ~ U C'3 U C3 V C~'3 ~ U U C'3 Q
C'3 ~ U a ~- ~ a ~ a c~ ~- a a a ~ ~- a
~'aaUa aUU~-U~ C3~'aUa
Q.
0
a
Q~~-~-Q~-c~c~a.UQ c~Q~c~
a Q ~ U U ~ C3 Q U ~ U C5 C'3 U C3 U U C5 ~ ~. a
C'~ U U ~ ~ U ~ U Q U H U Q U Q ~ U Q C.a'S V Q
a~c~c~~~-~ a~.c~c~c~~a
N ~ U ~ Q U Q C3 a ~ U ~ Q Q U U ~ U U U a U Q I-
c~~- a a~ aQ
~aQQQ~~~~~~~~Q ~~U~~~Q
~Q~aa~~~QaQaQ~ ~a~~~-c~a
U U ~. ~ C3 C3 C'3 a U a U ~ a ~
r M
M
O O O_
M ~ d
Y a
E ~ U a
r M CND
C (~ I-
i. M r
Y
d r r
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
c
~ ~ o
U c~ = ~ N ~ a~
a>
-~ ~ o _c 'O -~
ct
~ c~ Q .~ ~
~ o ~ o
~
.O O
>, ~ c Q.
,~, ~ ~ ~ ~ = ~ o
U
f~ - In :,r. .a-
~ c~ ~. ~ c
O
C
O
I~ti -O
r-. > f~ C
a~ U ~ ~ ~ o ~ ~
y o
s
c m c~ c~ ~
~ ,~ a~ N
U
O C_ N c~ ~
O >, O fn . O -O
N
~
cLS
N C O :'' N
C ~
~
~ 'O
c~ O +
N
L
d
d
IC
'"-
QU Q
U
U~~U~Q ~
Q~~a
~
Q Q Q t- U C.3
~ U C3 Q E- Q
Q
1- Q
Q C'3 f- Q U Q Q
U ~ ~"' V Q Q U V y'3 Q U ~'
Q ~ I-
a ~ Q
~,r~ U I- I- Q U
(~ Q U
I- a
~ Q ~ ~- U C3 Q ~ l
a U - Q~-
U U U U
a
U ~-
~UU ~
UU
~
~QV U
C'3 U U UQ
C3 N
N
o >- Y
0
a
Q U U ~ Q (~ Q U I- U I- (.3
U Q
U ~
UQQI- ~QUUC3QC
-QVU '3
H
~
H H Q
H
Q Q Q ~ Q U ~ Q V ~
~ (.
'3
a
~
'S~ '3U1-
Q~-UQUC -
QUUUC
a
U
QQ~ta-~QU '3
-HQQ~-C
QI
Q U U U Q Q V U Q Q C3 Q
E- Q ~ C3 I- I- C'3 U
U C'3 U I-
(~ C3 U
N
d'
0 0
r r
L
U N
co
R ~ N
L
r r
216
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
° ~ ~. -a
~ wn ~ _o ~ ~ c~
~ U _..~ .gin ~ c ~ ~ ~ ° .''~'nc o
0
o ~ =-°.~ ~ v ~
a~ o ~ c~°a ~ ~ ~ ~ o ~ ao
c c . ~- o L v y
° ° ~ o o .-c °' _c ~ ~ ~ ?. o
d .~ .~ ° a~ -
LiJ ~ ~ ~ ~ ~ ~ ~ o U ~ ~? .07
V N ~ ~ ~ O ~ 'D
(~
C N 'O N ~~- pp "'- N ;~ a7
~ O ~
N Q. ~-. c~ 'a ~ ~ L ~- U
n'~U ~ ~ ~~~~ '.n n
h- c.,..,u ~~ ~ U c~ o ~ N ° ~o
a~
N C4
R '"-
U ~ I- (~ I- U I- a U U ~- U a U
~' a U Q Q ~ ~ I- U U Q U C.~ U
U~Qa~Qaa~ ~UU~U~
U I-a- C'S Q (a,'3 C5 U E- C'3
C'3(.~UQ~aQQU UUI-I-a-~UUa
N (.3 C'3 U a C3 ~ a U U U ~ a I- Ca'3 U Q
U U ~ U C3
C'3 C'3 I- a U a U
~~Qac~c~a~a
~UUIa-V I-a-QUU C~UUC'3QU
U (~ C'S U U U U C3 C3 a a a C'3 C'3 C3
U (~ a U a U a a U C'3 U ~-- U i- a
N
t
Q
o ~ >'
0
a
a (~ a U (~ I- I- f- a I- U U U U U C.3 a
Q C.3 Q V H Q U Q ~ U C~'3 U U U U Ca'3
Q U a a Q ~' ~ Q ~' Q C~'3 U ~ U U H U
a U ~ ~ U U U a a a a a C3 a U U C'3 C3
N (.~a HQQQUC'3UQ U~I-aC'3(~f-a
m Q Q ~ Q Q ~ ~ ~ ~ Q U ~IU- ~ U U U V a
U~aaUaaU~U UU~UQUU
Q C'3 tU- ~ a Q ~ I- Q U H U U U U I- a
N
r
jp O
N
.NY ~ U
Z
N ~ H
O
U U
p m
z
N r N
Y
Z
217
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
N a\o .~ ~_ d°°-. ~ o ~ c c c~
~s U ~ ~ a~ ~ o a~ ~ ~ ~ ~ o ~ ~ o
U ~ ° ~ c ~ Q.c_ ~ o ~ ~ m o a.
E c E ,~, ~ r
o ~ ~ ~ a~ ~ ~ a~ o a~ o a~ o . o ca o
U o °.°n °:.=o° te'a'= ~-°o
o cn :~, o ~ *' ~ ~ ~ c~ ~ c~ 0 0
I~1I o a> ca ~-' a~ a~ '> a> o -a -a ~ -a v
o .~ ~ X ~ ,'~ 'a ~ .~ c~~a ~ ~ c~~a ~ n.
0 0 0 0 ~ as '~ :~ ~ cn .-. pn c~ ~ v
0
o .~ .°Q °° ~ ~ °~ c ° o ~ o
U t. o -o u~ ,~ Q o Q
L
G7 47 \ O
L N ~r~ N
I-C~Q~UU U~OI-I-C~ UUUQI-U
Iv-C~QUCa'SQ III--C~'3UQC3 ~~UQ~'3U
C'3 (~ U Q Q U Q Q C3 C3 U Q U I- U C'3 C3 t-
I- U ~- Q t- U I- U Q I- ~ U I- U U C3 C.3 C'3
C'3 U ~3 (~ U U U U C'3 C'3 U U U Q O C3 ~ U U
d Q U U V IU- Q U (.a'S U U U U Q O H ~ U U U C'3 U
~QU~CU'3C.a'3U ~UQCa'3C3CU'3Q Q~CU'3~H V C5
U ~ V H U U U U U Q C'3 V C.3 U Q V U Q (.3
~ V ~ C'3 Q C'3 U ~ U C3 C'3 Q C'3 ~ U U C3 C.U'3
Q U U V U ~3 L'3 U Q Q Q ~ U Q Q V U U'3
UUQ UCH
.~
a
0
a
U U (~ U U Q Q Q U U (~ Q U U C3 ~- C'3 Q U U U
f- U Q C3 I- U Q C'3 U U Q Q U Q Q Q C3 Q U U U
Q I- C'3 C3 U Q I- Q C3 Q U Q U U U Q C'3 U U U
U C3 ~ Q I- U H U Q I- Q, C3 C'3 I- C3 Q (.~ U
U (~ U Q C3 Q ~'3 C3 Q U H I- C.3 U U U I- C3 Q U (.3
a ~- E- C'3 Q (~ U C3 I- U C'3 C3 U I- U U C.3 C.3 C'3 ~"' U C~ 1- U
U Q U C'S Q U Q C3 U Q Q C3 U U U ~- C3 (.~ U C3 (~ U
U U i- Q U U C'3 I- C3 U Q Q U f- U C3 U Q Q U ~ i- C'3
U ~ U U U U U U U U f- Q U U Q U U Q C'3 Q
U U I- (~ ~ H I- Q ~ Q Q U C3 ~'3 U ~ h- I- ~ U ~-
U U U U C3 C'3 U (~ U C3 U U (~ Q U Q C'3 U U U U
C3UI- I-!-t- I-UQC3C'3I-C3 UC3C'3C3 C'S
Q U C'3 ~ C3 Q U C'3 C3 C3 Q C.3 C3 C3 C'3 C'3 C3 H- U U I-
U Q ~- U Q U Q Q C'3 U U (~ C3 ~ C'3 C3 C'3 C3 U U U
N
r
O
L
1i O
Z Z Z
Q
oUo 0
c Q ,N-
d m N Z
m
Z Z O
Z
d
U
Y I- c~
d
Z Z Z
218
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
c c c~ ~ ~ ~ ~ 0 0
o ~ ~ o ~ ~ o
~ ~ ~U
U U m ~ Q
~ Q m ~ Q
(~ ~ C c~ ~ C ~ N N
O ~ ~ O ~ ~ O ~
V
O ~ O ~ ~ O ~ O ,
~
, -U O
V V U V
Q 'J ~ Q. .~-. Q.
U
~ U ~ ~ U ~
c~ fLf ~U
".'- N ~ ..s--.
a ~~ a ~~ a ~
0 o s
0
r r
l~
U U U U U ~ U U U ~ C'3 U Q C3 U U C.3 U Q
al-C'3U1-Q Q U
1-UUUUU QC31-UUC3UC'3C~
U C3 Q U U U Q C~ (.~ U Q C3 C~ C'3 C'3
U U U C'3 U Q
Q Q U U Q C'3 Q U C3 C'3 U C'3 Q C.3 Q Q (.3
Q C3 C'3 U
U U ~- Q U U Q Q C~ U U Q U U Q U Q Q Q
(~ Q ~3 U C3 Q U C'3 Q
U Q C3 C3 C'3 U C.3 U Q Q U C3 U
!- U Q U U
a U U U U C~ Q Q Q U C~ U
U ~- U U C~ U (~ H C~ Q
Q C'3 Q Q U U U U Q E- 3 I- Q (3 Q U I- C3
C3 U U U Q
~- C
U (~ Q U Q U (.3 f3 C.3 .
i~ I- C.3 U Q C'3 U I- C3 U e3 U U U
C~ U C'J Q C~ U Q I- C~ C.3 C3
C~ U C~ ~ U C
U U C~ Q (
~ U Q C~ f-
U Q E- U C3 Q .
C3 ~- I- U U .
U U U U U Q U U C3 1- U i- Q C3 Q
U U C3 U Q U U C3
U (.~ U U U Q U U C'3
Q
U U C'3 U i- U i- C'3 C3 U C3 t- C3 U Q U C3
I- U Q U
C3 Q Q Q U C3 U U I- I- U Q U Q I- U U (.3
U U U
C3 I- U U C'3 I- Q U U U U Q Q U U U C'3 U
U f- I-
Q U Q U C'3 ~3 Q Q U U U U U ~3 U Q C'3 C'3
U U C3
N
O
a
U U U U U U C'3 U U U Q ~ C3 ~ U U Q Q U U
U U U U U
U H C3 Q C.3 Q U U U Q Q C'3 ~ Q
I- U 3 a
' Q U U U Q (
E-
I- Q U ~ U U U U C .
U 3 U C3
U U U Q U Q ~ Q U C3 C'3 U (~ U (~ U U a C5
U Q U C'S U
Q I- U
~ '
Q
U I- Q U Q U I- C3 Q U U U
U I- U U C
H U Q U 3 Q
U U U H C3
U H U Q Q C3 Q ~ V ~
U
U
Q V ~ U
N H U C. Q U U C'3 U C3 U Q U
t C.3 U Q U U Q Q U
- U C'3 U
'3 U
U Q C.3 U U
C'3 I- U
Q U Q H U ~ C~'3 ~ U Ca'3 Q U Q ~' C3 U Q V
Q U U Q ~- U C'3
' Q U U U
'
U
U I-
3 H- C3 I- Q Q H C
Q Q U C 3 U C3 U U S
C'3 ~ I- U U C3 U
U U U ( U
U ~'3 U U (.3 U C.3
~ (.3 V U
C
~
Q U U ~- Q . Q U Q C3 C3 (~ C~
U U Q Q U U I- t3 C3 U
U U C3 .
U I- U C3 U
U ' C'3 Q
Q
U
U U U
U U ~ S U U
U C U C3 C3
I-
N
R
I- y-
Y ~ ~ 0
Z Z Z
H
~ N
c d C~ U
Q
M M M
U U
V
O O p
Z
d
O M O M O M
Z Z Z
219
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
L a~ a~
a~ ~ o
a~
C~ a~ o c~ ~ ~ ~ ~ y
~
N -p ~ O O ~
~ -p ~ N ~ .fl ~ p
~
V
~ O ~ ~ ~ Q.r
~
~ ~
i O ~ O - I~ O ~ tn C ~ ~
!~ O
_
o O ~ ''' ~ N N O O O O C o O
o
~
'
a W C C O c0 V >
N _
_O \ O .-:
.
~ O Q
~
O ~
~
C Y O c~ N N ~
O O to
O ~
~
~
""' v c~
W ~ O
~ C ~' ~ U
N O ~
~' ~ O ~ c ~ ~
O
~ ~ .~ ~ ~ ~ ~
n.
_O
,_,
~
O
N ~ ~ ~ ~ ~ ~ ~
C ~ p ~
N
N 7,
r
E
O ~ N U ~ Q
p
.
~ O O
~
Q~
N
~ O' N
CD O ~ ~ O 'a
O
j
;~ ~
" O '
F- cI1
O
O
f.
O o
G7
O r
d
U
~'~QU
U
Q
fa-Q~
a-
a'
(.
31
Q U Q U U (~
a
N UQH U
Q
E- U a
~
Q V Q C'3 U U
QU~~
~U
Q Ca'3 ~ IU- a U
O
N
Q.
O
a
c~a~a~-~U
F
U UQ
V
I- a U U
a
QQQ
Q~~~
a
Q
Q~a~
aU
a~
~-
a
N
O N
OI
z
.~c C3
C!
d
O r
a
z
c~
0
z
0
220
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
v~ o ~ ~ ~, c ~ ~ ~ ~ ~ ~ ~ g ~ L ~ v~ ~, o
~n ~ ~ :.... ~ ~ ° ~ ~ .~ ~°-., ~ Wit' ~ m ~ ~ ~ -oa v ~ y
.o
w ~ o ~ ~ E ~ ~ ~ ~' .~ pc c =' :~ ~ o ~ ~ .c c c~a ~ N ~ o a~
a> o .,-.. ~ o o ~ <n .,., o u~ o v~ a~ ;~,
Q ~ ~ i O +L. ~ >, ~ p~ ~_' .p O d. i v- O ° ~ r - ~ ~ O
~ Q ~ U ~ ~ ~ ~ ~ ~ ~ N p N N T U Cry 'p '''' O
L p
O U O ~ ,-~ ~ ~ ~ .+U-, ~ Op C~5 Q ~ c~u' O O ~ O ~ ~ O ~ O ~ Q.
Q c o ~ ca ~_ ~ ~ ~n ~ O Q Q p ~ ~ o o ° -o o ° o 'o ~ I~ N
C ~ ~ Q ~ ~ O I- ~ U V _C .C ~ m ~ ~ N ~ p '''' ~ ~ 'O -fl O MO
C ~ . :~ c~ +. d' t~ (iS
N~'a~p_~~~MNtn ., .~0~~ ~~C ~
O -O N ~ c~ O
~Opo~~r~.~ (~~~ptQ'p"''e~~mN~N;~._p V
N ~ Up ~ N ~ ~ ~ O ~ .fl N MO' ~ ~ .~ ~ ~ N
co ~ °~ °~ ~ ~ ~ ~ H ~ ~ ~ N $ ~ ~ '~ ~ ~u~
O O
i
O ~
QQ~- aa~a~~Q~
U ~ C'3 C5 U Q ~ Q I- f- Q U Q Q (~
~ Q ~ Q ~' ~ H ~ Q a Ca'3 Q Q U
Q Q ~ Q (3 Q Q C.3 U ~ C3 Q ~--
~ ~ U I- U ~"' Q Q U Q ~ Q U C~
Q H ~ Q Q Q Q ~ U Q U Q C3 U
U O I- U
Q U Q H ~ Q Q Q E- U H- Q Q
C.3 Q I-' U Q Q ~ U Q U I- Q Q U
~I-U-U~HQ QU~U~QQQ~
C3U~QQ C'3~U~QE-QU
t
n.
0
O
a
Q U Q C'3 C3 Q Q Q ~ U I- I- O U
U ~ Q ~ U Q Q ~ U ~ ~ QIU- Q a ~ H
U~QUQaU Q~QQHIa-QQHQ
Q Q Q Q IU- Q Q~- Q Q U Q a CQ'3 U Q
Q Q Q a U Q ~ ~ Q la- Q ~ Q Q ia- ~ V a
V Q Q Q ~ Q Q Q ~ la- Q ~ U U ~ H
~ f- U I- ~' I- Q U Q Q Q Q U Q
C'3 O ~ C3 Q Q Q Q Q U Q U I- Q I-
Q U Q U C.3 ~'3 a ~' U ~ Q Q Q U U O Q
O M
O Z
Y
O O
d
Il7
13 O _O
U Q
D
O O
Z
O O
221
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
O .~ ~ M O N ~ '~ ~. ~ O N O
O
- U
C ~ O O
O m O o Y O O 'a ~ ~ C
d' -a ~ O ~ N ~ o ~ ~ ~ Q ~ ~
~ O O Q
T
~N O ~~ ~ tn o ~ .> O U Q Q 7 O O N
0 - v1 .C 'p M .C O M L O y'= ~ O O d5
~ ~ ~
'~ O O ~ O N i O ~ U ~ ~ ~ ~ ~ ~ ~ O ~ E
a ~
.
G1 ~ ~
p ~ d' ~ ~ ~ ~ O " O
,0 '~ ~ f~ O .
>
0 p V
.. O
, ~' N ~
W ~ C ~ ~ ~ ~ ~ .~ ~ O (~ 'Q
~ c tt5 ~ln O ~
'
~
'
p O ~ E CD
O
~
O m O ~ ~ ~ ~ ~ ~ '+U-.. N O.
N
'
V
U ~ i
O ~ 'V C .~
~ ~ C O Cn N O L L
O ~ +. ~ ~
L- -
O O Q. ~
lI~
O p
~ O
O
Q ~ ~ .~ ~ "-' ~ ~ ~ Q ~ +~ N
~ +- ,_ -
N
O
p
C i- ~ cC m !n . c~
O O O
~ ~
. Q ~
N O U
O p' 0
O ~ ~ ~ '~ N ~
~
- m
~ m
O
C C
O
~
~ ~
p, O
~ .f..
V
.
.r t~ m E-
O
~ N
a0. d.
M
Q U U Q U V Q U a ~..' I- I-
~
Q a
a C3
U a
HU~H~ ~Q
UUQ HRH
' U
C I-
3 -
C'3 a U a a a U U a
~~QQ~~a~Q Q~c~c~~a
a a ~aa~- ~a~~~~-
a a U U a U C'3
U ~
'
~
a ~
C a I- I- a C3
3 C'3
~ U a U ~- a C3 a
a~a ae~aa aaa~-a
U ~ ~ Q U Q
~ U Q
a
H U Q
Q f- C
U U ~ U a a ~ U a '3
U U I- a a U I- a a I- I- I-
a (~
a a C? a a
a C'3 U I- a
a C3 C3 U C3 a U U C3
~
~-~-
~-~-~aa
Q.
O
O
a
UaUQa~~aa~ ~~~~aa~
U ~
a
U a C.3 a i- U U a
I- a U C3 a
E- U
U C3
~
~- a (~ a ~
a a U U a a
U
Q ~~aa~
~ ~
~~~aQa
~Q ~
N ~ ~-
a
~U
~
U~
~ c~~-ac~
a~ C3 a (~ U U
Q~
a U E- ~- (~ U ~- C'3
a
a
'3 U ~ a C.3 ~ a
'3 a V Q C. ~ C'3
H ~ ~ Q C Q
Q U ~ la- ~ Q ~ Q I-U- a C'3
a U U U a a a U U a U U Q
a C3 U U a
~, f-
(> U ~3 a a U
a a U a C3 ~- ~ U a U a ~3
O M N
~I N
LL
O N ~r
a
a
a
0
d N
222
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
-a c o
0 0 _ o c~ ~ o
a~ ca W >. -a 'gin _o
o M ~ ~ ~ ~ -a o
o W ~ N ~ cn ~ C
a n °ao vwv ~ ~ ~ 0 0
o ~ ~ c~u ~ ~
o ~ ~ a~ .~ ° o ~ ~_
a c
Q o > as o ~ c ui
E c U t '~ o ~ o ~ c
c t cn :~
a r~ .~ v ~ .~c
d o
due'
~UQ~~U ~UU~~UQV~.
a ~' H C3 U la- ~ IU- C3 a U U U U U
U Q C'3 V U U Q U ~ Q U U U U U
U U U
N UQQUUHQ ~UQUQ~~U~~
~aUaaQ UHUU~UQ~UU
C'3UUU~C'3 UHHHCU'3a~UC3
(.a'3 U C~'3 U U H ~ C'3 U U ~ Q U C.~'3 H
a U U ~ U U I- I- U a U U U U a
0
Q.
0
0
a
e~aa~ac~c~a e~c~UC~~-U~-U~-U
aC3Uaai-U aaUl-aUUI-UC3
a a U C.3 ~- C3 I- a U U U C.~ U ~' C3 U C'3
a C'3 a U a a C'3 a U U U C'3 U I- a (~
a a ~ (~ a U U a U I- C3 a a U a U I- U
a U C'3 U (~ a U U a U U U a U ~ U ~ U U
U U U I- U U C'3 U U U C3 U C3 U a U U U
N U C.3 C3 C'3 a a E- C'3 a C5 a U a a U U U i- (.3
I- C.a'3 U I-Q- U Q C~'3 Q ~ U U U U U ~- C.3 Q U
a Q U a a a C.3 C3 a C'3 U ~ a a I- ~
C3 a (.3 a U C'3 C5 C'3 U (~ U ~ U ~ U U U U
U U a a U I- I- a (~ I- U U U U C3 U U
C'3 U C3 <3 C3 C'3 U U a U t- ~ ~- t- a I- C'3
N
R_
O O
L T' r
a °
0
r a
a
T T
223
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
+.
d
uJ
d o
N
V.L. ~ T
(~ U U U U I- U U Q U E- Q C.3
U U
U U U U C3 Q C3 U Q U U U C3
(~
U H Q H
Q a
~ ~
U
Q U C'S
Q I U C
- '3 C
C'3 '3
Q U
a ~
a' U V (. C~ C.
'3 ~ U Q IQ- 1- t- C~3
U '3 Q Q
~ C~ U U
U C3 C'3
in f~ ~ U U Q (~'3 ~ ~ U Q U Q
~'
C3 a
C ~UUUUU
3
~UQ~ Q~U
U ~
U
Q
Q U
(~ Q
UQU U
U'
C.~Q~U C3f
QC'3 -C
C5 Q I- U U U U Q 3
I- U C3 Q C3 (5
I-
O
N
O ~'
O
a
U H U U Q H U Q U H U H Q H C'3
U U
I- U Q U C.3 U C3 (,3 ~ U I-
U ~- ~ Q U I- U
U U U U (.3 I- y-. f- Q U U I-
U U Q U C'3
C5 C'3 (~ Q U U U U C5 U ~ C'S
Q 1- Q U C'3
U U
a
~
a e3 U U ~- Q (~
N ~- Q U U C'3
f.3 U C'3 Q Q U Q
Q i-- U U Q U Q ~- U ~- C3
U ~ C.3
(~ C'3 U U C3 C.3 U CQ'3 U ~
is7 C3 C3 U ~ 1-~3- Q U U
t-f-U~--C'SQQt-Q U
C'3 U C'3 U U U U U Q Q U U U
U U U U 1-
C'3 C3 U Q U C'S I- U U U U
Q H I- U U U (~
Q C'3 t- Q U U U U U E- C'3
(~ Q Q C3 a
U U U U ~ U ~
U ~
a U U
'3 U C'S C
I 'S
- Q U Q C
N
R
_
O O O
r
s.. r
d d d
L
O
O
O O
r
r r
224
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
.° ~ L '° -a
Q c ~ -o ~ ~ M ~ ~ ~ ~ ~ ~ >. o c°u yn
m.c~ .,-. ° ca ~ c ~ ~ '~' r Y o
o ~ ~ ~ ~ ~ ~ ° o ~ ~ °' ~ ~°v ~ c a
~oa~c~ao~~~~°'
v o ~ O O t .~ '> O c~'n ~ ~ ~ .O j N N ~ o Q ~ c~
d ..- ~ .~ -Q = ~ .~ ~ f~ r cS5 'O ~ N N -fl C p tn
LJJ O ~ O ~ C C ~ ~ O ~ =. '~ ~~ ~ ~, -O .~' O- O C
U '. o 'O O O M
L V~ ~ ~ L ~ ~ ~ ~ r O
~ ~ .n -a g ~ c a~ ~ o <n ~ ° c o 0
o ~ °. c~ CD ~ o 'a .~ ~ c o o ° o
"~ C ~ "'' L N ~ (~ N ~ ~ ' ~ O :V V fl-
~ r ~ .~ -> a~ N ~ '> o ° °' o a
~ c~ o M o .~ ~ o ~ °
''r o ~ ~ M -o ~n ~ a
O ~ W o
d d T r
'r2~ ~ . ~ r d'
~-c~U U~ ~ c~aaaUa
c~ a c~ ~-
~~Qa~a
U U Q v» ~ ~ v U ~""' a U U Q
d U U U c~u v ~ ~ ~ U Q Q ~ C3
m C~'3~C~'3 U ~~° ;; IU-C~U~aU
M ~Q~ ~~ °'_~,~ ~~aaaa
QUH ~ v ~ ~U VUI-a-~~Q
Q U U ~ ~ ~ ~ Q Q Q Q U a
a I- U I- ~ ca U U a a a
O.
0
a
U aU U Q ~ U UU C3 C'3 U U 1- IQ
H C'3 C3 U Q
~UUUQU~ QUCU'3QIU-I~-C.~'3
a a a C'3 ~ U U U U ~ I- t- U U a
U I- C3 C3 U a a U V U U U U Q a
ire V U Q Q U U Q Q U U U a U ~
V~~'3UU~UU f-aaQC'3~'IU
~ Q V Q ~ ~ U ~U'3 Q Q H ~ a Q
r r
~~ O N
d
O O
O M
tC r r
U a
d
r Q
d
C'3
U
225
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
°~ o °~ o °~ o
o Q o Q o
m
._cF.- 3
O L O L O L
U ~ ~U ~ ~U
O ,~ O ~ O ,~
N U ~ U N U
Q o Q o Q o
L
o a
0 0
Y~ T T
UUI-~C.~U UUI-~'C3U C'3al-E-aU
U U U a 1- a U U U a I- a C'J a U a U a
C'3 C~ 1- a C3 U C'3 C'3 I- a C3 U a C'3 U a U U
a (~ U U a C'3 a U U U a C3 a I- I- U C.3 U
U U a (~ I- U U a C'3 I- C'3 U C'3 a a a
a ~ U U a C3 a ~ U U a C3 C3 C3 I- U i- U
U a C'S U f- I- I- U a (~ U I- I- t- C'S U U ~ U a
N a Ca'3 U U U U U Q ~ V U U U U Q U I-I-U- U a U C'3
i~ U ~- t- U I- U C3 U E- I- U h (~ C'3 a U C3 U U t-
i- a U U U ~- f- a U U U ~- a C3 C3 U a U
a (~ C3 U (~ a (~ C'3 U (~ U (~ (~ Q ~ U
C'3 U ~ C'3 U a C3 (~ ~ C.3 U a a C.3 a a a
C'3 1- a C'3 C'3 U C'3 I- a C'3 C'S U I- I- ~ a a a
a (.3 U U H a a C.3 U U 1- a C3 a <3 ~-- U a
C3 I- f- C'3 U a C'3 1- ~- C.3 U a C'3 f.3 C3 U I- a
N
's
a
o ~ >- oc
0
a
~~~~Q~~ ~~~~a~~
~aQaUC~~- ~aQaUC~~- aa~-~~-acs
U C.3 U a C3 a t3 U C3 U a C.3 a C'3 a U U U U U E-
~ a ~, U U U U ~ a ~ U U U U U C'3 (~ U a a U
a' U C3 U U U U a U U C3 U U U U a U C3 U I- ~"' a U (>
N ~ C3 ~ C3 tU- C.~'3 tU- U ~ ~ ~ C3 H C~'3 1-U- U (U,> (3 C.~ U U (.5 IU- Q
UUU~CU' CU'3'3aUUUU~CU'3CU'3aUUQl-~-CU'3U~U
a ~ U Q H H C.~ a H U Q IU- H C'3 Q Q a ~' C'3 U N
U CU'3 U Q U CU'3 Q U C3 U Q U C'3 a H Q C3 C3 Q Q Q
N
N N M
O _O _O
d
I- ~- H
d
U U N
a
L
T T T
L
($ I- E-
226
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
DD
xs o xs o m m
o Q o Q
c -° o
m E o ~ n.
"''
d ~ ° .~ ° 'o
u~.t .~ .~ .c .S ~ o c~
a~
o ~ o
~ °
U = U O
O t O ,~
~U
U U ~ U
O ~ O t°n
a '; a ~' a .c
L
O O
la ~= d. a°0 c°~
N
(3 a ~ (3 H a I-
~-aU~-UU a~~~~~ ~~a~~a
U C'3 U U U U a ~- I- U U a a a I- a U
a C'S ~ U a U U U U U U a U U (~ a a
U I- U U U ~ U U C3 U (.3 ~- (.3 C'3 U h C3 (.3
C3 C'3 U C'3 U a U U U a U U U C'3 U C3 I-
a' a a a U I- U a U U C~ U U C'3 U a U ~ I- U
U C'3 ~'3 U U C3 C.3 U C'3 H- E- I- C'3 a (~ U (~ a ~ a a
t- (~ ~- I- U I- C'3 a C'3 U U C3 I- U a a U U a ~ a
U a C.U'S U ~ N U H U ~ ~ ~ U U C.~'S H Q ~ H
UCa'3C'3UHU UUC~~UU C'3~"'~"'~C.a'3H
~'3 U (.3 U I- U U V H N U C.3 C'3 C.a'3 U U ~ IU-
a ~ a C3 U U U C.3 C'3 U U C3 C3 C3 U U a C3
Q.
.Q o ~ OC OC
a
a a a (.~3 a ~- C3 I- U H- U C3 a U I- f- U U U a
UaaU~C3a aaI-UUaU U('3aC'3aUU
U C'3 C.3 I- C3 U U U U U ~-' I- U (~ a U C'3 1- U
UUCa')UC~U~ UQFU-C'3IU-UIU- QQC'3U~HU
a' I- U I- ~ U I- a U a U U U U a U U U a U ~- C'3 I- U
N I-I-Ca'3IU-(U3~~UQQ(.U3UUC.U'3QUUUUUUQQ~
in a Q U U U U C3 a U C3 ~ a U U U a a ~ C'3 U I-
C'3 U f- C'3 U ~ U C3 U a a U U ~ U U a (~ U a
C3 I- U ~- C.3 U C.3 U C3 U ~- I- ~ C'3 U ~ a U ~. (~
I- U a ~ a a U (~ i- a U U a a ~' C'S U C'3 U (~ a
U Q C~'3 C3 tU- C~'3 U ~' U C.~ U U a C3 U U U a U I- U
a C'3 I- a C3 a C3 U C3 U C3 ~-- U C'3
N
O r N
O r O
r r
°' a
M
r N
U U C~
L
r
r r r
C
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
c d'
~ J M
O O
O O D
~ O
D D ~ CO Q
V Q.
V m M m M
01 ~ O ,~ O "-
O 0 O
~/ ~ .~,
O O O .
M D
O
a a ~vm
ag
d o 0
d ~ 00 N
d
'- r M
aQ~a~ ~a~a~ ac~UaaQ
~
U
Q a H Q Q U C U U U U Q C'J
~ '3 I-I-
- Q U ~
Q U ~ Q~ ~ Ca'3 Q Q Q C'3 Q (.3
U ~ U C'3 ~
a U
~
Q Q I- Q Q '3 U Q C'3 C3
' '3 C U f
~ ~ Q U U C - Q
a U U (
~ a C~
N 3 I- a a U a U ~
M U C a U Q U U
a U U U H U
a C~ U ~ a'
' Q
U C'3 a (.
~ Q ~ U U a ~ U I- (~ 3
I- a ~ C'J
I- a U U ~
~.
(~ C'3 I- a U a ~. (.3 C3 ~- U a a
a U a
Q ~ U
U
~
C3 ~ U C. ~ Q U Q U U
'3 Q a U
~
U
C. Q C.U'3 Ca'3C3 C.
'3 a Q Q C.a'3U I- f- '3 Q H ~ U
o OC GC OC
0
a
~ a ~"' I- I- ~"' C~ ~- U U C'S
a i- a a C'3 U a C'3 C'3
U a a U a a a U U U a c~ ~- ~, a
U ~ ~ ~ a a
E-
U
a
a a C.3 a I- ~ U (~ C'3
I- a C.3 U a a a U
a I- I- U U C3 ~ I- U C'3
a a ~' C'3 f- a a
C'3 U U ' '
U
U Q U ~-- U a a a C C
s U 3 a 3 a a t~ C'S
U C~5 U U (~ C~ a ~ ~ C3 U
U a U U ~- a I- t~
U U a ~
'
a
N ~ ~ ~ a ~ a U U ~-- U 3 U U a
U a U U ~ C
U C'3 f- a
a C3 a a ' a U
C3
~
ire a C a
C'3 U a (~ 3 U I-
U U a U U ~ (~ U
a ~- a U a I- U a
' C3 a
3 a C3 a U U a C'3 a I- U U C'3
C'3 C3 (~ U
U, U C Q ~ U IU- U H U ~ ~ V
U C'3 ~ I-U- U Q
Q H U
~' ~ a'
a ~~~ U'
U U aUUQ
U a Q
~
a'
C U C
3QI - 3
- I- C
31 -I- 3
-
C.
N
N
r N
IC O O
d
z z w
E- ~-- >
~_
r
a
z ~"
d ~ a
a ~ ~ ~ ~o c~
c~ ~ z >
228
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Table 11
MALES LY - CHI ED VALUES
ON SQUAR
Marker Gene Position/orderMarker name tot bone
chisq
Grou
A
OGN_02 OGN 31995 OGN A189G 85.56
OFS202 OSF2 356 OSF2 A147G 71.67
OMD_03 OMD 9201 OMD A9201 G 54.59
OMD_01 OMD 10596 OMD A195G 48.47
KJ4703 KJ_opgba47 15307 KJ opgba47C15307T42.14
LUM_01 LUM 91974 LUM G78A 37.76
ITGA13 ITGAV 1849 ITGAV G76T 32.67
KJ1311 KJ_opgbal3 39015 KJ_opgbal3 A39015C32.51
FGF202 FGFR2 72611 FGFR2 T105C 30.57
AKA906 AKAP9 138995 AKAP9 A81 G 30.01
MAP803 MAPK8 185530 MAPK8 G218A 26.66
AKA910 AKAP9 145008 AKAP9 C75T 24.12
BMPA03 BMPR2 1790 BMPR2 CTCTT26CT 23.97
KJ1 KJ oagbal 15318 KJ oagba1T136C 23.2
01
Grou
B
IGF503 IGFBP5 152006 IGFBP5 C152006T 19.62
KJ1303 KJ_opgba47 17052 KJ opgba47 C250T19.53
K13601 KJ opgba13613512 KJ_opgba136 A13512G19.46
IRS102 IRS1 3035 I S1 G87C 19.19
INS101 INSIG1 6271 INSIG1 C183T 18.72
KJX702 KJ obexp7 wetSNP 18.65
GB:AC090953_1.v5163
6.C>T
ITGA08 ITGAV 13752 ITGAV C13752G 18.58
KJ1307 KJ_opgbal3 37414 KJ_opgbal3 A37414G18.58
TNF602 TNFAIP6 144773 TNFAIP6 A83G 18.26
TIM104 TIMP1 18389 TIMP1 A148G 17.94
KJX706 KJ obexp7 wetSNP 17.15
GB:AC090953_1.v8869
8.G>A (isSNP
SNP00067306)
NFK203 NFKB2 47577 NFKB2 T210C 17.14
PMX101 PMX1 116641 PMX1 C116641T 15.56
TNF601 TNFAIP6 140934 TNFAIP6 G117A 15.08
KJB405 KJ bonlib4 wetSNP 14.55
GB:AC025744_4.v1197
48.G>A
K11501 KJ opgba1151422 KJ_opgba115 A113G14.33
ACLP10 ACLP 144713 ACLP G139T 14.24
IL6S02 IL6ST 41580 IL6ST C227G 13.89
ICJ1406KJ opgbal4 20529 KJ_opgbal4 A92G 13.54
TIM102 TIMP1 18711 TIMP1 G173A 13.22
~
KJ1 KJ oa bat 16444 KJ oa ba1T222C 13.2
02
229
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Table 11
Marker Gene Position/orderMarker name tot bone
chis
KJX505 _ wetSNP 12.71
KJ_obexp5 GB I:AC012651
_6_0000
01.v27596.C>T
SCY201 SCYA2 60517 SCYA2 A130G 12.56
FST101 FSTL1 219 FSTL1 T169C 12.51
KJ4705 KJ opgba47 15418 KJ opgba47 G15418T11.89
PTG101 PTGS1 153 P TGS1 C100A 11.56
ACLP05 ACLP 145005 ACLP C21 T 11.45
MMP101 MMP1 6586 MMP1 T237C 10.63
PAI105 PAI1 96900 PAI1 A96900G 10.51
KJ1405 KJ opgbal4 5204 KJ opgba14A32G 10.2
KJ1304 KJ_opgba47 17323 K J opgba47 C144G9.65
_
KJB410 KJ_bonlib4 wetSNP 9.5
GB:AC025744_4.v1232
16.C>T
KJ4701 KJ opgba47 16681 KJ_opgba47 A16681G9.45
KJ9701 KJ_opgba97 31213 K J opgba97 C66G 9.13
KJX705 KJ obexp7 wetSNP 9.08
G B:AC090953_i
.v7675
5.G>A (isSNP
SNP00067305)
CHUK01 CHUK 19499 CHUK G93A 9.03
KJX703 KJ obexp7 wetSNP 8.69
GB:AC090953_1.v7189
8.C>T
ITGA02 ITGAV 7950 ITGAV T49C 8.52
ACLP06 ACLP 138330 ACLP 35(G)C34T 8.44
CHUK03 CHUK 8892 CHUKA33C 8.4
KJX707 KJ obexp7 wetSNP 7.99
GB:AC090953_i
.v1035
OO.T>A
NOT303 NOTCH3 35571 NOTCH3 G92A 7.76
CY1701 CYP17 6609 CYP17 T245G 7.65
SDF104 SDF1 17023 SDF1 C17023T 7.51
KJ1403 KJ opgbal4 2706 KJ_opgba14C170T7.45
KJ1306 KJ opgba47 30234 KJ opgba47 G104C7.17
Grou
C
INS102 INSIG1 6272 INSIG1 T184G 6.99
ITGA12 ITGAV 16247 ITGAV C'101 6.99
T
oFS2o1 OSF2 182 OSF2 G97A 6.75
AD1205 ADAM12 40801 ADAM12 G112C 6.73
TGM103 TGM1 36093 TGM1 A226G 6.59
TIM103 TIMP1 17434 TIMP1 G221A 6.54
SDF106 SDF1 23726 SDF1 C23726T 6.53
AKA908 AKAP9 166561 AKAP9 T39G 6.39
SC2002 SCYA20 75660 SCYA20 G187A 6.23
AD1201 ADAM12 2358 ADAM12 A76G 6.2
TGM108 TGM1 52478 TGM1 G70A 5.67
230
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Table 11
jMarkerGene ~ Position/orderMarker name tot bone
chis
KJA302 KJ_ATHGBA3 937 KJ_ATHGBA3 C239A5.63
BMPA04 BMPR2 27645 BMPR2 G61A 5.45
KJX508 KJ obexp5 wetSNP 5.44
G BI:AC012651
_6_0000
04.v960.A>G
KJ1308 KJ opgbal3 38686 KJ opgbal3 C38686T5.41
MMP104 MMP1 9247 MMP1 T153C 5.4
FST103 FSTL1 10033 FSTL1 A10033G 5.23
KJX506 KJ obexp5 wetSNP 5.18
GBI:AC012651
_6_0000
03.v3531.C>G
(isSNP
SN P00015562)
IRS104 IRS1 850 IRS1 C66T 5.14
IRS108 IRS1 3262 IRS1 G161C 4.95
GB3401 GBA_IB343 91092 GBA_IB34 G49T 4.8
AKA901 AKAP9 60717 AKAP9 G204T 4.75
SCY202 SCYA2 61292 SCYA2 A21 T 4.72
PAI102 PAI1 102102 PAI1 G93A 4.7
TGM106 TGM1 38617 TGM1 C119T 4.67
FOSB04 FOSB 29825 FOSB C210G 4.65
CY1705 CYP17 6263 CYP17 C6263G 4.61
KJB402 KJ_bonlib4 wetSNP 4.61
GB:AC025744_4.v1170
86.A>G
ACLP09 ACLP 143645 ACLP T116C 4.23
NFK201 NFKB2 46569 NFKB2 C70T 3.95
IGF401 IGFBP4 13372 IGFBP4 C22T 3.79
SOD201 SOD2 1183 SOD2 C47T 3.71
ACLP07 ACLP 139649 ACLP C26T 3.68
AD1202 ADAM12 9118 ADAM12 T23C 3.64
LIF_02 LIF 7435 LIF A7435G 3.5
VEGF01 VEGFA 142442 VEGFA G172A 3.49
ACLP08 ACLP 143621 ACLP (ACTCAG)923.44
KJB413 KJ_bonlib4 wetSNP 3.3
GB:AC025744_4.v1242
33.C>T
FL_04 FL 2847104 101647 FL_2847104 A101647G3.08
KJ1401 KJ opgbal4 17038 KJ_opgbal4 G107A2.81
i~sso3 IL6ST 21425 ILS6T C127G 2.57
IRS107 IRS1 2995 IRS1 A47G 2.51
ICJA303KJ_ATHGBA3 9846 KJ_ATHGBA3 G166A2.32
NFK202 NFKB2 47085 NFKB2 A48C 2.28
IRS105 IRS1 1285 IRS1 G262A 2.27
AD1206 ADAM12 11872 ADAM12 C142T 2.23
FST102 FSTL1 2797 FSTL1 A134G 2.22
KJB709 KJ_bonlib7 wetSNP 2.21
G B:AL138901
_2.v58092
.C>T
CHUK02 CHUK 35542 CHUK A104G 1.9
NOT304 NOTCH3 4347 NOTCH3 C226T 1.87
FOSB01 FOSB 30724 FOSB A30724G 1.86
KJ1 KJ oa bat 15057 KJ oa bat T113C1.76
04
231
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Table 11
,Marker Gene Position/orderMarker name tot bone
chis
MMP107 MMP1 9056 MMP1 C54T 1.69
ACLP02 ACLP 139671 ACLP CC48CCC 1.65
FGF201 FGFR2 36718 FGFR2 C107T 1.57
MMP108 MMP1 11784 MMP1 C118T 1.57
AD1204 ADAM12 '12300 ADAM12 A119G 1.56
K11503 KJ opgba11515215 KJ_o gba115 1.56
G141T
K11504 KJ opgba1152459 KJ opgba115G169T1.55
FL_05 FL_2847104 120151 FL_2847104 C120151T1.39
KJ4702 KJ opgba47 15058 KJ opgba47C15058G1.25
TG M TG M 1 26887 TG M 1 C206T 1.24.
102
KJB711 KJ bonlib7 wetSNP 1.22
GB:AL138901
2.v58172
.C>T
NOT302 NOTCH3 21959 NOTCH3 A43T 1.13
PAI107 PAI1 104415 PAI1 C104415T 1.05
KJ9703 KJ_opgba97 35165 KJ_opgba97 G131A1.02
PAI108 PAI1 96894 PAI1 G85A 0.96
SDF101 SDF1 16619 SDF1 G46A 0.84
TGM111 TGM1 61558 TGM1 G21A 0.8
KJX302 KJ obexp3 SNP00033624 0.77
C/T
KJ4704 KJ_opgba47 15815 KJ opgba47A15815G0.68
FBL201 FBLN2 21246 FBLN2 C148T 0.66
ITGA11 ITGAV 5756 ITGAV G180A 0.61
FBL205 FBLN2 29968 FBLN2 T195C 0.57
K13602 KJ opgba13613354 KJ_opgba136 0.46
C100T
MMP103 MMP1 9126 MMP1 G32A 0.29
OGN_03 OGN 38656 OGN C107A 0.25
NOT301 NOTCH3 4117 NOTCH3G128A 0.15
ACLP04 ACLP 142383 ACLP T132C 0
MMP105 MMP1 9365 MMP1 G271T 0
SC2001 SCYA20 75634 SCYA20 0161 0
G
SC2003 SCYA20 75689 SCYA20 G216C 0
232
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Table 12
FEMALES ONLY - UARED
CHI SQ
Marker Gene Position/orderMarker name tot bone
chis
Grou
A
KJB402 KJ_bonlib4 wetSNP 48.06
GB:AC025744 4.v1170
86.A>G
KJ1401 KJ opgbal417038 KJ_opgbal4 G107A41.74
AKA908 AKAP9 166561 AKAP9 T39G 36.71
IRS105 IRS1 1285 IRS1 G262A 32.83
TIM102 TIMP1 18711 TIMP1 G173A 30.69
KJ4704 KJ opgba4715815 KJ_opgba47A15815G29.98
SCY201 SCYA2 60517 S CYA2 A130G 25.47
KJB709 KJ_bonlib7 wetSNP 24.31
GB:AL138901 2.v58092
.C>T
KJ4703 KJ o gba4715307 KJ opgba47C15307T23.85
KJ1303 KJ_opgba4717052 KJ_opgba47 C250T23.74
NOT304 NOTCH3 4347 NOTCH3 C226T 22.8
IGF503 IGFBP5 152006 IGFBP5 C152006T 22.71
Grou
B
OGN_02 OGN 31995 OGN A189G 22.47
K11501 KJ opgba1151422 KJ opgba115 A113G21.74
_
KJ1306 KJ_opgba4730234 KJ opgba47 G104C18.35
KJX702 KJ obexp7 wetSNP 17.55
GB:AC090953_1.v5163
6.C>T
FL_04 FL_2847104101647 FL_2847104 A101647G17.48
MMP107 MMP1 9056 MMP1 C54T 16.24
KJX705 KJ_obexp7 wetSNP 15.95
G B:AC090953_1.v7675
5.G>A (isSNP
SNP00067305)
KJ9703 KJ_opgba9735165 KJ opgba97 G131A15.84
AKA906 AKAP9 138995 AKAP9 A81 G 14.3
MMP104 MMP1 9247 MMP1 T153C 14.23
KJX703 KJ obexp7 wetSNP 13.67
G B:AC090953_1.v7189
8.C>T
IL6S02 IL6ST 41580 IL6ST C227G 13.34
PAI105 PAI1 96900 PAI1 A96900G 13.12
KJ4705 KJ_opgba4715418 KJ_opgba47 G15418T13.05
VEGF01 VEGFA 142442 VEGFA G172A 12.67
KJ4702 KJ_opgba4715058 KJ_opgba47C15058G12.58
AD1201 ADAM12 2358 ADAM12 A76G 12.5
LIF 02 LIF 7435 LIF A7435G 12.46
233
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Table 12
,Marker Gene Position/orderMarker name ot bone
t this
FOSB01 FOSB 30724 FOSB A30724G 11.73
FST102 FSTL1 2797 FSTL1 A134G 11.62
TGM106 TGM1 38617 TGM1 C119T 11.55
OMD_03 OMD 9201 OMD A9201 G 11.49
FGF201 FGFR2 36718 FGFR2 C107T 11.08 .
AD1206 ADAM12 11872 ADAM12 C142T 11.06
KJX508 KJ obexp5 wetSNP 10.88
G BI:AC012651
6 0000
04.v960.A>G
IRS102 IRS1 3035 IS1 G87C 10.81
FBL201 FBLN2 21246 FBLN2 C148T 10.79
MMP103 MMP1 9126 MMP1 G32A 10.7
ACLP06 ACLP 138330 ACLP 35(G)C34T 9.61
KJ1307 KJ_opgbal337414 KJ opgbal3 A37414G9.57
ITGA13 ITGAV 1849 ITGAV G76T 9.49
IRS108 IRS1 3262 IRS1 G161C 9.31
KJ1_04 KJ_oagbal 15057 KJ_oagbal T113C8.96
ACLP05 ACLP 145005 ACLP C21T 8.85
FGF202 FGFR2 72611 FGFR2 T105C 8.82
SC2002 SCYA20 75660 SCYA20 G187A 8.79
FOSB04 FOSB 29825 FOSB C210G 8.76
KJ1_02 KJ_oagbal 16444 KJ_oagba1T222C 8.72
TNF602 TNFAIP6 144773 TNFAIP6 A83G 8.71
SDF104 SDF1 17023 SDF1 C17023T. 8.67
IL6S03 IL6ST 21425 ILS6T C127G 8.32
KJB413 KJ_bonlib4 wetSNP 8.15 .
GB:AC025744_4.v1242
33.C>T
K11504 KJ opgba1152459 KJ opgba115G169T8.01
_
FST101 FSTL1 219 FSTL1 T169C 7.51
KJ1304 KJ opgba4717323 KJ_opgba47 C144G7.4
GB3401 GBA_IB343 G 49T 7.3
91092 GBA_IB34
OGN_03 OGN 38656 OGN C107A 7.19
K11503 KJ opgba11515215 KJ opgba115 6.9
G141T
KJ1406 KJ opgbal420529 KJ_opgbal4 A92G6.75
ITGA08 ITGAV 13752 I TGAV C13752G 6.68
NOT301 NOTCH3 4117 NOTCH3G128A 6.61
PTG101 PTGS1 153 PTGS1 C100A 6.61
KJ1403 KJ opgbal42706 KJ_opgba14C170T6.54
KJ1311 KJ opgbal339015 K J opgbal3 A39015C6.4
IGF401 IGFBP4 13372 IGFBP4 C22T 6.3
LUM 01 LUM 91974 LUM G78A 6.24
Grou
C
KJX707 KJ_obexp7 wetSNP 5.76.
GB:AC090953_1.v1035
OO.T>A
AD1204 ADAM12 12300 ADAM12 A119G 5.73
TGM108 TGM1 52478 TGM1 G70A 5.58
MMP101 MMP1 6586 MMP1 T237C 5.42
234
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Table 12
Marker Gene Position/orderMarker name t ot bone
chisq
ACLP07 ACLP 139649 ACLP C26T 5.38
ACLP08 ACLP 143621 ACLP (ACTCAG)92 5.3
TGM103 TGM1 36093 TGM1 A226G 5.21
TGM102 TGM1 26887 TGM1 C206T 5.06
PAI102 PAI1 102102 PAI1 G93A 5.02
PMX101 PMX1 116641 PMX1 C116641T 4.99
NFK202 NFKB2 47085 NFKB2 A48C 4.72
TNF601 TNFAIP6 140934 TNFAIP6 G117A 4.7
KJB405 KJ_bonlib4 wetSNP 4.62
G B:AC025744_4.v1197
48.G>A
KJX302 KJ_obexp3 SNP00033624 C/T 4.43
SOD201 SOD2 1183 SOD2 C47T 4.4
CHUK01 CHUK 19499 CHUK G93A 4.39
INS102 INSIG1 6272 INSIG1 T184G 4.39
TGM111 TGM1 61558 TGM1 G21A 4.32
INS101 INSIG1 6271 INSIG1 C183T 4.22
SDF106 SDF1 23726 SDF1 C23726T 4.16
BMPA04 BMPR2 27645 BMPR2 G61A 4.12
OFS202 OSF2 356 OSF2 A147G 4.04
CY1701 CYP17 6609 CYP17 T245G 3.86
ACLP02 ACLP 139671 ACLP CC48CCC 3.66
KJX505 KJ obexp5 wetSNP 3.65
G BI:AC012651 6
0000
01.v27596.C>T
KJX506 KJ_obexp5 wetSNP 3.52
G BI:AC012651
_6_0000
03.v3531.C>G
(isSNP
SNP00015562)
_
KJ1308 KJ_opgbal338686 KJ opgbal3 C38686T3.46
AD1205 ADAM12 40801 ADAM12 G112C 3.39
NFK203 NFKB2 47577 NFKB2 T210C 2.99
FL_05 FL_2847104120151 FL_2847104 C120151T2.97
ACLP10 ACLP 144713 ACLP G139T 2.91
AKA901 AKAP9 60717 AKAP9 G204T 2.89
KJA303 KJ ATHGBA39846 KJ ATHGBA3 G166A2.84
K13601 KJ opgba13613512 KJ opgba136 A13512G2.74
PA1107 PA11 104415 PA11 C104415T 2.72
NOT303 NOTCH3 35571 NOTCH3 G92A 2.66
FBL205 FBLN2 29968 FBLN2 T195C 2.65
ITGA02 ITGAV 7950 ITGAV T49C 2.55
KJB711 KJ bonlib7 wetSNP 2.44
GB:AL138901 2.v58172
.C>T
OFS201 OSF2 182 OSF2 G97A 2.43
BMPA03 BMPR2 1790 BMPR2 CTCTT26CT 2.37
AKA910 AKAP9 145008 AKAP9 C75T 2.28
FST103 FSTL1 10033 FSTL1 A10033G 2.26
KJ4701 KJ_opgba4716681 KJ_opgba47 A16681G2.22
CY1705 CYP17 6263 CYP17 C6263G 2.16
235
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Table 12
Marker Gene Position/orderMarker name tot bone
chis
KJX706 KJ obexp7 wetSNP 2.08
G B:AC090953_1.v8869
8.G>A (isSNP
SNP00067306)
KJA302 KJ_ATHGBA3937 KJ_ATHGBA3 C239A2.04
MMP108 MMP1 11784 MMP1 C118T 1.73
ACLP09 ACLP 143645 ACLP T116C 1.71
K13602 KJ opgba13613354 KJ opgba136 C100T1.7
KJ9701 KJ opgba9731213 KJ_opgba97 C66G 1.66
CHUK03 CHUK 8892 CHUKA33C 1.58
IRS107 IRS1 2995 IRS1 A47G 1.5
KJ1405 KJ_o gbal4_ KJ opgba14A32G 1.5
5204
ITGA11 ITGAV 5756 ITGAV G180A 1.41
MAP803 MAPK8 185530 MAPK8 G218A 1.34
TIM103 TIMP1 17434 TIMP1 G221A 1.24
IRS104 IRS1 850 IRS1 C66T 1.17
KJ1_01 KJ_oagbal15318 KJ_oagba1T136C 1.15
AD1202 ADAM12 9118 ADAM12 T23C 1.13
OMD_01 OMD 10596 OMD A195G 1.12
NOT302 NOTCH3 21959 NOTCH3 A43T 1.04
ITGA12 ITGAV 16247 ITGAV C101T 0.91
KJB410 KJ bonlib4 wetSNP 0.81
GB:AC025744 4.v1232
16.C>T
CHUK02 CHUK 35542 CHUK A104G 0.65
PAI108 PAI1 96894 PAI1 G85A 0.64
TIM104 TIMP1 18389 TIMP1 A148G 0.64
NFK201 NFKB2 46569 NFKB2 C70T 0.6
SDF101 SDF1 16619 SDF1 G46A 0.58
SCY202 SCYA2 61292 SCYA2 A21 T 0.41
ACLP04 ACLP 142383 ACLP T132C 0
MMP105 MMP1 9365 MMP1 G271T 0
SC2001 SCYA20 75634 SCYA20 C161 G 0
SC2003 SCYA20 75689 SCYA20 G216C 0
236
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
U ~' U Q Q Q
U U U U U ~' (.3 Q N U Q ~ Q Q U O C3 Q Q U ~ H U ~ Q Q Ca'3 U
C3 U U C3 ~- C3 ~ U Q Q C'3 U C'3 ~ U C'S U
C~ C'S Q ~ ~' Q U ~, ~ Q H H Q Q H U Q a ~' U Q Q a ~ U ~ ~ U U Q
cn C~~UC~c~U~-~~Q~~a~UC~c~~~-UC~c~c~Q~a Ue~Qa
f~ I- U C~ U
L Q C'3 ~ U H U H U U U ~ O Q O H U O U Q U a U Q Q U Q I- U C~ Q Q
U H C.~ F- (.~ C'3 U Q I-
~' ~ U I- U U U Q I- C'3 Q C3 Q U I- U U ~ Q C3 U C3 IQ- Q la- U U
E C3 U U U U C'3 C3 U Q U Q I- Q U U U I- ~ U t- C3 I- ~ Q Q ~ C3 (~ I- U
's= C5 Q U I- C'3 U ~ U ~ Q Q U Q U I- U U ~ U C3 U Q C'3 U U Q I- U U
Q. ~- C3 C3 C'3 U C'S Q I- C'3 Q U ~' Q Q ~ I- U U ~- Q ~ I-
f4 U Q IU- C3 C.3 ~ U Q a (~ C.~ Q ~ U Q (.3 C.3 U Q ~ U C5 I- Q U ~ I- ~- U
U U U C'3 C'S U ~ C3 ~ Q U U U ~' Q ~ C3 1- O Q ~ U Q Q U ~'3 Q Q
C3 Q C.3 C.3 U U U U U I- ~ ~- O U O U U ~ 1- U ~ U Q U ~ C3 U t- U
O a ~' ~ ~' U U C'3 Q ~ U ~' Q U ~ C'3 U Q Q Q C3 U I- Q U
Q E- ~ U
d
L
~ d. d' d' 'd' d' d' dwY d' d' N dwt d' ~t ~t ~' 'd' N d' d' ~t ~t d' N d' ~ N
~t Wit' d'
d d L~ L~ L~ L~ LI7 LO l~ tf7 LIB L~ tn M CO ~ ~I7 LI7 Ln tn Lf~ Ln L~ LIB L~
tn tf7 L~ W f7 tf7 LIB Lf~
C
ad
UU UUU ~3U UC'3 C3 C'3U U U
Q I- C'3 Q I- I- Q U O U U U I- O ~-- U Q U IU- Q Q C.U'3 Ca'3 U Q C.O'3 H U
Q C5 U Q U U ~ U Q Q U ~ (~ Q Q U U Q ~ U Q Q ~ I- I- Q Q la- U
~' Q ~ ~ ~' U U C3 Q C3 I- U U C5 C'3 (~ U C'S U
C5 I- U I- C'S C'3 C'3 IU- L'3 U Q ~ Q Q ~ U Q ~ O U V E- I- U I- Q C'3 ~- C'3
U
a. (~ C~ U Q C'3 C5 C'3 U C3 U U Q Q U U U Q Q ~' H U Ca'3 ~ H Q Q a Q Q U
Q Q U Q C.3 C3 C3 ~- Q I- C3 ~- ~ U U I- ~ U
u~ U I- C'3 U ~- Q ~- U Q ~ (~ Q ~ Q Q Q U I- I- Q C'3 I- Q
d fU- U a U H t-~- I~- Q C.~'3 U U H ~ C'3 U Q ~ U U ~ Q Q ~ Q Q U C~ 1~- IU-
U Ca'3
E Q ~ ~ U Q U U U I- ~ U ~ Q V U C3 U C3 ~ ~ C3 U I- Q U Q C3 C3 C3 ~- C3
'L I- Q Q I- Q Q ~3 Q C'3 Q Q (~ ~' ~ U U a C3 C'3 (~ U U O t- Q ~'3 Q
00 ~U'3 U U C.a'3 ~ ~ ~ U Q Q Q U C'3 U ~- a U ~- ~ U CN'3 ~ ~ (~ h- Q Q IU- U
(~ Q Q I- I- Q Q Q C3 U
Q ~3 Q Q Q U ~-- U U U H Q U U ~ U
~UU~~t-I-UPI-C3~I-~UUUI-QUQQU~UHU~U~C~'3
C3 C3 ~ U C.3 Q Q ~ Q ~ Q U ~ C~ Q U U IU- O Q U U (.3 Q Q U Q Q C.3 U a
Q U (~ Q C3 C'3 Q U U C~ U U I- Q Q ~- U Q I- Q Q
U Q U Q U C'3 C3 U U U Q C'3 H U Q ~ i-U- U ~- Q U U U U C.3 ~ Q U I- C.3
C'3 U C'3 C3 C'3 ~ C'3 Q Q ~ Q U Q y- I- U Q V ~ U Q Q
Q U H
U
U U
U U U H U U U U U Q U U U V V ~ a ~ a Q U U ~ t~ C3 Q ~ O Q U
Q U Q ~ C3 C3 (3 U U ~' Q Q ~' Q Q Q Q H Q Q U Q Q U U U IQ- Q H ~ Q
U Q C3 U ~- C'3 C3 U C'3 Q U U Q Q C'3 Q U Q I-Q- H' ~' C'3 U Q (~ ~ U U U
C'3 Q U Q Q H Q I- U Q Q O Q Q (~ U U U C'3 C3 Q U ~ C5 (~ U Q O
a U C'3 H C.3 U U U U Q Q U Q Q Q E- C'3 ~, Q ~ I- C3 Q H Q ~- C~ Q E- Q Q
N C~UU~U~~U~'~Q(.3HUQQUQUHI-QU~,UC3~QUQU
L U Q t- U C3 Q Q ~ U U Q U Q U U U U ~ C3 Q Q Q Q U Q C~'3 U U U U U
Q U (.~ C'3 C3 Q I
C3 I- U U C'3 C'3 C~ U U U Q Q U Q ~ U U I- V C'3 U Q U Q U Q Q ~ Q C.~ U
U Q (~ Q U I- I- Q U U Q C'3 Q Q I- U F- C~ U U I- U U U U Q
'Q Q C'3 I- ~ U C5 C3 I- ~" I- C3 ~ ~ U IU- ~ U U Q U U I- 1- Q O I- U Q U ~
Q C3 U ~ O C3 C'3 C3 U Q Q Q U C'3 U (~ U U O Q U Q Q ~' a
z ~ U U U C3 U U Q ~ U IU- H ~ U H i- Q Q Q ~ C5 V VU- U H U Q U ~ I- U
~' ~ C3 I- C3 C3 U C'3 Q Q U (~ Q U t- I- Q Q U U
U U O f- (~ I- I- Q Q ~-Q- U ~ C3 U E- C.3 U C'3 ~ Q Q Q U U tU- U U C3 I- C'3
~. Q Q U I- C3 C.'3 ~ U C3 Q I- U Q Q U Q U t- Q (~ Q O U ~ O U U I- U Q
I- 1- (~ U Q C~ C~ a C~ I- U Q Q Q t- (~ U (~ Q
a ~ Q U a C3 Q C3 U Q Q U ~ I- O Q ~ U a U I- I- f-- I
U O Q U
N
N d' Ln Cfl f~ 00 O O r CO op O M r N r d' r N M d'
C~ Ln N 00 r N Ch r C~ ~
'(p O O O O O O O r O O O r O O O O O O O O O O O O O r r T O O O
L ~ N. n. d ~ N. ~ a. ~ a~ O> a~ Q N N m m r r r ,- ~ r Q Q Q Q Q ~ ~ ~
d J J J J J J J J Q Q Q Q N. IL LL (n C~ I- I- I- I- LL Cn (~ O C~ C~ C~ r r r
y U U U U U U U U Y Y Y Y ~ C'3 C3 O O ~ ~ m ~ C'3 OC ~.- ~. ~. y- '- '- '-
aQQaaaQaQQaam~-~-~~-u-~-~-~--- -YYY
237
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
Q H Q U U Q U ~- Q
U ~ Q Q ~' C'3 U U U Q F- ~ H Q U Q Q Q Q U Q
Q U U U Q I- Q U 1- Q U U U Q Q Q Q H U U t-
U U ~ Q Q CU'3 U U U U ~ a U Q ~ U ~ ~ C3 a ~ C3 Q U (.3 (~ U C3 Q Q Q
a C~ Q Q Q I- i- U I- O Q Q a C~ ~' U C~ Q I- U Q U I- C~ Q U C~5 U C~ U
Q t- Q ~ Q U H U Q U U U U U 1- U U Q Q C.3 U Q 1
O U Q ~ ~ ~ ~ ~ U C3 U U V U Q Q C3 Q Q V Q U H C.U'3 U U ~ U U la-
d U ~ C3 Q U ~ Q ~ V U ~ C'3 ~ U Q C~ ~ Q Q ~ U Q Q 1- ~- U I-. V U U
U Q C3 C'3 O Q Q I- U t- C'3 O U Q Q 1- ~ Q I- Q U U C'3 U C'3 U Q
U Q Q O ~' U Q Q I- ~ Q U a C.3 Q U Q Q U I-- Q U U O C3 U
U ~ t3 U ~ U U U O ~ I- U Q U Q U Q I- C3 C~ U I- U I- U E- U U ~ U Q
I- U Q I- U Q U Q Q U Q U U ~ Q C'3 Q C'3 Q C'3 (~ Q Q Q Q Q L- I- C'3
U Q Ca'3 Q V Q C.~ U Q ~ U H (.~'S Ca'3 1- Q Q (~ Q I- ~ U U Q U C'3 ~ ~ O I-
Q
U ~ U Q C'3 ~ Q f.~ ~ ~ (,~~ U ~ ~3 ~ ~ Q ~ U Q U U U H U ~ ~ U U
V~U~~ ia- C5 U ~~UQ U fU-
as
i
C_
R ~ dwt ~ d' dwt d' ~ ~t d' d' d' ~h d' d' CO d' d- 'd' d' d- d' d- 'd' d' d'
CO d' d' N O
d d LO tn Ln L~ LI7 LO LI7 LO Lf7 LO Ln LO Ln tn LO Ln In Ln L~ LI7 LO LO LO
L~ LI~ LO Lt7 LO L~ Lf7 Ln
C
C
ad
U
C3 U C3 U U C3 U U C3 U U H U U C3 U U C5 U U U U U (~ (~ U Q
U (.~ I- U U Q Q I- Q t- U U U U C'3 O I- ~' U I-- ~ I- Q I- Q C3 ~ Q C'3
Q U I- H- Q 1- Q C3 Q C3 U ~' U ~ Q Q 1- Q a C3 Q U U
(.3 C'3 C3 U C'S C3 U U U I- Q Q U U I-
~, U Q U U Q U ~, Q ~ Q U I- Q ~ U U U U Q E-- U U I-
U Q U U t- U Q Q U U U (~ Q U I-a- IU- U C'3 C~ Q Q U C'3 Q ~ Q U f> Q Q Q
1- Q C3 U I- Q
~ ~ Q C3 Q 1- f.3 C'3 ~ U U Q ~ ~ U U U H U C'3 H U U U U U Q U ~ H U U
u~ ~ Q U Q ~' C3 Q U Q U U U (~ ~ ~- C'3 I- C'3 U U C'S O I- U U C3
U C'3 U C'3 C'3 Q U U C'3 Q ~ C3 Q C'3 U U y- C~ U U Q C'3 U Q I- Q
~ ~ H H ~ Q ~~'3 U U C.a'3 Q ~ U C~'3 U Q ~ U Q ~ Q H U U ~ Q U U U ~ C'3 Q
~y Q Q (> Q I- C.'3 Q C3 Q ~ Q ~ C'3 U ~ Q U' ~' I- U U C~ ~ Q U Q
Q ~ U Q V U U U ~ U Q U U I- U t- Q Q Q ~ Q C'3 I- ~ U Q U U I- C'3
C3 I- U U Q Q U C3 I- Q Q C'3 U U U U I- U ~ U U
I- m Q Q U Q E- U U ~ U H E- Q ~ U U U C'3 C3 C3 I- ~ Q U U ~ U U Q
C3 Cy-- I- C'3 I- Q U U U U C'3 ~' Q U Q (~ Q I- Q U U I- C'3 C'3
U Q Q C5 I- Q Q ~' C'S Q Q Q Q U Q C'3 C3 Q ~ U Q ~' ~ ~ U Q U Q C'3 U
Q Q C3 (~ I- C3 C3 Q U U C3 W I- Q I- Q Q U Q C'3 Q C3 ~-
Q U Q Q U Q Q U Q Q I- U U ~ U U C3 U (~ C3 Q ~ U ~ (~ ~ ~ Q I-
C3 U ~ Q ~'3 U ~' a U Q Q Q C3 ~ I- U Q (3 U Q Q Q C3
U
C3
U (~ C3 U U (,~ U C'3 C3 U U U I"' U (~ U O U U U C'3 O U C'S U
U U I- C'3 Q C3 I- U a (.3 E- U C3 C.3 C.3 Q Q Q I-. U U U U U Q ~- QQ ~- ~
C.3 C.3
~aQaQaa~~aC~U~~~Q~a~c~c~Qac~QU c~~c~c~
C'3UHQU~U~C'3UUQU~U~U'3UQIa-Q~UUQUIU-C3UUQCa'3
Ca'3 Q Q C.a'3 Q U I- Q ~ Q U U C~'3 ~ ~' U U Q Q Q Q Q C~'3 C~'3 U N IU- U ~
U Q
m U I- U f- U I- U U C3 Q C'3 Q Q U U Q C'3 a U Q I- C3 I- f- C.3 U U I- O I-
I-,
(3 U U U Q Q C3 U ~ Q Q Q U U Q U Q U U (~ I- U Q (~ U Q C.3 Q U Q
d Q Q U Q U I- I- U C'3 Q U U Q Q U U U Q Q U ~ U Q
E Q U E- U ~ U ~ Q C~ ~ Q ~ U Q U ~ U U ~ U U I- ~ U ~ U Q I- Q
'~. U Q U C5 ~ Q U L- Q U Q U a F- Q ~ Q U I- I- Q ~ U U U (~ Q Q
a. U Q U U C3 (~ Q U U U E" ~ U U U U U Q Q U U ~ C'3 Q U U
Z C3 U Q Q Q Q ~' Q Q Q Q C'3 O U U C~ Q C3 ~ I-Q- U Q U Q U I- U ~ Q V
Q O U C3 U t3 U ~. U U Q a ~ Q ~, Q O Q Q Q ~' Q I- ~- U C'3 ~ U Q
Q H I~- ~ C.a'3 H U ~ U C~'3 Q Q U U ~ C'3 U U U Q ~ H U Q U ~ ~ U ~ H U
U C.3 C'3 Q I- Q U C'3 I- I- U U U U Q E' U Q C3 Q Q U U C.3 U Q U U C3 U
f- U I- Q C3 Q Q U ~ U Q Q Q Q C3 U ~' Q Q Q Q
F- Q Q U U E- U U Q Q Q ~ C~ Q ~ U U ~ ~ U U ~ ~- E- U f- C'3 ~ E-
C~ C~
Q
N
r r N C9 d' CO f~ 00 r r r N CO d' N O) r M T' M 'd' LO 1~ r N M ~- N M d' N
~~C O O O O O O O O r O O O O O O O O O O O O O O p p O O O O O O
a,. O I I M M M M M M d' I~ I~ 1~ I~ d' I~ I DO r r r r r N N N M C'~ M M N
Q~ M T r r r r r T r r ~ Ct ~Y ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ Y Y Y
Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y J ~ ~ ~ ~ ~ ~ Z Z Z Z Z Z Z O
238
CA 02471376 2004-06-21
WO 03/054218 PCT/US02/40948
UO
la- U a U U U Q H f~- Q U U U U U U Q U ~
dU~Q~UUO~QQQQUQOaU~V
a U a (~ a C'3 C3 ~ U U a U a ~
a a a ~ U U U ~-- U ~ U U U C'S C~'3 U a a O
Q a C.a'3 O U C'3 U (.a'3 O H ~ U a C'S E- Q ~ a
Q.
cn U Q Q ~ ~ C.a'S V C~'3 Q C3 ~ ~ V C~'3 C~ U U U O
a C'3 a U ~ a U a U Q U U ~ O C.3 ~ C3 ~' Q
QQQOW'3~UC3(~~UUUU~~~UC3
U O Q Q U a (.3
d
c_ ,~
~NNNd'd'~d'd'dwtd'~tdwtd'd'd~d'd'
d d lI7 LO Ln LI7 LI7 LO LO L~ tf~ L~ ~L7 lf7 tl~ tl7 In tO LO lO LI7
C
C
Q d
F-
U
H
C'3
O U ~ H U U a U U a N H a Q CU'3 (.3 H IU- C'3
a ~ a U a a O a a U U U a U a U U C'3 I-
I-U- a (~ U U ~ Q Q a U Q Q O Q H U Q ~ C.~'3
M a U U I- U t- ~' I- U U ~ H H ~ U a U C3 I- (~
~ C3 U U ~"' ~ Q a C~'3 Q a U 1- U H
E- C~ C~ E
I- a U QUUOIU-UaUla-U
E ~ H Q U O Q Ca'3 a Q U a Q U U I- a (~ a ~
W U C'S I- C'S O C.3 (~ U C3 I- tU- ~ a U U C3 a U a
00 ~ Q a U a U Q t0- O a O (~ ~ U U Ca'3 U a U
UU(~UQU~Ca'3Uaa 0C3~
Q Ca'3 C'3 Q ~ Q ~ Q Q H ~ ~ Q U ~ Q ~ U O
~ U ~ U U IU- U O U Q U U ~ ~ U Q U Q O
a
a
a
c~ c~
U O U C.3 U
Q~C3UHUHUU QQaUUa(3f-t-
a a U U 1- ~ I- I- U ~ C3 U ~- C3 U
a a C'3 C.3 Q a O f.3 U U U U ~3 a a a C.3 C.3
U a Q C~'3 O H U U I~- U U U V U a O ~ a
a U ~'- I- a a U a a a a C'3 U U O U a
N IU- U C3 a U U C'3 a Q C5 U U C~ ~ U Q H f.~'3 CU'3
U
~' UUC~QQa~~-U~~h~-~Q~~~Q~~
Q. V a Q Ca'3 Q U ~ U ~ U U U a Q Q U Q Q U
z Q~QH~~UQ~10-UU QQUHUH
a a a C'3 I- U O a U C3 C3 ~ (~ U U C3 O C'S
U a O C'3 U C3 U E- C'3 C3 C3 (~ I- ~. U a a
U Q a ~ U Q U a E- a a U a a U C'3 U
U a I- a ~ U U C3 C'3 I- C3 C'3
a V C.3 Q I- a a a U a a a U ~ a ~ I-
O a
N
_~ N M ~- M N ~ ~ O r r N N M CO r N r N r
O O O O
O O O
ZDp~_OOOO~NrOOOOTO
r r r r
Y C'3 (.~ ~ ~ Q Q Q Q ~ U O C'3 C'J C~ C~ Z z Lu
OOOOo-~~~~~y--I-I-t-I-I-I->
239