Language selection

Search

Patent 2569095 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2569095
(54) English Title: CRYSTAL STRUCTURE OF DIPEPTIDYL PEPTIDASE IV (DPP-IV) AND USES THEREOF
(54) French Title: STRUCTURE CRISTALLINE DE DIPEPTIDYL PEPTIDASE IV (DPP-IV) ET SES UTILISATIONS
Status: Dead
Bibliographic Data
(51) International Patent Classification (IPC):
  • C12N 9/64 (2006.01)
  • C07K 14/00 (2006.01)
  • C07K 14/47 (2006.01)
  • C12N 9/48 (2006.01)
  • G01N 33/53 (2006.01)
  • C40B 30/02 (2006.01)
  • G06F 19/00 (2006.01)
(72) Inventors :
  • QIU, XIAYANG (United States of America)
(73) Owners :
  • PFIZER PRODUCTS INC. (United States of America)
(71) Applicants :
  • PFIZER PRODUCTS INC. (United States of America)
(74) Agent: SMART & BIGGAR LLP
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date: 2005-05-26
(87) Open to Public Inspection: 2005-12-15
Examination requested: 2006-11-28
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/IB2005/001630
(87) International Publication Number: WO2005/119526
(85) National Entry: 2006-11-28

(30) Application Priority Data:
Application No. Country/Territory Date
60/576,877 United States of America 2004-06-03

Abstracts

English Abstract




A crystal structure of dipeptidyl peptidase IV (DPP-IV), and the three-
dimensional atomic coordinates of the DPP-IV extracellular domain, as
described and used for the identification of ligands, including DPP-IV
inhibitors, used for the treatment of diseases that are associated with
proteins that are subject to processing by DPP-IV.


French Abstract

Selon l'invention, une structure cristalline de dipeptidyl peptidase IV (DPP-IV) et les trois coordonnées atomiques tridimensionnelles du domaine extracellulaire de DPP-IV, telles que décrites et utilisées pour l'identification de ligands, dont les inhibiteurs de DPP-IV, sont utilisées pour traiter des maladies associées à des protéines soumises à un traitement par DPP-IV.

Claims

Note: Claims are shown in the official language in which they were submitted.



-26-
CLAIMS
What is claimed is:

1. A crystalline composition of the extracellular domain of mammalian
dipeptidyl peptidase
IV (DPP-IV) comprising one molecule per crystal asymmetric unit.
2. A crystal of claim 1, wherein said extracellular domain has a three
dimensional structure
characterized by the atomic structure coordinates of Fig 2.
3. The DPP-IV crystal comprising SEQ ID NO 2, or a homologue, analogue or
variant
thereof.
4. A crystal having one molecule per asymmetric unit comprising a polypeptide
with an
amino acid sequence spanning amino acids Gly31 to Pro766 listed in SEQ ID NO
1, or a homologue,
analogue or variant thereof.
5. The crystal of claim 4, wherein the homologue or variant has an amino acid
identity of at
least 95% with a polypeptide having an amino acid sequence spanning amino
acids Gly31 to Pro766
listed in SEQ ID NO 1.
6. The crystal of claim 4 or 5, wherein the homologue or variant thereof has a
protein
backbone comprising the atomic coordinates, or portions thereof, that are
within a root mean square
deviation of less than 2.5, 2.0, 1.5, 1.2, 1.0, 0.7, 0.5, or 0.2 .ANG. of the
atomic coordinates, or portions
thereof, listed in FIG 2.
7. A polypeptide comprising the amino acid sequence set forth in SEQ ID NO. 1
or a
homologue, or variant thereof, wherein the molecules are arranged in a
crystalline manner in a space
group of P4 3 2 1 2 so as to form a unit cell of dimensions a=b=68.7 .ANG., c=
421.2 .ANG. and which effectively
diffracts X-rays for determination of the atomic coordinates of DPP-IV
polypeptide to a resolution of about
27.ANG..
8. A crystal of a protein-ligand molecule or molecular complex comprising (a)
a polypeptide
with an amino acid sequence from Asp38 to Pro766 listed in SEQ ID NO 1, or a
homologue, or variant
thereof, (b) a ligand, (c) and the crystal effectively diffracts X-rays for
the determination of atomic
coordinates of the protein-ligand complex to a resolution of greater than 2 7
Angstroms.
9. A method of designing a compound that binds to DPP-IV comprises the amino
acid
sequence spanning amino acids Gly31 to Pro766 listed in SEQ ID NO-1, or a
homologue, or variant
thereof using the crystal of claim 1, comprising selecting a compound by
performing structure-based drug
design with the atomic coordinates determined for the crystal, wherein said
selecting is performed in
conjunction with computer modeling.
10. A method for crystallizing a DPP-IV polypeptide molecule or molecular
complex
comprising (a) preparing a mixture of an aqueous solution comprising a
polypeptide with an amino acid
sequence spanning amino acids Gly31 to Pro766 listed in SEQ ID NO:1, or a
homologue, or a variant
thereof; (b) mixing said aqueous solution with a reservoir solution comprising
a precipitant to from a mixed
volume; and (c) crystallizing said mixed volume.
11 The method of claim 10 wherein step (c) is carried out by vapor diffusion
crystallization,
batch crystallization, liquid bridge crystallization, or dialysis
crystallization


-27-
12. A computer for producing a three-dimensional representation of a
polypeptide with an
amino acid sequence spanning amino acids Gly31 to Pro766 listed in SEQ ID
NO:1, or a homologue, or a
variant thereof, comprising: a computer-readable data storage medium
comprising a data storage material
encoded with computer-readable data, wherein said data comprises the structure
coordinates of FIG. 2, or
portions thereof; a working memory for storing instructions for processing
said computer-readable data; a
central-processing unit coupled to said working memory and to said computer-
readable data storage
medium for processing said computer-machine readable data into said three-
dimensional representation;
and a display coupled to said central-processing unit for displaying said
representation.
13. A computer for producing a three-dimensional representation of a molecule
or molecular
complex comprising the atomic coordinates having a root mean square deviation
of less than 2.5, 2.0, 1.7,
1.5, 1.2, 1.0, 0.7, 0.5, or 0.2 .ANG. from the atomic coordinates for the
carbon backbone atoms listed in FIG.2
comprising: a computer-readable data storage medium comprising a data storage
material encoded with
computer-readable data, wherein said data comprises the structure coordinates
of FIG. 2, or portions
thereof; a working memory for storing instructions for processing said
computer-readable data; a central-
processing unit coupled to said working memory and to said computer-readable
data storage medium for
processing said computer-machine readable data into said three-dimensional
representation; and a
display coupled to said central-processing unit for displaying said
representation.
14. A computer for producing a three-dimensional representation of a molecule
or molecular
complex comprising: a binding site defined by the structure coordinates in
FIG. 2, or a the structural
coordinates of a portion of the residues in FIG. 2, or the structural
coordinates of one or more DPP-IV
amino acids in SEQ ID NO:1 selected from Glu205, Glu206, Tyr547, Ser630,
Tyr631, Tyr662, Tyr666,
Asp708, Asn710 and His740 wherein said computer comprises; a computer-readable
data storage
medium comprising a data storage material encoded with computer-readable data,
wherein said data
comprises the structure coordinates of FIG. 2, or portions thereof; a working
memory for storing
instructions for processing said computer-readable data; a central-processing
unit coupled to said working
memory and to said computer-readable data storage medium for processing said
computer-machine
readable data into said three-dimensional representation; and a display
coupled to said central-processing
unit for displaying said representation.
15. A method for identifying potential ligands for DPP-IV, or homologues,
analogues or
variants thereof, comprising: displaying three dimensional structure of DPP-IV
enzyme, or portions
thereof, as defined by atomic coordinates in FIG. 2, on a computer display
screen; optionally replacing
one or more DPP-IV enzyme amino acid residues listed in SEQ ID NO:1, or one or
more of the amino
acids listed in Table 1, or one or more amino acid residues selected from
Glu205, Glu206, Tyr547,
Ser630, Tyr631, Tyr662, Tyr666, Asp708, Asn710 and His740, in said three-
dimensional structure with a
different naturally occurring amino acid or an unnatural amino acid; employing
said three-dimensional
structure to design or select said ligand; contacting said ligand with DPP-IV,
or variant thereof, in the
presence of one or more substrates; and measuring the ability of said ligand
to modulate the activity DPP-
IV.

Description

Note: Descriptions are shown in the official language in which they were submitted.



DEMANDES OU BREVETS VOLUMINEUX
LA PRESENTE PARTIE I)E CETTE DEMANDE OU CE BREVETS
COMPREND PLUS D'UN TOME.
CECI EST LE TOME DE _2

NOTE: Pour les tomes additionels, veillez contacter le Bureau Canadien des
Brevets.

JUMBO APPLICATIONS / PATENTS

THIS SECTION OF THE APPLICATION / PATENT CONTAINS MORE
THAN ONE VOLUME.

THIS IS VOLUME 1 OF 2

NOTE: For additional volumes please contact the Canadian Patent Office.


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
CRYSTAL STRUCTURE OF DIPEPTIDYL PEPTIDASE IV (DPP-IV) AND USES THEREOF

FIELD OF THE INVENTION

The present invention relates to crystalline compositions of mammalian
dipeptidyl peptidase IV
(DPP-IV), methods of preparing said compositions, methods of determining the
three-dimensional (3-D) X-
ray atomic coordinates of said composition, methods of identifying ligands of
DPP-IV using structure based
drug design, the use of 3-D crystal structures to design, modify and assess
the activity of potential inhibitors,
and to the use of such inhibitors for the treatment of, for example, diabetes,
glucose tolerance, obesity,
appetite regulation, lipidemia, osteoporosis, neuropeptide metabolism and T-
cell activation, among others.

BACKGROUND OF THE INVENTION
The serine peptidase dipeptidyl peptidase IV (DPP-IV) is a multifunctional
type II cell surface
glycoprotein, which is widely expressed in a variety of cell types,
particularly on differential epithelial cells
of the intestine, liver, prostate tissue, corpus luteum, and kidney proximal
tubules (Thoma et al., Structure,
11, 947-959, 947 (2003) citing Hartel et al., Histochemistry 89, 151-161
(1988); McCaughan et al.,
Hepatology 11, 534-544 (1990) as well as leukocytes subsets (Thoma et al.,
(2003) citing Gorrell et al.,
Cell. Immunol. 134, 205-215 (1991)).
DPP-IV has roles in many biological processes including its ability to
modulate the biological
activity of several peptide hormones, chemokines and neuropeptides by
specifically cleaving after a
proline or alanine at amino acid position 2 from the N terminus, a rate-
limiting step in the degradation of
peptides (Mentlein, R. Regul Pept. 85, 9-24 (1999). Therefore, the natural
substrates of DPP-IV include
several chemokines, cytokines, neuropeptides, circulating hormones, and
bioactive peptides which as
those of skill in the art will appreciate, suggests a key regulatory role in
the metabolism of peptide
hormones and in amino acid transport ((Lambeir et al., FEBS Lett. 507, 327-330
(2001); (Hildebrandt et
al., Clin. Sci. 99, 93-104)).
DPP-IV has been implicated in the control of glucose homeostasis because its
substrates include
the incretin peptides glucagon-like peptide 1 (GLP-1) and gastric inhibitory
polypeptide (GIP). Cleavage
of the N-terminal amino acids from these peptides renders them functionally
inactive. As such, GLP-1 has
been shown to be an effective anti-diabetic therapy in Type 2 diabetic
patients and to reduce the meal-
related insulin requirement in Type 1 diabetic patients. Moreover, GLP-1
and/or GIP are believed to
regulate satiety, lipidemia and osteogenesis, and therefore exogenous GLP-1
has been proposed as a
treatment for patients suffering from acute coronary syndrome, angina and
ischemic heart disease.
As those of skill in the art will appreciate, several methods have been used
in the past and
continue to be used to discover selective inhibitors of biomolecular targets
such as DPP-IV. The various
approaches include ligand-directed drug discovery (LDD), quantitative
structure activity relationship
(QSAR) analyses; and comparative molecular field analysis (CoMFA). CoMFA is a
particular type of
QSAR method that uses statistical correlation techniques for the analysis of
the quarititative relationship
between the biological activity of a set of compounds with a specified
alignment, and their three-
dimensional electronic and steric properties. Other properties such as
hydrophobicity and hydrogen
bonding can also be incorporated into the analysis.


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-2-
An invaluable component of these drug discovery approaches is structure based
design, which is
a design strategy for new chemical entities, or optimization of lead compounds
identified by other
methods, using the 3D structure of the biological macromolecule target
obtained by for example, X-ray or
nuclear magnetic resonance (NMR) studies, or from homology models. Analyzing 3-
D structures of
proteins provides crucial insights into the behavior and mechanics of drug
binding and biological activity.
Coupled with computational techniques including modeling and simulation, the
study of biomolecular
interactions provides details of events that may be difficult to investigate
experimentally in the laboratory,
and can help reveal topological features important for determining molecular
recognition. As those skilled
in the art will recognize, this information can, in turn, be used for
predicting ligand-receptor complex
formation, and for designing ligands and protein mutations that produce
desired ligand receptor
interactions.
Although, crystal structures of DPP-IV have been previously disclosed, none
have been able to
solve a three-dimensional structure with only one molecule per asymmetric
unit, in turn providing a greater
advantage for iterative structure based design. (Aertgeets et al., Protein
Science, 13:412-421 (2004)
hDPP-IV complexed with a decapaptide; Engel et al., PNAS, 100:9:5063-5068
(2003) crystal structure of
native porcine DPP-IV; Hiramatsu et al., BBRC, 302:849-854 (2003) crystal
structure of hDPP-IV at 2.6A
resolution; Rasmussen et al., Nature Structural Biology, 10:19-25 (2003) 2.5A
structure of the extracellular
region of DPP-IV in complex with the inhibitor valine-pyrrolidide; Thoma et
al., Structure, 11:947-959
(2003) expressed and purified the ectodomain of human DPP-IV in Pichia
pastoris and determined X-ray
structure at 2.1 A resolution.)
To that end, the quest for specific and potent DPP-IV inhibitors for use in
physiological studies
and therapeutic settings continues. Thus, obtaining three-dimensional (3D)
structures of DPPs, such as
DPP-IV as a single molecule of an asymmetric unit, by, for example, X-ray or
NMR studies, or from
homology models, and analyzing the structures using computational methods,
facilitates such discovery
efforts.

SUMMARY OF THE INVENTION

The present invention provides crystalline compositions of DPP-IV, and
specifically of DPP-IV,
having one molecule per asymmetric unit. The invention further provides
methods of preparing said
compositions, methods of determining the 3-D X-ray atomic coordinates of said
crystalline compositions,
methods of using the atomic coordinates in conjunction with computational
methods to identify binding
site(s), methods to elucidate the 3-D structure of homologues of DPP-IV, and
methods to identify ligands
which interact with the binding site(s) to agonize or antagonize the
biological activity of DPP-IV, methods for
identifying inhibitors of DPP-IV, pharmaceutical compositions of inhibitors,
and methods of treatment of Type
2 diabetes, Type 1 diabetes, impaired glucose tolerance, hyperglycemia,
metabolic syndrome (syndrome X
and/or insulin resistance syndrome), glucosuria, metabolic acidosis,
arthritis, cataracts, diabetic neuropathy,
diabetic nephropathy, diabetic retinopathy, diabetic cardiomyopathy, obesity,
conditions exacerbated by
obesity, hypertension, hyperlipidemia, atherosclerosis, osteoporosis,
osteopenia, frailty, bone loss, bone
fracture, acute coronary syndrome, short stature due to growth hormone
deficiency, infertility due to
polycystic ovary syndrome, anxiety, depression, insomnia, chronic fatigue,
epilepsy, eating disorders,
chronic pain, alcohol addiction, diseases associated with intestinal motility,
ulcers, irritable bowel syndrome,


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-3-
inflammatory bowel syndrome; short bowel syndrome; and the prevention of
disease progression in Type 2
diabetes, using said pharmaceutical compositions.
In a preferred embodiment the invention provides crystalline compositions of
the extracellular
domain of DPP-IV (residues 31-766 of SEQ ID NO:1), whereby the crystal
structure is derived from
mammal, preferably human.
One aspect of the present invention provides methods for crystallizing a DPP-
IV polypeptide.
Preferably the methods for crystallizing the DPP-IV polypeptide comprising an
amino acid sequence
spanning the amino acids 31 to 766 listed in SEQ ID NO:1 comprise the steps
of: (a) preparing solutions
of the polypeptide and precipitant; (b) growing a crystal comprising molecules
of the polypeptide from said
mixture solution; and (c) separating said crystal from said solution. The
crystallization growth can be
carried out by various techniques known by those skilled in the art, such as
for example, batch
crystallization, liquid bridge, vapor diffusion, crystallization, or dialysis
crystallization. Preferably, the
crystallization growth is achieved using vapor diffusion techniques.
An embodiment of the present invention provides crystalline compositions of
DPP-IV comprising a
crystalline form of a polypeptide with an amino acid sequence spanning the
amino acids GIy31 to Pro766
listed in SEQ ID NO:1, wherein the crystalline composition has a space group
P43212 and unit cell
dimensions a=b=68.7 A, c= 421.2 A.
In a second aspect, the present invention provides vectors useful in methods
for preparing a
substantially purified extracellular domain of DPP-IV comprising the
polypeptide with an amino acid
sequence spanning amino acids GIy31 to Pro766 listed in SEQ ID NO:1.
Yet another embodiment of the present invention provides a DPP-IV crystal of
SEQ ID NO:2, or a
homologue, analogue or variant thereof.
In a third aspect, the present invention provides methods for determining the
X-ray atomic
coordinates of the crystalline compositions at a 2.7A resolution.
In a fourth aspect, the present invention provides a molecule or molecular
complex crystal, wherein
the crystal has substantially similar atomic coordinates to the atomic
coordinates listed in FIG. 2 or portions
thereof, or any scalable variations thereof.
In a fifth aspect, the present invention provides a molecule or molecular
complex crystal, wherein
the crystal comprises a polypeptide with an amino acid sequence spanning the
amino acids GIy31 to Pro766
listed in SEQ ID NO:1. A further embodiment of the invention provides a
crystal comprising an amino acid
sequence that is at least 98% or 95% homologous to a polypeptide with an amino
acid sequence spanning
the amino acids GIy31 to Pro766 listed in SEQ ID NO:1.
An even further embodiment of the invention provides a crystal comprising an
amino acid sequence
that is at least 98% or 95% homologous to a polypeptide with an amino acid
sequence spanning the amino
acids GIy31 to Pro766 listed in SEQ ID NO:1, and having the atomic coordinates
listed in FIG. 2.
In a sixth aspect, the present invention provides a molecule or molecular
complex crystal, wherein
the crystal comprises a polypeptide, or a portion thereof, with atomic
coordinates having a root mean square
deviation from the protein backbone atoms (N, Ca, C, and 0) listed in FIG. 2
of less than 0.2, 0.5, 0.7, 1.0,
1.2, 1.5, 2.0 or 2.5 A.
40. In a seventh aspect, the present invention provides a scalable, or
translatable, three dimensional
configuration of points derived from structural coordinates of at least a
portion of a DPP-IV molecule or


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-4-
molecular complex comprising a polypeptide with an amino acid sequence
spanning the amino acids Gly3l
to Pro766 listed in SEQ ID NO:1. In an embodiment of this aspect, the
invention also comprises the
structural coordinates of at least a portion of a molecule or a molecular
complex that is structurally
homologous to a DPP-IV molecule or molecular complex. On a molecular scale,
the configuration of points
derived from a homologous molecule or molecular complex has a root mean square
deviation of less than
about 0.2, 0.5, 0.7, 1.0, 1.2, 1.5, 2.0 or 2.5 A from the structural
coordinates provided in FIG. 2.
In an eighth aspect, the invention provides computers for producing a three-
dimensional
respresentation of aspect eight of the present invention can be used to design
and identify potential
ligands or inhibitors of DPP-IV by, for example commercially available
molecular modeling software in
conjunction with structure-based drug design as provided herein.
In a further aspect, the present invention provides computer for producing
three-dimensional
representations of:
a. a molecule or molecular complex comprising a polypeptide with an amino acid
sequence
spanning amino acids GIy31 to Pro766 listed in SEQ ID NO:1, or a homologue, or
a variant thereof;

b. a molecule or molecular complex, wherein the atoms of the molecule or
molecular complex are
represented by atomic coordinates that are substantially similar to, or are
subsets of, the atomic
coordinates listed in FIG. 2;

c. a molecule or molecular complex, wherein the molecule or molecular complex
comprises
atomic coordinates having a root mean square deviation of less than 0.2, 0.5,
0.7, 1.0, 1.2 , 1.5, 2.0 or
even 2.5 A from the atomic coordinates for the carbon backbone atoms listed in
FIG. 2; or

d. a molecule or molecular complex, wherein the molecule or molecular complex
comprises a
binding pocket or site defined by the structure coordinates that are
substantially similar to the atomic
coordinates listed in FIG. 2, or a subset thereof, or more preferably the
structural coordinates in FIG. 2
corresponding to one or more DPP-IV amino acids, or conservative replacements
thereof, in SEQ ID
NO:1 selected from GIu205, GIu206, Tyr547, Ser630, Tyr631, Tyr662, Tyr666,
Asp708, Asn710 and
His740;

wherein said computer comprises:
(i) a computer-readable data storage medium comprising a data storage medium
encoded with computer-readable data, wherein said data comprises the structure
coordinates of FIG. 2,
or portions thereof, or substantially similar coordinates thereof;
(ii) a working memory for storing instructions for processing said computer-
readable data;
(iii) a central-processing unit coupled to said working memory and to said
computer-
readable data storage medium for processing said computer-machine readable
data into said three-
dimensional representation; and
(iv) a display coupled to said central-processing unit for displaying said
representation.
In a ninth aspect, the present invention provides methods involving molecular
replacement to
obtain structural information about a molecule or molecular complex of unknown
structure. In one
embodiment, the method includes crystallizing the molecule or molecular
complex, generating an x-ray
diffraction pattern from the crystallized molecule or molecular complex, and
applying at least a portion of


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-5-
the structure coordinates set forth in FIG. 2 to the x-ray diffraction pattern
to generate a three-dimensional
electron density map of at least a portion of the molecule or molecular
complex.

In yet another aspect, the present invention provides methods for generating 3-
D atomic
coordinates of protein homologues, analogues, or variants of DPP-IV using the
x-ray coordinates of DPP-
IV described in FIG. 2, comprising:

a. identifying one or more homologous polypeptide sequences to DPP-IV;

b. aligning said sequences with the sequence of DPP-IV which comprises a
polypeptide with an
amino acid sequence spanning amino acids GIy31 to Pro766 listed in SEQ ID
NO:1;

c identifying structurally conserved and structurally variable regions between
said homologous
sequence(s) and DPP-IV;

d. generating 3-D coordinates for structurally conserved residues of the said
homologous
sequence(s) from those of DPP-IV using the coordinates listed in FIG. 2;

e. generating conformations for the loops in the structurally variable regions
of said homologous
sequence(s);

f. building the side-chain conformations for said homologous sequence(s); and

g. combining the 3-D coordinates of the conserved residues, loops and side-
chain conformations
to generate full or partial 3-D coordinates for said homologous sequences.

Embodiments of the ninth aspect provide methods which further comprise
refining and evaluating
the full or partial 3-D coordinates. These methods may thus be used, for
example, to generate 3-
dimensional structures for proteins for which 3-dimensional atomic coordinates
have not been
determined. As such, the newly generated structure can help to elucidate
enzymatic mechanisms, or be
used in conjunction with other molecular modeling techniques in structure
based drug design.
In the tenth aspect, the present invention provides methods for identifying
inhibitors, ligands, and
the like, of DPP-IV by providing the coordinates of a molecule of DPP-IV to a
computerized modeling
system; identifying chemical entities that are likely to bind or interact with
the molecule (e.g., by screening
a small molecule library); and, optionally, procuring or synthesizing and
assaying the compounds or
analogues derived thereof for bioactivity. Further aspects of the present
invention relate to methods for
identifying potential ligands for DPP-IV or homologues, or analogue or
variants thereof comprising:

a. displaying the three dimensional structure of DPP-IV enzyme or homologue or
analogue or
variant thereof, as defined by atomic coordinates that are substantially
similar to the atomic coordinates
listed in FIG. 2 on a computer display screen;

b. optionally replacing one or more the enzyme amino acid residues listed in
SEQ ID NO:1, or
preferably one or more amino acid residues selected from GIu205, GIu206,
Tyr547, Ser630, Tyr631,
Tyr662, Tyr666, Asp708, Asn710 and His740, in said three-dimensional structure
with a different naturally
occurring amino acid or an unnatural amino acid to display a variant
structure;

c. optionally conducting ab intio, molecular mechanics or molecular dynamics
calculations on the
displayed three dimensional structure to generate a modified structure;


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-6-
d. employing said three-dimensional structure, variant structure, or modified
structure to design or
select said ligand;

d. synthesizing or obtaining said ligand;

e. contacting said ligand with said enzyme in the presence of one or more
substrates; and
f. measuring the ability of said ligand to modulate the activity of said
enzyme.

Those of skill in the art can appreciate that the information obtained by the
methods for identifying
inhibitors and ligands of DPP-IV, as described above, can be used to
iteratively refine or modify the structure
of original ligand. Thus, once a ligand is found to modulate the activity of
said enzyme, the structural
aspects of the ligand may be modified to generate a structural analog of the
ligand. This analog can then be
used in the above methods to identify binding ligands. One of ordinary skill
in the art will know the various
ways a structure may be modified.
In embodiments, the methods further comprise computationally modifying the
structure of the
ligand; computationally determining the fit of the modified ligand using the
three-dimensional coordinates
described in FIG. 2, or portions thereof; contacting said modified ligand with
said enzyme, or homologue,
or variant thereof in an in vitro or in vivo setting; and measuring the
ability of said ligand to modulate the
activity of said enzyme.
In an eleventh aspect, the present invention provides compositions, such as,
pharmaceutical
compositions comprising the inhibitors or ligands designed according to any of
the methods of the present
invention. In one embodiment, a composition is provided that includes an
inhibitor or ligand designed or
identified by any of the above methods. In another embodiment, the composition
is a pharmaceutical
composition.
The twelfth aspect of the present invention are methods for treating Type 2
diabetes, Type 1
diabetes, impaired glucose tolerance, hyperglycemia, metabolic syndrome
(syndrome X and/or insulin
resistance syndrome), glucosuria, metabolic acidosis, arthritis, cataracts,
diabetic neuropathy, diabetic
nephropathy, diabetic retinopathy, diabetic cardiomyopathy, obesity,
conditions exacerbated by obesity,
hypertension, hyperlipidemia, atherosclerosis, osteoporosis, osteopenia,
frailty, bone loss, bone fracture,
acute coronary syndrome, short stature due to growth hormone deficiency,
infertility due to polycystic ovary
syndrome, anxiety, depression, insomnia, chronic fatigue, epilepsy, eating
disorders, chronic pain, alcohol
addiction, diseases associated with intestinal motility, ulcers, irritable
bowel syndrome, inflammatory bowel
syndrome; short bowel syndrome; and the prevention of disease progression in
Type 2 diabetes, comprising
administering pharmaceutical compositions, identified by structure based
design using the atomic
coordinates, or portions thereof, listed in FIG. 2, effective in treating the
disorders or conditions.

BRIEF DESCRIPTION OF THE DRAWINGS
Figure 1 is an orthogonal view of an embodiment of DPP-IV shown in a ribbon
representation.
The N- and C-termini of the polypeptide are also depicted.
Figure 2 is a list of the X-ray coordinates of a DPP-IV crystal as described
in the Examples.


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-7-
DETAILED DESCRIPTION OF THE INVENTION

The present invention relates to crystalline compositions of DPP-IV, 3-D X-ray
atomic coordinates
of said crystalline compositions, methods of preparing said compositions,
methods of determining the 3-D
X-ray atomic coordinates of said crystalline compositions, and methods of
using said atomic coordinates
in conjunction with computational methods to identify binding site(s), or
identify ligands which interact with
said binding site(s) to agonize or antagonize DPP-IV.
For convenience, certain terms employed in the specification, examples, and
appendant claims
are collected here. Unless defined otherwise, all technical and scientific
terms used herein have the same
meaning as commonly understood by one of ordinary skill in the arts.
The term "affinity' as used herein refers to the tendency of a molecule to
associate with another.
The affinity of a drug is its ability to bind to its biological target
(receptor, enzyme, transport system, etc.)
For pharmacological receptors, affinity can be thought of as the frequency
with which the drug, when
brought into the proximity of a receptor by diffusion, will reside at a
position of minimum free energy within
the force field of that receptor.
The term "agonist" as used herein refers to an endogenous substance or a drug
that can interact
with a receptor and initiate a physiological or a pharmacological response
characteristic of that receptor
(contraction, relaxation, secretion, enzyme activation, etc.)
The term "analog" as used herein refers to a drug or chemical compound whose
structure is
related in some way to that of another drug or chemical compound, but whose
chemical and biological
properties may be quite different.
The term "antagonist" as used herein refers to a drug or a compound that
opposes the
physiological effects of another. At the receptor level, it is a chemical
entity that opposes the receptor-
associated responses normally induced by another bioactive agent.
The term "asymmetric unit" refers to the basic motif, which is repeated in 3-D
space by the
symmetry operators of the crystallographic space group, of which the
coordinated of the structure are
determined. It is the smallest part of the crystal structure from which the
complete structure can be built
using space group symmetry.
As used herein the term "binding site" refers to a specific region (or atom)
in a molecular entity
that is capable of entering into a stabilizing interaction with another
molecular entity. In certain
embodiments the term also refers to the reactive parts of a macromolecule that
directly participate in its
specific combination with another molecule. In other embodiments, a binding
site may be comprised or
defined by the three dimensional arrangement of one or more amino acid
residues within a folded
polypeptide. In yet further embodiments, the binding site further comprises
prosthetic groups, water
molecules or metal ions which may interact with one or more amino acid
residues. Prosthetic groups,
water molecules, or metal ions may be apparent from X-ray crystallographic
data, or may be added to an
apo protein or enzyme using in silico methods.
The term "bioactivity' refers to DPP-IV activity that exhibits a biological
property conventionally
associated with a DPP-IV agonist or antagonist, such as a property that would
allow treatment of one or
more of the various diseases such as, for example, diabetes, glucose
tolerance, obesity, appetite regulation,
lipidemia, osteoporosis, neuropeptide metabolism and T-cell activation, among
others.


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-8-
The term "catalytic domain" as used herein, refers to the catalytic domain of
the DPP-IV class of
enzymes, which features a conserved segment of amino acids in the carboxy-
terminal portion of the
proteins, wherein this segment has been demonstrated to include the catalytic
site of these enzymes.
This conserved catalytic domain extends approximately from residue 552 to 766
of the full-length enzyme
of DPP-IV (SEQ ID NO:1).
"To clone" as used herein, means obtaining exact copies of a given
polynucleotide molecule using
recombinant DNA technology. Furthermore, "to clone into" may be meant as
inserting a given first
polynucleotide sequence into a second polynucleotide sequence, preferably such
that a functional unit
combining the functions of the first and the second polynucleotides results.
For example, without
limitation, a polynucleotide from which a fusion protein may be
translationally provided, which fusion
protein comprises amino acid sequences encoded by the first and the second
polynucleotide sequences.
Specifics of molecular cloning can be found in a number of commonly used
laboratory protocol books
such as Molecular Cloning: A Laboratory Manual, 2nd Ed., ed. by Sambrook,
Fritsch and Maniatis (Cold
Spring Harbor Laboratory Press: 1989).
The term "co-crystallization" as used herein is taken to mean crystallization
of a preformed
protein/ligand complex.
The term "complex" or "co-complex" are used interchangeably and refer to a DPP-
IV molecule, or
a variant, or homologue of DPP-IV in covalent or non-covalent association with
a substrate, ligand,
inhibitor.
The term "contacting" as used herein applies to in silico, in vitro, and/or in
vivo experiments.
"Diseases" and particularly "diseases that are associated with proteins that
are subject to
processing by DPP-IV", include, but are not limited to, for example, Type 2
diabetes, Type 1 diabetes,
impaired glucose tolerance, hyperglycemia, metabolic syndrome (syndrome X
and/or insulin resistance
syndrome), glucosuria, metabolic acidosis, arthritis, cataracts, diabetic
neuropathy, diabetic nephropathy,
diabetic retinopathy, diabetic cardiomyopathy, obesity, conditions exacerbated
by obesity, hypertension,
hyperlipidemia, atherosclerosis, osteoporosis, osteopenia, frailty, bone loss,
bone fracture, acute coronary
syndrome, short stature due to growth hormone deficiency, infertility due to
polycystic ovary syndrome,
anxiety, depression, insomnia, chronic fatigue, epilepsy, eating disorders,
chronic pain, alcohol addiction,
diseases associated with intestinal motility, ulcers, irritable bowel
syndrome, inflammatory bowel
syndrome; short bowel syndrome; and the prevention of disease progression in
Type 2 diabetes.
The term "extracellular domain" as used herein, refers to the extracellular
domain of DPP-IV,
which features a conserved segment of amino acids, whereby this segment has
been demonstrated to
include glycosylation sites, a cysteine-rich region and the catalytic active
site. This conserved
extracellular domain extends approximately from residue GIy31 to Pro766 of the
full length enzyme (SEQ
ID NO:1).
As used herein, the terms "gene", "recombinant gene" and "gene construct"
refer to a nucleic acid
comprising an open reading frame encoding a polypeptide, including both exon
and (optionally) intron
sequences. The term "intron" refers to a DNA sequence present in a given gene
which is not translated
into protein and is generally found between exons.

"Human DPP-IV" (hDPP-IV) is a cell surface type-II membrane glycoprotein and
is also called
adenosine deaminase (ADA) binding protein or CD26, which is known as a T-cell
activation antigen.


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-9-
hDPP-IV is a single polypeptide chain of 766 amino acids, which consists if
five regions: a cytoplasmic
region (residues about 1-6), a transmembrane region (residues about 7-28), a
highly gycosylated region
(residues about 29-323), a cysteine-rich region (residues about 324-551), and
a catalytic region (residues
about 552-766) (Hiramatsu et al., (2003)).
The term "high affinity" as used herein means strong binding affinity between
molecules with a
dissociation constant KD of no greater than 1uM. In a preferred case, the KD
is less than 100 nM, 10 nM,
1 nM, 100 pM, or even 10 pM or less. In a most preferred embodiment, the two
molecules can be
covalently linked (Kp is essentially 0).
The term "homologue" as used herein means a protein, polypeptide,
oligopeptide, or portion
thereof, having preferably at least 95% amino acid sequence identity with DPP-
IV enzyme as described in
SEQ ID NO: 1 or SEQ ID NO:2 or with any extracellular domain described herein,
or with any functional or
structural domain of lipid binding protein. SEQ ID NO:1 is the full-length
amino acid sequence of a wild-
type human DPP-IV, while SEQ ID NO:2 is the hDPP-IV construct including
residues 31-766 which, as
described in the Examples, was purified, expressed and crystallized. Those of
skill in the art understand
that a set of structure coordinates determined by X-ray crystallography is not
without standard error. As
used herein, and for the purpose of this invention, the term "substantially
similar atomic coordinates" or
atomic coordinates that are "substantially similar" refers to any set of
structure coordinates of DPP-IV or
DPP-IV homologues, or DPP-IV variants, polypeptide fragments, described by
atomic coordinates that
have a root mean square deviation for the atomic coordinates of protein
backbone atoms (N, Ca, C, and
0) of less than about 2.5, 2.0 1.5, 1.2, 1.0, 0.7, 0.5, or even 0.2 A when
superimposed using backbone
atoms of structure coordinates listed in FIG. 2. For the purpose of this
invention, structures that have
substantially similar coordinates as those listed in FIG. 2 shall be
considered identical to the coordinates
listed in FIG. 2. The term "substantially similar" also applies to an assembly
of amino acid residues that
may or may not form a contiguous polypeptide chain, but whose three
dimensional arrangement of atomic
coordinates have a root mean square deviation for the atomic coordinates of
protein backbone atoms (N,
Ca, C, and 0), or the side chain atoms, of less than about 2.5, 2.0, 1.5, 1.2,
1.0, 0.7, 0.5, or even 0.2 A
when superimposed-using backbone atoms, or the side chain atoms- of the atomic
coordinates of similar
or the same amino acids from the coordinates listed in FIG. 2. Those skilled
in the art further understand
that the coordinates listed in FIG. 2 or portions thereof may be transformed
into a different set of
coordinates using various mathematical algorithms without departing from the
present invention. For
example, the coordinates listed in Fig. 2, or portions thereof, may be
transformed by algorithms which
translate or rotate the atomic coordinates. Alternatively, molecular
mechanics, molecular dynamics or ab
intio algorithms may modify the atomic coordinates. Atomic coordinates
generated from the coordinates
listed in FIG. 2, or portions thereof, using any of the aforementioned
algorithms shall be considered
identical to the coordinates listed in FIG. 2.
The term "in silico" as used herein refers to experiments carried out using
computer simulations.
In an embodiments, the in silico methods are molecular modeling methods
wherein 3-dimensional models
of macromolecules or ligands are generated. In other embodiments, the in
silico methods comprise
computationally assessing ligand binding interactions.
The term "ligand" describes any molecule, e.g., protein, peptide,
peptidomimetics, oligopeptide,
small organic molecule, polysaccharide, polynucleotide, etc., which binds or
interacts, generally but not


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-10-
necessarily specifically to or with another molecule. In one aspect the ligand
is an agonist, whereby the
molecule upregulates (i.e., activates or stimulates, e.g., by agonizing or
potentiating) activity, while in
another aspect of the invention the ligand is an inhibitor or antagonist,
whereby the molecule down
regulates (i.e., inhibits or suppresses, e.g. by antagonizing, decreasing or
inhibiting) the activity.
The term "modulate" as used herein refers to both upregulation (i.e.,
activation or stimulation, e.g.,
by agonizing or potentiating) and down-regulation (i.e., inhibition or
suppression, e.g., by antagonizing,
decreasing or inhibiting) of a bioactivity.
The term "pharmacophore" as used herein refers to the ensemble of steric and
electronic features
of a particular structure that is necessary to ensure the optimal
supramolecular interactions with a specific
biological target structure and to trigger (or to block) its biological
response. In an embodiments, a
pharmacophore is an abstract concept that accounts for the common molecular
interaction capacities of a
group of compounds towards their target structure. In yet a further
embodiments, the term can be
considered as the largest common denominator shared by a set of active
molecules. Pharmacophoric
descriptors are used to define a pharmacophore, including H-bonding,
hydrophobic and electrostatic
interaction sites, defined by atoms, ring centers and virtual points.
Accordingly, in the context of enzyme
ligands, such as for example agonists or antagonists, a pharmacophore may
represent an ensemble of
steric and electronic factors which are necessary to insure supramolecular
interactions with a specific
biological target structure. As such, a pharmacophore may represent a template
of chemical properties for
an active site of a protein/enzyme representing these properties' spatial
relationship to one another that
theoretically defines a ligand that would bind to that site.
The term "precipitant" as used herein includes any substance that, when added
to a solution,
causes a precipitate to form or crystals to grow. Examples of suitable
precipitants include, but are not
limited to, alkali (e.g., Li, Na, or K), or alkaline earth metal (e.g., Mg, or
Ca) salts, and transition metal
(e.g., Mn, or Zn) salts. Common counter ions to the metal ions include, but
are not limited to, halides,
phosphates, citrates and sulfates.
The term "prodrug" as used herein refers to drugs that, once administered, are
chemically
modified by metabolic processes to become pharmaceutically active. In certain
embodiments the term
also refers to any compound that undergoes biotransformation before exhibiting
its pharmacological
effects. Prodrugs can thus be viewed as drugs containing specialized non-toxic
protective groups used in
a transient manner to alter or to eliminate properties, usually undesirable,
in the parent molecule.
The term "receptor" as used here in refers to a protein or a protein complex
in or on a cell that
specifically recognizes and binds to a compound acting as a molecular
messenger (neurotransmitter,
hormone, lymphokine, lectin, drug, etc.). In a broader sense, the term
receptor is used interchangeably
with any specific (as opposed to non-specific, such as binding to plasma
proteins) drug binding site, also
including nucleic acids such as DNA.
The term "recombinant protein" refers to a polypeptide which is produced by
recombinant DNA
techniques, wherein generally, DNA encoding a polypeptide is inserted into a
suitable expression vector
which is, in turn, used to transform a host cell to produce the polypeptide
encoded by the DNA. This
polypeptide may be one that is naturally expressed by the host cell, or it may
be heterologous to the host
cell, or the host cell may have been engineered to have lost the capability to
express the polypeptide
which is otherwise expressed in wild type forms of the host cell. The
polypeptide may also be, for


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-11-
example, a fusion polypeptide. Moreover, the phrase "derived from", with
respect to a recombinant gene,
is meant to include within the meaning of "recombinant protein" those proteins
having an amino acid
sequence of a native polypeptide, or an amino acid sequence similar thereto
which is generated by
mutations, including substitutions, deletions and truncation, of a naturally
occurring form of the
polypeptide.
As used herein, the term "selective DPP-IV inhibitor" refers to a substance,
such as for example,
an organic molecule, that effectively inhibits an enzyme from the DDP-IV
family to a greater extent than
any other DPP enzyme. A selective DPP-IV inhibitor is a substance, having a K;
for inhibition of DPP-IV
that is less than about one-half, one-fifth, or one-tenth the K; that the
substance has for inhibition of any
other DPP enzyme. In other words, the substance inhibits DPP-IV activity to
the same degree at a
concentration of about one-half, one-fifth, one-tenth or less than the
concentration required for any other
DPP enzyme. In general a substance is considered to effectively inhibit DPP-IV
if it has an IC50 or K; of
less than or about 10 mM, 1 mM, 500 nM, 100 nM, 50 nM or 10 nM.
As used herein the term "small molecules" refers to drugs as they are orally
available (unlike
proteins which must be administered by injection, topically or inhalation).
The size of the small molecules
is generally under 1000 Daltons, but many estimates seem to range between 300
to 700 Daltons.
The term "space group" refers to the lattice and symmetry of the crystal. In a
space group
designation the capital letter indicates the lattice type and the other
symbols represent symmetry
operations that can be carried out on the contents of the asymmetric unit
without changing its
appearance.
By "therapeutically effective" amount is meant that amount which is capable of
at least partially
reversing and/or treat the symptoms of the disease. A therapeutically
effective amount can be determined
on an individual basis and will be based, at least in part, on a consideration
of the species of the mammal,
the size of the mammal, the type of delivery system used, and the type of
administration relative to the
progression of the disease. A therapeutically effective amount can be
determined by one of ordinary skill
in the arts.
As used herein, the term "transfection" means the introduction of a nucleic
acid, e.g., via an
expression vector, into a recipient cell by nucleic acid-mediated gene
transfer. "Transformation" refers to a
process in which a cell's genotype is changed as a result of the cellular
uptake of exogenous DNA or RNA
and, for example, the transformed cell expresses a recombinant form of a
polypeptide or, in the case of
anti-sense expression from the transferred gene, the expression of a naturally-
occurring form of the
polypeptide is disrupted.
The term "variants" in relation to the polypeptide sequence in SEQ ID NO:1 or
SEQ ID NO:2
include any substitution of, variation of, modification of, replacement of,
deletion of, or addition of one or
more amino acids from or to the sequence providing a resultant polypeptide
sequence for an enzyme
having DPP-IV activity. Preferably the variant, homologue, fragment or
portion, of SEQ ID NO:1 or SEQ
ID NO:2, comprises a polypeptide sequence of at least 5 contiguous amino
acids, preferably at least 10
contiguous amino acids, preferably at least 15 contiguous amino acids,
preferably at least 20 contiguous
amino acids, preferably at least 25 contiguous amino acids, or preferably at
least 30 contiguous amino
acids.


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-12-
The term 'Yector" refers to a nucleic acid molecule capable of transporting
another nucleic acid to
which it has been linked. One type of preferred vector is an episome, i.e., a
nucleic acid capable of extra-
chromosomal replication. Suitable host-vector systems include but are not
limited to mammalian cell
systems infected with virus (e.g., vaccinia virus, adenovirus, etc.); insect
cell systems infected with virus
(e.g., baculovirus); microorganisms such as yeast containing yeast vectors; or
bacteria transformed with
bacteriophage, DNA, plasmid DNA, or cosmid DNA. The expression elements of
vectors vary in their
strengths and specificities. As those depending on the host-vector system
utilized, any one of a number of
suitable transcription and translation elements may be used. Preferred vectors
are those capable of
autonomous replication and/or expression of nucleic acids to which they are
linked. Vectors capable of
directing the expression of genes to which they are operatively linked are
referred to herein as
"expression vectors". In general, expression vectors of utility in recombinant
DNA techniques are often in
the form of "plasmids" which refer generally to circular double stranded DNA
loops which, in their vector
form are not bound to the chromosome. In the present specification, "plasmid"
and "vector" are used
interchangeably as the plasmid is the most commonly used form of vector.
However, the invention is
intended to include such othErr forms of expression vectors which serve
equivalent functions and which
become known in the art subsequently hereto.
The following amino acid abbreviations are used throughout this disclosure:
A = Ala = Alanine T = Thr = Threonine
V = Val = Valine C = Cys = Cysteine
L = Leu = Leucine Y = Tyr = Tyrosine
I= Ile = Isoleucine N = Asn = Asparagine
P = Pro = Proline Q = Gln = Glutamine
F = Phe = Phenylalanin D = Asp = Aspartic Acid
W Trp = Tryptophan E = Glu = Glutamic Acid
M Met = Methionine K = Lys = Lysine
G Gly = Glycine R = Arg = Arginine
S Ser = Serine H = His = Histidine
A. Clones and Expressions
As would be appreciated by those skilled in the art, the nucleotide sequence
coding for a DPP-IV
polypeptide, or a functional fragment, including the C-terminal peptide
fragment of the catalytic domain of
DPP-IV protein, and/or derivatives or analogs thereof, including a chimeric
protein, thereof, can be
inserted into a suitable expression vector, i.e., a vector which contains the
necessary elements for the
transcription and translation of the inserted protein-coding sequence. The
elements mentioned above are
termed herein a "promoter." Thus, the nucleic acid encoding a DPP-IV
polypeptide of the invention or a
functional fragment comprising the extracellular domain of of the DPP-IV
protein, or a homologue, an
analog, or variant thereof, is operationally associated with a promoter in an
expression vector of the
invention. In preferred embodiments, the expression vector contains the
nucleotide sequence coding for
the polypeptide comprising the amino acid sequence spanning amino acids GIy31
to Pro766 listed in SEQ
ID NO:1. Both cDNA and genomic sequences can be cloned and expressed under the
control of such
regulatory sequences. An expression vector also preferably includes a
replication origin. The necessary
transcriptional and translational signals can be provided on a recombinant
expression vector. As detailed
below, all genetic manipulations described for the DPP-IV gene in this
section, may also be employed for
genes encoding a functional fragment, including the C-terminal peptide
fragment of the catalytic domain of
the DPP-IV protein, derivatives or analogs thereof, including a chimeric
protein thereof.


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-13-
Suitable host-vector systems include but are not limited to mammalian cell
systems infected with
virus (e.g., vaccinia virus, adenovirus, etc.); insect cell systems infected
with virus (e.g., baculovirus);
microorganisms such as yeast containing yeast vectors; or bacteria transformed
with bacteriophage,
DNA, plasmid DNA, or cosmid DNA. The expression elements of vectors vary in
their strengths and
specificities. As those depending on the host-vector system utilized, any one
of a number of suitable
transcription and translation elements may be used.
A recombinant DPP-IV protein of the invention may be expressed chromosomally,
after
integration of the coding sequence by recombination. In this regard, any of a
number of amplification
systems may be used to achieve high levels of stable gene expression. (See
Sambrook et al., 1989,
infra).
A suitable cell for purposes of this invention is one into which the
recombinant vector comprising
the nucleic acid encoding DPP-IV protein is cultured in an appropriate cell
culture medium under
conditions that provide for expression of DPP-IV protein by the cell.
Any of the methods previously described for the insertion of DNA fragments
into a cloning vector
may be used to construct expression vectors containing a gene consisting of
appropriate
transcriptional/translational control signals and the protein coding
sequences. These methods may include
in vitro recombinant DNA and synthetic techniques, and in vivo recombination
(genetic recombination).
Expression of the DPP-IV protein may be controlled by any promoter/enhancer
element known in
the art, provided that these regulatory elements must be functional in the
host selected for expression, as
would be appreciated by those of skill in the art.
Vectors containing a nucleic acid encoding a DPP-IV protein of the invention
can be identified, for
example, by four general approaches: (1) PCR amplification of the desired
plasmid DNA or specific
mRNA; (2) nucleic acid hybridization; (3) presence or absence of selection
marker gene functions; and (4)
expression of inserted sequences. The invention is further intended to include
other forms of identification
of vectors, containing a nucleic acid encoding a DPP-IV protein of the present
invention, which serve
equivalent functions and which become known in the art subsequently hereto. In
the first approach, the
nucleic acids can be amplified by PCR to provide for detection of the
amplified product. In the second
approach, the presence of a foreign gene inserted in an expression vector can
be detected by nucleic acid
hybridization using probes comprising sequences that are homologous to an
inserted marker gene. In the
third approach, the recombinant vector/host system can be identified and
selected based upon the
presence or absence of certain "selection marker" gene functions (e.g., beta-
galactosidase activity,
thymidine kinase activity, resistance to antibiotics, transformation
phenotype, occlusion body formation in
baculovirus, etc.) caused by the insertion of foreign genes in the vector. In
another example, if the nucleic
acid encoding DPP-IV protein is inserted within the "selection marker" gene
sequence of the vector,
recombinant vectors containing the DPP-IV protein insert can be identified by
the absence of the DPP-IV
protein gene function. In the fourth approach, recombinant expression vectors
can be identified by
assaying for the activity, biochemical, or immunological characteristics of
the gene product expressed by
the recombinant vector, provided that the expressed protein assumes a
functionally active conformation.
A wide variety of host/expression vector combinations may be employed in
expressing the DNA
sequences of this invention as known by those of skill in the art.


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-14-
Once a particular recombinant DNA molecule is identified and isolated, several
methods known in
the art may be used to propagate it. Once a suitable host system and growth
conditions are established,
recombinant expression vectors can be propagated and prepared in quantity. As
noted above, the
expression vectors which can be used include, but are not limited to, the
following vectors or their
derivatives: human or animal viruses such as vaccinia virus or adenovirus;
insect viruses such as
baculovirus; yeast vectors; bacteriophage vectors (e.g., lambda); and plasmid
and cosmid DNA vectors,
to name but a few.
Vectors can be introduced into the desired host cells by methods known in the
art, e.g.,
transfection, electroporation, microinjection, transduction, cell fusion, DEAE
dextran, calcium phosphate
precipitation, lipofection (lysosome fusion), use of a gene gun, or a DNA
vector transporter (see, e.g., Wu
et al., 1992, J. Biol. Chem. 267:963-967; Wu and Wu, 1988, J. Biol. Chem.
263:14621-14624; Hartmut et
al., Canadian Patent Application No. 2,012,311, filed Mar. 15, 1990).

B. Crystal and Space Groups
X-ray structure coordinates define a unique configuration of points in space.
Those skilled in the
art understand that a set of structure coordinates for a protein or a
protein/ligand complex, or a portion
thereof, define a relative set of points that, in turn, define a configuration
in three dimensions. A similar or
identical configuration can be defined by an entirely different set of
coordinates, provided the distances
and angles between atomic coordinates remain essentially the same. In
addition, a scalable configuration
of points can be defined by increasing or decreasing the distances between
coordinates by a scalar factor
while keeping the angles essentially the same.
One aspect of the present invention relates to a crystalline composition
comprising a polypeptide
with an amino acid sequence spanning amino acids GIy31 to Pro766 listed in SEQ
ID NO:1.
In another embodiment, the crystallized complex is characterized by the
structural coordinates
listed in FIG. 2, or portions thereof. In yet a further embodiments, the atoms
of the ligand are within
about 5 angstroms of one or more DPP-IV amino acids in SEQ ID NO: 1 preferably
selected from GIu205,
GIu206, Tyr547, Ser630, Tyr631, Tyr662, Tyr666, Asp708, Asn710 and His740. One
embodiment of the
crystallized complex is characterized as belonging to the space group P43212
and unit cell dimensions
a=b=68.7 A, c= 421.2 A, a=R=y=90.0 . This embodiment is encompassed by the
structural coordinates of
FIG. 2. The ligand may be a small molecule which binds to DPP-IV extracelluar
domain defined by SEQ
ID NO:2, or portions thereof, with a K; of less than about 10 mM, 1 mM, 500
nM, 100 nM, 50 nM, or 10
nM.
Various computational methods can be used to determine whether a molecule or a
binding pocket
portion thereof is "structurally equivalent," defined in terms of its three-
dimensional structure, to all or part
of DPP-IV or its binding pocket(s). Such methods may be carried out in current
software applications,
such as the molecular similarity application of QUANTA (Accelrys Inc., San
Diego, Calif.). The molecular
similarity application permits comparisons between different structures,
different conformations of the
same structure, and different parts of the same structure. The procedure used
in molecular similarity to
compare structures is divided into four steps: (1) load the structures to be
compared; (2) optionally define
the atom equivalences in these structures; (3) perform a fitting operation;
and (4) analyze the results.
Each structure is identified by a name. One structure is identified as the
target (i.e., the fixed structure),
while all remaining structures are working structures (i.e., moving
structures). Since atom equivalency


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-15-
within molecular similarity applications is defined by user input, for the
purpose of this invention equivalent
atoms are defined as protein backbone atoms (N, Ca, C, and 0) for all
conserved residues between the
two structures being compared. A conserved residue is defined as a residue
that is structurally or
functionally equivalent (See Table 4 infra). In further embodiments rigid
fitting operations are considered.
In other embodiments, flexible fitting operations may be considered.
When a rigid fitting method is used, the working structure is translated and
rotated to obtain an
optimum fit with the target structure. The fitting operation uses an algorithm
that computes the optimum
translation and rotation to be applied to the moving structure, such that the
root mean square deviation of
the fit over the specified pairs of equivalent atoms is an absolute minimum.
This number, given in
angstroms (A), is reported by the molecular similarity application.
Any molecule or molecular complex or binding pocket thereof, or any portion
thereof, that has a
root mean square deviation of conserved residue backbone atoms (N, Ca, C, and
0) of less than about
2.5, 2.0, 1.5 A, 1.0 A, 0.7 A, 0.5 A or even 0.2 A, when superimposed on the
relevant backbone atoms
described by the reference structure coordinates listed in FIG. 2, is
considered "structurally equivalent" to
the reference molecule. That is to say, the crystal structures of those
portions of the two molecules are
substantially identical, within acceptable error. Particularly preferred
structurally equivalent molecules or
molecular complexes are those that are defined by the entire set of structural
coordinates listed in FIG. 2,
plus or minus a root mean square deviation from the conserved backbone atoms
of those amino acids of
not more than 2.5 A. More preferably, the root mean square deviation is less
than about 1.0 A.
The term "root mean square deviation" means the square root of the arithmetic
mean of the
squares of the deviations. It is a way to express the deviation or variation
from a trend or object. For
purposes of this invention, the "root mean square deviation" defines the
variation in the backbone of a
protein from the backbone of DPP-IV or a binding pocket portion thereof, as
defined by the structural
coordinates of DPP-IV described herein.
The refined x-ray coordinates of the extracellular domain of DDP-IV (amino
acids 38 to 766 as
listed in SEQ ID NO:2), Znz+, Mg2+, and 32 water molecules are as listed in
FIG. 2.
One orthogonal view of the molecule is shown in FIG. 1.

The crystal structure of the extracellular domain (amino acids 31-766 of SEQ
ID NO:1) was
solved to a resolution 2.7 A. The asymmetric unit is composed of one dimer. In
the structure of the
present invention the structure includes two domains, the f3-propeller domain
(residues 55-497) and the
catalytic domain (residues 508-766), together with a couple of linker regions
(1-54 and 498-507). The
propeller domain packs against the hydrolase domain, and the catalytic triad
of DPP-IV composed of
residues Ser630, His740 and Asp708, which are located which the last 140
residues of the C-terminal region
is at the interface of the two domains.
The present invention provides a molecule or molecular complex that includes
at least a portion of
a DDP-IV and/or substrate binding pocket. In one embodiment, the DDP-IV
binding pocket includes the
amino acids listed in Table 1, the binding pocket being defined by a set of
points having a root mean
square deviation of less than about 2.5, 2.0, 1.5, 1.2, 1.0, 0.7, 0.5, or even
0.2 A, from points representing
the backbone atoms of the amino acids in Table 1. In another embodiment, the
DPP-IV substrate binding
pocket includes the amino acids selected from GIu205, GIu206, Tyr547, Ser630,
Tyr631, Tyr662, Tyr666,
Asp708, Asn710 and His740 from SEQ ID NO:1.


CA 02569095 2006-11-28
WO 2005/119526 - 16 - PCT/IB2005/001630
Table 1: Identified residues 5 A away from the binding pocket of the DPP-IV
crystal structure.
Arg125 His126 Trp201 G1u205 Glu206 Va1207 Phe208 Ser209
A1a210 Tyr256 Arg356 Phe357 Arg358 Tyr547 Gly549 Pro550
Ser551 Tyr558 Trp627 Trp629 Ser630 Tyr631 Va1653 A1a654
Va1656 Tyr662 Asp663 Tyr666 Asn710 Va171 1 GIn715 His740

G1y741 11e742 His748 Tyr752
C. Isolated Polypeptides and Variants

One embodiment of the invention describes an isolated polypeptide consisting
of a portion of
DPP-IV which functions as the binding site when folded in a 3-D orientation.
One embodiment is an
isolated polypeptide comprising a portion of DPP-IV, wherein the portion
starts at about amino acid
residue GIy31, and ends at about amino acid residue Pro766 as described in SEQ
ID NO:1, or a
sequence that is at least 95% or 98% homologous to a polypeptide with an amino
acid sequence
spanning amino acids GIy31 to Pro766 as listed in SEQ ID NO:1.
Another embodiment comprises crystalline compositions comprising variants of
DPP-IV. Variants
of the present invention may have an amino acid sequence that is different by
one or more amino acid
substitutions to the sequence disclosed in SEQ ID NO:1 or SED ID NO:2.
Embodiments which comprise
amino acid deletions and/or additions are also provided. The variant may, for
example, have conservative
changes (amino acid similarity), wherein a substituted amino acid has similar
structural or chemical
properties, for example, the replacement of leucine with isoleucine. Those
skilled in the art will
understand that determining which and how many amino acid residues may be
substituted, inserted, or
deleted without adversely affecting biological or pharmacological activity may
be reasonably inferred in
view of this disclosure, and may further be found using computer programs well
known in the art, for
example, DNAStar software (DNAStar Inc. Madison, WI).
As those silled in the art will appreciate, amino acid substitutions may be
made, for instance, on
the basis of similarity in polarity, charge, solubility, hydrophobicity,
hydrophilicity, and/or the amphipathic
nature of the residues provided that a biological and/or pharmacological
activity of the native molecule is
retained.
Negatively charged amino acids include aspartic acid and glutamic acid;
positively charged amino
acids include lysine and arginine; amino acids, with uncharged polar head
groups having similar
hydrophilicity values include leucine, isoleucine, and valine; amino acids
with aliphatic head groups
include glycine, alanine; asparagine, glutamine, serine; and amino acids with
aromatic side chains include
threonine, phenylalanine, and tyrosine.
Examples of conservative substitutions are set forth in Table 4 as follows:


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-17-
Table 4:
Original Residue Example conservative substitutions
Ala (A) Gly; Ser; Val; Leu; Ile; Pro
Arg (R) Lys; His; Gln; Asn
Asn (N) Gln; His; Lys; Arg
Asp(D) Glu
Cys (C) Ser
Gln (Q) Asn
Glu (E) Asp
Gly (G) Ala; Pro
His (H) Asn; Gln; Arg; Lys
Ile (I) Leu; Val; Met; Ala; Phe
Leu (L) Ile; Val; Met; Ala; Phe
Lys (K) Arg; Gln; His; Asn
Met (M) Leu; Tyr; Ile; Phe
Phe (F) Met; Leu; Tyr; Val; Ile; Ala
Pro (P) Ala; Gly
Ser(S) Thr
Thr(T) Ser
Trp (W) Tyr; Phe
Tyr (Y) Trp; Phe; Thr; Ser
Val (V) Ile; Leu; Met; Phe; Ala

"Homology" is a measure of the identity of nucleotide sequences or amino acid
sequences. To
characterize the homology, subject sequences are aligned so that the highest
percentage homology
(match) is obtained, after introducing gaps, if necessary, to achieve maximum
percent homology. N- or C-
terminal extensions shall not be construed as affecting homology. "Identity'
per se has an art-recognized
meaning and can be calculated using published techniques. Computer program
methods to determine
identity between two sequences, for example, include DNAStarO software; the
GCGO program package
(Devereux, J., et al. Nucleic Acids Research (1984) 12(1): 387); BLASTP,
BLASTN, FASTA (Atschul, S.F.
et al., J. Molec Biol (1990) 215: 403). Homology (identity) as defined herein
is determined conventionally
using the well-known computer program, BESTFITO (Wisconsin Sequence Analysis
Package, Version 8
for Unix, Genetics Computer Group, University Research Park, 575 Science
Drive, Madison, WI, 53711).
When using BESTFITO or any other sequence alignment program (such as the
Clustal algorithm from
MegAlign software (DNAStarO)) to determine whether a particular sequence is,
for example, about 95%
homologous to a reference sequence, according to the present invention, the
parameters are set such
that the percentage of identity is calculated over the full length of the
reference nucleotide sequence or
amino acid sequence and that gaps in homology of up to about 90% of the total
number of nucleotides in
the reference sequence are allowed.


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-18-
Ninety percent homology is therefore determined, for example, using the
BESTFITO program with
parameters set such that the percentage of identity is calculated over the
full length of the reference
sequence, e.g., SEQ ID NO:1, and wherein up to 5% of the amino acids in the
reference sequence may
be substituted with another amino acid. Percent homologies are likewise
determined, for example, to
identify preferred species, within the scope of the claims appended hereto,
which reside within the range
of about 95% to 100% homology to SEQ ID NO:1 as well as the binding site
thereof. As noted above, N-
or C-terminal extensions shall not be construed as affecting homology. Thus,
when comparing two
sequences, the reference sequence is generally the shorter of the two
sequences. This means that, for
example, if a sequence of 50 nucleotides in length with precise identity to a
50 nucleotide region within a
100 nucleotide polynucleotide is compared, there is 100% homology as opposed
to only 50% homology.
Although the natural polypeptide of SEQ ID NO:1 and a variant polypeptide may
only possess a
certain percentage identity, e.g., 95%, they are actually likely to possess a
higher degree of similarity,
depending on the number of dissimilar codons that are conservative changes.
Conservative amino acid
substitutions can frequently be made in a protein without altering either the
conformation or function of the
protein. Similarity between two sequences includes direct matches as well a
conserved amino acid
substitutes which possess similar structural or chemical properties, e.g.,
similar charge as described in
Table 2.
Percentage similarity (conservative substitutions) between two polypeptides
may also be scored
by comparing the amino acid sequences of the two polypeptides by using
programs well known in the art,
including the BESTFIT program, by employing default settings for determining
similarity.
A further embodiment of the invention is a crystal comprising the coordinates
of FIG. 2, wherein
the amino acid sequence is represented by SEQ ID NO:1. A further embodiment of
the invention is a
crystal comprising the coordinates of FIG.2, wherein the amino acid sequence
is at least 95% or 98%
homologous to the amino acid sequence represented by SEQ ID NO:1.
Various methods for obtaining atomic coordinates of structurally homologous
molecules and
molecular complexes using homology modeling are disclosed in, for example, US
Patent No: 6,356,845.
D. Structure Based Drug Design

Once the three-dimensional structure of a crystal comprising a DPP-IV protein,
a functional
domain thereof, homologue, analogue or variant thereof, is determined, a
ligand (antagonist or agonist)
may be examined through the use of computer modeling using a docking program
such as GRAM,
DOCK, or AUTODOCK (See for example, Morris et al., J. Computational Chemistry,
19:1639-1662
(1998)). This procedure can include in silico fitting of potential ligands to
the DPP-IV crystal structure to
ascertain how well the shape and the chemical structure of the potential
ligand will complement or
interfere with the catalytic domain of DPP-IV. (Bugg et al., Scientific
American, December:92-98 (1993);
West et al., TIPS, 16:67-74 (1995)). As those of skill in the art can
appreciate, computer programs can
also be employed to estimate the attraction, repulsion, and steric hindrance
of the ligand to the binding
site. Generally the tighter the fit (e.g., the lower the steric hindrance,
and/or the greater the attractive
force) the more potent the potential drug will be since these properties are
consistent with a tighter binding
constant. Furthermore, the more specificity in the design of a potential drug
the more likely that the drug
will not interfere with the properties of other proteins. This will minimize
potential side-effects due to
unwanted interactions with other proteins.


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-19-
One embodiment of the present invention relates to methods of identifying
agents that bind to a
binding site on DPP-IV extracellular domain wherein the binding site comprises
amino acid residues GIu205,
GIu206, Tyr547, Ser630, Tyr631, Tyr662, Tyr666, Asp708, Asn710 and His740 of
SEQ ID NO:1,
comprising: contacting DPP-IV with a test ligand under conditions suitable for
binding of the test compound
to the binding site, and determining whether the test ligand binds in the
binding site, wherein if binding
occurs, the test ligand is an agent that binds in the binding site. In further
embodiments, the testing may be
carried out in silico using a variety of molecular modeling software
algorithms including, but not limited to,
DOCK, ALADDIN, CHARMM simulations, AFFINITY, C2-LIGAND FIT, Catalyst, LUDI,
CAVEAT, and
CONCORD. (Brooks, et al. CHARMM: a program for macromolecular energy,
minimization, and dynamics
calculations. J Comp.Chem 1983, 4:187-217; E.C. Meng, B.K. Shoichet & I.D.
Kuntz. Automated docking
with grid-based energy evaluation. J Comp Chem 1992, 13:505-524.
In another embodiment, a potential ligand may be obtained by screening a
random peptide library
produced by a recombinant bacteriophage (Scott and Smith, Science, 249:386-390
(1990); Cwirla et al.,
Proc. Natl. Acad. Sci., 87:6378-6382 (1990); Devlin et al., Science, 249:404-
406 (1990)) or a chemical
library, or the like. A ligand selected in this manner can be then be
systematically modified by computer
modeling programs until one or more promising potential ligands are
identified. Such analysis, for
example, has been shown to be effective in the development of HIV protease
inhibitors. (Lam et al.,
Science 263:380-384 (1994); Wlodawer et al., Ann. Rev. Biochem. 62:543-585
(1993); Appelt,
Perspectives in Drug Discovery and Design 1:23-48 (1993); Erickson,
Perspectives in Drug Discovery and
Design 1:109-128 (1993)).
Computer modeling allows the selection of a finite number of rational chemical
modifications, as
opposed to the countless number of essentially random chemical modifications
that could be made, of
which any one might lead to a useful drug. Each chemical modification requires
additional chemical steps,
which while being reasonable for the synthesis of a finite number of
compounds, quickly becomes
overwhelming if all possible modifications needed to be synthesized are
actually synthesized. Thus,
through the use of the three-dimensional structure disclosed herein and
computer modeling, a large
number of these compounds can be rapidly screened on a computer monitor
screen, and a few likely
candidates can be determined without the laborious synthesis of untold numbers
of compounds.
Once a potential ligand (agonist or antagonist) is identified, it can be
either selected from a library
of chemicals as are commercially available from most large chemical companies
or, alternatively, the
potential ligand may be synthesized de novo. As mentioned above, the de novo
synthesis of one or even
a relatively small group of specific compounds is reasonable in the art of
drug design. The potential ligand
can be placed into any standard binding assay as well known to those skilled
in the art to test its effect on
DPP-IV activity.
When a suitable drug is identified, a supplemental crystal can be grown
comprising a protein-
ligand complex formed between a DPP-IV protein and the drug. Preferably the
crystal diffracts X-rays
allowing the determination of the atomic coordinates of the protein-ligand
complex to a resolution of less
than 5.0 Angstroms, more preferably less than 3.0 Angstroms, and even more
preferably less than 2.0
Angstroms. The three-dimensional structure of the supplemental crystal can be
determined by Molecular
Replacement Analysis. Molecular replacement uses a known three-dimensional
structure as a search
model to determine the structure of a closely related molecule or protein-
ligand complex in a new crystal


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-20-
form. The measured X-ray diffraction properties of the new crystal are
compared with the search model
structure to compute the position and orientation of the protein in the new
crystal. Computer programs that
can be used include: X-PLOR and AMORE (J. Navaza, Acta Crystallographics ASO,
157-163 (1994)). As
those of skill in the art can appreciate, once the position and orientation
are known, an electron density
map can be calculated using the search model to provide X-ray phases.
Thereafter, the electron density is
inspected for structural differences, and the search model is modified to
conform to the new structure.
Using this approach, it is possible to use the claimed structure of DPP-IV can
be used to solve the three-
dimensional structures of any such DPP-IV complexed with a ligand. Other
suitable computer programs
that can be used to solve the structures of such STAT crystals include:
QUANTA; CHARMM; INSIGHT;
SYBYL; MACROMODEL; and ICM.
Suitable in silico methods for screening, designing or selecting ligands are
disclosed in, for
example, U.S. Patent No. 6,356,845.

E. Ligands
In one aspect, the present invention discloses binding agents which interact
with a binding site of
DPP-IV defined by a set of points having a root mean square deviation of less
than about 2.5, 2.0, 1.7,
1.5, 1.2, 1.0, 0.7, 0.5, or even 0.2 A from points representing the backbone
atoms of the amino acids
represented by the structure coordinates listed in FIG. 2. Such embodiments
represent variants of the
DPP-IV crystal.
In another aspect, the present invention provides ligands which bind to a
folded polypeptide
comprising an amino acid sequence spanning amino acids 31 to 766 listed in SEQ
ID NO:1, or a
homologue or variant thereof. In further embodiments, the ligand is a
competitive or uncompetitive
inhibitor of DPP-IV. In yet further embodiments the ligand inhibits DPP-IV
with an IC50 or K; of less than
about 10 mM, 1 mM, 500 nM, 100 nM, 50 nM or 10 nM. In yet further embodiments,
the ligand inhibits
DPP-IV with a K; that is less than about one-half, one-fifth, or one-tenth the
K; that the substance has for
inhibition of any other DPP-IV enzyme. In other words, the substance inhibits
DPP-IV activity to the same
degree at a concentration of about one-half, one-fifth, one-tenth or less than
the concentration required for
any other DPP enzyme.
One embodiment of the present invention relates to ligands, such as proteins,
peptides,
peptidomimetics, small organic molecules, etc., designed or developed with
reference to the crystal
structure of DPP-IV as represented by the coordinates presented herein in FIG.
2, and portions thereof.
Such binding agents interact with the binding site of the DPP-IV represented
by one or more amino acid
residues selected from GIu205, GIu206, Tyr547, Ser630, Tyr631, Tyr662, Tyr666,
Asp708, Asn710 and
His740.

F. Machine Readable Storage Media

Transformation of the structure coordinates for all or a portion of DPP-IV or
one of its binding
pockets, for structurally homologous molecules as defined below, or for the
structural equivalents of any
of these molecules or molecular complexes as defined above, into three-
dimensional graphical
representations of the molecule or complex can be conveniently achieved
through the use of
commercially-available software.


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-21 -

The invention thus further provides a machine-readable storage medium
comprising a data
storage material encoded with machine readable data which, when using a
machine programmed with
instructions for using said data, is capable of displaying a graphical three-
dimensional representation of
any of the molecule or molecular complexes of this invention that have been
described above. In a
preferred embodiment, the machine-readable data storage medium comprises a
data storage material
encoded with machine readable data which, when using a machine programmed with
instructions for
using said data, is capable of displaying a graphical three-dimensional
representation of a molecule or
molecular complex comprising all or any parts of a DPP-IV binding pocket, as
defined above. In another
preferred embodiment, the machine-readable data storage medium is capable of
displaying a graphical
three-dimensional representation of a molecule or molecular complex defined by
the structure coordinates
of the amino acids listed in FIG. 4, plus or minus a root mean square
deviation from the backbone atoms
of said amino acids of not more than 2.0 A.
In an alternative embodiment, the machine-readable data storage medium
comprises a data
storage material encoded with a first set of machine readable data which
comprises the Fourier transform
of the structural coordinates set forth in FIG. 2, and which, when using a
machine programmed with
instructions for using said data, can be combined with a second set of machine
readable data comprising
the X-ray diffraction pattern of a molecule or molecular complex to determine
at least a portion of the
structural coordinates corresponding to the second set of machine readable
data.
For example, a system for reading a data storage medium may include a computer
comprising a
central processing unit ("CPU"), a working memory which may be, e.g., RAM
(random access memory) or
"core" memory, mass storage memory (such as one or more disk drives or CD-ROM
drives), one or more
display devices (e.g., cathode-ray tube ("CRT") displays, light emitting diode
("LED") displays, liquid
crystal displays ("LCDs"), electroluminescent displays, vacuum fluorescent
displays, field emission
displays ("FEDs"), plasma displays, projection panels, etc.), one or more user
input devices (e.g.,
keyboards, microphones, mice, touch screens, etc.), one or more input lines,
and one or more output
lines, all of which are interconnected by a conventional bidirectional system
bus. The system may be a
stand-alone computer, or may be networked (e.g., through local area networks,
wide area networks,
intranets, extranets, or the internet) to other systems (e.g., computers,
hosts, servers, etc.). The system
may also include additional computer controlled devices such as consumer
electronics and appliances.
Input hardware may be coupled to the computer by input lines and may be
implemented in a
variety of ways. Machine-readable data of this invention may be inputted via
the use of a modem or
modems connected by a telephone line or dedicated data line. Alternatively or
additionally, the input
hardware may comprise CD-ROM drives or disk drives. In conjunction with a
display terminal, a keyboard
may also be used as an input device.
Output hardware may be coupled to the computer by output lines and may
similarly be
implemented by conventional devices. By way of example, the output hardware
may include a display
device for displaying a graphical representation of a binding pocket of this
invention using a program such
as QUANTA as described herein. Output hardware might also include a printer,
so that hard copy output
may be produced, or a disk drive, to store system output for later use.
In operation, a CPU coordinates the use of the various input and output
devices, coordinates data
accesses from mass storage devices, accesses to and from working memory, and
determines the


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-22-
sequence of data processing steps. A number of programs may be used to process
the machine-readable
data of this invention. Such programs are discussed in reference to the
computational methods of drug
discovery as described herein. References to components of the hardware system
are included as
appropriate throughout the following description of the data storage medium.
Machine-readable storage devices useful in the present invention include, but
are not limited to,
magnetic devices, electrical devices, optical devices, and combinations
thereof. Examples of such data
storage devices include, but are not limited to, hard disk devices, CD
devices, digital video disk devices,
floppy disk devices, removable hard disk devices, magneto-optic disk devices,
magnetic tape devices,
flash memory devices, bubble memory devices, holographic storage devices, and
any other mass storage
peripheral device. It should be understood that these storage devices include
necessary hardware (e.g.,
drives, controllers, power supplies, etc.) as well as any necessary media
(e.g., disks, flash cards, etc.) to
enable the storage of data.

G. Pharmaceutical Compositions

The present invention provides methods for treating certain diseases in a
mammal, preferably a
human being, in need of such treatment using the ligands, and preferably the
inhibitors, as described herein.
The ligand can be advantageously formulated into pharmaceutical compositions
comprising a therapeutically
effective amount of the ligand, a pharmaceutically acceptable carrier and
other compatible ingredients, such
as adjuvants, Freund's complete or incomplete adjuvant, suitable for
formulating such pharmaceutical
compositions as is known to those skilled in the art. Pharmaceutical
compositions containing the ligand can
be used for treatment of diseases that are associated with proteins that are
subject to processing by DPP-IV,
such as Type 2 diabetes, Type 1 diabetes, impaired glucose tolerance,
hyperglycemia, metabolic syndrome
(syndrome X and/or insulin resistance syndrome), glucosuria, metabolic
acidosis, arthritis, cataracts, diabetic
neuropathy, diabetic nephropathy, diabetic retinopathy, diabetic
cardiomyopathy, obesity, conditions
exacerbated by obesity, hypertension, hyperlipidemia, atherosclerosis,
osteoporosis, osteopenia, frailty,
bone loss, bone fracture, acute coronary syndrome, short stature due to growth
hormone deficiency,
infertility due to polycystic ovary syndrome, anxiety, depression, insomnia,
chronic fatigue, epilepsy, eating
disorders, chronic pain, alcohol addiction, diseases associated with
intestinal motility, ulcers, irritable bowel
syndrome, inflammatory bowel syndrome; short bowel syndrome; and the
prevention of disease progression
in Type 2 diabetes.
The pharmaceutical composition is administered to the mammal in a
therapeutically effective
amount such that treatment of the disease occurs.
The present invention is further illustrated by the following examples, which
should not be
construed as limiting in any way. The contents of all cited references
(including literature references,
patents, published patent applications as cited throughout this application)
are hereby expressly
incorporated by reference in their entireties.
The practice of the present invention will employ, unless otherwise indicated,
conventional
techniques of cell biology, cell culture, molecular biology, microbiology and
recombinant DNA, X-ray
crystallography, and molecular modeling which are within the skill of the art.
As those of skill in the art will
understand, such techniques are explained fully in the literature. See, for
example, Molecular Cloning: A
Laboratory Manual, 2nd Ed., ed. by Sambrook, Fritsch and Maniatis (Cold Spring
Harbor Laboratory
Press: 1989); DNA Cloning, Volumes I and II (D. N. Glover ed., 1985);
Oligonucleotide Synthesis (M. J.


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-23-
Gait ed., 1984); Mullis et al. U.S. Patent No: 4,683,195; Nucleic Acid
Hybridization (B. D. Hames & S. J.
Higgins eds. 1984); Transcription And Translation (B. D. Hames & S. J. Higgins
eds. 1984); B. Perbal, A
Practical Guide To Molecular Cloning (1984); the treatise, Methods In
Enzymology (Academic Press, Inc.,
N.Y.); Methods In Enzymology, Vols. 154 and 155 (Wu et al. eds.),
Crystallography Made Clear: A Guide
For Users Of Macromolecular Models (Gales Rhodes, 2"d Ed. San Diego: Academic
Press, 2000).
EXAMPLES
Example 1: Cloning and Expression of human DPP-IV in Sf21 insect cells

A. Construction of hDPP-IV:

Residues 31 to 766 of homo sapien (human) wild type DPP-IV (SEQ ID NO:1) were
amplified by
PCR using the following primers DPPIV-Fc31- BamF (5'-
TTAAGGATCCTGGCACAGATGATGCTACAGCTGAC-3' (SEQ ID NO: 3)), which introduced a Bam
HI site
at the N-terminus, and DPPIV-ChisT-XhoR (5'-AATTCTCGAGTTACTAGTGAT
GATGGTGGTGATGGCTGCCGCGCGGCACCAGAGGTAAAGAGAAACATTGTTTTATGAAGTGGC-3'
(SEQ ID NO: 4)), which introduced His6-tag, thrombin cleavage site and Xho I
site at the C-terminus. The
spin-column (Roche Applied Sciences Indianapolis, IN) purified PCR product
(2208 bp) was digested with
BamHI and Xhol restriction enzymes and subcloned into a baculovirus transfer
vector treated with the
corresponding enzymes. The vector contained a polyhedrin promoter and the
honeybee melittin secretion
signal for efficient, high-level secretion of the recombinant protein.
B. Production of recombinant baclovirus
Cloning steps were monitored by restriction endonuclease mapping and
sequencing analysis. E. coli
clones with recombinant bacmid were obtained after transformation of E. coli
DH10Bac cells (Invitrogen
Corp., Frederick, MD) with 5 ng of the final construct that encodes hDPP-IV
residues 31-766 in pMCG243
(baculovirus transfer vector) (DPPIV-HBM31 -HT plasmid DNA) and blue/white-
screening according to
manufacturer's protocol (Invitrogen Corp.). Monolayers of Sf9 cells (20 X 106
cells in a 162 cmZ culture
flask) were transfected by overlaying 20 ml of transfection mixture containing
100 l of mini-prep bacmid
DNA and 100 l of CeIIFECTIN reagent (Gibco BRL Gaithersburg, MD) in Sf-900 II
SFM. The transfection
mixture was removed after 5 h of incubation (27 C) and the cells were overlaid
with 25 ml of Sf-900 II SFM.
The recombinant virus were harvested at 72h post-transfection and further
amplification of the virus was
achieved by infecting 100 ml of Sf-9 cells (1.2 X 106 cells/ml) with 2 ml of
the recombinant virus for 65-72 h.
Baculo Infected Insect Cells (BIIC) stocks were prepared as follows; when the
cell diameter increases by 2-4
~m above the baseline (usually 65-72h) and while the cell viability is still >
80%, the BIIC's were gently spun
down and re-suspended in a freezing medium (90% SF900 II, 1% w/v BSA, 10% v/v
DMSO) at 1 X 10'
viable cells/ml. The BIIC's were frozen down as 1 ml aliquots using normal
cryopreservation methods and
stored in -709C or liquid nitrogen for long-term storage.
C. Expression of recombinant DPP-IV
Cloning steps were monitored by restriction endonuclease mapping and
sequencing analysis. E. coli
clones with recombinant bacmid were obtained after transformation of E. coli
DH10Bac cells (Invitrogen
Corp., Frederick, MD) with 5 ng of DPPIV-HBM31 -HT plasmid DNA and blue/white-
screening according to
manufacturer's protocol (Invitrogen). Monolayers of Sf9 cells (20 X 106 cells
in a 162 cm2 culture flask) were
transfected by overlaying 20 ml of transfection mixture containing 100 NI of
mini-prep bacmid DNA and 100


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-24-
NI of CeIIFECTIN reagent (Gibco BRL) in Sf-900 II SFM. The transfection
mixture was removed after 5 h of
incubation (27 C) and the cells were overlaid with 25 ml of Sf-900 II SFM. The
recombinant virus were
harvested at 72h post-transfection and further amplification of the virus was
achieved by infecting 100 ml of
Sf-9 cells (1.2 X 106 cells/ml) with 2 ml of the recombinant virus for 65-72
h. Baculo Infected Insect Cells
(BIIC) stocks were prepared as follows; when the cell diameter increases by 2-
4 pm above the baseline
(usually 65-72h) and while the cell viability is still > 80%, the BIIC's were
gently spun down and re-
suspended in a freezing medium (90% SF900 II, 1% w/v BSA, 10% v/v DMSO) at 1 X
107 viable cells/ml.
The BIIC's were frozen down as 1 ml aliquots using normal cryopreservation
methods and stored in -70 C
or liquid nitrogen for long-term storage.
Example 2: Purification of His6-tagged DPP-IV wild type extracellular domain

After clarification by centrifugation and filtration, 10 liters of culture
media containing secreted
human DPP-IV-31-766-C-his6 was concentrated 10-20 fold using a hollow fiber
filter unit which had been
washed with exchange Buffer A (50 mM Tris, 0.3 M NaCI, 1 mM TCEP, pH 8). The
concentrated media was
exchanged with 5 volumes of Buffer A. After clarification by filtration,
imidazole was added to 10 mM by
addition of Buffer B (50 mM TrisCl, 0.3 M NaCI, 0.25 M imidazole, 1 mM TCEP,
pH 8). The sample was
applied to a 40 mL immobilized metal affinity column (Ni-NTA Superflow,
Qiagen), which had been
equilibrated in Buffer A (50 mM TrisCl, 0.3 M NaCI, 1 mM TCEP, pH 8) at 6-8
mLJmin. The column was
washed with Buffer A to achieve a stable baseline at 280 nm. Bound protein was
eluted at a lower flow rate
in a linear gradient from 0 - 20%B in 4 column volumes (cv) (5%B / cv)
followed by a step to 100% B, held
isocratic ally for 4 cv. Fractions were analyze by SDS-PAGE on 4-12% bis-tris
in MOPS buffer using the
NuPAGE system (Invitrogen). Fractions containing DPP-IV were pooled and
dialyzed at 4 degrees C
against 2 changes of dialysis buffer (50 mM TrisCl, 0.1 M NaCI, 1 mM TCEP, pH
8. After dialysis, the
sample was concentrated to 6-10 mg/mL and fractionated by size exclusion
chromatography on Superdex
200 prep grade HiLoad 16/60 (Amersham Biosciences). DPP-IV eluted as an
apparent dimer. Fractions
were analyze by SDS-PAGE on 4-12% bis-tris in MOPS buffer using the NuPAGE
system (Invitrogen).
DPP-IV isolated in this manner was used for crystallization.
Example 3: Crystallization of DPP-IV

DPP-IV of Example 2 was concentrated into buffer containing 50 mM TrisCi, 25
mM NaCI, 1 mM
TCEP, pH 8, to 8-10 mg/mL. Leads were obtained through sparse matrix screening
at 22 . Optimized
crystals grew in drops made from 1.5 pL of protein + 1.5 pL of reservoir
solution (0.1 M TrisCl, pH 8.5, 0.2 M
sodium acetate, 10-16% PEG 4000) equilibrating over the same reservoir
solution. Crystals were
transferred to a solution containing 0.1 M TrisCl, pH 8.5, 0.2 M sodium
acetate, 14-16% PEG 4000, and
20% ethylene glycol. Crystals were flash frozen in gaseous or liquid nitrogen
for data collection.
Example 4: X-ray data collection, structure determination and refinement of
DPP-IV

The crystals prepared in Example 3 were transferred to a
cryoprotectant,solution, made up of the
reservoir solution, with 15-25% ethylene glycol, and then flash-frozen in a
stream of cold nitrogen gas at
100K. A full data set was collected from one crystal frozen in this manner at
the Advanced Photon Sources
of Argonne National Laboratory on a on a ADSC Quantum 210 CCD detector. Data
was processed using
the HKL2000 suite of software (Otwinowski, Z. & Minor, W. Methods Enzymology
276, 307-326 (1997). Data
collection statistics are summarized in Table 5a.


CA 02569095 2006-11-28
WO 2005/119526 PCT/IB2005/001630
-25-
The crystals belong to space group P43212 with unit cell dimensions a=b=68.7
A, (-- 421.2 A, a= R=
y= 90 . They contain one molecule of the polypeptide per asymmetric unit.
The structure was solved by the method of molecular replacement, using the
program AMORE
(Navaza, J., Acta Cryst., 157-163 (1994)). The crystal mosaicity is 0.6 A. The
data is 96.5% complete to
2.7 A resolution with an Rmer9e of 0.062 and an average redundancy of 4.4. The
final model was built with
manual rebuilding on the graphics screen, using the program XtalView (McRee,
D. E., Practical Protein
Crystallography, Academic Press, San Diego, 1993). Refinement in Refmac was
carried out using all
data in the resolution range 50.0-2.7A. The R-factor for the current model is
0.257 (free R-factor, 5% of
the data, 0.340). The refinement statistics are summarized in Table 5b.
The current model contains residues 38-766 (others are disordered in the
crystal), 12 sugar and 32
water molecules.
Table 5a -Data statistics
Resolution range 50-2.7
Number of observations Total 124470
Unique 28252
Com leteness % 96.5(96.9)
1/5(1) 23.5(2.0)
Rsym 0.062(0.544) '
' Numbers in parentheses refer to the highest resolution range (2.80-2.70A)
2
Rsym = F(I-<I>)/~ <I>

Table 5b- Refinement statistics
Nr. of reflections used 23609

Nr. of reflections used for 1269
Rtree

Rcrys,/Rfree 0.257/0.340
Number of atoms 6169

' R = 7_I1Fobs1 - MF.el.11/11Fabs1

Equivalents
While specific embodiments of the subject invention have been discussed, the
above specification is
illustrative and not restrictive. Many variations of the invention will become
apparent to those skilled in the
art upon review of this specification. The appended claims should be
interpreted by reference to the claims,
along with their full scope of equivalents, and the specification, along with
such variations.


DEMANDES OU BREVETS VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVETS
COMPREND PLUS D'UN TOME.
CECI EST LE TOME 1 DE 2

NOTE: Pour les tomes additionels, veillez contacter le Bureau Canadien des
Brevets.

JUMBO APPLICATIONS / PATENTS

THIS SECTION OF THE APPLICATION / PATENT CONTAINS MORE
THAN ONE VOLUME.

THIS IS VOLUME 1 OF 2

NOTE: For additional volumes please contact the Canadian Patent Office.

Representative Drawing

Sorry, the representative drawing for patent document number 2569095 was not found.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date Unavailable
(86) PCT Filing Date 2005-05-26
(87) PCT Publication Date 2005-12-15
(85) National Entry 2006-11-28
Examination Requested 2006-11-28
Dead Application 2009-05-26

Abandonment History

Abandonment Date Reason Reinstatement Date
2008-05-26 FAILURE TO PAY APPLICATION MAINTENANCE FEE

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Request for Examination $800.00 2006-11-28
Registration of a document - section 124 $100.00 2006-11-28
Application Fee $400.00 2006-11-28
Maintenance Fee - Application - New Act 2 2007-05-28 $100.00 2006-11-28
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
PFIZER PRODUCTS INC.
Past Owners on Record
QIU, XIAYANG
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Drawings 2006-11-28 112 7,297
Claims 2006-11-28 2 130
Abstract 2006-11-28 1 50
Description 2006-11-28 27 1,750
Description 2006-11-28 9 181
Cover Page 2007-02-13 1 28
PCT 2006-11-28 5 162
Assignment 2006-11-28 3 119
PCT 2006-11-30 7 275