Language selection

Search

Patent 2325463 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2325463
(54) English Title: A METHOD FOR INCREASING THE PROTEIN CONTENT OF PLANTS
(54) French Title: PROCEDE PERMETTANT D'AUGMENTER LE CONTENU PROTEINIQUE DES VEGETAUX
Status: Deemed Abandoned and Beyond the Period of Reinstatement - Pending Response to Notice of Disregarded Communication
Bibliographic Data
(51) International Patent Classification (IPC):
  • C12N 15/82 (2006.01)
(72) Inventors :
  • JAYNES, JESSE M. (United States of America)
(73) Owners :
  • DEMEGEN, INC.
(71) Applicants :
  • DEMEGEN, INC. (United States of America)
(74) Agent: OSLER, HOSKIN & HARCOURT LLP
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date: 1999-04-27
(87) Open to Public Inspection: 1999-11-04
Availability of licence: N/A
Dedicated to the Public: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/US1999/009067
(87) International Publication Number: US1999009067
(85) National Entry: 2000-10-13

(30) Application Priority Data:
Application No. Country/Territory Date
09/066,056 (United States of America) 1998-04-27

Abstracts

English Abstract


A de novo designed, artificial storage protein has been stably expressed in
plants. This protein was designed to have high levels of all the essential
amino acids needed for human nutrition. Expressing the gene coding for this
protein in crop plants will greatly improve the nutritional quality of the
resulting crops. The gene has also been observed to increase the overall level
of protein production in a plant. This property will allow enhanced levels of
production of other valuable proteins by a plant. For example, a transgenic
plant with a gene encoding for insulin may produce higher levels of insulin
when the plant also expresses a gene for an artificial storage protein. The
method will also allow enhanced production of nonprotein products.
Contransfomation with a gene of interest can result in enhanced levels of the
protein product of the gene of interest or a product synthesized thereby.


French Abstract

L'invention concerne une protéine artificielle de stockage, conçue de novo, de manière stable, dans des végétaux. Cette protéine a été mise au point de manière à produire des niveaux élevés de tous les acides aminés essentiels, nécessaires à la nutrition de l'homme. L'expression du gène codant pour cette protéine, dans les plantes cultivées, améliorera grandement la qualité nutritionnelle des plantes cultivées résultantes. Le gène a également été mis au point, de manière à augmenter le niveau global de production de protéine dans les végétaux. Cette propriété permettra d'améliorer les niveaux de production de toutes les autres protéines intéressantes dans une plante. Par exemple, une plante transgénique possédant un gène codant pour l'insuline, peut produire des niveaux d'insuline plus élevés, lorsque la plante exprime également un gène pour une protéine artificielle de stockage. Ce procédé permet également d'améliorer la production de produits non protéiniques. Une contransformation avec un gène considéré peut produire des niveaux améliorés du produit protéinique du gène considéré, ou un produit synthétisé par celui-ci.

Claims

Note: Claims are shown in the official language in which they were submitted.


48
WHAT IS CLAIMED IS:
1. A plant wherein said plant is a transgenic plant comprising a heterologous
gene selected
from the group consisting of (i) a gene which encodes a protein comprising an
amphiphilic .alpha.-helix and (ii) a gene which encodes a protein comprising a
.beta.-pleated
sheet, wherein said transgenic plant produces more protein per tissue weight
than does
said plant when said plant is not a transgenic plant and wherein said tissue
is root, tuber,
seed, leaf, stem, edible portion, flower or whole plant.
2. The plant of claim 1 wherein if said gene is selected from (i) then said
gene encodes a
protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic
amino acid
residue and can vary along said protein, h is a hydrophobic amino acid residue
and can
vary along said protein, a equals 0 or 1, x equals 1 or 2, y equals I or 2, z
is selected from
the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19,
20 and any
whole number greater than 20, and w equals 0, 1 or 2 and wherein the value of
x in one
((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y) and
wherein the
value of y in one ((h)x(H)y) need not be the same as the value of y in another
((h)x(H)y),
and if said gene is selected from (ii) then said gene encodes a protein
comprising a
sequence (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can
vary along
said protein, h is a hydrophobic amino acid residue and can vary along said
protein, r
equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.
3. The plant of, claim 2 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
4. The plant of claim 2 wherein if said gene is selected from (i) then said
gene encodes a
protein comprising a sequence selected from the group consisting of SEQ ID
NO:31,
SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is
selected from (ii) then said gene encodes a protein comprising a sequence
selected from

49
the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID
NO:39,
SEQ ID NO:40 and SEQ ID NO:41.
5. The plant of claim 2 wherein if said gene is selected from (i) then said
gene encodes a
protein comprising a sequence selected from the group consisting of SEQ ID
NO:3, SEQ
ID NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and SEQ ID
NO:27 and if said gene is selected from (ii) then said gene encodes a protein
comprising
a sequence selected from the group consisting of SEQ ID NO:9, SEQ ID NO:13,
SEQ
ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
6. The plant of claim 2 wherein if said gene is selected from (i) then said
gene encodes
multiple units of said amphiphilic .alpha.-helix wherein each unit of
amphiphilic .alpha.-helix is
defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue
and can vary
along said protein, h is a hydrophobic amino acid residue and can vary along
said protein,
u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the
group consisting
of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole
number greater
than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic .alpha.-
helix is separated
from any neighboring unit of amphiphilic .alpha.-helix by a helix breaker and
wherein any
unit of amphiphilic .alpha.-helix can be different from any other unit of
amphiphilic .alpha.-helix
and wherein a value of x in one ((h)x(H)y) need not be the same as a value of
x in another
((h)x{H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as
a value of
y in another ((h)x(H)y) and wherein if said gene is selected from (ii) then
said gene
encodes multiple units of said .beta.-pleated sheet wherein each unit of
.beta.-pleated sheet is
defined by (H)r(hH)sh)t wherein H is a hydrophilic amino acid residue and can
vary
along said protein, h is a hydrophobic amino acid residue and can vary along
said protein,
r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2
and wherein
each unit of .beta.-pleated sheet is separated from any neighboring unit of
.beta.-pleated sheet
by a helix breaker and wherein any unit of .beta.-pleated sheet can be
different from any
other unit of .beta.-pleated sheet and wherein a value of r in one unit of
.beta.-pleated sheet need
not be the same as a value of r in another unit of .beta.-pleated sheet and
wherein a value of
s in one unit of .beta.-pleated sheet need not be the same as a value of s in
another unit of

50
.beta.-pleated sheet and wherein a value of t in one unit of .beta.-pleated
sheet need not be the same
as a value of t in another unit of .beta.-pleated sheet.
7. The plant of claim 6 wherein said helix breaker is SEQ ID NO:8.
8. The plant of claim 6 wherein if said gene is selected from (i) then said
gene encodes from
4 to 8 units of amphiphilic .alpha.-helix and if said gene is selected from
(ii) then said gene
encodes from 4 to 8 units of .beta.-pleated sheet.
9. The plant of claim 7 wherein if said gene is selected from (i) then said
gene encodes a
protein comprising a sequence selected from the group consisting of SEQ ID
NO:2, SEQ
ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID
NO:28 and if said gene is selected from (ii) then said gene encodes a protein
comprising
a sequence selected from the group consisting of SEQ ID NO:10, SEQ ID NO:14,
SEQ
ID NO:18, SEQ ID NO:22, SEQ ID NO:26, and SEQ ID NO:30.
10. A plant wherein said plant is a transgenic plant comprising a heterologous
gene which
encodes a protein comprising a combination of amphiphilic .alpha.-helix and
.beta.-pleated sheet
and wherein said transgenic plant produces more protein per tissue weight than
does said
plant when said plant is not a transgenic plant wherein said tissue is root,
tuber, seed, leaf,
stem, edible portion, flower or whole plant.
11. The plant of claim 10 wherein said gene encodes a protein comprising a
sequence of units
of (((H)u((h)x(H)y)z(h)w X n)v((H)r(hH)s(h)t X m))p or
(((H)r(hH)s(h)t X m))p((H)u((h)x(H)y)z(h)w X n)v wherein H is a hydrophilic
amino acid residue
and can vary along said protein, h is a hydrophobic amino acid residue and can
vary
along said protein, X is any amino acid and may be different for each X n or X
m, u equals
0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group
consisting of 5, 6, 7,
8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater
than 20, w
equals 0, 1 or 2, n equals any whole number including 0, v equals any whole
number
greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t
equals 0, 1 or

51
2, m equals any whole number including 0, and p equals any whole number
greater than
0 and wherein any one unit within said protein can differ from any other unit
within said
protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for
one unit can differ
from any other unit and wherein a value of x in one ((h)x(H)y) need not be the
same as a
value of x in another ((h)x(H)Y) and wherein a value of y in one ((h)x(H)y)
need not be the
same as a value of y in another ((h)x(H)y).
12. The plant of claim 11 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
13. The plant of claim 11 wherein X is SEQ ID NO:8.
14. A gene encoding a protein selected from the group consisting of (i) a
protein comprising
an amphiphilic .alpha.-helical sequence wherein said sequence comprises
(H)u((h)x(H)y)Z(h)w
wherein H is a hydrophilic amino acid residue and can vary along said protein,
h is a
hydrophobic amino acid residue and can vary along said protein, u equals 0 or
1, x equals
1 or 2, y equals 1 or 2, z is selected from the group consisting of
5,6,7,8,9,10,11,12,
13,14,15,16,17,18,19,20 and any whole number greater than 20, and w equals 0,
1
or 2 and wherein the value of x in one ((h)x(H)y) need not be the same as the
value of x
in another ((h)x(H)y) and wherein the value of y in one ((h)x(H)y) need not be
the same
as the value of y in another ((h)x(H)y), and (ii) a protein comprising a
.beta.-pleated sheet
sequence wherein said sequence comprises (H)r(hH)s(h)t wherein H is a
hydrophilic
amino acid residue and can vary along said protein, h is a hydrophobic amino
acid
residue and can vary along said protein, r equals 0 or 1, s is any whole
number greater
than 0, and t equals 0, 1 or 2.
15. The gene of claim 14 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and

52
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
16. The gene of claim 14 wherein if said gene is selected from (i) then said
gene encodes a
protein comprising a sequence selected from the group consisting of SEQ ID
NO:31,
SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is
selected from (ii) then said gene encodes a protein comprising a sequence
selected from
the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID
NO:39,
SEQ ID NO:40 and SEQ ID NO:41.
17. The gene of claim 14 wherein if said gene is selected from (i) then said
gene encodes a
protein comprising a sequence selected from the group consisting of SEQ ID
NO:3, SEQ
ID NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and SEQ ID
NO:27 and if said gene is selected from (ii) then said gene encodes a protein
comprising
a sequence selected from the group consisting of SEQ ID NO:9, SEQ ID NO:13,
SEQ
ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
18. The gene of claim 14 wherein if said gene is selected from (i) then said
gene encodes
multiple units of said amphiphilic .alpha.-helical sequence wherein each unit
of amphiphilic
.alpha.-helical sequence is defined by (H)u((h)x(H)y)z(h)w wherein H is a
hydrophilic amino
acid residue and can vary along said protein, h is a hydrophobic amino acid
residue and
can vary along said protein,u equals 0 or 1, x equals 1 or 2, y equals 1 or 2,
z is selected
from the group consisting of 5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 and
any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit
of
amphiphilic .alpha.-helical sequence is separated from any neighboring unit of
amphiphilic
.alpha.-helical sequence by a helix breaker and wherein any unit of
amphiphilic .alpha.-helical
sequence can be different from any other unit of amphiphilic .alpha.-helical
sequence and
wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in
another
((h)x{H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as
a value of
y in another ((h)x{H)y) and wherein if said gene is selected from (ii) then
said gene
encodes multiple units of said .beta.-pleated sheet sequence wherein each unit
of .beta.-pleated

53
sheet sequence is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino
acid residue
and can vary along said protein, h is a hydrophobic amino acid residue and can
vary
along said protein, r equals 0 or 1, s is any whole number greater than 0, and
t equals 0,
1 or 2 and wherein each unit of .beta.-pleated sheet sequence is separated
from any
neighboring unit of .beta.-pleated sheet sequence by a helix breaker and
wherein each unit
of .beta.-pleated sheet sequence can be different from any other unit of
.beta.-pleated sheet
sequence and wherein a value of r in one unit of .beta.-pleated sheet need not
be the same as
a value of r in another unit of .beta.-pleated sheet and wherein a value of s
in one unit of
.beta.-pleated sheet need not be the same as a value of s in another unit of
.beta.-pleated sheet and
wherein a vlue of t in one unit of .beta.-pleated sheet need not be the same
as a value of t in
another unit of .beta.-pleated sheet.
19. The gene of claim 18 wherein said helix breaker is SEQ ID NO:8.
20. The gene of claim 18 wherein if said gene is selected from (i) then said
gene encodes
from 4 to 8 units of amphiphilic .alpha.-helical sequence and if said gene is
selected from (ii)
then said gene encodes from 4 to 8 units of .beta.-pleated sheet sequence.
21. The gene of claim 19 wherein if said gene is selected from (i) then said
gene encodes a
protein comprising a sequence selected from the group consisting of SEQ ID
NO:2, SEQ
ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID
NO:28 and if said gene is selected from (ii) then said gene encodes a protein
comprising
a sequence selected from the group consisting of SEQ ID NO:10, SEQ ID NO:14,
SEQ
ID NO:18, SEQ ID NO:22, SEQ ID NO:26 and SEQ ID NO:30.
22. A gene which encodes a protein comprising a sequence of units of
(((H)u((h)x(H)yz(h)wXn)v((H)r(hH))s(h)tXm))p or
((H)r((hH)s(h)tXm))p((H)u((h)x(H)y)z(h)wXn)v
wherein H is a hydrophilic amino acid residue and can vary along said protein,
h is a
hydrophobic amino acid residue and can vary along said protein, X is any amino
acid and
may be different for each Xn or Xm, u equals 0 or 1, x equals 1 or 2, y equals
1 or 2, z is
selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
16, 17, 18, 19,

54
20 and any whole number greater than 20, w equals 0, 1 or 2, n equals any
whole number
including 0, v equals any whole number greater than 0, r equals 0 or l, s
equals any
whole number greater than 0, t equals 0, 1 or 2, m equals any whole number
including
0, and p equals any whole number greater than 0 and wherein any one unit
within said
protein can differ from any other unit within said protein and wherein the
values of u, x,
y, z, w, n, v, r, s, t, m and p for one unit can differ from any other unit
and wherein a
value of x in one ((h)x(H)y) need not be the same as a value of x in another
((h)x(H)y) and
wherein a value of y in one ((h)x(H)y) need not be the same as a value of y in
another
((h)x(H)y).
23. The gene of claim 22 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
24. The gene of claim 22 wherein X is SEQ ID NO:8.
25. A protein selected from the group consisting of (i) a protein comprising
an amphiphilic
.alpha.-helical sequence wherein said sequence comprises (H)u((h)x(H)y)z(h)w
wherein H is a
hydrophilic amino acid residue and can vary along said protein, h is a
hydrophobic amino
acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or
2, y equals 1
or 2, z is selected from the group consisting of
5,6,7,8,9,10,11,12,13,14,15,16,17,
18,19,20 and any whole number greater than 20, and w equals 0, 1 or 2 and
wherein the
value of x in one ((h)x(H)Y) need not be the same as the value of x in another
((h)x(H)y)
and wherein the value of y in one ((h)x(H)y) need not be the same as the value
of y in
another ((h)x(H)y), and (ii) a protein comprising a .beta.-pleated sheet
sequence wherein said
sequence comprises (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue
and can
vary along said protein, h is a hydrophobic amino acid residue and can vary
along said
protein, r equals 0 or 1, s is any whole number greater than 0, and t equals
0, 1 or 2.

55
26. The protein of claim 25 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
27. The protein of claim 25 wherein if said protein is selected from (i) then
said protein
comprises a sequence selected from the group consisting of SEQ ID NO:31, SEQ
ID
NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said protein is
selected from (ii) then said protein comprises a sequence selected from the
group
consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID NO:39, SEQ ID
NO:40 and SEQ ID NO:41.
28. The protein of claim 25 wherein if said protein is selected from (i) then
said protein
comprises a sequence selected from the group consisting of SEQ ID NO:3, SEQ ID
NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and SEQ ID
NO:27 and if said protein is selected from (ii) then said protein comprises a
sequence
selected from the group consisting of SEQ ID NO:9, SEQ ID NO:13, SEQ ID NO:17,
SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
29. The protein of claim 25 wherein if said protein is selected from (i) then
said gene encodes
multiple units of said amphiphilic .alpha.-helical sequence wherein each unit
of amphiphilic
.alpha.-helical sequence is defined by (H)u((h)x(H)y)z(h)w wherein H is a
hydrophilic amino
acid residue and can vary along said protein, h is a hydrophobic amino acid
residue and
can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or
2, z is selected
from the group consisting of 5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20 and
any whole number greater than 20, and w equals 0, 1 or 2 and wherein each unit
of
amphiphilic .alpha.-helical sequence is separated from any neighboring unit of
amphiphilic
.alpha.-helical sequence by a helix breaker and wherein any unit of
amphiphilic .alpha.-helical
sequence can be different from any other unit of amphiphilic .alpha.-helical
sequence and
wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in
another
((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as
a value of

56
y in another ((h)x(H)Y) and if said protein is selected from (ii) then said
gene encodes
multiple units of said .beta.-pleated sheet sequence wherein each unit of
.beta.-pleated sheet
sequence is defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid
residue and
can vary along said protein, h is a hydrophobic amino acid residue and can
vary along
said protein, r equals 0 or 1, s is any whole number greater than 0, and t
equals 0, 1 or 2
and wherein each unit of .beta.-pleated sheet sequence is separated from any
neighboring unit
of .beta.-pleated sheet sequence by a helix breaker and wherein each unit of
.beta.-pleated sheet
sequence can be different from any other unit of .beta.-pleated sheet sequence
and wherein
a value of r in one unit of .beta.-pleated sheet need not be the same as a
value of r in another
unit of .beta.-pleated sheet and wherein a value of s in one unit of .beta.-
pleated sheet need not
be the same as a value of s in another unit of .beta.-pleated sheet and
wherein a value of t in
one unit of .beta.-pleated sheet need not be the same as a value of t in
another unit of
.beta.-pleated sheet.
30. The protein of claim 29 wherein said helix breaker is SEQ ID NO:8.
31. The protein of claim 29 wherein if said protein is selected from (i) then
said protein
comprises 4 to 8 units of amphiphilic .alpha.-helical sequence and if said
protein is selected
from (ii) then said protein comprises from 4 to 8 units of .beta.-pleated
sheet sequence.
32. The protein of claim 30 wherein if said protein is selected from (i) then
said protein
comprises a sequence selected from the group consisting of SEQ ID NO:2, SEQ ID
NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and SEQ ID
NO:28 and if said protein is selected from (ii) then said said protein
comprises a sequence
selected from the group consisting of SEQ ID NO:10, SEQ ID NO:14, SEQ ID
NO:18,
SEQ ID NO:22, SEQ ID NO:26 and SEQ ID NO:30.
33. A protein comprising a sequence of units of
(((H)u((h)x(H)y)z(h)wXn)v((H)r(hH)s(h)tXm))p
or (((H)r(hH)s(h)tXm))p((H)u((h)x(H)y)z(h)wXn)v wherein H is a hydrophilic
amino acid
residue and can vary along said protein, h is a hydrophobic amino acid residue
and can
vary along said protein, X is any amino acid and may be different for each Xn
or Xm, u

57
equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group
consisting of
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number
greater than
20, w equals 0,1 or 2, n equals any whole number including 0, v equals any
whole
number greater than 0, r equals 0 or 1, s equals any whole number greater than
0, t equals
0, 1 or 2, m equals any whole number including 0, and p equals any whole
number
greater than 0 and wherein any one unit within said protein can differ from
any other unit
within said protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m
and p for one
unit can differ from any other unit and wherein a value of x in one ((h)x(H)y)
need not be
the same as a value of x in another ((h)x(H)y) and wherein a value of y in one
((h)x(H)y)
need not be the same as a value of y in another ((h)x(H)y).
34. The protein of claim 33 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
35. The protein of claim 33 wherein X is SEQ ID NO:8.
36. A plant cell wherein said plant cell is a transgenic plant cell comprising
a heterologous
gene selected from the group consisting of (i) a gene which encodes a protein
comprising
an amphiphilic .alpha.-helix and wherein said transgenic plant cell produces
more protein per
gram of plant cell than does said plant cell when said plant cell is not a
transgenic plant
cell and (ii) a gene which encodes a protein comprising a .beta.-pleated sheet
and wherein
said transgenic plant cell produces more protein per gram of cells than does
said plant cell
when said plant cell is not a transgenic plant cell.
37. The plant cell of claim 36 wherein if said gene is selected from (i) then
said gene encodes
a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic
amino
acid residue and can vary along said protein, h is a hydrophobic amino acid
residue and
can vary along said protein, u equals or 0 or 1, x equals 1 or 2, y equals 1
or 2, z is selected
from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20 and

58
any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value
of x in
one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y)
and wherein the
value of y in one ((h)x(H)y) need not be the same as the value of y in another
((h)x(H)y),
and if said gene is selected from (ii) then said gene encodes a protein
comprising a
sequence (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can
vary along
said protein, h is a hydrophobic amino acid residue and can vary along said
protein, r
equals 0 or l, s is any whole number greater than 0, and t equals 0, 1 or 2.
38. The plant cell of claim 37 wherein h is selected from the group consisting
of glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
39. The plant cell of claim 37 wherein if said gene is selected from (i) then
said gene encodes
a protein comprising a sequence selected from the group consisting of SEQ ID
NO:31,
SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is
selected from (ii) then said gene encodes a protein comprising a sequence
selected from
the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID
NO:39,
SEQ ID NO:40 and SEQ ID NO:41.
40. The plant cell of claim 37 wherein if said gene is selected from (i) then
said gene encodes
a protein comprising a sequence selected from the group consisting of SEQ ID
NO:3,
SEQ ID NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and
SEQ ID NO:27 and if said gene is selected from (ii) then said gene encodes a
protein
comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ
ID
NO:13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
41. The plant cell of claim 37 wherein if said gene is selected from (i) then
said gene encodes
multiple units of said amphiphilic .alpha.-helix wherein each unit of
amphiphilic .alpha.-helix is
defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue
and can vary
along said protein, h is a hydrophobic amino acid residue and can vary along
said protein,

59
u equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the
group consisting of
5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number
greater than
20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic .alpha.-helix
is separated from
any neighboring unit of amphiphilic .alpha.-helix by a helix breaker and
wherein any unit of
amphiphilic .alpha.-helix can be different from any other unit of amphiphilic
.alpha.-helix and
wherein a value of x in one ((h)x(H)y) need not be the same as a value of x in
another
((h)x(H)Y) and wherein a value of y in one ((h)x(H)y) need not be the same as
a value of
y in another ((h)x(H)y) and if said gene is selected from (ii) then said gene
encodes
multiple units of said .beta.-pleated sheet wherein each unit of .beta.-
pleated sheet is defined by
(H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can vary along
said
protein, h is a hydrophobic amino acid residue and can vary along said
protein, r equals
0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2 and
wherein each unit
of .beta.-pleated sheet is separated from any neighboring unit of .beta.-
pleated sheet by a helix
breaker and wherein any unit of .beta.-pleated sheet can be different from any
other unit of
.beta.-pleated sheet and wherein a value of r in one unit of .beta.-pleated
sheet need not be the
same as a value of r in another unit of .beta.-pleated sheet and wherein a
value of s in one
unit of .beta.-pleated sheet need not be the same as a value of s in another
unit of .beta.-pleated
sheet and wherein a value of t in one unit of .beta.-pleated sheet need not be
the same as a
value of t in another unit of .beta.-pleated sheet.
42. The plant cell of claim 41 wherein said helix breaker is SEQ ID NO:8.
43. The plant cell of claim 41 wherein if said gene is selected from (i) then
said gene encodes
from 4 to 8 units of amphiphilic .alpha.-helix and if said gene is selected
from (ii) then said
gene encodes from 4 to 8 units of .beta.-pleated sheet.
44. The plant cell of claim 42 wherein if said gene is selected from (i) then
said gene encodes
a protein comprising a sequence selected from the group consisting of SEQ ID
NO:2,
SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and
SEQ ID NO:28 and if said gene is selected from (ii) then said gene encodes a
protein

60
comprising a sequence selected from the group consisting of SEQ ID NO:10, SEQ
ID
NO:14, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:26, and SEQ ID NO:30.
45. A plant cell wherein said plant cell is a transgenic plant cell comprising
a heterologous
gene which encodes a protein comprising a combination of amphiphilic .alpha.-
helix and
.beta.-pleated sheet and wherein said transgenic plant cell produces more
protein per gram of
cells than does said plant cell when said plant cell is not a transgenic plant
cell.
46. The plant cell of claim 45 wherein said gene encodes a protein comprising
a sequence of
units of (((H)u((h)x(H)y)z(h)wXn)v((H)r(hH)s(h)tXm))p or
(((H)r(hH)s(h)tXm))p((H)u((h)x(H)y)z(h)wXn)v wherein H is a hydrophilic amino
acid residue
and can vary along said protein, h is a hydrophobic amino acid residue and can
vary
along said protein, X is any amino acid and may be different for each Xn or
Xm, u equals
0 or l, x equals 1 or 2, y equals 1 or 2, z is selected from the group
consisting of 5, 6, 7,
8, 9, 10, 1 l, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater
than 20, w
equals 0, 1 or 2, n equals any whole number including 0, v equals any whole
number
greater than 0, r equals 0 or l, s equals any whole number greater than 0, t
equals 0, 1 or
2, m equals any whole number including 0, and p equals any whole number
greater than
0 and wherein any one unit within said protein can differ from any other unit
within said
protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for
one unit can differ
from any other unit and wherein a value of x in one ((h)x(H)y) need not be the
same as a
value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y)
need not be the
same as a value of y in another ((h)x(H)y).
47. The plant cell of claim 46 wherein h is selected from the group consisting
of glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
48. The plant cell of claim 46 wherein X is SEQ ID NO:8.

61
49. A method for increasing production of a protein or a nonprotein product in
a plant of a
specified species wherein said method comprises:
a) transforming a cell or cells of said species with a heterologous gene to
produce a
transgenic cell or transgenic cells wherein said gene is selected from the
group consisting
of (i) a gene which encodes a protein which comprises an amphiphilic .alpha.-
helical sequence
and (ii) a gene which encodes a protein which comprises a .beta.-pleated sheet
sequence;
b) growing said transgenic cell or cells to produce a transgenic plant; and
c) growing said transgenic plant.
50. The method of claim 49 wherein if said gene is selected from (i) then said
gene encodes
a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic
amino
acid residue and can vary along said protein, h is a hydrophobic amino acid
residue and
can vary along said protein, a equals 0 or 1, x equals 1I or 2, y equals 1 or
2, z is selected
from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20 and
any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value
of x in
one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y)
and wherein the
value of y in one ((h)x(H)y) need not be the same as the value of y in another
((h)x(H)y),
and if said gene is selected from (ii) then said gene encodes a protein
comprising a
sequence (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can
vary along
said protein, h is a hydrophobic amino acid residue and can vary along said
protein, r
equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.
51. The method of claim 50 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
52. The method of claim 50 wherein if said gene is selected from (i) then said
gene encodes
a protein comprising a sequence selected from the group consisting of SEQ ID
NO:31,
SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is
selected from (ii) then said gene encodes a protein comprising a sequence
selected from

62
the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID
NO:39,
SEQ ID NO:40 and SEQ ID NO:41.
53. The method of claim 50 wherein if said gene is selected from (i) then said
gene encodes
a protein comprising a sequence selected from the group consisting of SEQ ID
NO:3,
SEQ ID NO:6, SEQ ID NO:Il, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and
SEQ ID NO:27 and if said gene is selected from (ii) then said gene encodes a
protein
comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ
ID
NO:13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
54. The method of claim 50 wherein if said gene is selected from (i) then said
gene encodes
multiple units of said amphiphilic .alpha.-helix wherein each unit of
amphiphilic .alpha.-helix is
defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue
and can vary
along said protein, h is a hydrophobic amino acid residue and can vary along
said protein,
a equals 0 or l, x equals 1 or 2, y equals 1 or 2, z is selected from the
group consisting
of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole
number greater
than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic .alpha.-
helix is separated
from any neighboring unit of amphiphilic .alpha.-helix by a helix breaker and
wherein any
unit of amphiphilic .alpha.-helix can be different from any other unit of
amphiphilic .alpha.-helix
and wherein a value of x in one ((h)x(H)y) need not be the same as a value of
x in another
((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as
a value of
y in another ((h)x(H)y) and wherein if said gene is selected from (ii) then
said gene
encodes multiple units of said .beta.-pleated sheet wherein each unit of
.beta.-pleated sheet is
defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can
vary
along said protein, h is a hydrophobic amino acid residue and can vary along
said protein,
r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2
and wherein
each unit of .beta.-pleated sheet is separated from any neighboring unit of
.beta.-pleated sheet
by a helix breaker and wherein any unit of .beta.-pleated sheet can be
different from any
other unit of .beta.-pleated sheet and wherein a value of r in one unit of
.beta.-pleated sheet need
not be the same as a value of r in another unit of .beta.-pleated sheet and
wherein a value of
s in one unit of .beta.-pleated sheet need not be the same as a value of s in
another unit of .beta.-

63
pleated sheet and wherein a value of t in one unit of .beta.-pleated sheet
need not be the same
as a value of t in another unit of .beta.-pleated sheet.
55. The method of claim 54 wherein said helix breaker is SEQ ID NO:8.
56. The method of claim 54 wherein if said gene is selected from (i) then said
gene encodes
from 4 to 8 units of amphiphilic .alpha.-helix and if said gene is selected
from (ii) then said
gene encodes from 4 to 8 units of .beta.-pleated sheet.
57. The method of claim 55 wherein if said gene is selected from (i) then said
gene encodes
a protein comprising a sequence selected from the group consisting of SEQ ID
NO:2,
SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and
SEQ ID NO:28 and if said gene is selected from (ii) then said gene encodes a
protein
comprising a sequence selected from the group consisting of SEQ ID NO:10, SEQ
ID
NO:14, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:26, and SEQ ID NO:30.
58. A method for increasing production of a protein or a nonprotein product in
a plant of a
specified species wherein said method comprises:
a) transforming a cell or cells of said species with a heterologous gene to
produce a
transgenic cell or transgenic cells wherein said gene encodes a protein which
comprises
a combination of amphiphilic .alpha.-helical sequence and .beta.-pleated sheet
sequence;
b) growing said transgenic cell or cells to produce a transgenic plant; and
c) growing said transgenic plant.
59. The method of claim 58 wherein said gene encodes a protein comprising a
sequence of
units of (({H)u((h)x(H)y)z(h)wXr)v((H)(hH)s(h)tXm))p or
(((H)r(hH)s(h)tXm))p((H)u((h)x(H)y)z(h)wXn)v wherein H is a hydrophilic amino
acid residue
and can vary along said protein, h is a hydrophobic amino acid residue and can
vary
along said protein, X is any amino acid and may be different for each Xn or
Xm, u equals
0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group
consisting of 5, 6, 7,
8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater
than 20, w

64
equals 0, 1 or 2, n equals any whole number including 0, v equals any whole
number
greater than 0, r equals 0 or l, s equals any whole number greater than 0, t
equals 0, 1 or
2, m equals any whole number including 0, and p equals any whole number
greater than
0 and wherein any one unit within said protein can differ from any other unit
within said
protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for
one unit can differ
from any other unit and wherein a value of x in one ((h)x(H)y) need not be the
same as a
value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y)
need not be the
same as a value of y in another ((h)x(H)y).
60. The method of claim 59 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
61. The method of claim 59 wherein X is SEQ ID NO:8.
62. A method for increasing protein production in plant cells of a specified
species wherein
said method comprises:
a) transforming a cell or cells of said species with a heterologous gene to
produce a
transgenic cell or transgenic cells wherein said gene is selected from the
group consisting
of (i) a gene which encodes a protein which comprises an amphiphilic .alpha.-
helical sequence
and (ii) a gene which encodes a protein which comprises a .beta.-pleated sheet
sequence; and
b) growing said transgenic cell or cells in culture or in a bioreactor.
63. The method of claim 62 wherein if said gene is selected from (i) then said
gene encodes
a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic
amino
acid residue and can vary along said protein, h is a hydrophobic amino acid
residue and
can vary along said protein, u equals 0 or 1, x equals 1 or 2, y equals 1 or
2, z is selected
from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17,
18, 19, 20 and
any whole number greater than 20, and w equals 0, 1 or 2 and wherein the value
of x in
one ((h)x(H)y) need not be the same as the value of x in another ((h)x(H)y)
and wherein the

65
value of y in one ((h)x(H)y) need not be the same as the value of y in another
((h)x(H)y),
and if said gene is selected from (ii) then said gene encodes a protein
comprising a
sequence (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can
vary along
said protein, h is a hydrophobic amino acid residue and can vary along said
protein, r
equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2.
64. The method of claim 63 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
65. The method of claim 63 wherein if said gene is selected from (i) then said
gene encodes
a protein comprising a sequence selected from the group consisting of SEQ ID
NO:31,
SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if said gene is
selected from (ii) then said gene encodes a protein comprising a sequence
selected from
the group consisting of SEQ ID NO:36, SEQ ID NO:37, SEQ ID NO:38, SEQ ID
NO:39,
SEQ ID NO:40 and SEQ ID NO:41.
66. The method of claim 63 wherein if said gene is selected from (i) then said
gene encodes
a protein comprising a sequence selected from the group consisting of SEQ ID
NO:3,
SEQ ID NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID NO:23 and
SEQ ID NO:27 and if said gene is selected from (ii) then said gene encodes a
protein
comprising a sequence selected from the group consisting of SEQ ID NO:9, SEQ
ID
NO:13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID NO:29.
67. The method of claim 63 wherein if said gene is selected from (i) then said
gene encodes
multiple units of said amphiphilic .alpha.-helix wherein each unit of
amphiphilic .alpha.-helix is
defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic amino acid residue
and can vary
along said protein, h is a hydrophobic amino acid residue and can vary along
said protein,
a equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the
group consisting
of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole
number greater

66
than 20, and w equals 0, 1 or 2 and wherein each unit of amphiphilic .alpha.-
helix is separated
from any neighboring unit of amphiphilic .alpha.-helix by a helix breaker and
wherein any
unit of amphiphilic .alpha.-helix can be different from any other unit of
amphiphilic .alpha.-helix
and wherein a value of x in one ((h)x(H)y) need not be the same as a value of
x in another
((h)x(H)y) and wherein a value of y in one ((h)x(H)y) need not be the same as
a value of
y in another ((h)x(H)y) and wherein if said gene is selected from (ii) then
said gene
encodes multiple units of said .alpha.-pleated sheet wherein each unit of
.alpha.-pleated sheet is
defined by (H)r(hH)s(h)t wherein H is a hydrophilic amino acid residue and can
vary
along said protein, h is a hydrophobic amino acid residue and can vary along
said protein,
r equals 0 or 1, s is any whole number greater than 0, and t equals 0, 1 or 2
and wherein
each unit of .beta.-pleated sheet is separated from any neighboring unit of
.beta.-pleated sheet
by a helix breaker and wherein any unit of .beta.-pleated sheet can be
different from any
other unit of .beta.-pleated sheet and wherein a value of r in one unit of
.beta.-pleated sheet need
not be the same as a value of r in another unit of .beta.-pleated sheet and
wherein a value of
s in one unit of .beta.-pleated sheet need not be the same as a value of s in
another unit of
.beta.-pleated sheet and wherein a value of t in one unit of .beta.-pleated
sheet need not be the same
as a value of t in another unit of .beta.-pleated sheet.
68. The method of claim 67 wherein said helix breaker is SEQ ID NO:8.
69. The method of claim 67 wherein if said gene is selected from (i) then said
gene encodes
from 4 to 8 units of amphiphilic .alpha.-helix and if said gene is selected
from (ii) then said
gene encodes from 4 to 8 units of .beta.-pleated sheet.
70. The method of claim 68 wherein if said gene is selected from (i) then said
gene encodes
a protein comprising a sequence selected from the group consisting of SEQ ID
NO:2,
SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID NO:24 and
SEQ ID NO:28 and if said gene is selected from (ii) then said gene encodes a
protein
comprising a sequence selected from the group consisting of SEQ ID NO:10, SEQ
ID
NO:14, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:26, and SEQ ID NO:30.

67
71. A method for increasing production of a protein or a nonprotein product in
plant cells of
a specified species wherein said method comprises:
a) transforming a cell or cells of said species with a heterologous gene to
produce a
transgenic cell or transgenic cells wherein said gene encodes a protein which
comprises
a combination of amphiphilic .alpha.-helical sequence and .beta.-pleated sheet
sequence; and
b) growing said transgenic cell or cells in culture or in a bioreactor.
72. The method of claim 71 wherein said gene encodes a protein comprising a
sequence of
units of (((H)u((h)x(H)y)z(h)w X n)v((H)r(~hH)s(h)t X m))p or
(((H)r(hH)s(h)t X m))p((H)u((h)x(H)y)z(h)w X n)v wherein H is a hydrophilic
amino acid residue
and can vary along said protein, h is a hydrophobic amino acid residue and can
vary
along said protein, X is any amino acid and may be different for each X n or X
m, u equals
0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group
consisting of 5, 6, 7,
8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater
than 20, w
equals 0, 1 or 2, n equals any whole number including 0, v equals any whole
number
greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t
equals 0, 1 or
2, m equals any whole number including 0, and p equals any whole number
greater than
0 and wherein any one unit within said protein can differ from any other unit
within said
protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for
one unit can differ
from any other unit and wherein a value of x in one ((h)x(H)y) need not be the
same as a
value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y)
need not be the
same as a value of y in another ((h)x(H)y).
73. The method of claim 72 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
74. The method of claim 72 wherein X is SEQ ID NO:8.

68
75. A method of increasing production of a first protein, or of a nonprotein
product in which
case said first protein catalyzes a step in a synthesis of said nonprotein
product, in a plant
of a specified species wherein said first protein is encoded by a first gene
with which said
plant is transformed or an ancestor of said plant had been transformed wherein
said
method comprises the steps of:
(a) selecting a cell or cells of said species;
(b) transforming said cell or cells of said species with said first gene if
said cell or cells
were not already transformed with said first gene;
(c) transforming said cell or cells of said species with a second gene, if
said cell or cells
or an ancestor of said cell or cells had not previously been transformed with
said second
gene, to form a transgenic cell or transgenic cells wherein said second gene
is selected
from the group consisting of (i) a heterologous gene which encodes a protein
which
comprises an amphiphilic .alpha.-helical sequence and (ii) a heterologous gene
which encodes
a protein which comprises a .beta.-pleated sheet sequence;
(d) growing said transgenic cell or cells to produce a transgenic plant
comprising both
said first gene and said second gene; and
(e) growing said transgenic plant,
wherein, if it is necessary to perform steps (b) and (c), either step (b) can
be performed
before step (c), step (c) can be performed before step (b), or steps (b) and
(c) can be
performed simultaneously.
76. The method of claim 75 wherein if said second gene is selected from (i)
then said second
gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is
a
hydrophilic amino acid residue and can vary along said protein, h is a
hydrophobic amino
acid residue and can vary along said protein, a equals 0 or 1, x equals 1 or
2, y equals 1
or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17,
18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and
wherein the
value of x in one ((h)x(H)y) need not be the same as the value of x in another
((h)x(H)y)
and wherein the value of y in one ((h)x(H)y) need not be the same as the value
of y in
another ((h)x(H)y), and if said second gene is selected from (ii) then said
second gene
encodes a protein comprising a sequence (H)r(hH)s(h)t wherein H is a
hydrophilic amino

69
acid residue and can vary along said protein, h is a hydrophobic amino acid
residue and
can vary along said protein, r equals 0 or 1, s is any whole number greater
than 0, and
t equals 0, 1 or 2.
77. The method of claim 76 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
78. The method of claim 76 wherein if said second gene is selected from (i)
then said second
gene encodes a protein comprising a sequence selected from the group
consisting of SEQ
ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if
said second gene is selected from (ii) then said second gene encodes a protein
comprising
a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37,
SEQ
ID NO:38, SEQ ID NO:39, SEQ ID NO:40 and SEQ ID NO:41.
79. The method of claim 76 wherein if said second gene is selected from (i)
then said second
gene encodes a protein comprising a sequence selected from the group
consisting of SEQ
ID NO:3, SEQ ID NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID
NO:23 and SEQ ID NO:27 and if said second gene is selected from (ii) then said
second
gene encodes a protein comprising a sequence selected from the group
consisting of SEQ
ID NO:9, SEQ ID NO:13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID
NO:29.
80. The method of claim 76 wherein if said second gene is selected from (i)
then said second
gene encodes multiple units of said amphiphilic .alpha.-helix wherein each
unit of amphiphilic
.alpha.-helix is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic
amino acid residue
and can vary along said protein, h is a hydrophobic amino acid residue and can
vary
along said protein, a equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is
selected from the
group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
and any whole
number greater than 20, and w equals 0, 1 or 2 and wherein each unit of
amphiphilic .alpha.-

70
helix is separated from any neighboring unit of amphiphilic .alpha.-helix by a
helix breaker
and wherein any unit of amphiphilic .alpha.-helix can be different from any
other unit of
amphiphilic .alpha.-helix and whrein a value of x in one ((h)x(H)y) need not
be the same as a
value of x in annother ((h)X(H)y) and wherein a value of y in one ((h)x(H)y)
need not be
the same as a value of y in another ((h)x(H)y), and wherein if said second
gene is selected
from (ii) then said second gene encodes multiple units of said .beta.-pleated
sheet wherein
each unit of .beta.-pleated sheet is defined by (H)r(hH)s(h)t wherein H is a
hydrophilic amino
acid residue and can vary along said protein, h is a hydrophobic amino acid
residue and
can vary along said protein, r equals 0 or 1, s is any whole number greater
than 0, and t
equals 0, 1 or 2 and wherein each unit of .beta.-pleated sheet is separated
from any
neighboring unit of .beta.-pleated sheet by a helix breaker and wherein any
unit of .beta.-pleated
sheet can be different from any other unit of .beta.-pleated sheet and wherein
a value of r in
one unit of .beta.-pleated sheet need not be the same as a value of r in
another unit of
.beta.-pleated sheet and wherein a value of s in one unit of .beta.-pleated
sheet need not be the same
as a value of s in another unit of .beta.-pleated sheet and wherein a value of
t in one unit of
.beta.-pleated sheet need not be the same as a value of t in another unit of
.beta.-pleated sheet.
81. The method of claim 80 wherein said helix breaker is SEQ ID NO:8.
82. The method of claim 80 wherein if said second gene is selected from (i)
then said second
gene encodes from 4 to 8 units of amphiphilic .alpha.-helix and if said second
gene is selected
from (ii) then said second gene encodes from 4 to 8 units of .beta.-pleated
sheet.
83. The method of claim 81 wherein if said second gene is selected from (i)
then said second
gene encodes a protein comprising a sequence selected from the group
consisting of SEQ
ID NO:2, SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID
NO:24 and SEQ ID NO:28 and if said second gene is selected from (ii) then said
second
gene encodes a protein comprising a sequence selected from the group
consisting of SEQ
ID NO:10, SEQ ID NO:14, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:26 and SEQ
ID NO:30.

71
84. A method of increasing production of a first protein, or of a nonprotein
product in which
case said first protein catalyzes a step in a synthesis of said nonprotein
product, in a plant
of a specified species wherein said protein is encoded by a first gene with
which said
plant is transformed or an ancestor of said plant had been transformed wherein
said
method comprises the steps of:
(a) selecting a cell or cells of said species;
(b) transforming said cell or cells of said species with said first gene if
said cell or cells
were not already transformed with said first gene;
(c) transforming said cell or cells of said species with a second gene, if
said cell or cells
or an ancestor of said cell or cells had not previously been transformed with
said second
gene, to form a transgenic cell or transgenic cells wherein said second gene
encodes a
protein which comprises a combination of amphiphilic .alpha.-helical sequence
and .beta.-pleated
sheet sequence and wherein said second gene does not naturally occur in said
plant;
(d) growing said transgenic cell or cells to produce a transgenic plant
comprising both
said first gene and said second gene; and
(e) growing said transgenic plant,
wherein, if it is necessary to perform steps (b) and (c), either step (b) can
be performed
before step (c), step (c) can be performed before step (b), or steps (b) and
(c) can be
performed simultaneously.
85. The method of claim 84 wherein said second gene encodes a protein
comprising a
sequence of units of (((H)u((h)x(H)y)z(h)w X n)v((H)r(hH)s(h)t X m))p or
(((H)r(hH)s(h)t X m)p((H)u((h)x(H)y)z(h)w X n)v wherein H is a hydrophilic
amino acid residue
and can vary along said protein, h is a hydrophobic amino acid residue and can
vary
along said protein, X is any amino acid and may be different for each X n or X
m, u equals
0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group
consisting of 5, 6, 7,
8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater
than 20, w
equals 0, 1 or 2, n equals any whole number including 0, v equals any whole
number
greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t
equals 0, 1 or
2, m equals any whole number including 0, and p equals any whole number
greater than
0 and wherein any one unit within said protein can differ from any other unit
within said

72
protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for
one unit can differ
from any other unit and wherein a value of x in one ((h)x(H)y) need not be the
same as a
value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y)
need not be the
same as a value of y in another ((h)x(H)y).
86. The method of claim 85 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
87. The method of claim 85 wherein X is SEQ ID NO:8.
88. A method of increasing production of a first protein, or of a nonprotein
product in which
case said first protein catalyzes a step in a synthesis of said nonprotein
product, in a plant
cell or plant cells of a specified species wherein said first protein is
encoded by a first
gene with which said plant cell or plant cells are transformed or an ancestor
of said plant
had been transformed wherein said method comprises the steps of:
(a) selecting a cell or cells of said species;
(b) transforming said cell or cells of said species with said first gene if
said cell or cells
were not already transformed with said first gene;
(c) transforming said cell or cells of said species with a second gene, if
said cell or cells
or an ancestor of said cell or cells had not previously been transformed with
said second
gene, to form a transgenic cell or transgenic cells wherein said second gene
is selected
from the group consisting of (i) a gene which encodes a protein which
comprises an
amphiphilic a-helical sequence and wherein said gene does not naturally occur
in said
plant and (ii) a gene which encodes a protein which comprises a .beta.-pleated
sheet sequence
and wherein said gene does not naturally occur in said plant; and
(d) growing said transgenic cell or cells in culture or in a bioreactor,
wherein, if it is necessary to perform steps (b) and (c), either step (b) can
be performed
before step (c), step (c) can be performed before step (b), or steps (b) and
(c) can be
performed simultaneously.

73
89. The method of claim 88 wherein if said second gene is selected from (i)
then said second
gene encodes a protein comprising a sequence (H)u((h)x(H)y)z(h)w wherein H is
a
hydrophilic amino acid residue and can vary along said protein, h is a
hydrophobic amino
acid residue and can vary along said protein, u equals 0 or 1, x equals 1 or
2, y equals 1
or 2, z is selected from the group consisting of 5, 6, 7, 8, 9, 10, 11, 12,
13, 14, 15, 16, 17,
18, 19, 20 and any whole number greater than 20, and w equals 0, 1 or 2 and
wherein the
value of x in one ((h)x(H)y) need not be the same as the value of x in another
((h)x(H)y)
and wherein the value of y in one ((h)x(H)y) need not be the same as the value
of y in
another ((h)x(H)y), and if said second gene is selected from (ii) then said
second gene
encodes a protein comprising a sequence (H)r((hH)s(h)t wherein H is a
hydrophilic amino
acid residue and can vary along said protein, h is a hydrophobic amino acid
residue and
can vary along said protein, r equals 0 or 1, s is any whole number greater
than 0, and t
equals 0, 1 or 2.
90. The method of claim 89 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
91. The method of claim 89 wherein if said second gene is selected from (i)
then said second
gene encodes a protein comprising a sequence selected from the group
consisting of SEQ
ID NO:31, SEQ ID NO:32, SEQ ID NO:33, SEQ ID NO:34 and SEQ ID NO:35 and if
said second gene is selected from (ii) then said second gene encodes a protein
comprising
a sequence selected from the group consisting of SEQ ID NO:36, SEQ ID NO:37,
SEQ
ID NO:38, SEQ ID NO:39, SEQ ID NO:40 and SEQ ID NO:41.
92. The method of claim 89 wherein if said second gene is selected from (i)
then said second
gene encodes a protein comprising a sequence selected from the group
consisting of SEQ
ID NO:3, SEQ ID NO:6, SEQ ID NO:11, SEQ ID NO:15, SEQ ID NO:19, SEQ ID
NO:23 and SEQ ID NO:27 and if said second gene is selected from (ii) then said
second
gene encodes a protein comprising a sequence selected from the group
consisting of SEQ

74
ID NO:9, SEQ ID NO:13, SEQ ID NO:17, SEQ ID NO:21, SEQ ID NO:25 and SEQ ID
NO:29.
93. The method of claim 89 wherein if said second gene is selected from (i)
then said second
gene encodes multiple units of said amphiphilic .alpha.-helix wherein each
unit of amphiphilic
.alpha.-helix is defined by (H)u((h)x(H)y)z(h)w wherein H is a hydrophilic
amino acid residue
and can vary along said protein, h is a hydrophobic amino acid residue and can
vary
along said protein, a equals 0 or 1, x equals 1 or 2, y equals 1 or 2, z is
selected from the
group consisting of 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20
and any whole
number greater than 20, and w equals 0, 1 or 2 and wherein each unit of
amphiphilic
.alpha.-helix is separated from any neighboring unit of amphiphilic .alpha.-
helix by a helix breaker
and wherein any unit of amphiphilic .alpha.-helix can be different from any
other unit of
amphiphilic .alpha.-helix and wherein a value of x in one ((h)x(H)y) need not
be the same as
a value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y)
need not be
the same as a value of y in another ((h)x(H)y), and wherein if said second
gene is selected
from (ii) then said second gene encodes multiple units of said .beta.-pleated
sheet wherein
each unit of .beta.-pleated sheet is defined by (H)r(hH)s(h)t wherein H is a
hydrophilic amino
acid residue and can vary along said protein, h is a hydrophobic amino acid
residue and
can vary along said protein, r equals 0 or 1, s is any whole number greater
than 0, and t
equals 0, 1 or 2 and wherein each unit of .beta.-pleated sheet is separated
from any
neighboring unit of .beta.-pleated sheet by a helix breaker and wherein any
unit of .beta.-pleated
sheet can be different from any other unit of .beta.-pleated sheet and wherein
a value of r in
one unit of .beta.-pleated sheet need not be the same as a value of r in
another unit of .beta.-
pleated sheet and wherein a value of s in one unit of .beta.-pleated sheet
need not be the same
as a value of s in another unit of .beta.-pleated sheet and wherein a vlaue of
t in one unit of
.beta.-pleated sheet need not be the same as a value of t in another unit of
.beta.-pleated sheet.
94. The method of claim 93 wherein said helix breaker is SEQ ID NO:8.

75
95. The method of claim 93 wherein if said second gene is selected from (i)
then said second
gene encodes from 4 to 8 units of amphiphilic .alpha.-helix and if said second
gene is selected
from (ii) then said second gene encodes from 4 to 8 units of .beta.-pleated
sheet.
96. The method of claim 94 wherein if said second gene is selected from (i)
then said second
gene encodes a protein comprising a sequence selected from the group
consisting of SEQ
ID NO:2, SEQ ID NO:7, SEQ ID NO:12, SEQ ID NO:16, SEQ ID NO:20, SEQ ID
NO:24 and SEQ ID NO:28 and if said second gene is selected from (ii) then said
second
gene encodes a protein comprising a sequence selected from the group
consisting of SEQ
ID NO:10, SEQ ID NO:14, SEQ ID NO:18, SEQ ID NO:22, SEQ ID NO:26 and SEQ
ID NO:30.
97. A method of increasing production of a first protein, or of a nonprotein
product in which
case said first protein catalyzes a step in a synthesis of said nonprotein
product, in a plant
cell or plant cells of a specified species wherein said protein is encoded by
a first gene
with which said plant cell or plant cells are transformed or an ancestor of
said plant had
been transformed wherein said method comprises the steps of:
(a) selecting a cell or cells of said species;
(b) transforming said cell or cells of said species with said first gene if
said plant cell or
plant cells were not already transformed with said first gene;
(c) transforming said cell or cells of said species with a second gene, if
said cell or cells
or an ancestor of said cell or cells had not previously been transformed with
said second
gene, to form a transgenic cell or transgenic cells wherein said second gene
encodes a
protein which comprises a combination of amphiphilic a-helical sequence and
.beta.-pleated
sheet sequence and wherein said gene does not naturally occur in said plant;
and
(d) growing said transgenic cell or cells in culture or in a bioreactor,
wherein, if it is necessary to perform steps (b) and (c), either step (b) can
be performed
before step (c), step (c) can be performed before step (b), or steps (b) and
(c) can be
performed simultaneously.

76
98. The method of claim 97 wherein said second gene encodes a protein
comprising a
sequence of units of (((H)u((h)x(H)y)z(h)w X n)v((H)r(hH)s(h)t X m))p or
(((H)r(hH)s(h)t X m)p((H)u((h)x(H)y)z(h)w X n)v wherein H is a hydrophilic
amino acid residue
and can vary along said protein, h is a hydrophobic amino acid residue and can
vary
along said protein, X is any amino acid and may be different for each X n or X
m, a equals
0 or 1, x equals 1 or 2, y equals 1 or 2, z is selected from the group
consisting of 5, 6, 7,
8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 and any whole number greater
than 20, w
equals 0, 1 or 2, n equals any whole number including 0, v equals any whole
number
greater than 0, r equals 0 or 1, s equals any whole number greater than 0, t
equals 0, 1 or
2, m equals any whole number including 0, and p equals any whole number
greater than
0 and wherein any one unit within said protein can differ from any other unit
within said
protein and wherein the values of u, x, y, z, w, n, v, r, s, t, m and p for
one unit can differ
from any other unit and wherein a value of x in one ((h)x(H)y) need not be the
same as a
value of x in another ((h)x(H)y) and wherein a value of y in one ((h)x(H)y)
need not be the
same as a value of y in another ((h)x(H)y).
99. The method of claim 98 wherein h is selected from the group consisting of
glycine,
isoleucine, leucine, methionine, phenylalanine, threonine, tryptophan and
valine and
wherein H is selected from the group consisting of arginine, glutamate,
glycine, histidine,
lysine and threonine.
100. The method of claim 98 wherein X is SEQ ID NO:8.

Description

Note: Descriptions are shown in the official language in which they were submitted.


CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
TITLE OF THE INVENTION
A METHOD FOR INCREASING THE PROTEIN CONTENT OF PLANTS
EACKGROUND OF THE INVENTION
The composition of plant storage proteins, a major food reservoir for the
developing
seeds, determines the nutritional value of plants and grains when they are
used as foods for man
and domestic animals. The amount of protein varies with genotype or cultivar,
but in general,
cereals contain 10% of the dry weight of the seed as protein, while in
legumes, the protein
content varies between 20% and 30% of the dry weight. In many seeds, the
storage proteins
account for 50% or more of the total protein and thus determine the protein
quality of seeds.
Each year the total world cereal harvest amounts to some 1,700 million tons of
grain (Keris et
al. 1985). This yields about 85 million tons of cereal storage proteins
harvested each year and
contributes a majority of the total protein intake of humans and animals.
With respect to human and animal nutrition, most seeds do not provide a
balanced source
of protein because of deficiencies in one or more of the essential amino acids
in the storage
proteins. For example, humans require from foods eight amino acids:
isoleucine, leucine, lysine,
methionine, phenylalanine, threonine, tryptophan and valine, to maintain a
balanced diet.
Consumption of proteins of unbalanced composition of amino acids can lead to a
malnourished
state which is most often found in children in developing countries where
plants are the major
source of protein intake. Therefore, the development of nutritionally-balanced
proteins for
introduction into plants is of extreme importance.
AMINO ACID REQUIREMENTS
The biosynthesis of amino acids from simpler precursors is a process vital to
all forms
of life as these amino acids are the building blocks of proteins. Organisms
differ markedly with
respect to their ability to synthesize amino acids. In fact, virtually all
members of the animal
kingdom are incapable of manufacturing some amino acids. There are twenty
common amino
acids which are utilized in the fabrication of proteins and essential amino
acids are those protein
building blocks which cannot be synthesized by the animal. It is generally
agreed that humans
require eight of the twenty common amino acids in their diet. Protein
deficiencies can usually
be ascribed to a diet which is deficient in one or more of the essential amino
acids. A

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
2
nutritionally adequate diet must include a minimum daily consumption of these
amino acids
(Figure 1 ).
When diets are high in carbohydrates and low in protein, over a protracted
period,
essential amino acid deficiencies result. The name given to this
undernourished condition in
humans is "Kwashiorkor" which is an African word meaning "deposed child"
(deposed from the
mother's breast by a newborn sibling). This debilitating and malnourished
state, characterized by
a bloated stomach and reddish-orange discolored hair, is more often found in
children than adults
because of their great need for essential amino acids during growth and
development. In order
for normal physical and mental maturation to occur, the above mentioned daily
source of
essential amino acids is a requisite. Essential amino acid content, or protein
quality, is as
important a feature of the diet as total protein quantity or total calorie
intake.
Some foods, such as milk, eggs, and meat, have very high nutritional values
because they
contain a disproportionately high level of essential amino acids. On the other
hand, most
foodstuffs obtained from plants possess a poor nutritional value because of
their relatively low
content of some or, in a few cases, all of the essential amino acids.
Generally, the essential amino
acids which are found to be most limiting in plants are isoleucine, lysine,
methionine, threonine,
and tryptophan (MLEAA) (Figure 2).
It has been difficult to produce significant increases in the essential amino
acid content
of crop plants utilizing classical plant breeding approaches. This is
primarily due to the fact that
the genetics of plant breeding is complex and that an increase in essential
amino acid content
may be offset by a loss in other agronomically important characters. Also, it
is probable that the
storage proteins are very conserved in their structure and their essential
amino acid composition
would be little modified by these conventional techniques.
STRUCTURE AND CLASSIFICATION OF NATURAL STORAGE PROTEINS
Seed storage proteins can be characterized by several main features (Pernollet
and Mosse
1983): 1 ) their main function is to provide amino acids or nitrogen to the
young seedling; 2) the
general absence of any other known function; 3) their peculiar amino acid
composition in cereal
and legume seeds; and 4) their localization within storage organelles called
protein bodies, at
least during seed development. Several classes of storage proteins are
generally recognized based
on their solubilities in different solvents. Proteins soluble in water are
called "albumins"; proteins

CA 02325463 2000-10-13
WO 99/55890 PCf/US99/09067
3
soluble in 5% saline, "globulins"; and proteins soluble in 70% ethanol,
"prolamins". The proteins
that remain following these extractions are treated further with dilute acid
or alkali, and are
named "glutelins". Most cereals contain primarily prolamin type proteins and
can be classified
into different groups on the basis of the relative proportions of prolamins,
glutelins, and
globulins, and the subcellular location of these proteins in the mature seed.
The first group
corresponds to the Panicoideae sub-family, the second group the Triticeae
tribe, and the last one
to oat and rice storage proteins.
The principal members of the Panicoideae sub-family are maize, sorghum, and
millet.
Their major storage proteins are prolamins (50 to 60% of seed protein) and
glutelins (35 to 40%
of seed protein) (Pernollet and Mosse 1983). Prolamins are stored within
protein bodies, but
glutelins are located both inside and outside these organelles. The Triticeae
tribe which includes
wheat, barley, and rye, differ from the Panicoideae mainly in storage protein
localization and
structure. In the starchy endosperm of the seeds belonging to this tribe, no
protein bodies are left
at maturity. Clusters of proteins are then deposited between starch granules,
but are no longer
surrounded by a membrane.
In legumes and most other dicots, the major storage proteins are salt-soluble
globulins
(80%) and prolamins (10-15%). Globulins can be divided into vicillins and
legumins (Agros,
1985), based on their sedimentation coefficient (7S/11S), oligomeric
organization
(trimeric/hexameric), and polypeptide chain structure (single chain/disulphide-
linked pair of
chains). In the legume seed cotyledon, protein bodies are embedded between
starch granules
(Pernollet and Mosse 1983). They are membrane-bound organelles, a few microns
in diameter,
mainly filled with storage proteins and phytates. Besides storage proteins,
protein bodies also
contain other proteins, such as enzymes or lectins, although in lesser
amounts.
The structure of soluble globulins were studied more than the insoluble
prolamins and
glutelins. Vicillin appears as a homo- or heterotrimer, sometimes able to
associate into hexameric
form. Soybean ~3-conglycin and french bean phaseolin (Bollini and Chrispeels
1978) are the
structurally best known vicillins. Recently, the three-dimensional structure
of phaseolin was
determined by X-ray crystallographic analysis (Lawrence et al. 1990). However,
unlike other
vicillins, the phaseolin trimer can associate into a dodecamer (tetramer of
trimer) below pH 4.5.
Each polypeptide of the trimeric form comprises two structurally similar units
each made up of
a (3-barrel and an a-helical domain.

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
4
Glycinin, the soybean legumin, has a quaternary structure that was suggested
by Badiey
et al. (1975) to be twelve subunits packed in two identical hexagons. In
general, the legumin
molecule is a polymer formed by the association of six monomers. Each monomer
consists of
two subunits, acidic and basic. Sometimes, these subunits are associated by
disulfide linkages.
On the other hand, arachin, the peanut legumin, was found to consist of
different kinds of
subunits. The arachin hexamer association does not need different kinds of
subunits, which
suggests that the subunits have a very similar structure.
The most studied storage proteins, in terms of structure, are the corn
prolamines called
zeros. These proteins perform no known enzymatic function. Three types of wins
(a, ~i and y)
(Esen 1986) are synthesized on rough endoplasmic reticulum and aggregate
within this
membrane as protein bodies. The zein protein readily self associates to form
protein bodies and
is insoluble in water even in low concentrations of salt. The presence of all
types of zeros is not
necessary for the formation of a protein body as a single type of zero can
aggregate into a dense
structure and is generally found at the surface of protein bodies (Lending et
al. 1988; Wallace et
al. 1988). The mechanism responsible for protein body formation is thought to
involve
hydrophobic and weak polar interactions between individual zein molecules
(Wallace et al. 1988;
Agros et al. 1982), while they require a high amount of ethanol in aqueous
systems to maintain
their strict molecular conformation (Agros et al. 1982).
Circular dichroic measurements, amino acid sequence analysis, and electron
microscopy
of a zero protein suggests that zein secondary structure is primarily helical
with nine adjacent,
topologically antiparallel helices clustered within a distorted cylinder
(Agros et al., 1982;
Larkins, 1983; Larkins, et al.,1984). Polar and hydrophobic residues are
appropriately distributed
along the helical surfaces allowing infra- and intermolecular hydrogen bonds
and van der Waals
interactions among neighboring helices, such that rod-shaped zero molecules
can aggregate and
then stack through glutamate interactions at the cylindrical caps. Because of
this structure, zero
is much less soluble under physiological conditions than the globulin
phaseolin, and precipitation
of insoluble zero in the tightly packed protein body may make them less
available for proteolytic
degradation (Greenwood and Chrispeels 1985).
The storage protein structures are adapted to a maximal packing within protein
bodies
(Pernollet and Mosse 1983). Maximal packing is achieved in at least one of two
ways. The
folding of the polypeptide chain may favor the maximal packing of amino acids
within the

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
protein molecule, or the compacting of proteins is increased by the formation
of closely packed
quaternary structure. High degrees of polymerization can be observed in pearl
millet pennisetin
(Pernollet and Mosse 1983) or zein (Lending et al. 1988; Wallace et al. 1988).
Also, wheat
prolamins and glutelin associate into aggregates arising in the formation of
insoluble gluten.
5 These insoluble forms of protein deposits are osmotically inactive and
stable during the long
period of storage between the time of seed maturation and germination.
REGULATION OF STORAGE PROTEIN GENES
All storage proteins which have been investigated are encoded by multigene
families
(Bartels and Tompson 1983; Crouch et al. 1983; Forde et al. 1985; Kasarda et
al. 1984; Lycett
et aL 1985; Rafalski et al. 1984; Slightom et al. 1983). The structure of
these families varies. In
some cases, as in wheat or barley, two major subgroups can be noted: the a-
and y-gliadins and
the B- and C-hordeins, respectively (Forde et al. 1985; Kasarda et al. 1984;
Rafalski et al. 1984).
Within each subgroup, several subfamilies can be distinguished. Often short
repeats account for
at least part of the structure of the polypeptides. These repeats constitute
links through which
different subfamilies within the same species are related.
Storage protein genes, like most other plant genes characterized to date, are
transcribed
in a regulated rather than a constitutive fashion. Expression is frequently
tissue-specific and/or
temporally regulated. Cis-acting DNA sequences involved in developmental
and/or tissue-
specific regulation of gene expression can be defined by introducing plant
storage protein gene
regulatory regions coupled to bacterial reporter genes (Twell and Ooms 1987;
Wenzler et al.
1989, Marnes et al. 1988; Chen et al. 1988), or by introducing entire or
dissected genes (Colot
et al. 1987; Chen et al. 1986) into a transgenic environment. Unfortunately, a
transformation
system for the nutritionally important cereal species has not yet been well
established. Therefore,
most regulation mechanisms have been studied with transgenic dicot plants.
However, there is
increasing evidence that gene expression is controlled, at least partly, by
the interaction between
regulatory molecules and short sequences that are present in the 5' flanking
region of the gene.
The regulatory sequences of potato storage protein were investigated using
transgenic
potato plants. A 2.5 kb 5' flanking DNA fragment containing the promoter and
the patatin gene
was used to construct a transcriptional fusion gene with chloramphenicol
acetyl transferase
(CAT) or the (3-glucuronidase (GUS) gene (Twell and Ooms 1987; Wenzler et al.
1989). When

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
6
reintroduced into potato, these chimeric genes were expressed in tubers, but
not in leaves, stems
or roots.
The expression pattern of storage protein genes of cereals is retained in
tobacco, not only
with respect to tissue, but also to temporal expression. The 5' upstream
regions of wheat glutenin
genes possess regulatory sequences that determine endosperm-specific
expression in transgenic
tobacco (Colot et al. 1987). Deletion analysis of the low molecular weight
(LMV~ glutenin
sequence indicated that sequences present between 326 by and 160 by upstream
of the
transcriptional start point are necessary to confer endosperm-specific
expression. Furthermore,
cis-acting elements determining the regulation of each gene in the cluster are
recognized by the
tobacco traps-acting factor but also that cis-acting elements directing
expression of one gene do
not affect expression of neighboring genes. This was demonstrated by the
transfer of a 17. Z kb
soybean DNA containing a seed lectin gene with at least four nonseed protein
genes to transgenic
tobacco plants (Okamuro, 1986). The genes in this cluster were expressed in a
manner similar
to that in soybean; i.e., the lectin gene products accumulated in seeds, and
the other genes were
expressed in tobacco leaves, stems, and roots.
The expression of several DNA deletion mutants with a 257 by 5' flanking
sequence of
the a'-conglycin gene indicates that this region contained enhancer like
elements (Chen et al.
1986). Only a low level of expression of the a' gene occurred in developing
seeds of transgenic
plants that contain the a' gene flanked by 159 nucleotides 5' of the
transcripdonal start site.
However, a 20 fold increase in expression occurred when an additional 98
nucleotides of
upstream sequence were included. The DNA sequence between 143 and 257
contained five
repeats of the sequence AA(G)CCCA, and played a role in conferring tissue-
specific and
developmental regulation. The 35S promoter containing this sequence in
different positions and
different orientations is able to enhance the expression of the CAT gene by 25
to 40 fold (Chen
et al. 1988).
Traps-acting factors directly involved in storage protein gene regulation have
not yet been
reported. However, in some cases, the level of amino acids can control the
expression of storage
protein. Vegetative storage protein (VSP) gene expression in leaves, stems and
seed pods is
closely related to whether these organs are can ently a sink for nitrogen or a
source for mobilized
nitrogen for other organs (Staswick 1989). The leaves have a sensitive
mechanism for detecting
changes in sink demand of mobilizing reserves, and VSP gene expression can be
rapidly adjusted

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
7
accordingly. Sequestering excess amino acids in this way may prevent their
accumulation to
toxic levels.
GENETIC ENGINEERING USING AGROBACTERIUM TUMEFACIENS
One of the most significant recent advances in the area of plant molecular
biology has
been the development of the Agrobacterium tumefaciens Ti plasmid as a vector
system for the
transformation of plants. In nature, A. tumefaciens infects most
dicotyledonous and some
monocotyledonous plants by entry through wound sites. The bacteria bind to
cells in the wound
and are stimulated by phenolic compounds released from these cells to transfer
a portion of their
endogenous, 200 kb Ti plasmid into the plant cell (Weiler and Schroder 1987).
The transferred
portion of the Ti plasmid, (T-DNA), becomes covalently integrated into the
plant genome, where
it directs the biosynthesis of phytohormones using enzymes which it encodes.
The vir gene in
the bacterial genome is known to be responsible for this process. In addition
to vir gene products,
directly repeating sequences of 25 bases called "border" sequences are
essential, but only the
right terminus has been shown to be used for T-DNA transfer and integration.
Expression of the T-DNA gene inside the plants results in the uncontrolled
growth of
these and surrounding cells, leading to formation of a gall (Weiler and
Schroder 1987). Ti
plasmids, from which these disease-producing genes have been removed or
replaced, are referred
to as "disarmed" and can be used for the introduction of foreign genes into
plants. The great size
of the disarmed Ti plasmid and lack of unique restriction endonuclease sites
prohibit direct
cloning into the T-DNA. Instead, intermediate vectors such as pMON237 or
pBI121 can be used
to introduce genes into the Ti plasmid. Currently, two kinds of vector systems
are available as
intermediate vectors: cointegrating vectors and binary vectors. A
cointegrating transformation
vector must include a region of homology between the vector plasmid and the Ti
plasmid. Once
recombination occurs, the cointegrated plasmid is replicated by the Ti plasmid
origin of
replication. The cointegrate system, while more difficult to use, does offer
advantages. Once the
cointegrate has been formed, the plasmid is stable in Agrobacterium.
A binary vector contains an origin of replication from a broad host-range
plasmid instead
of a region of homology with the Ti plasmid. Since the plasmid does not need
to form a
cointegrate, these plasmids are considerably easier to introduce into
Agrobacterium. The other
advantage to binary vectors is that this vector can be introduced into any
Agrobacterium host

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
8
containing any Ti or Ri plasmid, as long as the vir helper function is
provided. Using these
systems, the gene regulation mechanism of storage proteins has been
elucidated.
IMPROVEMENT OF NUTRITIONAL QUALITIES OF PLANTS
The amino acid composition of the cereal endosperm protein is characterized by
a high
content of proline and glutamine while the amount of essential amino acids,
lysine and
tryptophan in particular, is a limiting factor (Pernollet and Mosse 1983). In
legumes, sulfur
containing amino acids such as methionine and cysteine are the major limiting
essential amino
acids for the efficient utilization of plant protein as animal or human food
while roots and tubers
are deficient in almost all of the essential amino acids.
There has been a great deal of effort to overcome these amino acid limitations
by
breeding and selecting for more nutritionally balanced varieties. Plants have
been mutated in
hopes of recovering individuals with more nutritious storage proteins. Neither
of these
approaches has been very successful, although some naturally occurnng and
artificially produced
mutants of cereals were shown to contain a more nutritionally balanced amino
acid composition.
These mutations cause a significant reduction in the amount of storage protein
synthesized and
thereby result in a higher percentage of lysine in the seed; however, the
softer kernels and low
yield of such strains have limited their usefulness (Pernollet and Mosse
1983). The reduction in
storage protein also causes the seeds to become more brittle; as a result,
these seeds shatter more
easily during storage. The lower levels of prolamin also result in flours with
unfavorable
functional properties which cause brittleness in the baked products (Pernollet
et al. 1983). Thus,
no satisfactory solution has yet been found for improving the amino acid
composition of storage
proteins.
One direct approach to this problem is to modify the nucleotide sequence of
genes
encoding storage proteins so that they contain high levels of essential amino
acids. To achieve
this aim, several laboratories have tried to modify and express storage
proteins in the host plants.
Modified natural storage proteins have been created by inserting into the
natural storage protein
genes exogenous DNA sequences coding for essential amino acids. The basic idea
is to produce
modified proteins which are similar to the naturally occurring proteins, but
which have inserted
into them sequences of essential amino acid residues. There are at least three
problems
encountered with this approach: ( 1 ) Dilution. Even if this approach is
successful, the modified

CA 02325463 2000-10-13
WO 99155890 PCT/US99/09067
9
protein will still have high levels of non-essential amino acids, effectively
"diluting" the net
concentrations of the encoded essential amino acids. (2) Instability. The
modified proteins are
typically susceptible to proteolytic attack in the plant. Because a natural
storage protein is a
highly evolved structure, artificial modifications to it are likely to
destabilize it. For example,
S a stabilizing glutamic acid-lysine salt bridge might be broken. (3) Multiple
copies of genes.
Naturally occurring storage proteins are typically encoded by multiple gene
copies. A mutation
in just one of the copies of the gene will likely have only a limited effect.
In vitro mutagenesis was used to supplement the sulfur amino acid codon
content of a
gene encoding ~i-phaseolin, a Phaseolus vulgaris storage protein (Hoffmann et
al. 1988). The
nutritional quality of ~3-phaseolin was increased by the insertion of 15 amino
acids six of which
were methionine. The inserted peptide was essentially a duplication of a
naturally occurring
sequence found in the maize 15 kD zein storage protein (Pederson et al. 1986).
However, this
modified phaseolin achieved less than 1 % of the expression level of normal
phaseolin in
transformed seeds. Recently it has been found that this insertion was made in
part of a major
structural element of the phaseoiin trimer (Lawrence et al. 1990). Therefore,
an inclusion of 15
residues at this site could distort the structure at the tertiary and/or
quaternary level.
Lysine and tryptophan-encoding oligonucleotides were introduced at several
positions
into a 19 kD a-type zero complementary DNA by oligonucleotide-mediated
mutagenesis
(Wallace et al. 1988). Messenger RNA for the modified zero was synthesized in
vitro and
injected into Xenopus laevis oocytes. The modified zero aggregated into
structures similar to
membrane-bound protein bodies. This experiment suggested the possibility of
creating high-
lysine corn by genetic engineering.
There are alternative approaches that might be more practical. One of these is
to transfer
heterologous storage protein genes that encode storage proteins with higher
levels of the desired
amino acids. For this purpose, a chimeric gene encoding a Brazil nut
methionine-rich protein
which contains 18% methionine has been transferred to tobacco and expressed in
the developing
seeds (Altenbach et al. 1989). The remarkably high level of accumulation of
the methionine-rich
protein in the seed of tobacco results in a significant increase in methionine
levels of ~30%.
The maize 15 kD zero structural gene was placed under the regulation of French
bean ~3-
phaseolin gene flanking regions and expressed in tobacco (Hoffmann et al.
1987). Zein
accumulation was obtained as high as 1.6% of the total seed protein. Zein was
found in roots,

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
hypocotyls, and cotyledons of the germinating transgenic tobacco seeds. Zein
was deposited and
accumulates in the vacuolar protein bodies of the tobacco embryo and
endosperm. The storage
proteins of legume seeds such as the common bean (Phaseolus vulgaris) and
soybean (Glycine
max) are deficient in sulfur-containing amino acids. The nutritional quality
of soybean could be
5 improved by introducing and expressing the gene encoding methionine-rich 15
kD zero
(Pederson et al. 1986).
A synthetic gene (HEAAE I = High Essential Amino Acid Encoding) which encoded
a
protein domain high in essential amino acid was expressed as a CAT-HEAAE I
fusion protein
in potato (Jaynes et al. 1986; Yang et al. 1989}. However, structural
instability limited the high
10 level expression of this fusion protein in the potato system. Also, the
content of essential amino
acids was diluted to less than 40% of the original encoded protein by
constructing this fusion.
There are several precautions that should be considered in engineering storage
proteins
(Larkins, 1983). First, in vitro mutational change must not be in regions of
the protein that
perturb the normal protein structure; otherwise, the proteins might be
unstable. Second, when
attempting to increase nutritional quality by introducing a gene encoding a
heterologous protein
in crop plants, it is important that the protein encoded by an introduced gene
does not produce
any adverse effects in humans or livestock, the ultimate consumers of the
engineered seed
proteins (Altenbach et al. 1989). Finally, it is critical that the amino acids
present in the
introduced protein are able to be utilized by the animal for growth and
development.
DE NOVO DESIGN OF PROTEINS
Recently, a new field in protein research, de novo design of proteins, has
made
remarkable progress due to a better understanding of the rules which govern
protein folding and
topology. Protein design has two components: the design of activity and the
design of structure.
This review will concentrate on the design of structurally stable storage
protein-like proteins.
The usual approach for the design of helical bundle proteins consists of
linking sequences
with a propensity for forming an a-helix via short loop sequences to get
linear polypeptide
chains. This chain can fold into the predetermined 'globular type' tertiary
structure in aqueous
solution (Mutter 1988; DeGrado et al. 1989). a-helical secondary structures
are stabilized by
interatomic interactions that can be classified according to the distance
between interacting atoms
in the sequence of the protein (DeGrado et al. 1989).

CA 02325463 2000-10-13
WO 99/55$90 PCT/US99/09067
11
Short range interactions account for different amino acids having different
conformational
preferences. Both statistical (Chou and Fasman 1978) and experimental (Sueki
et al. 1984)
methods show that residues such as Glu, Ala and Met tend to stabilize helices,
whereas residues
such as Gly and Pro are destabilizing. However, these intrinsic preferences
are not sufficient to
determine the stability of helices in globular proteins.
Analysis of the free-energy requirements for helix initiation and propagation
indicates
that peptides of 10 to 20 residues should show little helix formation in water
(Bierzynski et al.
1982) when the Zimm-Bragg equation (Zimm and Bragg 1959) is used, with
parameters (s and
S) determined by host-guest experiments where s is the helix nucleation
constant, n is the number
of H-bonded residues in the helix and S is an average stability constant for
one residue.
sSn-1 / (S-1)
Nevertheless, the 13 amino acid C-peptide obtained from RNase A does show
measurable
helicity (~25%) at low temperature (Bierzynski et al. 1982; Brown and Klee
1981). The stability
of this peptide is 1000-fold greater than the value calculated from the Zimm-
Bragg equation.
Specific side-chain interactions, factors that are not considered in the Zimm-
Bragg model, are
responsible, at least in part, for the fact that the C-peptide is much more
helical than predicted
(Scheraga, 1985).
Medium-range interactions are responsible for the additional stabilization of
secondary
structures (DeGrado et al. 1989). Interaction between the side-chains are
regarded as important
medium range interactions (Shoemaker et al. 1987; Marqusee and Baldwin 1987).
These include
electrostatic interactions, hydrogen bonding, and the perpendicular stacking
of aromatic residues
(Blundell et al. 1986). An a-helix possesses a dipole moment as a result of
the alignment of its
peptide bonds. The positive and negative ends of the amide group dipole point
toward the helix
NHZ-terminus and COOH-terminus, respectively, giving rise to a significant
macrodipole.
Appropriately charged residues near the ends of the helix can favorably
interact with the helical
dipole and stabilize helix formation. It was estimated that the electrostatic
interaction between
a pair of antiparallel a-helices is about 20 Kcal/mol less than a parallel a-
helices pair (Hol and
Sanders I 981 ). Hydrogen bonds between side chains and terminal helical N-H
and C=O groups
also participate in the stabilization of helical structure (Richardson and
Richardson 1988; Presta
and Rose 1988; Richardson and Richardson 1989).

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
12
Protein structures contain several long-range stabilizing interactions which
include
hydrophobic and packing interactions, and hydrogen bonds. Among these, the
hydrophobic effect
is a prime contributor to the folding and stabilizing of protein structures.
The driving force for
helix formation in RNase A arises from long-range interactions between C-
peptide and S-protein,
a large fragment of the protein from which C-peptide was excised (Komoriya and
Chaiken 1985).
The role of hydrophobic interactions in determining secondary structures was
studied for
a series of peptides containing only Glu and Lys in their sequence (DeGrado
and Lear 1985). Glu
and Lys residues were chosen as charged residues for the solvent-accessible
exterior of the
protein to help stabilize helix formation by electrostatic interaction.
STABILITY OF DESIGNED PROTEINS
Hydrophobic residues often repeat every three to four residues in an a-helix
and form an
amphiphilic structure (DeGrado et al. 1989). Amphiphilicity is important for
the stabilization of
the secondary structures of peptides and proteins which bind in aqueous
solution to extrinsic
apolar surfaces, including phospholipid membranes, air, and the hydrophobic
binding sites of
regulatory proteins (Degrado and Lear 1985). This amphiphilic secondary
structure can be
stabilized relative to other conformations by self association. Therefore,
short peptides often
form the a-helix in water only because the helix is amphiphilic and is
stabilized by peptide
aggregation along the hydrophobic surface. Natural globular proteins are
folded by a similar
mechanism, involving hydrophobic interaction between neighboring segments of
secondary
structure (Presnell and Cohen 1989). Using the concept of an amphiphilic
helix, DeGrado and
coworkers have successfully built peptide-hormone analogs with minimal
homology to the native
sequences. These peptides, like the native ones, are not helical in solution
but do form helices
at the hydrophobic surfaces of membranes.
Designed synthetic peptides have been used to show how hydrophobic periodicity
in a
protein sequence stabilizes the formation of simple secondary structures such
as an amphiphilic
a-helix (Ho and DeGrado 198'n. The strategies used in the design of the
helices in the four-helix
bundles are: 1 ) the helices should be composed of strong helix forming amino
acids and 2) the
helices should be amphiphilic; i.e., they should have an apolar face to
interact with neighboring
helices and a polar face to maintain water solubility of the ensuing
aggregates. The results show
that hydrophobic periodicity can determine the structure of a peptide.
Therefore, the peptides

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
13
tend to have random conformations in very dilute solution, but form secondary
structures when
they self associate (at high concentration) or bind to the air-water surface.
The free energy associated with dimerization or tetramerization of the
designed peptides
could be experimentally determined from the concentration dependence of the CD
spectra for the
peptides (DeGrado et al. 1989; Lear et al. 1988; DeGrado and Lear 1985). At
low concentrations,
the peptides were found to be monomeric and have low helical contents, whereas
at high
concentration they could self associate and stabilize the secondary structure.
Therefore, possible
hairpin loops between helices can affect the stability of the secondary
structure by enhancing the
self association between the helical monomers. A strong helix breaker (Chow
and Fasman 1978;
Kabsch and Sander 1983, Sueki et al. 1984, Scheraga 1978) was included as the
first and last
residue to set the stage for adding a hairpin loop between the helices. A
single proline residue
appeared capable of serving as a suitable Iink if the C and N terminal glycine
residue are slightly
unwound. Glycine lacks a ~i-carbon, which is essential for the reverse turn
where positive
dihedral angles are required. The pyrrolidine ring of proline constrains its f
dihedral angle -60°.
Thus, proline should be destabilizing at positions where significantly
different backbone torsion
angles are required. This amino acid, as well as glycine, has a high tendency
to break helices and
occurs frequently at turns (Creighton 1987).
The direct evidence for stabilization of protein structure by adding the
linking sequence
was observed by comparing the guanidine denaturation curve for a monomer,
dimer and tetramer
(Degrado et al. 1989). The gene encoding tetrameric protein was expressed in
E. coli and purified
to homogeneity. In the series of mono-, di-, and tetramer, the stability
toward guanidine
denaturation increases concomitantly with the increase in covalent cross-links
between helical
monomer. At equivalent peptide concentrations, the midpoints of the
denaturation curves
occurred at 0.55, 4.5 and 6.5 M guanidine for the mono-, di, and tetramer.
Furthermore, as the
number of covalent cross-links was increased, the curves became increasingly
cooperative. Thus,
the linker sequence stabilized the formation of the four helix structures at
low concentration of
the peptides (<1 mg/ml).
Structural stability of proteins is directly related to in vivo proteolysis
(Parasell and Sauer
1989). Proteolysis depends on the accessibility of the scissile peptide bonds
to the attacking
protease. The sites of proteotytic processing are generally in relatively
flexible interdomain
segments or on the surface of the loops, in contrast to the less accessible
interdomain peptide

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
14
bonds (Neurath 1989). This suggests that the stability of the folded state of
the protein is the most
important determinant for its proteolytic degradation rate. The effect of a
folded structure on the
proteolytic degradation has been proven by several experiments. First,
proteins that contain
amino acid analogs or are prematurely terminated are often degraded rapidly in
the cells
S {Goldberg and St. John 1976). Second, there are good correlations between
the thermal stabilities
of specific mutant proteins and their rates of degradation in E. Coli (Pakula
and Sauer 1986,
Parasell and Sauer 1989). Finally, second-site suppressor mutations that
increase the
thermodynamic stability of unstable mutant proteins have also been shown to
increase resistance
to intracellular proteolysis (Pakula and Sauer 1989). The solubility of
proteins could also affect
their proteolytic resistance as some proteins aggregate to form inclusion
bodies that escape
proteolytic attack (Kane and Hartley 1988).
Metabolic stability is another factor influencing the in vivo stability of
proteins. Usually,
damaged and abnormal proteins are metabolically unstable in vivo (Finley and
Varshavsky 1985;
Pontremoli and Melloni 1986). In eukaryotes, covalent conjugation of ubiquitin
with proteins is
essential for the selective degradation of short-lived proteins (Finley and
Varshavsky. 1985). It
was found that the amino acid at the amino-terminus of the protein determined
the rate of
ubiquitination (Bachmair et al. 1986). Both prokaryotic and eukaryotic long-
lived proteins have
stabilizing amino acids such as methionine, serine, alanine, glycine,
threonine, and valine at the
amino terminus end. On the other hand, amino acids such as leucine,
phenylalanine, aspartic acid,
lysine, and arginine destabilize the target proteins.
Recently, many laboratories have attempted to improve the nutritional quality
of plant
storage proteins by transferring heterologous storage protein genes from other
plants (Pederson
et al. 1986). The development of recombinant DNA technology and the
Agrobacterium-based
vector system has made this approach possible. However, genes encoding storage
proteins
containing a more favorable amino acid balance do not exist in the genomes of
major crop plants.
Furthermore, modification of native storage proteins has met with difficulty
because of their
instability, low level of expression, and limited host range. One possible
alternative is the de
novo design of a more nutritionally-balanced protein which retains certain
characteristics of the
natural storage proteins of plants.
Our initial work described the use of small fragments of DNA which encoded
spans of
protein high in essential amino acids (Jaynes et al. 1985; Yang et al. 1989).
Subsequently, the

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
genes encoding these protein domains were cloned into an existing protein and
the expression
level of this modified protein determined in transgenic potato plants.
However, because of some
of the problems mentioned above, the results were somewhat less than desirable
(Yang et al.
1989).
5
The publications and other materials used herein to illuminate the background
of the
invention or provide additional details respecting the practice, are
incorporated by reference, and
for convenience are respectively grouped in the appended List of References.
10 SUMMARY OF THE INVENTION
Experiments were performed which were designed to produce transgenic plants
which
produce higher levels of essential amino acids. For this purpose, plants were
made transgenic
with a synthetic nucleic acid construct which encoded a protein containing
high levels of
essential amino acids. Resulting transgenic plants produced not only higher
levels of essential
1 S amino acids, but unexpectedly these plants also produced higher levels of
protein in general.
This increase in total protein content ranged from approximately 2-fold to S-
fold.
One aspect of the invention is a transgenic plant comprising a gene which
encodes a
protein which causes the transgenic plant to overproduce total protein as
compared to a
nontransgenic plant.
A second aspect of the invention is a gene encoding a protein wherein plants
which are
transgenic for this gene overproduce total protein as compared to a
nontransgenic plant.
A third aspect of the invention is a protein wherein if a plant is made
transgenic for a gene
encoding said protein said transgenic plant will overproduce total plant
protein as compared to
the plant when it is not transgenic. This protein may comprise an amphiphilic
a-helical
sequence, a ~i-pleated sheet sequence, or a combination of a-helix and ~i-
pleated sheet.
Another aspect of the invention is a transgenic plant cell which contains a
gene encoding
a protein which causes the plant cell to overproduce total protein as compared
to a nontransgenic
cell.
Yet another aspect of the invention is a method for increasing the production
of a specific
protein in a plant or plant cell by transforming the plant or plant cell with
a gene which encodes
a protein which causes the overproduction of total protein in the transgenic
plant or plant cell.

CA 02325463 2000-10-13
WO 99/55$90 PCT/US99/09067
16
Still another aspect of the invention is a method for increasing the
production of a
nonprotein product in a plant or plant cell by transforming the plant or plant
cell with a gene
encoding a protein which causes the overproduction of total protein in the
transgenic plant or
plant cell and thereby results in the increased synthesis of nonproteinaceous
material.
Yet another aspect of the invention is a method for enhancing the production
of a specific
protein or nonprotein in a plant or plant cell by cotransforming the plant or
plant cell with 1 ) a
gene encoding the specific protein or a protein involved as an enzyme in the
synthetic pathway
of the nonprotein product and 2) a gene encoding a protein which results in
the generalized
overproduction of total plant or plant cell protein.
BRIEF DESCRIP~'ION OF THE DRAWING,
Figure 1 shows the average essential amino acid requirement for both children
and adults
in mg per kg body weight.
Figure 2 shows the amounts of foodstuffs which must be consumed in grams per
day in
order to meet the minimum daily requirement of all essential amino acids.
Figure 3 illustrates how the amino acid composition of the ASP 1 monomer was
chosen.
Figure 4 shows the percentage of essential amino acids (EAA) and percentage of
most
limiting essential amino acids (MLEAA) in ASP 1 tetramer compared with natural
proteins.
Figure 5 is a depiction of the amphiphilicity of the ASP1 monomer where
hydrophobic
amino acids are in the white rectangle and hydrophilic amino acids are in the
shaded rectangle.
There are interactions between the Glu (E) and Lys (K) residues which are
shown as dark lines
depicting salt-bridges.
Figure 6 shows the amino acid sequence of the ASPI tetramer (SEQ ID N0:2).
Hydrophilic amino acids are underlined and ~3-turns are indicated.
Figures 7A-7B show the protein content of plants. Figure 7A shows the overall
protein
content determined by amino acid analysis. P-2 TC is a control plant and P-7
T, P-11 T, P-17
T and P-29 T are plants transformed with ASP1 tetramer. Figure 7B shows the %
increase of
protein content in the transformed plants as compared to the control plant.
These data were
derived from seedlings obtained from transformed mother plants. A minimum of
four separate
assays were used and the variation was no more than 30%.

CA 02325463 2000-10-13
WO 99155890 PCT/US99/09067
17
Figure 8A depicts the overall protein content of leaves from control and ASP1
tetramer
seedlings. The plants are labeled as for Figures 7A-7B. Figure 8B shows the %
increase in
protein content for the transformed plants as compared to the control plant.
S DETAILED DESCRIPTION OF THE INV NTION
The present invention uses quite a different approach. Rather than mutate or
transfer a
gene for a naturally occurring protein, an artificial protein has been
constructed de novo. This
de novo protein has nutritionally balanced proportions of the essential amino
acids, is stable
following expression in a plant, and shares some of the characteristics of
naturally occurring
plant storage proteins. Transgenic plants have been produced which contain
such a gene. These
plants not only produce more essential amino acids compared to controls, but
surprisingly the
total amount of protein produced by these plants is also increased.
Furthermore, the total amount
of nonproteinaceous components can also be increased via these methods.
There are at least two fundamental difficulties in achieving efficient
expression of
designed proteins. First, it is not yet known what stabilizes a protein
against proteolytic
breakdown and second, the mechanisms for folding of an amino acid sequence
into a
biologically-stable tertiary structure have not yet been fully delineated. For
the construction of
DNP 1 (Designed Nutritional Protein), we focused on the design of a
physiologically-stable as
well as a highly nutritious, storage protein-like, artificial protein.
DESIGNED NUTRITIONAL PROTEINS
We designed the synthetic protein DNP 1 to contain a high content of those
amino acids
which are essential to the diet of animals. The optimized content of essential
amino acids for this
new protein was obtained empirically by determining the amounts of essential
amino acids
necessary for normal metabolism of the animal. See Table 1, which gives
essential amino acid
requirement {grams/day) (in the following order) for children at 3 months,
children at 5 years,
children at 10 years, average for children at these three ages, adults at 25
years, adults at 75 years,
average for adults at these two ages, and overall average. We also determined
the 'deficiency
values' or the ratios of deficient essential amino acids for the 10 primary
crops animals consume
throughout the world (Figure 3). See Table 2, which gives essential amino acid
deficiency ratios
for the ten major crop plants consumed by humans. From these data, we then
found the ratio of
essential amino acids needed to totally complement each particular plant
foodstuff We averaged

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
18
Table 1
InfantChild Child Child Adult Adult Adult Overall
3 mo 5 yr 10 yr Ave 25 yr 75 yr Ave Ave
Ile 0.258 0.524 0.879 0.554 0.754 0.754 0.754 0.654
Leu 0.521 1.234 1.382 1.046 1.102 1.102 1.102 1.074
Lys 0.370 1.150 1.382 0.967 1.020 1.020 1.020 0.994
Met+Cys 0.235 0.468 0.691 0.465 0.986 0.986 0.986 0.725
Phe+Tyr 0.403 1.178 1.124 0.902 1.102 1.102 1.102 1.002
Thr 0.241 0.636 0.879 0.585 0.522 0.522 0.522 0.554
Trp 0.095 0.185 0.240 0.173 0.260 0.260 0.260 0.217
Val 0.308 0.655 0.785 0.583 0.754 0.754 0.754 0.668
Table 2
E.A.A. Wheat Corn Rice Barley Sorghum
Ile 1.72 2.23 1.71 1.94 1.51
Leu 1.85 1.25 1.57 1.98 0.85
Lys 4.08 3.36 3.03 3.68 4.54
Met + Cys 1.73 2.41 3.86 2.53 2.55
Phe + Tyr 1.20 1.30 1.10 1.32 1.49
Thr 2.14 1.67 1.75 2.01 1.89
Trp 1.61 2.20 1.78 1.37 1.69
Val 1.68 1.39 1.20 1.18 1.50
E.A.A. Cassava Taro Sweet PotatoPotato Plaintain
Ile 1.88 1.69 1.92 1.68 1.33
Leu 2.13 1.64 2.70 2.45 2.10
Lys 2.83 2.29 2.96 2.08 2.24
Met + Cys 3.18 3.52 2.84 3.53 3.74
Phe + Tyr 1.58 2.37 1.29 1.67 1.71
Thr 1.58 1.55 1.62 1.54 2.28
Trp 1.02 1.41 1.37 1.60 1.39
Val 1.23 1.31 1.19 1.01 1.36

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
19
these values and derived a set of numbers we call the 'Average Ratio for All
Crops Idealized to
the DNP 1 Monomer' (Figure 4). This set of numbers represents the ratio of
essential amino acids
necessary to complement the deficiencies found in all 10 crops for all human
age groups.
From the above set of numbers, we designed a nutritional protein for humans
(ASP 1 ).
The amino acid sequence for ASP 1 is shown in Figure 6 and is SEQ ID N0:2. The
DNA
sequence used to encode this protein is shown as SEQ ID NO:1. It has 1.8 times
more of the
essential amino acids compared to zein or phaseolin. The difference in MLEAA
is much higher,
containing 3 times more than phaseolin and 6.5 times more than zero. The
helical region of ASP 1
is amphipathic (hydrophobic residues clustered on one face of the helix while
hydrophilic
residues are found on the other face) and is stabilized by several GLU - LYS
salt bridges (Figure
5). The helix breaker Gly-Pro-Gly-Arg (SEQ ID N0:8) has been used as a turn
sequence. The
design results in an antiparallel tetramer which achieves an extraordinarily
stable secondary and
tertiary structure even at low concentration.
The structural stability of a protein is important in determining its
susceptibility to
proteolysis. Most native proteins are relatively resistant to cleavage by
proteolytic enzymes,
whereas denatured proteins are much more sensitive (Pace and Barret 1984).
Several f ndings
suggest that the stability of a folded protein is an important determinant of
its rate of degradation.
Therefore, in addition to improved nutritional quality, ASP 1 has been
designed to have a stable
storage protein-like structure in plants. Its design is based on the
structurally well-studied corn
storage zero proteins (Z19 and Z22), which are comprised of 9 repeated helical
units (Agros et
al. 1982). Each helical unit, 16 to 26 amino acids long, of zero is flanked by
turn regions and
forms an antiparallel helical bundle. Most of the amino acids in the helices
are hydrophobic
residues. On the other hand, ASP1 is comprised of 4 helical repeating units,
each 20 amino acids
long (Figure 6). Increased gene copy number by concatenation can increase the
protein yields.
At the same time, gene concatenation gives the increased molecular mass of the
encoded protein.
Such an increase in size and concatenation can significantly stabilize an
otherwise unstable
product (Shen 1984).
The gene encoding this novel peptide was chemically synthesized and cloned
into an E.
coli expression vector. This gene contains plant consensus sequences at the 5'
end of the
translation initiation site to optimize the expression of proteins in vivo. It
was placed under the
control of the 35S cauliflower mosaic virus (CaMV) promoter in order to permit
the constitutive

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
expression of this gene in tobacco. The gene can also be cloned into other
microorganisms, such
as yeasts, through standard means known in the art.
Unless otherwise clearly indicated by context, the term "ASP1" is intended to
encompass
any one or more of the following: ( 1 ) the peptide whose sequence is SEQ ID
N0:3; (2) the
5 peptide whose sequence is SEQ ID N0:4; (3) any polymer, copolymer, oligomer
or co-oligomer
of one or both of SEQ ID N0:3 and SEQ ID N0:4, such as the tetrameric ASP 1
whose sequence
is SEQ ID N0:2; or (4) any peptide or protein having substantially the same
amino acid sequence
as any of the above, and substantially the same stability upon expression in
at least one plant, but
whose amino acid sequence has been modified in a manner which will naturally
occur to one of
10 skill in the art, such as by insertions, deletions, and/or transpositions
which are not substantially
detrimental to the stability of, or to the nutritionally balanced essential
amino acid composition
of, the protein. By way of example, numerous transpositions, insertions, or
deletions in the
amino acid residue sequence of ASP 1 or other proteins of the invention will
occur to those of
skill in the art. It will be desirable to maintain overall amphipathy of the
structure to promote
15 stability; and it will also be desirable to have as internal sequences glu-
X-X-X-lys (SEQ ID
NO:S), to promote salt bridges in the a-helix, which also promote protein
stability. While other
acid-X-X-X-base sequences may also serve this function glu-X-X-X-lys (SEQ ID
NO:S) is
preferred: lysine is preferred as the base because it is an essential amino
acid; and glutamic acid
is preferred as the acid because it has been observed to stabilize an a-helix
better than does
20 aspartic acid. This same type of definition as set out for ASP 1 also
applies to all other
polypeptides or proteins which are disclosed herein.
The protein should also be designed for ready digestibility by the proteases
of the
intended consumer. For example, frequent lysine (or arginine) sites will
promote proteolytic
attack by trypsin. Frequent phenylalanine (or tyrosine) sites will promote
proteolytic attack by
chymotrypsin.
It may be desirable to tailor the essential amino acid content of the protein
specifically
to complement the essential amino acid content of a particular crop of
interest, rather than an
average for several crops. It may also be desirable to tailor the essential
amino acid content to
match the nutritional requirements of the intended consumer species. For
example, an artificial
storage protein to be expressed in maize might have one composition if the
maize is intended for

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
21
human consumption, and a somewhat different composition if the maize is
intended for feeding
pigs.
An amphipathic peptide or protein is one in which the hydrophobic amino acid
residues
are predominantly on one side, while the hydrophobic amino acid residues are
predominantly on
the opposite side, resulting in a peptide or protein which is predominantly
hydrophobic on one
face, and predominantly hydrophilic on the opposite face.
PREDICTION OF THE STRUCTURE OF ASP1
Without wishing to be bound by the following discussion of inferences
regarding ASP1's
structure, the following gives the inventor's best current information and
inferences regarding
that structure. The secondary structures of the ASP 1 monomer and tetramer
were predicted by
PREDICT-SECONDARY in (3-SYBYL. The percentage of a-helix content predicted by
information-theory showed a higher a-helix content compared to the other two
prediction
methods (Bayes-statistic and neural-net) in PREDICT-SECONDARY. The predicted
secondary
structures by information-theory gave 100% helical content for the monomer and
74% for the
tetramer.
However, the accuracy of the three widely used prediction methods ranged from
49% to
56% for prediction of three states; helix, sheet, and coil (Kabsch and Sander
1983). This
inaccuracy might be due to the small size of the data base and/or the fact
that secondary structure
is determined by tertiary interactions which are not included in the local
sequences. For further
predictions of structure, the structures predicted by information-theory were
energy minimized
using SYBYL MAXIMIN2.
A perfect amphiphilic a-helical conformation was predicted for the ASP1-
monomer after
minimization. The tertiary structure of the ASP1-tetramer after minimization
showed the
antiparallel conformation as was designed. These minimization results
suggested the high
probability of stable secondary structure (a-helix and (3-turn) formation of
the ASP1-monomer
and -tetramer.
STRUCTURAL ANALYSIS OF ASP1 PROTEIN
The structural stability of ASP1-monomer and tetramer could not be determined
by
minimization only. Therefore, the stability of the a-helical secondary
structure of ASPI-

CA 02325463 2000-10-13
WO 99/55890 PCT/US99109067
22
monomer was investigated. HPLC analysis of the gel filtered synthetic ASP 1-
monomer showed
that purity was more than 90% and amino acid analysis of the purified fraction
gave the expected
molar ratios. This fraction was also analyzed by mass spectrometry, and the
molecular weight
peak corresponding to the ASP I -monomer (2896.5) was present. Since the
structural stability of
ASP1-monomer and tetramer could not be determined by minimization only, the
stability of the
a-helical secondary structure of ASP1-monomer was investigated by circular
dichroism (CD)
analysis. CD spectra of ASPI-monomer showed the typical pattern of alpha
helical proteins with
double minima at 208 and 222 nm in aqueous solution (data not shown). The
stability of the
secondary structure can be induced by the inter-molecular interaction between
the helical chains
(DeGrado et al. 1989). Therefore, stable aggregation between monomers,
presumably through
hydrophobic interactions, could stabilize the helical structure. Besides,
proper packing of the
apolar side chains and proper electrostatic interaction might play important
roles in stabilizing
the secondary structure of ASP1. The stable interaction among the monomeric
ASPI molecules
is an important determinant for the proper folding into the tertiary structure
of the ASP 1-tetramer.
Therefore, the self association capability of the ASPI-monomers was
investigated by using size
exclusion chromatography. The hydrodynamic behavior of this peptide showed
that it was
aggregated into a hexamer form with an apparent molecular weight of about 17
kD. This
hexameric aggregate could be maintained in either low or high ionic strength
solutions. This
result provides proof of the stable globular type tertiary structure formation
of tetrameric ASP 1.
Three potential ~3-turn (Gly-Pro-Gly-Arg (SEQ ID N0:8)) sequences were
inserted
between four monomers for the ASP1-tetramer construction. The ~3-turn could
play an important
role for structural stability of the ASPI-tetramer when it is expressed in
vivo. It can also help
stabilize tertiary structure formation. The interactions between the helical
monomers might be
much faster due to the proximate effect when they are connected. This
proximate effect might
be critical for folding at the low concentrations of ASP 1-tetramer that are
possible when they are
expressed in vivo. At the same time, the stability of the secondary structure
is increased by the
hydrophobic interactions between helical monomers. In addition, this ~3-turn
sequence has a
tryptic digestion site {Gly-Arg) which can increase the digestibility of this
protein when it is
consumed by animals.
The stability of the folded structure of a protein has a close relation to its
proteolytic
degradation rate (Pace and Barret 1984; Pakula and Sauer 1986; Parasell and
Sauer 1989; Pakula

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
23
and Sauer 1989). In this respect, we expected high stability of folded ASPI-
tetramer against
proteolytic degradation when it is expressed in vivo. Stable quaternary
structure is essential for
the formation of protein bodies of storage proteins in zero or phaseoiin
(Lawrence et al. 1990).
These higher order structures can be achieved through the interaction and
close packing of the
stable tertiary structures. The major driving force for this quaternary
structure formation is also
hydrophobic interaction between the tertiary structures.
INTRODUCTION OF ASP1 GENE INTO TOBACCO
The correct insertion and orientation of the pBI derivative containing the
ASP1 tetramer
was screened for by EcoRI and HindIII digestion (it was found in E. coli that
the most stable
form of the gene was the tetramer form). The EcoRI digestion gave a fragment
of the expected
size, 3.2 kb, which consisted of 3'NOS of ASP1 and the GUS gene (data not
shown). Also, the
ASP1 gene with its 35S promoter and 3NOS sequences was detected as a 1.4 kb
band by HindIII
digestion. Stable transformation of the ASP 1 gene into A. tumefaciens LBA4404
was confirmed
by HindIII digestion of isolated plasmid DNA. It could be isolated from
Agrobacterium and
detected by enzyme digestion because pBIl21 is a binary vector. Leaf discs,
transformed with
LBA4404 carrying the ASP1 gene, gave about 5 to 7 shoots two to three weeks
after infection.
A total of 565 kanamycin-resistant shoots were regenerated from 120 leaf
discs. These shoots
were excised from the leaf discs and transferred to new media to grow several
more weeks, and
then transferred to rooting media. After three weeks in rooting medium, 126
rooted shoots were
analyzed for (3-glucuronidase (GUS). Root tips of 56 out of 126 plants showed
various levels of
GUS activity. Not all the kanamycin-resistant shoots showed the GUS positive
result. Although
kanamycin resistance was due to the expression of neomycin phosphotransferase
(NPT II gene),
regeneration of nontransgenic shoots in the presence of kanamycin has been
reported. Therefore,
escapes from the screening based on kanamycin sensitivity might have occurred
in the
nontransformed plants, making them kanamycin resistant.
Thirty six plantlets which showed high levels of (3-glucuronidase activity
were
transplanted into jiffy pots. After establishment of the plants, a more
accurate fluorogenic assay
for GUS activity was done to quantify the expression level of this gene (Table
3). GUS activity
was measured as pmole 4-methyl umbelliferone produced per mg protein per
minute, all at an
excess of 4-methyl umbelliferone glucuronide. Some of these transformed
tobacco plants

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
24
showed higher levels of B-glucuronidase activity compared to other plants. The
level ~ of
expression might be primarily affected by whether the gene is incorporated
into an active or
inactive site of chromatin. Activity of chromatin,- methylation of DNA and
nuclease
hypersensitivity are closely related to each other. It has been found that the
nuclease
hypersensitive sites correlate to active transcription (Gross and Garrard
1987). The degree of
methylation of DNA is inversely related to gene expression. Furthermore, if
the gene is located
near the plant's endogenous promoter or enhancer sites, the level of
expression of this gene will
be increased by these near-by enhancing factors. Therefore, the difference in
the levels of GUS
activity between the transformed plants might be due to this positional
effect, which was
determined by the sites of incorporation of this gene into the tobacco genome.
Tahle 3
Transgenic Plant GUS Activity
ASP1 #1 200
ASP1 #9 315
ASP 1 # 11 3,790
ASP1 #13 360
ASP1 #17 2,400
ASP 1 #29 200
pBI 121 #2 320
Wildtype #37 10
ANALYSIS OF TRANSFORMED PLANTS
DNA Analysis
Although GUS activity and kanamycin resistance are good indicators of
transformation,
rearrangement in the T-DNA after incorporation in the plant genome can
inactivate or silence the
other genes transferred. Correct incorporation of the ASP1 gene into the
tobacco genome was
therefore determined by Southern blotting using the ASP 1 tetrameric fragment
as a probe. A
distinct 1.4 kb HindIII band appeared in 7 out of 9 tobacco genomic DNA
samples analyzed, but
did not appear in negative control samples. As a positive control, and to
check the copy number,
HindIII-digested plasmid pBI ASP1-tetramer was also loaded, corresponding to 1
and 5 copy
number of the inserted gene in tobacco DNA. Multiple positive bands were
observed from most
of the transformed plants, with the expected size of 1.4 kb. Extra bands
appeared which were

CA 02325463 2000-10-13
WO 99/55890 PC'f/US99/09067
bigger than 1.4 kb, and which showed different patterns between the individual
plants. These
results suggested that the ASP1 gene, alone or with neighboring genes, might
be inserted into
several sites in the chromosomes, with or without rearrangement. The copy
number of the
correct band varied among the plants, and ranged from 1 to 5 by densitometric
measurement.
5 The copy number of a gene can affect its expression level; a gene with a
high copy number can
give a higher level of expression. The impact of copy number on the extent of
expression varies
from one system to another. In some cases there are positive correlations to
expression, but not
always. It should not be expected that all the copies of the gene are equally
active, because the
position of a copy in the genome can affect its level of the transcription.
However, as the number
10 of chromosomal sites containing foreign DNA increases, the likelihood that
at least one of the
pieces of DNA will integrate into a transcriptionally active region also
increases.
Kanamycin Gene Segregation Test
First generation progeny from self fertilized, transformed parents were tested
for
kanamycin gene segregation. Because the integrated T-DNA is inherited as a
dominant
15 Mendelian trait, the copy number of the ASP 1 gene can be determined by the
kanamycin
segregation pattern of the progeny. The results showed that most transformed
plants had
multiple copies of the ASP1 gene. See Table 4, in which Kn(r) is the number
showing
Kanamycin resistance, and Kn(s) is the number showing Kanamycin sensitivity.
The progenies
of transformed plants carrying single, double, or triple genetic NPT II loci
(the gene bestowing
20 Kanamycin resistance) are expected to segregate in 3:1, 15:1 or 63:1
ratios, respectively.
Therefore, plants #2 and # 13 have one NPT II locus; plants # 1, # 11 and #29
have two NPT II
loci; and plant # 17 has more than three loci of the gene encoding kanamycin
resistance.
Table 4
Transgenic PlantKn(r) Kn(s) Kn(r)/Kn(s)
25 ASP1 #1 136 8 15:1
ASP1 #11 127 8 15:1
ASP1 #13 112 37 3:1
ASP 1 # 17 175 1 175:1
ASP 1 #29 131 9 15:1
pBI 121 #2 107 34 3:1

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
26
RNA Analysis
Efficient transcription of inserted ASPl genes in the tobacco plants was
tested by
Northern blot analysis. The polyA RNA was analyzed using the ASP1-tetramer
probe. The
correct gene size transcribed was about 490 bases, which consisted of 30 bases
upstream and 170
bases downstream of the ASP 1-tetramer gene. In addition to this message,
eukaryotic mRNA
contains different sizes of polyA. Therefore, the expected size of the ASP1-
tetramer message
should be around 600 plus 100 bases long. Bands were observed which
corresponded to this
expected size from all the samples which were analyzed. However, the levels of
transcription
of the ASPl genes were dramatically different among the different transformed
plants.
Transformed plant # 17 accumulated 5- to 50-fold more transcripts than the
other transformed
plants. Such differences in accumulation could be explained by the effect of
position, or by the
effect of multiple copy insertion. The expression levels of the ASP 1 gene and
its neighboring
GUS gene correlated with each other in some transformed plants (such as in
plant # 17), but not
in all. These results suggested that the level of expression of two closely
connected genes can
be dramatically different. Multiple transcripts with different sized bands
(500-700 bases) were
observed from several transformed plants. This result might be due to multiple
insertion of the
ASP1 gene into the tobacco genome. These inserted genes may be rearranged, but
still produce
transcripts. Another possibility might be strong secondary structure which
could be formed due
to the four directly repeated sequences of the tetrameric ASP1 transcripts.
Different mobilities
could result, depending on the secondary structure.
Expression ofASPl
Standard means known in the art were used to raise polyclonal antibody against
synthetic
ASP1 monomer. This antibody was used to detect the production of stable ASP1
protein in
tobacco. If desired, standard means known in the art can also be used to
prepare monoclonal
antibodies against ASP 1. High levels of the tetrameric form ( 11.2 kD) of the
ASP 1 protein were
detected from plant # 17 by Western blot analysis (data not shown). Therefore,
direct correlation
was found between gene copy number, number of genetic NPTII loci, GUS
expression,
accumulation of ASP 1 transcript and protein expression level in the case of
plant # 17. Some
heterologous seed proteins undergo specific degradation when expressed in
transgenic plants. A
significant amount of the immunoreactive protein accumulated in tobacco seed
expressing the

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
27
phaseolin gene is smaller than the final processed protein (Sengupta et al.
1985). A similar result
was found when ~i-conglycinin was expressed in transgenic petunia (Beachy et
al. 1985). In
contrast to these results, the ASP1 protein appears to be quite stable in
transgenic tobacco plants.
Amino acid and total protein analyses were conducted on leaf tissue from
several of the
transgenic plants which produced detectable levels of ASP1. Surprisingly, we
found that the
overall levels of all amino acids were increased with some of the plants being
remarkably high.
(Table 5, Figures 7A-7B). These data were derived from seedlings from
transformed mother
plants. A minimum of four separate assays was used; variation was no more than
30%. The data
shown in Table 5 were determined from the dry weight of the whole plant. This
rather
disconcerting result has been repeated numerous times and the overall levels
of all amino acids
in the transgenic plants remain significantly elevated. See Table 6 and
Figures 8A-8B. Table
6 gives percentages of the amino acids above that of the control (pBI 121 #2)
for protein isolated
from various ASP 1 tetramer seedlings. These values were derived from amino
acid analysis.
Figures 8A-8B depict overall protein content from equivalent samples (by
weight) taken from
1 S leaves of control (P-2TC) and various ASP 1 tetramer seedlings. Other
methods of determining
overall protein content have been used with similar trends observed. For
example, comparison
of total protein densitometric values derived from SDS-PAGE of equivalent
samples (on a
weight basis) yield the same results (data not shown). Therefore, in addition
to being a very
stable protein in a plant cell, ASP 1 must function as a general 'protein-
stabilizer' and reduces
overall protein turnover without apparent deleterious effects to the plants,
since there is no
observable difference in growth characteristics in the plants producing high
amounts of ASP 1
as compared to control plants.
Table 5
Total Protein % Above Control
Transformed Control 12 0
ASP1 #7 19 58
ASP #11 28 133
ASP #17 24 100
ASP #29 17 42

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
28
Table 6
of 7 Above % of 11 Above% of 17 Above % of 29 Above
C C C C
Asx 60.55 80.59 47.00 35.00
Glx 65.26 56.18 46.42 20.68
Ser 30.00 109.67 73 28.00
Gly 14.46 115.96 78.01 23.69
His 31.30 94.27 63.74 27.23
Arg 39.06 86.5 58.28 23.24
Thr 31.00 106.55 79.48 23.44
Ala 6.21 123.91 76.55 27.54
Pro 11.68 114.53 73.65 19.66
Tyr 254.95 261.26 236.49 121.02
Val 14.45 80.06 53.32 23.70
Met ND ND ND ND
Cys ND ND ND ND
Ile 19.86 69.51 45.3 22.65
Leu 3.95 99.81 68.17 22.54
Phe 2.65 1 O 1.77 72.42 14.45
Lys -25.90 119.51 79.34 3.61
Trp ND ND ND ND
Expression of ASPI in Sweet Potato
The above results indicate the surprising overall increase in protein
production in tobacco
plants which were transformed with ASP1. Similar results have also been found
in sweet potato.
These results indicate that the increase in total protein content is a general
phenomenon which
is applicable to at least most plants. Table 7 lists the percentage of total
protein, as a function
of dry weight, of the transformed controls and ASP 1 transformants of sweet
potato. The numbers
are the average of 5 separate assays. Table 8 indicates the amount of
essential amino acid in
mg/100 grams edible portion of the sweet potato and the numbers are the
average of 3 separate
assays. Table 9 illustrates the percentage of these essential amino acids
compared to the
transformed control, the numbers being the average of 3 separate assays. Table
10 shows data

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
29
for a repeat of experiments as done in Table 8 but with the content of more of
the amino acids
determined. The numbers in Table 10 are the average of 3 separate assays.
Table 11 shows the
increase in transformant #5. Table 12 shows the % protein (wet weight basis)
of roots and
leaves, with the numbers being the average of at least 3 separate assays,
while Table 13 depicts
the overall protein content of the roots of transformed plants on a dry weight
basis and percent
dry matter and overall moisture content.
Table 7
Overall Protein Content and Percentage of the Control Transformed Plant
% Total Protein % of Control
Transformed Control 3.3 ~ 0.31 100
ASP 1 Transformant 6.3 t 0.46 191
1
ASP1 Transformant 5.2 t 0.14 158
2
ASP1 Transformant 4.8 ~ 0.06 146
3
ASP1 Transformant 9.1 ~ 0.16 276
4
ASP1 Transformant 9.6 t 0.19 291
5
Table 8
Essential Amino Acid Content in mg/100 Grams Edible Portion
Essential AA T-ControlASP1 ASP/ ASP/ ASP1 ASP/
T1 T2 T3 T4 TS
Isoleucine 90 225 255 290 315 270
Leucine 175 360 415 465 455 430
Lysine 148 275 315 350 395 365
Methionine 55 115 135 135 15 135
Phenylalanine 135 275 340 375 430 350
Threonine 135 225 305 350 385 325

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
Table 9
Essential Amino Acid Content as a Percentage of the Transformed Control Plants
Essential AA T-ControlASP ASP 1 ASP 1 ASP 1 ASP 1
1 Tl T2 T3 T4 TS
Isoleucine 100 250 283 322 350 300
Leucine 100 206 237 266 260 246
Lysine 100 186 213 237 267 247
Methionine 100 209 246 246 282 246
Phenylalanine 100 204 252 277 319 259
Threonine 100 167 226 259 285 241

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
31
Table 10
Essential Amino Acid Content in mg/100 grams Edible Portion (Sweet Potato)
Essential AA T-ControlASP1 ASP T2 ASP1 ASP1 ASP1
T1 T3 T4 T5
Isoleucine 80 320 420 383 388 433
Leucine 155 567 680 633 625 687
Lysine 125 450 510 493 493 537
Methionine 30 143 190 173 165 197
Phenylalanine 90 497 600 560 540 617
Threonine 105 423 480 430 445 487
Tryptophan 0.5 83 65 51 57 61
Valine 110 473 610 573 588 637
Nonessential
AA
Aspartic Acid 230 2,260 1,267 1,533 2,395 2,567
Serine 95 513 450 547 558 660
Glutamic Acid 245 1,100 913 993 1,000 1,210
Proline 110 223 270 333 358 400
Glycine 100 387 333 397 393 443
Alanine 105 473 367 397 480 487
Tyrosine 55 327 310 367 358 403
Histidine 40 197 153 160 198 210
Arginine 95 417 357 590 450 507
Ammonium 45 250 140 160 260 277
Protein 2.300 10.150 7.593 9.473 10.403 11.083

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
32
Table 11
Essential Amino Acid Content as a Percentage of the Transformed Control Plants
Essential AA T-Control ASP1 TS
Isoleucine 100 542
Leucine 100 443
Lysine 100 429
Methionine 100 656
Phenylalanine I 00 685
Threonine 100 464
Tryptophan 100 1,230
Valine 100 579
Table 12
Overall Protein Content (Fresh Weight) in Storage Roots and Leaves
Protein in Roots % Protein in Leaves
Transformed Control 0.36 0.94
ASP1 Transformant 2.42 1.93
1
ASP1 Transformant 1.60 1.21
2
ASP 1 Transformant 2.11 1.26
3
ASP1 Transformant 2.23 1.60
4
ASP 1 Transformant 2.46 2.03
5

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
33
Table 13
Various Composition of Roots Content on a Dry Weight Basis
Protein % Dry Matter % Moisture
Transformed Control 2.3 17.0 83
ASP 1 Transformant 10.2 23.7 76
1
ASP1 Transformant 8.0 21.7 79
2
ASP1 Transformant 9.5 23.0 78
3
ASP 1 Transformant 10.6 20.7 80
4
ASP1 Transformant 11.7 21.9 79
5
In a field study of 5 separate transformed lines of sweetpotato which were
transformed
with ASP1, it was seen that 3 of the 5 lines grew more slowly than the control
plants and
produced fewer storage roots, while the remaining two transformed lines which
had the highest
protein levels grew normally. Application of nitrogenous fertilizer did not
make any significant
difference in the yield of these two lines.
Other Gene Constructs Which Can Yield Increased Overall Protein Content
Protein constructs similar to ASP1 will also cause plants to give elevated
total protein
yields when the plant is transformed with a gene construct which expresses
such a protein. One
such protein is HDNP1 which has the following monomeric amino acid sequence:
MLEEIFKKMTEWIEKVLKTM (SEQIDN0:6)
hhHHhhHHhHHhhHHhhHhh (SEQ ID N0:31)
The "h" and "H" below the amino acid sequence refer to "hydrophobic" and
"hydrophilic",
respectively. Hydrophobic amino acids comprise: isoleucine, methionine,
phenylalanine,
tryptophan, valine, leucine, alanine and cysteine. Hydrophilic amino acids
comprise: arginine,
glutamic acid, histidine, lysine, asparagine, aspartic acid, glutamine,
tyrosine and proline.
Glycine, threonine and serine can act as either hydrophilic or hydrophobic
amino acid residues
depending upon their immediate environment. The HDNP 1 monomer is composed of
20 amino
acids in the structural motif to render an amphiphilic a-helix. The tetrameric
form is:
MLEEIFKKMTEWIEKVLKTMgpgrMLEEIFKKMTEWIEKVLKTMgpgrMLEEIFKKMTE
WIEKVLKTMgpgrMLEEIFKKMTEWIEKVLKTM (SEQ ID N0:7).

CA 02325463 2000-10-13
WO 99/55890 PCT/US99109067
34
This tetrameric form shows the 4 a-helices interspaced with the ~i-turn gpgr
(SEQ ID
N0:8). The tetramer is composed of 92 amino acids, including the 12 amino
acids comprising
the 3 ~3-turns, in the structural motif to render an amphiphilic a-helix.
HDNP1 is quite similar to ASP1 except that the Leu in position 5 of the
monomer has
been changed to Ile and also the Ile in position 17 of the monomer for ASP 1
has been changed
to a Leu in HDNP 1. For the tetramer, these changes are made throughout the
protein as can be
seen by comparing the amino acid sequences. Yet another protein which can
yield similarly
elevated protein levels when plants are transformed with a gene construct
expressing the gene,
is the protein HDNP2 which has the following monomeric sequence:
MTIEWKVELKFEMKIELKMT (SEQ ID N0:9)
hHhHhHhHhHhHhHhHhHhH (SEQ ID N0:36)
This monomer is composed of 20 amino acids in the structural motif to render
an
amphiphilic ~3-pleated sheet. The tetrameric form shown below has each stretch
of (3-pleated
sheet interspaced with the (3-turn gpgr (SEQ ID N0:8). The sequence of the
tetrameric form is:
MTIEWKVELKFEMKIELKMTgpgrMTIEWKVELKFEMKIELKMTgpgrMTIEWKVELKF
EMKIELKMTgpgrMTIEWKVELKFEMKIELKMT (SEQ ID NO:10).
The tetramer is composed of 92 amino acids, including the 12 amino acids
comprising
the 3 (3-turns, in the structural motif to render an amphiphilic (i-pleated
sheet.
Protein Designs Useful for Organisms Other than Humans
The proteins ASP1, HDNP1 and HDNP2 were designed to yield high levels of
essential
amino acids especially suitable for humans. Each type of animal has its own
set of required
essential amino acids and these sets of essential amino acids, while usually
overlapping, are
different from each other. Other proteins can be designed which yield higher
levels of essential
amino acids more suitable for organisms other than humans. For example, pigs
have one set of
essential amino acids, chickens have a different set, and fish have yet a
different set. Transgenic
plants can be engineered to be designed to be fed to one particular species of
animal. For
example, various transgenic corn plants can be produced wherein one transgenic
form is most
suitable for humans, a second transgenic form will produce a high level of
those essential amino
acids suited for pigs, and a third transgenic form can be made which is most
suited for chickens.
The design of such proteins can be based on the design of ASP1, HDNPI and
HDNP2. One of

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
skill in the art knows how to prepare DNA which will encode each desired
protein. The
following are examples of the monomeric and tetrameric forms of proteins which
may be used
for specific species of animals. DNAs encoding these proteins are easily
designed and used to
make transgenic plants as described above.
5 A protein directed to use with swine is SDNP1 which has the amino acid
sequence:
MFETIVKLVEETMHKWEEVIKKFVTMVEETLKKFEEITKKM (SEQ ID NO:11)
hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHhhHHh (SEQ ID N0:32)
This monomer is composed of 41 amino acids in the structural motif to render
an
amphiphilic a-helix. The tetrameric form is:
10 MFETIVKLVEETMHKWEEVIKKFVTMVEETLKKFEEITKKMgpgrMFETIVKLVEETM
HKWEEVIKKFVTMVEETLKKFEEITKKMgpgrMFETIVKLVEETMHKWEEVIKKFVT
MVEETLKKFEEITKKMgpgrMFETIVKLVEETMHKWEEVIKKFVTMVEETLKKFEEIT
KKM (SEQ ID N0:12).
This tetrameric form shows the 4 a-helices interspaced with the ~3-turn gpgr
(SEQ ID
15 N0:8). The tetramer is composed of 176 amino acids, including the 12 amino
acids comprising
the 3 ~i-turns, in the structural motif to render an amphiphilic a-helix.
A second protein for swine is SDNP2 and has the monomeric amino acid sequence
MTIEFKVELKVETHWEMKIEVKFETKIEVKTEMKLEVKFTM (SEQ ID N0:13)
hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHh (SEQ ID N0:37)
20 This monomer is composed of 41 amino acids in the structural motif to
render an
amphiphilic ~i-pleated sheet. The tetrameric form shown below has each stretch
of (3-pleated
sheet interspaced with the ~3-turn gpgr (SEQ ID N0:8). The sequence of the
tetrameric form is:
MTIEFKVELKVETHWEMKIEVKFETKIEVKTEMKLEVKFTMgpgrMTIEFKVELKVET
HWEMKIEVKFETKIEVKTEMKLEVKFTMgpgrMTIEFKVELKVETHWEMKIEVKFETK
25 IEVKTEMKLEVKFTMgpgrMTIEFKVELKVETHWEMKIEVKFETKIEVKTEMKLEVKFTM
(SEQ ID N0:14).
The tetramer is composed of 176 amino acids, including the 12 amino acids
comprising
the 3 ~3-turns, in the structural motif to render an amphiphilic (3-pleated
sheet.
A protein directed to use with poultry is PDNP 1 which has the amino acid
sequence:
30 MFEGLVKIMEEVLRHWTEVFGKI FEMGTRFLEGFTKM (SEQ ID NO:15)
hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHh (SEQ ID N0:33)

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
36
This monomer is composed of 37 amino acids in the structural motif to render
an
amphiphilic a-helix. The tetrameric form is:
MFEGLVKIMEEVLRHWTEVFGKIFEMGTRFLEGFTKMgpgrMFEGLVKIMEEVLRHW
TEVFGKIFEMGTRFLEGFTKMgpgrMFEGLVKIMEEVLRHWTEVFGKIFEMGTRFLEG
FTKMgpgrMFEGLVKIMEEVLRHWTEVFGKIFEMGTRFLEGFTKM (SEQ ID N0:16).
This tetrameric form shows the 4 a-helices interspaced with the (3-turn gpgr
(SEQ ID
N0:8). The tetramer is composed of 160 amino acids, including the 12 amino
acids comprising
the 3 (i-turns, in the structural motif to render an amphiphilic a-helix.
A second protein for poultry is PDNP2 and has the monomeric amino acid
sequence
MEFKVGIELRFTWEMHVGFELKIGFTVEMRLGFETKM {SEQ ID N0:17)
hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHh (SEQ ID N0:38)
This monomer is composed of 37 amino acids in the structural motif to render
an
amphiphilic ~3-pleated sheet. The tetrameric form shown below has each stretch
of (3-pleated
sheet interspaced with the ~i-tum gpgr (SEQ ID N0:8). The sequence of the
tetrameric form is:
MEFKVGIELRFTWEMHVGFELKIGFTVEMRLGFETKMgpgrMEFKVGIELRFTWEMH
VGFELKIGFTVEMRLGFETKMgpgrMEFKVGIELRFTWEMHVGFELKIGFTVEMRLGF
ETKMgpgrMEFKVGIELRFTWEMHVGFELKIGFTVEMRLGFETKM (SEQ ID N0:18).
The tetramer is composed of 160 amino acids, including the 12 amino acids
comprising
the 3 ~3-turns, in the structural motif to render an amphiphilic ~i-pleated
sheet.
A protein directed to use with fish is FDNP 1 which has the amino acid
sequence:
MFEELVRTIEELMKKWEEVFKRVLHILEEFVRKFEETMRK (SEQIDN0:19)
hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHhhHH (SEQ ID N0:34)
This monomer is composed of 40 amino acids in the structural motif to render
an
amphiphilic a-helix. The tetrameric form is:
MFEELVRTIEELMKKWEEVFKRVLHILEEFVRKFEETMRKgpgrMFEELVRTIEELMK
KWEEVFKRVLHILEEFVRKFEETMRKgpgrMFEELVRTIEELMKKWEEVFKRVLHILE
EFVRKFEETMRKgpgrMFEELVRTIEELMKKWEEVFKRVLHILEEFVRKFEETMRK(SEQ
ID N0:20).
This tetrameric form shows the 4 a-helices interspaced with the (3-tum gpgr
(SEQ ID
N0:8). The tertramer is composed of 172 amino acids, including the 12 amino
acids comprising
the 3 (3-turns, in the structural motif to render an amphiphilic a-helix.
A second protein for fish is FDNP2 and has the monomeric amino acid sequence

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
37
MEIKLEVRFETKVELKVEWRIEFHTELKMELRVELRFEMK (SEQIDN0:21) '
hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhH (SEQ ID N0:39)
This monomer is composed of 40 amino acids in the structural motif to render
an
amphiphilic ~i-pleated sheet. The tetrameric form shown below has each stretch
of ~i-pleated
S sheet interspaced with the (3-turn gpgr (SEQ ID N0:8). The sequence of the
tetrameric form is:
MEIKLEVRFETKVELKVEWRIEFHTELKMELRVELRFEMKgpgrMEIKLEVRFETKVE
LKVEWRIEFHTELKMELRVELRFEMKgpgrMEIKLEVRFETKVELKVEWRIEFHTELK
MELRVELRFEMKgpgrMEIKLEVRFETKVELKVEWRIEFHTELKMELRVELRFEMK(SEQ
ID N0:22).
The tetramer is composed of 172 amino acids, including the 12 amino acids
comprising
the 3 ~3-turns, in the structural motif to render an amphiphilic ~i-pleated
sheet.
A protein directed to use with dogs is DDNP 1 which has the amino acid
sequence:
MVETFIKLVEEIVRKWEEMLHKFVEVLTKLFETFTKIM (SEQ ID N0:23)
hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHhh (SEQ ID N0:35)
This monomer is composed of 38 amino acids in the structural motif to render
an
amphiphilic a-helix. The tetrameric form is:
MVETFIKLVEEIVRKWEEMLHKFVEVLTKLFETFTKIMgpgrMVETFIKLVEEIVRKWE
EMLHKFVEVLTKLFETFTKIMgpgrMVETFIKLVEEIVRKWEEMLHKFVEVLTKLFETF
TKIMgpgrMVETFIKLVEEIVRKWEEMLHKFVEVLTKLFETFTKIM (SEQ ID N0:24).
This tetrameric form shows the 4 a-helices interspaced with the (3-turn gpgr
(SEQ ID
N0:8). The tertramer is composed of 164 amino acids, including the 12 amino
acids comprising
the 3 ~3-turns, in the structural motif to render an amphiphilic a-helix.
A second protein for dogs is DDNP2 and has the monomeric amino acid sequence
MTVEFKLEIKVTIEFKWEVHLEIRFEVKLEMKFTLTMV (SEQ ID N0:25)
hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhh (SEQ ID N0:40)
This monomer is composed of 38 amino acids in the structural motif to render
an
amphiphilic ~i-pleated sheet. The tetrameric form shown below has each stretch
of (3-pleated
sheet interspaced with the ~i-turn gpgr (SEQ ID N0:8). The sequence of the
tetrameric form is:
MTVEFKLEIKVTIEFKWEVHLEIRFEVKLEMKFTLTMVgpgrMTVEFKLEIKVTIEFKW
EVHLEIRFEVKLEMKFTLTMVgpgrMTVEFKLEIKVTIEFKWEVHLEIRFEVKLEMKFT
LTMVgpgrMTVEFKLEIKVTIEFKWEVHLEIRFEVKLEMKFTLTMV (SEQ ID N0:26).

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
38
The tetramer is composed of 164 amino acids, including the 12 amino acids
comprising
the 3 (3-turns, in the structural motif to render an amphiphilic ~i-pleated
sheet.
A protein directed to use with cats is CDNP 1 which has the amino acid
sequence:
MLETLFKIVEETLRKWEEMFKHVLTFMEEIVKRITRLM (SEQ ID N0:27)
hhHHhhHhhHHhhHHhHHhhHHhhHhhHHhhHHhHHhh (SEQ ID N0:35)
This monomer is composed of 38 amino acids in the structural motif to render
an
amphiphilic a-helix. The tetrameric form is:
MLETLFKIVEETLRKWEEMFKHVLTFMEEIVKRITRLMgpgrMLETLFKIVEETLRKWE
EMFKHVLTFMEEIVKRITRLMgpgrMLETLFKIVEETLRKWEEMFKHVLTFMEEIVKRI
TRLMgpgrMLETLFKIVEETLRKWEEMFKHVLTFMEEIVKRITRLM (SEQ ID N0:28).
This tetrameric form shows the 4 a-helices interspaced with the ~i-turn gpgr
(SEQ ID
N0:8). The tertramer is composed of 164 amino acids, including the 12 amino
acids comprising
the 3 (i-turns, in the structural motif to render an amphiphilic a-helix.
A second protein for cats is CDNP2 and has the monomeric amino acid sequence
MTLEFKLTMELHWEIKVELKTEVRIEMKFEVRLEFRMT (SEQ ID N0:29)
hHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhHhH (SEQ ID N0:41)
This monomer is composed of 38 amino acids in the structural motif to render
an
amphiphilic ~i-pleated sheet. The tetrameric form shown below has each stretch
of (3-pleated
sheet interspaced with the ~3-turn gpgr (SEQ ID N0:8). The sequence of the
tetrameric form is:
MTLEFKLTMELHWEIKVELKTEVRIEMKFEVRLEFRMTgpgrMTLEFKLTMELHWEIK
VELKTEVRIEMKFEVRLEFRMTgpgrMTLEFKLTMELHWEIKVELKTEVRIEMKFEVR
LEFRMTgpgrMTLEFKLTMELHWEIKVELKTEVRIEMKFEVRLEFRMT (SEQ ID N0:30).
The tetramer is composed of 164 amino acids, including the 12 amino acids
comprising
the 3 ~i-turns, in the structural motif to render an amphiphilic ~i-pleated
sheet.
Use of Vectors to Increase the Production of a Second Protein for which the
Plant is
Transformed
The generally enhanced levels of protein production can be useful in
expressing other
valuable proteins. For example, if a gene coding for insulin were cloned into
a plant expressing
the ASP I gene, it is expected that levels of insulin production will be
higher, as compared to
control plants having the insulin gene, but lacking the ASP 1 gene. Therefore
plants which are
transgenic for both the ASP1 gene or similar gene which also results in
increased total protein

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
39
production and for a second gene which encodes a protein of interest will make
more of'the
protein of interest than if the plant were transformed solely with the gene
encoding the protein
of interest and not transformed with the ASP 1 or similar gene. It is
irrelevant whether the plant
is first transformed with ASP 1 or a similar gene and later transformed with a
gene of interest, or
whether the plant is first transformed with the gene of interest and then is
later transformed with
ASP 1 or similar gene. Also, a transformation can be performed using both
genes simultaneously.
Use of Vectors to Increase the Production of Nonprotein Products
Plants and plant cells which have been made transgenic for ASP 1 or similar
amphipathic
proteins produce greater amounts of all protein than do nontransgenic plants
or cells. As a result
of this generally higher level of protein, higher levels of nonprotein
products will also be made.
This result is expected because there will be an increase in the levels of
enzymes which are used
in the synthesis of such products. For example, taxol is naturally synthesized
by certain plants
and the synthesis of taxol is dependent on enzymes. Increased levels of those
enzymes will lead
to increased levels of taxol. Similarly, many plants produce sugars, e.g.,
sugarcane. Again, the
synthesis of sugars is dependent on enzymes within the plant. Increased levels
of these enzymes
will yield increased levels of the sugars. Therefore simply making a plant or
plant cell transgenic
for ASP 1 or similar amphipathic protein will result in the plant or cell
producing more product
wherein said product need not be a protein but is synthesized by protein
(enzyme) action.
Similarly, if one knows the enzymes involved in the synthetic pathway of a
desired
product, e.g., taxol or sugar, one can co-transform a plant or plant cell with
a gene encoding
ASP1 or similar amphipathic protein and with a gene encoding the enzyme which
is utilized in
synthesizing the desired product. In this way one can further enhance the
production of the
desired product. This can be especially useful if there is one limiting enzyme
and the gene for
this limiting enzyme of the pathway is used.
Sweetpotato was transformed with ASP1 and two transformed lines were assayed
for
sugar content and overall amount of dry matter versus moisture content.
Results are shown in
Table 14.

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
Table 14
Sucrose % Glucose % Fructose
Control 0.54 0.06 0.06
Transformant 0.95 0.06 0.06
1
5 Transforrnant 1.57 0.1 S 0.13
2
Table 14 shows that Transformant 1 had an increased production of sucrose but
normal
production of both glucose and fructose whereas Transformant 2 had increased
production of all
3 sugars as compared to the control plant.
10 Table 13 indicates that the overall amount of dry matter is increased from
17% in the
control to roughly 22% in the transformants. This is approximately a 30%
increase in dry matter
as a percent of the total weight of the plant.
Use in Plants Other than Tobacco and Sweet Potato
15 High-level, tissue-specific expression of ASP1 or related genes can also be
performed,
in a manner generally analogous to that described above for tobacco and sweet
potato, in certain
economically important plants such as rice, wheat, barley, sorghum, maize,
potato, plantain,
cassava, taro, soybean, alfalfa, or a forage grass. It is desirable to
incorporate suitable promoters
or other regulatory sequences to encourage expression (preferably constitutive
expression)
20 primarily in the part of the plant intended as a foodstuff. For example, in
rice or maize,
expression is desired primarily in the seeds; while in potato or sweet potato,
expression is desired
primarily in the tuber. Where necessary, transformation protocols known in the
art other than
the Agrobacterium protocol will be used, such as transformation through DNA
particle gun or
via plant protoplasts. See, e.g., Klein et al. (1987) and Croughan et al.
(1989). These plants can
25 be transformed with vectors encoding not only ASP1, but for any such
similar proteins including
any of the proteins disclosed above.
Cell Culture of Transgenic Plant Cells
It is not necessary to make transgenic plants to perform the invention. Plant
cells can be
30 made transgenic with a gene encoding ASP1 or other amphipathic protein and
these transgenic
cells can be grown in culture or in a bioreactor. This avoids the necessity of
having to regenerate

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
41
a plant. These transgenic cells will produce enhanced levels of protein and
other products as was
seen in the transgenic plants. These cells can be cotransformed with any genes
of interest, for
example a gene encoding insulin. The desired product will be overproduced as
compared to a
nontransgenic plant cell or a cell not transformed with a gene encoding ASP 1
or other
amphipathic protein. The desired product can be purified from the cultured
cells.
As used in the claims below, unless otherwise clearly indicated by context,
the term
"higher plant" is intended to encompass gymnosperms, monocotyledons, and
dicotyledons; as
well as any cells, tissues, or organs taken or derived from any of the above,
including without
limitation any seeds, leaves, stems, flowers, roots, tubers, single cells,
gametes, or protoplasts
taken or derived from any gymnosperm, monocotyledon, or dicotyledon. Also, the
term
"protein" is meant to include peptides such as dipeptides or any longer
peptide as well as
proteins.
Although the bulk of the above discussion regarding this invention has focused
on de
novo proteins having an a-helical structure, the same basic approach can work
in designing de
novo proteins having a (3-sheet structure. To generate amphipathic ~i-sheets
(which are not
believed to have been reported in nature), amino acid residues will alternate
between being
hydrophobic and being hydrophilic, so that one side of the structure is
hydrophobic, and the other
side is hydrophilic. This structure was seen in the sequences disclosed above.
Salt bridges to
promote stability can be formed with internal sequences glu-X-lys. Other acid-
X-base sequences
may also serve this function. Lysine is preferred as the base because it is an
essential amino acid.
It may be possible to substitute aspartic acid for glutamic acid, however, to
give the internal
sequence asp-X-lys. Turns between adjacent monomer units may be promoted, for
example, by
the internal sequence gly-asn, to form oligomers or polymers of the main
peptide structure.
While the invention has been disclosed in this patent application by reference
to the
details of preferred embodiments of the invention, it is to be understood that
the disclosure is
intended in an illustrative rather than in a limiting sense, as it is
contemplated that modifications
will readily occur to those skilled in the art, within the spirit of the
invention and the scope of the
appended claims.

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
42
LITERATURE CITED
Agros, P., Pederson, K., Marks, D. and Larkins, B. A. 1982. A structural model
for maize zein
proteins. J. Biol. Chem. 257: 9984-9990.
Agros, P., Naravana, S. V. L.; and Nielsen, N. C. 1985. Structural similarity
between legumin
and vicillin storage proteins from legumes. The EMBO J. 4: 1111-1117.
Altenbach, S. B., Pederson, K. W., Meeker, G. Staraci, L. C., and Sun, S. S.
M. 1989.
Enhancement of the methionine content of seed proteins by the expression of a
chimeric gene
encoding a methionine-rich protein in transgenic plants. Plant Mol. Biol. 13:
513-522.
Bachmair, A., Finley, D., and Varshavsky, A. 1986. In vivo halflife of a
protein is a function of
its amino-terminal residue. Science 234: 179-186.
Badley, R.A., Atkinson, D., Hauler, H., Oldani, D., Green, J.P., and Stubbs,
J.M. 1975. The
structure, physical and chemical properties of the soybean protein glycinin.
Biochim. Biophys.
Acts. 412: 214-228.
Bartels, D., and Tompson, R.D. 1983. The characterization of cDNA clones
coding for wheat
storage proteins. Nucleic Acids Res. 11: 2961-2977
Beachy R.N., Chen Z.L., Horsch R.B., Rogers S.G., Hoffman N.J., and Fraley
R.T. 1985.
Accumulation and assembly of soybean (3-conglycinin in seeds of transformed
petunia plants.
EMBO J. 4: 3047-3053.
Bierzynski, A., Kim, P. S., and Baldwin, R. L. 1982. A salt bridge stabilizes
the helix formed by
isolated c-peptide of RNAse A. Proc. Natl. Acad. Sci. U. S. A. 79: 2470-2474.
Blundell, T.L., Thornton, S. J., Burley, S. K., and Petsco, G. A. 1986. Atomic
interactions.
Science 234: 1005-1009.
Bollini, R. and Chrispeels, M.J. 1978. Characterization and subcellular
localization of vicillin
and phyto-hemagglutinin, the two major reserve proteins of Phaseolus vulgaris.
Plants 142: 291-
298.
Brown, J. E. and Klee, W. A. 1971. Helix-coil transition of the isolated amino
terminus of
Ribonuclease. Biochemistry 10: 470-476.
Chen, Z. L., Pan, N. S., and Beachy, R. N. 1988. A DNA sequence element that
confers seed-
specific enhancement of a constitutive promoter. The EMBO J. 7: 297-302.
Chen, Z. L., Schuler, M. A. and Beachy, R. N. 1986. Functional analysis of
regulatory elements
in a plant embryo-specific gene. Proc. Natl. Acad. Sci. U. S.A. 83:8560-8564.
Chou, P. Y. and Fasman, G. D. 1978. Prediction of the secondary structure of
proteins from their
amino acid sequence. Adv. Enzymol. 47: 45-148.

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
43
Colot, V., Robert, L. S., Kavanagh, T. A., Beavan, M. W. and Tompson, R. D.
1987.
Localization of sequences in wheat endosperm protein genes which confer tissue-
specific
expression in tobacco. The EMBO J. 6: 3559-3564.
Creighton, T.E. 1984. Proteins. New York: Freeman.
Crouch, M., Tenberge, K., Simone, N.E., and Ferl, R. 1983. Sequence of the
1.7K storage protein
of Brassica napus. Mol. Appl. Genet. 2: 273-283.
Croughan et al. 1989. Advances in Plant Biotechnology, pp. 107-114.
Degrado, W. F., and Lear, J. D. 1985. Induction of peptide conformation at
apolar/water
interfaces. J. Am. Chem. Soc. 107: 7684-7689.
Degrado, W. F., Wasserman, Z. R., and Lear, J.D. 1989. Protein design, a
minimalist approach.
Science 241: 622-628.
Esen, E. 1986. Separation of alcohol-soluble proteins (zeros) from maize into
three fractions by
differential solubility. Plant Physiol. 80: 623-627.
Finley, D. and Varshavsky, A. 1985. The ubiquitin system: functions and
mechanisms. Trends
Biochem. Sci. 10: 343-346.
Forde, B.G., Kreis, M., Williamson, M.S., Fry, R.P. and Pywell, J. 1985. Short
tandem repeats
shared by B- and C-hordein cDNAs suggest a common evolutionary origin for two
groups of
cereal storage protein genes. The EMBO J. 4: 9-15.
Goldberg, A.L., and St John, A.C. 1976. Intracellular protein degradation in
mammalian and
bacterial cells: part 2. Annu. Rev. of Biochem. 45: 747-803.
Greenwood, J. S., and Chrispeels, M. J. 1985. Correct targeting of the bean
storage protein
phaseolin in the seeds of transformed tobacco. Plant Physiol. 79: 65-71.
Gross, D.S., and Garrard, W.T. 1987. Poising chromatin for transcription.
Trends in Biochem.
12: 293-296.
Ho, S. P., and Degrado, W. F. 1987. Design of a 4-helix bundle protein:
synthesis of peptides
which self associate into helical protein. J. Am. Chem. Soc. 109: 6751-6758.
Hoffmann, L.E., Donaldson, D. D., and Herman, E. M. 1988. A modified storage
protein is
synthesized, processed, and degraded in the seeds of transgenic plants. Plant
Mol. Biol. 11: 717-
729.
Hoffmann, L. E., Donaldson, D. D., Bookland, R., Rashka, K. and Herman, E. M.
1987.
Synthesis and protein body deposition of maize 15-kd zein in transgenic
tobacco seeds. The
EMBO J. 6: 3213-3221.

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
44
Hol, W. G. and Sander, H. C. 1981. Dipole of the a-helix and ~i-sheet: their
role in protein
folding. Nature 294: 532-536.
Horsch, R.B., Fry, J., Hoffmann, N., Neidermeyer, J., Rogers, S.G. and Fraley,
R.T. 1988. In
Plant Mol. Biol. Manual ed. S. B. Gelvin and R. A. Schilperoort, Dordrecht:
Kluwer Academic.
Jaynes, J. M., Nagpala, P., Destefano, L., Denny, T., Clark, C., and Kim, J-H.
1992. Expression
of a de novo designed peptide in transgenic tobacco plants confers enhanced
resistance to
Pseudomonas solanacearum infection. Plant Science 89: 43-53.
3aynes, J. M., Yang, M. S., Espinoza, N. O., and Dodds, J. H. 1986. Plant
protein improvement
by genetic engineering: use of synthetic genes. Trends in Biotechnol. 4: 314-
320.
Jones, J.D.G., and Gilbert, D.E. 1987. T-DNA structure and gene expression in
petunia plants
transformed by Agrobacterium tumefaciens C58 derivatives. Mol. Gen. Genet.
207: 478-485.
Kabsch, W., and Sander, C. 1983. How good are predictions of protein
structure? FEBS lett. 155:
179-182.
Kane, J.F. and Hartley, D.L. 1988. Formation of recombinant protein inclusion
bodies in
Escherichia colt. Trends in Biotechnol. 6: 95-101.
Kasarda, D.D., Okita, T.W., Bernardin, J.E., Baecker, P.A., and Nimmo, C.C.
1984. DNA and
amino acid sequences of alpha and gamma gliadins. Proc. Natl. Acad. Sci.
U.S.A. 81: 4712-4716.
Keris, M., Shewry, P. R., Forde, B. G., Forde, G. and Miflin, J. 1985.
Structure and evolution of
seed storage proteins and their genes with particular reference to those of
wheat, barley and rye.
Oxford Survey of Plant Mol. and Cell Biol. 2:253-317.
Klein, T.M., Wolf, E.D., Wu, R. and Sanford, J.C. 1987. High-velocity
microprojectiles for
delivering nucleic acids into living cells. Nature 327:70-73.
Komoriya, A., and Chaiken, J. M. 1982. Sequence modeling using semisynthetic
Ribonuclease
S. J. Biol. Chem. 257: 2599-2604.
Larkins, B.A. 1983. Genetic engineering of seed storage protein. In Genetic
Engineering of
Plants ed. B. A. Larkins, pp. 93-120. New York: Plenum.
Larkins, B.A., Pederson, K., Mark, M.D., and Wilson, D.R. 1984. The zein
protein of maize
endosperm. Trends in Biochem. 9: 306-308.
Lawrence, M.C., Suzuki, E., Varghes, J.N., Davis, P.C., Van Donkelaar, A.
Tulloch, P.A. and
Collman, P.M. 1990. The three-dimensional structure of the seed storage
protein phaseolin at 3
~ resolution. The EMBO J. 9: 9-1 S.

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
10
Lear, J. D., Wasserman, Z. R. and Degrado, W. F. 1988. Synthetic amphiphilic
peptide model
for protein ion channels. Science 240: 1177-1181.
Lending, C. R., Kriz, A., Larkins, B. A. and Bracker, C. E. 1988. Structure of
maize protein
bodies and immunocytochemical localization of zeros. Protoplasma 143: 51-62.
Lycett, G. W., Cory, R.D., Shirsat, A. H., Richards, D. M., and Boulter, D.
1985. The 5'-flanking
regions of three pea legumin genes: comparison of DNA sequences. Nucleic Acids
Res. 13:
6733-6743.
Marqusee, S. and Baldwin, R. 1987. Helix stabilization by GLU-LYS salt bridges
in short
peptides of de novo design. Proc. Natl. Acad. Sci. U. S. A. 84: 8898-8902.
Marries, C., Gallois, P., Copley, J. and Keris, M. 1988. The 5' flanking
region of a barley B
15 hordein gene controls tissue and developmental specific CAT expression in
tobacco plants. Plant
Mol. Biol. 10: 359-366.
Mutter, M. 1988. Nature's rules and chemist's tools: a way for creating novel
proteins. Trends in
Biochem. 13: 260-264.
Pace, C.N. and Barnet, A.J. 1984. Kinetics of tryptic hydrolysis of the
arginine-valine bond in
folded and unfolded ribonuclease Tl. Biochem. J. 219: 411-417.
Pakula, A.A. and Sauer, R.T. 1986. Bacteriophage 1 Cro mutation: effect on
activity and
intracellular degradation. Proc. Natl. Acad. Sci. U.S.A. 82: 8829-8833.
Pakula, A.A. and Sauer, R.T. 1989. Amino acid substitutions that increase the
thermal stability
of the 1 Cro protein. Proteins 5: 202-210.
Parasell, D.A. and Sauer, R.T. 1989. The structural stability of a protein is
an important
determinant of its proteolytic susceptibility in Escherichia coli. J. Biol.
Chem. 264: 7590-7595.
Pederson, K., Agros, P., Naravana, S. V. L., and Larkins, B. A. 1986. Sequence
analysis and
characterization of a maize gene encoding a high-sulfur zero protein of Mw
15,000. J. Biol.
Chem.201:6279-6284.
Pernollet, J. C. and Mosse, J. 1983. Structure and location of legume and
cereal seed storage
protein. Seed Proteins. Phytochemical Soc. of Eur. Sym. Series 20: 155-187.
Pontremoli, S. and Melloni, E. 1986. Extralysosomal protein degradation. Annu.
Rev. Biochem.
55: 455-481.
Presnell, S.R., and Cohen, F.E. 1989. Topological distribution of a four-a-
helix bundle. Proc.
Natl. Acad. Sci. U.S.A. 86: 6592-6596.
Presta, L. G. and Rose, G. D. 1988. Helix signals in proteins. Science 240:
1632-1641.

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
46
Rafalski, J.A., Scheets, K., Metzler, M., and Peterson, D.M. 1984.
Developmentally regulated
plant genes: the nucleotide sequence of a wheat gliadin genomic clone. The
EMBO J. 3: 1409-
1415.
S Richardson, J. S. and Richardson, D. C. 1988. Amino acid preferences for
specific locations at
the ends of a-helices. Science 240: 1648-1652.
Richardson, J. S. and Richardson, D. C. 1989. The de novo design of protein
structures. Trends
in Biochem. 14: 304-309.
Sanders, P.R., Winter, J.A., Barnason, A.R. and Rogers, S.G. 1987. Comparison
of cauliflower
mosaic virus 355 and nopaline synthetase promoters in transgenic plants.
Nucleic Acids Res 15:
1543-1558.
Scheraga, H. 1978. Use of random copolymers to determine helix-coil stability
constants of the
naturally occurring amino acids. Pure. Appl. Chem. 50: 315-324.
Scheraga, H. A. 1985. Effect of side chain-backbone electrostatic interaction
on the stability of
a-helices. Proc. Natl. Acad. Sci. U. S. A. 82: 5585-5587.
Scott, R.J., and Draper, J. 1987. Transformation of carrot tissue derived from
proembryogenic
suspension cells: a useful model system for gene expression studies in plants.
Plant Mol. Biol.
8: 265-274.
Sengupta, G. C., Reichert, N. A., Baker, R. F., Hall, T. C. and Kemp, J. D.
1985.
Developmentally regulated expression of the bean ~i-phaseolin gene in tobacco
seed. Proc. Natl.
Acad. Sci. U. S. A. 82: 3320-3324.
Shen, S-H. 1984. Multiple joined genes prevent product degradation in E. coli.
Proc. Natl Acad.
Sci. U. S. A. 81: 4627-4631.
Slightom, J.L., Sun, S.M., and Hall, T.C. 1983. Complete sequence of french
bean storage
protein gene: phaseolin. Proc. Natl. Acad. Sci. U.S.A. 80: 1897-1901.
Staswick, P. E. 1989. Preferential loss of an abundant storage protein from
soybean pods during
seed development. Plant Physiol. 90: 1251-1255.
Stockhaus, J., Eckes, P., Blau, A., Schell, J., and Willmitzer, L. 1987. Organ-
specific and dosage-
dependent expression of a leaf/stem specific gene from potato after tagging
and transfer into
potato and tobacco plants. Nucleic Acids Res. 15: 3479-3491.
Sueki, M., Lee, S., Power, S. P., Denton, J. B., Konishi, Y., and Scheraga, H.
1984. Helix-coil
stability constants for the naturally occurring amino acids in water.
Macromolecules 17: 148-155.
Twell, D. and Ooms, G. 1987. The 5' flanking DNA of a patatin gene directs
tuber specific
expression of a chimeric gene in potato. Plant Mol. Biol. 9: 365-375.

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
47
Wallace, J. C., Galili, G., Kawata, E. E., Cuellar, R. E., Shotwell, M. A.,
and Larkins, B: A.
1988. Aggregation of lysine containing zeros into protein bodies in Xenopus
oocytes. Science
240: 662-664.
Wenzler, H. C., Mignery, G. A., Fisher, L. M., and Park, W. D. 1989. Analysis
of a chimeric
class I potatin-GUS gene in transgenic potato plants: high level expression of
tubers and sucrose-
inducible expression in cultured leaf and stem explants. Plant Mol. Biol. 12:
41-50.
Yang, M.S., Espinoza, N. O., Dodds, J. H., and Jaynes, J. M. 1989. Expression
of a synthetic
gene for improved protein quality in transformed potato plants. Plant Science.
64: 99-111.
Zimm, B. H. and Bragg, J. R. 1959. Theory of the phase transition between
helix and random coil
in polypeptide chains. J. Chem. Phys. 31: 526-535.

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
1
SEQUENCE LISTING
<110> Demegen, Inc.
<120> A Method for Increasing the Protein Content of Plants
<130> 2093-124
<140>
<141>
<150> U.S. 09/066,056
<151> 1998-04-27
<160> 41
<170> PatentIn Ver. 2.0
<210> 1
<211> 293
<212> DNA
<213> Artificial Sequence
<220>
<221> CDS
<222> (10)..(285)
<220>
<223> Description of Artificial Sequence:Synthetic DNA
to encode artificial protein ASP1.
<400> 1
gatccaaca atg ctt gaa gag ctg ttc aaa aag atg acc gag tgg atc gag 51
Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu
1 5 10
aaa gtg atc aaa acg atg gga cca ggc agg atg ctc gag gag ctg ttc 99
Lys Val Ile Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Leu Phe
15 20 25 30
aaa aag atg acc gag tgg atc gag aaa gtg atc aaa acg atg gga cca 147
Lys Lys Met Thr Glu Trp Ile Glu Lys Val Ile Lys Thr Met Gly Pro
35 40 45
ggc agg atg ctc gag gag ctg ttc aaa aag atg acc gag tgg atc gag 195
Gly Arg Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu
50 55 60
aaa gtg atc aaa acg atg gga cca ggc agg atg ctc gag gag ctc ttt 243
Lys Val Ile Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Leu Phe
65 70 75
aaa aaa atg act gag tgg atc gaa aaa gtg atc aaa act atg taggaatt 293
Lys Lys Met Thr Glu Trp Ile Glu Lys Val Ile Lys Thr Met
80 85 90

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
2
<210> 2
<211> 92
<212> PRT
<213> Artificial Sequence
<400> 2
Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val
1 5 10 15
Ile Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Leu Phe Lys Lys
20 25 30
Met Thr Glu Trp Ile Glu Lys Val Ile Lys Thr Met Gly Pro Gly Arg
35 40 45
Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val
50 55 60
Ile Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Leu Phe Lys Lys
65 70 75 80
Met Thr Glu Trp Ile Glu Lys Val Ile Lys Thr Met
85 90
<210> 3
<211> 20
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:Version of
ASP1.
<400> 3
Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val
1 5 10 15
Ile Lys Thr Met
<210> 4
<211> 24
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:Version of
ASP1.
<400> 4
Met Leu Glu Glu Leu Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val
1 5 10 15

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
3
Ile Lys Thr Met Gly Pro Gly Arg
<210> 5
<211> 5
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence: Protein segment
to promote salt bridges.
<400> 5
Glu Xaa Xaa Xaa Lys
1 5
<210> 6
<211> 20
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:HDNPl monomer.
<400> 6
Met Leu Glu Glu Ile Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val
1 5 10 15
Leu Lys Thr Met
<210> 7
<211> 92
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:HDNPi tetramer.
<400> 7
Met Leu Glu Glu Ile Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val
1 5 10 15
Leu Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Ile Phe Lys Lys
20 25 30
Met Thr Glu Trp Ile Glu Lys Val Leu Lys Thr Met Gly Pro Gly Arg
35 40 45
Met Leu Glu Glu Ile Phe Lys Lys Met Thr Glu Trp Ile Glu Lys Val
50 55 60

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
4
Leu Lys Thr Met Gly Pro Gly Arg Met Leu Glu Glu Ile Phe Lys Lys
65 70 75 80
Met Thr Glu Trp Ile Glu Lys Val Leu Lys Thr Met
85 90
<210> 8
<211> 4
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:Protein segment
to act as helix breaker.
<400> 8
Gly Pro Gly Arg
1
<210> 9
<211> 20
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:HDNP2 monomer.
<400> 9
Met Thr Ile Glu Trp Lys Val Glu Leu Lys Phe Glu Met Lys Ile Glu
1 5 10 15
Leu Lys Met Thr
<210> 10
<211> 92
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:HDNP2 tetramer.
<400> 10
Met Thr Ile Glu Trp Lys Val Glu Leu Lys Phe Glu Met Lys Ile Glu
1 5 10 15
Leu Lys Met Thr Gly Pro Gly Arg Met Thr Ile Glu Trp Lys Val Glu
20 25 30
Leu Lys Phe Glu Met Lys Ile Glu Leu Lys Met Thr Gly Pro Gly Arg
35 40 45

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
Met Thr Ile Glu Trp Lys Val Glu Leu Lys Phe Glu Met Lys Ile Glu
50 55 60
Leu Lys Met Thr Gly Pro Gly Arg Met Thr Ile Glu Trp Lys Val Glu
65 70 75 80
Leu Lys Phe Glu Met Lys Ile Glu Leu Lys Met Thr
85 90
<210> 11
<211> 41
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:SDNPl monomer
for use with swine.
<400> 11
Met Phe Glu Thr Ile Val Lys Leu Val Glu Glu Thr Met His Lys Trp
1 5 10 15
Glu Glu Val Ile Lys Lys Phe Val Thr Met Val Glu Glu Thr Leu Lys
20 25 30
Lys Phe Glu Glu Ile Thr Lys Lys Met
35 40
<210> 12
<211> 176
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:SDNPl tetramer
for use with swine.
<400> 12
Met Phe Glu Thr Ile Val Lys Leu Val Glu Glu Thr Met His Lys Trp
1 5 10 15
Glu Glu Val Ile Lys Lys Phe Val Thr Met Val Glu Glu Thr Leu Lys
20 25 30
Lys Phe Glu Glu Ile Thr Lys Lys Met Gly Pro Gly Arg Met Phe Glu
35 40 45
Thr Ile Val Lys Leu Val Glu Glu Thr Met His Lys Trp Glu Glu Val
50 55 60
Ile Lys Lys Phe Val Thr Met Val Glu Glu Thr Leu Lys Lys Phe Glu
65 70 75 80

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
6
Glu Ile Thr Lys Lys Met Gly Pro Gly Arg Met Phe Glu Thr Ile Val
85 90 95
Lys Leu Val Glu Glu Thr Met His Lys Trp Glu Glu Val Ile Lys Lys
100 105 110
Phe Val Thr Met Val Glu Glu Thr Leu Lys Lys Phe Glu Glu Ile Thr
115 120 125
Lys Lys Met Gly Pro Gly Arg Met Phe Glu Thr Ile Val Lys Leu Val
130 135 140
Glu Glu Thr Met His Lys Trp Glu Glu Val Ile Lys Lys Phe Val Thr
145 150 155 160
Met Val Glu Glu Thr Leu Lys Lys Phe Glu Glu Ile Thr Lys Lys Met
165 170 175
<210> 13
<211> 41
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:SDNP2 monomer
for use with swine.
<400> 13
Met Thr Ile Glu Phe Lys Val Glu Leu Lys Val Glu Thr His Trp Glu
1 5 10 15
Met Lys Ile Glu Val Lys Phe Glu Thr Lys Ile Glu Val Lys Thr Glu
20 25 30
Met Lys Leu Glu Val Lys Phe Thr Met
35 40
<210> 14
<211> 176 '
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:SDNP2 tetramer
for use with swine.
<400> 14
Met Thr Ile Glu Phe Lys Val Glu Leu Lys Val Glu Thr His Trp Glu
1 5 10 15

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
7
Met Lys Ile Glu Val Lys Phe Glu Thr Lys Ile Glu Val Lys Thr Glu
20 25 30
Met Lys Leu Glu Val Lys Phe Thr Met Gly Pro Gly Arg Met Thr Ile
35 40 45
Glu Phe Lys Val Glu Leu Lys Val Glu Thr His Trp Glu Met Lys Ile
50 55 60
Glu Val Lys Phe Glu Thr Lys Ile Glu Val Lys Thr Glu Met Lys Leu
65 70 75 80
Glu Val Lys Phe Thr Met Gly Pro Gly Arg Met Thr Ile Glu Phe Lys
8S 90 95
Val Glu Leu Lys Val Glu Thr His Trp Glu Met Lys Ile Glu Val Lys
100 105 110
Phe Glu Thr Lys Ile Glu Val Lys Thr Glu Met Lys Leu Glu Val Lys
115 120 125
Phe Thr Met Gly Pro Gly Arg Met Thr Ile Glu Phe Lys Val Glu Leu
130 135 140
Lys Val Glu Thr His Trp Glu Met Lys Ile Glu Val Lys Phe Glu Thr
145 150 155 160
Lys Ile Glu Val Lys Thr Glu Met Lys Leu Glu Val Lys Phe Thr Met
165 170 175
<210> 15
<211> 37
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:PDNP1 monomer
for use with poultry.
<400> 15
Met Phe Glu Gly Leu Val Lys Ile Met Glu Glu Val Leu Arg His Trp
1 5 10 15
Thr Glu Val Phe Gly Lys Ile Phe Glu Met Gly Thr Arg Phe Leu Glu
20 25 30
Gly Phe Thr Lys Met

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
8
<210> 16
<211> 160
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:PDNPl tetramer
for use with poultry.
<400> 16
Met Phe Glu Gly Leu Val Lys Ile Met Glu Glu Val Leu Arg His Trp
1 5 10 15
Thr Glu Val Phe Gly Lys Ile Phe Glu Met Gly Thr Arg Phe Leu Glu
20 25 30
Gly Phe Thr Lys Met Gly Pro Gly Arg Met Phe Glu Gly Leu Val Lys
35 40 45
Ile Met Glu Glu Val Leu Arg His Trp Thr Glu Val Phe Gly Lys Ile
50 55 60
Phe Glu Met Gly Thr Arg Phe Leu Glu Gly Phe Thr Lys Met Gly Pro
65 70 75 80
Gly Arg Met Phe Glu Gly Leu Val Lys Ile Met Glu Glu Val Leu Arg
85 90 95
His Trp Thr Glu Val Phe Gly Lys Ile Phe Glu Met Gly Thr Arg Phe
100 105 110
Leu Glu Gly Phe Thr Lys Met Gly Pro Gly Arg Met Phe Glu Gly Leu
115 120 125
Val Lys Ile Met Glu Glu Val Leu Arg His Trp Thr Glu Val Phe Gly
130 135 140
Lys Ile Phe Glu Met Gly Thr Arg Phe Leu Glu Gly Phe Thr Lys Met
145 150 155 160
<210> 17
<211> 37
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:PDNP2 monomer
for use with poultry.
<400> 17
Met Glu Phe Lys Val Gly Ile Glu Leu Arg Phe Thr Trp Glu Met His
1 5 10 15

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
9
Val Gly Phe Glu Leu Lys Ile Gly Phe Thr Val Glu Met Arg Leu Gly
20 25 30
Phe Glu Thr Lys Met
<210> 18
<211> 160
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:PDNP2 tetramer
for use with poultry.
<400> 18
Met Glu Phe Lys Val Gly Ile Glu Leu Arg Phe Thr Trp Glu Met His
1 5 10 15
Val Gly Phe Glu Leu Lys Ile Gly Phe Thr Val Glu Met Arg Leu Gly
20 25 30
Phe Glu Thr Lys Met Gly Pro Gly Arg Met Glu Phe Lys Val Gly Ile
35 40 45
Glu Leu Arg Phe Thr Trp Glu Met His Val Gly Phe Glu Leu Lys Ile
50 55 60
Gly Phe Thr Val Glu Met Arg Leu Gly Phe Glu Thr Lys Met Gly Pro
65 70 75 80
Gly Arg Met Glu Phe Lys Val Gly Ile Glu Leu Arg Phe Thr Trp Glu
85 90 95
Met His Val Gly Phe Glu Leu Lys Ile Gly Phe Thr Val Glu Met Arg
100 105 110
Leu Gly Phe Glu Thr Lys Met Gly Pro Gly Arg Met Glu Phe Lys Val
115 120 125
Gly Ile Glu Leu Arg Phe Thr Trp Glu Met His Val Gly Phe Glu Leu
130 135 140
Lys Ile Gly Phe Thr Val Glu Met Arg Leu Gly Phe Glu Thr Lys Met
145 150 155 160

CA 02325463 2000-10-13
WO 99/55$90 PCTNS99/09067
<210> 19
<211> 40
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:FDNPl monomer
for use with fish.
<400> 19
Met Phe Glu Glu Leu Val Arg Thr Ile Glu Glu Leu Met Lys Lys Trp
1 5 10 15
Glu Glu Val Phe Lys Arg Val Leu His Ile Leu Glu Glu Phe Val Arg
25 30
Lys Phe Glu Glu Thr Met Arg Lys
35 40
<210> 20
<211> 172
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:FDNPl tetramer
for use with fish.
<400> 20
Met Phe Glu Glu Leu Val Arg Thr Ile Glu Glu Leu Met Lys Lys Trp
1 5 10 15
Glu Glu Val Phe Lys Arg Val Leu His Ile Leu Glu Glu Phe Val Arg
20 25 30
Lys Phe Glu Glu Thr Met Arg Lys Gly Pro Gly Arg Met Phe Glu Glu
35 40 45
Leu Val Arg Thr Ile Glu Glu Leu Met Lys Lys Trp Glu Glu Val Phe
50 55 60
Lys Arg Val Leu His Ile Leu Glu Glu Phe Val Arg Lys Phe Glu Glu
65 70 75 80
Thr Met Arg Lys Gly Pro Gly Arg Met Phe Glu Glu Leu Val Arg Thr
85 90 95
Ile Glu Glu Leu Met Lys Lys Trp Glu Glu Val Phe Lys Arg Val Leu
100 105 110
His Ile Leu Glu Glu Phe Val Arg Lys Phe Glu Glu Thr Met Arg Lys
115 120 125
Gly Pro Gly Arg Met Phe Glu Glu Leu Val Arg Thr Ile Glu Glu Leu
130 135 140

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
11
Met Lys Lys Trp Glu Glu Val Phe Lys Arg Val Leu His Ile Leu Glu
145 150 155 160
Glu Phe Val Arg Lys Phe Glu Glu Thr Met Arg Lys
165 170
<210> 21
<211> 40
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:FDNP2 monomer
for use with fish.
<400> 21
Met Glu Ile Lys Leu Glu Val Arg Phe Glu Thr Lys Val Glu Leu Lys
1 5 10 15
Val Glu Trp Arg Ile Glu Phe His Thr Glu Leu Lys Met Glu Leu Arg
20 25 30
Val Glu Leu Arg Phe Glu Met Lys
35 40
<210> 22
<211> 172
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:FDNP2 tetramer
for use with fish.
<400> 22
Met Glu Ile Lys Leu Glu Val Arg Phe Glu Thr Lys Val Glu Leu Lys
1 5 10 15
Val Glu Trp Arg Ile Glu Phe His Thr Glu Leu Lys Met Glu Leu Arg
20 25 30
Val Glu Leu Arg Phe Glu Met Lys Gly Pro Gly Arg Met Glu Ile Lys
35 40 45
Leu Glu Val Arg Phe Glu Thr Lys Val Glu Leu Lys Val Glu Trp Arg
50 55 60
Ile Glu Phe His Thr Glu Leu Lys Met Glu Leu Arg Val Glu Leu Arg
65 70 75 80
Phe Glu Met Lys Gly Pro Gly Arg Met Glu Ile Lys Leu Glu Val Arg
85 90 95

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
12
Phe Glu Thr Lys Val Glu Leu Lys Val Glu Trp Arg Ile Glu Phe His
100 lOS 110
Thr Glu Leu Lys Met Glu Leu Arg Val Glu Leu Arg Phe Glu Met Lys
115 120 125
Gly Pro Gly Arg Met Glu Ile Lys Leu Glu Val Arg Phe Glu Thr Lys
130 135 140
Val Glu Leu Lys Val Glu Trp Arg Ile Glu Phe His Thr Glu Leu Lys
145 150 155 160
Met Glu Leu Arg Val Glu Leu Arg Phe Glu Met Lys
165 170
<210> 23
<211> 38
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:DDNPl monomer
for use with dogs.
<400> 23
Met Val Glu Thr Phe Ile Lys Leu Val Glu Glu Ile Val Arg Lys Trp
1 5 10 15
Glu Glu Met Leu His Lys Phe Val Glu Val Leu Thr Lys Leu Phe Glu
20 25 30
Thr Phe Thr Lys Ile Met
<210> 24
<211> 164
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:DDNPl tetramer
for use with dogs.
<400> 24
Met Val Glu Thr Phe Ile Lys Leu Val Glu Glu Ile Val Arg Lys Trp
1 5 10 15
Glu Glu Met Leu His Lys Phe Val Glu Val Leu Thr Lys Leu Phe Glu
20 25 30
Thr Phe Thr Lys Ile Met Gly Pro Gly Arg Met Val Glu Thr Phe Ile
35 40 45

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
13
Lys Leu Val Glu Glu Ile Val Arg Lys Trp Glu Glu Met Leu His Lys
50 55 60
Phe Val Glu Val Leu Thr Lys Leu Phe Glu Thr Phe Thr Lys Ile Met
65 70 75 80
Gly Pro Gly Arg Met Val Glu Thr Phe Ile Lys Leu Val Glu Glu Ile
85 90 95
Val Arg Lys Trp Glu Glu Met Leu His Lys Phe Val Glu Val Leu Thr
100 105 110
Lys Leu Phe Glu Thr Phe Thr Lys Ile Met Gly Pro Gly Arg Met Val
115 120 125
Glu Thr Phe Ile Lys Leu Val Glu Glu Ile Val Arg Lys Trp Glu Glu
130 135 140
Met Leu His Lys Phe Val Glu Val Leu Thr Lys Leu Phe Glu Thr Phe
145 150 155 160
Thr Lys Ile Met
<210> 25
<211> 38
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:DDNP2 monomer
for use with dogs.
<400> 25
Met Thr Val Glu Phe Lys Leu Glu Ile Lys Val Thr Ile Glu Phe Lys
1 5 10 15
Trp Glu Val His Leu Glu Ile Arg Phe Glu Val Lys Leu Glu Met Lys
20 25 30
Phe Thr Leu Thr Met Val
<210> 26
<211> 164
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:DDNP2 tetramer
for use with dogs.

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
14
<400> 26
Met Thr Val Glu Phe Lys Leu Glu Ile Lys Val Thr Ile Glu Phe Lys
1 5 10 15
Trp Glu Val His Leu Glu Ile Arg Phe Glu Val Lys Leu Glu Met Lys
20 25 30
Phe Thr Leu Thr Met Val Gly Pro Gly Arg Met Thr Val Glu Phe Lys
35 40 45
Leu Glu Ile Lys Val Thr Ile Glu Phe Lys Trp Glu Val His Leu Glu
50 55 60
Ile Arg Phe Glu Val Lys Leu Glu Met Lys Phe Thr Leu Thr Met Val
65 70 75 80
Gly Pro Gly Arg Met Thr Val Glu Phe Lys Leu Glu Ile Lys Val Thr
85 90 95
Ile Glu Phe Lys Trp Glu Val His Leu Glu Ile Arg Phe Glu Val Lys
100 105 110
Leu Glu Met Lys Phe Thr Leu Thr Met Val Gly Pro Gly Arg Met Thr
115 120 125
Val Glu Phe Lys Leu Glu Ile Lys Val Thr Ile Glu Phe Lys Trp Glu
130 135 140
Val His Leu Glu Ile Arg Phe Glu Val Lys Leu Glu Met Lys Phe Thr
145 150 155 160
Leu Thr Met Val
<210> 27
<211> 38
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:CDNPl monomer
for use with cats.
<400> 27
Met Leu Glu Thr Leu Phe Lys Ile Val Glu Glu Thr Leu Arg Lys Trp
1 5 10 15
Glu Glu Met Phe Lys His Val Leu Thr Phe Met Glu Glu Ile Val Lys
20 25 30
Arg Ile Thr Arg Leu Met

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
<210> 28
<211> 164
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:CDNPl tetramer
for use with cats.
<400> 28
Met Leu Glu Thr Leu Phe Lys Ile Val Glu Glu Thr Leu Arg Lys Trp
1 5 10 15
Glu Glu Met Phe Lys His Val Leu Thr Phe Met Glu Glu Ile Val Lys
25 30
Arg Ile Thr Arg Leu Met Gly Pro Gly Arg Met Leu Glu Thr Leu Phe
35 40 45
Lys Ile Val Glu Glu Thr Leu Arg Lys Trp Glu Glu Met Phe Lys His
50 55 60
Val Leu Thr Phe Met Glu Glu Ile Val Lys Arg Ile Thr Arg Leu Met
65 70 75 80
Gly Pro Gly Arg Met Leu Glu Thr Leu Phe Lys Ile Val Glu Glu Thr
85 90 95
Leu Arg Lys Trp Glu Glu Met Phe Lys His Val Leu Thr Phe Met Glu
100 105 110
Glu Ile Val Lys Arg Ile Thr Arg Leu Met Gly Pro Gly Arg Met Leu
115 120 125
Glu Thr Leu Phe Lys Ile Val Glu Glu Thr Leu Arg Lys Trp Glu Glu
130 135 140
Met Phe Lys His Val Leu Thr Phe Met Glu Glu Ile Val Lys Arg Ile
145 150 155 160
Thr Arg Leu Met
<210> 29
<211> 38
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:CDNP2 monomer
for use with cats.
<400> 29
Met Thr Leu Glu Phe Lys Leu Thr Met Glu Leu His Trp Glu Ile Lys
1 5 10 15

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
16
Val Glu Leu Lys Thr Glu Val Arg Ile Glu Met Lys Phe Glu Val Arg
20 25 30
Leu Glu Phe Arg Met Thr
<210> 30
<211> 164
<212> PRT
<213> Artificial Sequence
<220>
<223> Description of Artificial Sequence:CDNP2 tetramer
for use with cats.
<400> 30
Met Thr Leu Glu Phe Lys Leu Thr Met Glu Leu His Trp Glu Ile Lys
1 5 10 15
Val Glu Leu Lys Thr Glu Val Arg Ile Glu Met Lys Phe Glu Val Arg
20 25 30
Leu Glu Phe Arg Met Thr Gly Pro Gly Arg Met Thr Leu Glu Phe Lys
35 40 45
Leu Thr Met Glu Leu His Trp Glu Ile Lys Val Glu Leu Lys Thr Glu
50 55 60
Val Arg Ile Glu Met Lys Phe Glu Val Arg Leu Glu Phe Arg Met Thr
65 70 75 80
Gly Pro Gly Arg Met Thr Leu Glu Phe Lys Leu Thr Met Glu Leu His
85 90 95
Trp Glu Ile Lys Val Glu Leu Lys Thr Glu Val Arg Ile Glu Met Lys
100 105 110
Phe Glu Val Arg Leu Glu Phe Arg Met Thr Gly Pro Gly Arg Met Thr
115 120 125
Leu Glu Phe Lys Leu Thr Met Glu Leu His Trp Glu Ile Lys Val Glu
130 135 140
Leu Lys Thr Glu Val Arg Ile Glu Met Lys Phe Glu Val Arg Leu Glu
145 150 155 160
Phe Arg Met Thr
<210> 31
<211> 20
<212> PRT
<213> Artificial Sequence

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
17
<220>
<223> Amino acid residues 1, 2, 5, 6, 9, 12, 13, 16, 17,
19 and 20 are hydroophobic and the rest are
hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and
hydrophobic representation as exemplified by SEQ
ID N0:6.
<400> 31
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
1 5 10 15
Xaa Xaa Xaa Xaa
<210> 32
<211> 41
<212> PRT
<213> Artificial Sequence
<220>
<223> Amino acid residues 1, 2, 5, 6, 8, 9, 12, 13, 16,
19, 20, 23, 24, 26, 27, 30, 31, 34, 37, 38 and 41
are hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and
hydrophobic representation as exemplified by SEQ
ID NO:11.
<400> 32
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
1 5 10 15
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
20 25 30
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
35 40
<210> 33
<211> 37
<212> PRT
<213> Artificial Sequence
<220>
<223> Amino acid residues 1, 2, 5, 6, B, 9, 12, 13, 16,
19, 20, 23, 24, 26, 27, 30, 31, 34 and 37 are
hydrophobic and the rest are hydrophilic.
<220>

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
18
<223> Description of Artificial Sequence:Hydrophilic and
hydrophobic representation as exemplified by SEQ
ID N0:15.
<400> 33
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
1 5 10 15
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
20 25 30
Xaa Xaa Xaa Xaa Xaa
<210> 34
<211> 40
<212> PRT
<213> Artificial Sequence
<220>
<223> Amino acid residues 1, 2, 5, 6, 8, 9, 12, 13, 16,
19, 20, 23, 24, 26, 27, 30, 31, 34, 37 and 38 are
hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and
hydrophobic representation as exemplified by SEQ
ID N0:19.
<400> 34
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
1 5 10 15
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
20 25 30
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
35 40
<210> 35
<211> 38
<212> PRT
<213> Artificial Sequence
<220>
<223> Amino acid residues 1, 2, 5, 6, 8, 9, 12, 13, 16,
19, 20, 23, 24, 26, 27, 30, 31, 34, 37 and 38 are
hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and
hydrophobic representation as exemplified by SEQ
ID N0:23 or 27.

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
19
<400> 35
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
1 5 10 15
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
20 25 30
Xaa Xaa Xaa Xaa Xaa Xaa
<210> 36
<211> 20
<212> PRT
<213> Artificial Sequence
<220>
<223> Amino acid residues 1, 3, 5, 7, 9, 11, 13, 15, 17
and 19 are hydrophobic and the rest are
hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and
hydrophobic representation as exemplified by SEQ
ID N0:9.
<400> 36
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
1 5 10 15
Xaa Xaa Xaa Xaa
<210> 37
<211> 41
<212> PRT
<213> Artificial Sequence
<220>
<223> Amino acid residues 1, 3, 5, 7, 9, 11, 13, 15, 17,
19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39 and 41
are hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and
hydrophobic representation as exemplified by SEQ
ID N0:13.
<400> 37
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
1 5 10 15
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
20 25 30

CA 02325463 2000-10-13
WO 99/55890 PCTNS99/09067
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
35 40
<210> 38
<211> 37
<212> PRT
<213> Artificial Sequence
<220>
<223> Amino acid residues 1, 3, 5, 7, 9, 11, 13, 15, 17,
19, 21, 23, 25, 27, 29, 31, 33, 35 and 37 are
hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and
hydrophobic representation as exemplified by SEQ
ID N0:17.
<400> 38
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
1 5 10 15
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
20 25 30
Xaa Xaa Xaa Xaa Xaa
<210> 39
<211> 40
<212> PRT
<213> Artificial Sequence
<220>
<223> Amino acid residues 1, 3, 5, 7, 9, 11, 13, 15, 17,
19, 21, 23, 25, 27, 29, 31, 33, 35, 37 and 39 are
hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and
hydrophobic representation as exemplified by SEQ
ID N0:21.
<400> 39
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
1 5 10 15
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
20 25 30
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
35 40

CA 02325463 2000-10-13
WO 99/55890 PCT/US99/09067
21
<210> 40
<211> 38
<212> PRT
<213> Artificial Sequence
<220>
<223> Amino acid residues 1, 3, 5, 7, 9, 11, 13, 15, 17,
19, 21, 23, 25, 27, 29, 31, 33, 35, 37 and 38 are
hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and
hydrophobic representation as exemplified by SEQ
ID N0:25.
<400> 40
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
1 ~ 5 10 15
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
20 25 30
Xaa Xaa Xaa Xaa Xaa Xaa
<210> 41
<211> 38
<212> PRT
<213> Artificial Sequence
<220>
<223> Amino acid residues 1, 3, 5, 7, 9, 11, 13, 15, 17,
19, 21, 23, 25, 27, 29, 31, 33, 35 and 37 are
hydrophobic and the rest are hydrophilic.
<220>
<223> Description of Artificial Sequence:Hydrophilic and
hydrophobic representation as exemplified by SEQ
ID N0:29.
<400> 41
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
1 5 10 15
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa
20 25 30
Xaa Xaa Xaa Xaa Xaa Xaa

Representative Drawing
A single figure which represents the drawing illustrating the invention.
Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee  and Payment History  should be consulted.

Event History

Description Date
Inactive: IPC expired 2018-01-01
Application Not Reinstated by Deadline 2004-04-27
Time Limit for Reversal Expired 2004-04-27
Deemed Abandoned - Failure to Respond to Maintenance Fee Notice 2003-04-28
Inactive: Cover page published 2001-01-17
Inactive: First IPC assigned 2001-01-11
Letter Sent 2000-12-19
Letter Sent 2000-12-19
Inactive: Notice - National entry - No RFE 2000-12-19
Application Received - PCT 2000-12-16
Amendment Received - Voluntary Amendment 2000-10-13
Application Published (Open to Public Inspection) 1999-11-04

Abandonment History

Abandonment Date Reason Reinstatement Date
2003-04-28

Maintenance Fee

The last payment was received on 2002-04-24

Note : If the full payment has not been received on or before the date indicated, a further fee may be required which may be one of the following

  • the reinstatement fee;
  • the late payment fee; or
  • additional fee to reverse deemed expiry.

Patent fees are adjusted on the 1st of January every year. The amounts above are the current amounts if received by December 31 of the current year.
Please refer to the CIPO Patent Fees web page to see all current fee amounts.

Fee History

Fee Type Anniversary Year Due Date Paid Date
Basic national fee - standard 2000-10-13
Registration of a document 2000-10-13
MF (application, 2nd anniv.) - standard 02 2001-04-27 2001-04-26
MF (application, 3rd anniv.) - standard 03 2002-04-29 2002-04-24
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
DEMEGEN, INC.
Past Owners on Record
JESSE M. JAYNES
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column (Temporarily unavailable). To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Representative drawing 2001-01-16 1 5
Description 2000-10-12 68 3,162
Claims 2000-10-12 29 1,508
Abstract 2000-10-12 1 58
Drawings 2000-10-12 6 141
Cover Page 2001-01-16 1 57
Reminder of maintenance fee due 2000-12-27 1 112
Notice of National Entry 2000-12-18 1 195
Courtesy - Certificate of registration (related document(s)) 2000-12-18 1 113
Courtesy - Certificate of registration (related document(s)) 2000-12-18 1 113
Courtesy - Abandonment Letter (Maintenance Fee) 2003-05-25 1 176
Reminder - Request for Examination 2003-12-29 1 123
PCT 2000-10-12 16 622
Fees 2002-04-23 1 37
Fees 2001-04-25 1 42

Biological Sequence Listings

Choose a BSL submission then click the "Download BSL" button to download the file.

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Please note that files with extensions .pep and .seq that were created by CIPO as working files might be incomplete and are not to be considered official communication.

BSL Files

To view selected files, please enter reCAPTCHA code :