Note: Descriptions are shown in the official language in which they were submitted.
CA 02343586 2001-04-25
lA
Background of the Invention
This invention relates generally to the fields of biology and chemistry.
More particularly, the invention is directed to modified fluorescent proteins
and to
- methods for the preparation and use thereof.
In biochemistry, molecular biology and medical diagnostics, it is often
desirable to add a fluorescent label to a protein so that the protein can be
easily
tracked and quantified. 'hhe normal procedures for labeling requires that the
protein be covalently reacted in vitro with fluorescent dyes, then repurified
to
15 remove excess dye and any damaged protein. If the labeled protein is to be
used
inside cells, it usually has to be microinjected; thi-s is a difficult and
time-
consuming operation that ca_znot be performed on fare numbers of cells. These
problems may, however, be eliminated by joining a rucl~otide sequence coding
for
the protein of interest with the sequence for a naturally fluorescent protein,
then
20 expressing the fusion protein.
The green fluorescent protein (GFP) of the jellyfish Aequorea vimoria is a
remarkable protein with strong visible absorbance and fluorescence from a p-
hydroxybenzylideneimidazolone chromophore, which is generated by cyclization
and oxidation of the protein's own Ser-Tyr-Gly sequence at positions 65 to 67.
A
25 cDNA sequence [SEQ ID NO:1] for one isotype of GFP has been reported
[Prasher, D. C. et al:, <~ene 111, 229-233 (1992)]; cloning of this cDNA has
enabled GFP expression in different organisms. The finding that the expressed
protein becomes fluorescent in cells from a wide variety of organisms
[Chalfie, M.
et al.; Science 263, 802-805 (I994)] -makes GFP a powerful new tool in
molecular
30 and cell biology and indicates that the oxidative cyclization must be
either
spontaneous or dependent only on ubiquitous enzymes and reactants.
CA 02343586 2001-04-25
2
A major question in protein photophysics is how a single chromophore can
give widely different spectra depending on its local protein environment. This
question has received the most attention with respect to the multiple colors
of
visual pigments based on retinal [Merbs, S. L. & Nathans, J. Science 258, 464-
466
5 (1992)], but is also important in GFP. The GFP from Aequorea and that of the
sea
pansy Renilla reniformis share the same chromophore, yet Aequorea GFP has two
absorbance peaks at 395 and 475 nm, whereas Renilla GFP has only a single
absorbance peak at 498 nm, with about 5.5 fold greater monomer extinction
coefficient than the major 395 run peak of the Aequorea protein [Ward, W. W.
in
Bioluminescence and Chemiluminescence (eds. DeLuca, M.A. & McElroy, W.D.)
235-242 (Academic Press, New York, 1981)]. The spectra of the isolated
chromophore and denatured protein at neutral pH do not match the spectra of
either
native protein [Cody, C. W. et al., Biochemistry 32, 1212-1218 (1993)].
For many practical applications, the spectrum of Renilla GFP would be
15 preferable to shat of Aequorea, because wavelength discrimination between
different fluorophores and detection of resonance energy transfer are easier
if the
component spectra are gall and narrow rather than low and broad. Furthermore,
the longer wavelength excitation peak (475 nm) of Aequorea GFP is almost ideal
for fluorescein filter sets and is resistant to photobleaching, but has lower
20 amplitude than the shorter wavelength peak at 395 nm, which is more
susceptible
to photobleaching [Chalfie et al. (1994), supra]. For all these reasons, it
would
clearly be advantageous to convert the Aequorea GFP excitation spectrum to a
single peak, and preferably at longer wavelengths.
There is also a need in the art for proteins which fluoresce at different
25 wavelengths. Variants of fluorescent proteins with different colors would
also be
very useful for simultaneous comparisons of multiple protein fates,
developmental
lineages, and gene expression levels.
Accordingly, it is an object of the present invention to provide improved
fluorescent proteins which do not suffer from the drawbacks of native Aequorea
30 GFP.
CA 02343586 2001-04-25
3
Summary of the Invention
In accordance with the present invention, it has been determined that
particular modifications in the polypeptide sequence of an Aequorea wild-type
GFP
[SEQ ID N0:2] lead to formation of products having markedly different
excitation
5 and emission spectra from corresponding products derived from wild-type GFP.
Visibly distinct colors and/or increased intensities of emission make these
products
useful in a wide variety of contexts, such as tracking of differential gene
expression
and protein localization.
Brief Description of the Drawings
The invention may be better understood with reference to the accompanying
drawings, in which:
Fig. 1 compares different versions of GFP by gel electrophoresis
and Coomassie blue staining;
Fig. 2 illustrates a proposed biosynthetic scheme for GFP;
Figs. 3a and 3b illustrate the excitation and emission spectra of wild-
type and a first group of mutant GFPs;
Figs. 4a and 4b illustrate the excitation and emission spectra of wild-
type and a second group of mutant GFPs;
20 Fig. 5 illustrates the rate of fluorophore formation in the wild-type
GFP and the Ser 65-~Thr mutant;
Figs. 6a and 6b illustrate the behavior of wild-type GFP and the Ser
65-~Thr mutant, respectively, upon progressive irradiation with ultraviolet
light; and
Fig. 7 illustrates fluorescence excitation and emission spectra of a
third group of GFP mutants.
Detailed Description of the Invention
GFP was expressed in E. coli under the control of a T7 promoter for
quantitative analysis of the properties of the recombinant protein. Gel
30 electrophoresis under denaturing conditions showed protein of the expected
molecular weight (27 kDa) as a dominant band (Fig. 1), which could be
quantified
CA 02343586 2001-04-25
4
simply by densitometry of staining with Coomassie blue. Soluble recombinant
GFP proved to have identical spectra and the same or even slightly more
fluorescence per mole of protein as GFP purified from Aequorea victoria,
showing
that the soluble protein in E. coli undergoes correct folding and oxidative
cyclization with as high an efficiency as in the jellyfish.
The bacteria also contained inclusion bodies consisting of protein
indistinguishable from jellyfish or soluble recombinant protein on denaturing
gels
(Fig. 1). However, this material was completely non-fluorescent, lacked the
visible absorbance bands of the chromophore, and could not be made fluorescent
10 even when solubilized and subjected to protocols that renature GFP [Ward,
W. W.
& Bokman, S. H., Bir~c~hemistry 21, 4535-4540 (1982); Surpin, M. A. & Ward,
W. W., Photockem. Photobiol. 49, Abstract, 25S (1989)x. Therefore, protein
from inclusion bodies seemed permanently unable to generate the internal
chromophore. An interesting intermediate stage in protein maturation could be
generated by growing the bacteria anaerobically. The soluble protein again
looked
the same as GFP on denaturing gels (Fig. 1) but was non-fluorescent. In this
case,
fluorescence gradually developed after admission of air, even when fresh
protein
synthesis was blocked using puromycin and tetracycline. Evidently, the soluble
non-fluorescent protein synthesized under anaerobic conditions was ready to
20 become fluorescent once atmospheric oxygen was readmitted. The fluorescence
per protein molecule approached its final asymptotic value with a single-
exponential
time course and a rate constant of 0.24 ~ .06 hr~' (at 22°C) measured
either in
intact cells with protein-synthesis inhibitors or in a lysate in which the
soluble
proteins and cofactors Were a thousand fold more dilute. Such pseudo-first
order
25 kinetics strongly suggest that no enzymes or cofactors are necessary for
the final
step of fluorophore formation in GFP.
It has thus been determined that formation of the final fluorophore requires
molecular oxygen and proceeds in wild-type protein with a time constant of - 4
h at 22°C and atmospheric p0z. This was independent of dilution,
implying that
30 the oxidation does not require enzymes or cofactors.
CA 02343586 2001-04-25
A molecular interpretation is presented in Fig. 2. If the newly translated
apoprotein (top left) evades precipitation into inclusion bodies, the amino
group of
Gly 67 might cyclize onta the carbonyl group of Ser 65 to form an imidazolidin-
5-
one, where the process would stop (top center) if OZ is absent. The new N=C
5 double bond would be expected to promote dehydrogenation to form a
conjugated
chromophore; imidazolidin-5-ones are indeed known to undergo autoxidative
formation of double bonds at the 4-position [Kjaer, A. Acta Chem. Scand. 7,
1030-1035 (1953); Kidwai, A. R. & Devasia, G. M. J. Org. Chem. 27, 4527-4531
( 1962)) , which is exactly what is necessary to complete the fluorophore
(upper
10 right). The protonated and deprotonated species (upper and lower right) may
be
responsible for the 395 and 470-475 nm excitation peaks, respectively. The
excited states of phenols are much more acidic than their ground states, so
that
emission would come only from a deprotonated species.
The Aequorea GFP cDNA was subjected to random mutagenesis by
15 hydroxylamine treatment or polymerase chain reaction. Approximately six
thousand bacterial colonies on agar plates were illuminated with alternating
395 and
475 nm exci;ation a:ld visually s.reened for ;.._,..~~ excitation properties
or
emission colors.
According to a first aspect of the present invention, modifications are
20 provided which result in a shift in the ratio of the two excitations peaks
of the
product after oxidation and cyclization relative to the wild type. Three
mutants
were found with significant alterations in the ratio of the two main
excitation peaks
(Table I). The mutations were sequenced and recombined with the wild-type gene
in different ways to eliminate neutral mutations and assign the fluorescence
effects
25 to single amino acid substitutions, except for H9 where two neighboring
mutations
have not yet been separated. They all lay in the C terminal part of the
protein
(Table I), remote in primary sequence from the chromophore formed from
residues
65-67.
These and other modifications are defined herein with reference to the
30 amino acid sequence [SEQ ID N0:2] encoded by the reported cDNA [SEQ ID
NO:1]; the first amino .acid identified is the one found at the indicated
location m
CA 02343586 2001-04-25
6
the reported sequence, while the second indicates the substitution found in
the
modified form. The fluorescent product derived from a wild-type or modified
GFP
polypeptide sequence is no longer strictly speaking a simple polypeptide after
oxidation and cyclization; however, reference is sometimes made for sake of
simplicity herein to the polypeptide (e.g., "wild-type GFP" or "modified GFP")
where what is intended would be obvious from the context. Compared with wild-
type GFP, H9 (Ser 202->Phe, Thr 203-~Ile) had increased fluorescence at 395 nm
excitation; P9 (Ile 167-~Val) and P11 (Ile 167-~Thr) were more fluorescent at
475
nm excitation.
One possibility for these spectral perturbations in P9 and P11 is that the
mutations at Ile 167 shift a positive charge slightly closer to the phenolic
group of
the fluorophore; this should both increase the percentage of phenolic anion,
which
is probably the species responsible for the 470-475 nm excitation peak, and
shift
the emission peak hypsochromically. However, the hypothesized ionizable
15 phenolic group would have to be buried inside the protein at normal pH,
because
the ratio of 471 to 396 nm peaks in the mutants could not be further affected
by
external pH until it was raised to 10, just below the threshold for
denaturation.
The pH-sensitivity of wild-type GFP is similar [Ward, W. W. et al., Pnotochem.
Photobiol. 35, 803-808 (1982)].
According to another aspect of the invention, a mutant P4 (Tyr 66-->His)
was identified which was excitable by ultraviolet and fluoresced bright blue
in
contrast to the green of wild type protein. The excitation and emission maxima
were hypsochromically shifted by 14 and 60 nm respectively from those of wild-
type GFP. The mutated DNA was sequenced and found to contain five amino acid
substitutions, only one of which proved to be critical: replacement of Tyr 66
in the
center of the chromophore by His (corresponding to a change in the GFP cDNA
sequence [SEQ ID NO:l] at 196-198 from TAT to CAT).
The surprising tolerance for substitution at this key residue prompted further
site-directed mutagenesis to Trp and Phe at this position. Trp gave excitation
and
30 emission wavelengths intermediate between Tyr and His (Table I) but was
only
weakly fluorescent, perhaps due to inefficiency of folding or cluomophore
CA 02343586 2001-04-25
7
formation due to steric considerations. Phe gave weak fluorescence with an
excitation maximum at 358 nm and an emission maximum at 442 nm.
Accordingly, pursuant to this aspect of the invention modified GFP proteins
which
fluoresce at different wavelengths (preferably, different by at least 10 nm
and more
preferably, by at least 50 nm) relative to the native protein are provided,
for
example, those wherein '.Cyr 66 is replaced by Phe, His or Trp.
In a further embodiment of this aspect of the invention, a double mutant
Y66H, Y145F was identified which had almost the same wavelengths as the single
mutant Y66H but almost twice the brightness, due mainly to a higher quantum
efficiency of fluorescence. The double mutant also developed its fluorescence
during overnight growth, whereas the single mutant required several days.
In accordance with further embodiments of this aspect of the invention, a
first round of mutagenesis to increase the brightness of Y66W yielded
M153T/V163A/N212K as additional substitutions. This mutant was subjected to
another round of mutagenesis, resulting in two further sets, N 146I and
I123V/Y145H/H148R {T'able II). The quantum efficiency of these mutants is now
comparable to wild-type GFP. The clustering of the substitutions in residues
145
to 163 suggest that those residues lie relatively close to the chromophore and
that
reductions in the size of their side chains might be compensating for the
larger size
of tryptophan compared to tyrosine.
Pursuant to yet another aspect of the present invention, modified GFP
proteins are provided which provide substantially more intense fluorescence
per
molecule than the wild type protein. Modifications at Ser 65 to Ala, Leu, Cys,
Val, Ile or Thr provide proteins with red-shifted and brighter spectra
relative to the
native protein. In particular, the Thr mutant (corresponding to a change in
the
GFP cDNA sequence [SEQ ID NO:1] at 193-195 from TCT to ACT') and Cys
mutant (corresponding to a change in the GFP cDNA sequence [SEQ ID NO:1] at
193-195 from TCT to 'TGT) are about six times brighter than wild type when
excited at the preferred long-wavelength band above 450 nm. As a consequence,
these modified proteins are superior to wild type proteins for practically all
applications. Further, the brightness of these modified proteins matches the
CA 02343586 2001-04-25
8
brightness reported in the literature for Renilla GFP; thus, these proteins
clearly
obviate the objections to the dimness of Aequorea GFP. In fact, it is
speculated
that the chromophores in these modified proteins may exhibit the optimum
brightness which could be achieved with a general structure derived from the
5 Aeguorea GFP chromophore. In particular, these mutanons provtae proaucts
exhibiting one or more of the following salient characteristics which
distinguish
them clearly over the corresponding product from a wild-type GFP: reduced
efficiency of excitation by wavelengths between about 350 and 420 nm; enhanced
excitation and emission efficiency when excited with wavelengths longer than
about
10 450 tun; increased resistance to light-induced shifts in the excitation
spectrum; and
faster kinetics of fluorophore generation. In contrast, mutations to Ttp, Arg,
Asn,
Phe and Asp did not provide improved brightness.
Mutagenesis of S65T to shift its wavelengths further to the red yielded
M153A/K238E (Table LI) as the GFP variant with the longest-wavelength
excitation
15 maximum yet described, 504 nm vs. 490 ntn for S65T. Surprisingly, the
emission
peak hardly changed (:514 ntn vs. 511 ntn), so that the separation between the
excitation and emission peaks (Stokes' shift) is extremely narrow, only 10 nm.
This
is one of the smallest values reported for any fluorophore in aqueous solution
at
room temperature. As in the Y66W series, M153 seems to be influential. It is
20 doubtful that K238E is important, because this substitution has been found
to be
without effect in other mutants.
As would be readily apparent to those working in the field, to provide the
desired fluorescent protein it would not be necessary to include the entire
sequence
of GFP. In particular, minor deletions at either end of the protein sequence
are
25 expected to have little or no impact on the fluorescence spectrum of the
protein.
Therefore, by a mutant or wild-type GFP sequence for purposes of the present
invention are contemplated not only the complete polypeptide and
oligonucleotide
sequences discussed herein, but also functionally-equivalent portions thereof
(i.e.,
portions of the polypeptide sequences which exhibit the desired fluorescence
30 properties and oligonucleotide sequences encoding these polypeptide
sequences).
For example, whereas the chromophore itself (position 65-67) is obviously
crucial,
CA 02343586 2001-04-25
9
the locations of known neutral mutations suggest that amino acids 76-115 are
less
critical to the spectroscapic properties of the product. In addition, as would
be
immediately apparent to those working in the field, the use of various types
of
fusion sequences which lengthen the resultant protein and serve some
functional
5 purpose in the preparation or purification of the protein would also be
routine and
are contemplated as within the scope of the present invention. For example, it
is
common practice to add amino acid sequences including a polyhistidine tag to
facilitate purification of the product proteins. As such fusions do not
significantly
alter the salient properties of the molecules comprising same, modified GFPs
as
10 described herein including such fusion sequences ~ at either end thereof
are also
clearly contemplated as within the scope of the present invention.
Similarly, in addition to the specific mutations disclosed herein, it is well
understood by those working in the field that in many instances modifications
in
particular locations in the polypeptide sequence may have no effect upon the
15 properties of the resultant polypeptide. Unlike the specific mutations
described in
detail herein, other mutations provide polypeptides which have properties
es~catially er s~.:',stantially i~di~~ ~~=aish~':~ -~~ .~s~ c~f tre specific
polypeptides
disclosed herein. For example, the following substitutions have been found to
be
neutral (i.e., have na significant impact on the properties of the product):
Lys
20 3->Arg; Asp 76-~Gly; Phe 99-->Ile; Asn 105-~Ser; Glu 115~Va1; Thr 225-~Ser;
and
Lys 238-~Glu. These equivalent polypeptides (and oligonucleotide sequences
encoding these polypeptides) are also regarded as within the scope of the
present
invention. In general, the polypeptides and oligonucleotide sequences of the
present invention (in addition to containing at least one of the specific
mutations
25 identified herein) will be at least about 85 % homologous, more preferably
at least
about 90 % homologous., and most preferably at least about 95 % homologous, to
the wild-type GFP described herein. Because of the significant difference in
properties observed upan introduction of the specified modifications into a
GFP
sequence, the presence of the specified modifications relative to the
corresponding
30 reported sequence for wild-type GFP [SEQ ID N0:2J are regarded as central
to the
invention.
CA 02343586 2001-04-25
The oligonucleotide sequences of the present invention are particularly
useful in processes for labelling polypeptides of interest, e.g., by the
construction
of genes encoding fluorescent fusion proteins. Fluorescence labeling via gene
fusion is site-specific and eliminates the present need to purify and label
proteins
S in vitro and microinject them into cells. Sequences encoding the modified
GFPs
of the present invention may be used for a wide variety of purposes as are
well
known to those working in the field. For example, the sequences may be
employed as reporter genes for monitoring the expression of the sequence fused
thereto; unlike other reporter genes, the sequences require neither substrates
nor
10 cell disruption to evaluate whether expression has- be achieved. Similarly,
the
sequences of the present invention may be used as a means to trace lineage of
a
gene fused thereto during the development of a cell or organism. Further, the
sequences of the present invention may be used as a genetic marker; cells or
organisms labeled in this manner can be selected by, e.g.; fluorescence-
activated
15 cell sorting. The sequences of the present invention may also be used as a
fluorescent tag to monitor protein expression in vivo, or to encode donors or
acceptors for fluorescence resonance energy transfer. Other uses for the
sequences
of the present invention would be readily apparent to those working in the
field,
as would appropriate techniques for fusing a gene of interest to an
oligonucleotide
20 sequence of the present invention in the proper reading frame and in a
suitable
expression vector so as to achieve expression of the combined sequence.
The availability of several forms of ~GFP with such different spectral
properties should facilitate two-color assessment of differential gene
expression,
developmental fate, or protein trafficking. For example, if one wanted to
screen
25 for a drug that is specific to activate expression of gene A but not gene
B, one
could fuse the cDNA for one color of GFP to the promoter region of gene A and
fuse the cDNA for another color to the promoter region of gene B. Both
constructs would be transfected into target cells and the candidate drugs
could be
assayed to determine if they stimulate fluorescence of the desired color, but
not
30 fluorescence of the undesired color. Similarly, one could test for the
simultaneous
CA 02343586 2001-04-25
11
expression of both A and B by searching for the presence of both colors
simultaneously.
As another example, to examine the precise temporal or spatial relationship
between the generation or location of recombinant proteins X and Y within a
cell
or an organism, one could fuse genes for different colors of GFP to the genes
for
proteins X and Y, respectively. If desired, DNA sequences encoding flexible
oligopeptide spacers could be included to allow the linked domains to function
autonomously in a single construct. By examining the appearance of the two
distinguishable colors of fluorescence in the very same cells or organisms,
one
could compare and contrast the generation or location of the proteins X and Y
with
much greater precision and less biological variability than if one had to
compare
two separate sets of cells or organisms, each containing just one color of GFP
fused to either protein X or Y. Other examples of the usefulness of two colors
would be obvious to those skilled in the art.
The further mutations to brighten the Y66H and Y66W variants of GFP
enhance the possibility of using two or three colors of fluorescent protein to
track
differential gene expression, protein localizations or cell fates. For
example,
mutants P4-3 (Y66HIY145F), W7 (Y66W/N146I/M153T/V163A/N212K) and
S65T can all be distinguished from each other. P4-3 is specifically detected
by
exciting at 290-370 nm and collecting emission at 420-460 nm. W7 is
specifically
detected by exciting at 410-457 nm and collecting emission at 465-495 nm. S65T
is specifically detected by exciting at 483-493 nm and collecting emission at
wavelengths greater than 510 nm. Bacteria carrying these three proteins are
readily discriminated under a microscope using the above wavelength bandpass
filters.
The chromophore in GFP is well buried inside the rest of the protein, so
much of the dimness of the original point mutants was presumably due to steric
mismatch between the substituted amino acid and the cavity optimized for
tyrosine.
The location of the beneficial mutations implies that residues 145-163 are
probably
close to the chromophore. The M153A/S65T mutant has the longest wavelengths
CA 02343586 2001-04-25
12
and smallest Stokes' shift of any known fluorescent protein that does not use
a
cofactor.
The invention may be better understood with reference to the accompanying
examples, which are intended for purposes of illustration only and should not
be
construed as in any seine limiting the scope of the invention as defined by
the
claims appended hereto.
Example 1
The coding region of GFP clone 10.1 [Prasher et al. ( 1992), supra] was
amplified by PCR to create NdeI and BamHI sites at the 5' and 3' ends,
10 respectively, and was cloned behind the T7 promoter of pGEMEX2 (Promega)
replacing most of the T7 gene 10. The resulting plasmid was transformed into
the
strain JM109(DE3) (Promega Corp., Madison, WI), and high level expression was
achieved by growing the cultures at 24°C to saturation without
induction by IPTG.
To prepare soluble extracts, 1.5 ml cell suspension were collected, washed and
15 resuspended in 150 ~cl :>0 mM Tris/HCI, pH 8.0, 2 mM EDTA. Lysozyme and
DNAse I were added to 0.2 mg/ml and 20 ug/ml, respectively, and the samples
were incubated on ice until lysis occurred (1-2 hours). The lysates were then
clarified by centrifuging at 12,000 x g for 15 minutes. Inclusion bodies were
obtained as described in the literature [Sambrook, J. et al. in Molecular
Cloning:
20 A Laboratory Manual Vol. 2, 17.37-17.41 (Cold Spring Harbor Press, Cold
Spring
Harbor, New York, 1989)].
As illustrated in Fig. 1, soluble extracts of E. coli expressing GFP show a
predominant band which is absent in extracts from control cells and has the
same
electrophoretic mobility as native GFP isolated from the jellyfish ,4.
victoria.
25 Inclusion bodies of expressing cells consist mainly of non-fluorescent GFP
which
has the same mobility as soluble GFP. Non-fluorescent soluble GFP of
anaerobically grown cultures is also a major band with correct mobility.
Soluble
extracts of the mutated clones H9, P9, P11 and P4 again contain a dominant
protein with essentially the same molecular weight.
30 Random mutagenesis of the GFP cDNA was done by increasing the error
rate of the polymerase chain reaction with 0.1 mM MnCl2, SO ~.M dATP and 200
CA 02343586 2001-04-25
13
~cM of dGTP, dCTP, and dTTP [Muhlrad, D. et al. , Yeast 8, 79-82 ( 1992)] .
The
product was ligated into pGEMEX2 and subsequently transformed into
JM109(DE3). Colonies on agar were visually screened for different emission
colors and ratios of brightness when excited at 475 vs. 395 run.
5 Figs. 3a and 3b illustrate the excitation and emission spectra of wild-type
and mutant GFPs. In Figs. 3a and 3b, -- wild-type; - - S202F,T203I; - -
- I167T; ----- Y66W; -- ~ - ~ Y66H. Samples were soluble fractions from E.
coli expressing the proteins at high level, except for Y66W, which was
obtained
in very low yield and measured on intact cells. Autofluorescence was
negligible
10 for all spectra except those of Y66W, whose excitation spectrum below 380
nm
may be contaminated by autofluorescence. Excitation and emission spectra were
measured with 1.8 nm bandwidths and the non-scanning wavelength set to the
appropriate peak. Excitation spectra were corrected with a rhodamine B quantum
counter, while emission spectra (except for Y66W) were corrected for
15 monochromator and detector efficiencies using manufacturer-supplied
correction
spectra. All amplitudes have been arbitrarily normalized to a maximum value of
1Ø A comparison of brightness at equal protein concentrations is provided in
Table I.
CA 02343586 2001-04-25
14
Table I
Characteristics of mutated vs. wild-type GFP
Variant Mutation Excitation Emission Relative'
Maxima (nm)z Maxima (nm)b Fluorescence
' S
wild type none 396 (476) 508 (503) ( ---100 % )
' H9 Ser 202-Phe, 398 511 117 % ~'
Thr 203-~Ile -
p9 Ile 167-Val 471 (396) 502 (507) 166%a
_P11 Ile 167-Thr_ 471 (396) 502 (507) 1$8%'
p4 Tyr 66--His 382 448 57 %
- 10
Tyr 66-Trp 458 480 n.d.
aValues in parentheses are lower-amplitude peaks.
bPrimary values were observed when exciting at the main
excitation peak; values
15 in parentheses were observed when illuminating at the lower-
amplitude
excitation
peak.
'Equal amounts of protein were used based on densitometry
of gels stained with
Coomassie Blue (Fig. 1).
Emission maxima of spectra recorded at excitation 395 nm
were compared.
20 'Emission maxima of spectra recorded at excitation 475 nm
were compared.
(Emission spectrum of P4 recorded at 378 nm excitation was
integrated and
compared to the integrated emission spectrum of wild type
recorded at 475 nm
excitation; both excitation and emission characteristics
were corrected.
Example 2
25 Oligonucieotide-directed mutagenesis at the codon for Ser-65
of GFP cDNA
was performed by the literature method [Kunkel, T.A. (1985)
Proc. Natl. Acad.
i
488] using the Muta-Gene Phagemid in Vitro Mutagenesis Kit
version
SA 82
,
Sci. U
2, commercially available from Bio-Rad, Richmond, CA. The
method employs
a bacterial host strain deficient for dUTPase (dut) and
uracil-N-glycosylase (ung),
30 which results in an occasional substitution of uracil for
thymine in newly-
synthesized DNA. When the uracil-containing DNA is used
as a wild-type
template for oligonucleotide-directed in vitro mutagenesis,
the complementary
(mutant) strand can be synthesized in the presence of deoxynucleotides,
ligase and
CA 02343586 2001-04-25
polymerise using the mutagenic oligonucleotide to prime DNA synthesis; the
Version 2 kit utilizes unmodified T7 DNA polymerise to synthesize the
complementary strand. When the heteroduplex molecule is transformed into a
host
with an active uracil-N-glycosylase (which cleaves the bond between the uracil
base
5 and the ribose molecule, yielding an apyrimidic site), the uracil-containing
wild-
type strand is inactivated, resulting in an enrichment of the mutant strand.
The coding region of GFP cDNA was cloned into the BamHl site of the
phagemid pRSETB from Invitrogen (San Diego, CA). This construct was
introduced into the dut, ung double mutant E. coli strain CJ236 provided with
the
10 Muta-Gene kit and supeunfected with helper phage VCSM13 (Stratagene, La
Jolla,
CA) to produce phagemid particles with single-stranded DNA containing some
uracils in place of thymine. The uracil-containing DNA was purified to serve
as
templates for in vitro synthesis of the second strands using the mutagenic
nucleotides as primers. The DNA hybrids were transformed into the strain
15 XLlblue (available from Stratagene), which has a functional uracil-N-
glycosylase;
this enzyme inactivates the parent wild-type DNA strand and selects for mutant
clones. DNA of several colonies were isolated and checked for proper mutation
by sequencing.
To express thc: mutant proteins, the DNA constructs obtained by
20 mutagenesis were transformed into E. coli strain BL21 (DE3)LysS (Novagen,
Madison, WI), which has a chromosomal copy of T7 polymerise to drive
expression from the strong T7 promotor. At room temperature 3 ml cultures were
grown to saturation (typically, overnight) without induction. Cells from 1 ml
of
culture were collected, washed and finally resuspended in 100 ul of 50 mM Tris
25 pH 8.0, 300 mM NaCI. The cells were then lysed by three cycles of
freeze/thawing (liquid nitrogen/30° C water bath). The soluble fraction
was
obtained by pelletting cell debris and unbroken cells in a microfuge.
To facilitate purification of the recombinant proteins, the vectar used fuses
a histidine tag (6 consecutive His) to the N-terminus of the expressed
proteins.
30 The strong interaction between histidine hexamers and Niz+ ions permitted
purification of the proteins by NI-NTA resin (available commercially from
Qiagen,
CA 02343586 2001-04-25
16
Chatsworth, CA). Microcolumns (10 ~,l bed volume) were loaded with 100 ~.l
soluble extract (in 50 nilvl Tris pH 8.0, 300 mM NaCI), washed with 10 bed
volumes of the same buffer and with 10 volumes of the buffer containing 20 mM
imidazole. The recombinant proteins were then eluted with the same buffer
containing 100 mM imidazole.
Aliquots of the purified mutant GFP proteins were run along with wild-type
GFP on a denaturing polyacrylamide gel. The gel was stained with Coomassie
blue and the protein bands were quantified by scanning on a densitometer.
Based
on these results, equal amounts of each version of protein were used to run
fluorescence emission and excitation spectra.
Figs. 4a and 4b compare the excitation and emission spectra of wild-type
and Ser 65 mutants. In Fig. 4a, -- S65T; - - S65A; - - - S65C; - ~ -
wild-type (emission at 508 nm). In Fig. 4B, -- S65T; - - S65A; - - - S65C;
~ ~ ~ wild-type (excitation at 395 nm); - ~ - ~ wild-type (excitation at 475
nm).
Excitation and emission spectra were measured with 1.8 nm bandwidths and the
non-scanning wavelength set to the appropriate peak. As is apparent from Fig.
4b,
all three mutants exhibited substantially higher intensity of emission
relative to the
wild-type protein.
Fig. 5 illustrates the rates of fluorophore formation in wild-type GFP and
20 in the Ser 65--~Thr mutant. E. coli expressing either wild-type or mutant
GFP were
grown anaerobically. At time=0, each sample was exposed to air; further growth
and protein synthesis were prevented by transferring the cells to nutrient-
free
medium also containing sodium azide as a metabolic inhibitor. Fluorescence was
subsequently monitored as a function of time. For each culture, the
fluorescence
25 intensities are expressed as a fraction of the final fluorescence intensity
obtained
at t = 18 to 20 hours, after oxidation had proceeded to completion. From Fig.
5,
it is apparent that development of fluorescence proceeds much more quickly in
the
mutant than in wild-type GFP, even after normalization of the absolute
brightnesses
(Figs. 4a and 4b). Therefore, when the development of GFP fluorescence is used
30 as an assay for promotor activation and gene expression, the mutant clearly
gives
a more rapid and faithful measure than wild-type protein.
CA 02343586 2001-04-25
' 17
Figs. 6a and 6b illustrate the behavior of wild-type GFP and the Ser
65-~Thr mutant, respectively, upon progressive irradiation with ultraviolet
light.
Numbers indicate minutes of exposure to illumination at 280 nm; intensity was
the
same for both samples. Wild-type GFP (Fig. 6a) suffered photoisomerization, as
shown by a major change in the shape of the excitation spectrum. Illumination
= with broad band (240-400 nm) UV caused qualitatively similar behavior but
with
less increase of amplitude in the 430-500 nm region of the spectrum: The
photoisomerization was not reversible upon standing in the dark. This
photoisomerization would clearly be undesirable for most uses of wild-type
GFP,
because the protein rapidly loses brightness when excited at its main peak
near 395
nm. The mutant (Fig. 6b) showed no such photoisomerization or spectral shift.
Examvle 3
GFP cDNAs encoding for Tyr66-~Iiis (Y6bH), Tyr66-~Trp (Y66W), or
Ser65-~Thr (S65T) were separately further mutagenized by the poiymerase chain
reaction and transformed into E. coli for visual screening of colonies with
unusual
intensities or colors. Isolation, spectral characterization (Table II and Fig.
7), and
DN A sequencing yielded several additional useful variants.
i Random mutagenesis of the gfp cDNA was done by increasing the error
rate of the PCR with 0.1 mM MnCl2 and unbalanced nucleotide concentrations.
The GFP mutants S65T, Y66H and Y66W had been cloned into the BamHl site
of the expression vector pRSETB (Invitrogen), which includes a T7 promoter and
i a polyhistidine tag. The GFP coding region (shown in bold) vas flanked by
the
following 5' and 3' sequences: 5'-G GAT CCC CCC GCT GAA TTC ATG ...
AAA TAA TAA GGA TCC-3' . The 5' primer for the mutagenic PCR was the T7
primer matching the vector sequence; the 3' primer was 5'-GGT AAG CTT TTA
i
TTT GTA TAG TTC ATC CAT GCC-3', specific for the 3' end of GFP, creating
a HindIII restriction site next to the stop codon. Amplification was over 25
cycles
(I min at 94' C, 1 min 52' C, 1 min 72' C) using the AmpliTaq polymerase from
~
t 30 Perkin Elmer. Four separate reactions were run in which the concentration
of a
different nucleotide was lowered from 200 ~eM to 50 ~.M. The PCR products were
CA 02343586 2001-04-25
I8
combined, digested with BamHI and HindIII and ligated to the pRSETB cut with
BamHI and HindIII. The Iigation mixture was dialyzed against water, dried and
subsequently transformed into the bacterial strain BL21(DE3) by
electroporation
(50 ul electrocompetent cells in 0.1 cm cuvettes, 1900 V, 200 ohm, 25 ~cF).
Colonies on agar were visually screened far brightness as previously described
herein. The selected clones were sequenced with the Sequenase version 2.0 kit
from United States Biochemical.
Cultures with freshly transformed cells were grown at 37' C to an optical
density of 0.8 at 600 nm, then induced with 0.4 mM isopropylthiogalactoside
overnight at room temperature. Cells were washed in PBS pH 7.4, resuspended in
$0 mM Tris pH 8.0, 300 mM NaCI and lysed in a French press. The polyhistidine-
tagged GFP proteins were: purified from cleared Iysates on nickel-cheiate
columns
(Qiagen) using 100 mM imidazole in the above buffer to elute the protein.
Excitation spectra were obtained by collecting emission at the respective
peak wavelengths and were corrected by a Rhodamine B quantum counter.
Emission spectra were likewise measured at the respective excitation peaks and
were corrected using factors from the fluorometer manufacturer (Spex
Industries,
Edison, NJ}. In cleavage experiments emission spectra were recorded at
excitation
368 nm. For measuring molar extinction coefficients, 20 to 30 ~,g of protein
were
used in 1 ml of PBS pH 7.4. Quantum yields of wild-type GFP, S65T, and P4-1
mutants were estimated by comparison with fluorescein in 0.1 N NaOH as a
standaid of quantum yield 0.91 [ed. Miller, 1.N., Standards in Fluorescence
Spectrometry (Chapman and Hall, New York, 1981}]. Mutants=P4 and P4-3 were
likewise compared to 9-amino-acridine in water (quantum yield 0.98). W2 and W7
were compared to both standards, which fortunately gave concordant results.
Fig. 7 illustrates the fluorescence excitation and emission spectra of
different GFP mutants. All spectra were normalized to a maximal value of 1.
Each pair of excitation and emission spectrum is depicted by a distinct Iine
style.
The fluorescence properties of the obtained GFP mutants are reported in
Table II.
CA 02343586
2001-04-25
19
Table II
Fluorescence mutants
properties
of GFP
Clone Mutations Excitation EmissionExtinct.Coeff.Quantum
max (nm) max (nm) (M'lcm 1) yield
P4-3 Y66H 381 445 14,000 0.38
Y145F
W7 Y66W 433 (453) 475 (501)18,000 (17,100)0.67
N 146I
M153T
V 163A
N212K
W2 Y66W 432 (453) 480 10,000 (9,600)0.72
I123V
Y 145H
H 1488
M153T
V 163A
N212K
P4-1 S65T 504 (396) 514 14,500 (8,600)0.54
M153A
K238E
CA 02343586 2001-04-25
20
SEQUENCE LISTING
(1) GENERAL INFORMATION:
(i) APPLICANT: Tsien, Roger Y.
Heim, Roger
(ii) TITLE OF INVENTION: MOD1FIE0 GREEN FLUORESCENT PROTEINS
(iii) NUMBER OF SEQUENCES: 2
(iv) CORRESPONDENCE ADDRESS:
(A) ADDRESSEE: Bobbins, Berliner & Carson
(B) STREET: 201 North Figueroa Street, Suite 500
(C) CITY: Los Angeles
(D) STATE: California
(E) COUNTRY: USA
(F) ZIP: 90012
(v) COMPUTER READABLE FORM:
(A) MEDIUM TYPE: Floppy disk .
(B) COMPUTER: IBM PC compatible
(C) OPERATING SYSTEM: PC-OOS/MS-DOS
(D) SOFTWARE: PatentIn Release #1.0, Version #1.25
(vi) CURRENT APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE:
(C) CLASSIFICATION:
(viii) ATTORNEY/AGENT INFORMATION:
(A) NAME: Spitals, John P.
(B) REGISTRATION NUMBER: 29,215
(C) REFERENCE/DOCKET NUMBER: 1279-178
(ix) TELECOMMUNICATION INFORMATION:
(A) TELEPHONE: (213) 977-1001
(e) TELEFAX: (213) 977-1003
(2) INFORMATION FOR SEQ ID N0:1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 717 base pairs
(B) TYPE: nucleic acid
(C) STRANOEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: cDNA
(ix) FEATURE:
(A) NAME/KEY: CDS
(B) LOCATION: 1..717
(xi) SEQUENCE DESCRIPTION: SEO ID N0:1:
ATG AGT AAA GGA GAA GAA CTT TTC ACT GGA GTT GTC CCA ATT CTT GTT 48
Met Ser Lys Gly Glu Glu Leu Phe Thr Gly Val Val Pro Ile Leu Yal
1 5 10 15
GAA TTA GAT GGT GAT GTT AAT GGG CAC AAA TTT TCT GTC AGT GGA GAG 96
Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu
2D 25 30
GGT GAA GGT GAT GCA ACA TAC: GGA AAA CTT ACC CTT AAA TTT ATT TGC 144
Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe ile Cys
35 40 45
ACT ACT GGA AAA CTA CCT GTT CCA TGG CCA ACA CTT GTC ACT ACT TTC 192
CA 02343586 2001-04-25
21
Thr Thr Gly Lys Leu Pro Val Pro Trp Pro
Thr Leu Val Thr Thr Phe
50 55 60
TCT TAI GGT GTT CAA TGC TTT TCA AGA TAC 240
CCA GAT CAT ATG AAA CGG
Ser Tyr Gly Val Gln Cys Phe Ser Arg Tyr
Pro Asp His Met Lys Arg
65 70 75 80
CAT GAC TTT TTC AAG AGT GCC ATG CCC GAA 288
GGT TAT GTA CAG GAA AGA
His Asp Phe Phe Lys Ser Ala Met Pro Glu
Gly Tyr Val Gln Glu Arg
85 90 95
ACT ATA TTT TTC AAA GAT GAC GGG AAC TAC 336
AAG ACA CGT GCT GAA GTC
Thr Ile Phe Phe Lys Asp Asp Gly Asn Tyr
Lys Thr Arg Ala Glu Val
100 105 110
AAG TTT GAA GGT GAT ACC CTT GTT AAT AGA 384
ATC GAG TTA AAA GGT ATT
Lys Phe Glu Gly Asp Thr Leu Val Asn Arg
Ile Glu Leu lys Gly Ile
115 120 125
GAT TTT AAA GAA GAT GGA AAC ATT CTT GGA 432
CAC AAA TTG GAA TAC AAC
Asp Phe Lys Glu Asp Gty Asn Ile Leu Gly
His Lys Leu Glu Tyr Asn
130 135 140
TAT AAC TCA CAC AAT GTA TAC ATC ATG GCA 480
GAC AAA CAA AAG AAT GGA
Tyr Asn Ser His Asn Val Tyr Ile Met Ala
Asp Lys Gln Lys Asn Gly
145 150 155 160
ATC AAA GTT AAC TTC AAA ATT AGA CAC AAC 528
ATT GAA GAT GGA AGC GTT
Ile Lys Val Asn Phe Lys ILe Arg His Asn
Ile Glu Asp Gly Ser Vat
1b5 170 175
CAA CTA GCA GAC CAT TAT CAA CAA AAT ACT 576
CCA ATi GGC GAT GGC CCT
Gtn Leu Ala Asp His Tyr Gln Gln Asn Thr
Pro Ile Gly Asp Gly Pro
180 185 190
GTC CTT TTA CCA GAC AAC CAT TAC CTG TCC 624
ACA CAA TCT GCC CTT TCG
Vat Leu Leu Pro Asp Asn His Tyr Leu Ser
Thr Gln Ser Ala Leu Ser
195 200 205
AAA GAT CCC AAC GAA AAG AGA GAC CAC ATG 672
GTC CTT CTT GAG TTT GTA
Lys Asp Pro Asn Glu Lys Arg Asp His Met
Val Leu Leu Glu Phe Val
210 215 220
ACA GCT GCT GGG ATT ACA CAT GGC ATG GAT 717
GAA CTA TAC AAA TA
Thr Ala Ala Gly Ile Thr His Gly Met Asp
Gtu Leu Tyr Lys
225 230 235
(2) INFORMATION FOR SEO ID N0:2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 238 amino acids
(B) TYPE: amino acid
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ 1D N0:2:
Met Ser Lys Gty Glu Glu Leu Phe Thr Gly Val Vat Pro Ile Leu Val
1 5 10 15
Glu Leu Asp Gly Asp Val Asn Gly His Lys Phe Ser Val Ser Gly Glu
20 25 30
Gly Glu Gly Asp Ala Thr Tyr Gly Lys Leu Thr Leu Lys Phe Ile Cys
35 40 45
Thr Thr Gly Lys Leu Pro Val Pro Trp Pro Thr Leu Val Thr Thr Phe
50 S5 60
Ser Tyr Gly Val Gln Cys Phe Ser Arg Tyr Pro Asp His Met Lys Arg
CA 02343586 2001-04-25
22
65 70 75 80
His Asp Phe Phe Lys Ser Ala Met Pro Glu Gly tyr Val Gln Glu Arg
g5 90 95
Thr Ile Phe Phe Lys Asp Asp Gly Asn Tyr Lys Thr Arg Ala Glu Val
100 105 110
Lys Phe Glu Gly Asp 1hr Leu Val Asn Arg Ile Glu Leu Lys Gly lle
115 120 125
Asp Phe Lys Glu Asp Gly Asn Ile Leu Gly His Lys Leu Glu Tyr Asn
130 135 140
Tyr Asn Ser His Asn Vat Tyr Ile Met Ala Asp Lys Gln Lys Asn Gly
145 150 155 160
Ile Lys Val Asn Phe Lys Ile Arg His Asn Ile Glu Asp Gly Ser Val
165 170 175
Gln Leu Ala Asp His Tyr Gln (.ln Asn Thr Pro Ile Gly Asp Gly Pro
180 185 190
Val Leu Leu Pro Asp Asn His 1'yr Leu Ser Thr Gln Ser Ala Leu Ser
195 200 205
Lys Asp Pro Asn Glu Lys Arg lisp His Met Val Leu Leu Glu Phe Val
210 215 220
Thr Ala Ala Gly Ile Thr His Gly Met Asp Glu Leu Tyr Lys
225 230 235