Note : Les descriptions sont présentées dans la langue officielle dans laquelle elles ont été soumises.
7~
-- 1 --
DNA sequence coding for a signal peptide of
levansucrase and vectors containing the same
The sucrose metabolic system is an important
model for studying the regulation of gene expression
and secretion in Bacillus subtilis. Biochemical and
genetical studies have already been made on this system
which comprises at least eight different loci [Lepesant,
J.A. et al (1976) In : Schlessinger D. (Ed) Micro-
biology, Am. Soc. Microbiol. Washington DC, pp. 58 -
69]. It includes three structural genes specificallyinduced by sucrose. One of them, sacB, codes for the
exocellular levansucrase (lvs). Among the five known
regulatory loci, four control the expression of sacB.
Levansucrase is a R-D-fructofuranosyl trans-
ferase (E.C. 2. 4. 1. 10) which synthesizes polymers
of fructose called levans. It has been obtained in a
pure state [DEDONDER, R. (1966) Methods in Enzymol.,
8, 500 - 505]. It has a molecular weight of 50000
daltons; its primary structure has been determined by
protein sequencing [DELFOUR, A. (1981) Doctorate Thesis
Paris VII University] as well as its tertiary structure
by X-ray diffraction analysis [Lebrun, E. et al (1980)
J. Biol. Chem., 255, 12034-12036].
The present invention is based on the dis-
covery which has been made of the existence of a DNAsequence comprised in the sacB gene, which codes for a
signal peptide, more generally of the existence of a
precursor of levansucrase. This signal peptide has
been located among the expression products of an insert
contained in a modified vector which had been used for
transforming an E. coli strain, whereby said insert
contained a sequence including most of said sacB gene
including a regulatory gene associated therewith.
The modified vector had been obtained by
inserting sized partial Sau3Al digests of the DNA of
the QB 2010 Bacillus subtilis strain carrying a sacR
X) ~
: .:. . , .. . - : : ~
': . ;;. ~ '. "' ' ' , ' :
~7~'77~
-- 2 --
mutation and secreting levansucrase constitutively, in
the BamHI sites of a ~EMBL3 phage by making a first
screening of the modified vector by molecular hybrid-
ization with a radioactively labelled probe containing
the sacB gene, by making a restriction map of the first
screened modified phages, whereby a modified phage was
retained as the starting phage for the study which led
to the invention by comparing the different restriction
si-tes obtained with those deduced by a computer compi-
lation from the known aminoacid sequence of the pro-
tein. The finally selected phage included the entire
sacB gene, except for the last twenty-six nucleotides.
An EcoRI-TaqI fragment of about 400 bp which
overlaps the NH2-terminal sequence of the protein and
its putative signal sequence was subcloned into the
large EcoRI-ClaI fragment of pBR322. The nucleotide
sequence was determined starting from pBR322 EcoRI
or HindIII sites by the method of Maxam and Gilbert
[(1980) Methods in Enzymol., 65, 449 - 559]. The de-
duced aminoacid sequence was compared to that obtainedby protein sequencing by A. Delfour and showed total
identity downstream the Lys residue which is the mature
protein N-terminal aminoacid.
Upstream this Lys residue an open reading
frame of 29 aminoacids was found starting from an ATG
codon. This sequence is characterized by the presence
of a hydrophilic region of 8 aminoacids with 3 Lys
residues followed by a hydrophobic stretch of 21 resi-
dues with an Ala as last aminoacid.
There follows a formula (I) of the DNA and
aminoacid sequences in the NH2-terminal region of
levansucrase. The NH2-terminal Lys of the mature
levansucrase was taken as aminoacid ~1. The site of
cleavage between the signal sequence and the mature
levansucrase (when the latter is excreted by its
natural host) is indicated by a vertical bar.
~i :
. . . ~- , : . : ;
::: . ~ . , . . :.
. .. ~ ,
: ~ ., ~ . -: :.
t ~ ~ `J t~
~ 3
-2
TA~AGGA GAC~TG~CG ATG AAC AIC AAA AAG m GCA AAA CAA G~A ACG AI~
fMet Asn IZe Lys Lys Phe AZa Lys GZn AZa Thr lZe
TTA ACC TTT ACT ACC GC~ CTG CTG GCA GG~ GGC GQ AC~ CAA G~G m GCG
(1) Leu Thr Phe ~hr Thr AZa Leu Leu AZa GZy GZy AZa Thr GZn AZa Phe AZa
A~A G~A ACG AAC C~A A~G CCA IAT AAG GAA ACA IAC GGC ATT TCC CAT ATT
lys GZu Th~ Asn GZn Lys Pro Tyr Lys GZu Thr Tyr GZy IZe Ser His IZe
ACA CGC C~T GA~ ~I~ CTG C~A AIC CCC GAA CAG QA A~A A~T G~AAG
Th~ Arg ~is Asp Met Leu GZn IZe Pro GZu GZn GZn Lys Asn GZu Lys
A TGA stop eodon in the same reading frame 2
codons upstream from the ATG codon (numbered -29) of
the preeursor eomprising -the corresponding 29 triplets
of nueleotides eoding for the eorresponding 29 amino-
aeyl residues of the signal peptide is also shown here-
above.
Aeeordingly the ATG codon corresponding to
the Met residue located 29 aminoacids before the N-
terminal Lys residue of the mature protein may be con-
sidered as the start eodon. A good putative ribosome
binding site was also found 9 bp upstream from the ATG.
Preliminary sequencing results indieate that the stop
eodon eould eorrespond to the end of another gene sinee
it is preeeded by an open reading frame of at least
185 bp. No promoter nucleotide consensus sequence was
2Q found in this sequence.
Expression of the sacB gene in E. coli minicells
To eonfirm the existenee of the preeursor and '!
to determine its length, the saeB gene was expressed
in an E. coli minicell producing strain. The plasmid
pLS8, whieh eontains the entire sacB gene, was obtained
by Gay. P. et al from a library [J.A. Hoeh (1983)
J. Baeteriol., 153, 1424 - 1431]. It was introdueed
by transformation into the E. eoli ARI062 minicell
:~1
- ~
:, `
~'7~7~
-- 4 --
producing strain. Very few recombinants were obtained
and the majority of them had lost the levansucrase
ac-tivity. Two recombinant clones possessing the Lvs
(levansucrase ) phenotype detected directly on plate
(Lepesant J.A. et al (1972) Molec. Gen. Genet. 1l8, 135
- 160 were picked and further analyzed. The presence
of an active levansucrase was confirmed by enzymatic
assays on the sonicated extract (the specific activity
found was about 0.l enzyme unit per mg bacterial pro-
tein). Furthermore, one of the clones recovered con-
tained a plasmid with a DNA insertion located in the
0.95 kb HindIII fragment at the end of the sacB locus.
This sequence is about 800 bp long and contains a PstI
site; it probably corresponds IS-I. The other clone
harbored a plasmid identical to the original pLS8. The
proteins of this latter clone were analyzed by SDS-PAGE
after incorporation of 135s1 L-methionine. Several
additional bands were detected, compared to the pattern
obtained for the extract of the cells containing the
vector alone pBR325 in a distinct lane. One of these
bands had a MW which corresponds to that of authen-tic
levansucrase (50000), and another one had a MW of about
53000. Another gel was submitted to immunoblotting
and it was observed that the same two bands reacted
with anti-levansucrase antibodies, suggesting that the
smaller polypeptide is the mature form and the larger
one a precursor form of the enzyme.
The invention is thus more particularly con-
cerned with a nucleotide sequence coding for the signal
peptide of formula (2).
MET ASN ILE LYS L~S PHE ALA LYS GLN ALA THR
ILE LEU THR PHE THR THR ALA LEU LEU ALA GLY GLY ALA THR
GLN ALA PHE ALA
(2)
: ., ,,, - .
: . , ' . . ::
.: .;
7~ 7<3
- 4a-
The corresponding preferred nucleotide se-
quence flows from formula 1 hereabove. It is however
understood that any of the triplets of the correspond-
ing sequence may be replaced by another one coding for
S the same aminoacid, as it can be deduced from the
classical genetic code.
This nucleotide sequence is of particular
significance as the major function of causing the
excretion of the levansucrase by its natural host can
be attributed to the corresponding signal peptlde.
2Q
,;
~ ~'7~'7~
The invention is further concerned with the vectors
particularly plasmids, capable of transforming B.subtilis
and which contain said particular nucleotide sequence or
fragment, hereafter referred to as "signal fragment" and
optionally, part of the DNA sequence which follows said
signal sequence in the sacg gene and, accordingly, codes
for the first aminoacids of the mature levansucrase ~here-
after referred to as "partial levansucrase gene").
Preferably the vector of the invention is free of
said partial levansucrase gene. The latter may then be
replaced by another DNA sequence coding for a determined
polypeptide, particularly a polypeptide whose production
by B. subtilis is sought. Advantage is then taken from the
capability of said signal sequence of directing the syn-
thesis and excretion by the transformed
B. subtili~ of said predetermined polypeptide ~hen
_.
a DNA sequence coding for said predetermined protein hasbeen substituted for the DNA sequence coding for the mature
levansucrase. Thus the invention opens an additional field
of use of B.subtilis for the production and excretion of
determined poly~eptides or peptides as diverse as are, for instance
interferon, polypeptides having immunogenic properties
against various pathological microorqanisms, i.e. viral
hepatitis B, for instance a ~olypeptide having lmmuno~enic
properties analogous to those of the HBsAg anti~en, or the
so-called crystal protein of Bacillus Thurin~iensis, that is a vrotein
having ~ctent insecticidal properties.
When the partial levansucrase ~ene is present, the
vector of the invention then consists of an intermediate
vector, then capable of being modified by an insert coding
for said determined polypeptide, particularly by substitu-
tion of the appropriate DNA sequence for said partial levan-
sucrase gene. Preferred methods for carrying out said
substitution will be disclosed hereafter.
Preferably said partial levansucrase gene contains
not more than the 112 first base pairs of the gene sequence
coding for the mature levansucrase gene, the 112th base
'
. , :,.
:
~27~79
-- 6
pair being located within an EcoRI site in the natural
gene sequence.
Advantageously the DNAs and vectors of this
invention also possess the endogenous ribosome binding
site which, in the natural sacB gene has been deemed
to be located 9 base pairs upstream from the ATG first
codon of the signal fragment, i.e. a AAAAAGGAG se-
quence. Even more preferably the DNA or vectors of
this invention further comprise the whole promoter
region which, in the natural DNA including the sacB
gene, is within Sau3AI - EcoRI fragment containing
about 750 base pairs, said fragment including the
signal sequence and the 112 first base pairs of the
levansucrase gene (and terminating at the EcoRI site
already referred to hereabove).
It may be preferable to substi-tute the
promoter region (sacRC) obtained from mutant of B.
subtilis which is constitutive for levansucrase, for
the promoter region of sacR obtained from B. subtilis
in which the synthesis of levansucrase must be induced,
for instance in the presence of sucrose.
Another promoter region effective to control
the transcription of a gene in B. subtilis may also be
substituted for the endogenous promoter region normally
associated with the levansucrase and precursor genes,
more particularly upstream from the abovesaid ribosome
binding site. Examples of such promoter regions are
those normally associated with the endocellular sucrase
coded for by the sacA gene or the promoter normally
associated with the gene coding for the crystal protein
in B. thuringiensis.
Preferred vectors according to the invention
are those which further comprise a replicon enabling
them to be replicated also in E. coli. Such vectors
may accordingly be amplified in E. coli.
~' .. "; .~
,,. ,..
: ;,: , ,
. . : :, ~. ~ :
~ - , j , ., : ~
~7(~7~
- 7 -
Thus preferred vectors of the invention will
comprise the abovesaid Sau3AI-EcoRI sequence, such
vectors being further modifiable by insertions therein
of an~ appropriate DNA sequence coding for a selected
predetermined polypeptide. Needless to say that any
part of said Sau3AI-EcoRI sequence which is of no
effect on the capability of the signal sequence to be
transcribed or translated can possibly be deleted.
It will however be of particular advantage to
modify it in such manner that said appropriate DNA
sequence be inserted close to the terminal CGC codon of
the signal sequence, for instance by the insertion of
an appropriate linker including a suitable restriction
site within the partial levansucrase gene. Most
preferably however the insertion of said appropriate
DNA sequence ought to take place contiguous to and
immediately downstream from the signal sequence. This
particular mode of insertion must be considered when
the determined polypeptide sought shall be for subse-
quent pharmaceutical use. As a matter of fact it isthen indeed most preferable that the excreted poly-
peptide be free of even but a short N--terminal peptide,
which would be foreign to the determined polypeptide.
It is well known that the presence of even but one
additional aminoacid (or the substitution of another
distinct aminoacid for the first natural N-terminal
aminoacid occurring in the polypeptide sought) may
damage the prospect that the polypeptide so modified be
useful for therapeutical purposes.
One manner of achieving the recombination of
the signal sequence and of said appropriate sequence
coding for the predetermined protein under such strict
contiguous conditions involves, starting from a vector
comprising the abovesaid Sau3AI - EcoRI sequence, sub-
jecting the plasmid to an EcoRI treatment to linearize
the plasmid, then subjecting the linearized plasmid to
,
... .:. . .
.
.. . . .
7~
- 7a
a treatment with an exonucleolytic enzyme such as Bal31
under conditions controlled so as to remove the 112
base pairs of the partial levansucrase gene, then
ligating the appropriate gene coding for the determined
polypeptide or corresponding insert to the terminal
base pair of the signal sequence by means of a ligase,
such a T4 DNA ligase, either simultaneously with the
preceding operation or subsequent thereto, recircular-
izing the plasmid to provide a final vector containing
/
~ /
/
/
.. ... _ ~
~1 .
.
. . . ;
. ~ . . .
~ 7
- 8 -
said insert and capable of transforming B.subtilis and
__
rendering the latter capable of excreting the determined
polypeptide in its culture medium.
It may be noted that the correct removal of
the 112 nucleotide base pairs of the partial levansucrase
gene may have to be checked prior to the subsequent liga-
tion procedure, i.e. by recloning the exonucleolytically-
treated fragments, recovering the cloned fragments and
analyzing them, such as by the Maxam and Gilbert sequencing
method,thereby identifying the fragment includin~ an intact
signal sequence freed of up to the ~irst nucleotide of the
partial levansucrase ~ene. The identified fragment can then
be used in the subsequent ligation step mentioned hereabove.
According to a s.econd alternative the starting vector
can be hydrolysed with the HhaI restriction enzyme to
linearize the plasmid at the level of the Ala residue in the
-6 position, with respect to the terminal -1 Ala, i.e. in the
corresponding GCGC site, and then ligating to the so formed
extremity a synthe-tic polynucleotide of 17 base pairs to
restore the final part o~ the signal sequence coding for the
C-terminal Ala-Thr-Gln-Ala-Phe-Ala sequence of the signal
peptide, the so restricted signal sequence then bei~g li~a-
ted to the appropriate DNA insert as disclosed hereabove.
The synthetic polynucleotide sequence is either
identical to or different from the corresponding natural
seauence, yet with the proviso that it codes for the same
peptidic sequence, for instance with a view of provoking an
appropriate restricting site at the level of the codon
coding for the Ala in the -1 position.
, . .
~ ~7~ 7~
A still further alternative for producing a vector in-
cIuding an appropriate DNA sequence can be obtained by
ligating the linearized vector by the 11haI enzyme as
discussed in relatior. tc the second alternative (and
missing the 17 last nucleotides of the signal sequence)
with the apprcpriate DNA, hcwever previously mcdified
wit11 a synthetic polynucleotidic sequence corres~onding
to the missing part of the signal sequence upstream from
the starting codon of said appropriat~ DNA and provided
with the additional nucleotides in the direction opposite
to that of the transcription subsequent and so selected as
to enable the reconstitution of the HhaI site, by diges-
tion with the endonuclease.
Needless to say that other restriction sites may
lS be used for achieving similar reccmbinaticns. For ins-
tance the preceding second, third or fourth alternatives
may be carried out upon using the MnlI site formed
in the nuclectide double-ccdcn coding for the Gly-Gly
dipeptidic sequence (aminoacids at the -8 and -7 pcsi-
ticnc).
It will thus be unders.tood that the different
alternatives which have been indicated hereabove are but
illustrative of the many embcdiments which can be contem-
plated by the man skilled in the art to achieve similar
results.
Further aspects of the inventicn will further appear
as the description of the preferred vectors of this
invention proceeds, ~articularly in relation to the
drawings in which :
~,~
:- .
~ -
' .. .. . .
t~ 3
-- 10 --
- fig. 1 is a restriction map of an HindIII-EcoRI ragment
which is used below to constr~ct plasmid pBS610 and which
includes a DNA sequence of the invention; and
- ~ig. 2 and fig~ 3 are diagrammatical representatives of
the two plasmids constructed below.
l. Construction of plasmid p~S610.
DN~ extracted f,rom P. su'ctilis strain 168, available
at the "Bacillus ~enetic Stock Center" of the Department
cf Microbiology of the Ohio State Vniversity, Cclum~us,
Ohio, was ~artially digested with HindIII. After polyacry-
lamide gel electrophoresis the 4 kb fragments were clcned
in plasmid pUC8 disclosed ~y I. Vietra and J. Messing,
~ene, 1982, 19, 259-268. Plasmid pUC8 replicates only in
E. ccli. I`he transformant plasmid was recovered from the
clone which was found tc synthetise levansucrase. The
l~tter was detected by its action on sucrose, whic)- is
hydrolyzed into glucose anc fructcse polymers (levans)
under the action of levansucrase.
The plasmid recovered was digested with EcoRI and
PamHI. The ~ccRI-~aml~I fragment which was isolated
encomp2ssed a ~lindIII-EcoRI fragment of about 2 kb. The
restricticn map of the latter fragment is shown diagrair-
~.atically in fig. 1. It contained the DNA sequence of
the invention. The said DNA sequence, shcwn as a thicke-
ned line in fig. l, is terminated by the EcoRI site at
the level of the ll2th nuclectide of the levansucrase
gene. The Hihal site shown is at the level of the codon
. ..
'"` :' :
. .
: ,.
,, ~ ... . .
:, ~
~ ~'7(~77~t
for the aminoacid at the -6 position with respect to
the terminal L-Alanyl ~position -1).
The abovesaid BamHI-Eco~I fragment was then re-
combined into a new plasmid with the BamHI-EcoRI
fragment of 4 k~ obtained from pBR322, after digestion
of the latter with the BamHI and EcoRI enzymes, said
last mentio~ fragment also lncluding the ~lk~n ~.en~
responsibl~ for th~ replicaticn ef pBR322 in E. coli.
A HindIII-HindIII fragment of about 2.9 kb,
consisting of plasmid pC194, and including the repliccn
elements which enahle said pC194 to re~licate in
B. subtilis was inserted in the recombinant plasmid
previously obtained,more particularlyinthe HindIII-site
contained in the BamHI-EcoRI fragment including ~he
DNA ~equence of the invention, to finally ~rovide
plasmid pBS610, diagrammatlcally re~resented in fig.2.
Parts shcwn b~ arcC a and 'c (of about 2 kb and 20 bas~
~airs respectively originate from the last m~ntio~
BamHI-EcoRI fragment, part shcwn by arc c ~of abou
2.9 kb) ~rom pC194 and part d (of abcut 4 kb) Irom
pBR322. The fragments frcm which p~S610 is form~d
are also designated by the names of Ihe plasmi~s Gr r)NAs
from whlch they were r~s~ectively obtaine~. pBS610 (comprisln~
abcut 8,9 kb) can be re~licated ir, both E. coli and
~ u~ s.
T~:~ plasmid pC194 can be
obt~lned from plasm1ci pHV33. E. coli strin (SK 15 92)
transfcrmed by pHV33 has been deposlteci at the
"Coll~ctlon Na~ionale des Cul~res de ~5icr~-organismes~
(C.~;.C.~ 'aticn~l Col.cct1cr. c~ Cuitur~s of ~licro-
orgar.isms) cf thc ~ateul- Irstltut~ c,f Pa.rls, Francc-,
under nu~ber I-191, as mer,t1oned ln th~ pu~lished
European Patent Application 83 400826 published Nov. 2,
1983.
.'~ ,
:
.,
. : , : .
'7V~7~3
-- 12 -
2. Constructicn of pBS620
A HlndIII-EcoRI fragment, containing the DNA
sequence of the invention, analcgous tc that contemplated
in the preceeding example, was obtained by digestion in
the presence of the EcoRI and HindIII endonucleases of
the mGdified phage cbtained by the modification of the ~-
E~l~L3 phage referred to in the beginning cf this disclo-
sure. It comprises the sacRC promoter region of a B.
subtilis synthetizing levansucrase constitutively. A
strain of E. coli containing the so-mcdified phage has
been deposited at the C.N~C.M. on July 6th, 1984, under
accession Nr. I-314.
The EcoRI-~HindIII fragment of 2 kb containing ~.e
D~IA sequence of this invention and the S,acRC
locus was recombined with -the EcoRI-HindIII fragment of
about 4.3 kb isolated frcm pBR322 and ccntainins the
replicon elements h~hich enable pBR322 tc replicate in
E. coli. The plasmid obtained was digested again with
HindIII and recombined with the fragment of 2.9 kb
already mentioned above, from pC194, tc provide plasmid
pBS620 (of about 9.2 kb) diagrammatically shown in fig.3.
The different parts cf plasmid pBS620 are identified in
fig. 3 with reference to the plasmids or DN~s from
which they were respectively obtained. Like paS610,
pBS620 can be replicated in both _. subtilis and E. coli.
These plasmids can be further mcdified as disclcsed
above tc substitute therein an appropriate DNA sequence
codir.y for a determined polypeptide, preferably contiguous
tc the signal sequence, fcr the partial levansucrase ger,e.
The DNA sequence ccding fcr the determined polypeptide is
,cossibly prolongated, downstream from the sequence tc be
translated,by a lin~er including an appropriate restric~ior.
site, for instance an HhaI site, when recourse is had fcr
. i ~
` J
,~
~: - .. ;
1~'7(~7~
making the modification tc the third or fcurth
alt~rnatives disclcsed hereabove.
It should be understocd that the different recom-
bination steps ln vitro which ha~e been referred ~ in
the exa~ples, may bc carried out in any manner known per
se. The different restriction enzymes which have been
used are available in the trade. They c-an be used
according to the manufacturers'reccm~endations.
The modified plasmids so obtained can then be
used to transform ~. subtilis to cause the latter tc
effectively produce and excrete the d~termined poly-
peptide.
The following procedure can be used.
FGr instance B. subtilis strain QB666 (C.N.C.I~.
I-192) can ~e used. The transformatiGn may be carried
out according to the known technique descri~ed by
DUBNAU, ~. DAVIDOFF-ABELSON, R. SCHER, B. and
CIRIGLIANO C. ~1973, J. Bacteriol. 114, 273-286). The
cells ccntaining plasmids resistant to antibiotic can
be selected cn SP agar plates supplemented ~ th chlo-
ramphenicol (5 micrograms/ml).
The B. subtilis transformants can then be
d~tected, for instance by bringing intc play antibodies
previcusly prepared and active against the polypepti~e
sought. The cells of the detected colcnies can then be
recovered, and used f~r making cell cultures synthesi-
zing and excreting the poly~eptide in the medium. The
poly~e~tide may th~n be recovered from the cul~ure
medium in any suitable ~anner. 1'he~-~E~3 phage has been
disclosed by FRI9CHAUF A.M. et al J. Mol. Biol. (19~3) Vol. 170,
pp; 827 - 842.
Another advantageous ~dification which can be brought
to the final codon of the signal sequence and to the starting
codon of the Levansucrase ~ene consist in the substitution
- of GCC for GCG and
-f GGc(coding for glycine) for A~A.
The GCG/GGC junction so formed can be cleaved as
shown by the vertical line between GCC and GGCI to
thereby provide corres~onding blunt extremities.
. '-W;
, . ~ :
: ~. .. .
.. . . .
~.. "
.. ~ , , .
. . .. .