Note: Descriptions are shown in the official language in which they were submitted.
CA 02268004 1999-04-06
1
TRANSFORMATION SYSTEM I:N CANDIDA UTILrB.
Tachnieal Sector.
The present invention is related ~CO the field of the genetic
engineering and biotechnology, and in particular with the
development of a system host-vector for the genetic
transformation of the yeast Candicfa utilis which permits the
expression and secretion of hete:rologous proteins in this
yeast, which can be further used with several purposes.
Prior Art.
to The genetic engineering and biotechnology have opened an
unprecedented goal in the production of many interest
proteins with medical, nutritious or industrial purpose,
which report excellent benefits.
The bacteria Escherichia coli has been the moat employed
microorganism with these purpoeee~ by diverse biotechnology
companies, due to the knowledge of their genetics) its easy
manipulation and their systems of cultivation at high
densities.
however, the hopes of the production of proteins of inrerest
in this microorganism are affected for diverse factors, First
of a11 the pirogenic and toxic compounds in the cell wall of
Escherichia coli have provoked regulations limiting its use
when the products obtained are intended to be used as
medicaments or food in humane. In addition, the proteins that
are over-expressed in Escherichia noli generally appear in an
insoluble form that could not be secreted, On the other hand,
the mechanisms of transcription, translation and
postranslation modifications differ from the eukaryotic
systems, resulting in recombinant proteins that in a certain
way differ from that of the natura7_ sources.
The possibility of producing 'heterologous proteins in
eukaryotic systems, such as the ye~iSL, has some advantages in
relation to the prokaryotic systems. Among these, they could
CA 02268004 1999-04-06
2
mention the capacity to grow to k~.igh cellular densities and
the possibility of adapting their cultivation to continuous
systems _ Also the yeast are able to secrete proteins to the
culture medium in considerable bigger amounts, in comparison
with Eacherichia coli, also the c(rowth medium used for the
growth of yeast are more economic ~~han those used in bacteria
(Lemoine, Y., 1988. Heterologous expression in yeast. 8th
International Biotechnology Symposium, Paris, July 17-22).
Also, these systems can carry out other postraductional
modifications as it is the case of the glycosylation, which
is absent in the bacterial systems (Fiers) W., 1988,
Engineering Maximal Expression of Heterologous Gene in
Microorganism. 8th International Biotechnology symposium,
Paris, July 17-221. In addition, these systems generally have
certain preference for the same codon use that the higher
eukaryotic systems (Kigsman, S.M. et al., 1990. Heterologous
Gene Expression in Saccharomyces cerevisiae, Biotechnology &
Genetic Engineering Reviews, 3, Ed. G.E. Russell).
A11 this has led to the development and dissemination of new
eukaryotic transformation systerns and with particular
interest in yeast) firstly described for species of the
Saccharomyces genus, with special emphasis in Saccharomyces
cerevisiae. However, the expression of proteins in
Saccharomyces has confronted problems with the expression
levels obtained using their homologous promoters as well as
the hyperglycosylation observed in the proteins secreted to
the medium. That is the main reasons that in the last years
it has been intensified the search of non-conventional yeast
for their use in the expression of heterologous proteins.
3o With the development of transformation system in other non-
Saccharomyces yeast, like Hanse~nula polymorpha, Pichia
pastoris, and yeast of the Kluyveromyces genus (Sudbery, P.,
1994. Yeast 10; 1707-17z6) has permitted a quick advance in
CA 02268004 1999-04-06
3
the knowledge and development of these systems, as well as
increased the number of foreign proteins expressed with
vaccination, diagnoses and industrial purposes in these
system.
Also inside the genus Candida E~everal transformation and
expression systems have been reported, including Candida
tropicalis, Candida boidiini, c'andida glabrata, Candida
parapsilosis, Candida maltosa and Candida albicans all with a
marked medical interest, because many of these species are
the causing of opportunists illnesses in humans.
Candida utilie, in the Candida genus, has special interest
due to its particular characteristics. First of a11, Candida
utilis uses a great variety of inexpensive carbon sources
such as xylose, sucrose and maltose among other. Another
interesting feature is that it is possible produce
efficiently a great amount of cells in continuous cultures.
Also Candida utilis, as well as Saccharomyces cerevisiae and
Kluyveromyces lactie, have been authorized for the FDA (Food
and Drug Administration) like s~~fe sources in foodstuff.
Resides) Candida utilis has been used in the industry for the
production of L-glutamina, ~etil acetate and invertase among
other products.
A preliminary system for a transformation system in Candida
utilis has been described by 130, I. et. al., 1984,
(Biotechnology and Bioengineering Symp. 14: 295-301). This
report is incomplete because tl;~e presence of the drug
resistance marker and direct evidence of the transformation
process are not disclosed. Recently, a novel strategy
concerning a transformation syete:m for Candida utilis has
been reported by Kondo, K. et al . , 1995, (.T. Bacteriol . 177
7171-7177). They obtained cycloheximide (CYH) resistant
transformants by using a marker gene containing a mutated
form of the ribosomal protein L41, which conferred
CA 02268004 1999-04-06
4
resistance, and also used ribosomal DNA (rDNA) fragment as a
multicopy target for plaemid integration because the marker
needs to be present in multiple copies for selection of CHY-
resistant transformants.
Many attempts has been done to use Cazidida utilis as host for
heterologoue gene expression, nevE:rtheless, a transformation
procedure in Candida utilis using auxotrophyc mutants has not
been developed up to now.
If it is taken into account the knowledge obtained in the
to industrial exploitation of Candids~ utilis and the novelty of
its genetics, it could be ~=onsidered an attractive
microorganism for its commercial utilization as expression
system of heterologous proteins.
Disclosure of the Invention.
The object of the present invention has been to provide a
transformation system useful to ex;presa heterologous proteins
in the yeast Candida ucilis, based on the obtainment of
axotrophyc mutants of this specie ~~s well as m the isolation
of different genes from a genomic library which complement
said auxotrophies.
The transformation process described herein provides means to
introduce DNA fragments or sequenc~ae into Candida util.is host
cells and allows Candida utilis to be used as a host system
for gene expression and protein prnduction_
Furthermore transformed yeast ce7_ls can be identified and
selected by the methods describe in the present invention.
Novel strains of Candida utilis, vectors and subclones are
provided. Novel yeast strains are used as hosts for
introduction of recombinant DNA fragments.
The invention further relates to stable transformation and
maintenance of DNA in host cells, where the marker is
homologous integrated in the genome of the yeast.
CA 02268004 1999-04-06
S
Concretely the present invention consists in a transformation
system in the yeast Candida utilis, which uses as hosts new
auxotrophyc mutants isolated from the strain NRRL Y-1o84 of
said yeast. These mutants are defective in the enzyme
orotidin-5' phosphate decarboxylase of the biosynthetic
pathway of the uracil or in the biosynthetic pathway of the
hietidine, and were obtained by classical mutagenesis using
both UV and NTG as mutagenic agents known from the prior art
(Sherman, F. et al., 1986_ Laboratory course: Manual for
methods in yeast genetics. Cold Spring Harbor Laboratory
Press, NY). These mutants present a high stability (frequency
of reversion approximately of 1o'a) and can be efficiently
transformed with the procedure described in this invention.
In addition, it was isolated as selection markers for the
mutants of Candida utilis the gene URA3, encoding for the
orotidin 5'-phosphate decarboxylaeEa enzyme and HISS encoding
for the Imidazol-glycerol-phosphate dehydratase enzyme which
were isolated from a gene library of Candida utilis in PUC19
and identified by complementation of the pyrF and hisb463
mutations respectively in the stra~Ln Escherichia coli MC1o66.
Similarly the mutation ura3 of Saccharomyces cerevisiae
strain SEY 2202 was used to identify this gene came from C.
utilis, The complete sequence of 'these genes was determined
and the predicted amino-acid sequences show high similarities
with that of the same gene from other yeast and fungi.
The vectors used in the transformation system were the
plasmids pURAS and pUREC3 which comprise the URA3 gene,
capable of being integrated into the homologous locus of the
Candida utilis mutant host by homo7.ogous recombination.
The present invention also provides a set of plasmids based
on those described formerly, which are used for the
transformation of the mutants isolated from C. utilis in
order to obtain heterologous proteins.
CA 02268004 1999-04-06
6
The transformation system of the present invention uses as
hosts new auxotrophyc mutants obtained from the strain NRRL
Y-1084 of Candida ut~l.is, which are defective mainly in the
uracil and histidine pathways, and among them were selected
for their characteristics the mutant CUT-35 (ura') and the
mutant TNll~1-3 (his') .
EXA?~LES
Example 1: Mutsgenesia of Caadida t~tilie
To develop a transformation system in a microorganism they
are generally required three elements:
(1) a marker for the selection o.E the tranaformants, that
could be an auxotrophyc or a dominant marker,
(2) a mutant or appropriate host for this selection and
(3) a method to reproducible introduce the extranuclear DNA
in the host in an efficient form.
In order to achieving the second objective, it was carried
out a classical mutagenesis in t:he yeast Candida utilis.
Cultures of the of the selected yeast strain (NRRL Y-1084)
were inoculated in 100 ml of YPG medium (Yeast extract 1%)
2o peptone 2%, glucose 2%) and they were incubated in a shaker
at 30~C for l0-20 hours. 50 ml of the culture was centrifuged
to 3000 rpm for 5 minutes. Later the cells were washed 2
times with citrate buffer 0.1M' (pH 5.5) sterile and
resuspended in 50 ml of the same buffer. After, 10 m1 of this
suspension was incubated with a solution of NTG to a final
concentration of 50 mg/ml. The su=;pension was incubated for
minutes at 30~C in repose.
The NTG was removed of the suspension washing 2 times with
distilled water. The cells were resuspended in 50 ml of YPG
30 and then they were transferred to an erlenmeyer with 100 ml
of YPG. This culture of mutant cells was incubated at 30~C
for 48 hours.
Eltrichment with nyatatin
CA 02268004 1999-04-06
7
Approximately 5 ml of the 48 hours YPG expressed culture was
used to inoculate 100 ml of minimum medium. The minimum
medium (YNB, Yeast Nitrogen Base) used for the enrichment
Kith the antibiotic was not supplement with the metabolic
produced by the biosynthetic via in which the defect is
looked for. For example, for the isolation of auxotrophics
mutants for uracil, it is not added to the medium.
The incubation was continued until the optical density (oD)
of the culture reached 20 to 30% of the initial OD. When the
culture reached the desire OD, th.e cellular suspension was
treated with 25 units/ml a solution of nyetatin. The solution
with the antibiotic was incubated at 30~C for 30 minutes
without agitation, The nystatin waos eliminated of the medium
washing the cellular suspension wi~~h distilled water 2 times
and later the cells were resuspende:d in an appropriate volume
in order to obtain 150 to 200 colonies per plate,
Screening and selection.
The plates containing the mutageni.zed colonies according to
the example 1, were plated in YNB mediums with and without
between uracil. The colonies that ~iid not grow in absence of
uracil were taken for further analysis.
Specifically in order to identify the presence ura3 or ura5
mutants, the cells were grown in presence of 5-fluorotic acid
(SFOA). The resistant colonies were selected as ura3 or ura5
like mutants.
Example 2: Isolation of ura3 mutants.
After the nystatin enrichment the culture was washed twice
with distilled water and plate directly on YNB plates
containing 0.75 ~g/ml of 5-FOA (5-fluoro-orotic acid, Fluka)
and 40 ~g/ml of uracil , The plates were incubated four days
and the colonies that grew were analyzed in order to check
the ura- phenotype. From the 4x10 viable cells, after the
nystatin enrichment) 79 colonies showed resistance to the 5-
CA 02268004 1999-04-06
8
r
FOA. These colonies could be ura3, uraS, or simply resistant
to the 5-FOA. To confirm the uracil auxotrophy the supposed
mutants were plated in YPG medium incubated 48 hours to 30~C
and replicated in YN$ plates with a.nd without uracil. A total
of 67 colonies were unable to grow in YNB without uracil,
shown a ura' phenotype.
The frequency of reversion of all these mutants were
determined standing out a group of 23 mutants by presenting a
frequency of reversion in the order of 10-~, what confers
them certain stability to be used as a host for
transformation system.
The orotidin 5'-monophosphate decarboxylase (ODCase) activity
of a11 the uracil auxotrophyc mutants was determined by the
method of Yoshimoto et. al., 197A (Methods Enzymol. 51: 74-
79), as well as it was determined their growth conditions.
These results are shown in Table 1.
Table 1. Summary of the characteristics of more significant
ura3 mutants.
Name Reversion OMF>DCase Growth
Frequency Activity
CUT35 c SxlO-' - +++
CUT43 c 1x10-' - +++
CUT61 < 1x10'8 - +++
CUT65 c 1x10-~ - ++
CUT70 < 1x10'~ - +
CUT86 < 7x10-' - +++
CUT93 1x10-~ - +++
CUT166 6x10-~ - +++
Example 3: Isolation of other mutants different of the
phenotype ura- .
With the objective of having a vari~Cty of auxotrophyc mutants
different than uracil, the cellular suspension obtained
according to the nystatin enrichment were plated in YPG and
CA 02268004 1999-04-06
9
incubated at 30~C for 50 hours. Later the colonies contained
in the YPG plates were replicated on plates containing YNB
medium arid incubated at 30~C for 48 hours. The colonies
unable to grow in the YNB plates were taken for further
analysis.
Around 2411 colonies were screened and consequently was
obtained a 2% of appearance of auxotrophyc mutants. These
mutants were checked using the Holliday and the Finchan
tests. It was obtained 90~ of his mutants, 2% responded to
to the phenotype lys', 1% to the p.'henotype leu-, 1% to the
phenotype met-, 1% to the phenotype ads- and the 5% did not
show a simple auxotrophyc phenotype (Naa).
The mutants having frequency of reversion between 10-~ and
l0'~ were selected for further analysis (Table 2) .
Table 2:
Name Phenotype Reversion
Frequency
TN~13 his- 1x10-
Tt4r131 his- ixlo-
Th~164 his' 1x10-
Tt4rT9 h1s- 4xlo-'
TMN12 his- 5x10''
T1113 his' 2 . SxlO-'
Tt4r162 his- 8x10''
Tt~174 his' 2x10 ~
TI~178 his' 2xlo''
TMN4 S lys' 8x10'6
T1~T71 his- 2x10-6
TI4N82 Naa 2x10'6
Example 4: Conatructioa Caadidfa utilis geaaaric library.
of s
The chromosomal DNA extractedfrom Candida utilis NRRL Y-1084
was partially the e:nxyme Sau3A and fragments
digested with
with sixes b etween 6 and
9 kb were
isolated
by
zo electrophoresisin law point gel temperature agarose (LGT).
CA 02268004 1999-04-06
Theee fragments were ligated in the pUCl9 vector previously
digested with HamHI and treated ~.rith alkaline phosphatase.
This ligation were transformed in Escherichia coli MC 1066
(F', D Lac x74, hsr, hsm, rpsl, galU, galK, trip C 9030F,
5 leuB, pyrF::tnS) strain. Aproximately 95% of recombinants
were obtained in the genomic library.
Example 5: ZBOlation of the QRA3 gE~ne fram Candida utilis.
As a marker for transformation of the Candida utilis ura3
host, the URA3 gene from Candid~t utilis was isolated and
10 characterised. DNA fragments which contained the Candida
utilis URA3 gene were isolated from a Candlda utilis pUCl9
genomic library by the ability to complement Escherichia coli
pyrF mutation, taking into account that URA3 gene from
Saccharomyces cerevisiae complements the pyrF mutation of E.
coli, using fortuitous promoter aci:ivity in Escherichia coli.
When this library was spread on uracil-deficient medium, 12
independent pyrF+ colonies were isolated. Two of these clones
(AURA-2 and pURA-5) had the same 2.6-kb genomic Candida
utilis insert DNA on pUCl9 using HindIII and EcoRI
restriction digestions. DNA from both plasmids transformed
Escherichla coli MC1066 to Ura' at a high fx'equency. The map
of one of the Candida utilis UFtA3 gene-pUCl9 recombinant
plasmid (AURA-5) is shown in Figure 1. This plasmid was used
for further complementation and sequence analysis.
Example 6: Demarcation sad sequence analysis of Candida
utilia URA3 gene.
The plaemid pURAS was digested with several restriction
enzymes. The fragments corresponding to the EcoRi digestions
(1,9 kb), HincII (1,3 kb), SacI i;l,l kb) were subcloned in
pBluescript SK (+) giving rise the plasmids pUREc-3, pURHinc-
1 pURSac-4, respectively. Fragment corresponding to the
plasmid pURSac-4 was not able to complement the pyrF mutation
of Escher.ichia coli (Figure 2)_
CA 02268004 1999-04-06
11
The 1,9 kb EcoRI fragment of (pilRhc-3, Figure 3) containing
the URA3 gene of Candida utilis arse completely double strand
sequenced by the method of Sanger et. al. (1977, Proc. Natl.
Acad. Sci USA 74: 5463-5467).
With this end, the universal oligonucleotides of the series
Ml3mp/pUC, ae well as internal o7.igonucleotides derived of
the eequence were used. The complete sequence of 1179 by of
the EcoRI fragment ie shown in Figure 4 (Seq. Id. No.: 1, 2).
This fragment contains an open reading frame of of 80o by
(266 codons). The Candida uti3i:: URA3 gene codes for a
protein with theoretical moleculaa~ mass of 29 436 Da. The
nucleotide sequence flanking to the ATG initiation codon
(GAAAATG) corresponds well with the: consent reported in yeast
(A/YAA/YAATG), by Cigan y Donehue, l997 (Gene 59; 1-18).
The 3'-non translated region contains a putative
polyadenilation site (TATAAAA, consensus AATAAAA) present in
the 3' terminal region in most of the eukaryotic genes (Guo,
Z y Sherman, F., 1995. Mol. Cell. f~iol. 15: 59A3-5990).
.>rxample 7: Complementation sui~lysis in Saccharomyees
cerevissse.
With the objective to verify that the fragment cloned,
correspond to the Candida utilis URA3 gene and not a DNA
fragment with supprea8or activity, the 2.8 kb Kpnl/Xbal
fragment of the pURA5 plasmid was cloned in a pHR322
derivative vector (pHSARTR-3). The pBSARTR-3 vector posses an
autonomous replicating sequence ~;ARS1) and the TRP1 gene
selection marker both from t3acchaZ-omyces cez~evieiae.
Consequently, the plasmid pUT64 (figure 5) was obtained and
used to transform the SaccharotnyceFr cerevisiae strain SEY2202
(ura3-52-, leu2-112) hie3) using the lithium acetate method,
previously reported by Ito. et. al., 1983 (J. Bacteriol. l53:
163-168).
CA 02268004 1999-04-06
12
The transformants were obtained 4B hours after the
transformation. The presence of t:he replicative plasmid was
checked using both colony hybricLization and southern-blot
experiments.
The frequency of transformation obtained (2-5 x l02
transf/mg) is in agreement with that of reported in the
literature for other auxotrophic ttu3rkere from other yeast.
Consequently, it was demonstrated that the gene URA3 from
Candida utilis is able to complement the ura3 mutation of
Saccharomyces cerevieiae .
,.,;._ ,r~''. ..,< ..,
CA 02268004 1999-04-06
~a
Example 8: Transformation of fandida utilis CUT35 with the
plasmids pURAS and pUCURA3 using the LiAc method.
_~:e ~_a3 mutant strain of Canc'ida util-s C:,'T 33, ~henc~ype
and deocsvted with access'cn cumber CBS 100C83 at
.el'l~raai~'Jurea',: VOCr ~C,h'_.ili'Ttej.Cll~'_:reS On OCtODer ! , 1Q~ , waS
'ransformed using the method cf 1 i thium acetate rapcr~ed by
Tto et. al. (1983), and using the previously _sclated UR~3
g ene _ r .''.IiW.a.'l~i..Ja U L_' ~i S aS a ~ie l eC t'_on mar ker . ='='~~e
VeC'~"r j
~~=~_.0 and ~'..'C.::W3 ) , used _._ t'-.e _r ans formatlcn S'~-S tem wer a
deS'~gne~~ C.'', ~ r2Ci.l'v' 'integrate 1:'_CO t.~e hOIt101 OgcLlS lOC'.',..
.:f
'_?2 (JdIlC,t'.~."'.a a ~_' ~=S ml.'.=an t S i.raln : y 1:.~,Iiu;~lOgOL:S r 2C
:i(L.~~:la r_Cn.
~'re plasmid ~;UC~~A3 was obtainec. by cloning the 1, ~ kb ~c.oRI
__ :gale-:t .._ t:'le ~c~l:_a ;.'~~_~S i:a~_? gei'le In L,'?e CQrreS~Oi::_ug
~_~e Cf the VeCt;,_ NL.T'~1~ , ~~l:~re b) . ~reV:'C',iS~y tile
~_~._..__,,r:ClaC~ ~:? '"~rOCe~.ure, COth (,._:.~S:LI~QS Were '_~2S~eu ~._
?'~,~_
~~rh,_c'_: is loc~_ed in the ' :rime of t.~:e s~_~c~'ara~
gene. T':e '~ i~eari~at_on cf the plasmids favors ~~e homologous
___CeCrat~~~,r! '~:'1 C_~.e G'e?'lOiTllC iCC'~.';S.
Tne ~_ar_sfor:~a_icn procedure was performed as ~reVvousl~;
descri'oed by I~c. 2s. ai, a~83 ~:~r lithium aceta~e procedure,
exoe:~t ~~ha~. she concentration used for LiAc was 3~ mM in the
present case. In order tc carry our this method, after ~:~:e
,~~'~,ony growth cn YPD plate, the selected s tra~_-~ is ~.,~1 ~;~ra.d
with sharing in 5 m'~ of YPD liqui d medium at 30~C for a~No',.:L 8
hours, it is inoculated in i00 ml cf YPD liquid medium a_ a
'~~'~' ~~
concen~~a_ior_ o. OD:: =0.003 and cultured with s~nak_ng ~~ 3~'
for about l~ hears. After the cells have grown to logarit=.mlc
phase (DO;,,_,=0. 6) , they are collected by centrifugation at
3000 rpm for 5 minutes. The cells were washed once with 3 m1
,._ sterilised water. After, Lhe cells are suspended into 1 m1
~_ J0 mu LiAC. Afterward they are incubated wit:: agitation at
30~C for 1 hour. Consequently 100 ~1 of cell s faere al iqucted
in microfuge tube and S ~g of DhA was added. TL:en the mix
CA 02268004 1999-04-06
~a b
~~-~_Si DNT1 ~~:?C.l~lua'tC-C3C~~l:~i" J ~ :il~:=lCsJ.,_:Sa ..,~.-
.vas a t . '
the cells/DNA treated wit: 0.7ml of 40=~ c~
?E
:nl~ of ~,_AC re vr~wa'Natedat 3C''C for 1 ncva____ __rwar:~
we . ~.
':eat s:~ocri .1~. i;.r .. mi::was app~~ie~~__ ware= :~at_..
at
immediate~~y _.._ 30 secor_ds a~~ _~e
t_~_e mix was
_ in
T
ce isi.~NA was was:.e~~ twicei:__ris 1C:~,M, _~ __ -~n~M ~ri
_
"i_.a~ , .:~, s eIP"~Ve~, r!:e:":_ ~__-S _._
t::= ~ Cer~_~a::= r ___ _
_
suspended _.. __~ u'~ : = ~'r~.s i0 .-nM pri ~, DDTA _ m,:~: _ __ .. s;: as
tri2 Zl::dl VCi'~a?e .
1~ m~c se~!eCtlO:: .._ t_'?e __.i?S~O~iTla:itS waS Carr'~e~=
__';W','~:'~~:T,LlI:l
ITLed'~'ui~t ~'-aClC'~=':~ L:~~C11. i'::e :Tl~~'_~.~_.. sta~~_~- _._ :'!2S2
~~ai:S=C~~?'La=':t_, i~.ia~ ~ JIl, Clle t0 t.'~e rileCaai iS:'il : -
'~Jr;;,~~~ OQOllS
_ :taC~.rat~0=:. _':=~e t_ ,.mss=S'=:.",a't~OI? 'LreC['t'e'."':C;V
CCi~C_~~_niit_'? _:"':at
.. _ r eCOr =ed _ J _ .~.~~.,,.._~.~ Oil.';,-;: eS ~~2~ e'v 1 S=.a.~ ailC ~
_.':c= __..._
~or.:rer_=ior:a'_ _;ea~_ ~..._. ~_.te~~atiVe sector '~ _ %
_ _ s ; ~.a.~l=
CA 02268004 1999-04-06
_ 13_ .,
Example 9: ~ Tranefosmation of Caacfida utjlis CQT'35 with the
plasmids p~RAS and pUC0RA3 using electroporation method.
The ura3 mutant strain of Candida utilis CUT35 was
transformed using the electroporation method reported by
Kondo, K. et al., 1995, (J. Bacteriol. 177: 7l71-7177), using
l0 the previously isolated tlR~~3 genes from Candida utilis as a
selection marker. The vectors (p~lRAS and pUCURA3), used in
the transformation system were designed to directly integrate
into the homologous locus of the Candida utilis mutant strain
by homologous recombination,
The procedure used is based on tine treatment of the intact
cell yeast with an electric field,. The following conditions
were used: 0.7 kV (3,5 kV/cm) ae a pulse, a resistance of ,
800 S2 and a capacitance of 25 NSF.
Previously to the transformation procedure, both plasmids
2o were digested with XhoI which is located in the 5' prime of
the structural gene facilitating the homologous integration
in the genomic locus.
The selection of the transformant:e was done in YNB minimal
medium without uracil.
The frequency of transformation ufaing both pURAS and pUCURA3
depended of the plasmid concentration. A comparison of both
methods (LiAc and electroporation) is shown in Table 3.
CA 02268004 1999-04-06
14
Table :3
Transformation frequency
Vector DNA Concentration (# tranef./~g)
(ug) LiAc Electroporation
pUCURA-3 0.1 - 70-90
0.5 - 640
3.0 22 -
pURA-5 0.1 - 40-50
0.5 - 670
3.0 21 -
The mitotic stability of these tra:nsformante was high, due to
the mechanism of integration.
The frequencies of transformation coincide with that of
S reported for Saccharomyces cerevisise and other non-
conventional yeast using integrative vectors.
In figure 7 is shown the outline of the possible integration
events in the genome of Candid.n utilis as well as the
Southern-blot of some traneformants.
Example 10: Zaolatlon of the HIS3 3~ene of Cand3da utilis.
The HISS gene from Candida utilis was isolated and
characterized from the library previously described in the
Example 4. DNA fragments, which contained the Candida utilis
HISS gene, were isolated from a Candida utilis genomic
library by the ability to complement the hisb463 mutation in
the Escherichia coli KC8 (hsd, his13463, leuB6) pyrF::TnS Kmr,
trp (983o (lact YA), stm, galu, gal), taking into account
that HZS3 gene from Saccharomyces cerevis~ae complements the
hisb463 mutation of Escherichia coli, using fortuitous
promoter activity in Escherichia coli.
In order to isolate the HZS3 gene, 105 cells were spread on
minimum medium (M9) supplemented with uracil, tryptophan and
leucine. Plasmid DNA was extracted from colonies able to
growth in this medium and consequently capable to complement
the hisb463 mutation in the Escherichia coli KC8 mutant
strain. The plasmid DNAs isolated were used to retransform
CA 02268004 1999-04-06
the Escherichia coli KCe mutant strain. A11 the plasmid able
to supplement the histidine requirement of the mutant strain
were denominated pHCU. In order to confirm that the his'
colonies contained the HISS gene of Candida utilis and not a
5 fragment of ADN with auppressor activity, two of the plasmids
obtained from the his' transformanta (pRCU37 pHCU40) were
subjected to a PCR reaction. Two degenerate oligonucleotides
from two regions highly conserved in five IGPDasae sequences
from yeast and fungi were used. '.Che oligonucleotide as well
10 as amino-acid sequences of the degenerate oligonuclcotides
are shown in the figure 8.
Approximately a 50o-by PCR band corresponding to the coding
sequence of the HIS3 gene from Candida utiZis was amplified.
The approximately 500 by PCR fragment, which was shown by
15 Southern blot to hybridize the Candida utilis genomic DNA,
was cloned in T-Vector (pMOSBLUE, Amershan) and the predicted
amino-acid translation of its sequence was shown to be highly
identical to His3p from other yeast and fungi. The plasmid
pHCU37 (Figure 9) was used for the determination of the
entire sequence of the HIS3 gene from Candida utilis.
Example 11: Sequencing of the XI83 gene of Candjda util3s.
The HISS gene from Candida uti_Lis was completely double
strand sequenced using the method of Sanger et. al. (1977) .
Oligonucleotides of the universal series Ml3mp/pUC were used.
Primers taken from the PCR fragment were used to initiate
sequencing of the entire gene. A total 1190 by of the pHCU37
was sequenced. The entire sequence HISS from Candida utilis
is shown in Figure 10 (Seq. Id. No.: 5, 6).
This fragment contains an open residing frame of 210 codons .
The Candida utilis HI53 gene code for a protein with
theoretical molecular mass of z4 5.L8 Da.
CA 02268004 1999-04-06
16
Example I2; Isolation of the INV1 gene that cod.ifiea for the
invertase of Candida utilis.
In order to isolate the INV1 gene that codifies for the
enzyme invertase in Candida ucilis, it was taken advantage
from the fact that the amino acid sequence of this enzyme
presents regions highly preserved between different species,
thus the sequences of p-fructofw=anosidase from yeasts were
aligned. Two degenerated oligonu~~leotides used in said PCR
were designed according to the codons usage in Candida
l0 utilis. The polypeptide sequences, as well as the degenerated
vligonucleotides are shown in Figure 11.
PCR generated ~a band of 417pb, which was subcloned in the
Vector-T (pMOBlue, Amereham), Said band was completely
sequenced and the tranelation~ of said DNA fragment
corroborated the presence of consensus regions and also a
high homology among the enzymes invertases reported in the
literature. This demonstrated that the fragment isolated
belonged to the INV1 gene, which codifies for this enzyme in
candida utilis. This fragment was used as probe for isolating
INV1 gene from Candida uti3is.
After the search of the Candida u~til.is library, a total of 6
clones having the iPNl gene were isolated. Two of these
clones were selected for their siae for sequencing (pCI-6 and
PCI-12), using the oligonucleotides of the previous step for
PCR. These oligonucleotides were used to start the complete
sequence of the gene from both chains belonging to plasmid
pCI-6.
Example 13: Sequencing of the INV;L gene of Caadida utilie.
A total of Z607 by of the clone: pCI-6 containing the INV1
gene codifying for the inverta,se of Candida uti.tis was
completely sequenced by the method of Sanger et la., (1977),
and for this end were used universal oligonucleotides
belonging to the series Ml3mpfpUC, ae well as internal
CA 02268004 1999-04-06
17
oligonucleotidea derived from t:he sequence. The complete
sequence of the 2607 by fragment is shown in Figure 12 (Seq.
Id. No.: 5, 6). Said fragment contains an open reading frame
of 1602 by (S34 codone). The IrfVl gene of Candida utilis
codifies for a protein of theoretic molecular weight of
60 703 Da.
Considering that the invertase in Candida utilis is a
periplasmic enzyme, it should hare in its N-terminal end a
signal peptide. Analyzing the sequence up to 5'-end of the
gene) two codons ATG are observed (ATG1 and ATGz in Figure
12) giving place to ORF codifying for proteins which differ
only the size of their N-terminal ends. Applying the von
Heijne's algorithm (1986, Nucl. Acids Res. 14: 4683-4690) to
predict the restriction site for the peptidase signal of the
mature protein derived from both ~4TG, it is reveled that the
restriction sites for both casein are located between the
residues S39 and s40 for ATGl and between residues S26 and
S27 for ATGi. This gives place tc~ signal peptides of 39 and
26 amino acids respectively. Taking in to account the average
size for signal sequences in yeast (estimated as about 20
residues) , it can be suggested that the start codon for the
INVl gene is the second ATG.
Eleven potential sites of N-glycoeilation, according to the
general rule N-X-T/S, were found in the asparagines of the
positions 40, 88, 141, 187, 245, 277, 344, 348, 365, 373, 379
and 399 of the sequence of the mature protein.
The region 5'-non translated sho~Na two possible TATA boxes
(consensus TATAA)) in the regions -18 to -14 and -212 to
-208, and also various possible union sites for the repressor
Migl (consensus SYGGRG).
CA 02268004 1999-04-06
18
BRIEF DESCRIPTION OF THE FIGURES:
Figure 1. Plasmid pURAS, obtair,,ed in the Candida utilis
genomic library by complementation of the Escherichia coli
MC1066 pyrF SaCCharomyces cere~isiae SEY 2202 ura3 mutations.
Figure 2. Restriction enzyme map, sequence strategy and
complementation analysis of the URA3 gene from Candida
util.is.
Figure 3. Plasmid pUREC3 obtained by cloning of the 1.9 kb
EcoRI fragment of the plasmid pU12A.5 in the pBLUESCRIPT SK(+).
Figure 4. Amino acid sequence deduced from the DNA sequence
of the URA3 gene, and the DNA sequence of the DNA encoding
it.
Figure 5. Plasmid pUT64 obtained for the complementation
experiment in the Saccharomyce,s cerevisfae ura3 mutant
strain.
Figure 6. Plasmid pUCURA3 used in the Candida utilis
transformation experiments.
z5 Figure 7. (A) Predicted arrangements of the vector DNA
integrated at the UFA3 locus by homologous
recombination.
(B? DNA-blot hybridization of genomic DNA from
some transformante.
Figure 8. DNA sequence of the primers used in the isolation
of the HISS gene) and the deduced amino acid sequence
encoding it.
CA 02268004 1999-04-06
19
Figure 9. Plasmid pHCU37, obtained in the Candida utilis
genomic library by complementation of the Escherichia coli
KC8 hiab463 mutation.
Figure 10. Amino acid sequence deduced from the DNA sequence
of the HIS3 gene, and the DNA sequence of the DNA encoding
it.
Figure 11. Amino acid sequence and corresponding DNA sequence
of the oligonucleotides used in the PCR for the isolation of
l0 the INVl gene of Candida utilis.
Figure 12: DNA sequence corre;Bponding to the fragment
containing the INVl gene of Candid!a utilis.
CA 02268004 1999-04-06
SEQUENCE LISTING
(1)
GENERAL
INFORtIATION:
5
(i) APPLICANT:
(A) NAME: CENTRO DE INOENIERIA GENETICA Y HIOTECNOLOGIA
(B) STREET: AVE. 31 ENTRH 15B 5' 190, CLTHANACAN,
PLAYA.
(C) CITY: CItJDAD DE LA HABANA
10 (D) STATE: CIUDAD DE LA HAHANA
(E) COUNTRY: CUBA
(F) POSTAL CODE (ZIP): 12100
(G) TELEPHONE: 53 7 216013
(H) TELEFAX: 53 7 336008
15
(ii) TITLE OF INVENTION: TRANSFORMAT1:ON SYSTEM
IN CAUDIDA UTILIS.
(iii) NUMBER OF SEQUENCES: 6
2 (iv) COMPUTER READR8L8 FORM:
0
(A) MEDIUM TYPE: Floppy disk
(a) COMPUTER: IHM PC compatible
(C) OPERATINQ SYSTEM: PC-DOS/MS'-DOS
(D) SOFTWARE: PatentIn Release #1.0, Version
#1.30 (EPO)
(vi) PRIOR APPLICATION DATA:
(A) APPLICATION NUhfl3ER: A2/96
(B) FILING DATE: 03-OCT-1996
(2)
INFORMATION
FOR
SEQ
ID
NO.
1:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 11'19 base pairs
3 (B) TYPE: nucleic acid
S
(C) STRANDEDNESS: siagle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(i11) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
45(vi) ORIGINAL SOURCE:
(A) ORGANISM: Candida tltilia
(B) STRAIN: NRRL Y-1084
(ix) FEATURE:
SO (A) NAME/KEY: mat_pcptide
(B) LOCATION:1..1179
(D) OTHER iNfORMATION:/produet- "Enzyme orotidin-5-phosphate
decarboxylaee "
/gene= "URA3"
5S
(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 1:
CAAATAGCTC TCTACTTGCT TCTGCTCAAC AAGCTGCTGG AACTGCTGCT GCTCTTTTGG 60
CA 02268004 1999-04-06
21
GTTGAATTGG TCCATCCTTG CTACTTTTCC GCCTAGT'.CTC GATTCCGATT CTGATAGAGA 120
AGCCC:AGCTA TC~AATGGAAG AAATTTTTCA CTTTTOT1~TG TCCTTTTTTT CACGCTTCGT 1A0
TGCTTCGGAC AAAAAAATAG TOGAGGCACT C'GGTOGAL9GG AAGCTATCCT CGAGATGAAA 240
AATTTCAAGC TCATCTCATC GTCCAAGTGC3 GAGAGCAAGC TGAdGCTTCT GAAGAGOTTG 300
AGGAAAATGG TCACC'.ACGTT ATCOTACACA GAGAGO(1<:AT CGCACCCTTC GCCACTTGCT 360
AAGCGTCTGT TTTCGCTTAT GGAGTCCAAG AAGACGAi~CC TGTGTGCCAG TG?CGATGTT 4Z0
CGTACCAC:AG AGGAGTTGCT CAAGC?CGTT GATACGC7.'TG GTCCTTATAT CTGTCTGT?G 480
AAGACGCATA TTGATATCAT TGATGACTTC TCTATGGAGT CTACTGTGGC TCCACTGTTG 540
GAGCTTTCAA AAGAGC.'ACAA TTTCCTCATC TTTGAGCiF~CC GTAAGTTTGC TGATATCGGC 600
AACACCGTCA AGGC,'ACAGTA CGCCGOTGGT GCGTTCAAGA TTGCACAATG GGCAGACATC 660
ACCAACGCCC ACGGTGTCAC CGGTCGAGGT ATCGTCAFvGG GGTTCiAAGGA GGCTGCACAG 720
GAAACCACGG ATGAGCCAAG AGGGCTGTTG ATGCTTGCTG AGCTAAGCTC CAAGGGCTCC 780
2 5 TTCGCTCACG GGACA2ATAC CGAGGAGACC OTGGAGA7.'TCi CCAAAACTGA TAAC'sG'ACTT'I
84 0
TGTATTGGAT TCATCGCACA GAGAGACATG GGTGGCACiAG AAGATGGGTT CGACTGGATC 900
ATCATGACAC CAGGCGTGGG ACTCGACGAT AAGGGCGF~CT CCCTGGGCCA ACAGTACAGA 960
ACTGTCGATG AGGTTGTCAG TGGTGGCTGT CJACATCA7.'CA TCGTTGGTAG AGGCTTGTTT 1020
GGAAAGGGAA GAGATCCAAC AGTGGAACGT GAGCGTTF~TA GAAAACiCAGG CTGGGATGCT 1080
TATCTCAAGA GATACTCAGC TCAATAAACG TTOAGCTC.'rG OCTTGTATAG GTTCACT"TGT 1140
ATAAAATGTT CATTACTGTT TTCGGAAGTT GTAGATTGC 1179
(2) INFORMATION FOR SEQ ID NO: 2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: Q66 amino acida
(B) TYPE: amino acid
(C) STRANDEDNESS: eiagle
4 5 (D) TOPOLOGY: linear
(ii) MOLECULE TYpE: proteia
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Car~dsdz utili9
5 5 (s) sTRAIN: NRRL Y-loe4
(ix) FEATURE:
(A) NAME/1CEY: Protein
(8) LOCATION:1..266
CA 02268004 1999-04-06
22
(zi) SEQUENCE DESCRIPTION: SEQ ID NO: 2:
Met Val Thr Thr Leu &er Tyr Thr Glu Arg AIa Ser Hie Pro Ser Pro
1 5 10 15
Leu Ala Lye Arg Leu Phe Ser Leu Met Glu Ser Lys Lys Thr Aea Leu
ZO 25 30
Cys Ala Ser Val Asp Val Arg Thr Thr Glu Glu Leu Leu Lys Leu Val
35 40 i5
Aap Thr Leu Gly Pro Tyr Ile Cys Leu Leu Lye Thr Hia Ile Acp Ile
SO 55 60
Ile Asp Asp Phe Ser Met Glu Ser Thr Val Ala pro Leu Leu Glu Leu
65 70 75 80
5er Lye Glu His Asn Phe Leu Ile Phe Glu Asp Arg Lys Phe Ala Aep
8$ 90 95
Ile Gly Aan Thr Val Lye Ala Gln Tyr Ala Gly Gly Ala Phe Lye Ile
100 105 110
Ala Gln Trp Ala Asp Ile Thr Aen Ala His Gly vat Thr Gly Arg Gly
115 120 1Z5
Ile val Lye Gly Leu Lye Olu Ala Ala Gln Glu Thr Thr Aep Glu Pro
130 135 140
Arg Gly Leu Leu Met Leu Ala Glu Leu &er Ser Lys Gly Ser Phe Ala
145 1S0 155 160
His Gly Thr Tyr Thr Glu Glu Thr Val 01u Ile Ala Lys Thr Asp Lye
165 170 175
Asp Phe Cys Ile Oly Phe Ile Ala Gln Arg Aep Met Gly Gly Arg Glu
1A0 1B5 190
Asp Gly Phe Asp Trp Ile Ile Met Thr Pro Gly Val Gly Leu Asp Asp
195 200 2fl5
Lys Gly Asp Ser Leu Gly Oln Gln Tyz .Azg Thr val Asp Glu Val Val
210 215 220
4 5 Ser Gly Gly CYe Asp Ile Ile Ile Val Gly Arg aly Leu Phe Gly Lys
225 230 235 240
Gly Arg Aap Pro Thr Val Glu Gly Glu .Arg Tyr Arg Lys Ala Gly Trp
2f5 250 Z55
Asp Ala Tyr Leu Lys Arg Tyr Ser Ala Gln
260 265
CA 02268004 1999-04-06
23
(2) INFORMATION FOR SEQ ID NO: 3:
(i) SEQUENCE CHARACTERISTICSs
(A) LENGTH: 1190 base pairs
(8) TYPE: nucleic acid
(c) sTRANDEDNESS: oingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(iii) HYPOTHETICAL: NO
(iw) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(A) ORGANISH: Candida utilts
(a) STRAIN: NRRL~ Y-1084
(ix) FEATURE:
(A) NAME/I~Y: mat~eptide
(H) LOCATION:1..1190
(D) OTHER INFORMATION:/product= "Enzyme
imidazol-glycerol phosphate dehydratase ~
/gene= "HISS"
(xi) SEQUENCE DESCRIPTION: SEQ ID N0: 3:
ACCTCCCAAT CGCACAGGCA ACGATACAAA TTGAACGA3T ATTAACCATC TTCTGTGCTA 60
AAAAGAGTCG AAGAACRACA GTGCGCCRAA AAARAAAC'iC CGGACCGCAC ACGACTCATC 120
GCTCTCGGAA TATCCCTCGG AATGCGCCAC TTCCGGCiT~:~C GTGGCCATCG GAAGAGCGAA 180
GAO?CATCAC CATCGTACTT TAACGACTTA CTATTCTCAT TGAGTATTGA GAAGAAGGAT Q40
AGAGAAATGG CTGAACGAAC GGTGAAACCC CAGAGRAG.AG CTCTTGTGAA TCGTACAACA 300
AACGAAACGA AGATCCAGAT TTCCTTGAGT TTGGATGG'TG GATACGTAAC GGTTCCGGAG 360
TCAATCTTCA AGGATAAGAA GTACGACGAT GCTACTCA~AO TCACCTCTTC TCAGGTGATT 420
TCAATCAACA CGGGCGTTGG ATTCCTGGAC CACATGATCC ATGCTCTTGC GAAGCATGGT 4B0
4 5 GGGTGGAGTT TGAT?GTGGA GTGTATTGGT GATTTGCACA TTGACGACCA CCACACCACC 540
GACGACGTTG GTATTGCGCT GGGAGACGCC GTCAAGGA~3G CCTTGGCATA TAGAGGTGTC 600
AAGAGATTTG GTAGCGGGTT TGCTCCATTG GACGAOGC'TC TGAGCAGAGC CGTTGTTGAT 660
CTGAOTAACC GTCCOTTTGC CGTTGTTGAG CTGGGACTCA AGAGGGAAAA GATCGGTGAC 720
TTGTCATGTG AGATGATTCC TCACTTCTTG GAGAGTTT'TG CCCAAGCAGC TCATATCACG 780
ATGCATGTTG ACTGTTTGAG AOGCTTCAAC GACCATCAi~A GAGCTGAATC CGCATTCAAG e40
GCCCTGGCRG TCGCCATTAA GGAATCCATC TCCAO?AACG GCACCAATGA TGTTCCCTCA 900
ACAAAGGGTG TTTTGTTCTA GATAGCAGTC TTTCTGTC'TC TCTATTTATT CGATAAATAA 96o
CA 02268004 1999-04-06
24
GAACTATGTA TATCTTTCTC TTTTAATTGT ATATOTACAT GCACAGCTGA CTTCATCAAC 100
GGAAGATGTT ATTOAGTGCA GCCATTOTCT GACTGTCGTT ATCCTTCTTT GCGGATTTAC 10A0
CAAGGACTCT ACGACCACTG GTOOCTTTGA TATGATT'CCC T~3CCAGTACT TGTAACAGOT 1140
GCAACGTCAA TGGAAACOOC ACCGTTAGCC TTGATCiO'.CIG CACGGGTAGG 1190
(2) INPORMATZON FOR SEQ ID NO: 4:
(i) SfiQUENCE CHARACTERISTICS:
(A) LENGTH: 210 amino acids
(H) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(iii) HYPOTHETICAL: NO
(iv) ANTI-SENSfi: NO
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Candida utilis
2 5 (H) STRAIN: NRRL Y-1084
(ix) FEATURE:
(A) NAME/KEY: PZOtein
(H) LOCATION:1..210
(xi) SEQUENCE DESCRIPTION: 6EQ ID NO: 4:
Met Ala Glu Arg Thr Val Lys Pro Gln A.rg Arg Ala Leu Val Aen Arg
1 5 to is
Thr Thr Asn Glu Thr Lya Ile Gln Ile Ser Leu Ser Leu Aep G1y Gly
20 25 30
4 0 Tyr Val Thr Val Pro Glu Ser Ile Phe Lys Asp Lys Lys Tyr Asp Asp
35 40 45
Ala Thz Gln Val Thr Ser Ser Gln Val Ile Ser Ile Asn Thr Gly Val
50 55 60
Gly Phe Leu Asp His Met Ile His Ala Leu Ala Lys His Gly Gly Trp
65 70 75 BO
Ser Leu Ile Val Glu Cys Ile Gly Asp Leu Hie Ile Asp Aep His Hie
85 90 95
Thr Thr Glu Aep Val Gly xle Ala Leu Gly Aep Ala Val Lys Glu Ala
100 105 110
Leu Ala Tyr Arg Gly Val Lys Arg Phe Gly Ser Gly Phe Ala Pro Leu
115 l20 125
Asp Glu Ala Leu Ser Arg Ala Val Val Asp Leu Ser Asn Arg Pro Phe
130 135 140
CA 02268004 1999-04-06
Ala Val Val Glu Leu Gly Leu Lye Arg Glu Lys Ile Gly Aap Leu Ser
14S l50 155 160
Cys Glu MeL Ile Fro Hie Phe Leu Glu :Ser Phe Ala Gln Ala Ala His
5 165 :170 175
Ile Thr Het His Val Asp Cye Leu Arg G1y Phe Aen Asp Hie H1c Arg
180 185 190
10 Ala Glu Ser Ala Phe Lye Ala Leu Ala Val Ala Ile Lye Glu Ser Ile
195 200 205
ser Ser
zlo
l2) INFORMATION FOR SEQ ID NO: 5:
(i) SEQUENCE CHARAC?ERISTICS:
2 0 (A) LENGTH: 26o7 bxae pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
lD) TOPOLOGY: linear
2 5 (ii) MOLfiCULE TYPE= DNA (genomic)
(iii) HYPOTHETICAL: NO
(iV) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Candiaa utilic
(H) STRAIN: NRRL Y-10B4
3 5 ( i x ) FEATURE :
(A) NAME/KEY: mat-peptide
(H) LOCATION:1..2607
(D) OTHER INFORMATION=/product~ "Eazima invertaca
(beta-fructofuranoeidaaa)"
4 0 /gcne= "INV1"
(Xl) SEQUENCE DESCRIPTION: SEQ ID NO: S:
4 5 ATCGGCACAG AAGCGACACT GATGTCCTCC GTCTAAAACT CATCGTTTAA TAACTTCTGC 60
ATTGGCAGCT CCGGAGCACA CTCAATTGGCi ACTAAAAGF,A GTAACATTTa TACTACAATG 120
AGTCGTATAG AGTCATGTAT AAGAAGAACA GCAAGAAAA,G AAAATATTGG TGCAGAATTC 180
AACAGCTTCT GAGATCGTAA GAACAGCCAA TCATTTACC:G GAATTCATTA TGATACCTAT 240
AGRAAGACAC AAATTGTTGG GTAAAACAAC AGAACATAC!C TGTATAGGGG TTTATACGAG 3D0
AATTTTCTTA GACGTCTCCC CCAGTGTCCG CC,'AAAGCAA,C TTACATGTGG AOTTTGAATT 360
TGGATGCGCC TTTTCCTTTA AACGGTCACC TGAGGTCTGA ATCTCAATGC AAATATCATT 420
ACACCAATRA TAAAGGTGCA TATAACCCCA TAACCTGTAC ATAAA4AACG GCRCATGATC 480
CA 02268004 1999-04-06
26
Ci0.ATTTATCGACGTTATGCC TTGTCAGACC ATCGTCGTGA 540
AC?TTTCTAA ACCGGATAAA
CTCTCGCACGOATTATAACG TGCGTCTGTG ATATGCAC'CCCCCCGTGGAG600
CGGRAAAAF.C
AAGTGAAGCGGCCACCTGTG GAGCAGRAAT TTCGATCGi~CCAAATGGTTT660
OTTTCAAGTT
CCTGTTGTCAAAGGOCTTGA GATTTACCAC TTGACCAT'CTTTCGGAGAGC720
GTGCTCAGAA
ATTCCCATGAGTGGTGTCCA AAAAGACTAT ARAAGCAGCA GTTGACAAAA7B0
CAGGGATGTC
GATGCCTCAGAGGACCAAGA AGACATCAAG AGTCTt:ACGATTTAGTTGAT840
TGAACACTA(i
TCCAGCATTTACAGACCATT AGTCCATCTA ACGCCACCAG GAACGACCCT900
TGGGGTGGAT
AATGGTCTCT?CTACGATTC ATCTGAATCT ACTTACCA'CGATACAACCCA960
TGTACTACCA
AACGATACGATTTGGGGATT GCCTCTATAT TGGGGAt:Jt.'.CGTGATTTGTTA1020
CCACCTCTGA
ACGTGGGACCACCATGCGCC TOCAATTGGA CCTGAGAA'CGTATTTACTCT108O
ATGATGAGGG
GGATCTATAGTCATAGACTA CGATAATACC TCAGGGT"Tt~2AACAAGACCA1140
TTGACOA'rTC
GAACAGAGAATCGTTGCCAT TTATACCAAT AACTTACCi~GGCAAGACATT1200
ATGTCGAGAC
2 GCCTATTCCACGGACGGTGG TTATACTTTC GAAAAGTA'CGAGTTATAGAC1260
5 AAAACAACCC
GTCiaATTCGACCCAATTTAG GGATCCGAAG GTGATTTGL3TTGAACAATGG1320
ATGAGGAAAC
GTCATGACTGTGGCRAAGAG TCi4AGAGTAC AAGATCCAc3ATGACAATTTG1380
TTTACACCTC
AAAGACTGGAGTTTGGCCTC GAATTTCTCA ACCAAGGG',CTTCAGTATGAA1440
ATOTTGGTTA
TGTCt:AGGTCTATTCGAAGC CACTATTGAA AACCCAAAGA AGAGAAGAAA1500
GTGGTGACCC
3 TGGGTTATGGTCTTAGCAAT CAATCCAGGC TCACCTCT'CGAAATGAATAC1560
5 GTGGTTCCAT
TTTGTTGGTGATTTCAACGG TACTGAATTC ATTCCAGA'CGAAGATTTATG1620
ATGACGCTAC
GATACTGGTAAGGACTTCTA TGCCTTCCAA OCGTTCTTCA GAATCGGTCA1680
ATGCACCGGA
ATTGGAGTTGCCTGGTCATC GAACTGGCAL3 TATTCCAACCTCCTGATOGA140
AGG?TCCGGA
TA?AGAAGCTCCATGTCATC AATCAGAGAG TACACTCTt3ATACGAATCCA1800
GATATOTCAG
GAATCTGAACAGTTGATCCT TTGTCAAAAA CCATTCTT'CGAQACTTGAAG1860
TGAACGAGAC
GTGGTTGAAGAGTACAAGGT TTCAAACAG? TCTTTQACCG 0?TTGGAAG?1920
TGGACCACAC
AGCTTTGCAAACTCCAACAC CACTGGACTG TTGGATTTt:a1CACGGTTAAC1980
ACATGACTTT
GGTACAACTGACGTTACGCA GAAOGACTCC GTCACCTT'CCiCAAATCTAAC2040
AGCTCAGAAT
CAAAGCGACGAGGCAATTGC GCTTOGTTAC GATTACAACA CTACATCAAC2100
ACGAGCRATT
S5 AGAGCCACAGAGAGCTACTT CCAGAGAACC AACCAGTTCT ATGGTCCACG2160
TCCAGGAGAG
TACGTTCAGCCTCTCACAAT CACCGAATCT GGTGATRAa0.CCTACGGATTG2220
AGTACCAGCT
GT'TGATAACAACATCCTTGA GTTGTACTTC AACGACGG~:~(3CACAAACACC2
CATTCACATC 2
8
0
CA 02268004 1999-04-06
27
TTCTTCTTGG AGAAGGGCAA GCCATCAAAC GTCGATATCG CTCCAAGGAG2340
TOC3CAAGCTC
GCTTACCACC GTGGACCAQC TGACTGAGAC GTCTCACTOT CGCACGTGAA2400
TTGACGAATA
AGCTATATAA GGGATCACGT GGTCTAGCCA CCCCAGTC'.CACAAACCGCCA2460
AAAOCTTCAG
CTATATAAAC AGACAGGTTT GTCACTTTTC AACAAAAGiA CTTTTACCCT2520
ATATCTTCTT
TCAGAGTAGT TTGTACGAGT GCTTTTTTCA ATTATATA'.CAAGCTGCCTTT25A0
CAACAACGTG
GGATATGCAA TCAACAC3CGC TCTCTTT 2607
(2) INFORhIATION FOR SEQ ID NO: 6:
(i) SEQUENCE CHARACTERI&TIC9:
(A) LENGTH: 533 amino
acids
(B) TYPE: amino acid
(C) STRANDEDNES6: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(iii) HYPOTF~TICAL: NO
(iv) ANTI-SENSE: NO
(vi) ORIGINAL SOURCE:
(A) ORGANISM: Candida
utilis
3 (H) STRAIN: NRRL Y-10B4
0
(ix) FEATURE:
(A) NAME/KEY: Protein
(H) LOCATION:1..533
(xi) SEQDENCE DESCRIPTION: SEQ ID N0: 6:
Met 5er Leu Thr Lye Aap Ala Ser Glu Asp Gln Glu Asp Ile Lya Ser
1 5 7.0 15
Leu Thr Met Aan Thr Ser Leu Val Asp :~eX Ser Ile Tyr Arg pro Leu
20 25 30
Val His Leu Thr Pro Pro Val Gly Trp rtet Asn Asp Pro Asn Gly Leu
35 40 45
Phe Tyr Asp Ser Ser Glu her Thr Tyr Ftis Val Tyr Tyr Gln Tyr Asn
55 60
?ro Asn Aap Thr Ile Trp Gly Leu Pro Leu Tyr Trp Gly Hia Ala Thr
65 70 75 80
Ser Aep Aap Leu Leu Thr Trp Asp His Ftis Ala Pro Ala Ile Gly Pro
85 .'~0 95
Glu Asn Asp Asp Glu Gly Ile Tyr Ser cily Ser Ile Val Ile Asp Tyr
1o0 1o5 110
CA 02268004 1999-04-06
28
Asp
Asn
Thr
6ez
Gly
Phe
Phe
Aep
Asp
Ser
Thr
Arg
Pro
Glu
Gln
Arg
115 120 125
Ile Val Ala Ile Tyr Thr Aen Aen Leu FaroThr Gln
Asp Val Glu Asp
130 135 140
Ile Ala Tyr Ser Thr Asp Gly Gly Tyr 7~hrTyr Glu
Phe Glu Lys Asn
14S l50 155 160
Asn Pro Val Ile Asp Val Asn Ser Tht CflnPro Lys
Phe Arg Asp Val
165 1.70 175
Ile Trp Tyr Glu Glu Thr Glu Gln Trp Val Ala Lye
Met Thr Val Ser
180 185 190
Gln Glu Tyr Lye Ile Gln Ile Tyr Thr f~erLys Asp
Asp Asn Leu Trp
19S 200 205
Set Leu Ala Ser Asn Phe 5er Thr Lys GflyTyr Gln
Tyz Val Gly Tyr
21o 2l5 zzo
Glu Cys Pro Gly Leu Phe Glu Ala Thr 1:1eLye Ser
Glu Aan Pro Gly
225 230 235 240
2 Asp Pro Glu Lye Lys Trp Val Met Val heu Pro Gly
5 Ala Ile Asn Ser
Z45 i'.50 2S5
Pro Leu Gly Gly Ser Ile Asn Glu Tyr F~hePhe Aan
val Gly Asp Gly
260 265 27o
Thr Glu Phe Ile Pro Aep Asp Rsp Ala Thr Asp Thr
Arg Phe Met Gly
275 280 285
Lys Aep Phe Ty= Ala Phe Gln Ala Phe F~heGlu A9n
Asn Ala Pro Arg
290 295 300
Ser Ile Gly Val Ala Trp her 8er Asn Trp Asn Gln
Gln Tyr Ser Val
305 310 315 320
Pro Asp Pro Asp 01y Tyr Arg Set Ser Met Arg Glu
Ser Ser Ile Tyr
325 330 335
Thr Leu Arg Tyr Val Ser Thr Aen Pro GlluLeu Ile
Ser Glu Glri Leu
340 345 350
C'ys Glri Lye Pro Phe Phe Val Asn Glu Val Val
1'hr Aap Leu Lys Glu
35S 360 365
Glu Tyr Lye Val Ser Aen Ser Ser Leu Z'hrThr Phe
Val Asp His Gly
370 375 3A0
Ser Ser Phe Ala Asa Ser Asn Thr Thr LilyPhe Jean
Leu Leu Aep Met
3B5 390 395 400
Thr Phi Thr Val Asri Gly Thi' Thr Aep Aep Ser
Val Thr Gln Lys val
405 910 415
Thr Phe Glu Leu Arg Ile Lys Ser Asn G~lnAla ile
Ser Aap Glu Ala
420 425 430
CA 02268004 1999-04-06
29
Leu Gly Tyr Asp Tyr Asn Asn Glu Gln )?he Tyr Ile Asn Arg Ala Thr
435 440 445
Glu Ser Tyr Phe Gln Arg Thr Asn Gln I?he Phe Gln Glu Azg Trp Ser
4S0 45S 460
Thz Tyz Val Gln Pro Leu Thr Ile Thr (~lu Ser Gly Aep Lys Gln 2~r
465 470 475 4B0
Gln heu Tyr Gly Leu Val Asp Aen Aen :Cle Leu Glu Leu Tyr Phe Asn
4B5 !190 495
Aep Gly Ala Phe Thr Ser Thr Asn Thr 1?he Phe Leu Glu Lys Gly Lys
500 505 S10
Pro Ser AHn Val Aep Ile Val Aln Ser E:er Ser Lys Glu Ala Tyz His
515 520 525
Arg Gly Pro Ala Aep
530