Note: Descriptions are shown in the official language in which they were submitted.
1520253035WO 98/12308CA 02264487 1999-04-08PCT/IT97/00228-1-SOLUBLE POLYPEPTIDES WITH ACTIVITY OF THE NS3PROTEASE OF HEPATITIS C VIRUS,PREPARATION AND ISOLATIONDESSRIBILQNThe hepatitis C virus (HCV)nonâB hepatitisSERINEAND PROCESS FOR THEIRis the main etiologic(NANB).that HCV causes at least 90% of postâtransfusional NANB50% of NANB hepatitis.Although great progress has been made in the selection ofagent of non-A, It is estimatedviral hepatitis and sporadicblood donors and in the immunological characterisation ofblood used for transfusions,HCVtransfusions,there is still a high levelof acute infection thoseamong receiving bloodresulting in one million or more infectionsevery year throughout the world. Approximately 50% of HCVinfected thewithin a period that can range from 5 to 40 years,thatindividuals develop cirrhosis ofliverandstudiesrecent clinicalthere is athesuggestHCVdevelopment of hepatocellular carcinoma.correlation between chronic infection andHCV is an enveloped virus containing an RNA positivegenome of approximately 9.4 kb. This virus is a member ofthe Pï¬aviviridae family, the other members of which arethe pestiviruses and flaviviruses.The RNA genome of HCV has recently been sequenced.Comparison of sequences from the HCV genomes isolated invarious parts of the world has shown that these sequencescan be extremely heterogeneous. Most of the HCV genome isoccupied by an open reading frame (ORF) that can varybetween 9030 and 9099 nucleotides. This ORF codes for asingle the length of whichobviously vary from 3010 to 3033 amino acids. During theviral polyprotein, canvirus infection cycle, the polyprotein is proteolyticallyprocessed into the individual gene products necessary forreplication of the virus.The genes coding for HCV structural protein arelocated at the 5â end of the ORF,coding for the nonâstructural proteins occupies the restwhereas the regionSUBSTITUTE SHEET (RULE 26)U:35CA 02264487 1999-04-084-of the ORF. The structural proteins consist of: C21 kDa), E1 gp37) and E2 (N81, gp61).non-glycosilate protein of 21 kDa, which probably forms(core,(envelope, C is athe viral nucleocapsid. The protein E1 is a glycoproteinkDatheanother membrane glycoprotein of 61 kDa,of approximately 37 and isbelieved to be astructural protein of outer viral envelope. E2,is probably asecond structural protein of the outer envelope of thevirus.With regard to HCV structural regions, truncatedpeptides derived therefrom, are disclosed in a previouspatent application (WO 92/12992), only as having epitopesfor HCV, and therefore solely for their immunologicalThey considered asproperties. are in fact thereinpeptides suitable for developing diagnostic methods ormethods for preventing and/or treating HCV disease.The sole immunological properties charachterize alsothe article ofin Virology vol.206 1995, pages 666-672,corresponding however to the non-structural proteins ofimmunogenic peptides disclosed inKhudyakov et al.HCV coded by the non-structural regions of the virus.The non-structural region starts with NS2 (p24), ahydrophobic protein of 24 kDa whose function is notknown. NS3, a protein of 68 kDa which follows NS2 in thepolyprotein, has two functional domains: a serineprotease domain in the first 180 amino-terminal aminoacids and an RNA-dependent ATPase domain in the carboxyâterminal part. The gene region corresponding to NS4 codesfor NS4A (p6),NS4B (p26). The gene corresponding to NS5 codes for twoproteins, NSSA (p56) and NS5B (p65), of 56 and 65 kDa,respectively. Recently it has been shown that the NSSBa membrane protein of 54 amino acids, andregion has an RNA dependent RNAâpolymerase activity (1).Various molecular biological studies indicate thatthesignal peptidase, a protease associated with theendoplasmic reticulum of the host cell, is responsiblefor proteolytic processing in the non-structural region,Amman SHEETCA 02264487 1999-04-08.73-that is El/E2 and E2/NS2£22. A first protease activity of HCV is responsible forbetween NS2 and NS3. Thisin a region comprising both a part of N82 andto say the sites C/El,the cleavage activity iscontainedthe part of NS3 containing the serine protease domain,but does not use the same catalytic mechanism (3). On thecontrary, the serine protease contained in the 180 aminoacids at the aminoâterminal of NS3 is responsible forcleavage at the junctions between NS3 and NS4A, betweenNS4A and N848, between N548 and NSSA,and N858 (4-8). In particular it has been found that theproduced byand between NSSAcleavage this serine protease leaves aresidue ofxvU:35CA 02264487 1999-04-08«ï¬-cysteine or threonine on the amino-terminal side(position Pl) and a residue of alanine or serine on thecarboxy-terminal side (position P1â) of the substrate (6,9). Recently it has been shown that NS4A binds the N-terminal end of N53 with its central hydrophobic portion,thereby increasing the proteolytic activity of N83 in allthe cleavage sites on the polyprotein (10-12).The NS3 protein contains also aminoacid motifs of anRNA helicase, positioned in the C-terminal domain of theprotein. In an article of Kim D. W. et al., in Bioche.Biophys. Research Communications vol. 215, no.1 1995,pages 160-166, is disclosed a recombinant proteincomprising 466 aminoacids from the C-terminal of HCV, andpurified by an His-tag expression system, possessing infact an RNA. helicase activity. The His tail attachedtheretohas the sole function of enabling an easierpurification of the protein.Inhibition of the protease activity would thereforestop the proteolytic processing of the non-structuralportion of" the HCV polyprotein and, as a consequence,would prevent virus replication in infected cells. Thissequence of events has been verified jmm a flavivirus,homologous of the hepatitis C virus, which infects cellsin culture.In this case it has been possible to show thatgenetic manipulation, producing a protease that is nolonger capable of exerting its catalytic activity,abolishes the ability of the virus to replicate (13).Furthermore it has been widely demonstrated, both invitro and in clinical studies, that compounds capable ofinterfering with the activity of the HIV protease arecapable of inhibiting the replication of this virus (14).Finally there is evidence of the fact that the NS5region of HCV, which as we have mentioned above has anRNA dependent RNAâpolymerase activity, does not displaythis function except after processing by the NS3protease..~ .- :-â âN -'âl_â''ât:'-':. ,1.â-_âs._,_.i;'_x ,):".âC;__â.11CA 02264487 1999-04-08-33-Therefore a substance capable of interfering withthe proteolytic activity associated with the NS3 protein.could be a new therapeutic agent. From this point of Viewdetailed knowledge of the threeâdimensional structure ofthe protease takes on a great deal of importance, as itwould allow both a greater understanding of thebiological phenomena in which it is involved, and theanalysis, study and design of inhibitor molecules capableof interfering with the protease activity, thus pavingthe way for the development of pharmaceutical_. _.,. -n âH;-'.;_""15203035W0 98/ 12308 .CA 02264487 1999-04-08PCT/IT97/00228_ 4 -compositions suitable for treatment of hepatitis C.Nevertheless, determination of the structure both usingNMR methods and Xâray crystallography, requires largeamounts of soluble protein, and at the present time it isnot possible to meet this request. In fact, although thesimplest and most economical manner of obtaining largeamounts of the desired polypeptide is expression of thecorresponding gene in bacteria, and although there is awidespread availability of numerous eucaryotic promotersand methods for maximising the expression of heterologousgenes in E. Coli, nevertheless an efficient production ofthe polypeptide in question, although. necessary, mightnot be sufficient. Many recombinant proteins do not foldthe polypeptidic chain correctly when they are expressedin E. Coli. The result is the synthesis of polypeptideswhich are in the hosteither degraded cell, or areaccumulated in an insoluble form in the so calledinclusion bodies (15). Furthermore, in the case ofextremely hydrophobic proteins, proteins of viral originor proteins that are toxic for the bacterial cell (as isthe case for certain proteases of viral origin) there areinsurmountable difficulties in producing them in anative, soluble form.In the case of the NS3 serine protease of thehepatitis C virus, due to the conditions in which theprotein is normally produced, it has not been possible todate to obtain in E. coli a native type, soluble proteasein amounts sufficient to enable the thestudy ofwhichsolutions containing a high millimolar concentration ofstructural nature of this protein, requiresthe protein.It has nowthat theseimportant limitations can be overcome by using the methodunexpectedly been foundaccording to the present invention.this methoddiscovery that the NS3 serine protease domain,As will be seen fromthe following, is based on the unexpectedin itsnative conformation, binds a Znâ ion.SUBSTITUTE SHEET (RULE 26)WO 98/12308U:10152035CA 02264487 1999-04-08PCT/IT97/00228-5-Because, as mentioned above, the structure of theHCV NS3 protease is not yet known, a structural model ofthe protein was prepared, to be used as a guide duringexperiments. However, the similarity of the NS3 proteaseto other serine proteases of known structure is extremelylow (less than 15%), which does not allow good alignmentbetween sequences and as a result doesnot allowconstruction of a threeâdimensional model based solely onhomology. For this reason, the available serine proteasestructures were used to build a multiple alignment of thestructurally conserved regions and to draw up in this waya profile with which the sequence of the NS3 proteasecould subsequently be aligned. In this way it waspossible to build an approximate threeâdimensional modelof the HCV NS3 protease (9, 16).Recently, three new âviruses responsible for humanhepatitis have been discovered (17).GBVâA, GBVâB and GBVâC, polyproteinorganisation in common with that of HCV (18, 19).These new viruses,known as show aFromalignment of the region corresponding to NS3 in thesethree new viruses with that of various HCV serotypes,identified. Thesethe amino acids in the active site,several preserved amino acids wereresidues comprise:some glycines and prolines (probably involved instabilising the structure of the protein) and threecysteines and one histidine (figure 1). In the modelsuggested. by us for the NS3 protease these last fourresidues are found in a region of the molecule oppositethe active site, andin 2a close spatial relationship,their relative position is such that it forms a bindingsite for a divalent metallic ion, such as for example theion Znâ(figure 2).This observation was subsequently confirmedexperimentally. In fact, as will be illustrated ingreater detail in the examples, the HCV NS3actually has a metal content equivalent to one mole ofproteasezinc to each mole of protein, and as is the case in otherSUBSï¬TUTESHEET(RULE26)1015202535CA 02264487 1999-04-08W0 98/ 12308PCT/IT97/00228-6-proteins the zinc is necessary to enable the protein toits native(20, 21).The fact that the NS3 protease has a binding sitetake on structure and. become catalyticallyactivefor a natal ion and that this binding site is so wellpreserved, even in viruses that are not phylogeneticallythe thetherapeutic agents whose target site is this very regionIn fact,close, opens way to study of antiviralof the protein. in the case of another viralprotein that binds Zn2+ ions, that is to say the HIVvirus nucleocapsid, it has been possible to identifycompounds that interfere selectively with the bondbetween the protein and the Znâ ions (22, 23) and it hasalso been seen that these compounds interfere with theviral infection of cells grown in culture medium.An object of the present invention is therefore toprovide a method for highâyield expression, in a nativeform, that is to say as a protein containing a bivalentmetallic ion, and in a highly soluble form of the HCV NS3protease using heterologous expression systems, such asE. coli cells transformedusing suitable geneticconstructs and cultivated in a medium enriched with saltscontaining divalent metal ions.A further object of the presentmethodinvention is toprovide a generalallowing preparation andisolation in a native, pure and highly soluble form, of+ 2+large amounts of polypeptides containing Znz, Co orCdâ) with the protease activity of HCV NS3.Furthermore, an additional object of the presentinvention is to provide a method that allows preparationand isolation in a native, pure and highly soluble formwith thewhich are at the same time markedamounts ofactivity of HCV NS3,stableof large polypeptides proteaseusing heavy isotopes such as 13C orthedimensional structure of the protein using NMR.15N, asrequired for experiments to determine three-SUBSTITUTE SHEET (RULE 26)CA 02264487 1999-04-08W0 98l123081520253035PCT/IT97/00228-7-Finally, the present invention provides new geneticconstructs colifor the expression, in B. cells, ofmodified polypeptides with the protease activity <3f HCVNS3, having a high yield of the native and soluble formof the HCV NS3 protease.These and other objects are achieved using one orthe thedescribed below.more of embodiments ofpresent inventionIn an embodiment of the invention a procedure isNS3that is toprovided for obtaining production of the serineprotease domain in its native form,saycontaining a bivalent metal ion, which is necessary forthe structural integrity of the protein. The innovationin the procedure consists in the addition to the culturemedium in which the transformed bacterial cells are grownCd, Mn,These compoundsof compounds containing metals such as Zn, Co,Cu, Ni, Ag, Fe, Cr, Hg, Au, Pt, V.provide the culture medium with the ions required by theprotein to take on its native structure. In this way thein theinstead of being heldprotein is found in its native, soluble formcytoplasm of the bacterial cells,' from which itdifficultin the included bodies,applyingcan only beobtained by resolubilisationprocedures.In another embodiment of the invention, a procedureis provided that makes it possible to replace the zincion in the protease, which is spectroscopically silent,or Cd?+),so as to permit the study ofwith. otherâ ions (for example Co2+which arespectroscopically active,possible inhibitors capable of coâordinating the metalcontained in the protein and therefore of disturbing thebond between the protein and the metal.In another embodiment of the invention, the additionof bivalent metal ions to a minimum culture medium,containing glucose and ammonium salts enriched with 13Cor 15N as the sole sources of carbon and nitrogen,respectively, makes it possible to obtain large amountsSUBSï¬TUTESHEET(RULE26)CA 02264487 1999-04-08W0 98/ 12308B202530PCTIIT97/00228-3-of soluble protein marked with stable heavy isotopes suchas 13C or l5N. This type of isotope enrichment isnecessary to determine the structure using NMRtechniques.In. a further embodiment of the present inventionpolypeptide sequences are provided that contain the NS3serine protease domain of hepatitis C virus, suitablymodified. These polypeptides are characterised in thatthey have at their C~terminal end a sequence of extremelyhydrophilic amino acids, such as for example a series oflysines, which are not present in the original sequence.By using this other new method there is a substantialimprovement in terms of solubility and integrity of theprotein produced. These modified protease molecules arealso to be theconsidered as a subject of presentinvention.Subjects of the present invention are therefore:a) Isolated and purified polypeptides containing theHCV NS3 serine protease domain, characterised in thatthey have at their Câterminal end. a tail of at leastthree lysines.b) A process for the preparation of polypeptidescontaining the HCV NS3 serine protease domain in asoluble form, of use for enzymological experiments,determination of the threeâdimensional structure of theboth by NMRcrystallography, comprising the following operations:enzyme means of and using X~rayâ transformation of a prokaryotic host cell with anexpression vector containing a DNA sequence coding for apolypeptide with the proteolytic activity of the HCV NS3protease;-growth of the prokaryotic host cell on a specialculture medium containing Znâ or alternatively salts oftransition metals such as Co, Cd, Mn, Cu, Ni, Fe, Cr,Hg, Au, Pt, V;Ag,â expression of the DNA sequence required to producethe chosen polypeptide;SUBSHTUTESHEET(RULE26)WO 98/12308202530CA 02264487 1999-04-08PCT/IT97/00228-9-â-purification of the polypeptide without having toresort to resolubilisation protocols, and without theneed for renaturation of the from includedproteinbodies.c) A process for the renaturation in vitro of theabove polypeptides, characterised in that it comprisesthe following operations:â transformation of a prokaryotic host cell with anexpression vector containing a DNA sequence coding for apolypeptide with the proteolytic activity of HCV NS3protease;â expression of the DNA sequence required to producethe chosen polypeptide;~ purification of the denaturated and renaturatedpolypeptide of the protein using buffers containing Znâor alternatively salts of transition metals such as Co,Cd, Mn, Cu, Ni, Ag, Fe, Cr, Hg, Au, Pt, V.d) Expression vectors for the production of thepolypeptides represented by the sequences SEQ ID NO:1 toSEQ ID NO:4 with the proteolytic activity of HCV NS3,comprising: a polynucleotidecoding for one of saidpolypeptides; regulation, transcription. and translationcell,and,sequences, in said hostbonded tooperating operationallysaid polynucleotide; optionally, aselectable marker.e) A prokaryotic cell transformed with an expressionvector containing a DNA sequence coding for polypeptideswith the proteolytic activity of the HCV NS3 protease, sothepolypeptide which is coded in the chosen sequence.as to allow said hostcell to express specificFigure 1 shows the alignment between the HCVâ NS3serine protease sequence and the viruses GBVâA, GBVâB andGBVâC/HGV (Hcv, Hga, Hgb, Hgc), with the poliovirus (Pol)2A cysteine protease. Amino acids conserved in the HCVproteases and in the viruses GBVâA, GBVâB and GBVâC/HGVare shaded. The catalytic residues are underlined andSUBST!TUTE SHEET (RULE 26)CA 02264487 1999-04-08W0 98/ 12308B2030(4.)U1PCT/IT97/00228- -the residues that bind zinc are indicated using thesymbol _Figure 2 shows a diagrammatic model of the NS3serine protease domain. In particular it shows theposition within the structure of the amino acids involvedin binding zinc (dark grey) and the catalyticwtriad(light grey).Figure 3 shows the effect of the zinc ion on HCV NS3serine protease activity.Figure 4 shows the effects of the zinc ion on theproduction of HCV NS3 protease as a soluble protein in E.coli on a minimum culture medium.theColumn 2 refers to theresults of experiment carried out on the cellswithout inducing protease production (âIPTG).4 and 5 thatColumns 3,of ZnCl2 and(+IPTG)portionindicate in the absencefollowing thetheinduction «of protease productionlocked in the(indicated by the abbreviation PT).protein remains insolubleOn the contrary, inthe presence of ZnCl2 the protease is found entirely inthe soluble portion (indicated by the abbreviation SN).Figure 5 shows the electronic spectrums of the HCVNS3 protease. Figure 5a shows the visible and nearâUVspectrum of the Coââprotease. Figure Sb shows the UVabsorption spectrums of the Znâ-protease and of the Cdâ~protease.DEEQSIISStrains of E. coli DHI/p bacteria transformed withthe plasmids pT7â7(Pro BK-as K4), pT7â7(Pro)âasK4), pT7â7(Pro HâasK4) and pT7â7 (Pro J8âasK4) and coding for theamino acid sequences SEQ ID NO:1, SEQ ID NO:2, SEQ IDNO:3 and SEQ IUD NO:4, respectively, were deposited on1996Industrial and Marine Bacteria LtdScotland, U.K.,40822.llections of(NCIMB).under access numbers NCIMB 40821,NCIMB 40823 and NCIMB 40824,Up to this point a generalAugust 8, with The National CâAberdeen,NCIMBrespectively.description has beenwith the aid of thegiven. of the present invention.SUBSTITUTE SHEET (RULE 26)W0 98/ 12308WB203035CA 02264487 1999-04-08PCT/IT97/00228-n_following examples a more detailed description ofspecific embodiments of the invention will now be given,with the aim of clarifying the objects, characteristics,advantages and methods of application thereof.EXAMPLE 1EXPRESSION AND PURIFICATION OF POLYPEPTIDES WITH THEPROTEASE ACTIVITYâ OFâ HCV" NS3, IN" THEIR. NATIVE SOLUBLEFORMThe plasmids pT7â7(Pro BKâasK4), pT7â7(Pro HâasK4),pT7â7(Pro JâasK4) and pT7â7(Pro J8âasK4) were constructedto allowexpression in E. coli ofpolypeptidescharacterised in that they have a sequence chosen fromthe ones in the group from SEQ ID NO:1 to SEQ ID NO:4.The NS3 proteasevarious HCV isolates (BK, H, J and J8,the âtailâpolypeptides contain the domain ofwithat the C-respectively)addition of a of four lysinesterminal end.pT7~7 (Pro BK-asK4)(EMBL data bank accessnucleotides 3411 and 3950,pT7â7 (Pro HâasK4)(EMBL datacontains the sequence for HCV-BKM58335)cloned in the vector pT7â7.number: between thecontains the sequence for HCV-HM67463) thecloned in the vector pT7â7.bank access number:3420 and 3959,(Pro JâasK4)betweennucleotidespT7â7(EMBL datacontains the sequence for HCVâJD90208)cloned in the vector pT7â7.bank access number:nucleotides 3408 and 3947,pT7â7 (Pro J8âasK4) contains(EMBL data bank access number:the nucleotides 3432 and 3971,between thethe sequence for HCV-J8DlO988/DO1221)cloned in the vector pT7-between7.The expression vector pT7â7 is a derivative ofpBR322 which contains, in addition to the gene for B-lactamase and the replication origin of ColE1, thepromotor and the ribosome binding site of the T7(24).The fragments coding for the HCV NS3 protease werebacteriophage Q10 genecloned downstream of the T7 bacteriophage Q10 promoter,SUBSTITUTE SHEET (RULE 26). W ..,..,...,...ma.....m.......s..,.._..... _, .,, _,_. ,,,_m,_,,(, _M___ M202530CA 02264487 1999-04-08WO 98/12308PCT/IT97/00228-12-in reading frame with the first ATG condon of the gene 10protein of phage T7 using methods known to the art.The CDNA fragment containing the sequence HCVâBKbetween nucleotides 3411 and 3950 was amplified byPolymerase Chain Reaction (PCR), using theoligonucleotides PROT(BK~K4)SK4)AS (SEQ ID NO:6)(SEQ ID NO:5)as primers.and PROT(BK-The CDNA fragment soobtained was digested with the restriction enzyme lmdel,and cloned in pT7-7, which was first linearised with therestriction enzymes NdeI and Smal.The HCVâHbetween nucleotides 3420 and 3959 was amplified by PCR,CDNA fragment containing the sequenceusing the oligonucleotides PROT(H-K4)S (SEQ ID NO:7) andPROT(HâK4)AS (SEQ ID NO:8) The CDNA fragmentso obtained was digested. with the restriction. enzymesas primers.NdeI and EcoRI, and cloned. in pT7-7, which was firstlinearised with the same restriction enzymes.The CDNA fragment containing the sequence HCVâJbetween nucleotides 3408 and 3947 was amplified by PCR,using the oligonucleotides PROT(J-K4)S (SEQ ID NO:9) andPROT (J-K4 ) AS (SEQ ID NO : 1 O) as The CDNAfragment so obtained was digested with the restrictionprimers.enzymes NdeI and ECORI, and cloned in pT7-7, which wasfirst linearised with the same restriction enzymes.The HCV-J8between nucleotides 3432 and 3971 was amplified by PCR,CDNA fragment containing the sequenceusing the oligonucleotides PROT(J8âK4)S (SEQ ID NO:11)and PRO'I'(J8-K4)AS (SEQ ID NO:l2) The CDNAfragment so obtained was digested with the restrictionas primers.enzymes NdeI and ECORI, and cloned in pT7-7, which wasfirst linearised with the same restriction enzymes.The plasmids pT7â7(Pro BKâasK4), pT7â7(PropT7â7(Pro J-asK4) and pT7â7(Pro J8-asK4)â.~v/1 \âG.D1\â1 I 1containing NS3sequences also contain the gene for B-lactamase, whichcan. be used as a cellsselection. marker for E. colitransformed with these plasmids.SUBSTITUTE SHEET (RULE 26)CA 02264487 1999-04-08W0 98/123081015203035PCT/IT97/00228-m-The fragments were cloned downstream of the T7bacteriophage promotor, in reading frame with the firstATG codon of the gene 10 protein of phage T7 usingmethods known to the art. The plasmids pT7â7(Pro BK-asK4), pT7â7(Pro HâasK4), pT7â7(Pro J-asK4) and pT7â7(ProJ8-asK4) containing NS3 sequences also contain the genefor Bâlactamase, which can be used as a selection markerfor E. coli cells transformed with these plasmids.The plasmids are then transformed. in the E. colistrain BL21 (DE3), normally used for high levels ofexpression. of genes cloned. in expression vectorscontaining the T7 promotor. In this strain the T7polymerase gene is carried into the bacteriophage X DE3,which. is(25).the cultures at an A600 nm of 0.7-0.9 with 0.4 mM ofintegrated into the chromosome of BL21 cellsExpression of the gene is induced by incubatingisopropylâ1âthio-BâDâgalactopyranoside (IPTG) for 3 hoursat 20°C in LB culture medium additioned with ZnCl2 at aconcentration that can vary from 50 pM to 1. mM. Afterthe three hours have passed the cells are harvested andwashed in a saline phosphate buffer solution (20 mMsodium phosphate pH 7.5, 140 mM NaCl),are re-suspended in 25 mM sodium phosphate at pH 7.5, 10%500 mM NaCl, 10 mM DTT, 0.5% CHAPSlitre of culture medium).after which theyglycerol, (10 ml per 1The cells are then lysated bypassing twice through a âFrench pressure cellâ and thethishomogenate obtained inway is centrifugated at100,000xg for 1 hour, while the nucleic acids are removedwith 0.5% Thesupernatants are loaded onto a HiLoad 16/10 SP Sepharoseby precipitation polyethylenimine.High Performance column (Pharmacia), and balanced with 50mM of sodium phosphate at pH 7.5, 5% glycerol, 30.1% CHAPS (buffer A). Therepeatedly with buffer A and the protease was eluted byapplying 0.6 M NaC1. Thefractions containing the protease were then collected andmix]! TW'T"|"s . u . ~ .. ,column had been washeda gradient of from 0 toconcentrated âusing a chamber for ultrafiltration. underSUBSTITUTE SHEET (RULE 26) CA 02264487 1999-04-08W0 98/ 12308 PCT/IT97/00228-M-magnetic stirring, equipped with a YMâlO membrane(Amicon). The sample was then loaded onto an HR 26/60101520253035HiLoad Superdex 75 column (Pharmacia), balanced withbuffer A, operating at a flow rate of 1 ml/min.Thefurther purified on an HR 5/5 Mono S column (Pharmacia),fractions containing NS3 were collected andbalanced with buffer B and operating at a flow rate of 1ml/min. The protease was eluted from the column in pureform applying a linear gradient of 0-0.6 NaCl in bufferA.After this passage the protein was preserved instocks at concentrations of 50-150 uM at a temperature of-80 °C Theestimated byafter freezing inthedetermination of absorbancy at 280 nm using a coefficientliquid nitrogen.concentration of protein wasof extinction. deriving fronx the sequence data or fromquantitative aminotheBoth methods come toTheSDSand by HPLC using an inverse phase.6x25O mm, 300 A).TFA (A) and acetonitryl/0.1% TFAacid analysis.same results, with an error factor of 10%.thepolyacrylamide gelpurity of enzyme was ascertained onVydac C4 column (4used were H20/0.1%5 mm, The eluents(B).A linear gradient of from 3% to 95% B over 60 minutes wasused. Analysis of the Nâterminal end was carried outusing Edman degradation on a gaseous(Applied Biosystem model 470A)phase sequencerand the analysis by massspectroscopy revealed that more than 96% of the purifiedthe PITAYSSQ. Theremaining 3% has the sequence MAPITAYSSQ as foreseen fromprotein has Nâterminal sequencethe data on the nucleotide sequence.In order to neasure the enzymatic activity of the_.C-_._'l ._-.._a.._.â.._.L.L.LCL1 hJJ.kJl.CJ..J.1,pul.. _-_..o.L...A..'... ........4..'..1.. -5 âVâ! â.â--'.a..-. ..._.'.J.â..G by J.L.1iCl.J.Lâ k).L. .LJ G.|l|.LLJ.LJ QL..L\.LDwas used as a substrate.the cleavage(DEEMECSSHLPYK) . AThis peptide was derived fromthe NS4AâNS4Bwith 14sequence of junctionpeptide amino acidscorresponding to the central hydrophobic region of theSUBSTITUTE SHEET (RULE 26)1520253035CA 02264487 1999-04-08W0 98â 12393 PCTlIT97/00228-5-protein NS4A (from position 21 to position 34) (Pep4A2lâ34: GSVVIVGRIILSGR) was used as a protease cofactor.The solidsynthesis based. on Fmoc chemistry. Afterâ washing andpeptides were synthesised by phasedeprotection, the ârawâ peptides were purified by HPLC to98% purity. The identity of the peptides was determinedby mass spectrometry. The peptide solutions stored wereprepared in DMSO and preserved at â80°C, furthermore theconcentrations were determined by quantitative amino acidanalysis carried out on samples hydrolysed with HCl.The cleavage tests were carried out using 300 nM â1.6 uM of enzyme in 30 l of 50 mM Tris pH 7.5,2% CHAPS,substrate and/or peptide-NS4A at 22°C.50%glycerol, 30 mM DTT and appropriate amounts ofThe reaction wasstopped by addition of 70pl of H20 containing 0.1% TFA.Cleavage of the peptide substrate was determined by HPLCusing a MerckâHitachi chromatograph. After this, 90ul ofeach sample were injected into an inverse phaseLichrospher C18 cartridge column (4x125 mm, Sum, Merck)and the fragments were separated. usingâ an acetonitrylgradient of 3â100% at 2%/min. Identification of the peakwas achieved following both the absorbancy at 220 nm andthe fluorescence of the tyrosine 305(Xex= 260 nm, Xem=nm).Tables 1 and. 2 give the data for solubilityrelating to the NS3various HCV virus isolated.andyield protease corresponding toTable 1 gives the data forproduction of the various forms of protease both with andwithout the addition of four lysines at the Câterminalend, and both with and without the addition of ZnCl2 inthe The data thepercentage of protein recovered in the soluble fractionof the cell and theincluded bodies.culture medium. are expressed asextracts protein found theTable 2 gives the yields and solubilityof the various forms of protease,inpurified from B.cells grown in the presence of ZnCl2.coliAs can be seenfrom the results given, (BKâASK4,the modified proteasesSUBSTITUTE SHEET (RULE 26)CA 02264487 1999-04-08WO 98112308 PCT/IT97/00228-15-JâASK4, HâASK4) are between 10 and 20 times more solubleand, when expressed 1J1 a culture medium containing anexcess of ZnCl2, they give a yield up to 10 times greater202530than the respective proteases without the lysine tail.IABLE_1Construct Culture medium Soluble portion Included bodiesPro J LB 5% 95%Pro JâasK4 LB 20% 80%Pro J LB + znCl2 99% <1%Pro J-asK4 LB + ZnCl2 99% <1%Pro H LB <2% >98%Pro H-asK4 LB 3-4% >95%Pro H LB + ZnCl2 5% 95%Pro H-asK4 LB + ZnCl2 50% 50%IABLE_2Construct Yield (mg/lt medium) Solubility (mg/ml)Pro BK 1-2 1-2Pro BK-asK4 10-15 >40Pro H 0.1-0.2 1-2Pro H-asK4 1-2 >40Pro J 1-2 0.5-1Pro JâasK4 15-20 >10EXMBLEEZDETERMINATION OF THE METAL CONTENT OF POLYPEPTIDES WITHTHE PROTELYTIC ACTIVITY OF HCV NS3 PROTEASEThe polypeptides purified according to the proceduredescribed in examples 1, 3 and 5 were further dialysedagainst buffers containing a chelating agent, in order toremove any metal ions bound to the protein, and theirmetal content was determined by atomicabsorptionspectrometry using a Perkin-Elmer InstrumentSUBSTITUTE SHEET (RULE 26)1020253035W0 98/ 12308CA 02264487 1999-04-08PCT/IT97/00228-17-spectrometer. The glass equipment used for analysis ofthe metal content was washed using 30% nitric acid andrinsed completely with deionised water. The protease (ata concentration of 4 mg/ml) was dialysed for a period ofat least 16 hours against a buffer containing 50 mMTris/Hcl pH 7.5, 3 mM DTT, 10% glycerol, 0.1% CHAPS. AChelex-100 resin (2.5 g/l)dialysis buffer to prevent contamination by casual metalwas held in suspension in theions. The protein was then hydrolysed with nitric acidthe TheCo2+ and Cdâ solutions were purchasedand then used to determine metal content.standardised Znâ,from Merck.The metal content was found to be 1 gâatmn per 1mole enzyme (see table 3 â n.d.= not determined), withthe exception of of the apoprotein, which has anegligible metal content.TABLE 3Protein Zn (g-atoms /mole) Co (g-atoms/mole) Cd (g-atoms /mole)zn2+-Ns3 1.09 n.d. n.d.Apo-NS3 0.02 n.d n.d.co2+âNs3 0.19 0.90 n.d.cd2+-NS3 o.o9 n.d. 1.15n.d.: not determinedEKAMBLE_lPROCEDURE FOR THE RENATURATION,_,-~â._ âââ..-u-n.o»~~â rsv-a r-vsvru1âOF THE NS3 PROTEASEolï¬To ascertain whether or not zinc is required for HCVNS3 serine protease activity, its proteolytic activitywas first measured on aThissynthetic substrate peptide.measurement was carried outin the presence ofincreasing concentrations of EDTA or of 1,10-SUBSTITUTE SHEET (RULE 26)W0 98ll23081530b)UrCA 02264487 1999-04-08PCT/IT97/00228-18-phenanthroline. It was found that these two compounds donot inhibit proteolysis hmâ NS3 at. concentrations lowerthan 1 mM. Above these concentrations both EDTA. andl,lOâphenanthroline only show a modest level ofinhibition of N83 activity. However a similar inhibitionbehaviour has been obtained in control experiments usingstructurally similar elements to 1,10-phenanthroline,which. is not capable of chelating zinc ions, and theactivity was not reâobtained in the presence of an excessof Zn% ions. These results suggest that either zinc isnot required for enzymatic activity, or that it is sostrongly bonded to the protein that it cannot be removedby treatment with chelating agents. It was thereforedecided to proceed with preparation of a proteincontaining no zinc (apoprotein) and to measure itsbiochemical activity in the absence and in the presenceof this metal. Bonded zinc cannot be removed by dialysisagainst chelators with a pH exceeding 7, whereas on theother hand prolonged dialysis of the enzyme at E1 pH ofless than S and in the presence of 10 mM EDTA causes aloss of zinc accompanied by irreversible precipitation ofthe sample. The above observations suggest that the zincis strongly bound and that it is essential for thestructural integrity of the protein. In order tofacilitate the release of zinc the apoprotein wasobtained by applying the following procedure: 1.7 mg ofN83 protease were denaturated by addition of TFA to eafinal concentration of 1%. The denaturated protein wasthen. purified (mi a Resouce RPC 3 ml column using anacetonitryl gradient of from 0% to 85% in the presence of0.1% TFA. The flow rate of the column was equivalent to2 ml/min and the volume of the gradient was 45 ml. Thezinc content of the apoproteinwas found to benegligible. The enzymatic activity of the apoprotein wasthen tested in the presence and in the absence of zinc.The apoprotein was diluted to a final concentration of 60nM in the activity buffer containing the concentrationsSUBSï¬TUTESHEET(RULE26)10U20WO 98/12308CA 02264487 1999-04-08PCT/IT97/00228-19-of ZnCl2 shown in the graph and 10 mM DTT to pmeventoxidation of the thiole groups. After an incubationperiod of 1 hour at 22°C the reaction was started byadding the substrate eptide at a concentration of 40 mM.The reaction was then made to proceed for another hourbefore taking the measurements. As shown in figure 3,reconstitution of the enzymatic activity depends on thebuffer.reactivation was observed at a ZnCl2 concentration of 25concentration of zinc ions in the MaximumuM. At this concentration. the enzymatic activity isfound to be approximately 50% when compared to theprotease containing zinc (diluted in the same buffer atthe same final concentration). This experiment givesunequivocal proof that zinc is necessary in order for theenzyme to be structurally" complete and active, and italso provides a method for reconstitution of N83 serineprotease activity starting from the apoprotein.EXAMBLE_£PROCESS FOR THE PRODUCTION OF HCV NS3 PROTEASE IN A FORMTHAT CAN BE USED FOR DETERMINATION OF THE THREE-DIMENSIONAL STRUCTURE THEREOF USING NMR TECHNIQUESThe discovery that HCV NS3 protease contains astructural zinc atom has been used to increase theproduction of soluble protein. in bacterial cells (E.coli) and therefore to produce a protein in a form thatcan be used for experiments aimed at determining thestructure by means of NMR.In effect, determination of structure by means ofNMR involves metabolic marking with 15â and 13C, to becarried. out on âa minimum 'culture medium, for examplemodified M9 culture medium (NHg2SO4 lg/l, Kâphosphate 100mM. MgsO4 0.5 mM, 'CaCl2 0.5 mM, biotin 5pM, thiamine 7âM,pM).which does not includeampicillin Sug/ml, glucose 4 g/l,Induction in this culture medium,FeSO,, . 7H2O 13zinc salts in its composition, inevitably results in theproduction of insoluble protein, whereas the addition of50pM of ZnCl2 results in the production of a completelySUBSTITUTE SHEET (RULE 26)W0 98/ 123081520CA 02264487 1999-04-08PCT/IT97/00228-20-soluble protease. In this way it is possible to producea marked protein using (15NHJ2SO4 as a source of nitrogenand 13Câglucose as a source of carbon.Following this new procedure, a protein has beenobtained that remains in a soluble form in the cytoplasmand is not captured by the inclusion bodies, as was thethe thisresolubilisationcase using old procedures. In thewhichas these procedureswayprocedures become unnecessary,results in considerable advantages,have an extremely variable yield, require extremelycontrolled conditions and also frequently causeirreversible alterations in the protein. Figure 4 showshow the protease (at approximately 21 kDa â indicated inthe figure by an arrow)(PT)is produced as an insolubleaggregate when the bacterial cells are grown in4 and 5).On the contrary, if ZnCl2 is added to the culture mediumminimum culture medium without zinc (columns 3,at a concentration of 50 mM the protein is found in thesoluble fraction (SN) (columns 6, 7 and 8) and disappearsfrom the insoluble fraction (PT).EXAM2LE_§REPLACEMENT OF THE Zn&â BOUND TO NS3 WITH SPECTROSCOPICPEQBES 511311 as cgâ QB cgâThe Znâ binding site of the HCV NS3 protease andzinc with metalsThebinding of the structural zinc to the enzyme makes itzinc can be studied by replacing thethat make spectroscopic studiespossible. closedifficult to remove the metal and replace it in Vitro.the Znââ and Cd?â bycells (E.were transformed. with an appropriate expression. vector+As a result, was replaced by C02incorporation iJ1 Vivo. The bacterialcoli)...5ââ-'uâs~'v.a-v âIf\f'\ vnhï¬Luv Innâ:and grown. in Ininimum. culture inediurpotassium phosphate at pH 7.0, 0.5 mMCaClQ, 13 uM FeSO4, 7 pM thiamine,(4 g/l) (NH4)2SO4 (1 9/1)carbon and nitrogen, respectively.of Znâ theMgSO4, 0.5 mM6 uM biotin. Glucoseand were used as sources ofTo reduce the amountin the culture medium, phosphate buffer wasSUBSTITUTE SHEET (RULE 25)WO 98/12308152035CA 02264487 1999-04-08PCT/IT97/00228-2]-made to pass through a ChelexâlOO column.To obtainproduction of Coâ or CdââNS3, 50 mM of CoCl2 and CdCl2were added,IPTG.respectively, 20 minutes before addition ofPurification. of the Coâ' and CdâFproteases wasobtained using the procedure describedfact thattreated with Chelex-100 resinin example 1,all the buffers(2.5 g/l)except for the used wereand the DTT waseliminated.The addition of CoCl2 or CdCl2 to the culture mediumstill results in production of the soluble enzyme, whichindicates that the Coâ and Cdâ ions can replace zinc inthe binding site for metal and protease.The protease containing Coli and Cd?â was subjectedTheabsorption6a), whichsite with a tetrahedral geometryto electronic absorption spectroscopic analysis.containing Cob" shows atheprotease typicalspectrum in visible region (figureindicates(26).minimums ata bindingThe two main bands at 640 nm and at 685 nm and the585 nm 740 nm dâdThe energy in these transitions and theand at indicatetransitions.coefficientsdistortedmolar extinctionare characteristic ofwith a tetrahedral(27).complexes co-ordinationgeometry The dâd transition energy is consistent,bond.the centroide in the band corresponding towith a mixed sulphur-nitrogen coâordinationFurthermore,the dâd transition indicates a Cokbond (26).observed at around 365 nmcomplex with a S3N2+A typical charge transfer band S â> Co was(figure 6a), implying that themetal ion is coâordinated by thiolates.In accordance with these data, the UV absorbancyâ)spectrum of the Cd+âprotease (figure 6b) shows anincrease in absorbancy at around 256 nm, which in allprobability is due to a charge transfer band S â> Cdâ2+(28).and Cdâ âthree-dimensional model proposed by us.In conclusion, spectroscopic analysis of the Coproteases is completely consistent with theIn face, in themodel the binding site for the metal is made up of threeSUBSTITUTE SHEET (RULE 26)W0 98/ 12308H202535CA 02264487 1999-04-08PCT/IT97/00228-22-thiole groups of three cysteines and of a nitrogen atomfrom the side chain of a hystidine. Each of the residuesthat according to the model form the binding site for themetal has been changed to alanine and, as expected, noneof the mutants obtained is capable of being expressed ina soluble form in E. coli.BIBLIOGRAPHY1. Behrens S.E., Tomei L., De Francesco R.(1996) EMBOJ.l5:l2-22.2. Hijikata M., Kato N., Ootuyama Y., Nakagawa M. &Shimotohno K. (1991) Proc. Natl. Acad. Sci. USA 88:5547â5551.3. Grakoui A., McCourt D.W., Wychowski C., FeinstoneS.M., Rice C.M. (1993) Proc. Natl. Acad. Sci. USA 90:10583-10587.4. Bartenschlager R., Ahlborn-Laake L., Mous J. &Jacobsen H. (1993) J. Virol. 68:1045â1055.5. Eckart M.R., Selby M., Masiarz F., Lee C., BergerK., Crawford K., Kuo C., Kuo G., Houghton M. & Choo Q.âL.(1993) Biochem. Biophys. Res. Comm. 192:399â406.6. Grakoui A., McCourt D.W.) Wychowski C., FeinstoneS.M. & Rice C.M. (1993) J. Virol. 67:2832â2843.7. Tomei L., Failla C., Santolini E., De Francesco R. &La Monica N. (1993) J. Virol. 6724017-4026.8. Manabe S., Fuke I., Tanishita O., Kaji C., Gomi Y.,Yoshida S., Mori C., Takamizawa A., Yohida I. & OkayamaH. (1994) virology 198:636â644.9. Pizzi E., Tramontano A., Tomei L., La Monica N.,Failla C., Sardana M., Wood T., De Francesco R. (1994)Proc. Natl. Acad. Sci. USA 91:888-892.10. Shimuzu Y., Yamanii K., Masuho Y , Yokota T , InouheH., Sudo K., Satoh S & Shimothono K. (1996) J. Virol.70:l27~l32.11. Lin CL, Thomson JLAH & Rice C.M (1995) .J. Virol.69:4373â4380.SUBSTITUTE SHEET (RULE 26)WO 98/12308152030CA 02264487 1999-04-08PCTIIT97/00228-23-12. Tomei L., Failla (3., Vitale R.L., Bianchi E. & DeFrancesco R. (1996) J. Gen. Virol. 77:1065â1070.13. Chambers T.J., Weir R.C.,Fletterick R.J.,USA 87:8898â8902.14. Lam P.Y., Jadhav P K., Eyermann C.J., Hodge C.N., RuGrakoui A., Mccourt D.W.,Bazan J.F., Rice C.M. (1990) Natl.Acad.Proc.Sci.Y., Bacheler L.T., Meek J.L., Otto M.J., Rayner M.M.,Wong Y.N., Chang C.âH., Weber P.C., Jackson D.A., SharpeT.R., EricksonâViitanen S. (1994) Science 263:380-384.15. Georgiou G. and Valax P. (1996) Current Opinion inBiotechnology 7:190â197.16. Failla C., Pizzi B., De Francesco R., Tramontano A.(1996) Folding & Design 1:35-42.17. Zuckermann A.J.(1996) The Lancet 347:58â559.18. Muerhoff A.S., Leary T.P., Simons J.N., PilotâMatiasT.J., Dawson G.J., Erker J.C., Chalmers M.L., SchlauderG.C., Desai S.M. & Mushahwar I.K. (1995) J. Virol.69:S62lâS630.19. Leary T.P.. Muerhoff A.S.,Simons J.N., PilotâMatiasT.J., Erker J.C., Chalmers M.L., Schlauder G.C., DawsonG.J., Desai S.M. & Mushahwar I.K. (1996) J.Medical Virol.48:60-67.20. Yu S.F. and Lloyd R.E. (1992) Virology 186:725-73521. Voss T., Meyer R. and Sommergruber W. (1995) ProteinScience 4:2526â2531.22. Rice W.G., Schaeffer, C.A., Harten B., Villinger F.,South T.L., Summers M.F., Henderson L.E., Bess J W.J.,Arthur L.O., McDougall J.S., Orloff S.L., Mendeleyev J. &Kun E. (1993) 361:473â475.23. Rice W.G , Supko J.G., Malspeis L., Buckheit R.W.J.,Clanton D., Bu M., Graham L., Schaeffer C.A., TurpinJ.A., Domogala J., Gogliotti R., Bader J.P., HallidayS.M., Coren L., Sowder R.C.I., Arthur L.O. & HendersonL.E. (1995) Science 270:l194âll97.24. Tabor S. & Richardson C. C. (1985) Proc. Natl. Acad.Sci. USA 82:l074-1078.SUBSTITUTE SHEET (RULE 26)CA 02264487 1999-04-08W0 98/ 12308 PCT/IT97/00228-24-25. Studier F.W. and Moffatt (1986) J.Mol.Biol. 189:113â13026. Maret W.& Vallee B.L. (1993) Meth. Enzym. 226:S2â71.27. Bertini I.& Luchinat C. (1984) Adv. Inorg. Biochem.6:72-lll.28. Fitzgerald D.W. and Coleman J.E. (1991) Biochemistry3015195-5201.SUBSTITUTE SHEET (RULE 26)WU202535W0 98/12308CA 02264487 1999-04-08PCTIIT97/00228-25-SEQUENCE LISTINGGENERAL INFORMATION(i) APPLICANT:ISTITUTO DI RICERCHE DI BIOLOGIA MOLECOLARE P.ANGELETTI S.p.A.(ii) TITLE OF INVENTION: "SOLUBLE POLYPEPTIDES WITH ACTIVITY OFTHE NS3 SERINE PROTEASE OF HEPATITIS C VIRUS, AND PROCESS FORTHEIR PREPARATION AND ISOLATION"NUMBER OF SEQUENCES: 12MAILING ADDRESS:(A) ADDRESSEE: Societa'(B) STREET: Piazza di Pietra, 39(C) CITY: Rome(D) COUNTRY: Italy(E) POST CODE: IâOO186(V) COMPUTERâREADABLE FORM:(A) TYPE OF SUPPORT: Floppy disk 3.5"(B) COMPUTER: IBM PC compatible(C) OPERATING SYSTEM: PCâDOS/MS-DOS Rev. 5.0(D) SOFTWARE: Microsoft Word 6.0AGENT INFORMATION(A) NAME: DI CERBO Mario (Dr.)(B) REFERENCE: RM/X88878/PC-DC(ix) TELECOMMUNICATIONS INFORMATION(iii)(iv)Italiana Brevetti1.44 MBYTES(viii)(A) TELEPHONE: O6/6785941(B) TELEFAX: O6/6794692(C) TELEX: 612287 ROPAT(1) INFORMATION ON SEQUENCE SEQ ID NO:l:(i) SEQUENCE CHARACTERISTICS(A) LENGTH:(B) TYPE: amino acid(C) STRANDEDNESS: single(D) TOPOLOGY: linear(ix) FEATURE(A) NAME: Pro BK-asK4187 amino acidsSUBSTITUTE SHEET (RULE 26)WO 98/123082530CA(D)02264487 1999-04-08OTHER INFORMATION:-26-protease of HCVâisolated BK.SEQUENCE DESCRIPTION SEQ ID NO:Thr Ala Tyr Ser Gln Gln Thr Arg Gly LeuMetCysGluValLeu65GlnProAspSerLeu145ThrGluAlaIleValAsn50AlaAspCysValPro130CysArgThr(2)ProIleGln35GlyGlyLeuThrIle115ArgPIOGlyThr(xi)IleThr20ValValProValCys100ProPIOSerValMet180(i)(ix)5SerValCysLysGly85GlyValValGlyAla165ArgLeu ThrSer ThrThr55TrpGly Pro70Trp GlnSer SerArg ArgSer Tyr135His Ala150Lys AlaAla SerGlyAla40ValIleAlaAspArg120Lâ¬L1ValValLys10Arg25ThrAspGlnTyr HisThr GlnPro Pro90Leu Tyr105Gly AspLys GlyGly IleAsp Phe170Lys Lys185LysSerGlyMet75GlyLeuSerSerPhe155ValLysSEQUENCE CHARACTERISTICS(A) LENGTH:(B) TYPE:(C) STRANDEDNESS :(D) TOPOLOGY: linearFEATURE(A) NAME:(D) OTHER INFORMATION:amino acidPro H-asK4protease of HCVâisolated H.SUBSTITUTE SHEET (RULE 26).-..' .....'1 f\D.J..u5.1_cAsnPheAla60TyrAlaValArgSer140ArgProINFORMATION ON SEQUENCE SEQ ID NO:187 amino acidsPCT/IT97/00228GlnLeu45GlyThrArgThrGly125GlyAlaValsequence for1:Val30AlaSerASHSerArg110SerGlyAlaGlutheLeu15GluThrLysValLeu95HisLeuProValSer175GlyGlyCysThrAsp80ThrAlaLeuLeuCys160Metsequence for the NS3NS3H203035W0 98/ 12308 0MetCysGluIleIle65GlnProAspSerLeu145ThrGluAlaIleValASI150AlaAspCysValPro130CysArgThr(3)P150IleGln35GlySerLeuThrIle115ArgProGlyThr(xi)IleThr20IleValPIOValCys100ProProAlaValMet180(i)(ix)CA-27-02264487 1999-04-08SEQUENCE DESCRIPTION SEQ ID NO: 2:Thr Ala Tyr Ala Gln Gln Thr Arg5SerValCysLysGly85GlyValIleGlyAla165ArgLeuSerTrpGly70TrpSerArgSerHis150LysAlaINFORMATIONThrThrThr55ProProSerArgTyr135AlaAlaSerON SEQUENCE SEQ ID NO:GlyAla40ValValAlaAspArg120LeuValValLysArg25ThrTyrIleProLeu105GlyLysGlyAspLys18510AspGlnHisGlnGln90TyrAspGlyLeuPhe170LysLysThrGlyMet75GlyLeuSerSerPhe155IleLysSEQUENCE CHARACTERISTICS(A)(B)(C)(D)F -(A)(D)LENGTH:TYPE:STRANDEDNESS:amino acidsingleTOPOLOGY: linearTURENAME: ProJâasK4OTHER INFORMATION:protease of HCV-isolated J.SEQUENCE DESCRIPTION: SEQ ID NO:(xi)SUBSTITUTE SHEET (RULE 26)ASI1PheAla60TyrSerValArgSer140ArgPro187 amino acidsGly Leu LeuGlnLeu45GlyThrArgThrGly125GlyAlaValVal30AlaThrASI1SerArg110SerGlyAlaGlu3:15GluThrArgValLeu95HisLeuPICValAsn175sequence for thePCT/IT97/00228GlyGlyCysThrAsp80ThrAlaLeuLeuCys160LeuNS3CA 02264487 1999-04-08W0 98/12308 PCT/IT97/00228-23-Met Ala Pro Ile Thr Ala Tyr Ser Gln Gln Thr Arg Gly Leu Leu GlyCys Ile Ile Thr Ser Leu Thr Gly Arg Asp Lys Asn Gln Val Asp Gly20 25 30Glu Val Gln Val Leu Ser Thr Ala Thr Gln Ser Phe Leu Ala Thr Cys35 40 45Val Asn Gly Val Cys Trp Thr Val Tyr His Gly Ala Gly Ser Lys Thr50 55 60Leu Ala Gly Pro Lys Gly Pro Ile Thr Gln Met Tyr Thr Asn Val AspI0 65 70 75 80Gln Asp Leu Val Gly Trp Pro Ala Pro Pro Gly Ala Arg Ser Met Thr85 90 95Pro Cys Thr Cys Gly Ser Ser Asp Leu Tyr Leu Val Thr Arg His Ala100 105 110U Asp Val Val Pro Val Arg Arg Arg Gly Asp Ser Arg Gly Ser Leu Leu115 120 125Ser Pro Arg Pro Ile Ser Tyr Leu Lys Gly Ser Ser Gly Gly Pro Leu130 135 140Leu Cys Pro Ser Gly His Val Val Gly Ile Phe Arg Ala Ala Val Cys20 145 150 155 160Thr Arg Gly Val Ala Lys Ala Val Asp Phe Ile Pro Val Glu Ser Met165 170 175Glu Thr Thr Met Arg Ala Ser Lys Lys Lys Lys180 18525 (4) INFORMATION ON SEQUENCE SEQ ID NO 4:(i) SEQUENCE CHARACTERISTICS(A) LENGTH: 186 amino acids(B) TYPE: amino acid(C) STRANDEDNESS: single30 (D) TOPOLOGY: linear(ix) FEATURE(A) NAME: Pro J8âas 4(D) OTHER INFORMATION: sequence for the NS3protease of HCVâisolated J8.35 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4:Ala Pro Ile Thr Ala Tyr Thr Gln Gln Thr Arg Gly Leu Leu Gly Ala1 5 10 15$UBSï¬TUTESHEET(RULE26)W0 98ll23081520253035IleValSerAla65AspCysValPICCys145ArgValValGlnGly50GlyLeuThrIleArg130SerGlyAla(5)ValVal35ValPIC)ValCysPro115ProArgValThrSer20LeuLeuLysGlyGly100ValLeuGlyAlaArg180CALeuSerTrpGlyTrp85AlaArgSerHisLys165AlaThrSerThrPro70PI'OValArgThrAla150SerSerGlyValVal55ValSerAspLysLeu135ValIleLys-29-Arg Asp25Thr Gln40Tyr HisThr GlnPro ProLeu Tyr105Asp Asp120Lys GlyGly LeuAsp PheLys Lys18502264487 1999-04-08LysThrGlyMetGly90LeuArgSerPheIle170LysASI1PheAlaTyr75ThrValArgSerArg155ProSEQUENCE CHARACTERISTICS(A)(B)LENGTH:TYPE:nucleic acidGluLeuGly60ThrLysThrGlyGly140AlaValINFORMATION ON SEQUENCE SEQ ID NO:(i)26 nucleotidesGlnGly45AsnSerSerArgAla125GlyAlaGluAla30ThrLysAlaLeuAsn110LeuProValSerGlySerThrGluAsp95AlaLeuValCysLeu175PCTIIT97/00228GlnIleLeuGly80P150AspSerLeuAla160Asp(C) STRANDEDNESS: single(D) TOPOLOGY: linear(ii) MOLECULE TYPE: Synthetic DNA(iv) ANTISENSE: No(vii)IMMEDIATE SOURCE:(ix FEATURE(A) NAME: PROT(BKâK4)S(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5:GCATACATAT GGCGCCCATC ACGGCC 26SUBSTITUTE SHEET (RULE 26)203035W0 98/12308(6)CTACTTCTTC TTCTTGCTAG CCCGCATAGT AGT(7)GAGATACATA TGGCGCCTAT CACGGC(8)CA 02264487 1999-04-08PCT/IT97/00228-30-INFORMATION ON SEQUENCE SEQ ID NO: 6:(i) SEQUENCE CHARACTERISTICS(A) LENGTH: 33 nucleotides(B) TYPE: nucleic acid(C) STRANDEDNESS: single(D) ASPECT: linearii) MOLECULE TYPE: Synthetic DNAiv) ANTISENSE: Yes(((vii)IMMEDIATE SOURCE: oligonucleotide synthesiser(ix) FURTHER CHARACTERISTICS(A) NAME: PROT(BKâK4)AS(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6:33INFORMATION ON SEQUENCE SEQ ID NO: 7:(i) SEQUENCE CHARACTERISTICS(A) LENGTH: 26 nucleotides(B) TYPE: nucleic acid(C) STRANDEDNESS: single(D) TOPOLOGY: linear(ii) MOLECULE TYPE: Synthetic DNA(iv) ANTISENSE: No(vii)IMMEDIATE SOURCE: oligonucleotide synthesiser(ix) FEATURE(A) NAME: PROT(H-K4)S(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7:26INFORMATION ON SEQUENCE SEQ ID NO: 8:(i) SEQUENCE CHARACTERISTICS(A) LENGTH: 42 nuclectides(B) TYPE: nucleic acid(C) STRANDEDNESS: single(D) TOPOLOGY: linear(ii) MOLECULE TYPE: Synthetic DNA(iv) ANTISENSE: YesSUBSTITUTE SHEET (RULE 26)B2025DJC)CA 02264487 1999-04-08wo 98/12308 PCT/IT97I00228-3]-(vii)IMMEDIATE SOURCE: oligonucleotide synthesiserFEATURE(A) NAME: PROT(HâK4)AS(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8:TTTGAATTCC TACTTCTTCT TCTTGCTAGC TCTCATGGTT GT 42(ix)(9) INFORMATION ON SEQUENCE SEQ ID NO: 9:(i) SEQUENCE CHARACTERISTICS(A) LENGTH: 27 nucleotides(B) TYPE: nucleic acid(C) STRANDEDNESS: single(D) TOPOLOGY: linear(ii) MOLECULE TYPE: Synthetic DNAANTISENSE: NO(vii)IMMEDIATE SOURCE: oligonucleotide synthesiserFEATURE(A) 1\IAME: PROT (J-K4) s(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:TTTCATATGG CGCCTATCAC GGCCTAT(iv)(ix)9:27(10) INFORMATION ON SEQUENCE SEQ ID NO:(i) SEQUENCE CHARACTERISTICS(A) KENGTH:(B) TYPE: nucleic acid(C) STRANDEDNESS: single(D) TOPOLOGY: linear(ii) MOLECULE TYPE: Synthetic DNAANTISENSE: Yes(vii)IMMEDIATE SOURCE: oligonucleotide synthesiser(ix) FEATURE(A) NAME: PROT(JâK4)ASSE UENCE DESCRIPTION SEQ IDTTTGAATTCC TACTTCTTCT TCTTGCTAGC CCGCATGGTA GT10:26 nucleotides(iv) T\Tf'\:LV\J10:42(ll) INFORMATION ON SEQUENCE SEQ ID NO:(i) SEQUENCE CHARACTERISTICS(A) LENGTH:11:24 nucleotidesSUBSTITUTE SHEET (RULE 26)CA 02264487 1999-04-08WO 98112308 PCTIIT97/00228-32-(B) TYPE: nucleic acid(C) STRANDEDNESS: single(D) TOPOLOGY: linear(ii) MOLECULE TYPE: Synthetic DNA5 (iv) ANTISENSE: NO(vii)IMMEDIATE SOURCE: oligonucleotide synthesiser(ix) FEATURE(A) NAME: PROT(J8-K4)S(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: ll:W GGGAATTCCA TATGGCTCCC ATTACTGCT ACAC 24(12) INFORMATION ON SEQUENCE SEQ ID NO: 12:(i) SEQUENCE CHARACTERISTICS(A) LENGTH: 42 nucleotides(B) TYPE: nucleic acid15 (C) STRANDEDNESS: single(D) TOPOLOGY: linear(ii) MOLECULE TYPE: Synthetic DNA(iv) ANTISENSE: Yes(vii)IMMEDIATE SOURCE: oligonucleotide synthesiser20 (ix) FEATURE(A) NAME: PROT(J8-K4)S(Xi) SEQUENCE DESCRIPTION SEQ ID NO: 12:TTTGAATTCC TACTTCTTCT TCTTGCTAGC CCGTGTGGCG AC- 42SUBSTITUTE SHEET (RULE 26)