Language selection

Search

Patent 2166313 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2166313
(54) English Title: NON-A, NON-B, NON-C, NON-D, NON-E HEPATITIS REAGENTS AND METHODS FOR THEIR USE
(54) French Title: REACTIFS POUR L'HEPATITE NON-A, NON-B, NON-C, NON-D ET PROCEDE POUR LEUR UTILISATION
Status: Dead
Bibliographic Data
(51) International Patent Classification (IPC):
  • C12N 15/51 (2006.01)
  • A61K 39/29 (2006.01)
  • A61K 39/385 (2006.01)
  • C07K 14/18 (2006.01)
  • C07K 16/10 (2006.01)
  • C07K 19/00 (2006.01)
  • C12N 5/10 (2006.01)
  • C12N 7/00 (2006.01)
  • C12Q 1/70 (2006.01)
  • G01N 33/576 (2006.01)
  • G01N 33/577 (2006.01)
  • A61K 39/00 (2006.01)
(72) Inventors :
  • SIMONS, JOHN N. (United States of America)
  • PILOT-MATIAS, TAMI J. (United States of America)
  • DAWSON, GEORGE J. (United States of America)
  • SCHLAUDER, GEORGE G. (United States of America)
  • DESAI, SURESH M. (United States of America)
  • LEARY, THOMAS P. (United States of America)
  • MUERHOFF, ANTHONY SCOTT (United States of America)
  • ERKER, JAMES CARL (United States of America)
  • BUIJK, SHERI L. (United States of America)
  • MUSHAHWAR, ISA K. (United States of America)
(73) Owners :
  • ABBOTT LABORATORIES (United States of America)
(71) Applicants :
(74) Agent: NORTON ROSE FULBRIGHT CANADA LLP/S.E.N.C.R.L., S.R.L.
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date: 1995-02-14
(87) Open to Public Inspection: 1995-08-17
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/US1995/002118
(87) International Publication Number: WO1995/021922
(85) National Entry: 1995-12-28

(30) Application Priority Data:
Application No. Country/Territory Date
08/196,030 United States of America 1994-02-14
08/242,654 United States of America 1994-05-13
08/283,314 United States of America 1994-07-29
08/344,185 United States of America 1994-11-23
08/344,190 United States of America 1994-11-23
08/344,557 United States of America 1995-01-27

Abstracts

English Abstract






Hepatitis GB Virus (HGBV) nucleic acid and amino acid sequences useful for a variety of diagnostic and therapeutic applications, kits
for using the HGBV nucleic acid or amino acid sequences, HGBV immunogenic particles, and antibodies which specifically bind to HGBV.
Also provided are methods for producing antibodies, polyclonal or monoclonal, from the HGBV nucleic acid or amino acid sequences.


French Abstract

Séquences d'acide nucléique et d'acides aminés du virus de l'hépatite GB (VHGB) utiles pour diverses applications dans le diagnostic et le traitement, kits pour la mise en oeuvre des séquences d'acides nucléiques et d'acides aminés du VHGB, particules immunogènes du VHGB et anticorps se liant spécifiquement au VHGB. L'invention décrit également des procédés pour la production d'anticorps polyclonaux ou monoclonaux à partir des séquences d'acides nucléiques ou d'acides aminés du VHGB.

Claims

Note: Claims are shown in the official language in which they were submitted.



613
WHAT IS CLAIMED IS:
1. A purified polynucleotide or fragment thereof derived from hepatitis
GB virus (HGBV) capable of selectively hybridizing to the genome of HGBV or
the complement thereof.

2. The purified polynucleotide or fragment thereof of claim 1 wherein
said polynucleotide is characterized by a positive stranded RNA genome wherein
said genome comprises an open reading frame (ORF) encoding a polyprotein
wherein said polyprotein comprises an amino acid sequence having at least 35%
identity to an amino acid sequence selected from the group consisting of HGBV-A,HGBV-B and HGBV-C.

3. The purified polynucleotide or fragment thereof of claim 1 wherein
said polynucleotide is characterized by a positive stranded RNA genome wherein
said genome comprises an open reading frame (ORF) encoding a polyprotein
wherein said polyprotein comprises an amino acid sequence having at least 40%
identity to an amino acid sequence selected from the group consisting of HGBV-A,HGBV-B and HGBV-C.

4. The purified polynucleotide or fragment thereof of claim 1 wherein
said polynucleotide is characterized by a positive stranded RNA genome wherein
said genome comprises an open reading frame (ORF) encoding a polyprotein
wherein said polyprotein comprises an amino acid sequence having at least 60%
identity to an amino acid sequence selected from the group consisting of HGBV-A,HGBV-B and HGBV-C.

5. A recombinant polynucleotide or fragment therof derived from
hepatitis GB virus (HGBV) capable of selectively hybridizing to the genome of
HGBV or the complement thereof.

6. The recombinant polynucleotide of claim 5 wherein said nucleotide
comprises a sequence that encodes at least one epitope of HGBV.

7. The recombinant polynucleotide of claim 6 wherein said
recombinant nucleotide is characterized by a positive stranded RNA genome
wherein said genome comprises an open reading frame (ORF) encoding a
polyprotein wherein said polyprotein comprises an amino acid sequence having at


614
least 35% identity to an amino acid sequence selected from the group consisting of
HGBV-A, HGBV-B and HGBV-C.

8. The recombinant polynucleotide of claim 5 wherein said
polynucleotide is contained within a recombinant vector.

9. The polynucleotide of claim 8 further comprising a host cell
transformed with said vector.

10. A hepatitis GB virus (HGBV) recombinant polynucleotide or
fragment thereof comprising a nucleotide sequence derived from an HGBV
genome.

11. The HGBV recombinant polynucleotide of claim 10 wherein said
polynucleotide is contained within a recombinant vector.

12. The HGBV recombinant polynucleotide of claim 10 further
comprising a host cell transformed with said vector.

13. The HGBV recombinant polynueleotide of claim 10, wherein said
sequence encodes an epitope of HGBV.

14. The HGBV recombinant polynucleotide of claim 13, wherein said
sequence is characterized by a positive stranded RNA genome wherein said
genome comprises an open reading frame (ORF) encoding a polyprotein wherein
said polyprotein comprises an amino acid sequence having at least 35% identity to
an amino acid sequence selected from the group consisting of HGBV-A, HGBV-B
and HGBV-C.

15. The HGBV recombinant polynucleotide of claim 13 wherein said
polynucleotide is contained within a recombinant vector.

16. The HGBV recombinant polynucleotide of claim 15 further
comprising a host cell transformed with said vector.

17. A recombinant expression system comprising an open reading
frame of DNA or RNA derived from hepatitis GB virus (HGBV) wherein said

615
open reading frame comprises a sequence of HGBV genome or cDNA and
wherein said open reading frame is operably linked to a control sequence
compatible with a desired host.

18. The expression system of claim 17 further comprising a cell
transformed with said recombinant expression system.

19. The expression system of claim 18 further comprising a
polypeptide of at least about eight amino acids in length produced by said cell.
20. Purified hepatitis GB virus (HGBV).

21. The purified virus of claim 20 further comprising a preparation of
HGBV polypeptide or fragment thereof.

22. A purified polypeptide derived from hepatitis GB virus (HGBV)
comprising an amino acid sequence or fragment thereof wherein said sequence is
characterized by a positive stranded RNA genome wherein said genome comprises
an open reading frame (ORF) encoding a polyprotein wherein said polyprotein
comprises an amino acid sequence having at least 35% identity to an amino acid
sequence selected from the group consisting of HGBV-A, HGBV-B and HGBV-
C.

23. A recombinant polypeptide comprising an amino acid sequence or
fragment thereof wherein said sequence is characterized by a positive stranded
RNA genome wherein said genome comprises an open reading frame (ORF)
encoding a polyprotein wherein said polyprotein comprises an amino acid
sequence having at least 35% identity to an amino acid sequence selected from the
group consisting of HGBV-A, HGBV-B and HGBV-C.

24. A recombinant polypeptide comprising an amino acid sequence or
fragment thereof characterized by a positive stranded RNA genome wherein said
genome comprises an open reading frame (ORF) encoding a polyprotein wherein
said polyprotein comprises an amino acid sequence having at least 35% identity to
an amino acid sequence selected from the group consisting of HGBV-A, HGBV-B
and HGBV-C.

616
25. An antibody directed against at least one hepatitis GB virus
(HGBV) epitope.

26. The antibody of claim 25 wherein said antibody is polyclonal.

27. The antibody of claim 25 wherein said antibody is monoclonal.

28. A fusion polypeptide comprising at least one hepatitis GB virus
(HGBV) polypeptide or fragment thereof.

29. A particle that is immunogenic against hepatitis GB virus (HGBV)
infection, comprising a non-HGBV polypeptide having an amino acid sequence
capable of forming a particle when said sequence is produced in a eukaryotic or
prokaryotic host, and at least one HGBV epitope.

30. A polynucleotide probe for hepatitis GB virus (HGBV) wherein
said polynucleotide probe is characterized by a positive stranded RNA genome
wherein said genome comprises an open reading frame (ORF) encoding a
polyprotein wherein said polyprotein comprises an amino acid sequence having at
least 35% identity to an amino acid sequence selected from the group consisting of
HGBV-A, HGBV-B and HGBV-C.

31. An assay kit for determining the presence of hepatitis GB virus
(HGBV) antigen or antibody in a test sample comprising a container containing a
polypeptide possessing at least one HGBV epitope present in an HGBV antigen.

32. The assay kit of claim 31, wherein said polypeptide is characterized
by a positive stranded RNA genome wherein said genome comprises an open
reading frame (ORF) encoding a polyprotein wherein said polyprotein comprises
an amino acid sequence having at least 35% identity to an amino acid sequence
selected from the group consisting of HGBV-A, HGBV-B and HGBV-C.

33. The assay kit of claim 32 wherein said polypeptide is attached to a
solid phase.

34. A kit for determining the presence of hepatitis GB virus (HGBV)
antigen or antibody in a test sample comprising a container containing an antibody

617
which specifically binds to an HGBV antigen, wherein said antigen comprises an
HGBV epitope encoded by a sequence having at least about 60% sequence
similarity to a sequence of HGBV.

35. The kit of claim 34 wherein said antibody is attached to a solid
phase.

36. A kit for determining the presence of hepatitis GB virus (HGBV)
polynucleotides in a test sample suspected of containing said polynucleotides,
comprising a container containing a polynucleotide probe wherein said
polynucleotide probe comprises a nucleotide sequence characterized by a positivestranded RNA genome wherein said genome comprises an open reading frame
(ORF) encoding a polyprotein wherein said polyprotein comprises an amino acid
sequence having at least 35% identity to an amino acid sequence selected from the
group consisting of HGBV-A, HGBV-B and HGBV-C.

37. A method for producing a polypeptide containing at least one
hepatitis GB virus (HGBV) epitope comprising incubating host cells transformed
with an expression vector comprising a sequence encoding a polypeptide
characterized by a positive stranded RNA genome wherein said genome comprises
an open reading frame (ORF) encoding a polyprotein wherein said polyprotein
comprises an amino acid sequence having at least 35% identity to an amino acid
sequence selected from the group consisting of HGBV-A, HGBV-B and HGBV-
C.

38. A method for detecting hepatitis GB virus (HGBV) nucleic acid in a
test sample suspected of containing HGBV comprising:
a. reacting the test sample with a probe for an HGBV polynucleotide
encoded by a sequence of HGBV or fragment thereof wherein said sequence is
characterized by a positive stranded RNA genome wherein said genome comprises
an open reading frame (ORF) encoding a polyprotein wherein said polyprotein
comprises an amino acid sequence having at least 35% identity to an amino acid
sequence selected from the group consisting of HGBV-A, HGBV-B and HGBV-
C, under conditions and for a time which allows the formation of a complex
between the probe and the HGBV nucleic acid in the test sample;
b. detecting the complex which contains the probe.

618
39. The method of claim 38 further comprising the step of amplifying
the probe of step (a) by the polymerase chain reaction (PCR) technique.

40. The method of claim 38 further comprising the step of amplifying
the probe of step (a) by the ligase chain reaction (LCR) technique.

41. A method for detecting hepatitis GB virus (HGBV) antigen in a test
sample suspected of containing HGBV comprising:
a. contacting the test sample with an antibody or fragment thereof
which specifically binds to at least one HGBV antigen, for a time and under
conditions sufficient to allow the formation of antibody/antigen complexes;
b. detecting said complex containing the antibody.

42. The method of claim 41 wherein said antibody is attached to a solid
phase.

43. The method of claim 41 wherein said antibody is a monoclonal or
polyclonal antibody.

44. A method for detecting hepatitis GB virus (HGBV) antibodies in a
test sample suspected of containing said antibodies, comprising:
a. contacting the test sample with a probe polypeptide wherein said
polypeptide contains at least one HGBV epitope comprising an amino acid
sequence or fragment thereof is characterized by a positive stranded RNA genome
wherein said genome comprises an open reading frame (ORF) encoding a
polyprotein wherein said polyprotein comprises an amino acid sequence having at
least 35% identity to an amino acid sequence selected from the group consisting of
HGBV-A, HGBV-B and HGBV-C, for a time and under conditions sufficient to
allow antigen/antibody complexes to form;
b. detecting said complexes which contain the probe polypeptide.

45. The method of claim 42 wherein said probe polypeptide is attached
to a solid phase.

46. The method of claim 42 wherein said solid phase is selected from
the group consisting of beads, microtiter wells, walls of test tube, nitrocellulose
strips, magnetic beads and non-magnetic beads.



619

47. The method of claim 44 wherein said polypeptide is a recombinant
protein or a synthetic peptide which encodes at least one epitope of HGBV is
characterized by a positive stranded RNA genome wherein said genome comprises
an open reading frame (ORF) encoding a polyprotein wherein said polyprotein
comprises an amino acid sequence having at least 35% identity to an amino acid
sequence selected from the group consisting of HGBV-A, HGBV-B and HGBV-
C.

48. The method of claim 44 wherein said sequence is characterized by a
positive stranded RNA genome wherein said genome comprises an open reading
frame (ORF) encoding a polyprotein wherein said polyprotein comprises an amino
acid sequence having at least 35% identity to an amino acid sequence selected from
the group consisting of HGBV-A, HGBV-B and HGBV-C.

49. A vaccine for treatment of hepatitis GB virus (HGBV) infection
comprising a pharmacologically effective dose of an immunogenic HGBV
polypeptide or fragment thereof which polypeptide is characterized by a positivestranded RNA genome wherein said genome comprises an open reading frame
(ORF) encoding a polyprotein wherein said polyprotein comprises an amino acid
sequence having at least 35% identity to an amino acid sequence selected from the
group consisting of HGBV-A, HGBV-B and HGBV-C, in a pharmaceutically
acceptable excipient.

50. A vaccine for treatment of hepatitis GB virus (HGBV) infection
comprising an inactivated or attenuated HGBV in a pharmacologically effective
dose in an pharmaceutically acceptable excipient.

51. A tissue culture grown cell infected with hepatitis GB virus
(HGBV).

52. The tissue culture grown cell of claim 51 wherein said HGBV is
transfected into a cell.

53. The tissue culture grown cell of claim 51 wherein said HGBV
comprises a subgenomic fragment of the HGBV gene.





620
54. A method for producing antibodies to hepatitis GB virus (HGBV)
comprising administering to an individual an isolated immunogenic polypeptide orfragment thereof comprising at least one HGBV epitope in an amount sufficient toproduce an immune response.

55. A synthetic peptide encoding an epitope of hepatitis GB virus
(HGBV) comprising a sequence of HGBV or fragment thereof is characterized by
a positive stranded RNA genome wherein said genome comprises an open reading
frame (ORF) encoding a polyprotein wherein said polyprotein comprises an amino
acid sequence having at least 35% identity to an amino acid sequence selected from
the group consisting of HGBV-A, HGBV-B and HGBV-C.

56. The synthetic polypeptide of claim 55 attached to a solid support.

57. A diagnostic reagent comprising a polynucleotide derived from
hepatitis GB virus (HGBV), wherein said polynucleotide or fragment thereof
encodes at least one epitope of HGBV and is characterized by a positive strandedRNA genome wherein said genome comprises an open reading frame (ORF)
encoding a polyprotein wherein said polyprotein comprises an amino acid
sequence having at least 35% identity to an amino acid sequence selected from the
group consisting of HGBV-A, HGBV-B and HGBV-C.

58. A diagnostic reagent comprising a polypeptide or fragment thereof
derived from hepatitis GB virus (HGBV), wherein said polypeptide or fragment
thereof encodes at least one epitope of HGBV and is characterized by a positive
stranded RNA genome wherein said genome comprises an open reading frame
(ORF) encoding a polyprotein wherein said polyprotein comprises an amino acid
sequence having at least 35% identity to an amino acid sequence selected from the
group consisting of HGBV-A, HGBV-B and HGBV-C.

Description

Note: Descriptions are shown in the official language in which they were submitted.



DEMANDES OU BREVETS VOLUMINEUX


LA PRÉSENTE PARTIE- DE ~ 1 I t DEMANDE OU CE BREVET
COMPREND PLUS D'UN TOME. - -


CECI EST LE. TOME / DE ~


NO~E: Pour les tomes additionels, veuillez c~ntacter le Bureau canadien des
brevets


2 1 ~13



JUIVIBO APPLICATIONSIPATENTS


THIS SE~TION OF THE APPLICATION/PATENT CONTAINS MORE
THAN ONE VOLlJME


THIS IS VOLUME 1_ OF ;~


I~OTE: ~or additional v~lumes please c~ntact the Canadian Patent Office

_ W O 95/21922 PC~rtUS95tO2118
21 6G31 3

NON-A, NON-B, NON-C, NON-D, NON-E HEPATITIS REAGENTS
AND METHODS FOR THEIR USE

This application is a continuation-in-part application of U.S. Serial No.
08/377,557 filed January 27, 1995, which is a continuation-in-part of U.S. Serial
No. 08/344,185 filed November 23, 1994 and U.S. Serial No. 08/344,190 filed
November 23, 1994, which are each continuation-in-part applications of
08/283,314 filed July 29, 1994, which is a continuation-in-part application of
U.S. Serial No. 08/242,654, filed May 13, 1994, which is a continn~tion-in-part
o application of U.S. Serial No. 08/196,030 filed February 14, 1994, all of which
enjoy common ownership and each of which is inco,~o,~led herein by reference.

B~rk~round of the Invention
This i~ tioll relates generally to a group of infectious viral agents causing
hçp~titi~ in man, and more particularly, relates to m ~tçri~l~ such as polynucleotides
derived from this group of viruses, polypeptides encoded therein, antibodies
which specifically bind to these polypeptides, and diagnostics and vaccines thatemploy these materials.
~ep~titic is one of the most hll~ol~l diseases ~ d from a donor to
a recipient by transfusion of blood products, organ transplantation and
hemodialysis; it also can be tr~n~mittecl via ingestion of co"li.."i"~te~l food stuffs
and water, and by person to person contact. Viral hepatitis is known to include a
group of viral agents with ~ tinctive viral genes and modes of replication, causing
hepatitis with differing degrees of severity of hepatic damage through dirr~re.ll
25 routes of ~ ",ic~ion. In some cases, acute viral hepatitis is clinic~lly diagnosed
by well-defined patient S~lllptC lllS in~ (ling j~lln~lice, hepatic tenderness and an
elevated level of liver l.~ "il~es such as as~le tr~nc~."il.~ce (AST), alanine
tr~nc~min~ce (ALT) and isocitrate dehydrogenase (ISD). In other cases, acute
viral hepatitis may be clinically ina~tnl. The viral agents of htop~titic include
30 hepatitis A virus (HAV), hepatitis B virus (HBV), hepatitis C virus (HCV),
hepatitis delta virus (HDV), hepatitis E virus (HEV), Epstein-Barr virus (EBV)
and cytomegalovirus (CMV).
Although specific serologic assays available by the late 1960's to screen
blood donations for the presence of HBV surface antigen (HBsAg) were
35 successful in reducing the incidence of post-transfusion hepatitis (PTH) in blood
recipients, PTH continued to occur at a significant rate. H. J. Alter et al., Ann.
Int. Med. 77:691-699 (1972); H. J. Alter et al., Lancet ii:838-841 (1975).

~ ~6~13
WO g5/21922 PCTIUS9S/02118


Investigators began to search for a new agent, termed "non-A, non-B hepatitis"
(NANBH), that caused viral hepatitis not ~csoci~ted with exposure to viruses
previously known to cause hPp~titi~ in man (HAV, HBV, CMV and EBV). See,
for example, S. M. Feinstone et al., New Engl. J. Med. 292:767-770 (1975);
Anonymous editorial, Lancet ii:64-65 (1975); F. B. Hollinger in B. N. Fields andD. M. Knipe et al., Virology, Raven Press, New York, pp. 2239-2273 (1990).
Several lines of epidemiological and laboratory evidence have suggested
the existence of more than one p~cllleldlly tr~ncmitte~l NANB agent, including
multiple attacks of acute NANBH in inllavelleous drug users; distinct incubationo periods of patients acquiring NANBH post-transfusion; the outcome of cross-çh~ nge ~1,;",~ 7P,e e~ nl~i; the ultrastructural liver pathology of infected
chil~ n7~es; and the dirrtl~ ial rç~i~t~ e of the putative agents to chloroform.J. L. Dienstag, Ga~lloelllelology 85:439-462 (1983); J. L. Dienstag,
Ga~ e~ ,rolo~y 85:743-768 (1983); F. B. Hollinger et al., J. Infect. Dis.
142:400407 (1980); D. W. Bradley in F. Chisari, ed., Advances in Hepatitis
Research. Masson, New York, pp. 268-280 (1984); and D. W. Bradley et al., J.
Infect. Dis. 148:254-265 (1983).
A serum sample obtained from a surgeon who had developed acute
hepatitis was shown to induce hep~titic when inoculated into ~ . ;"~ (Saguinus
species). Four of four t~m~rin~ developed elevated liver enzymes within a few
weeks following their inoculation, suggesting that an agent in the surgeon's serum
could produce hepatitis in t~llalhlS. Serial passage in various non-human primates
demonstrated that this hepatitis was caused by a tr~ncmi~s~'r~le agent; filtration
studies suggested the agent to be viral in nature. The tr~ncmi~s~ble agent
2s responsible for these cases of hepatitis in the surgeon and ~ll~ins was termed the
"GB agent." F. Deinhardt et al., J. Exper. Med. 125:673-688 (1967). F.
Dienhardt et al., J. Exper. Med., supra; E.Tabor et al., J. Med. Virol. 5:103-108
(1980); R. O. Whittington et al., Viral and Immunological Diseases in Nonhuman
Primates, Alan R. Liss, Inc., New York, pp. 221-224 (1983)
Although it was suggested that the GB agent may be an agent causing
NANBH in humans and that the GB agent was not related to the Known NANBH
agents studied in various laboratories, no definitive or conclusive studies on the
GB agent are known, and no viral agent has been discovered or molecularly
characterized. F. Deinhardt et al., Am. J. Med. Sci. 270:73-80 (1975); and J. L.3s Dienstag et al., Nature 264:260-261 (1976). See also E. Tabor et al., J. Med.
Virol., supra; E. Tabor et al., J. Infect. Dis. 140:794-797 (1979); R. O.
Whittington et al., supra; and P. Karayiannis et al., Hepatology 9:186-192 (1989).

WO95/21922 2~ ~6313 PcrluS95/02118


Early studies inllic~tPA that the GB agent was unrelated to any known
human hepatitis virus. S. M. Feinstone et al., Science 182:1026-1028 (1973); P.
J. Provost et al., Proc. Soc. Exp. Biol. Med. 148:532-539 (1975); J. L. Melnick,Intervirolo~y 18:105-106 (1982); A. W. Holmes et al., Nature 243:419-420
5 (1973); and F. Deinhardt et al., Am. J. Med. Sci.. supra. However, questions
were raised regarding whether the GB agent was a virus which in~lced hepatitis
infection in hllm~n.c, or a latent tamarin virus activated by the GB serum and once
activated, easily passaged to other tamarins, inducing hepaliLis in them. Also, a
small percelltage of marmosets inoculated with GB-positive serum did not developlo clinical hepatitis (4 of 52, or 7.6%), suggesting that these animals may have been
naturally immlmP and thus, that the GB agent may be a marmoset virus. W. P.
Parks et al., J. Infect. Dis. 120:539-547 (1969); W. P. Parks et al., J. Infect. Dis.
120:548-559 (1969). Morphological studies have been equivocal, with immllne
electron micr~sco~ studies in one report in~ ting that the GB agent formed
immlmP complexes with a size distribution of 20-22 nm and resembling the
spherical structure of a parvovirus, while another study reported that immllnP
electron microscopy data obtained from liver homogenates of GB-positive tamarinsindicated that aggregares of 34-36 nm with icosahedral symmetry were detected,
suggesting that the GB agent was a calici-like virus. See, for example, J. D.
Almeida et al., Nature 261:608-609 (1976); J. L. Dienstag et aL, Nature, supra.
Two hepatitis-causing viruses recently have been discovered and reported:
HCV, which occurs primarily through p~cntelal tr~n.cmi~.cion, and HEV, which is
tral~ d enterically. See, for example, Q. L. Choo et al., Science 244:359-362
(1989), G. Kuo et al., Science 244:362-364 (1989), E. P. Publication No. 0 318
2s 216 (published May 31, 1989), G. R. Reyes et al., Science 247: 1335-1339
(1990). HCV is responsible for a majority of PTH ascribed to the NANBH
agent(s) and many cases of acute NANBH not acquired by transfusion.
Anonymous editorial, Lancet 335:1431-1432 (1990); J. L. Dienstag,
Ga~lloelltelolo~y99:1177-1180(1990);andM.J.Alteretal.,JAMA264:2231-
2235 (1990).
While the detection of HCV antibody in donor samples elimin~tPs 70 to
80% of NANBH infected blood in the blood supply system, the discovery and
detection of HCV has not totally prevented the tr~n~mi~ion of hepatitis. H. Alter
et al., New Eng. J. Med. 321:1494-1500 (1989). Recent publications have
3s questioned whether additional hepatitis agents may be responsible for PTH and for
co~ iLy acquired acute and/or chronic hepatits that is not associated with PT~I.For example, of 181 patients monitored in a prospective clinical survery con-lllcted

WO 95/21922 ~J~ L 3 PCI`/US95/02118


in France from 1988 to 1990, investigators noted a total of 18 cases of PTH.
Thirteen of these 18 patients tested negative for anti-HCV antibodies, HBsAg,
HBV and HCV nucleic acids. The authors speculated as to the potential
pOl La~ce of a non-A, non-B, non-C agent causing PTH. V. Thiers et al., J.
s Hepatology 18:34-39 (1993). Also, of 1,476 patients monitored in another study
con-lucte~ in Germany from 1985 to 1988, 22 cases of documented cases of PTH
were not related to infection with HBV or HCV. T. Peters et al., J. Med. Virol.
39:139-145 (1993).
It would be advantageous to identify and provide materials derived from a
o group of novel and unique viruses causing hepatitis, such as, polynucleotides,
recombinant and synthetic polypeptides encoded therein, antibodies which
specifically bind to these polypeptides, and diagnostics and vaccines that employ
these m~teri~lc, Such m~teri~l~ could greatly enh~nce the ability of the mP~
co~ .;ly to more a~ulaLcly diagnose acute and/or chronic viral h.o.p~titi~ and
could provide a safer blood and organ supply by detecting non-A, non-B and non-
C hepatitis in these blood and organ donations.

Summaly of the Invention
The present invention provides a purified polynucleotide or fragment
thereof derived from hP.p~titic GB virus (HGBV) capable of selectively hybridizing
to the genome of HGBV or the complement thereof, wherein said polynucleotide is
ch~ P~l by a positive stranded RNA genome wherein said genome comprises
an open reading frame (ORF) encoding a polyprotein wherein said polyprotein
co~ lises an amino acid sequence having at least 35% identity, more preferably,
2s 40% identity, even more plcfclably, 60% identity, and yet more preferably, 80%
identity to an amino acid sequence selectecl from the group con.ci~ting of HGBV-A,
HGBV-B and HGBV-C. Also provided is a recombinant polynucleotide or
fragment therof derived from hepatitis GB virus (HGBV) capable of selectively
hybridizing to the genome of HGBV or the complement thereof, wherein said
nucleotide comprises a sequence that encodes at least one epitope of HGBV, and
wherein said recombinant nucleotide is characterized by a positive stranded RNA
genome wherein said genome comprises an open reading frame (ORF) encoding a
polyprotein wherein said polyprotein comprises an amino acid sequence having at
least 35% identity to an amino acid sequence selected from the group consisting of
HGBV-A, HGBV-B and HGBV-C. Such a recombinant plynucleotide is
contained within a recombinant vector and further comprises a host cell
transformed with said vector.

WO 95/21922 ~ 1~ 6 3 1 :~ PCT/US95/02118


The present invention also probides a hepatitis GB virus tHGBV)
recombinant polynucleotide or fragment thereof comrri~ing a nucleotide sequence
- derived from an HGBV genome, wherein said polynucleotide is contained within a
recombinant vector and further comprises a host cell transformed with said vector.
5 and further wherein said sequence encodes an epitope of HGBV. The HGBV
recombinant polynucleotide is ch~t~ ed by a positive stranded RNA genome
wherein said genome comprises an open reading frame (ORF) encoding a
polyprotein wherein said polyprotein comprises an amino acid sequence having at
least 35% identity to an amino acid sequence selected from the group consisting of
0 HGBV-A, HGBV-B and HGBV-C. The present invention provides a
recombinant expression system comprising an open reading frame of DNA or
RNA derived from hepatitis GB virus (HGBV) wherein said open reading frame
co--.r.- ;.~es a sequence of HGBV genome or cDNA and wherein said open reading
frame is operably linked to a contrc,l sequence co~ ,alible with a desired host, and
15 further comprises a cell transformed with said recomhin~nt expression system and
a polypeptide of at least about eight amino acids in length produced by said cell.
The present invention additionally provides a purified hepatitis GB virus
(HGBV) comprising a plGpa alion of HGBV polypeptide or fragment thereof, a
recombinant polypeptide comprising an amino acid sequence or fragment thereof
20 wherein said sequence is char~t~.ri7~.cl by a positive stranded RNA genome
wherein said genome cc-,lnl,lises an open reading frame (ORF) encoding a
polyprotein wherein said polyprotein compri.ces an amino acid sequence having atleast 35% identity, more preferably 40% identity and yet more preferably 60%
identity to an amino acid sequence select~tl from the group con.cicting of HGBV-A,
25 HGBV-B and HGBV-C. Antibodies, both polyclonal and monoclonal, are
provided by the present invention, as well as, a fusion polypeptide comprising at
least one hepatitis GB virus (HGBV) polypeptide or fragment thereof, a particle
that is immunogenic against hepatitis GB virus (HGBV) infection, comprising a
non-HGBV polypeptide having an amino acid sequence capable of forming a
30 particle when said sequence is produced in a eukaryotic or prokaryotic host, and at
least one HGBV epitope, and a polynucleotide probe for hepatitis GB virus
(HGBV) wherein said polynucleotide probe is characterized by a positive strandedRNA genome wherein said genome comprises an open reading frame (ORF)
encoding a polyprotein wherein said polyprotein comprises an amino acid
35 sequence having at least 35% identity to an amino acid sequence selected from the
group consisting of HGBV-A, HGBV-B and HGBV-C.

WO 95/21922 Z 1 6 ~ 3 1 3 PCT/US95/02118 _


Assay kits also are provided, as well as methods for producing a
polypeptide Co~ g at least one hepatitis GB virus (HGBV) epitope comprising
incub~ting host cells transformed with an expression vector comprising a sequence
enco-ling a polypeptide ch~a~;~e~i~ed by a positive stranded RNA genome wherein
5 said genome co~ ises an open reading frame (ORF) enco(ling a polyprotein
wherein said polyprotein co~ ,lises an amino acid sequence having at least 35%
identity to an amino acid sequence selecte~l from the group consisting of HGBV-A,
HGBV-B and HGBV-C. Also provided are methods of detecting HGBV nucelic
acids, ~ntigerlc and antibodies in test samples, including methods which utilizeo solid phases, recolll~in~ or synthetic peptides, or probes. Vaccines also are
provided by the present invention, as are tissue culture grown cell infected with
hepatitis GB virus (HGBV), a method for producing antibodies to hepatitis GB
virus (HGBV) co~ g ~Aminict~ring to an individual an isolated immunogenic
polypeptide or fragrnent thereof compricin at least one HGBV epitope in an
5 amount sufficient to produce an i"",~t- lc~ollse. Diagnostic reagents also areprovided herein which comprises polynucleotides or polypeptides or fragments
thereof.

Brief Description of the Drawing.c
FIGURES 1-12 are graphs of individual t~ll~ills which plot the amount of
liver enzyme (ALT or ICD) as llle~ul~d in mU/ml against time (weeks post
inoculation), where ALT CO in~ at~s the cuttoff value for ALT, and ICD CO
intli~ te,c the cutoff value of ICD, wherein
FIGURE 1 shows the graph of tamarin T-1053;
FIGURE 2 shows the graph of tamarin T-1048;
FIGURE 3 shows the graph of tamarin T-1057;
FIGURE 4 shows the graph of tamarin T-1061;
FIGURE 5 shows the graph of tamarin T-1047;
FIGURE 6 shows the graph of tarnarin T-1042;
FIGURE 7 shows the graph of tamarin T-1044;
FIGURE 8 shows the graph of tarnarin T-1034;
FIGURE 9 shows the graph of tamarin T-1055;
FIGURE 10 shows the graph of tamarin T-1051;
FIGURE 11 shows the graph of tamarin T-1038; and
FIGURE 12 shows the graph of tamarin T- 1049.

WO 95121922 2~ ~ ~ 31 ~ ~CI/US95/02118


FIGURE 13 p,csc,.l~ a flow diagram of the steps involved in
representational difference analysis (RDA), the procedure used for identifying
clones.
F~GURE 14 shows an ethidium bromide stained 2.0% agarose gel of the
5 products from the ~cp~cscn~alional difference analysis (RDA) performed on pre- inoculation and acute phase HGBV-infecte~lt~m~nn plasma.
FIGURE 15 shows an autoradiogram from a Southern blot of genomic
DNA, amplicon DNA and products from the first three rounds of
~ub~laclion/hybridization.
FIGURE 16 shows the same autoradiogram as described in FIGURE 15,
except that an alternative radiolabeled probe is used.
FIGURE 17 shows an ethidium bromide stained 1.5% agarose gel of
polymerase chain reaction (PCR) amplified product from genomic DNA.
FIGURE 18 shows an aulo,~iiogram from a Southem blot of the 1.5%
agarose gel in FIGURE 17.
FIGURE 19 shows an ethidium bromide stained 1.5% agarose gel of RT-
PCR product obtained from nommal human serum and pre-inoculation and acute
phase tamarin plasmas.
FIGURE 20 shows an autoradiogram from a Southern blot of the same gel
described in FIGURE 19.
FIGURES 21 A and B show autoradiograms from Northern blots of total
cellular RNA extracted from the liver of an uninfected tamarin and an HGBV-
infected tamarin.
FIGURE 22 shows a diagram that demonstrates each of the recombinant
2s polynucleotide isolates are present on contiguous RNA species.
FIGURES 23 A-C show dot plot analyses of the nucleic acid sequences
wherein:
FIGURE 23A shows a dot blot co~ ison of HGBV-A;
FIGURE 23B shows a dot blot comparison of HGBV-B;
FIGURE 23C shows a dot blot comparison of HGBV-A v.
HGBV-B.
FIGURES 24 A-B show the conserved residues as follows:
FIGURE 24A shows the conserved residues in the putative NTP-
binding helicase domain of predicted translation products of HGBV-A, HGBV-B
and HCV-1 NS3,

WO 95121922 2 1 ~ 6 3 1 3 PCT/US95/02118


FIGURE 24B shows the conserved residues of the RNA-
dependent RNA polymerase domain of predicted translation products of HGBV-A,
HGBV-B and HCV-l NSSb.
FIGURES 25 A-B show Coomassie-stained 10% SDS-polyacrylamide
gels of CKS fusion protein whole cell lysates; three CKS fusion proteins
demonstrate immunoreactivity with HGBV-infected tamarin sera.
FIGURES 26 to 30 are graphs of individual t~m~rin.c which plot 1) the
amount of liver enzyme (ALT) as measured in mU/ml against time (weeks post
inoculation) as shown by a solid line; 2) ELISA absorbance values for the CKS-
lo 1.7 recombinant protein as shown by filled circles connected by dotted lines; 3)
ELISA absorbance values for the CKS-1.4 recombinant protein as shown by open
circles conn~cted by dotted lines; 4) ELISA absorbance values for the CKS4.1
lccolllbinallt protein as shown by crosses connect~d by dotted lines; 5) negative
PCR results using SEQ ID #21 primers as shown by empty squares; 6) postivive
PCR results using SEQ ID #21 primers as shown by filled squares; 7) negative
PCR results using SEQ ID #26 primers as shown by empty diamonds; 8) positive
PCR results using SEQ ID #26 primers as shown by filled diamonds; 9)
inoculation dates are intlic~tPc~ by the arrowheads, wherein
FIGURE 26 shows the graph of tamarin T-1048;
FIGURE 27 shows the graph of tamarin T-1057;
FIGURE 28 shows the graph of tamarin T-1061;
FIGURE 29 shows the graph of tamarin T-1051; and
FIGURE 30 shows the graph of tamarin T-1034.
FIGURES 31 -34 are graphs of a human test specimens which plots 1) the
2s amount of liver enzyme (ALT) as measured in mU/ml against time (weeks post
inoculation) as shown by a solid line; 2) ELISA absorbance values for the CKS-
1.7 recoll,bi~al t protein as shown by dotted lines, filled circles; 3) ELISA
absorbance values for the CKS- 1.4 recombinant protein as shown by dotted lines,open circles, wherein
FIGURE 31 shows a graph of patient 101;
FIGURE 32 shows a graph of patient 257;
FIGURE 33 shows a graph of patient 260; and
FIGURE 34 shows a graph of patient 340.
FIGURE 35 shows conserved residues, wherein
FIGURE 35A shows the conserved residues in the putative NTP-
binding helicase domain of predicted translation products of Contig. ~, Contig.-B
and HCV-l NS3, and

WO95/21922 ~1 ~63 i 3 PCI/US95/02118


FIGURE 35B shows the conserved residues of the RNA-
dependent RNA polymerase domain of predicted translation products of Contig.
A, Contig. B and HCV-1 NS5b.
FIGURE 36 shows a nucleotide ~lignm~nt of HGBV-A, HGBV-B,
HGBV-C and HCV-1.
FIGURE 37 shows a PhosphoImage (Molecular Dynamics, Sunnyvale,
CA) from a Southem blot of the PCR products after hybridization with the
radiolabeled probe from GB-C
FIGURE 38 shows a nucleotide alignment of HGBV-C with two variant
0 clones.
FIGURE 39 plesen~ a schematic of the assembled contig of HGBV-C.
FIGURE 40 shows a nucleotide ~li nm~nt of HGBV-C with four variant
clones.
FIGURE 41 shows a PhosphoImage (Molecular Dynamics, Sunnyvale,
CA) of a Southem blot of PCR products generated from a C~n~ n hepatitis
patient after hybridization with radiolabeled from C~n~ n patient GB-C.5.
FIGURE 42 depicts a phylogenetic tree produced from ~lignm~nt of the
helicase domains of the viruses inflir~tç(l
FIGURE 43 SCOTT depicts a phylogenetic tree produced from ~lignm~-nt
of the RNA-dependent RNA polymerase domains of the viruses in-licat~cl
FIGURE 44 p~c;sent~ a phylogenetic tree produced from ~lignm~n t of the
large open reading frames (putative precursor polyproteins) of the viruses
inflic~tP~l
Detailed Description of the Invention
The present invention provides ch~dct~ ion of a newly ascertained
etiological agents of non-A, non-B, non-C, non-D and non-E hepatitis-causing
agents, collectively so-temmed "Hep~titic GB Virus," or "HGBV." The present
invention provides a method for delelll~,nillg the presence of the HGBV etiological
agents, methods for obtaining the nucleic acid of this etiological agents created
from infected serum, plasma or liver homogenates from individuals, either humansor t~m~rin~, with HGBV to detect newly synthesi7Pcl antigens derived from the
genome of heretofore unisolated viral agents, and of selecting clones which
produced products which are only found in infectious individuals as compared to
non-infected individuals.
Portions of the nucleic acid sequences derived from HGBV are useful as
probes to detemline the presence of HGBV in test samples, and to isolate naturally
occurring variants. These sequences also make available polypeptide sequences of

21~631~
wo 95/21922 Pcr/usss/02ll8



HGBV antigens encoded within the HGBV genome(s) and permit the production
of polypeptides which are useful as standards or reagents in diagnostic tests and/or
as colllponen~ of vaccines. Monoclonal and polyclonal antibodies directed against
at least one epitope contailled within these polypeptide sequences also are useful
s for diagnostic tests as well as thcl~culic agents, for screening of antiviral agents,
and for the isolation of the HGBV agent from which these nucleic acid sequences
are derived. Isolation and sequencing of other portions of the HGBV genome also
can be accomplished by ~ltili7.ing probes or PCR primers derived from these
nucleic acid sequences, thus allowing additional probes and polypeptides of the
0 HGBV to be established, which will be useful in the diagnosis and/or treatment of
HGBV, both as a prophylactic and thela~cu~ic agent.
According to one aspect of the invention, there will be provided a purified
HGBV polynucleotide, a recombinant HGBV polynucleotide, a recombinant
polynucleotide co,-,~ lg a se~ r~ derived from an HGBV genome; a
Iccollll~ill~ll polypeptide enco-ling an epitope of HGBV; a synthetic peptide
encoding an epitope of HGBV; a l.,col,lbinant vector col~t;1;"il-g any of the above
described recolllbh~ polypeptides, and a host cell transformed with any of thesevectors. These recombinant polypeptides and synthetic peptides may be used
alone or in colllbillation, or in conjul.clion with other substances representing
epitopes of HGBV.
In another aspect of the invention there will be provided purified HGBV; a
plcp~lion of polypeptides from the purified HGBV; a purified HGBV
polypeptide; a purified polypeptide compri.cing an epitope which is
immunologically irlentic~l with an epitope contained in HGBV.
In yet another aspect of the invention there will be provided a recombinant
expression system co~ isillg an open reading frame (ORF) of DNA derived from
an HGBV genome or from HGBV cDNA, wherein the ORF is operably linked to a
control sequence colllpalible with a desired host, a cell transformed with the
recombinant expression system, and a polypeptide produced by the transformed
cell.
Additional aspects of the present invention include at least one recombinant
HGBV polypeptide, at least one recombinant polypeptide comprised of a sequence
derived from an HGBV genome or from HGBV cDNA; at least one recombinant
polypeptide comprised of an HGBV epitope and at least one fusion polypeptide --
comprised of an HGBV polypeptide.
The present invention also provides methods for producing a monoclonal
antibody which specifically binds to at least one epitope of HGBV; a purified

WO 95/21922 2 1 ~ ~; 3 :L 3 PCT/US95/02118


p,Gp~dlion of polyclonal antibodies which specifically bind to at least one HGBVepitope; and methods for using these antibodies, which include diagnostic,
prognostic and thel~Gulic uses.
In still another aspect of the invention there will be provided a particle
s which i~ "-.l;~es against HGBV infection comprising a non-HGBV polypeptide
having an amino acid sequence capable of forming a particle when said sequence is
produced in an eukaryotic host, and an HGBV epitope.
A polynucleotide probe for HGBV also will be provided.
The present invention provides kits cont~ining reagents which can be used
for the detection of the plGsGnce and/or amount of polynucleotides derived from
HGBV, such reagents compri~ing a polynucleotide probe cont~ -g a nucleotide
sequence from HGBV of about 8 or more nucleotides in a suitable container; a
reagent for detecting the presence and/or amount of an HGBV antigen compri.cing
an alltibody directed against the HGBV antigen to be detect~d in a suitable
col~h.~l, a reagent for ~ietecting the presence andlor amount of antibodies directed
against an HGBV antigen comprising a polypeptide con~i~ g an HGBV epitope
present in the HGBV antigen, provided in a suitable container. Other kits for
various assay formats also are provided by the present invention as described
herein.
Other aspects of the present invention include a polypeptide comprising at
least one HGBV epitope attached to a solid phase and an antibody to an HGBV
epitope ~tt~ Pd to a solid phase. Also included are methods for producing a
polypeptide co,.~ ;"g an HGBV epitope CO~ g incubating host cells
transformed with an expression vector co~t~;--;-,g a sequence encoding a
2s polypeptide con~ -g an HGBV epitope under conditions which allow expressionof the polypeptide, and a polypeptide col~ g an HGBV epitope produced by
this method.
The present invention also provides assays which utilize the recombinant or
synthetic polypeptides provided by the invention, as well as the antibodies
described herein in various forrnats, any of which may employ a signal generating
coll.poulld in the assay. Assays which do not utilize signal generating compounds
to provide a means of detection also are provided. All of the assays described
generally detect either antigen or antibody, or both, and include contacting a test
sarnple with at least one reagent provided herein to form at least one
3s antigen/antibody complex and det~cting the presence of the complex. These assays
are described in detail herein.

WO 95/21922 ~ 3 1 3 PCI'IUS95/02118


Vaccines for l,~ of HGBV infection comprising an immunogenic
peptide cont~ining an HGBV epitope, or an inactivated ~lGpala~ion of HGBV, or
an ~tten~tecl ~,cp~lion of HGBV, or the use of recombinant vaccines that
express HGBV epitope(s) and/or the use of synthetic peptides, also are included in
5 the present invention. An effective vaccine may make use of combinations of these
immunogenic peptides (such as, a cocktail of recombinant antigens, synthetic
peptides and native viral antigens ~ Gd simul~eo~sly or at dirr~lcnt
times); some of these may be utilized alone and be supplemented with other
representations of immunogenic epitopes at later times. Also included in the
10 present invention is a method for producing antibodies to HGBV comprising
;ni~le~ ;llg to an individual an isolated immunogenic polypeptide CulIIh;ll;ll~ an
HGBV epitope in an amount sufficient to produce an immnne response in the
inoculated individual.
Also provided by the present invention is a tissue culture grown cell
15 infected with HGBV.
In yet another aspect of the present invention is provided a method for
isolating DNA or cDNA derived from the genome of an unidentified infectious
agent, which is a unique modification of representational difference analysis
(RDA), and which is described in detail hereinbelow.
20 Definitions
The term "Hepatitis GB Virus" or "HGBV", as used herein, collectively
denotes a viral species which causes non-A, non-B, non-C, non-D, non-E
hepatitis in man, and ~ttt ml~tt d strains or defective h~ r~ g particles derived
therefrom. This may include acute viral hepatitis tr~n~mitt- d by co"l;.",in~l~d25 footlstl-ff.c, tlrinking water, and the like; hepatitis due to HGBV ~ cl via
person to person contact (including sexual tr~ncmic~ion, respiratory and parenteral
routes) or via intraveneous drug use. The methods as described herein will allowthe identification of individuals who have acquired HGBV. Individually, the
HGBV isolates are specifically referred to as "HGBV-A", "HGBV-B" and
30 "HGBV-C." As described herein, the HGBV genome is comprised of RNA.
Analysis of the nucleotide sequence and tle~ ce~l amino acid sequence of the
HGBV reveals that viruses of this group have a genome o,~,ani~a~ion similar to
-- that of the Flaviridae family. Based primarily, but not exclusively, upon
. .
similarities in genome olga~ a~ion, the International Col "",;ll~e on the Taxonomy
35 of Viruses has ,~co~ ended that this family be composed of three genera:
Flavivirus, Pestivirus, and the hepatitis C group. Similarity searches at the amino
acid level reveal that the hepatitis GB virus subclones have some, albeit low,

WO9S/21922 2~ ~ ~ 3~3 ~CT/US95/02118


sequence resemblence to hepatitis C virus. The information provided herein is
sufficient to allow cl~cciflr~tion of other strains of HGBV.
Several lines of evidence demonstrate that HGBV-C is not a genotype of
HCV. First, sera cont~ il-g HGB-C sequences were tested for the presence of
5 HCV antibody. Routine detection of individuals exposed to or infected with HCVrelies upon antibody tests which utilize antigens derived from three or more
regions from HCV- 1. These tests allow detection of antibodies to the known
genotypes of HCV (See, for example, Sakamoto et al., J. Gen. Virol. 75: 1761-
1768 (1994) and Stuyver et al., J. Gen. Virol. 74: 1093-1102 (1993). HCV-
o specific ELISAs failed to detect sera co,.~ GB-C sequences in six of eight
cases (TABLE A). Second, several human sera that were seronegative for HCV
antibodies have been shown to be positive for HCV genomic RNA by a highly
sensitive RT-PCR assay (Sugitani, Lancet 339:1018-1019 (1992). This assay
failed to detect HCV RNA in seven of eight sera co~ g HGB-C sequences
15 (TABLE A). Thus, HGBV-C is not a genotype of HCV based on both serologic
and molecular assays.
The ~lignmPnt of a portion of the predicted translation product of HGB-C
within the helicase region with the homologous region of HGBV-A, HGBV-B,
HCV-1 and additional l~lc;lllbel~ of the Flaviviridae, followed by phylogenetic
20 analysis of the aligned sequences suggests that HGBV-C is more closely related to
HGBV-A than to any member of the HCV group. The sequences of HGBV-C and
HGBV-A, while exhibiting an evolutionary ~iict~nce of 0.42, are not as divergentas HGBV-C is from HGBV-B, which shows an evolutionary ~lict~nce of 0.92
(TABLE 33, infra.). Thus, HGBV-A and HGBV-C may be considered to be
25 members of one subgroup of the GB viruses and GBV-B a member of its own
subgroup. The phylogenetic analysis of the helicase sequences from various HCV
isolates show that they form a much less diverged group, exhibiting a maximum
evolutionary distance of 0.20 (TABLE 32, infra.). A comparison of the HCV
group and the HGBV group shows a ,,,i,,i,,,l.,,, evolutionary distance between any
30 two sequences from each group of 0.69. The distance values reported hereinabove
were used to generate a phylogenic tree presented in FIGURE 42. The relatively
high degree of divergence among these viruses suggests that the GB viruses are
not merely types or subtypes within the hepatitis C group; rather, they constitute
their own phyletic group (or groups). Phylogenetic analysis using sequence
35 information derived from a small portion of HCV viral genomes has been shown
to be an acceptable method for the ~ccignm. nt of new isolates into genotypic
groups (Simmonds et al., Hepatology 19:1321-1324 (1994). In the current

WO 95/21922 PCI/US95/02118
- Zl 6631 3
14

analysis, the use of a 110 amino acid sequence within the helicase gene from
representative HCV isolates has pro~lly grouped them into their lc~;live
genotypes (Simmonds et al., J. Gen. Virol. 75:1053-1061 (1994). Therefore, the
evolutionary llict~ncçs shown, in all liklihood, accurately refect the high degree of
5 divergence between the GB viruses and the hepatitis C virus.
In previous applications, it was stated that "HGBV strains are identifiable
on the polypeptide level and that HGBV strains are more than 40% homologous,
preferably more than about 60% homologous, and even more plc;ft;lably more than
about 80% homologous at the polypeptide level." As it is used, the term
0 "homologous," when referring to the degree of relatedness of two polynucleotide
or polypeptide sequences, can be ambiguous and actually implies an evolutionary
relationship. As is now the current convention in the art, the term "homologous"is no longer used; instead the terms "cimil~rity" and/or "identity" are used to
describe the degree of re~ nrss ~w~n two polynucleotides or polypeptide
15 sequences. The techniques for det~,....i..illg amino acid sequence "cimil~rity"
andlor "identity" are well-known in the art and include, for example, directly
det~. --;-~i--g the amino acid sequence and co~ g it to the seqennces provided
herein; d~ .;nil-g the nucleotide sequence of the genomic material of the putative
HGBV (usually via a cDNA interrnr~ te), and detr~ . l li. .i llg the amino acid
20 sequence encoded therein, and col,.l ~. it~g the co.,espol1ding regions. In general,
by "identity" is meant the exact match-up of either the nucleotide sequence of
HGBV and that of another strain(s) or the amino acid sequence of HGBV and that
of another strain(s) at the appropliate place on each genome. Also, in general, by
".cimil~rity" is meant the exact match-up of amino acid sequrnr-e of HGBV and that
25 of another strain(s) at the appl~,pliate place, where the amino acids are identical or
possess similar r.~r.nnir~l and/or physical pol~e,lies such as charge or
hydrophobicity. The programs available in the Wisconsin Sequence Analysis
Package, Version 8 (available from the Genetics Co.llpul.,l Group, Madison,
Wisconsin, 53711), for example,. the GAP program, are capable of calculating
30 both the identity and similarity between two polynucleotide or two polypeptide
sequences. Other programs for c~lc~ ting identity and similarity between two
sequences are known in the art.
Additionally, the following parameters are applicable, either alone or in
combination, in identifying a strain of HGBV-A, HGBV-B or HGBV-C. It is
35 expected that the overall nucleotide sequence identity of the genomes betweenHGBV-A, HGBV-B or HGBV-C and a strain of one of these hep~titic GB viruses
will be about 45% or greater, since it is now believed that the HGBV strains may

- WO 95/21922 ` PCI/US95/02118
21~6313

be ge-neti~lly related, preferably about 60% or greater, and more plGr.,lably, about
80% or greater.
Also, it is expected thjat the overall sequence identity of the genomes
b~twt;ell HGBV-A and a strain of HGBV-A at the amino acid level will be about
5 35% or greater since it is now believed that the HGBV strains may be genetically
related, plefelably about 40% or greater, more preferably, about 60% or greater,and even more preferably, about 80% or greater. In addition, there will be
col,ci,l)onding contiguous sequences of at least about 13 nucleotides, which maybe provided in combination of more than one contiguous sequence. Also, it is
o expected that the overall sequence identity of the genomes b~e~n HGBV-B and
strain of HGBV-B at the amino acid level will be about 35% or greater since it is
now believed that the HGBV strains may be g~.n~tic~lly related, plef~lably about40% or greater, more preferably, about 60% or greater, and even more preferably,about 80% or greater. In addition, there will be c~"~i,ponding contiguous
5 sequences of at least about 13 nucleotides, which may be provided in combination
of more than one contiguous seql~ence. Also, it is expected that the overall
sequence identity of the genomes between HGBV-C and a strain of HGBV-C at
the amino acid level will be about 35% or greater since it is now believed that the
HGBV strains may be genetic~lly related, plef~,làbly about 40% or greater, more
20 preferably, about 60% or greater, and even more preferably, about 80% or greater.
In addition, there will be corresponding contiguous sequences of at least about 13
nucleotides, which may be provided in combination of more than one contiguous
sequence.
The compositions and methods described herein will enable the
25 propagation, identification, detection and isolation of HGBV and its possiblestrains. Moreover, they also will allow the prep~dlion of diagnostics and vaccines
for the possible dirr~ l strains of HGBV, and will have utility in screening
procedures for anti-viral agents. The information will be sufficient to allow a viral
taxonomist to identify other strains which fall within the species. We believe that
30 HGBV encodes the sequences that are included herein. Methods for assaying forthe presence of these sequences are known in the art and include, for example,
amplification methods such as ligase chain reaction (LCR), polymerase chain
reaction (PCR) and hybridization. In addition, these sequences contain open
reading frames from which an immunogenic viral epitope may be found. This
35 epitope is unique to HGBV when compared to other known hepatitis-causing
viruses. The uniquen~sc of the epitope may be deterrnined by its immunological
reactivity with HGBV and lack of immunological reactivity with Hepatitis A, B, C,

WO 9',121922 2 1 6 6 3 1 3 PCI/US95/02118

16

D and E viruses. Methods for determining immunological reactivity are known in
the art and include, for example, radioimmunoassay (RIA), enzyme-linked
immunosorbant assay (ELISA), hemagglutination (HA), fluorescence polarization
immunoassay (FPIA) and several examples of suitable techniques are described
s herein.
A polynucleotide "derived from" a de.ci~n~tP-l sequence for example, the
HGBV cDNA, or from the HGBV genome, refers to a polynucleotide sequence
which is comprised of a sequence of approximately at least about 6 nucleotides, is
feldbly at least about 8 nucleotides, is more p,erelably at least about 10-12
o nucleotides, and even more preferably is at least about 15-20 nucleotides
co~ onding, i.e., similar to or complGIllGll~ y to, a region of the decigTl~tP,cl
nucleotide sequence. Preferably, the sequence of the region from which the
polynucleotide is derived is similar to or co"lplf - .Pnt~ry to a sequence which is
unique to the HGBV genome. Whether or not a sequence is co~ lel~lG~ y to or
15 similar to a sequence which is unique to an HGBV genol"c can be detr~ Pd by
techniques known to those skilled in the art. Comparisons to sequences in
~l~t~b~nkc, for example, can be used as a method to determine the uniqueness of a
~ieSign~t~ sequence. Regions from which sequences may be derived include but
are not limited to regions encoding specific epitopes, as well as non-tr~ncl~tedand/or non-transcribed regions.
The derived polynucleotide will not nece.ss~rily be derived physically from
the nucleotide sequence of HGBV, but may be generated in any manner, including
but not limited to chemical synthesis, replication or reverse transcription or
transcription, which are based on the information provided by the sequence of
bases in the region(s) from which the polynucleotide is derived. In addition,
co"lbi.lations of regions corresponding to that of the ~lecign~tp~d sequence may be
modified in ways known in the art to be consistent with an intended use.
A "polypeptide" or "amino acid sequence derived from a decign~tPcl nucleic
acid sequence or from the HGBV genome refers to a polypeptide having an arnino
acid sequence identical to that of a polypeptide encoded in the sequence or a
portion thereof wherein the portion consists of at least 3 to 5 amino acids, andmore preferably at least 8 to 10 amino acids, and even more preferably 15 to 20
amino acids, or which is immunologically identifiable with a polypeptide encodedin the sequence.
A "recombinant polypeptide" as used herein means at least a polypeptide of
genomic, semisynthetic or synthetic origin which by virtue of its origin or
manipulation is not ~csoci~tP~A with all or a portion of the polypeptide with which it

- WO 95/21922 2 1 5 6 3 1 3 PCT/US95/02118


is ~.coci ~d in nature or in the form of a library and/or is linked to a
polynucleotide other than that to which it is linked in nature. A recombinant orderived polypeptide is not n~ cess~rily tr~n.cl~t.od from a de~ign~t~ nucleic acid
sequence of HGBV or from an HGBV genome. It also may be generated in any
s manner, including chemical synthesis or expression of a recolllbillalll expression
system, or isolation from mllt~tçd HGBV.
The term "synthetic peptide" as used herein means a polymeric form of
amino acids of any length, which may be chemically synthP.ci7.ed by methods well-
known to the r~u~ eel. These synthetic peptides are useful in various
lo applications.
The term "polynucleotide" as used herein means a polymeric form of
nucleotides of any length, either ribonucleotides or deoxyribonucleotides. This
term refers only to the ~ structure of the molecule. Thus, the term includesdouble- and single-stranded DNA, as well as double- and single-stranded RNA. It
5 also includes modifications, either by methylation and~or by capping, and
unmodified forms of the polynucleotide.
"HGBV co,.l;.;llil-g a sequence corresponding to a cDNA" means that the
HGBV contains a polynucleotide sequence which is similar to or comple~"en~y to
a sequence in the decign~t~d DNA. The degree of similarity or comple"le"~ily to
20 the cDNA will be approximately 50% or greater, will plc~GIdbly be at least about
70%, and even more preferably will be at least about 90%. The sequence which
cGIlG~nds will be at least about 70 nucleotides, plefGIdbly at least about 80
nucleotides, and even more preferably at least about 90 nucleotides in length. The
correspondence between the HGBV and the cDNA can be dc~ llillç(i by methods
25 known in the art, and include, for example, a direct comparison of the sequenced
material with the cDNAs described, or hybridization and digestion with single
strand ml~le~ces, followed by size determination of the digested fra~rnt~nt~
"Purified viral polynucleotide" refers to an HGBV genome or fragment
thereof which is essentially free, i.e., contains less than about 50%, preferably less
30 than about 70%, and even more preferably, less than about 90% of polypeptideswith which the viral polynucleotide is naturally associated. Techniques for
. ~
purifying viral polynucleotides are well known in the art and include, for example,
disruption of the particle with a chaotropic agent, and separation of the
polynucleotide(s) and polypeptides by ion-exchange chromatography, affinity
3s chromatography, and sedhllcl~ldlion according to density. Thus, "purified viral
polypeptide" means an HGBV polypeptide or fragment thereof which is essentially
free, that is, contains less than about 50%, preferably less than about 70%, and

2l66~l3
WO 95/21922 PCIIUS95/02118


even more preferably, less than about 90% of of cellular components with which
the viral polypeptide is naturally associated. Methods for purifying are known to
the routineer.
"Polypeptide" as used herein indicates a molecular chain of amino acids
and does not refer to a specific length of the product. Thus, peptides,
oligopeptides, and proteins are included within the definition of polypeptide. This
term, however, is not intended to refer to post-expression modifications of the
polypèptide, for example, glycosylations, acetylations, phosphorylations and thelike.
o "Recombinant host cells," "host cells," "cells," "cell lines," "cell cultures,"
and other such terms denoting microolg~ c or higher eucaryotic cell lines
cultured as unicellular entities refer to cells which can be, or have been, used as
lccip--~tc for recolllbinao~ vector or other transfer DNA, and include the original
progeny of the original cell which has been transfected.
As used herein "replicon" means any genetic element, such as a plasrnid, a
chromosome or a virus, that behaves as an autonomous unit of polynucleotide
replication within a cell. That is, it is capable of replication under its own control.
A "vector" is a replicon in which another polynucleotide segment is
~tt~rll~i such as to bring about the replication and/or expression of the attached
segment.
The term "control sequence refers to polynucleotide sequences which are
nececs~ry to effect the e~lGssion of coding sequences to which they are ligated.The nature of such control sequences differs depending upon the host org~nicm
In prokaryotes, such control sequences generally include ~r~ vt~l, ribosomal
binding site and terminators; in euka~otes, such control sequences generally
include promoters, terrninators and, in some inct~nres, enhancers. The term
"control sequence thus is intrn-led to include at a .,lilli",...ll all colllponell~ whose
presence is nPcecc~ry for expression, and also may include additional componentswhose presence is advantageous, for example, leader sequences.
"Operably linked" refers to a situation wherein the colllponen~ described
are in a relationship ~Illliuillg them to function in their intended manner. Thus,
for example, a control sequence"operably linked" to a coding sequence is ligated in
such a manner that expression of the coding sequence is achieved under conditions
compatible with the control sequences. ~
The terrn "open reading frarne" or "ORF" refers to a region of a
polynucleotide sequencewhich encodes a polypeptide; this region may represent a
portion of a coding sequenre or a total coding sequence.

_ WO 95/21922 2 1 6 6 3 1 3 PCI/US95/02118

19

A "coding sequence" is a polynucleotide sequencewhich is transcribed into
mRNA and/or tr~n.~l~t~ into a polypeptide when placed under the control of
- applop,iate regulatory sequences. The boundaries of the coding sequence are
detçrminPd by a translation start codon at the 5' -terminus and a translation stop
5 codon at tne 3' -te~ lus. A coding sequence can include, but is not limited to,
mRNA, cDNA, and recombinant polynucleotide sequences.
The term "immunologically identifiable wiWas" refers to the presence of
epitope(s) and polypeptide(s) which also are present in and are unique to the
designated polypeptide(s), usually HGBV proteins. Immunological identity may
o be det~llllined by antibody binding and/or colllpetilion in binding. These
techniques are known to the luulh~eel and also are described herein. The
uniqueness of an epitope also can be (lete~ e~ by com~ er searches of known
data banks, such as GenBank, for the polynucleotide sequences which encode the
epitope, and by amino acid sequence col~ isons with other known proteins.
As used herein, "epitope" means an antigenic det~ of a polypeptide.
Conceivably, an epitope can comprise three amino acids in a spatial conformationwhich is unique to the epitope. Generally, an epitope consists of at least five such
amino acids, and more usually, it consists of at least eight to ten amino acids.Methods of e~n ining spatial conformation are known in the art and include, for
20 example, x-ray crystallography and two--limlo.ncional nuclear m~gnetic resonance.
A polypeptide is "immunologically reactive" with an antibody when it
binds to an antibody due to antibody recognition of a specific epitope containedwithin the polypeptide. Immunological reactivity may be determined by antibody
binding, more particularly by the kinetics of antibody binding, and/or by
25 coll.~lilion in binding using as colll~lil~,l(s) a known polypeptide(s) cont~ining
an epitope against which the antibody is directed. The methods for determining
whether a polypeptide is immunologically reactive with an antibody are known in
the art.
As used herein, the term "immunogenic polypeptide co~ .g an HGBV
30 epitope" means naturally occurring HGBV polypeptides or fragments thereof, aswell as polypeptides prepared by other means, for example, chemical synthesis orthe expression of the polypeptide in a recombinant organism.
The term "transformation" refers to the insertion of an exogenous
polynucleotide into a host cell, irrespective of the method used for the insertion.
35 For example, direct uptake, tr~n.cd~lction, or f-mating are included. The
exogenous polynucleotide may be ll~ t~;lled as a non-integrated vectQr, for
example, a plasmid, or alternatively, may be integrated into the host genome.

WO 95/21922 21 6 ~ 3 1 3 PCIIUS95/02118


"Tre~tm~nt" refers to prophylaxis and/or therapy.
The term "individual" as used herein refers to vel~eblates, particularly
members of the m~mm~ n species and includes but is not limited to domestic
~nim~lc, sports ~nim~lc, primates and hllm~nc; more particularly the term refers to
5 t~m~rinc and hllm~nc
The term "plus strand" (or "+") as used herein denotes a nucleic acid that
contains the sequ~nceth~t encodes the polypeptide. The term "minus strand" (or
"-") dènotes a nucleic acid that contains a sequence that is CO~ IeIIIGIIl~Y to that of
the "plus" strand.
lo "Positive stranded genome" of a virus denotes that the genome, whetherRNA or DNA, is single-stranded and which encodes a viral polypeptide(s).
The term "test sample" refers to a colllponGnt of an individual's body
which is the source of the analyte (such as, antibodies of interest or antigens of
interest). These colll~c,n~ are well known in the art. These test samples include
1S biological s~mples which can be tested by the methods of the present invention
described herein and include human and animal body fluids such as whole blood,
serum, plasma, celGblos~illal fluid, urine, Iymph fluids, and various external
secretions of the leL.pilaluly~ intestin~l and genitorurinary tracts, tears, saliva,
milk, white blood cells, myelomas and the like; biological fluids such as cell
20 culture supern~t~ntc; fixed tissue specilllGIls; and fixed cell specimens.
"Purified HGBV" refers to a plG~al~tion of HGBV which has been isolated
from the cellular constituents with which the virus is normally ~csoci~te~, and
from other types of viruses which may be present in the infected tissue. The
techniques for isolating viruses are known to those skilled in the art and include,
25 for example, centrifugation and affinity cl rblllatography.
"PNA" denotes a "peptide nucleic analog" which may be utilized in a
procedure such as an assay to deterrnine the presence of a target. PNAs are
neutrally charged moieties which can be directed against RNA targets or DNA.
PNA probes used in assays in place of, for example, DNA probes, offer
30 advantages not acheivable when DNA probes are used. These advantages include
manufacturability, large scale labeling, reproducibility, stability, insensitivity to
changes in ionic strength and rçcict~nce to enzymatic degradation which is present
in methods utili7ing DNA or RNA. These PNAs can be labeled with such signal
geneldling compounds as flouorescein, radionucleotides, chemiluminescent
3s compounds, and the like. PNAs thus can be used in methods in place of DNA or
RNA. Although assays are described herein ~ltili7ing DNA, it is within the scope

WO 95/21922 2 ~ B ~ 31 ~ PcrluS95/02118

21

of the routineer that PNAs can be ~ubsLiLuLcd for RNA or DNA with appn~liate
changes if and as needed in assay reagents.
- General Uses
After plGp~hlg lccolllbillant proteins, synthetic peptides, or purified viral
s polypeptides of choice as described by the present invention, the recombinant or
synthetic peptides can be used to develop unique assays as described herein to
detect either the presence of antigen or antibody to HGBV. These compositions
also can be used to develop monoclonal and/or polyclonal antibodies with a
specific lcco"~bi~ protein or synthetic peptide which specifically bind to the
lo immunological epitope of HGBV which is desired by the rouLineer. Also, it is
co"lt;",~lated that at least one polynucleotide of the invention can be used to
develop vaccines by following methods known in the art.
It is contempl~tç~ that the reagent employed for the assay can be provided
in the form of a test kit with one or more CQ~ such as vials or bottles, with
each conlainel col-~ g a sep~ale reagent such as a monoclonal antibody, or a
coc~t~il of monoclonal antibodies, or a polypeptide (either recombinant or
synthetic) employed in the assay. Other co"~ol~enls such as buffers, controls,
and the like, known to those of ordinary skill in art, may be included in such test
kits.
"Solid phases" ("solid ~u~polL~") are known to those in the art and include
the walls of wells of a reaction tray, test tubes, polystyrene beads, magnetic beads,
nitrocellulose strips, ~c;lllbl~es, microparticles such as latex particles, sheep (or
other animal) red blood cells, dul~;yl~s and others. The "solid phase" is not
critical and can be se!ecte~ by one skilled in the art. Thus, latex particles,
microparticles, magnetic or non-m~gT-etic beads, membranes, plastic tubes, wallsof microtiter wells, glass or silicon chips, sheep (or other suitable animal's) red
blood cells and duracytes are all suitable examples. Suitable methods for
immobilizing peptides on solid phases include ionic, hydrophobic, covalent
interactions and the like. A "solid phase", as used herein, refers to any material
which is insoluble, or can be made insoluble by a subsequent reaction. The solidphase can be chosen for its intrinsic ability to attract and immobiliæ the capture
reagent. Alternatively, the solid phase can retain an additional receptor which has
the ability to attract and immobilize the capture reagent. The additional receptor ca
include a charged substance that is oppositely charged with respect to the capture
3s reagent itself or to a charged substance conjugated to the capture reagent. As yet
another alternative, the receptor molecule can be any specific binding member
which is immobilized upon (~tt~h~.d to) the solid phase and which has the ability

WO 95/21922 ~ 1 ~ 6 3 1 3 PCI/US95/02118


to immobilize the capture reagent through a specific binding reaction. The receptor
molecule enables the indirect binding of the capture reagent to a solid phase
m~trri~l before the performance of the assay or during the perfo~ ce of the
assay. The solid phase thus can be a plastic, derivatized plastic, m~gnetic or non-
5 m~gnetic metal, glass or silicon surface of a test tube, microtiter well, sheet, bead,microparticle, chip, sheep (or other suitable animal's) red blood cells, duracytes
and other configurations known to those of ordinary skill in the art.
It is contelll~lated and within the scope of the invention that the solid phase
also can comprise any suitable porous material with sufficient porosity to allowo access by detection antibodies and a suitable surface affinity to bind antigens.
Microporous structures are generally prefelled, but materials with gel structure in
the hydrated state may be used as well. Such useful solid ~u~olls include:
natural polymeric carbohydrates and their synthetir~lly morlifirfl, cross-linked or
substituted ~livalives, such as agar, agarose, cross-linked alginic acid, substituted
15 and cross-linked guar gums, cellulose esters, especi~lly with nitric acid andcarboxylic acids, mixed cellulose esters, and cellulose ethers; natural polymersco~ g nitrogen, such as proteins and derivatives, including cross-linked or
modified gelatins; natural hydroc~l.on polymers, such as latex and rubber;
synthetic polymers which may be prepared with suitably porous structures, such
20 as vinyl polymers, including polyethylene, polypropylene, polystyrene,
polyvinylchloride, polyvinylacetate and its partially hydrolyzed derivatives,
polyacrylarnides, polymethacrylates, copolymers and terpolymers of the above
polycon~lrn~tes~ such as polyesters, polyamides, and other polymers, such as
polyurethanes or polyepoxides; porous inorganic materials such as sulfates or
25 c~l onates of ~lk~line earth metals and m~gnrsium, including barium sulfate,
calcium sulfate, calcium carbonate, ~i1ir~tes of alkali and alkaline earth metals,
.. "i",l." and ll~A~ ;lllll; and al.. ;l.~ or silicon oxides or hydrates, such as
clays, alumina, talc, kaolin, zeolite, silica gel, or glass (these materials may be
used as filters with the above polymeric m~trri~lc); and llli~lulcs or copolymers of
the above classes, such as graft copolymers obtained by initi~li7ing polylll~fiza~ion
of synthetic polymers on a pre-existing natural polymer. All of these materials
may be used in suitable shapes, such as films, sheets, or plates, or they may becoated onto or bonded or l~min~tr.d to appl~liate inert carriers, such as paper,glass, plastic films, or fabrics.
The porous structure of nitrocellulose has excellent absorption and
adsorption qualities for a wide variety of reagents including monoclonal
antibodies. Nylon also possesses similar characteristics and also is suitable. It is

wo 95121922 Pcr/uss5/02ll8
21B63~1~
23

contemplated that such porous solid ~ Up~)Ol L~. described hereinabove are preferably
in the form of sheets of thickness from about 0.01 to 0.5 mm, preferably about 0.1
mm. The pore size may vary within wide limits, and is preferably from about
0.025 to 15 microns, especially from about 0.15 to 15 microns. The surfaces of
s such ~ U~Ul l'. may be activated by chemical processes which cause covalent
linkage of the antigen or antibody to the support. The irreversible binding of the
antigen or antibody is obtained, however, in general, by adsorption on the porous
m~ter1~1 by poorly understood hydrophobic forces. Suitable solid ~.u~OlL~. also
are described in U.S. Patent Application Serial No. 227,272.
o The "in-lic~tor reagent "co~ lises a "signal gene~aLillg compound" (label)
which is capable of gene.dLillg and ~ne.dles a me~suldble signal detect~l-le by
external means conjugated (~tt~rh~d) to a specific binding mt-~nher for HGBV.
"Specific binding ,.,. ."1~1" as used heréin means a member of a specific binding
pair. That is, two dirr~ mnl~~llles where one of the molecules through
çh.ornic~l or physical means specifically binds to the second molecule. In addition
to being an antibody member of a specific binding pair for HGBV, the in~1ic~tor
reagent also can be a lll~;lllber of any specific binding pair, including either hapten-
anti-hapten systems such as biotin or anti-biotin, avidin or biotin, a carbohydrate
or a lectin, a comple~ y nucleotide sequence, an effector or a receptor
molecule, an enzyme cofactor and an enzyme, an enzyme inhibitor or an enzyme,
and the like. An i~ nol~dctive specific binding member can be an antibody, an
antigen, or an antibody/antigen complex that is capable of binding either to HGBV
as in a sandwich assay, to the capture reagent as in a col~l~Lili~/e assay, or to the
ancillary specific binding ,lle.llbel as in an indirect assay.
The various "signal gen~,.dLillg compounds" (labels) con~",plated include
chromogens, catalysts such as enzymes, luminescent compounds such as
fluo~sceh~ and rhodamine, chemill.l";nçsc~ cc,lll~,ou~ds such as dioxetanes,
acridiniums, phel-~ iniums and luminol, radioactive elements, and direct
visual labels. Examples of enzymes include alkaline phosph~t~e, horseradish
peroxidase, beta-galactos~ ce, and the like. The selection of a particular label is
not critical, but it will be capable of producing a signal either by itself or in
conjunction with one or more additional substances.
The present invention provides assays which utilize specific binding
members. A "specific binding member," as used herein, is a member of a specific
binding pair. That is, two different molecules where one of the molecules through
chemical or physical means specifically binds to the second molecule Therefore,
in addition to antigen and antibody specific binding pairs of common

WO95/21922 2~ ~31 3 PCI/US95/02118


immunoassays, other specific binding pairs can include biotin and avidin,
carbohydrates and lectins, colllylc~"ænt~ y nucleotide sequences, effector and
receptor molecules, cofactors and en~yllles~ enzyme inhibitors and enzymes, and
the like. Furthermore, specific binding pairs can include IllGlll~l:i that are analogs
of the original specific binding ll,t",be,." for example, an analyte-analog.
Immunoreactive specific binding lllGlllbel~ include antigens, antigen fragments,antibodies and antibody fragments, both monoclonal and polyclonal, and
complexes thereof, including those formed by recombinant DNA molecules. The
term "hapten", as used herein, refers to a partial antigen or non-protein binding
member which is capable of binding to an antibody, but which is not capable of
eliciting antibody formation unless coupled to a carrier protein.
"Analyte," as used herein, is the substance to be detected which may be
present in the test sample. The analyte can be any substance for which there exists
a naturally occllrring specific binding mPmher (such as, an antibody), or for which
a specific binding member can be plepdlGd. Thus, an analyte is a substance that
can bind to one or more specific binding members in an assay. "Analyte" also
includes any antigenic subst~n~es, haptens, antibodies, and combinations thereof.
As a member of a specific binding pair, the analyte can be detected by means of
naturally occurring specific binding partners (pairs) such as the use of intrinsic
factor protein as a lllGlllbel of a specific binding pair for the determination of
Vitamin B 12, the use of folate-binding protein to detelll~ille folic acid, or the use of
a lectin as a Illr.~llh ~ of a specific binding pair for the dete~ ination of a
carbohydrate. The analyte can include a protein, a peptide, an amino acid, a
nucleotide target, and the like.
Other embo.1imPnt.~ which utilize various other solid phases also are
contemplated and are within the scope of this invention. For example, ion capture
~luc~lulGs for immobilizing an immobilizable reaction complex with a negatively
charged polymer, described in co-pending U. S. Patent Application Serial No.
150,278 corresponding to EP publication 0326100 and U. S. Patent Application
Serial No. 375,029 (EP publication no. 0406473), can be employed according to
the present invention to effect a fast solution-phase immunochemical reaction. An
immobilizable imm-lnP. complex is s~ala~d from the rest of the reaction mixture
by ionic interactions between the negatively charged poly-anion/il.ll,l..,-P complex
and the previously treated, positively charged porous matrix and detected by using
35 various signal geneldting systems previously described, including those described
in chemiluminsscent signal mea~u,G"ænts as described in co-pending U.S. Patent
Application Serial No.921,979 corresponding to EPO Publication No. 0 273,115.

wo95/21922 ~ 6 ~ 1 3 PCI`/US95102118


Also, the methods of the present invention can be adapted for use in
systems which utilize microparticle technology including in automated and semi-
automated systems wherein the solid phase comprises a microparticle (m~gnPtir ornon-m~gnPtiC). Such systems include those described in pending U. S. Patent
Applications 425,651 and 425,643, which correspond to published EPO
applications Nos. EP 0 425 633 and EP 0 424 634, respectively.
The use of sc~nning probe microscopy (SPM) for immunoassays also is a
technology to which the monoclonal antibodies of the present invention are easily
adaptable. In sc~nning probe microscopy, in particular in atomic force
o micr~sco~y, the capture phase, for example, at least one of the monoclonal
antibodies of the invention, is adhered to a solid phase and a sc~nning probe
microscope is utilized to detect antigen/antibody complexes which may be presenton the surface of the solid phase. The use of sc~nning tunnelling microsco~y
eli...;.. ~ S the need for labels which normally must be utilized in many
5 immunoassay systems to detect antigen/antibody complexes. Such a system is
described in pending U. S. patent application Serial No. 662,147. The use of
SPM to monitor specific binding reactions can occur in many ways. In one
embo limPnt, one member of a specific binding partner (analyte specific substance
which is the monoclonal antibody of the invention) is ~ttarhetl to a surface suitable
20 for sc~nning. The att~rh~pnt of the analyte specific substance may be by
adsorption to a test piece which comprises a solid phase of a plastic or metal
surface, following methods known to those of ordinary skill in the art. Or,
covalent att~r~ of a specific binding partner (analyte specific subst~nre) to a
test piece which test piece co...l.. ;.~es a solid phase of derivatized plastic, metal,
25 silicon, or glass may be Utili7P-l Covalent ~tt~rhmPnt methods are known to those
skilled in the art and include a variety of means to irreversibly link specific binding
partners to the test piece. If the test piece is silicon or glass, the surface must be
activated prior to ~tt~rhing the specific binding partner. Activated silane
co,~ unds such as triethoxy amino propyl silane (available from Sigma Chemical
30 Co., St. Louis, MO), triethoxy vinyl silane (Aldrich Chemical Co., Milwaukee,WI), and (3-mercapto-propyl)-trimethoxy silane (Sigma Chemical Co., St. Louis,
MO) can be used to introduce reactive groups such as amino-, vinyl, and thiol,
respectively. Such activated surfaces can be used to link the binding partner
directly (in the cases of amino or thiol) or the activated surface can be further
3s reacted with linkers such as glutaraldehyde, bis (succinimidyl) suberate, SPPD 9
succinimidyl 3-[2-pyridyldithio] propionate), SMCC (succinimidyl4-lN-
m~leimidolll~lyl] cyclohex~nP-1-carboxylate), SIAB (succinimidyl [4-iodoacetyl]

WO95/21922 ~1 li 631 3 PCI`/US95/02118

26

aminobenzoate), and SMPB (succinimidyl 4-[1-maleimidophenyl] butyrate) to
separate the binding partner from the surface. The vinyl group can be oxidized to
provide a means for covalent ~tt~hmrnt. It also can be used as an anchor for thepolyll-e~ Lion of various polymers such as poly acrylic acid, which can provide
5 multiple ~tt~rl~",elll points for specific binding partners. The amino surfæe can be
reacted with oxidized dextrans of various molecular weights to provide hydrophilic
linkers of dirr~ size and capacity. Examples of oxidizable dextrans include
Dextran T40 (molecular weight 40,000 daltons), Dextran T-110 (molecular
weight 110,000 daltons), Dextran T-500 (molecular weight 500,000 daltons),
lo Dextran T-2M (molecular weight 2,000,000 daltons) (all of which are available
from Ph~rm~r~ or Ficoll (molecular weight 70,000 daltons (available from
Sigma Chemical Co., St. Louis, MO). Also, polyelectrolyte interactions may be
used to immobilize a specific binding partner on a surface of a test piece by using
techniques and rhr.rnictries desclil,ed by pending U. S. Patent applications Serial
No. 150,278, filed January 29, 1988, and Serial No. 375,029, filed July 7, 1989.The pl~r~llcd method of ~tt~l""~ is by covalent means. Following ~tt~clllll~ nt
of a specific binding member, the surface may be further treated with materials
such as serum, proteins, or other blocking agents to minimi7e non-specific
binding. The surface also may be sc~nne~l either at the site of m~nnf~ctllre or point
20 of use to verify its suitability for assay purposes. The sc~nning process is not
anticipated to alter the specific binding p ~ ies of the test piece.
Various other assay formats may be used, including "sandwich"
immunoassays and probe assays. For example, the monoclonal antibodies of the
present invention can be employed in various assay systems to dete.ll i--e the
25 presence, if any, of HGBV proteins in a test sample. Fragments of these
monoclonal antibodies provided also may be used. For example, in a first assay
format, a polyclonal or monoclonal anti-HGBV antibody or fragment thereof, or a
combination of these antibodies, which has been coated on a solid phase, is
contacted with a test sample which may contain HGBV proteins, to form a
30 mixture. This mixture is inrub~tr~l for a time and under conditions sufficient to
form antigen/antibody complexes. Then, an indicator reagent comprising a
monoclonal or a polyclonal antibody or a fragment thereof, which specifically
binds to an HGBV region, or a combination of these antibodies, to which a signalgenerating compound has been ~tt~rh~l is contacted with the antigen/antibody
3s complexes to form a second mixture. This second mixture then is incubated for a
time and under conditions sufficient to form antibody/antigen/antibody complexes.
The presence of HGBV antigen present in the test sample and c~lul~d on the solid

WO 95/21922 PCI/US95/02118
3 ~ 3

phase, if any, is drlellllined by detecting the measurable signal generated by the
signal generating compoulld. The amount of HGBV antigen present in the test
sample is plupollional to the signal generated.
Alternatively, a polyclonal or monoclonal anti-HGBV antibody or fragment
s thereof, or a combination of these antibodies which is bound to a solid support, the
test sample and an in~ tor reagent co.,.l.. ;.~ii"~ a monoclonal or polyclonal
antibody or fragments thereof, which specifically binds to HGBV antigen, or a
combination of these antibodies to which a signal genel~tillg colll~ ul,d is
~tt~c~A are contacted to form a mixture. This mixture is in~ub~t~ for a time andlo under conditions sufficient to form antibody/antigen/antibody complexes. The
presence, if any, of HGBV proteins present in the test sample and captured on the
solid phase is ~tr, ..-i.~ç~l by clete~ting the mea~ulable signal genelalrd by the
signal gene~aLing colll~oulld. The amount of HGBV proteins present in the test
sample is p~o~lLional to the signal gellclated.
In another ~ltern~t~o assay format, one or a combination of at least two
monoclonal antibodies of the invention can be employed as a co,l")eti~i~e probe for
the detection of antibodies to HGBV protein. For example, HGBV proteins, either
alone or in combination, can be coated on a solid phase. A test sample suspectedof co~ g antibody to HGBV antigen then is ineub~te~ with an in-lie.~tor
reagent compri~ing a signal generating colllpoulld and at least one monoclonal
antibody of the invention for a time and under conditions sufficient to form
antigen/antibody complexes of either the test sample and in~lir~tor reagent to the
solid phase or the intlirator reagent to the solid phase. The reduction in binding of
the monoclonal antibody to the solid phase can be qll~ntit~tively measured. A
measurable reduction in the signal co,l,pa~d to the signal gene,ated from a
co--r - ",~A negative NANB, non-C, non-D, non-E hepatitis test sample indicates
the presence of anti-HGBV antibody in the test sample.
In yet another detection method, each of the monoclonal or polyclonal
antibodies of the present invention can be employed in the detection of HGBV
antigens in fixed tissue sections, as well as fixed cells by immunohistochemicalanalysis. Cytochemical analysis wherein these antibodies are labelled directly
(fluorescein, colloidal gold, horseradish peroxidase, ~lk~line phosphatase, etc.) or
are labelled by using secondary labelled anti-species antibodies (with various labels
as exemplified herein) to track the histopathology of disease also are within the
scope of the present invention.
In addition, these monoclonal antibodies can be bound to matrices similar
to CNBr-activated Sepharose and used for the affinity purification of specific

WO95/21922 ~ 1 66-~ 3 PCI~/US95/02118


HGBV proteins from cell cultures, or biological tissues such as blood and liver
such as to purify reco,l,bi"~" and native viral HGBV antigens and proteins.
The monoclonal antibodies of the invention can also be used for the
gen~,.d~ion of chi~ ,.ic antibodies for thel~eulic use, or other similar applications.
The monoclonal antibodies or fr~gm~nt~ thereof can be provided
individually to detect HGBV antigens. Combinations of the monoclonal antibodies
(and fragments thereof) provided herein also may be used together as components
in a mixture or "cocktail" of at least one anti-HGBV antibody of the invention with
antibodies to other HGBV regions, each having different binding specificities.
o Thus, this cocktail can include the monoclonal antibodies of the invention which
are directed to HGBV proteins and other monoclonal antibodies to other antigenicdc;lcl.~ lL~ of the HGBV genome.
The polyclonal antibody or fragment thereof which can be used in the assay
formats should sperifi~Ally bind to a specific HGBV region or other HGBV
proteins used in the assay. The polyclonal antibody used plGre.ably is of
m~mmAliAn origin; human, goat, rabbit or sheep anti-HGBV polyclonal antibody
can be used. Most preferably, the polyclonal antibody is rabbit polyclonal anti-HGBV antibody. The polyclonal antibodies used in the assays can be used either
alone or as a cocktail of polyclonal antibodies. Since the cocktails used in theassay formats are comprised of either monoclonal antibodies or polyclonal
antibodies having dirr.,.cnt HGBV specificity, they would be useful for diagnosis,
evaluation and prognosis of HGBV infection, as well as for studying HGBV
protein dirre,enliation and specificity.
It is con~",~lated and within the scope of the present invention that the
HGBV group of viruses may be detectable in assays by use of a synthetic,
,Gco~nbil~ant or native peptide that is common to all HGBV viruses. It also is
within the scope of the present invention that different synthetic, recombinant or
native peptides isentifying dirr~l~.ll epitopes from HGBV-A, HGBV-B, HGBV-
C, or yet other HGBV viruses, can be used in assay formats. In the later case,
these can be coated onto one solid phase, or each separate peptide may be coatedon separate solid phases, such as microparticles, and then combined to forrn a
mixture of peptides which can be later used in assays. Such variations of assay
formats are known to those of ordinary skill in the art and are discussed
hereinbelow.
In another assay format, the presence of antibody and/or antigen to HGBV
can be detected in a simultaneous assay, as follows. A test sample is
~imllltAn-~ously contacted with a capture reagent of a first analyte, wherein said

- WO 95/21922 2~ 3 1 3 PCI/US95/02118

29

capture reagent comprises a first binding member specific for a first analyte
~tt~r~lPA to a solid phase and a capture reagent for a second analyte, wherein said
capture reagent co~ lises a first binding mPmber for a second analyte ~tt~chPd to a
second solid phase, to thereby form a mixture. This mixture is incubated for a
5 time and under conditions sufficient to form capture reagent/first analyte andcapture reagent/second analyte complexes. These so-formed complexes then are
contacted with an inAir~tor reagent co~ irlg a ll~rlllhel of a binAing pair specific
for the first analyte labelled with a signal generating compound and an inAi~torreagent co~ isil,g a member of a binding pair specific for the second analyte
0 labelled with a signal gcnel~illg colllpoulld to form a second mixture. This second
mixture is i.~ d for a time and under conditions sufficient to form capture
reagent/first analyte/i...li~tor reagent complexes and capture reagent/second
analyte/in~ tor reagent comp~ es. The presence of one or more analytes is
d~ t.. i.. Pd by detecting a signal generated in com e~ion with the complexes
15 formed on either or both solid phases as an indication of the presence of one or
more analytes in the test sample. In this assay format, proteins derived from
human expression systems may be utilized as well as monoclonal antibodies
produced from the proteins derived from the ",i -"",~ n expression systems as
disclosed herein. Such assay systems are described in greater detail in pending
20 U.S. Patent Application Serial No. 07/574,821 entitled Simultaneous Assay forDetecting One Or More Analytes, which coll~,i,pollds to EP Publication No.
0473065.
In yet other assay formats, recombinant proteins and/or synthetic peptides
may be utilized to detect the ~ltsence of anti-HGBV in test samples. For example,
25 a test sample is i~ ub~ed with a solid phase to which at least one recombinant
protein or synthetic peptide has been ~tt~h~d These are reacted for a time and
under conditions sufficient to form antigen/antibody complexes. Following
incubation, the antigen/antibody complex is de~ected Tnllic~tor reagents may be
used to facilitate detection, depending upon the assay system chosen. In another30 assay format, a test sample is contacted with a solid phase to which a recombinant
protein or synthetic peptide produced as described herein is ~tt~rhe~l and also is
contacted with a monoclonal or polyclonal antibody specific for the protein, which
preferably has been labelled with an indicator reagent. After incubation for a time
and under conditions sufficient for antibody/antigen complexes to form, the solid
35 phase is separated from the free phase, and the label is detected in either the solid
or free phase as an indication of the presence of HGBV antibody. Other assay -
formats utili7in~ ~e proteins of the present invention are contemplated. These

WO 95/21922 ~ 1 ~ 6 3 1 3 PCT/US95/02118


include cont~ting a test sample with a solid phase to which at least one antigenfrom a first source has been ~tt~ incubating the solid phase and test sample
for a time and under conditions sufficient to form antigen/antibody complexes, and
then cont~cting the solid phase with a labelled antigen, which antigen is derived
5 from a second source dir~clcnt from the first source. For example, a recombinant
protein derived from a first source such as E. coli is used as a capture antigen on a
solid phase, a test sample is added to the so-prepared solid phase, and a
reco,llbhlal~l protein derived from a dirÇelcnl source (i.e., non-E. coli) is utilized as
a part of an in-lic~tor reagent. Likewise, combinations of a recombinant antigen on
10 a solid phase and synthetic peptide in the infii~tor phase also are possible. Any
assay format which utilizes an antigen specific for HGBV from a first source as the
capture antigen and an antigen specific for HGBV from a different second source
are conte~ )lated. Thus, various combinations of lecolllbill~ antigens, as well as
the use of synthetic peptides, purified viral proteins, and the like, are within the
15 scope of this invention. Assays such as this and others are described in U.S. Patent No. 5,254,458, which enjoys common ownership and is incol~oldted
herein by reference.
Other assay systems which utilize an antibody (polyclonal, monoclonal or
naturally-occurring) which specifically binds HGBV viral particles or sub-viral
20 particles housing the viral genome (or fragments thereof) by virtue of a contact
between the specific antibody and the viral protein (peptide, etc.). This captured
particle then can be analyzed by methods such as LCR or PCR to detellllh,e
whether the viral genome is present in the test sample. Test samples which can be
assayed accor~ing to this method include blood, liver, sputum, urine, fecal
25 material, saliva, and the like. The advantage of utili7.ing such an antigen capture
amplification method is that it can sepd,~e the viral genome from other molecules
in the test SlJeCill~en by use of a specific antibody. Such a method has been
described in pending U.S. patent application Serial No. 08/141,429.
While the present invention discloses the ~lcrclcnce for the use of solid
30 phases, it is contelllplated that the reagents such as antibodies, proteins and
peptides of the present invention can be utilized in non-solid phase assay systems.
These assay systems are known to those skilled in the art, and are considered to be
within the scope of the present invention.
Materials and Methods
35 General Techniques
Conventional and well-known techniques and methods in the fields of
molecular biology, microbiology, recombinant DNA and immunology are

_ WO95/21922 2166313 PCI~/US95/02118


employed in the practice of the invention unless other~vise noted. Such techniques
are explained and ~let~iled in the li~ ulG. See, for example, J. Sambrook et al.,
Molecular Clonin~: A Laboratory Manual. 2nd edition, Cold Spring Harbor Press,
Cold Spring Harbor, N.Y. (1989); D. N. Glover, ed., DNA Cloning. Volumes I
and II (1985); M.J. Gait ed., Oli~onucleotide Synthesis. (1984); B.D. Hames et
al., eds., Nucleic Acid Hybridization. (1984); B.D. Hames et al., eds.,
Transcription and Translation. (1984); R. I. Freshney ed., Animal Cell Culture.
(1986); Immobiliæd Cells and Er,zyllles~ IRL Press (1986); B. Perbal, A Practical
Guide to Molecular Cloning. (1984); the series, Methods in Enzymology.
0 Academic Press, Inc., Orlando, Florida; J. H. Miller et al., eds., Gene Transfer
Vectors For l~mm~ n Cells. Cold Spring Harbor Laboratory, Cold Spring
Harbor, N.Y. (1987); Wu et al., eds., Methods in Enzymology, Vol. 154 and 155
; Mayer et al., eds., Jmmllnolo~ir~l Methods In Cell and Molecular Biolo~y.
Academic Press, T onrlon (1987); Scopes, Protein Purifir~tion: Principles and
Practice. 2nd ed., Springer-Verlag, N.Y.; and D. Weir et al., eds., Handbook Of
Experimental Immunology. Volumes I-IV (1986); N. Lisitisyn et al., Science
259:946-951 (1993).
The reagents and methods of the present invention are made possible by the
provision of a family of closely related nucleotide sequences, isolated by
l~lesellkt~ional difference analysis modified as described herein, present in the
plasma, serum or liver homogenate of an HGBV infected individual, either tamarinor human. This family of nucleotide sequences is not of human or tamarin origin,since it will be shown that it hybridizes to neither human nor tamarin genomic
DNA from uninfected individuals, since nucleotides of this family of sequences are
2s present only in liver (or liver homogenates), plasma or serum of individuals
infected with HGBV, and since the sequence is not present in GenBank. In
addition, the family of sequences will show no significant identity at the nucleic
acid level to sequences contained within the HAV, HBV, HCV, HDV and HEV
genome, and low level identity, considered not significant, as translation products.
Infectious sera, plasma or liver homogenates from HGBV infected hllm~nc contain
these polynucleotide sequences, whereas sera, plasma or liver homogenates from
non-infected humans do not contain these sequences. Northern blot analysis of
infected liver with some of these polynucleotide sequences demonstrate that theyare derived from a large RNA transcript similar in size to a viral genome. Sera,3s plasma or liver homogenates from HGBV-infected humans contain antibodies
which bind to this polypeptide, whereas sera, plasma or liver homogenates from
non-infected humans do not contain antibodies to this polypeptide; these antibodies

WO95/21922 ~ 6~ 3 PCI'tUS95tO2118


are inclucecl in individuals following acute non-A, non-B, non-C, non-D and non-E infection. By these criteria, it is believed that the sequence is a viral sequence,
wherein the virus causes or is associated with non-A, non-B, non-C, non-D and
non-E hepatitis.
s The availability of this farnily of nucleic acid sequences perrnits theconstruction of DNA probes and polypeptides useful in diagnosing non-A, non-B,
non-C, non-D, non-E hepatitis due to HGBV infections, and in screening blood
donors, donated blood, blood products and individuals for infection. For
example, from the sequence it is possible to synthesi7.~. DNA oligomers of abouto eight to ten nucleotides, or larger, which are useful as hybridization probes or PCR
primers to detect the presence of the viral genome in, for example, sera of subjects
suspected of h~bG~ g the virus, or for screening donated blood for the presence
of the virus. The family of nucleic acid sequences also allows the design and
production of HGBV specific polypeptides which are useful as diagnostic reagents1S for the presence of antibodies raised during infection with HGBV. Antibodies to
purified polypeptides derived from the nucleic acid sequences may also be used to
detect viral antigens in infected individuals and in blood. These nucleic acid
sequences also enable the design and production of polypeptides which may be
used as vaccines against HGBV, and also for the production of antibodies, which
then may be used for protection of the disease, and/or for therapy of HGBV
infected individuals.
The family of nucleic acid sequences also enables further char~ctçn7~tion
of the HGBV genome. Polynucleotide probes derived from these sequences may
be used to screen genomic or cDNA libraries for additional overlapping nucleic
2s acid sequences which then may be used to obtain more overlapping sequences.Unless the genome is segrnented and the segments lack comrnon sequences, this
technique may be used to gain the sequence of the entire genome. However, if thegenome is segmented, other segments of the genome can be obtained by either
l~peali--g the RDA cloning pluce lul~; as described and modified hereinbelow or by
repeating the lambda-gtl 1 serological screening procedure ~iccu~sed hereinbelowto isolate the clones which will be described herein, or alternatively by isolating the
genome from purified HGBV particles.
The farnily of cDNA sequences and the polypeptides derived from these
sequences, as well as antibodies directed against these polypeptides, also are
3s useful in the isolation and identification of the HGBV etiological agent(s). For
example, antibodies directed against HGBV epitopes contained in polypeptides
derived from the nucleic acid sequences may be used in methods based upon

_ WO95/21922 2~ ~ 6 3 13 PCT/US95/02118

33

affinity chromatography to isolate the virus. Alternatively, the antibodies can be
used to identify viral particles isolated by other techniques. The viral antigens and
the genomic mAt~riAl within the isolated viral particles then may be further
charAr,tP.rj7p.~
The information obtained from further sequencing of the HGBV
genome(s), as well as from further characlcli~ation of the HGBV Anti~en~ and
ch~a~ A~ ion of the genome enables the design and synthesis of additional
probes and polypeptides and antibodies which may be used for diagnosis,
prevention and therapy of HGBV in~ çed non-A, non-B, non-C non-D, non-E
0 hepatitis, and for screening of infected blood and blood-related products.
The availability of probes for HGBV, including antigens, antibodies and
polynucleotides derived from the genome from which the family of nucleic acid
seql~enr,es is derived also allows for the development of tissue culture systemswhich will be of major use in ehlci~l"ting the biology of HGBV. Once this is
known, it is contc~ lated that new l~caL~llenl regimens may be developed based
upon antiviral colllpollllds which plcrclcl.lially inhibit the replication of or infection
by HGBV.
In one method used to identify and isolate the etiological agent of HGBV,
the cloning/isolation of the GB agent was achieved by modifying the published
procedure known as representational difference analysis (RDA), as reported by N.Lisitsyn et al., Science 259: 946-951 (1993). This method is based upon the
principles of subtractive hybridization for cloning DNA differences between two
complex IIIA~IIIIIA1 jAn genomes. Briefly, in this procedure, the two genomes under
evaluation are iderltifiP,d genPri~Ally as the "tester" (COIIIA;II;IIg the target sequence
of interest) and the "driver" (~t~resc~ llg normal DNA). Lisitsyn et al.'s
description of RDA is limited to identifying and cloning DNA dirrclcnces betweencomplex, but similar DNA backgrounds. These differences may include any large
DNA viruses (eg. >~5,000 base pairs of DNA) that is present in a cell line, blood,
plasma or tissue sample and absent in an uninfected cell line, blood, plasma or
tissue sample. Because previous lilclàlulc suggested that HGBV may be a small
virus coll~A;~ g either a DNA or RNA genome of <10,000 bases, the RDA
protocol was modified such as to allow the detection of small viruses. The majorsteps of the procedure are described hereinbelow and are diagramed in FIGURE
13.
Briefly, in step 1, total nucleic acid (DNA and RNA) is isolated using
collllllclcially available kits. RDA requires that the sample be highly mAtrhP.(l
Ideally, tester and driver nucleic acid samples should be obtained from the same

WO 95/21922 ~ 1 ~ 6 3 l 3 PCI/US95/02118


source (animal, human or other). It may be possible to use highly related, but
non-i~entir~l, material for the source of the tester and driver nucleic æids. Double
stranded DNA is gen~,ldled from the total nucleic acid by random primed reverse
transcription of the RNA followed by random primed DNA synthesis. This
s treatment converts single strand RNA viruses and single strand DNA viruses to
double strand DNA molecules which are ~ll,llenable to RDA. If one chooses to
assume that an unknown virus has a DNA or an RNA genome, a DNA-only or
RNA-only extrætion plocedulc can be employed and double-stranded DNA can be
generated as described in the art.
In step 2, the tester and driver nucleic æids are amplified to genel~le an
abundant amount of m~teri~l which represents the total nucleic acid extracted from
the pre-inoculation and infectious plasma sources (ie. the tester amplicon and the
driver amplicon). This is æhieved by cleaving double-stranded DNA prepared as
desclibed above with a restriction e.n~onllrle~e which has a 4 bp recognition site
(such as Sau3A I). The DNA fr~gm~nt.c are ligated to oligonucleotide adaptors (set
#1). The DNA fr~gm~nt.c are end-filled and PCR amplified. Following PCR
amplification, the oligonucleotide adaptor (set #1) is then removed by restriction
endonuclease digestion (for example, with Sau3A I), liberating a large amount oftester and driver nucleic æid to be used in subsequent subtractive hybridizationtechniques.
In step 3, the e~fillæntal design is to enrich for DNA unique to the tester
genome. This is æhieved by COlll~il~illg ~u~æli~/e hybridization and kinetic
enrichm~nt into a single step. Briefly, an oligonucleotide adaptor set (#2 or #3) is
ligated to the 5' ends of the tester amplicon. The tester amplicon and an excess of
2s driver amplicon are mixed, denatured and allowed to hybridized for 20 hours. A
large amount of the sequences that are held in common between the tester and
driver DNA will anneal during this time. In addition, sequences that are unique to
the tester amplicon will l~ln~e~l However, because of the limited time of
hybridization, some single-standed tester and driver DNA will remain.
In step 4, the 3' ends of the re~nn.o~led tester and driver DNA are filled in
using a thermostable DNA polymerase at elevated telll~l~lu,c as described in theart. The re~nn~le~ sequences that are unique to the tester contain the ligated
adaptor on both strands of the annealed sequence. Thus, 3' end-filling of these
molecules creates sequences compl~.lællt~y to PCR primers on both DNA
strands. As such, these DNA species will be amplified e~ponelllially when
subjected to PCR. In co~ ~l, the relatively large amount of hybrid molecules -
co.,l~ g sequences held in col,llllol1 between tester and driver amplicons (ie. one

W O 95/21922 ~ 3 PC~rrUS95/02118


strand was derived from the tester amplicon and one strand was derived from the
driver amplicon) will be amplified linearly when subjected to PCR. This is
because only one strand (derived from the tester amplicon) contains the ligated
adaptor sequence, and 3' end filling will only ~ C;neldle sequences complementary
5 to the PCR primer on the strand derived from the driver amplicon.In step 5, the double-strand DNA of interest is enriched quantitatively
using PCR for 10 cycles of amplification. As stated above in step 4, reannealed
tester sequences will be amplified ~xponen~ially whereas sequences held in
common belwæn tester and driver amplicons will be amplified linearly.
o In step 6, single-strand DNA which lGIllail~s is removed by a single strand
DNA nncle~e digestion using mung bean nll~le~e as described in the art.
In step 7, double-stranded DNA which remains after nuclease digestion is
PCR amplified an additional 15 to 25 cycles.
Finally in step 8, these DNA products are cleaved with restriction
endomlcle~e to remove the oligonucleotide adaptors. These DNA products can
then be subjected to subsequent rounds of amplification (beginning at step #3
using the oligonucleotide adaptor set that was not used in the previous cycle ofRDA) or cloned into a suitable plasmid vector for further analysis.
The RDA plocedule as described supra is a modification of the
representational difference analysis known in the art. The method was modified to
isolate viral clones from pre-inoculation and infectious sera sources. These
modifications are ~ c~lc~ecl further below and relate to the plep~dlion of
amplicons for both tester and driver DNA. First, the starting material was not
double-stranded DNA obtained from the genomic DNA of ~"i.."",~ n cells as
25 reported previously, but total nucleic acid extracted from infectious and pre-
inoculation biological blood samples obtained from l~, ..~. ;..~. It is possible that
other biological samples (for example, organs, tissue, bile, feces or urine) could be
used as sources of nucleic acid from which tester and driver amplicons are
generated. Second, the amount of starting nucleic acid is substantially less than
30 that described in the art. Third, a restriction endonuclease with a 4 bp instead of a
6 bp recognition site was used. This is ~.ub..~llially dirr~lt;l~t from the prior art.
Lisitsyn et al. teach that RDA works because the generation of amplicons (ie.
representations) decreases the complexity of the DNA that is being hybridized (ie.
subtracted).
3s In the prior art, restriction enzymes that have 6 bp recognition sites were
used to fragment the genome. These restriction endonucleases cleave
approximately every 4000 bp. However, the PCR conditions described in the

WO 95/21922 ~ 1 ~ ti 3 1 3 PcrluS95/02118

36

prior art amplify sequences C1500 bp in size. Therefore, subsequent PCR
amplification of a complex species of DNA (such as a genome) that has been
fragmented with a restriction enzyme that recognizes a 6 bp sequence results in the
generation of amplicons that contain the fraction of the DNA that was <1500 bp in
5 size after restriction endonuclease digestion. This reduction in DNA complexity
(esL;,-,~ed to be a 10- to 50-fold reduction) is reported to be nP.cess~ry for the
hybridization step of RDA to work. If the complexity is not reduced, unique
se4uences in the tester will not be able to efficiently hybridize during the
subtraction step, and therefore, these unique sequences will not be amplified
lo exponentially during the subsequent PCR steps of RDA.
The reduction of complexity of the nucleic acid sequences being subjected
to RDA lm-lerrnint~ using RDA effectively to isolate relatively small viruses. The
odds of two 6 bp-recognition sites occurring within 1.5 kb of each other is
sufficiently rare that one might miss a small (<10 kb) virus (TABLE 1).
TABLE 1
Virus Enzyme # of Fragments <l.Skb
BamH I 0
(~50 kb) Bgl II 3
Hind m
ParvoB19 BamHI 0
(~5 kb) Bgl II O
Hindm 2
Sau3A I (4 bp site) 5-7
HBV BamH I 1-2
(~3.2 kb) BglII 1-2
Hindm 0
Sau3A I (4 bp site) 12
However, we have discovered that RDA may be useful in cloning small viruses if
a more frequently cutting restriction endonuclease is used to fragment the DNA
35 being subjected to RDA. As shown in TABLE 1, amplicons based on 4 bp
recognition site enzymes will almost certainly contain several fragments from any
small virus, as restriction endonucleases which have 4 bp recognition sites
fragment DNA approximately every 250 base pairs. However, it is likely that
amplicons will be as complex as the source of the nucleic acid from which they
40 were generated because nearly all of the DNA species will be <1500 bp after
digestion with a 4 bp recognizing restriction endonuclease and thus, subject to
PCR amplification. Since the relative viral sequence copy IlUlllbe~ iS predicted to

216~13
WO 95/21922 ~ PCI/US95102118


be higher than any specific or endogenous sequence copy number, the unique viralsequen~çs that are present in the tester amplicon should be able to form double
stranded molecules during the hybridization step (step 3, above). Therefore, these
sequences will be amplified e~on~lllially as described above. It is reasoned that
5 as the relative viral sequence copy number becomes closer to that of the
background or endogenous nucleic acid sequence copy number, a restriction
endon-lclç~ce which recognizes a redlm-l~nt 6 bp sequence (for example BstYI or
HincII) and cleaves approximately every 1000 bp, or the .~imlllt~neous use of
several restriction endonll~le~ce which recognizes 6 bp sequences, may be used to
0 fragment the DNA prior to amplification by PCR. In this way, one can moderately
reduce the complexity of the amplicons being subje~lt;d to RDA while minimi7.ingthe risk of excluding viral sequeces from the tester amplicon. The utility of this
procedure is demonstrated by the cloning of HGBV sequences from infectious
tamarin plasma described herein.
15 Immunosc,ccllin~ to identify HGBV immunoreactive epitopes
Immunoscreening as described herein as follows also provided an
additional means of identifying HGBV sequences. Pooled or individual serum,
plasma or liver homogenates from an individual meeting the criteria and within the
palalllct~ set forth below with acute or chronic HGBV infection is used to isolate
20 viral particles. Nucleic acids isolated from these particles are used as the template
in the construction of a genomic andlor cDNA library to the viral genome. The
procedures used for isolation of putative HGBV particles and for constructing the
genomic and/or cDNA library in lambda-gtl 1 or similar systems known in the art
is tiiccllcserl hereinbelow. Lambda-gtl 1 is a vector that has been developed
25 specifically to express inserted cDNAs as fusion polypeptides with beta-
g~l~ctosid~se and to screen large n~l",be,~ of r~co",bi~lant phage with specificantisera raised against a defined antigen. The lambda-gtl 1 cDNA library generated
from a cDNA pool co..~ g cDNA is screened for encoded epitopes that can bind
specifically with sera derived from individuals who previously had experienced
30 non-A, non-B, non-C, non-D and non-E hepatitis. See V. Hunyh et al., in D.
Glover, ed, DNA Clonin~ Techniques: A Practical Approach. IRL Press,Oxford,
England, pp. 49-78 (1985). Approximately 106 - 107 phage are screened, from
which positive phage are identified, purified, and then tested for specificity of
binding to sera from different individuals previously infected with the HGBV
35 agent. Phage which selectively bind sera or plasma from patients meeting the
criteria described hereinbelow and not in patients who did not meet th~se described
criteria, are plcfell~,d for further study. By lltili7ing the technique of isolating

wo 95/21922 ~ 1 ~ 6 3 1 3 PCr/US95/02118


overlapping nucleic acid sequences, clones co~ additional u~ l and
downstream HGBV sequences are obtained. Analysis of the nucleotide sequences
of the HGBV nucleic acid sequences encoded within the isolated clones is
~lro"lled to deterrnine whether the colllposile sequence contains one long
s continuous ORF.
The sequences (and their complements) retrieved from the HGBV sequence
as provided herein, and the sequences or any portion thereof, can be prepared
using synthetic methods or by a combination of synthetic methods with retrieval of
partial se~ue~-ces using methods similar to those described herein. This
10 description thus provides one method by which genomic or cDNA sequences
corresponding to the entire HGBV genome may be isolated. Other methods for
isolating these sequences, however, will be obvious to those skilled in the art and
are considered to be within the scope of the present invention.
Deposit of Strains.
Strains replicated (clones 2, 4, 10, 16, 18, 23 and 50) from the HGBV
nucleic acid sequence library have been deposited at the American Type Culture
Collection, 12301 Parklawn Drive, Rockville, Maryland 20852, as of February
10, 1994, under the terms of the Budapest Treaty and will be ~ in.o(l for a
period of thirty (30) years from the date of deposit, or for five (5) years after the
last request for the deposit, or for the enforceable period of the U.S. patent,
whichever is longer. The deposits and any other deposited material described
herein are provided for convenience only, and are not required to practice the
present invention in view of the te~-~hing~ provided herein. The HGBV cDNA
sequences in all of the de~osi~ed m~tPri~l~ are incol~Glat~d herein by reference.
The plasmids were accorded the following A.T.C.C. deposit numbers: Clone 2
was accorded A.T.C.C. Deposit No. 69556; Clone 4 was accorded A.T.C.C.
Deposit No. 69557; Clone 10 was accorded A.T.C.C. Deposit No. 69558; Clone
16 was accorded A.T.C.C. Deposit No.69559; Clone 18 was accorded A.T.C.C.
Deposit No. 69560; Clone 23 was accorded A.T.C.C. Deposit No. 69561; and
Clone 50 was accorded A.T.C.C. Deposit No. 69562.
Strains replicated (clones 11, 13, 48 and 119) from the HGBV nucleic acid
sequence library have been deposited at the American Type Culture Collection,
12301 Parklawn Drive, Rockville, Maryland 20852, as of April 29, 1994, under
the terms of the Budapest Treaty and will be ~ ed for a period of thirty (30)
3s years from the date of deposit, or for five (S) years after the last request for the
deposit, or for the enforceable period of the U.S. patent, whichever islonger. The
deposits and any other deposited material described herein are provided for

_ WO 95/21922 2 1 fi 6 3 1 3 PCT/US95/02118

39

convenience only, and are not required to practice the present invention in view of
the te~rhingc provided herein. The HGBV cDNA sequences in all of the deposited
materials are incol~ol~led herein by reference. The plasmids were accorded the
following A.T.C.C. deposit numbers: Clone 11 was accorded A.T.C.C. Deposit
s No. No. 69613; Clone 13 was accorded A.T.C.C. Deposit No. 69611; Clone 48
was accorded A.T.C.C. Deposit No. 69610; and Clone 119 was accorded
A.T.C.C. Deposit No. 69612.
` Additional strains (clones 4-Bl.l, 66-3Al.49, 70-3Al.37 and 78-lC1.17)
from the HGBV nucleic acid sequence library have been deposited at the American
0 Type Culture Collection, 12301 Parklawn Drive, Rockville, Maryland 20852, as
of July 28, 1994, under the terms of the Budapest Treaty and will be ".~ ed
for a period of thirty (30) years from the date of deposit, or for five (5) years after
the last request for the deposit, or for the enforceable period of the U.S. patent,
wllichever is longer. The d~)osi~ and any other d~osiled m~teri~l described
ls herein are provided for convenience only, and are not required to practice the
present invention in view of the teachings provided herein. The HGBV cDNA
sequences in all of the deposited materials are incorporated herein by reference.
The plasmids were accorded the following A.T.C.C. deposit numbers: Clone 4-
B 1.1 was accorded A.T.C.C. Deposit No. No. 69666; Clone 66-3Al .49 was
accorded A.T.C.C. Deposit No. 69665; Clone 70-3Al.37 was accorded A.T.C.C.
Deposit No. 69664; and Clone 78-lC1.17 was accorded A.T.C.C. Deposit No.
69663.
Clone pHGBV-C clone #l was deposited at the American Type Culture
Collection, 12301 Parklawn Drive, Rockville, Maryland 20852 as of November 8,
1994, under the terms of the Budapest Treaty and will be maintained for a periodof thirty (30) years from the date of deposit, or for five (5) years after the last
request for the deposit, or for the enforceable period of the U.S. patent, whichever
is longer. The deposits and any other deposited material described herein are
provided for convenience only, and are not required to practice the present
invention in view of the teachings provided herein. pHGBV-C clone #I was
accorded A.T.C.C. Deposit No. 69711. The HGBV cDNA sequences in all of the
deposited materials are incorporated herein by reference.
Pl~p~dlion of Viral Polypeptides and Fragments
The availability of nucleic acid sequences permits the construction of
expression vectors encoding antigenically active regions of the polypeptide
encoded in either strand. These antigenically active regions may be derived fromstructural regions of the virus, including, for example, envelope (coat) or core

2:1 ~6~13
WO 95/21922 PCT/US95/02118


antigens, in addition to nonstructural regions of the virus, including, for example,
polynllclieotide binding proteins, polynucleotide polymerase(s), and other viralyUO~ lS ntoces~ry for replication and/or assembly of the viral particle. Fr~gm~nt~
encoding the desired polypeptides are derived from the genornic or cDNA clones
S using conventional restriction digestion or by synthetic methods, and are ligated
into vectors which may, for example, contain portions of fusion sequences such as
beta-g~l~rto.ci-l~ce (~-gal) or superoxide tlicmllt~se (SOD) or CMP-KDO
synth~.t~ce (CKS). Methods and vectors which are useful for the production of
polypeptides which contain fusion sequences of SOD are described in EPO
10 0196056, published October 1, 1986, and those of CKS are described in EPO
Publication No. 0331961, published September 13, 1989. Any desired portion of
the nucleic acid sequence conl~;";l-g an open reading frame, in either sense strand,
can be obtained as a recomhin~nt protein, such as a mature or fusion protein;
y, a polypeptide encoded in the HGBV genome or cDNA can be
15 provided by chemical synthesis.
The nucleic æid sequence encoding the desired polypeptide, whether in
fused or mature form, and whether or not co~ illg a signal sequence to permit
secretion, may be ligated into expression vectors suitable for any convenient host.
Both eucaryotic and prokaryotic host systems are used in the art to form
20 recombinant y~teh~s, and some of these are listed herein. The polypeptide then is
isolated from Iysed cells or from the culture m~illm and purified to the extent
needed for its intto.n~e~l use. Purification can be performed by techniques known in
the art, and include salt fractionation, chromatography on ion exc~l~nge resins,affinity chromatography, centrifugation, among others. Such polypeptides may be
25 used as diagnostic reagents, or for passive immunotherapy. In addition,
antibodies to these polypeptides are useful for isolating and identifying HGBV
particles. The HGBV antigens also may be isolated from HGBV virions. These
virions can be grown in HGBV infected cells in tissue culture, or in an infectedindividual.
30 ~eyal~Lion of Antigenic Polypeptides and Conjugation With Solid Phase
An antigenic region or fragment of a polypeptide generally is relatively
small, usually about 8 to 10 amino acids or less in length. Fr~gm~.nt.c of as few as
S amino acids may Cha~lt;liLe an ~ntig~nic region. These segments may
coll~;,yond to regions of HGBV antigen. By using the HGBV genomic or cDNA
35 sequences as a basis, nucleic acid sequences encoding short seglllt;ll~ of HGBV
polypeptides can be expressed recombinantly either as fusion proteins or as
isolated polypeptides. These short amino acid sequences also can be obtained by

WO 95/21922 ~l ~iG ~ I ~ PCI/US95/02118

41

chemical synthesis. The small chrmir~lly synthrci7~ polypeptides may be linked
to a suitable carrier molecule when the synthri7çd polypeptide provided is
correctly configured to provide the correct epitope but too small to be antigenic.
T .inkin~ methods are known in the art and include but are not limited to using N-
s succinimidyl-3-(2-pyrdylthio)propionate (SPDP) and s~lcçinimi~lyl 4-(N-
m~lrimitlomethyl)cyclohPx~nr-1-carboxylate (SMCC). Polypeptides lacking
sulfhydryl groups can be modified by adding a cysteine residue. These reagents
create à lic~lfi(le linkage ~ween themselves and peptide cysteine residues on one
protein and an amide linkage through the epsilon-amino on a lysine, or other free
0 amino group in the other. A variety of such ~liclllfidelamide-forming agents are
known. Other bifunctional coupling agents form a thioester rather than a disulfide
linkage. Many of these thio-ether-forming agents are co~ ,c,cially available andare known to those of oldhl~y skill in the art. The carboxyl groups can be
~ t~,d by COlll~illillg them with sur~inimi~e or 1-hydl~ ~yl-2-nitro~sulfonic
acid, sodium salt. Any carrier which does not itself induce the production of
antibodies harmful to the host can be used. Suitable carriers include proteins,
polys~cch~tides such as latex functionalized sepharose, agarose, cellulose,
cellulose beads, polymeric amino acids such as polyglutamic acid, polylysine,
amino acid copolymers and inactive virus particles, among others. Examples of
protein substrates include serum albumins, keyhole limpet hemocyanin,
immunoglobulin molecules, thyroglobulin, ovalbumin, tetanus toxoid, and yet
other proteins known to those skilled in the art.
Pl~dldlion of Hybrid Particle Immuno~ens COI.t~ HGBV Epitopes
The immunogenicity of HGBV epitopes also may be enh~tlred by
2s plGpalillg them in ~ n or yeast systems fused with or assembled with
particle-forming proteins such as those associated with HBV surface antigen.
Constructs whc~cill the HGBV epitope is linked directly to the particle-forming
protein coding sequences produce hybrids which are immunogenic with respect to
the HGBV epitope. In addition, all of the vectors pl~al~,d include c~ilopes
specific for HGBV, having varying degrees of immunogenicity. Particles
constructed from particle forming protein which include HGBV sequences are
immunogenic with respect to HGBV and HBV.
Hepatitis B surface antigen has been det~-~ Illinrd to be formed and
assembled into particles in S. cerevisiae and m~rnm~ rl cells; the formation of
3s these particles has been reported to enh~nre the immunogenicity of the monomer
suh~it. P. Valen7uel~ et al., Nature 298:334 (1982); P. V~lenmel~ et al., in I.
~illm~n et al., eds., II~ lilis B. Plenum Press, pp. 225-236 (1984). The

wo gsnlg22 ~ 1 ~ 6 ~ ~ 3 PCI'IUS95/02118


constructs may include immunodolllhla~ es of HBsAg. Such constructs
have been reported expressible in yeast, and hybrids including heterologous viral
se l~encGs for yeast eA~vlGssion have been disclosed. See, for example, EPO 174,444 and EPO 174,261. These constructs also have been reported capable of being
5 eAplGsstd in m~mm~ n cells such as Chin~se h~nn~ter ovary (CHO) cells.
Michelle et al., I~ ,.l,alional Symposium on Viral Hepatitis~ 1984. In HGBV,
portions of the particle-forrning protein coding sequence may be replaced with
codons e ~o~ g an HGBV epitope. In this replacelllGnl, regions that are not
required to m~ tte the aggn,g~lion of the units to form immnnogenic particles in10 yeast or ",:.."",~l~ can be deleted, thus el;...;,.,.l;..g additional HGBV antigenic sites
from collll)Glilion with the HGBV epitope.
Vaccine E~ lion
Vaccines may be ~ Gd from one or more immunogenic polypeptides or
nucleic acids derived from HGBV nucleic acid sequen~es or from the HGBV
15 genollle to which they collG~ond. Vaccines may colllplise recombinant
polypeptides co.-li.;..il-g epitope(s) of HGBV. These polypeptides may be
expressed in bacteria, yeast o m~ n cells, or allelllalively may be isolated
from viral p~p~ions. It also is anticipated that various ~llu~;lul~l proteins may
contain epitopes of HGBV which give rise to ~lulG~;liv-e anti-HGBV antibodies.
20 Synthetic peptides therefore also can be utilized when pl~illg these vaccines.
Thus, polypeptides co..li.;l.il-g at least one epitope of HGBV may be used, either
singly or in combinations, in HGBV vaccines. It also is conle...l l~t~d that
nonstructural pl~teills as well as structural pl~ h;llS may provide protection against
viral pathogenicity, even if they do not cause the production of neutralizing
25 antibodies.
Cnn.~i~ering the above, multivalent vaccines against HGBV may comprise
one or more structural proteins, and/or one or more nonstructural proteins. These
v~cilles may be comprised of, for example, r~,collll,illal t HGBV polypeptides
and/or polypeptides isolated from the virions and/or synthetic peptides. These
30 immunogenic epilopes can be used in combinations, i.e., as a Illi~lUIG of
recolllbill~t proteins, synthetic peptides andlor polypeptides isolated from thevirion; these may be ~llnnini~t~red at the same or dirrGIc~- time. Additionally, it
may be possible to use inactivated HGBV in vaccines. Such inactivation may be
be plGp~lion of viral Iysates, or by other means known in the art to cause
35 in~ivalion of hepatitis-like viruses, for ex~mple, l.~l,llent with organic solvents
or det ,~ r tre~tm~nt with form~lin ~ttt-nll~t~d HGBV strain ple~alion
also is disclosed in the present invention. It is contGIl~lated that some of the

~1 66~3
WO 95/21922 PCI/US95/02118


proteins in HGBV may cross-react with other known viruses, and thus that shared
epitopes may exist between HGBV and other viruses which would then give rise
to protective antibodies against one or more of the disorders caused by these
pathogenic agents. It is con~ ,?lated that it may be possible to design multiple5 purpose vaccines based upon this belief.
The pr~,~,~alion of vaccines which contain at least one immllnogenic
peptide as an active ingredient is known to one skilled in the art. Typically, such
V~CCil~s are prepared as injectables, either as liquid solutions or suspensions; solid
forrns suitable for solution in or suspension in liquid prior to injection also may be
0 prepared. The pr~d~ion may be emlllcifiP~ or the protein may be enr~rslll~tecl in
li~osollles. The active immunogenic ingredients often are mixed with
ph~rm~cologically acceptable exciriçnt.c which are compatible with the active
ingredient. Suitable excipients include but are not limited to water, saline,
dextrose, glycerol, ethanol and the like; colllbillations of these excipients in various
A.. u.. lc also may be used. The vaccine also may contain small amounts of
auxiliary substances such as wetting or emulsifying reagents, pH burrc~ g agents,
and/or adjuvants which çnh~nre the effectiveness of the vaccine. For example,
such adjuvants can include al~...,;..~".. hydroxide, N-acetyl-lllul~llyl-L-threonyl-D-
isogl~ . i . ..o (thr-DMP), N-acetyl-nol l,ul ~llyl-L-alanyl-D-isogl~ (CGP
11687, also referred to as nor-MDP), N-acelylll~ul~llyul-L-alanyl-D-
isogl ~ l l l i . .yl-L-alanine-2-( 1 '2'-dipalmitoyl-sn-glycero-3-hydroxphosphoryloxy)-
ethylamine (CGP 19835A, also referred to as MTP-PE), and RIBI (MPL + TDM+
CWS) in a 2% sq~ n~/Tween-80~ emulsion. The err~;livt;ness of an adjuvant
may be dt;~~ lled by Illf.h~lll illg the amount of antibodies directed against an
immlmogenic polypeptide cc ,.I~;l,;.,g an HGBV ~ntig~ni~ sequence reslllting from
~tlminictr~tjon of this polypeptide in vaccines which also are colll~lised of the
various adjuvants.
The vaccines usually are ~Aminict,ered by intraveneous or hllli..-...ccl-l~r
injection. Additional formulations which are suitable for other modes of
30 ~rlmini.ctration include suppositories and, in some cases, oral formulations. For
suppositories, traditional binders and carriers may include but are not limited to
polyaLkylene glycols or triglycerides. Such ~u~posilol;cs may be formed from
llli~UIt;S cont~ining the active ingredient in the range of about 0.5% to about 10%,
- preferably, about 1% to about 2%. Oral formulation include such normally
35 employed excipients as, for example ph~rm~r,euti~l grades of m~nnitol, lactose,
starch, m~y..esjull~ stearate, so~i_~m saccharine, cellulose, m~gn.osium carbonate
and the like. These collll)osilions may take the form of solutions, sucpencions,

WO 95/21922 PCI~/US9S/02118
1 3
44

tablets, pills, carslllçs, s-lst~ined release formulations or powdel~ and contain
about 10% to about 95% of active ingredient, ~lcfclably about 25% to about 70%.
The pluteins used in the vaccine may be form.ll~t~ into the vaccine as
neutral or salt forms. Ph~rm~reutic~lly acceptable salts such as acid addition salts
5 (formed with free amino groups of the peptide) and which are formed with
inorganic acids such as hydrochloric or phosphoric acids, or such organic acids
such as acetic, oxalic, tartaric, maleic, and others known to those skilled in the art.
Salts formed with the free carboxyl groups also may be derived from inorganic
bases such as sodium, pot~csi-lm, ammonium, calcium or ferric hydroxides and
10 the like, and such organic bases such as isopropylamine, t~h~cll~ylamine, 2-
ethylarnino ethanol, hi~ti~inP procaine, and others known to those skilled in the
art.
Vaccines are ~ c~cd in a way co~ lible with the dosage
formulation, and in such amounts as will be plu~hyl~rtir~lly and/or ~ culir~lly
5 effective. The quantity to be ~rlmini~t~ red generally is in the range of about 5
micrograms to about 250 micrograms of antigen per dose, and depends upon the
subject to be dosed, the capacity of the subject's immnne system to synth~oci7.eantibodies, and the degree of protection sought. Precise alllOUllt~i of active
ingredient required to be ~lminictered also may depend upon the ju(lgm~ont of the
20 practitioner and may be unique to each subject. The vaccine may be given in asingle or multiple dose schedule. A multiple dose is one in which a primary course
of vaccin~tion may be with one to ten sepalatc doses, followed by other doses
given at s~lksequ~nt time intervals required to m~int~in and/or to ,c;~,ro,ce the
immnn-o response, for example, at one to four months for a second dose, and if
25 required by the individual, a subsequent dose(s) after several month~. The dosage
regimen also will be dclr- . . .; ..lo.d, at least in part, by the need of the individual, and
be 11epel~ent upon the practitioner's j~ldgmP-nt It is co"l~l"~lated that the vaccine
col-l;.in;~g the immunogenic HGBV antigen(s) may be ~ lcd in conjunction
with other immunoregulatory agents, for example, with immllne globulins.
30 F`lc~,a,d~ion of Antibodies Against HGBV Epitopes
The immunogenic peptides prepared as described herein are used to
produce antibodies, either polyclonal or monoclonal. When ~l~,palillg polyclonalantibodies, a sele~ted m~mm~l (for example, a mouse, rabbit, goat, horse or the
like) is ;---l-ll-.-;~1 with an immnnogenic polypeptide bearing at least one HGBV
epitope. Serum from the ;~n.. ;~d animal is collected after an appl~p,idle
incllb~tion period and treated according to Icnnwn procedures. If serum conl;~ i"g
polyclonal antibodies to an HGBV epitope contains antibodies to other antigens,

WO 95/21922 ~ i 3 1 ~ PCT/US95/02118


the polyclonal antibodies can be purified by, for example, immunoaffinity
cl~ollldlography. Techniques for prod~lcing and proces~ing polyclonal antibodiesare known in the art and are described in, among others, Mayer and Walker, eds.,Irnmunochemical Methods In Cell and Molecular Biology. Academic Press,
5 London (1987). Polyclonai antibodies also may be obtained from a m~mm~l
previously infected with HGBV. An example of a method for ~ulifyillg antibodies
to HGBV epitopes from serum of an individual infected with HGBV using affinity
chl~lllalography is provided herein.
Monoclonal antibodies directed against HGBV e~ilopes also can be
0 produced by one skilled in the art. The general methodology for producing suchantibodies is well-known and has been described in, for example, Kohler and
Milstein, Nature 256:494 (1975) and reviewed in J.G.R. Hurrel, ed., Monoclonal
Hybridoma Antibodies: Tecllniques and Applications. CRC Press Inc., Boco
Raton, FL (1982), as well as that taught by L. T. Mimms et al., Virology 176:604-
619 (1990). Immortal antibody-producing cell lines can be created by cell fusion,
and also by other techniques such as direct transformation of B lymphocytes withoncogenic DNA, or transfection with Epstein-Barr virus. See also, M. Schreier etal., Hybridoma Techniques, Scopes (1980) Protein Purification, Principles and
Practice, 2nd Edition, Springer-Verlag, New York (1984); H~.,.".~llhlg et al.,
Monoclonal Antibodies and T-Cell Hybridomas (1981); Kennet et al., Monoclonal
Antibodies (1980). Examples of uses and techniques of monoclonal antibodies are
disclosed in U.S. patent applications Serial Nos. 748,292; 748,563;610,175,
648,473; 648,477; and 648,475.
Monoclonal and polyclonal antibodies thus developed, directed against
HGBV epilol)es, are useful in diagnostic and prognostic applications, and also,
those which are neutralizing are useful in passive irnmunotherapy. Monoclonal
antibodies e~peci~lly can be used to produce anti-idiotype antibodies. These anti-
idiotype antibodies are immunoglobulins which carry an "internal image" of the
antigen of the infectious agent against which protection is desired. See, for
example, A. Nisonoff et al., Clin. Immunol. Immunopath. 21:397-406 (1981),
and Dreesman et al., J. Infect. Dis. 151:761 (1985). Techniques for raising suchidiotype antibodies are known in the art and exemplified, for example, in Grych et
al., Nature 316:74 (1985); MacNamaraet al., Science 226:1325 (1984); and
Uytdehaag et al., J. Immunol. 134:1225 (1985). These anti-idiotypic antibodies
35 also may be useful for L~ t.l~1 of HGBV infection, as well as for elucidation of
the imrnunogenic regions of HGBV antigens.
Diagnostic Oligonucleotide Probes and Kits

WO 95/21922 ~ i 3 1 3 PCI/US95/02118

46

Using Aet~ ~l portions of the isolated HGBV nucleic acid sequences as
a basis, oligomers of approximately eight nucleotides or more can be ~l~cd,
either by excision or synthrtir~lly~ which hybridize with the HGBV genome and
are useful in identification of the viral agent(s), further char~ ion of the viral
s genome, as well as in detection of the virus(es) in Aice~ced individuals. The
natural or derived probes for HGBV polynucleotides are a length which allows thedetection of unique viral sequences by hybridization. While six to eight
nucleotides may be a workable length, sequences of ten to twelve nucleotides areplefcllcd, and those of about 20 nucleotides may be most pl~fcll~,d. These
lo sequences preferably will derive from regions which lack heterogeneity. Theseprobes can be prepared using routine, standard methods including automated
oligonucleotide synthetic methods. A complement of any unique portion of the
HGBV genome will be s~ticf~ctory. Complete complemr-nt~rity is desirable for
use as probes, although it may be ~ nececch~ y as the length of the fia~lllcl.l is
5 increased.
When used as diagnostic reagents, the test sample to be analyzed, such as
blood or serum, may be treated such as to extract the nucleic acids contained
therein. The reslllting nucleic acid from the sample may be subjected to gel
electrophoresis or other size separation techniques; or, the nucleic acid sample may
20 be dot-blotted without size separation. The probes then are labelled. Suitable
labels and methods for ~tt~rl~ing labels to probes are known in the art, and include
but are not limited to r~Aio~r-tive labels incorporated by nick translation or
kin~cing, biotin, fluolcscent and chemiluminescent probes. Fx~mrl~os of many of
these labels are disclosed herein. The nucleic acids extracted from the sample then
2s are treated with the l~he-ll~d probe under hyhritli7~tion conditions of suitable
stringencies.
The probes can be made completely coll~lclllcll~y to the HGBV genome.
Therefore, usually high stringency conditions are desirable in order to prevent false
positives. However, conditions of high stringency should be used only if the
30 probes are comple...P-~ y to regions of the HGBV genome which lack
heterogeneity. The stringency of hybridization is delr~ ",;,-eA by a number of
factors during the washing procedure, including tClll~)CldtUlC, ionic strength, length
of time and concentration of formamide. See, for example, J. Sambrook (supra).
Hybridization can be carried out by a llulllbel of various techniques. Amplification
35 can be perforrned, for example, by Ligase Chain Reaction (LCR), Polymerase
Chain Reaction (PCR), Q-beta replicase, NASBA, etc.

WO 95121922 ~ ~ ~ 6 3 1~ PCTtUS95/02118

47

It is con~r~ )lated that the HGBV genollle sequences may be present in
serum of infected individuals at relatively low levels, for example, approximately
102-103 sequences per ml. This level may require that ~rnplifir~ion techniques be
used in hybritli7~tion assays, such as the Ligase Chain Reaction or the Polymerase
5 Chain Reaction. Such techniques are known in the art. For example, the "Bio-
Bridge" system uses terminal deoxynucleotide tlallsr~l~se to add unmodified 3'-
poly-dT-tails to a nucleic acid probe (Enzo Biochem. Corp.). The poly dt-tailed
probe is hybridi_ed to the target nucleotide sequence, and then to a biotin-modified
poly-A. Also, in EP 124221 there is described a DNA hybri~li7~tion assay
10 v~Le~ the analyte is ~nn.o~led to a single-stranded DNA probe that is
CO r l~ lle~ y to an enzyme-labelled oligonucleotide, and the resulting tailed
duplex is hybridized to an enzyme-labelled oligonucleotide. EP 204510 describes
a DNA hybridization assay in which analyte DNA is co~ ed with a probe that
has a tail, such as a poly-dT-tail, an ~mplifi~r strand that has a sequencethat
5 hybridizes to to the tail of the probe, such as a poly-A sequence, and which is
capable of binding a plurality of labelled strands. The technique first may involve
~mplific~tion of the target HGBV sequences in sera to approximately 106
sequencec/ml. This may be accomplished by following the methods described by
Saiki et al., Nature 324: 163 (1986). The amplified sequence(s) then may be
20 ~let~-ct~l using a hybridization assay such as those known in the art. The probes
can be packaged in rli~gnostic kits which include the probe nucleic acid sequence
which s~uæ~ce may be labelled; ~ItPnn~tively, the probe may be unlabelled and the
ingredients for labelling could be included with the kit. The kit also may contain
other suitably packaged reagents and m~teri~lc needed or desirable for the
25 particular hybritli7~tion protocol, for example, standards as well as instructions for
pCIrO~ g the assay.
Other known amplification methods which can be utilized herein include
but are not limited to the so-called "NASBA" or "3SR" technique taught in PNAS
USA 87:1874-1878 (1990) and also ~liccucse~l bin Nature:350 (No. 6313):91-92
30 (1991) and Q-beta replicase.
Flourescence in situ hybridization ("FISH") also can be performed lltili7.ing
the reagents described herein. In situ hybridization involves taking
morphologically intact tissues, cells or chromosomes through the nucleic acid
hybridization process to delllollsL,d~ the presence of a particular piece of genetic
35 information and its specific location within individual cells. Since it does not
~equire homogenization of cells and extraction of the target sequenl~e, it provides
precise localization and di~llibulion of a sequence in cell populations. In situ

WO 95/21922 ~16 6 3 13 PCT/US95/02118

48

hybri~li7~tiorl can identify the sequence of interest co~-entrated in the cells
CQ~ g it. It also can identify the type and fraction of the cells in a
heterogenec,us cell population CO~ illg the sequence of interest. DNA and RNA
can be ~etected with the same assay reagents. PNAs can be utili_ed in FISH
S methods to detect targets wlthout the need for ~mplifi~tion. If increased signal is
desired, mutiple fluorophores can be used to increase signal and thus, sensitivity
of the method. Various methods of FISH are known, including a one-step method
using multiple oligoml~leotit1es or the conventional multi-step method. It is within
the scope of the present invention that these types of methods can be automated by
10 various means in~ ling flow cytometry and image analysis.
Immunoassay and Dia~nostic Kits
Both the polypeptides which react immunologically with serum co~ .;llg
HGBV antibodies and composites thereof, and the antibodies raised against the
HGBV specific epilo~s in these polypeptides are useful in immnno~c~ys to
15 detect the p,~sel~ce of HGBV antibodies, or the presence of the virus and/or viral
antigens in biological test s~mples The design of these immunoassays is subject
to variation, and a variety of these are known in the art; a variety of these have
been described herein. The immunoassay may utilize one viral ~ntigen, such as a
polypeptide derived from any clone-col~ g HGBV nucleic acid sequence, or
20 from the composite nucleic acid sequences derived from the HGBV nucleic acid
sequences in these clones, or from the HGBV genome from which the nucleic acid
sequences in these clones is derived. Or, the immunoassay may use a combination
of viral antigens derived from these sources. It may use, for example, a
monoclonal antibody directed against the same viral ~ntigeTl, or polyclonal
2s antibodies dh~ted against dirr~nl viral antigens. Assays can include but are not
limited to those based on c~ t;~ion, direct reaction or sandwich-type assays.
Assays may use solid phases or may be ~lrolllled by immnn~p~cci~ildlion or any
other methods which do not utilize solid phases. Examples of assays which utilize
labels as the signal geneldlillg colll~und and those labels are described herein.
30 Signals also may be amplified by using biotin and avidin, enzyme labels or biotin
anti-biotin systems, such as that described in pending U.S. patent application
Serial Nos. 608,849; 070,647; 418,981; and 687,785. Recombinant polypeptides
which include epitopes from immunodorninant regions of HGBV may be useful
for the detection of viral antibodies in biological test samples of infected
35 individuals. It also is cor~tPmpl~t~d that antibodies may be useful in discrimin~ting
acute fro~-acute infections. Kits suitable for immunodiagnosis and
cc ~ -;llg the ~lo~ e reagents are constructed by p~ck~ging the a~plu~liate

WO 95/21922 ~ f 6 ~ 3 1 3 PCI/US9S/02118

49

mz t~riz~lc, including the polypeptides of the invention co"~ i"g HGBV eyilopes
or antibodies directed against HGBV eyiLopes in suitable COIll .;l,t,~, along with the
r~ t~ g reagents and materials required for the conduct of the assay, as well assuitable assay instructions.
Assay formats can be desi n~A which utilize the recombinant proteins
detztil~d herein, and although we describe and detail CKS proteins, it also is
col"l~,."~lated that other expression systems, such as superoxide tli~"~ ce (SOD),
and others, can be used in the present invention to gene,a~ fusion proteins capable
of use in a variety of ways, including as antigens in immunoassays, irnmunogens
for antibody production, and the like. In an assay format to detect the presence of
antibody against a specific analyte (for example, an infectious agent such as a
virus) in a human test sample, the human test sample is col,~ed and incubzltr(l
with a solid phase coated with at least one recoll~bhl~lt protein (polypeptide). If
zlntiho~lies are present in the test sample, they will form a complex with the
zlntig~.nic polypeptide and become affixed to the solid phase. After the complexhas formed, unbound materials and reagents are removed by washing the solid
phase. The complex is reacted with an in~ zltor reagent and allowed to incubate for
a time and under conditions for second complexes to form. The y~sence of
antibody in the test sample to the CKS recombinant polypeptide(s) is de~~ ed
20 by detecting the signal genel~d. Signal generated above a cut-off value is
indicative of antibody to the analyte present in the test sample. With many
indicator reagents, such as enzymes, the amount of antibody present is
p~uyollional to the signal gene,~ed. Depending upon the type of test sample, it
may be diluted with a suitable buffer reagent, con~ ted, or col.l~ cl with the
25 solid phase without any manipulation ("neat"). For e.~zlmrle, it usually is yl~felled
to test serum or plasma S ull~les which previously have been diluted, or
concentrate s~ such as urine, in order to d~ t~ ....;..~ the y~ lce zmd/or
amount of antibody present.
In addition, more than one l~,co",bi"ant protein can be used in the assay
30 format just des~-rihed to test for the presence of antibody against a specific
infectious agent by utili7ing CKS fusion proteins against various antigenic epitopes
of the viral genome of the infectious agent under study. Thus, it may be yuefelled
to use ,~,co",bi"allt polypeptides which contain epitopes within a specific viral
antigenic region as well as epitopes from other antigenic regions from the viral35 genome to provide assays which have hlcleased sen~ilivily and ye~hays greater specif~ty ll~ g- a- p~lide from one epitope. Such an assay can be
utili_ed as a confirm~tory assay. In this particular assay format, a kno~-vn amount

~1663:L~
wo 95/21922 ~ Pcr/usss/02ll8

so

of test sample is contacted with (a) known ~ I(s) of at least one solid support
coated with at least one I~COIII~ alll protein for a time and under conditions
sllfficiPnt to form recombinant protein/antibody complexes. The complexes are
contacted with known ~m- llnt(s) of app,up~iate in-lic~tor reagent(s)s for a time and
s under suitable conditions for a reaction to occur, wherein the result~nt signal
gen~alGd is COlll~alcd to a negative test sample in order to d~ ....;..P the presence
of antibody to the analyte in the test sample. It further is con~ tPd that, whenusing certain solid phases such as microparticles, each Icco.~ .A.~l protein utilized
in the assay can be ~tt~rhrl1 to a seplalG l"iC[~a licle, and a IIUAlUlC of these
10 microparticles made by COl~il~ulg the various coated "li.;,~a,licles, which can be
o~lhlli2Gd for each assay.
Variations to the above-described assay formats include the hlco~u~dlion
of CKS-.~u.lllbil~ pr~teills of dirr~ l analytes ~tt~rhP~ to the same or to
dirr~l.l solid phases for the d~,t' ~l ;on of the p.~sence of antibody to either analyte
15 (for example, CKS-I~,co",~i"a~ll proteins specific for certain antigenic regions of
one infective agent coated on the same or dirrGIGlll solid phase with CKS-
recombinant proteins specific for certain antigenic region(s) of a dirr~ l infective
agent, to detect the p,Gsence of either (or both) hlrGI livG agents.
In yet another assay format, CKS ,~,co~bi~ t proteins colllA;I~
20 antigenic epilopes are useful in co.,.lle~iliv-e assays such as neutralization assays.
To perform a neutralization assay, a l~colllbin~ll polypeptide IG~lGse~ll;llg ~,~ilo~es
of an antigenic region of an infectious agent such as a virus, is solubilized and
mixed with a sample diluent to a final col-cc,..l . alion of bGIween 0.5 to 50.0 ~lg/rnl
A known arnount of test sample (~I~,f~,lably 10 ,ul), either diluted or non-diluted, is
2s added to a reaction well, followed by 400 ~Ll of the sarnple diluent co..l~ -g the
~co",bi"a.ll polypeptide. If desired, the IlliAIUlG may be p,~;..c.Jl)~te~ for
approAill,~lely 15 ~ JIe;s to two hours. A solid phase coated with the CKS
recombinant protein described herein then is added to the reaction well, and
incub~ted for one hour at approxim~tely 40C. After washing, a known amount of
30 an inrlir~tor reagent, for example, 200 ~1 of a peroxidase labelled goat anti-human
IgG in a conjugate diluent is added and inrub~trd for one hour at 40C. After
washing and when using an enzyme conjugate such as described, an enzyme
substrate, for example, OPD substrate, is added and incubated at room telll~ldlulG
for thirty Il~illUlGs. The reaction is terrnin~t~i by adding a stopping reagent such as
35 lN sulfuric acid to the reaction well. Abso,l,ance is read at 492 nm. Test samples
which contain antibody t~ thc s~irtc ~G~ide gene,alG a reduced signal caused
by ~e CC)I~ ;VG binding of the peptides to these a"lil,odies in solution. The

WO 95121922 2 ~ 6 6 3 1 3 PCI/US95/02118

51

.~"lage of co...l~t;l;~/e binding may be c~lcul~ted by co...p~ ;..g abso,'L.~Icevalue of the sample in the ~l~sence of ,Gcc lllbillalll polypeptide to the absorbance
value of the sample assayed in the ~hs~nr,e of a recolllbill~ll polypeptide at the
same dilution. Thus, the dirr~"ence in the signals generated between the sample in
s the ~lGsence of recombinant protein and the sample in the ~hsen~e of rGcolllbinant
protein is the lllca~.u~"llG~Il used to d~,t~"lllhle the presence or ~bsenr-e of antibody.
In another assay format, the recombinant proteins can be used in
imm--no~ot blot assay systems. The immunodot blot assay system uses a panel of
purified IGco~l'Gil~a ll polypeptides placed in an array on a nitrocellulose solid
lo support. The pl~d solid support is contacted with a sample and ca~;llul~,s
spGecific antibodies (specific binding ~..r.~.hf,r) to the recombinant protein (other
specific binding m~m'~r) to form specific binding member pairs. The captured
antibodies are c~et~t~ by reaction with an in~ tor reagent. ~GÇGl~bly, the
conjugate specific reaction is quantified using a refl~t~nr~ optics assembly within
15 an illsLIulllGnt which has been described in U. S. Patent Application Serial No.
07/227,408 filed August 2, 1988. The related U. S. Patent Application Serial No.07/227,586 and 07/227.590 (both of which were filed on August 2, 1988) further
described specific methods and appal~us useful to perform an immunodot assay,
as well as U. S. Patent No. 5,075,077 (U.S. Serial No. 07/227,272 filed August
20 2, 1988), which enjoys common owll~"~h,~ and is incol~olatGd herein by
Icfe,~ ce. Briefly, a nitrocellulose-base test cartridge is treated with multiple
antigenic polypeptirles. Each polypeptide is cont~inç~ within a specific reaction
zone on the test cartridge. After all the antigenic polypeptides have been placed on
the nitrocellulose, excess binding sites on the nitrocellulose are blocked. The test
2s cartridge then is cont^^t~d with a test sample such that each ~ntigenic polypeptide
in each reaction zone will react if the test sample cont~in~ the a~ro~liatG antibody.
Af;ter reaction, the test cartridge is washed and any antigen-antibody reactions are
i-lentifi~d using suitable well-known re~g~ nt~. As described in the patents andpatent applications listed herein, the entire process is amenable to automation. The
30 spe~ific~tions of these applications related to the method and appalalus for
performing an immunodot blot assay are incol~olalGd herein by reference.
CKS fusion proteins can be used in assays which employ a first and
second solid support, as follow, for fietecting antibody to a specific antigen of an
analyte in a test sample. In this assay format, a first aliquot of a test sample is
3s cont~ct~l with a first solid support coated with CKS recombinant protein specific
for an analyte for a time and under conditions ~uffi~ient tp form recomhin~nt
protein/analyte antibody complexes. Then, the complexes are cont~led with an

1 6 ~ 3 ~ 3
WO 95t21922 PCI/US95/02118


in~ tor reagent specific for the recombinant antigen. The in-iic~tor reagent is
rlPtected to deLc~ e the presence of antibody to the l~colll~ t protein in the
test sample. Following this, the presence of a dirr~ nt ~ntigçnic delc. ,~lin~ of
the same analyte is delcllllilled by cont~rting a second aliquot of a test sample with
s a second solid support coated with CKS recombinant protein specific for the
second antibody for a time and under conditions sufficient to form recolllbillant
protein/ second antibody complexes. The complexes are con~;~ed with a second
inrlir~tor reagent specific for the antibody of the complex. The signal is detected in
order to dtlc. ~ r the ~lcscnce of antibody in the test sample, whelcill the
0 pr~,sence of antibody to either analyte l~colllbinalll protein, or both, in-lic~tP~ the
presence of anti-analyte in the test sample. It also is col~lcln~lated that the solid
~U~J~ll~i can be tested simultaneously.
The use of haptens is known in the art. It is conte~ lated that haptens also
can be used in assays employing CKS fusion proteins in order to enh~re
15 perforrn~nce of the assay.
Further Ch~;lcli~lion of the HGBV Genome. Virions~ and Viral Antigens
Using Probes
The HGBV nucleic acid sequences may be used to gain further hlfollllation
on the sequence of the HGBV genome, and for identific~tion and isolation of the
20 HGBV agent. Thus, it is contemplated that this knowledge will aid in the
ch~u~tcli~lion of HGBV including the nature of the HGBV genome, the structure
of the viral particle, and the nature of the antigens of which it is composed. This
illrollllalion, in turn, can lead to additional polyllucleotide probes, polypeptides
derived from the HGBV genome, and antibodies directed against HGBV epitopes
25 which would be useful for the diagnosis and/or ~Icdllllel~l of HGBV caused non-A,
non-B, non-C, non-D and non-E hPp~titi~
The nucleic acid sequence information is useful for the design of probes or
PCR primers for the isolation of additional nucleic acid sequp-nrçs which are
derived from yet lmd~PfinP.d regions of the HGBV genome. For example, PCR
30 primers or labelled probes conl~;ni~g a sequence of 8 or more nucleotides, and
plcr~lably 20 or more nucleotides, which are derived from regions close to the 5'-
termini or 3'-termini of the family of HGBV nucleic acid sequences may be used to
isolate overlapping nucleic acid sequences from HGBV genomic or cDNA libraries
or directly from viral nucleic acid. These sequences which overlap the HGBV
35 nucleic acid sequçnr,es, but which also contain se~lu~nces derived from regions of
the genome from which the above-mentioned HGBV- nucleic acid sequen~e are not
derived, may then be used to synthP~i7P probcs for identifir~tion of other

- W O 95/21922 21 ~ & 3 1 3 PC~rtUS95tO2118


o~ la~pillg r~ L~ which do not ~ ess~ ;Iy overlap the nucleic acid sequences
in the clones. Unless the HGBV genome is segmPntçd and the sc~;lllcnL~ lack
common sequences, it is possible to sequence the entire viral genome(s) utili7.ing
the te~-hnique of isolation of o~_,lapping nucleic acid se~luc~ -cs derived from the
5 viral gcnollle(s). Ch~;lc,i,~lion of the genomic segments ~ltPrn~tively could be
from the viral genome(s) isolated from purified HGBV particles. Methods for
~ulifying HGBV particles and for ~etecting them during the pnrifir~tion ~locedu~c
are clesc-rihecl herein. Procedures for isolating polynucleotide genomes from viral
particles are well-known in the art. The isolated genomic seg~ then could be
lo cloned and seqU~pnr~ Thus, it is possible to clone and sequence the HGBV
genome(s) illc~,e.,~i~/e of their nature.
Methods for constructing HGBV genomic or cDNA libraries are known in
the art, and vectors useful for this ~,ul~ose are known in the art. These vectors
include lambda-gtl 1, lambda-gtlO, and others. The HGBV derived nucleic acid
15 sequence ~etecte~ by the probes derived from the HGBV genomic or cDNAs, may
be isolated from the clone by digestion of the isolated polynucleotide with the
approp.i~e restriction en_yme(s), and sequenced.
The sequence inrol,llalion derived from these o~ la~ g HGBV nucleic
acid sequences is useful for dctcllll;ll;llg areas of homology and heterogeneity20 within the viral genome(s), which could inl1ic~tç the ~,~,sence of dirrc~cl~l strains of
the genome and or of populations of defective particles. It is also useful for the
design of hybridization probes to detect HGBV or HGBV antigens or HGBV
nucleic acids in biological s~ s, and during the isolation of HGBV, lltili7ing the
tef ~-niques des~rihel herein. The overlapping nucleic acid se l'J~ nçe~c may be used
25 to create e~ln~,3sion vectors for polypeptides derived from the HGBV genome(s).
Encoded within the family of nucleic acid sequences are antigen(s) CQI-I~
epilopcs which are contf . . .plated to be unique to HGBV, i.e., antibodies directed
against these ~ntigenc are absent from individuals infected with HAV, HBV,
HCV, and HEV, and with the genomic sequences in GenBank are col-lcllll l~t~cl to30 indicate that minim~l homology exists bclww-l these nucleic acid sequencec and
the polynucleotide sequences of those sources. Thus, antibodies directed againstthe antigens enroclecl with the HGBV nucleic acid sequences may be used to
identify the non-A, non-B, non-C, non-D and non-E particle isolated from infected
individuals. In addition, they also are useful for the isolation of the HGBV
35 agent(s).
HGBV particles may be isolated from the sera of infected individuals or
from cell cultures by any of the methods known in the art, including, for example,

WO 95/21922 ~ 1 6 6 3 1 3 PCI/U!,9S~118

54

~erhnique,c based on size .licc~ ;...;..~tion such as se-l;...~..l~ion or exclusion
methods, or terhniql)es based on density such as ultracentrifugation in density
gradients, or ~l~ipildtion with agents such as polyethylene glycol (PEG), or
chlu,,lat~graphy on a variety of m~trri~lc such as anionic or cationic eyr-hqnge~
s materials, and m~t. ri~lc which bind due to hydrophobic hlte~ tions, as well ac
affinity columnc. During the isolation ~lucelulG the presence of HGBV may be
detrcted by hybridiL~lion analysis of the extracted genome, using probes derivedfrom HGBV nucleic acid sequel~ces or by immunoassay which utilize as probes
antibodies directed against HGBV qntigen~c encoded within the family of HGBV
o nucleic acid s~u~ nres~ The antibodies may be polyclonal or monoclonal, and itmay be desirable to purify the antibodies before their use in the immnnoqcsay
Such antibodies directed against HGBV antigens which are af~lxed to solid phasesare useful for the isolation of HGBV by immunoarr~ y chl~ll~tography.
Methods for ;..""~..nqrr...;ly cl~olllltography are known in the art, and include
15 mrtho lc for ~ffixing antibodies to solid phases so that they retain their
immunoselective activity. These methods include adsorption, and covalent
binding. Spacer groups may be included in the bifunctional coupling agents such
that the antigen binding site of the antibody remains ;qrce.s.sihle.
During the purifirqtion procedure the plGsellce of HGBV may be detected
20 and/or verified by nucleic acid hybridization or PCR, utili7.in~ as probes or primers
polynucleotides derived from a family of HGBV genomic or cDNA sequences, as
well as from o~ella~illg HGBV nucleic acid se luences. Fractions are treated
under conditions which would cause the disruption of viral particles, such as byuse of d.,t~ in the presence of chrkqting agents, and the pl~,s.,nce of viral
nucleic acid clGIr. ",;~,rA by hybri~li7qtion ter-hniques or PCR. Further co.. r.. ~tion
that the isolated particles are the agents which induce HGBV infection may be
obtained by infecting an individual which is preferably a tamarin with the isolated
virus particles, followed by a (lc~ ....;..~tion of whether the Sylll~tOlllS of non-A,
non-B, non-C, non-D and non-E hepatitis, as described herein, result from the
30 infection.
Such viral particles obtained from the purified plGp~dtions then may be
further characteri7~A The genomic nucleic acid, once purified, can be tested to
dete.Tnin~ its sensitivity to RNAse or DNAse I; based on these tests, the
dete....i..~lion of HGBV as a RNA genome or DNA genome may be made. The
35 str~ndeAness and circularity or non-circularity can be de~lll"lled by methodsknown in ~4 inrhltling its vi.~n~li7~tion by electron miclusco~y, its migration in
density ~di.,n~ and its s~;.. I~;on cha.~ lc ;~I;rs. From hybridization of the

-~ WO 95/21922 PCI/US95/02118
211~b 3L3

HGBV genome, the negative or positive str~ndprinpss of the purified nucleic acidcan be del~- ",i"Pd In addition, the purified nucleic acid can be cloned and
sequenced by known techniques, including reverse transcriptase, if the genomic
m~t~ri~l is RNA. Utilizing the nucleic acid derived from the viral particles, it then
5 is possible to sequence the entire genome, whether or not it is segmPnteA
De~l,lullalion of polypeptides col~l~;llillg conserved sequences may be
useful for selecting probes which bind the HGBV genome, thus allowing its
isolation. In addition, conserved sequences in conjunction with those derived
from the HGBV nucleic acid sequences, may be used to design primers for use in
0 systems which amplify genomic sequences. Further, the structure of HGBV also
may be dGl~ ulled and its co"~pon~ isolated. The morphology and size may be
dete,,,uned by electron microscopy, for example. The i(l~Pntifiration and
loc~li7~tion of specific viral polypeptide antigens such as envelope (coat) antigens,
or internal antigens such as nucleic acid binding pl~lLGil,s or core ~ntigen.~, and
polynucleotide pol~/l"c.ase(s) also may be ~ d by ascG,ldil,i"g whether the
antigens are present in major or minor viral COIII~ 1GII~ as well as by utili7ing
antibodies directed against the specific antigens encode~ within isolated nucleic
acid sequences as probes. This h~,ll,ation may be useful for diagnostic and
th~ldL)Gulic applications. For example, it may be preferable to include an exterior
antigen in a vaccine plGpaldtion, or perhaps multivalent vaccines may be compri~ed
of a polypeptide derived from the genome encoding a structural protein as well as a
polypeptide from another portion of the genome, such as a nonstruchural
polypeptide.
Cell Culture Systems and Animal Model Systems for HGBV Replication
Generally, suitable cells or cell lines for culhlring HGBV may include the
following: monkey kidney cells such as MK2 and VERO, porcine kidney cell lines
such as PS, baby hamster kidney cdl lines such as BHK, murine ",acrol)hage cell
lines such as P388Dl, MKl and Mml, human "~rophage cell lines such as U-
937, human peripheral blood leukocytes, human adherent monocytes, hepatocytes
or hepatocytic cell lines such as HUH7 and HepG2, embryos or embryonic cell
such as chick embryo fibroblasts or cell lines derived from inve,leb,d~es,
preferably from insccts such as Drosophia cell lines or more preferably from
arthropods such as mosquito cell lines or tick cell lines It also is possible that
primary hepatocytes can be cultured and then infected with HGBV. ~ltPrn~tively,
the h~p~tocyte culhures could be derived from the livers of infected individuals(humar~or l,....~ h ler case is an example of a cell line which is infected
in vivo being passaged in vihro. In addition, various imrnortalization methods can

WO 95/21922 ~ 1 6 ~ ~ 1` 3 PCT/US9SI02118


be used to obtain cell lines derived from hepatocyte cultures. For example,
primary liver cultures (before and after Pnri.^~mPnt of the hepatocyte population)
may be fused to a variety of cells to ...,.;..~ stability. Also, cultures may beinfected with transforming viruses, or transfected with transforming genes in order
s to create pe,ll~anellt or semi~ .l.~ cell lines. In addition, cells in liver cultures
may be fused to established cell lines such as PehG2. Methods for cell fusion are
well-known to the routineer, and include the use of fusion agents such as PEG and
Sendai Virus, among others.
It is co.-te...l l~trcl that HGBV infection of cell lines may be accomplished
o by techniques such as inrub~ting the cells with viral p,G~al~lions under conditions
which allow viral entry into the cell. It also may be possible to obtain viral
production by transfecting the cells with isolated viral polynucleotides. Methods
for transfecting tissue culture cells are known in the art and include but are not
limited to tcchl i-lues which use cle~ upoldlion and l"~,ci~i~lion with DEAE-
5 Dextran or calcium phosph~tP,. Transfection with cloned HGBV genomic orcDNA should result in viral replication and the in vitro propagation of the virus. In
addition to cultured cells, animal model ~y~lc",s may be used for viral replication.
HGBV replication thus may occur in cl-;l.lp~-7P~s and also in, for example,
marmosets and s~lr~ling mice.
20 Screening for Anti-Viral Agents For HGBV
The availability of cell culture and animal model systems for HGBV also
renders sc,ccl,il,g for anti-viral agents which inhibit HGBV replication possible,
and particularly for those agents which ~cÇc~clltially allow cell growth and
multiplication while inhibiting viral replication. These sclcening methods are
25 known in the art. Generally, the anti-viral agents are tested at a variety ofconrPntrationg~ for their effect on plc~clltillg viral replication in cell culture
systems which support viral replication, and then for an inhibition of infectivity or
of viral pathogenicity, and a low level of toxicity, in an animal model system. The
methods and composition provided herein for IPt~P~ting HGBV antigens and
30 HGBV polynucleotides are useful for sc,~nh~g of anti-viral agents because they
provide an alternative, and perhaps a more sensitive means, for ~i~Ptecting the
agent's effect on viral replication than the cell plaque assay or IDso assay. For
example, the HGBV polynucleotide probes described herein may be used to
quantitate the amount of viral nucleic acid produced in a cell culture. This could be
35 r~lro""ed by hybridization or co".pcl;l;on hybridi7ation of the infected cell nucleic
acids with a label~ HGBY- polynuc~ide probe. Also, anti-HGBV antibodies
may be used to identify and q~l~ntit~tP HGBV antigen(s) in the cell culture utili7in

WO 95/21922 2~ 6 ii 3 1 ~ PCI~/US95/02118

57

the immllno~cs~ys described herein. Also, since it may be desirable to quantitate
HGBV antigens in the infected cell culture by a col~l~tilion assay, the
polypeptides çnco~e~l within the HGBV nucleic acid sequPn~es described herein
are useful for these assays. Generally, a Iccolllbinant HGBV polypeptide derived5 from the HGBV genomic or cDNA would be i~hPllP(l and the inhibition of
binding of this labelled polypeptide to an HGBV polypeptide due to the antigen
produced in the cell culture system would be monitored. These methods are
esreci~lly useful in cases where the HGBV may be able to replicate in a cell lines
without causing cell death.
10 P~ lion of Attenuated Strains of HGBV
It may be possible to isolate attPnll~tPcl strains of HGBV by lltili7ing the
tissue culture systems and/or animal models systems provided herein. These
~ttP.ml~tPA strains would be useful for v~inps~ or for the isolation of viral
antigens. ~Ih.~ strains are isolatable after mnltirle p~ ~s in cell culture
15 and/or an animal model. Detection of an ;.llf~ t.~l strain in an infected cell or
individual is achievable by following methods known in the art and could includethe use of antibodies to one or more epitopes encoded in HGBV as a probe or the
use of a polynucleotide CO~ ;,.i..g an HGBV sequence of at least about 8
nucleotides in length as a probe. Also or ~ltprn~tively~ an attenuated strain may be
20 constructed utili7ing the genomic information of HGBV provided herein, and
utili7ing recombinant techniques. Usually an attempt is made to delete a region of
the genomP enr~ling a polypeptide related to pathogenicity but not to viral
replication. The genomic construction would allow the e*)l~ssion of an epitope
which gives rise to neutralizing antibodies for HGBV. The altered genome then
25 could be used to transform cells which allow HGBV replication, and the cells
grown under conditions to allow viral replication. ~IIe~ Gd HGBV strains are
useful not only for vaccine ~ul~oses, but also as sources for the collll"~ ial
production of viral antigens, since the proce;,~ g of these viruses would require
less stringent protection measures for the employees involved in viral production
30 and/or the production of viral products.
Hosts and Expression Control Sequences
Although the following are known in the art, included herein are general
techniques used in extracting the genome from a virus, plG~ hlg and probing a
genomic library, sequencing clones, constructing expression vectors, transforming
35 cells, performing immunological assays, and for growing cell in culture.
Both prokaryotic and eu~caryotic host cells m~ be used for expression of
desired coding sequences when ~plopliate control sequences which are

WO 95/21922 ~ ;L 6 ~ 3 1 3 PCT/US95/02118

58

compatible with the decign~t~l host are used. Among prokaryotic hosts, E. coli is
most frequently used. Expression control sequences for prokaryotics include
promoters, optionally cont~ining operator portions, and ribosome binding sites.
Transfer vectors compatible with prokaryotic hosts are commonly derived from theplasmid pBR322 which contains operons confe~ g ampicillin and tetræycline
resistance, and the various pUC vectors, which also contain sequences conferringantibiotic recict~nre markers. These markers may be used to obtain s~lccescful
transfoll~ by selection. Commonly used prokaryotic control sequen-~es
include the beta~ ce (penicillinase), lactose plulll~t~l system (Chang et al.,
o Nature 198: 1056 [1977]) the tryptophan promoter system (reported by Goeddel et
al., Nucleic Acid Res 8:4057 [1980]) and the lambda-derived Pl promoter and N
gene ribosome binding site (Shim~t~k~ et al., Nature 292:128 [1981]) and the
hybrid Tac promoter (De Boer et al., Proc. Natl. Acad. Sci. USA 292:128 [1983])
derived from se~lu~ ~ces of the ~2 and iac W5 pr~lllot~ . The fol._~hlg systems
are particularly co. . ~p~l ihle with E. coli; however, other prokaryotic hosts such as
strains of Bacillus or Pseudomonas may be used if desired, with coll~ ollding
control sequences.
Eukaryotic hosts include yeast and m~mm~ n cells in culture systems.
Saccharomyces cerevisiae and Sacchalolllyces carls~l~ sis are the most
commonly used yeast hosts, and are convenient fungal hosts. Yeast colll~alible
vectors carry markers which permit selection of successful transfollllall~ by
collf~llillg ~ ~hy to auxotrophic ~ or ~ e to heavy metals on wild-
type strains. Yeast colll~alible vectors may employ the 2 micron origin of
replication (as described by Broach et al., Meth. Enz. 101:307 [1983]), the
combination of OEN3 and ARSl or other means for ~csnring replication, such as
sequences which will result in incol~oldlion of an a~pl~pliate fragment into thehost cell genome. Control se~lu~nces for yeast vectors are known in the art and
include promoters for the synthesis of glycolytic enzymes, including the promoter
for 3 phosphophycerate kinase. See, for example, Hess et al., J. Adv. Enzyme
Reg. 7: 149 (1968), Holland et al., Biochemistry 17:4900 (1978) and Hitzeman J.
Biol. Chem. 255:2073 (1980). Terminators also may be included, such as those
derived from the enolase gene as reported by Holland, J. Biol. Chem. 256:1385
(1981). It is con~ lated that particularly useful control systems are those which
colll~lise the glyceraldehyde-3 phosphate dehydrogenase (GAPDH) promoter or
alcohol dehydrogenase (ADH) regulatable plclllulel, tefmin~ors also derived fromGAPDH, and if secretion is desired, leader sequences from yeast alpha factor. Inaddition, the l~ dional regulatory region and the Llalls~ tional initiation

wo 95/21922 ~ 1 6 ~ ~ 1 3 Pcr/uss5/02118

59

region which are operably linked may be such that they are not naturally associated
in the wild-type o,ganisl".
~ Amm~liAn cell lines available as hosts for eA~,e;.~ion are known in the art
and include many h,llllu,~lized cell lines which are available from the Alll~,~icall
s Type Culture Collection. These include HeLa cells, Chinrse h,~tovary (CHO)
cells, baby hamster kidney (BHK) cells, and others. Suitable plOlllùtf;l~ for
mAmmAli~n cells also are known in the art and include viral ~lu--~ such as that
from Simian Virus 40 (SV40), Rous salcûllla virus (RSV), adenovilus (ADV),
bovine papilloma virus (BPV), cytomegalovirus (CMV). I~AmmAli~n cells also
o may require tf-- --;--AtQr se~ue'~res and poly A addition sequf nces; f .~hAnr~r
sequences which incl~ase expression also may be included, and se luellces which
cause ~mplifiration of the gene also may be desirable. These sequences are knownin the art. Vectors suitable fom~,l,lica~ion in Ill~.lllllAliAn cells rnay include viral
replicons, or seqllf nr~,s which insure intf.grAtion of the ap~,ul~liat~- sG~utenr~s
encoding non-A, non-B, non-C, non-D, non-E epitopes into the host genome. An
example of a mAmmAliAn expression system for HCV is described in U.S. Patent
Application Serial No. 07/830,024, filed January 31, 1992.
Transforrnations
Transformation may be by any known method for introducing
polynucleotides into a host cell, including par~ging the polynucleotide in a virus
and transducing a host cell with the virus, and by direct uptake of the
polynucleotide. The tran~Çc.,lllalion procedures sel~ctf~ depends upon the host tû
be transformed. BArteri~l Lldllsr~.lllldlion by direct uptake generally employs
tre~tmrnt with calcium or rubidium chloride. Cohen, Proc. Natl. Acad. Sci. USA
2s 69:2110 (1972). Yeast transformation by direct uptake may be COn~l-lCtf~'~ using
the c~lr,ium phosph~te plc~ ~lion method of Graham et al., Virolo~y 52:526
(1978), or morlifiration thereof.
Vector Construction
Vector construction employs methods known in the art. Generally, site-
specific DNA cleavage is p.,.rolllled by treating with suitable restriction enzymes
under conditions which generally are specified by the manufacturer of these
coll,lllel~;ially available enzymes. Usually, about 1 microgram (~lg) of plasmid or
DNA sequence is cleaved by 1-10 units of enzyme in about 20 ~11 of buffer
` ^ solution by in~;ubitlion at 37C for 1 to 2 hours. After inr,ub~tion with the
3s restriction enzyme, protein is removed by phenol/chlolufollll extraction and the
DNA recovered by precipitation with ethanol. The cleaved fragm.~nt~ may be

WO95/21922 ~ 6 ~ 3 PCT/US95/02118


scp~led using polyacrylamide or agarose gel electrophoresis methods, according
to methods known by the routineer.
Sticky end cleavage r~ n~x may be blunt ended using E. coli DNA
polymerase 1 (Klenow) in the pçcscllce of the a~pl.)pliale deoxynucleotide
5 triphosph~tP,s (dNTPs) present in the mixture. Treatment with S1 nuclease alsomay be used, resulting in the hydrolysis of any single stranded DNA portions.
Ligations are ~lrolll,ed using standard buffer and telllpcl~lulc conditions
using T4 DNA ligase and ATP. Sticky end ligations require less ATP and less
ligase than blunt end ligations. When vector fragments are used as part of a
10 ligation mixture, the vector fragment often is treated with bacterial alkaline
phosph~t~ce (BAP) or calf intestin~l ~lk~linP phosphatase to remove the 5'-
phosph~te and thus prevent religation of the vector. Or, restriction en_yme
digestion of ull~d,lled r,~g",r.,l~; can be used to prevent ligation. Ligation
ub~lul~,s are l.~lsrolllled into sui~ble cloning hosts such as E. coli and succescful
5 ~ rolllla,l~ selectecl by m~.thocls inrl~ltling antibiotic r~cict~nre~ and then
screened for the correct construction.
Construction of Desired DNA Sequences
Synthetic oligonucleotides may be prepared using an automated
oligonucleotide synthpci7pr such as that described by Warner, DNA 3:401 (1984).
20 If desired, the synthetic strands may be labelled with 32p by treatment with
polynucleotide kinase in the presence of 32P-ATP, using standard conditions for
the reaction. DNA se~lucnces inrlll.1ing those isolated from genomic or cDNA
libraries, may be modified by known methods which include site directed
mutagenesis as described by Zoller, Nucleic Acids Res. 10:6487 (1982). Briefly,
25 the DNA to be motlifi~1 is p~r~gsd into phage as a single stranded sequence, and
con~.led to a double stranded DNA with DNA polymerase using, as a primer, a
synthetic oligonucleotide co , ~-rnPnt~ry to the portion of the DNA to be modified,
and having the desired m~lifir~tion included in its own sequence. Culture of thetransformed b~rtPri~ which contain replications of each strand of the phage, are30 plated in agar to obtain plaques. Theoretically, 50% of the new plaques contain
phage having the mllt~ted sequence, and the rem~ining 50% have the original
sequence. Replicates of the plaques are hybridized to labelled synthetic probe at
~elll~,~ules and conditions suitable for hybridization with the correct strand, but
not with the unmodified sequence. The sequences which have been i(lentified by
3s hybridization are recovered and cloned.
~I~bridization With Probe

- WO 95121922 2 1 6 6 3 ~ 3 PCT/US95/02118

61

HGBV genomic or DNA libraries may be probed using the procedure
described by Grunstein and Hogness, Proc. Natl. Acad. Sci. USA 73:3961
(1975). Briefly, the DNA to be probed is immobiliæd on nitrocellulose filters,
denatured and prehybridized with a buffer which contains 0-50% formamide, 0.75
s M NaCl, 75 mM Na citrate, 0.02% (w/v) each of bovine serum albumin (BSA),
polyvinyl pyrollidone and Ficoll, 50 mM Na Phosphate (pH 6.5), 0.1 % SDS and
100 ~lg/ml carrier denatured DNA. The pclwllLage of fc)~ ~ide in the buffer, as
well as the time and tt~ c~Lulc conditions of the prehybridization and subsequent
hybridization steps depends on the stringency required. Oligomeric probes which
lo require lower stringency conditions are generally used with low pelccllL~ges of
form~mide, lower lc~pc~lu~es, and longer hybridization times. Probes
cont~ining more than 30 or 40 . ll ~cleoti~es such as those derived from cDNA orgenomic seque~r~ generally employ higher tclll~l~lules~ for example, about 40
to 42C, and a high ~Ice~tage~ for example, 50% fo, ~ isle Following
15 prehybri-li7~tion, a 32P-labelled oligonucleotide probe is added to the buffer, and
the filters are incub~ted in this mixture under hybridization conditions. After
washing, the treated filters are subjected to autoradiography to show the location of
the hybridized probe. DNA in co"cs~onding locations on the original agar plates
is used as the source of the desired DNA.
20 Verification of Construction and Sequencing
For standard vector constructions, ligation mixtures are L~s~l~led into E.
coli strain XL-1 Blue or other suitable host, and successful transfollll~nL~, selected
by antibiotic lcs~ `e or other ~ kel~,. Plasmids from the transfo. .I~llL~ then are
plcpaled accordillg to the method of Clewell et al., Proc. Natl. Acad. Sci. USA
2s 62:1159 (1969) usually following chlol;.. l,l.hPI~ ol amplification as reported by
Clewell et al., J. Bacteriol. 110:667 (1972). The DNA is isolated and analyzed
usually by restriction enzyme analysis and/or sequen~ing Seque~ing may be by
the well-known dideoxy method of Sanger et al., Proc. Natl. Acad. Sci. USA
74:5463 (1977) as further described by Messing et al., Nucleic Acid Res. 9:309
(1981), or by the method reported by Maxam et al., Methods in Enzymology
65:499 (1980). Problems with band colll~,~;,sion, which are som~tim~s observed
in GC rich regions, are over~o~l~e by use of T-deazoguanosine according to the
method reported by Barr et al., Biotechniques 4:428 (1986).
Enzyme-Linked Immunosorbent Assay
3s Enzyme-linked immunosorbent assay (ELISA) can be used to measure
either an~igen or antibody concentrations. This method depends upon conjugation
of an enzyme label to either an antigen or antibody, and uses the bound e.~y",e

WO 95/21922 2 1 ~ 6 3 1 3 PCI'/US95/02118

62

activity (signal generated) as a qu~ e label (~llea~ulable gGneldled signal).
Methods which utilize enzymes as labels are described herein, as are examples ofsuch enzyme labels.
~ation of HGBV Nucleic Acid Sequences
s The source of the non-A, non-B,non-C, non-D, non-E agent is an
individual or pooled plasma, serum or liver homogenate from a human or t~m~rin
infected with the HGBV virus m~ting the clinical and laboratory criteria described
herein. A tamarin alternatively can be e~l~ . ;"~ .lly infected with blood from
another individual with non-A, non-B,non-C, non-E he~alilis mPeting the criteria0 described hereinbelow. A pool can be made by co~ illg many individual
plasma, serum or liver homogenate s~mrlcs co..l;.;..;..g high levels of alanine
Lldll~r~ se activity; this activity results from hepatic injury due to HGBV infection.
The TID (tamarin infective dose) of the virus has been c~ t~d from one of our
e~ to be > 4 x 105/ml (see Fl~mple 2, below).
1S For example, a nucleic acid library from plasma, serum or liver
homogenate, preferably but not n~cess~rily high titer, is generated as follows.
First, viral particles are isolated from the plasma, serum or liver homogenate; then
an aliquot is diluted in a buffered solution, such as one co"~ g 50 mM Tris-
HCI, pH 8.0, 1 mM EDTA, 100 mM NaCI. Debris is removed by centrifugation,
20 for example, for 20 minutes at 15,000 x g at 20C. Viral particles in the rsslllting
su~h,lllâtant then are pelleted by centrifugation under a~pr~,~liate conditions which
can be ~e....i..ecl routinely by one skilled in the art. To release the viral genome,
the particles are disluptt;d by snspentling the pellets in an aliquot of an SDS
s~c~ on~ for example, one CQIl~ ;llg 1% SDS, 120 mM EDTA, 10 mM Tris-
2s HCI, pH 7.5, which also contains 2 mg/ml proteinase K, which is followed by
;ub~lion at a~pru~liate conditions, for example, 45C for 90 minntes Nucleicacids are isolated by adding, for exarnple, 0.8 ~g MS2 bacteriophage RNA as
carrier, and e~llactillg the l~ ul~ four times with a 1:1 mixture of
phenol:chloroforrn (phenol salulaled with 0.5M Tris-HCI, pH 7.5, 0.1% (v/v)
30 beta-llæl~loethanol, 0.1% (w/v) hydroxyquinolone, followed by extraction two
times with chloroform. The aqueous phase is concentrated with, for example, 1-
butanol prior to precipitation with 2.5 volumes of absolute ethanol overnight at-20C. Nucleic acids are lcco~ d by centrifugation in, for example, a Beckman
SW41 rotor at 40,000 rpm for 90 min at 4C, and dissolved in water that is treated
3s with 0.05% (v/v) diethyl~yloc~l.ollale and autoclaved.
Nuclei~aci~ob~ined by lh~, above procedure is den~lul~,d with, for
example, 17.5 mM CH3HgOH; cDNA then is ~ lF~;~çd using this d~,llalulGd

%l663~
WO 95/21922 PCT/US95/02118

63

nucleic acid as temrl~tç, and is cloned into the EcoRI site of phage lambda-gtl 1,
for ex~mrle, by using methods described by Huynh (1985) supra, except that
random primers replace oligo(dT) 12-18 during the synthesis of the first nucleicacid strand by reverse transcriptase (see Taylor et al., [1976]). The res~llting- s double stranded nucleic acid sequences are fractionated acco~lhlg to size on a
Se~halose CL-4B column, for exarnple. Eluted m~tPri~l of approximate mean size
400, 300, 200 and 100 base-pairs are pooled into genomic pools. The lambda-
gtl 1 cDNA library is gellGl~lGd from the cDNA in at least one of the pools.
Alternatively, if the etiological agent is a DNA virus, methods for cloning genomic
DNA may be useful and are known to those skilled in the art.
The so-gelle.~lGd lambda-gtl 1 genomic library is screened for epitopes that
can bind specif c~lly with serum, plasma or a liver homogenate from an individual
who had previously exppripn~-ec~ non-A, non-B, non-C, non-E hepatitis (one
which meets the criteria as set forth he.~lbdow). About 104-107 phage are
scl~ned with sera, plasma, or liver homogenates using the methods of Huyng et
al. (supra). Bound human antibody can be dPtectPd with sheep anti-human Ig
antisera that is radio-labelled with 125I or other suitable ~e~,olt~r molecules
including HRPO, Alk~line phosph~t~ce and others. Positive phage are identified
and purified. These phage then are tested for specificity of binding to sera from a
pre-~letPrminPd number of dirr~lG~Il humans previously infected with the HGBV
agent, using the same method. Ideally, the phage will encode a polypeptide that
reacts with all or a lllajolily of the sera, plasma or liver homogenates that are
tested, and will not react with sera, plasma or liver homogenates from individuals
who are cht ~ -~l to be "negative" accor~ g to the criteria set forth herein for the
HGBV agent as well as hep~titic A, B, C, D and E. By following these
procedures, a clone that encodes a polypeptide which is specifically recognized
immunologically by sera, plasma or liver homogenates from non-A, non-B, non-
C, non-D and non-E-identifiP~ patients can be isolated.
The present invention will now be described by way of examples, which
are meant to illustrate, but not to limit, the spirit and scope of the invention.

- EXAMPLES
The examples provided herein describe in detail methods which led to the
- discovery of the HGBV group of viruses. The examples are provided in
35 chronological order so that the discovery of the HGBV-A, HGBV-B and HGBV-
C viruses of the HGBV group can befollowed. ~enerally, tr~n.cmicsihility and
infectivity studies were initially ~.roll~led; these studies and subsequent ones

WO 95/21922 ~ 1 6 6 :~ 1 3 PCI/US9S/02118

64

described herein led to evidence for the existence of two HCV-like viruses in
HGBV: GB-A and GB-B. Subsequent e~eli,llcll~ also det~iled herein utili7.ing
dGg~ eldlivG primers led to the discovery of HGBV-C. The prevalence of this
group of viruses in hnm~nc as evidenced by serological studies, the viral
5 ch~.el~ tion of this group of viruses, the rel~t~-lness of HGBV to other viruses
in its proposed genus and the interrel~te-in~ss of HGBV-A, HGBV-B and HGBV-
C also is taught.

Example 1. Tr~ncmiccibility of HGBV
0 A. EA~IilllGll~l Protocol. Sixteen t~m~rinc (Saguinus labiatus) were secured
through LEMSIP (Labolatoly for EApeli...ental Medicine and Surgery in Primates,
Tuxedo, New York) for the tr~ncmiccibility and h~fGc~ y studies. All animals
were ...~;..I;.;..Yl and monitored at LEMSIP accordi~-g to protocols approved byLEMSIP. (Note: one animal died of natural causes and one ailing animal was
5 e~lth~ni7~ prior to the initiation of hlfG;IiviLy studies). Baseline serum liver
enzyme values were established for serum liver en_ymes alanine llAi~ ce
(ALT), gamma-glutamyl~.al.~Çt;,ase (GGT) and isocitric dehydn)gellase (ICD) for
two to three months on serum ~ci--Rns obtained weekly or bi-weekly. A
.. ~;.. i.. , of eight serum liver en_yme values were obtained for each animal prior
20 to inoculation. Cutoff values (CO) were detel....llGd for each animal, based on the
mean liver enzyme value plus 3.75 times the standard deviation. Liver enzyme
values above the cutoff value were int~ lGIGd as al~no,..lal and suggestive of liver
damage. Several t~ . ;..c were inoc~ tPd as desc,ibGd hereinbelow and
monitored for ~ nges in ALT, GGT and ICD serum levels. At specified times
25 thereafter during the lll~ g process, certain animals were sacrified in order to
obtain serum and tissues for further studies.

B. Inoculation of Animals (Initial Study). A pool of known infectious tamarin
GB serum (passage 11, design~t~l as H205 GB pass 11) was p,~d~t,d from
30 serum collected during the early acute phase (19-24 days post inoculation) ofhepatitis from nine tamarins inoculated with the HGBV. This pool had been
previously described and studied in an effort to dete....;l-~ the etiological agent
involved. J. L. Dienstag et al., Nature 264 supra; E. Tabor et al., J. Med. Virol.
5, supra. Aliquots of this pool were ...A;..I;1;..e~ at Abbott Laboratories (North
Chicago, IL 60064) under liquid nitrogen storage conditions until utilized in this
study. Other aliqouts of HGBV are available from~he ~.. ;c~n Type Culture

WO 95/21922 21 ~ 6 ~ 1 3 PCT/US95/02118


.




Collection (A.T.C.C.), 12301 Parklawn Drive, Rockville, MD 20852, under
A.T.C.C. Deposit.No. VR-806.
On day one, four ~ . ;"~ of the initial group of rem~ining 14 tam~rin~,
identified as T-1053, T-1048, T-1057 and T-1061, were inoculated inl~vcllously
with 0.25 ml of pool H205? passage 11, previously diluted 1:50. These anim~lc
were monitored weekly for changes in the liver enzymes ALT, GGT and ICD.
TABLE 2 yl~;sell~ the pre- and post- inoculation liver enzyme data on these fourt~m~rinc (T-1053, T-1048, T-1057 and T-1061); FIGURES 1-4 present the pre-
and post- inoculation ALT and ICD levels of these four !;--Il~. ;I-c, As the data
0 demonstrate, signific~nt rises in ALT, GGT and ICD above the CO were obtained
in the four t~m~rinc inocul~tocl with the 1 :50 dilution of pool H205.
On the same day (day one), one tamarin (T-1047) was inoculated
intravenously with 0.25 ml of pooled normal tamarin serum and used as a negativecontrol, and another tamarin (T-1042) was inoc~ tç~ in1~ e~usly with 0.25 ml
of pooled normal human serum and served as an additional negdLive control.
FIGURES 5-6 and TABLE 3 present the pre- and post- inoculation ALT and ICD
levels of the two control l~."~ c (T-1047 and T-1042). As the data demonstrate,
no rise in ALT or ICD was ~ioculllpn~ed post-inoculation for the two control
tam~rinc for a period of eight weeks.
On the same day (day one), one tamarin (T-1044) was inoculated
intravenously with 0.2 ml of convalloscent sera obtained from the surgeon (original
GB source) approximately three weeks following the onset of acute hepatitis.
This specimen had been stored at -20C. F. Deinhardt et al., J. Exper. Med.
125:673-688 (1967). Another tamarin (T-1034) was inoclll~ted with 0.1 ml of this2s convalescent sera. As FIGURES 7-8 and TABLE 4 demonstrate, no rise in serumliver el~y~les was observed in these t;~ c for a period of eleven weeks post
inoculation. Thus, these data dc ~oncl~le that infective HGBV was not detect~hlein the convalescent sera ob~i~led from the original patient and stored at -20C,which could intlic~te that the individual had recovered from infection and that the
virus had been cleared from the patient's serum or that the viral titer had beenreduced to non-detect~hle levels upon storage at -20C.
C. Further Studies. Tamarin T-1053 showed a significant rise in serum liver
enzymes one week post-inoculation, and was retested for liver enzymes on day 11
post-inoculation. At day 12 it was dct~ d that significant elevations in serum
liver enzymes were present, and the animal was sacrificed on that day. Plasma,
liver and spleen tissue samples were obtained for further studies. The plasma ~n

WO95/21922 2~ 3 13 PCr/US95102118 --


T-1053 served as the source for the RDA procedure ~li-ccllcced in Example 3
below; the liver tissue was utilized in Example 8 below.
T~m~rinc T-1048, T-1057 and T-1061 were monitored for serum liver
enzyme values; all were observed to exhibit elevated serum liver enzyme levels
s within two weeks following inoculation; these elevated values were noted for six
or more weeks post inoculation. All three l;t...~. ;.,c were observed to have
decleasi"g serum liver enzyme levels below the CO by 84 days post inoculation.
On day 97 post inoculation, these three l;....~ c (T-1048, T-1057 and T-1061)
were re~ll~llPngç~ with 0.10 ml of neat plasma obtained from tamarin T- 1053
0 (shown to be infectious, see Example 2) to de~e ...;..e whether h~.p~titic as
docu. . .~ d by elevations in serum liver enzymes could be re-in-l~-cerl The data
are presented in TABLE 2 and FIGURES 1, 3 and 4. As the data indicates, serum
liver enzyme levels of two ~l~alills (T-1057 and T-1061) rem~in~(l below the CO
for three weeks post reinoc~ tion One tamarin (T-1048) exhibited mild
15 elevations in serum liver enzyme levels two weeks imm~ tl~.ly post-reinoculation.
It was hypoth~ci7~ that the mild elevations in T-1048 were attributable to either
~;i"re ;lion of liver tissue by HGBV or incomplete recovery from the initial
inoculation with H205.
Example 2. Infectivity Studies
A. E~ lilllelltal Protocol. Baseline readings on four t~m~rinc were obtained as
described in Example l(A). Briefly, baseline serum liver enzymes (ALT, GGT
and ICD) were established for each animal prior to inoculation. Cutoff values
(CO) were del~- --.;--~l for each animal, based on the mean liver enzyme value plus
2s 3.75 times the standard deviation. Liver enzyme values above the cutoff were
,~d as abnormal and suggesliv~ of liver damage.
B. Inoculation of T~m~rinc. The plasma from Tamarin T-1053, sacrificed at day
12 post inoculation (see Example 1 [C]), was used as the inoculum for further
studies. On day one, one tamarin (T-1055) was inocul~ted intravenously with
0.25 ml of neat T-1053 plasma. On the same day, two t~m~rinc (T-1038 and T-
1051) were inoculated intravenously with 0.25 ml of T- 1053 plasma which had
been serially diluted to either 10-4 (T-1038) or 10-5 (T-1051) in pooled normal
tamarin plasma. On the same day, tamarin T-1049 was inoc~ tecl intravenously
with 0.25 ml of plasma T-1053 which had been filtered through a series of filters
of decreasing pore size (0.8 ~m, 0.45 ~m, 0.22 llm and 0.10 ~lm) and diluted at
10~ in pooled normal tamarin plasma.

- WO 95121922 ~ 2 6 6 ~ 1 3 PCI`/US95/02118

67

All t~m~rinc (T-1055, T-1038, T-105 1 and T-1049) were monitored
weekly as described in Example 1 for changes in serum liver enzymes ALT, GGT
and ICD. TABLE 5 plcscn~7 the pre- and post- inoc~ tion liver enzyme data on
these four ~ . ;II.C, FIGURE 9 p~sel.l~. the pre- and post- inoculation ALT and
ICD values T-1055. Referring to FIGURE 9, it can be seen that elevations above
the CO in serum liver enzymes ALT and ICD occurred. This tamarin was sacrified
on day 12 post-inoculation. ~GURES 10 and 11 present the pre- and post-
inoculation serum levels of ALT and ICD for ~""., ;,~c T-1051 and T-1038,
~,spec~ively. Referring to FIGURES 10 and 11, it can be seen that elevations in
0 serum liver cnLyllles ALT and ICD occured in both animals by 11 days post-
inoculation. T-1038 was sacrified on day 14 post inoculation. TABLE 5 and
FIGURE 12 present the data obtained on T-1049. As can be seen from TABLE 5
and FIGURE 12, elevations in serum liver enzymes above the CO were observed
in T-1049 within 11 days post-inoc~ tion.
The filtration study con~ ctecl on T-1049 ;"~ tes that HGBV can pass
through a O.lO~lm filter, thereby suggesting that HGBV is likely to be viral in
nature, and less than O.l~m in tli~mPter. In addition, the infectivity titratione~lr, ;" ,~ con~ ct~ on T- 1038 demonstrates that the T 1053 serum contains at
least 4 x 105 tamarin infectious doses per ml.
In order to show the tr~ncmiccihility of a single HGBV agent, tamarin T-
1044 was inoculated with 0.25 ml of an inoculum concicting of T-1057 serum that
had been obtained 7 days after the H205 inoculation and diluted 1:500 in normal
tamarin serum. Mild elevations in ALT levels above the cutoff were observed
from days 14-63 PI (that it, elevations in the range of 82 to 106).
2s T~m~rin~ T-1047 and T-1056 were subsc~lu~,lllly inocul~t~l with 0.25ml of
T-1044 serum obtained 14 days PI and diluted 1:2 in normal tamarin serum.
Elevations in ALT levels above the cutoff were first obs~ cd in T-1047 and T-
1056 at 42 days PI and returned to normal levels at days 64 and 91 PI,
lcs~;livcly. Tamarin T-1058 was inocul~terl with 0.25ml of neat T-1057 serum
obtained 22 days after the ch~llPnge with T-1053 serum. Elevations in ALT levelshave not been observed for 112 days PI.

Example 3. Representational Difference Analysis (Subll~live Hybridization)
- - A. Generation of double-stranded DNA for Amplicons
3s Using the proce lulG ~escrihe~ herein in Materials and Methods above and
referring~GURE 13, tester amplicon was pl~,d from total nucleic acid
obtained from tamarin T-1053 infectious plasma on day 12 post inoc~ tion with

WO 95/21922 ~1 6 6 ~ 1 3 PCI'IUS95/02118

68

H205 serum (see FY~lnpl~c lC and 2B). Driver amplicon was prepared from
Tamarin T-1053 pre-inoculation plasma pooled from days -17 to -30 (see
F.Y~rnrle lA). Briefly, both plasmas were filtered through a 0.1 ~m filter as
described in FY~nrle 2B. Next, 50 111 of each filtered plasma was extracted using
a co,l"l,elcially available kit [United States Biochemical (USB), Cleveland, OH,cat. #73750] and 10 ~g yeast tRNA as a carrier. This nucleic acid was subjected
to random primed reverse transcription followed by random primed DNA
synthesis using co,l.l,le.cially available kits. Briefly, an 80 lul reverse
transcription reaction was performed using Perkin Elmer's (Norwalk, CT) RNA
0 PCR kit (cat. # N808-0017) as directed by the manufacturer using random
hçY~Tn~rs and inrnk~ting for 10 minlltes at 20C followed by 2 hours incubation at
42C. The reactions then were tennin~t~d and cDNA/RNA duplexes denatured by
in~llh~tion at 99C for 2 ",i",l~es The reactions were supplemented with 10 111
lOx RP buffer [100 mM NaCl, 420 mM Tris (pH 8.0), 50 mM DTT, 100 ~lg/ml
BSA], 250 pmoles random hçx~nP.rs and 13 units Seqnen~ce~ version 2.0
polymerase (USB, cat. #70775) in a total volume of 20 ~11. The reactions were
in~nb~t~ at 20C for 10 "~i""~es followed by 37C for 2 hours. After
phenol:chloroform extraction and ethanol precipitation, the double stranded DNA
products of these reactions were digested with 4 units of restriction endonuclease
Sau3A I (New F.ngl~nd Biolabs [NEB], cat. #169L) in 30 ~1 reaction volumes for
30 minutes, as directed by the supplier.
B. Generation of amplicons.
Sau3AI~ligested DNA was extracted and p~ ed as described above.
The entire Sau3AI-digested product was ~nn.o~led to 465 pmoles R Bgl 24
2s (SEQUENCE I.D. NO. 1) and 465 pmoles R Bgl 12 (SEQUENCE I.D. NO. 2) in
a 30 ~11 reaction volume buffered with lx T4 DNA ligase buffer (NEB) by placing
the reaction in a 50-55C dry heat block which was then in~uk~t~ at 4C for 1
hour. The ~nn.o~l~ product was ligated by adding 400 units T4 DNA ligase
(NEB, cat. # 202S). After incub~tion for 14 hours at 16C, a small scale PCR wasp~,.r~,ll"ed. Briefly, 10 ~11 of the ligation reaction was added to 60 ~1 H20, 20 ,ul
5x PCR buffer (335 mM Tris, pH 8.8, 80 mM [NH4]2SO4~ 20 mM MgC12, 0.5
~g/ml bovine serum albumin, and 50 mM 2-mercaptoethanol), 8 111 of 4 mM
dNTP stock, 2 ,ul (124 pmoles) R Bgl 24 (SEQUENCE I.D. NO. 3) and 3.75
units of AmpliTaq~) DNA polymerase (Perkin Elmer, cat. # N808-1012). The
PCR amplification was ~.rol"led in a GeneAmp~ 9600 thermocycler (Perkin
Elmer). Sarnples were inc~lh~t~ for S min. at 72C to fill-in the 5'-protruding
ends of the ligated adaptors. The samples were amplified for 25 to 30 cycles ( 1

21~6313
WO 95/21922 PCI`/US95/02118

69

min. at 95C and 3 min. at 72C) followed by extension of 72C for 10 min. Afteragarose gel conr~ ation of s~lccessful amplicon geneldLion (ie. a smear of PCR
products ranging from approximately 100 bp to over 1500 bp), a large scale
amplification of tester and driver amplicons was performed. Forty 100 ~1 PCRs
5 and eight 100 ~11 PCRs were set up as described above for the ,~ Lion of driver
and tester amplicons, respectively. Two ~11 from the small scale PCR product per100 111 reaction served as the template for the large scale amplicon geneldtion.Thermocycling was performed as described above for an additional 15 to 20 cyclesof amplification. The PCR reactions for both driver and tester DNA were then
0 phenol/chloroform extracted twice, isop~anol precipitated, washed with 70%
ethanol and digested with Sau3AI to cleave away the adaptors. The tester
amplicon was further purified on a low melting point agarose gel. Briefly, 10 llg
of tester amplicon DNA was run on a 2% SeaPlaque(~ gel (FMC Bioproducts,
Ro~ l ME). Fr~r ~nt.c of 150- 1500 base pairs were excised from the gel, the
gel slice was melted at 72C for 20 .. ;.. l~s with 3 ml H2O, 400 ,ul 0.5 M MOPS
and 400 111 NaCl. DNA was recovered from the melted gel slice using a Qiagen-tip20 (Qiagen, Inc., Chatsworth, CA) as directed by the manufacturer.
C. Hybridization and Selective Amplification of amplicons
Approximately 2 ~g of purified tester DNA amplicon was ligated to N
Bgl 24 (SEQUENCE I.D. NO.3) and N Bgl 12 (SEQUENCE I.D. NO. 4) as
described above. For the first subtractive hybridization, tester amplicon ligated to
the N Bgl primer set (0.5 ~g) and driver amplicon (20 llg) were mixed,
phenoVchloroform extracted and ethanol p,~i~ d. The DNA was resuspended
in 4 ~11 of EE x 3 buffer (30 mM EPPS, pH 8.0 at 20C [Sigma, St. Loius,MO], 3
mM EDTA) and overlaid with 35 ~1 of mineral oil. Following heat denaturation (3
min at 99C), 1,ul of 5 M NaCl was added to the denatured DNA and the DNA
was allowed to hybridize at 67C for 20 hours. The aqueous phase was removed
to a new tube and 8 ~11 of tRNA (5 mg/ml) was added to the sample followed by
390 ~Ll TE (10 mM Tris, pH 8.0 and 1 rnM EDTA). Eighty 111 of the hybridized
DNA solution was added to 480 ~1 H2O, 160 ~11 5x PCR buffer (above), 64 ~1 4
mM dNTPs and 6 ~11 (30 units) AmpliTaq(~ polymerase. This solution was
incubated at 72C for 5 min. to fill in the 5' overhangs created by the ligated N Bgl
24 primer. N Bgl 24 (SEQUENCE I.D. NO. 3, 1.24 nmoles in 20 111 H20) was
- - added, the reaction was aliquoted (100 ~l/tube) and subjected to 10 cycles of
amplification as described above. The reaction was pooled, phenol/chloroform
extracted twice, isO~)l~allol ~ ted~ washed with 70% ethanol and
I~SU~ ~ in 40 ~1 H20. Single-stranded DNA was removed by mung bean

WO 95/21922 ~ 3 ~ 3 PCI/US95/02118


nnr1ç~ce (MBN) . Briefly, 20 ~11 amplified DNA was ~ligestçd with 20 units MBN
(NEB) in a 40 ~ul reaction as described by the supplier. One hundred and sixty ~1
50 rnM Tris, pH 8.8 was added to the MBN digest. The enzyme was heat
inactivated at 99C for 5 min. Eighty ~11 of the MBN-digested DNA was PCR
amplified as described above for an additional 15 cycles. Again, the reaction was
pooled, phenol/chloroform extracted twice, isoplopanol p~ ~1, washed with
70% ethanol and res11spe~led in H20. The amplified DNA (3 to 5 ~lg) was then
digested with Sau3A I, extracted and plGcipilaled as described above. The final
DNA pellet was lGsu~ended in l00 ~1 TE.
D. S-~bs~ue"lhybridization/amplificationsteps
One hundred ng of the DNA from the previous hybridi_ation/selective
amplification was ligated to the J Bgl primer set (SEQUENCE I.D. NO. 5 and
SEQUENCE I.D. NO. 6) as described previously. This DNA (50 ng) was mixed
with 20 ~g of driver ~rnr1ir,on and the hyhritli7~tion and ~rnp1ifiriation procedures
were repeated as 1esçrihed above except that the extention t~;lllyGlillUlG during the
thremocycling was 70C and not 72C as for the N Bgl primer set (SEQUENCE
I.D. NO. 3 and SEQUENCE I.D. NO. 4) and the final amplification step (after
MBN digestion) was for 25 cycles. One hundred ng of the second hybri-li7~tion-
amplification product was then ligated to the N Bgl primer set (SEQUENCE I.D.
NO. 3 and SEQUENCE I.D. NO. 4), and 200 pg of this m~trri~1 together with 20
~lg of driver amplicon was taken for the third round of hybridi_ation/amplification
as desçrihed above with the final ~rnr1ifir~tion for 25 cycles.
A 2% agarose gel of the products from the lc~lGse-l~ional difference
analysis (RDA) ~Irolll,ed on pre-HGBV inoc~ ted and acute phase T-1053
2s plasma is shown in F~GURE 14. Referring to FIGURE 14, Lane 1 COlltaillS 150
ng of Haem digested Phi-X174 DNA marker (NEB) with the a~lopliate si_e (in
bp) of the DNA fr~gm-q-nt~. The co r1-Yity of the driver amplicon (lane 2) and the
tester amplicon (lane 3) is evidenced by the smear of DNA products seen in thesesamples. This complexity drops dr~rn~tir~lly as the tester se~lu~nces are subjected
to one (lane 4), two (lane 5) or three (lane 6) rounds of hybridization/selective
amplification.
E. Clonin~ of the difference products
The dirrGlGnce products were cloned into the Bam~ site of pB111ssç~ipt
II KS+ (Stratagene, La Jolla, CA, cat. # 212207), as follows. Briefly, 0.5 ~g
3s pB1nesçrirt II was digested with BamHI (10 units, NEB) and 5' d~hospllorylated
with calf intrstin~1 phos~hat~se (10 units, NEB) as directed by the ~u~pliGr. The
plasmid was phenol:chloroform extracted, ethanol ~l~ipi~d, washed with 70%

~16631~
_ W O 95/21922 PC~rnUS95/02118


ethanol and resllspen-le(l in 10 ~11 H20 (final concclllla~ion approximately 50 ng
pBluescript II per 111). The four largest bands from the second
hybridization/~mrlific~tion products were excised from a 2% low melting point
agarose gel as described above. Four ~l of the melted (72C, 5 min.) gel slices
s were ligated to 50 ng of the BamHI-cut, dephosphorylated pBluescript II in a 50 ~LI
reaction using the Takara DNA ligation kit (Takara Bioc~ ..ir~ Berkeley, CA).
After incub~ting at 16C for 3.5 hours, 8 ~l of the ligation reactions were used to
llall~r~ l E. coli col..pe~rnt XL-1 Blue cells (Stratagene) as directed by the
supplier. The ll~ulsr~ ion llli~lulcs were plated on LB plates supplemented
lo with ~mricllin (150 ~g/ml) and ;I~l;ub-~ -d overnight at 37C. The resultingcolonies were grown up in liquid culture and lllinip~p plasmid DNA was analyzed
as described in the art to conrlllll the existence of cloned product.
In addition to the cloning of the four largest products from the second
hybridization/amrlifi~tion step, the entire population of l)lOllU~ from the third
hybridization/amplification step was cloned into pBh~ese~ t ~. Briefly, 50 ng
pBluescript II vector (prepared as above) was ligated to 10 ng of the third
hybridization/amplification products in a 50 ~11 reaction as described above. After
in~ub~tion at 16C for 2 hours, lO ,ul ligation product was used to transform E.coli co.,.l~ t ~-1 Blue cells as before. Sixty colonies from the resultant
transformation were grown up, and n~ l~ DNA was prepared and analyzed as
described and known in the art. Restriction endonucle~ce digestion and dot blot
hybridization e~ nL~ were used to identify unique clones.

Example 4. Immunoisolation of a cDNA Clone Encodin~ an
Anti~enic Re~ion of the HGBV Genome
A. Fk?aldtion of Concelllldted Virus as a Source of Cloning Material
The following isolation scheme was employed to isolate the HGBV
genome in addition to the ~loc~lules exemplified in Example 3. Three t~n~rinc
(T-1055, T-1038 and T-1049) were inoc~ t~ with serum prepared from tamarin
T-1053 as desclibed in Example 2. Referring to TABLE 5, elevated liver enzyme
values were noted in all 3 tamarins by day 11 PI. Tamarin T-1055 was sacrificed
on day 12 PI and l~ullalhls T-1038 and T-1049 were sacrificed on day 14 PI.
Apprnxim~tt ly 3-4 ml of serum from each of these three ~ ll~h~S were pooled,
providing a total volume of approximately 11.3 ml. The pooled serum was
clarified by centrifugation at 10,000 x g for 15 min at 15C. It was then passedsnccescively through 0.8, 0.45, 0.2, and 0.1 ~m syringe filters. This filtered
m~t~ri~l was then con~ , dt~d by centrifugation through a 0.3 ml CsCl cushion

2166~13
WO 95/21922 PCI`/US95/02118 -~


(density 1.6 g/ml, in 10 mM Tris, 150 mM NaCl, lmM EDTA, pH 8.0) in a
SW41 -Ti rotor at 41,000 rpm at 4C for 68 min. The CsCl layer, approximately
0.6 ml, was removed following cçntrifig~tion and stored in three 0.2 ml aliquots at
-70C.
Tamarin T-1034 was subsequently inoc~ tP~cl with 0.25 ml of a 10-6
dilution of this pelleted m~tPri~l (pltpa~d in normal tamarin serum). Elevated
ALT liver enzyme values were first noted in T-1034 at 2 weeks PI, and remained
elevated for the next 7 weeks, finally nonn~li7ing by week 10 PI (see FIGURE
30, Example 14). This e~ ænt demonstrated the infectivity of the material
0 concentrated from the pooled tamarin sera. Since this m~teri~l was shown to be of
a relatively high titer, this concentldled source of virus was used as the source of
nucleic acid for the ~lc~dlion of a cDNA library, as described below.
B. cDNA Library Construction
An aliquot (0.2 ml) of the collcP~ ted virus (described above) was
extracted for RNA using a cul,ll"~lcially available RNA extraction kit (Stratagene,
La Jolla, CA) as instructed by the supplier. The sample was divided into four
equal aliquots prior to the final p~c~ i~Lion step, and then ~ d in the
presence of 5 llg/ml yeast tRNA. Only one of these aliquots was used for cDNA
synthesis; the others were stored at -80C. Phosphorylated, blunt-ended, double-stranded cDNA was plcparcd from the RNA using a collllll~l.,ially available kit
(Stratagene, La Jolla, CA) as directed by the m~nllf~tllrer. A double-stranded
linker/primer was then ligated to the cDNA ends (sense strand, SEQUENCE I.D.
NO. 7; ~nticçn~e strand, SEQUENOE I.D. NO. 8) in a 10 111 reaction volume
using a T4 DNA ligase kit (Stratagene, La Jolla, CA) as directed by the
manufacturer. This provided all cDNAs in the ",i~luie with identir~l 5' and 3'
ends co,~ g Not I and Eco RI restriction cl~y~e recognition sites. G. Reyes
and J. Kim, Mol. Cell. Probes 5:473-481 (1991); A. Akowitz and L. Manuelidis,
Gene 81:295-306 (1989); and G. Tn~h~llcpe et al., in Viral Hepatitis and Liver
Disease. F.B. Hollinger et al., Eds., pp. 382-387 (1991). The sense-strand
oligonuclP.otirle of the link~,Jplil"er was then used as a primer in a PCR reaction
such that all cDNAs were amplified independent of their sequence. This procedureallowed for the ~mplifi~tion of rare cDNAs present within the total cDNA
population to a level which allowed them to be efficiently cloned, thus producing a
cDNA library that is l~;pl~;se~ /e of the sequences within the starting material.
PCR was ~lÇulll~ed on a 1 ~ul aliquot of the above ligate in the presence of
the sense-strand oligonucleotide primer (final concentration: 1 ~M; reaction
volume: 50~11) using the GeneAmp PCR kit (Perkin-Elmer) as directed by the

~ 1 6~31~3
_ WO 95/21922 PCI/US95/02118


m~nnfactllrer in a PE-9600 thermocycler. Thirty cycles of PCR were pe,rol"~ed asfollows: denaturation at 94C for 0.5 min, annealing at 55C for 0.5 min, and
extension at 72C for 1.5 min. A 1 ~11 aliquot of the resulting products was then
re-amplified as described above. The final PCR reaction products were then
extracted once with an equal volume of phenol-chlo~ofo"ll (1:1, v/v) and once
with an equal volume of chloroform, and then pl~ dled on dry ice for 10 min
following the addition of sodium acetate (final concentration, 0.3 M) and 2.5
volumes of absolute ethanol. The resulting DNA pellet was resuspended in water
and digested with the restriction enzyme Eco RI (New F.nglAn-l Biolabs) as
o directed by the mAnufA~tllrer. The digested cDNAs were then purified from the
reaction mixture using a DNA binding resin (Prep-a-Gene, BioRad Laboldto,ies)
as directed by the manufacturer and eluted in 20 ~Ll of 11icti~ water.
The cDNAs (8 ~l) were ligated to 3 ~lg lambda gtl 1 vector DNA arms
(strAtAgenP, La Jolla, CA) in a reaction volume of 30 111 at 4C for 1-5 days.
Eleven microliters of the ligate was packaged into phage heads using GigaPack m
Gold pae~ging extract (Stratagene, La Jolla, CA) as directed by the manufacturer.
The res--lting library co~lldhled a total of approximately 1.73 million mt;lllbe(PFU) at a recombination frequency of 89.3% with an average insert size of
approximately 350 base pairs.
C. Immunoscreening of the Recombinant GB cDNA Library
The antiserum used for imm--nosc,~,ellhlg of the cDNA library was
obtained from tAmArinc that had delll~n~lldted elevations in their serum liver
enzyme levels following inoculation. Two sep~udte pools of antisera were used for
immunoscreening. The first pool contAinç~l serum from two animals (T-1048 and
T-1051; see Example 1, TABLE 2, and Example 2, TABLE 5"~,,~li-rely) while
the second pool co,lldil,ed serum from a single animal (T1034; see FIGURE 30,
Example 14). The specific sera used are shown in TABLE 6.
At the time that these samples were chosen for use in cDNA library
immunosc~ning, they had not been tested fcr their immunoreactivity with either
the 1.4 or 1.7 recombinant CKS proteins (Example 13). Therefore, the results
shown herein were obtained independent of any hlfol"~-dlion regarding the
~ - presence or absence of HGBV antibodies against these recombinant proteins
within the antiserum used.
- TABLE 6
Tamarin Sera used for Immunosc,~llhlg of GB cDNA Library
Tamar~ Tamarin Tamarin
1048a lOSlb 1034C

WO 95/21922 ~ 3 ~ 3 PCT/US95tO2118

74

Days Post- Volume in Days Post- Volume in Days Post- Volume in
Inoculate Pool Inoculate PoolInoculate Pool
63 0.2 ml 63 0.2 ml 42 0.1 ml
77 0.2ml 69 0.1 ml 49 0.1 ml
91 0.2 ml 91 0.2 ml 63 0.1 ml
97 0.2 ml 98 0.2 ml 70 0.1 ml
126 2.0 ml 105 0.2 ml 77 0.08 ml
109 5.3 ml
aTotal T-1048 pool volume is 2.8 ml. h-Total T-1051 pool volume is 6.4 ml. One ml of each
pool was saved and the 1. ' of each was cnmhinp~ and used as the primary ~l~isc~ . for
1mml~n~ ",i11g CTotal T-1034 pool volume is 0.48 ml; the entire pool was used for
__ c~,lull~.
s




The procedure used for the immunoisolation of l~,colllbinant phage was
based upon the method desc~ibed by Young and Davis with modifications as
described below. R.A. Young and R.W. Davis, PNAS 80: 1194-1198 (1983).
Two immunoscreening e~ llr.~ were performed, one lltili7ing antiserum
lo pooled from T-1048 and T-1051 and the other ~ltili7ing allLiselulll from T-1034. In
both cases, the primary antiselulll was pre-adsorbed against E. coli extract prior to
use in order reduce non-specific intel~;lions of antibody with E. coli proteins. In
the first e~ r~lt~ 1.29 million n,colllbinallt phage were immunoscreened with
the T-1048/T-1051 antiserum pool; in the second e~filll~nl 0.30 million
recombinant phage were ;Il~ os~feencd with T-1034 antiserum. The
~.,colllbill~ll phage library was plated on a lawn of E. coli strain Y109Or- andgrown at 37C for 3.5 hours. The plates were then overlayed with nylon filters
that were saturated with lPIG (10 mM) and the plates incubated at 42C for 3.5
hours. The filters were then blocked in Tris-saline buffer co~ ;ll;llg 1~o BSA, 1%
gelatin, and 3% Tween-20 ("blocking buffer") for 1 hour at 22C. The filters were
then ill~ b ~t4~1 in plilll~ ~ILiS~.ulll (1:100 dilution in blocking buffer) at 4C for
16 hours. Primary antiserum was then removed and saved for subsequent rounds
of plaque purification, and the filters washed four times in Tris-saline co~t~;";"g
0.1% Tween-20. The filters were then incubated in blocking buffer co"L;.i~
125-I-labeled (or ~lk~lin~-phosphatase conjugated) goat anti-human IgG (available
from Jackson ImmunoResearch, West Grove, PA) for 60 min at 22C, washed as
described above, and then exposed to x-ray film (or subjected to color
development according to established procedures, as in J. Sambrook et al.,
Molec~ r Clonin~: A I~olatOI~ Manual. 2nd edition, Cold Spring Harbor Press,
Cold Spring Harbor, N.Y., 1989). Five immunopositive phage (4-3B1, 48-lA1,

- WO 95/21922 ~ 1 Çi 6 3 ~ 3 PCT/US95/02118


66-3A1, 70-3Al, 78-lC1) were isolated from this library and subsequently tested
for specificity of binding to antisera from three infected ~ .; . ,c (T-1048, T- 1051,
T-1034) using the method described above. These recolllbinall~ encoded
polypeptides that reacted with convalescent sera, but not with pre-inoculation sera,
s from each of the three infected t~m~nc (data not shown).
In order to verify the specificity of the immunological reactivity of the
polypeptide encoded by the rccolll~illant phage, each cDNA was rescued from the
lambda phage genome by PCR using primers located 5' (SEQUENOE I.D. NO. 9)
and 3' (SEQUENCE I.D. NO. 10) to the Eco RI cloning site. The PCR products
0 were then digested with Eco RI and subseq~lently ligated into the E. coli
expression plasmid pJO201 as described in Example 13. Insertion of the cDNAs
into the Eco RI site of pJO201 ~A;Ill~ d the translational reading frame of thiscDNA as present in the lambda phage clone. The subclones in the pJO201
eA~l~,ssion vector were ~ecigr^~d 4-3Bl.l, 48-lA1.1, 66-3A1.49, 70-3A1.37,
and 78-lCl.17. Llllll"lloblot analysis (as in Example 13) of E. coli lysates
prepared from cultures eA~JlGssillg these cDNAs with conv~lesc~--nt sera from
t~m~rins T-1034, T-1048, and T-1051 (1:100 dilution) dclllol~ cd specific
immunologic reactivity with a protein of the size predicted for each CKS-fusion
protein. (data not shown). The DNA sequence of each of the cDNAs was
detçrminP~ and it was found that these clones possesse~ nearly 100% sequence
identity with that of HGBV-B virus (SEQUENCE I.D. NO. 11). The sequence of
the 4-3Bl.l insert (SEQUENOE I.D. NOS. 12 and 13), although not detcllllined
in its entirety, those portions that have been sequenrecl exhibit 99.5% Sequenceidentity to a portion of the sequ~-n~e within HGBV- B (SEQUENCE I.D. NO. 11)
2s from base pairs 6834-7458. This region of the HGBV-B (SEQUENCE I.D. NO.
11) sequence showing identity with that of the sc~lucnce obtained from clone 4-
3B 1.1 was tr~ncl~tffl into the +1 reading frame and is ~l~se..~d in the sequenc~.
listing as SEQUENOE I.D. NO. 14. The sequence of the 48-lAl.l insert
(SEQUENOE I.D. NO. 15) exhibits 100% Sequen~e identity to a portion of the
sequence from HGBV-B (SEQUENOE I.D. NO. 11, see Example 9) from base
pairs 4523-4752. The DNA sequence corresponding to SEQUENOE I.D. NO. 15
was tr~ncl~çd into the +l reading frame and is pl~sel,led in the sequence listing as
SEQUENCE I.D. NO. 16. The sequence of the 66-3Al.49 insert (SEQUENCE
- I.D. NO. 17) exhibits essentially 100% sequence identity to that of clone 48-
3s lAl.l and thus no protein translation is shown in the sequence listing. The
sequence of the 70-3A.1.37 insert (~UENOE I.D. NO. 18) exhibits 100%
sequence identity to a portion of the sequen~e from HGBV-B (SEQUENCE I.D.

WO 95t21922 ~ PCI/US95/02118

76

NO. 11) from base pairs 6450-6732 except for a three base-pair deletion
co,~ onding to bases 6630-6632 of the HGBV-B sequence (SEQUENCE I.D.
NO. 11). The DNA sequence collc~ponding to SEQUENCE I.D. NO. 18 was
tr~n~l~te~ into the ~2 reading frame and is ~l~scn~cd in the sequence listing asSEQUENCE I.D. NO. 19. The sequence of the 78-lC1.17 insert (SEQUENCE
I.D. NO. 20) exhibits 100% sequence identity to that of clone 70-3A1.37 and thusno protein translation is shown in the sequence listing. These data demonstrate
that the cDNA clones j~gl~ted from the lambda gtl 1 cDNA library are derived from
the genome of the HGBV agent and that it çn~odes polypeptides which are
specifically recognized immlmologically by sera from GB-infected t~m~rin~
Clones 48-lAl.l("clone 48") 4-3B1.1, 66-3A1.49, 70-3A1.37, and 78-lC1.17
have been deposited at the ~."~. ;c~. Type Culture Collection as provided
hereinabove.

. Exam~l?le 5. DNA sequence analysis of HGBV clones
Unique clones obtained in Example 3 were se luellced using the
dideoxynucleotide chain tçrmin~tion technique (Sanger, et al., supra) in a kit form
(Sequenase~' version 2.0, USB). These sequences are non-overlapping and are
presented in the Sequence Listing as clone 4 (SEQUENCE I.D. NO. 21), clone 2
(SEQUENCE I.D. NO. 22), clone 10 (SEQUENCE I.D. NO. 23), clone 11
(SEQUENCE I.D. NO. 24), clone 13 (SEQUENCE I.D. NO. 25), clone 16
(SEQUENCE I.D. NO. 26), clone 18 (SEQUENCE I.D. NO. 27), clone 23
(SEQUENCE I.D. NO. 28), clone 50 (SEQUENCE I.D. NO. 29) and clone 119
(SEQUENCE I.D. No. 30). Clones 4, 2, 10, 11, 13, 16, 18, 23, 50 and 119
have been deposited at the A.T.C.C. Clone 2 was accorded A.T.C.C. Deposit
No. 69556; Clone 4 was accorded A.T.C.C. Deposit No. 69557; Clone 10 was
accorded A.T.C.C. Deposit No. 69558; Clone 16 was ~ccorded A.T.C.C. Deposit
No.69559; Clone 18 was acco~ed A.T.C.C. Deposit No. 69560; Clone 23 was
accorded A.T.C.C. Deposit No. 69561; and Clone 50 was accorded A.T.C.C.
Deposit No. 69562; Clone 11 was accorded A.T.C.C. Deposit No. No. 69613;
Clone 13 was accorded A.T.C.C. Deposit No. 69611; and Clone 119 was
accorded A.T.C.C. Deposit No. 69612.
The se luellces were searched against the GenR~nk ~ h~ce using the
BLASTN algo~ lll (Altschul et al, J. Mol. Biol. 215:403410 [1990]). None of
these sequences were found in GenR~nk inriir~ting that these sequences have not
been previously cha,~;~li~d in the li~ u.e. The DNA sequences were
tr~n~l~teA into the six possible reading frames and are pl~se~l~ed in the sequence

- WO 95/21922 ~ 1 ~ 6 3 1 3 PCTIUS95/02118


listing (SEQUENCE I.D. NO. 21 translates to SEQUENCE I.D. NOS.31-36,
SEQUENCE I.D. NO. 22 tr~ncl~tes to SEQUENCE I.D. NOS. 3742,
SEQUENCE I.D. NO. 23 tr~ncl~tes to SEQUENCE I.D. NOS. 43-48,
SEQUENCE I.D. NO. 26 translates to SEQUENCE I.D. NOS. 49-54,
SEQUENCE I.D. NO. 27 tr~nCl~tes to SEQUENCE I.D. NOS. 55-60,
SEQUENCE I.D. NO. 28 translates to SEQUENCE I.D. NOS. 61-66, and
SEQUENCE I.D. NO. 29 tr~ncl~tes to SEQUENCE I.D. NOS. 67-72).
SEQUENOE I.D. NO. 24 is contained within SEQUENCE I.D. NO. 73
(described in Example 9), which tr~ncl~tes to SEQUENOE I.D. NOS. 74-79.
0 SEQUENOE I.D. NOS. 25 and 30 are contained within SEQUENCE I.D. NO. 80
(described in Example 9), which translates to SEQUENCE I.D. NO. 81-86. The
tr~ncl~t~.l sequ~n~,s were used to search the SWISS-PROT ~1~t~h~ce using the
BLASTX algorithm (Gish et al., Nature Genetics 3:266-272 [1993]). Again, none
of these se~ s were found in SWISS-PROT ;..~ ;..g that these sequences
have not been previously char~ten7~d in the li~tl~ulc.
Homology searches condnctecl using the BLASTN, BLASTX and
FASTdb algc.li~ s demonstrate some, albeit low, sequence resemblence to
hepatitis C virus (TABLE 7, below). Specifically, translations of clones 4
(SEQUENCE I.D. NO. 35), 10 (SEQUENCE I.D. NO. 44), 11 (residues 1-166
of GB-A, frame 3 [SEQUENCE I.D. NO. 76]), 16 (SEQUENCE I.D. NO. 50),
23 (SEQUENOE I.D. NO. 65), 50 (SEQUENOE I.D. NOS. 70 and 72) and 119
(residues 912-988 of GB- A, frame 3 [SEQUENOE I.D. NO. 83]), are between
24.1 % and 45.1 % homlogous to various HCV isolates at the amino acid level. Of
particular interest, translation of clone 10 (SEQUENCE I.D. NO. 44) showed
2s limited homology to the putative RNA-dependent RNA polymerase of HCV. A
co...p~. ;con of the conserved amino acids present in the putative RNA-dependentRNA polylll~lase of other positive strand viruses (Jiang et al. PNAS 90: 10539-
10543 [1993]) with the ~u~live amino acid translation of clone 10 (SEQUENOE
I.D. NO. 44) revealed that conserved amino acid residues of other RNA-dependent
30 RNA pol~lllelases are also consel ~red in clone 10 (SEQUENCE I.D. NO. 44).
This includes the canonical GDD (Gly-Asp-Asp) sign~ re sequence of RNA-
dependent RNA polymerases. Thus, clone 10 (SEQUENCE I.D. NO. 44)
appears to encode a viral RNA-dependent RNA polymerase. Surprisingly, only
clone 10 (SEQUENOE I.D. NO. 44) showed any sequence homology with HCV
3s at the nucleotide level when the BLASTN algorithm was used. Clones 4
(SEQUENCE I.D. NO. 21), 16 (SEQUENCE I.D. NO. 26), 23 (SEQUENCE
I.D. NO.28) and 50 (SEQUENOE I.D. NO. 29) and 119 (SEQUENCE ID. NO.

WO 95121922 ~ I b ~ 3 ~ 3 PCT/US95/02118

78

30) which have low HCV homology at the amino acid level, were not ~et~ ctecl by
BLASTN in searches of GenBank. In addition, clones 2 (SEQUENCE I.D. NOS.
37-42), 13 (SEQUENCE I.D. NO. 25 and 3742) and 18 (SEQUENCE I.D.
NOS. 27 and 55-60) showed no ~ignific~nt nucleotide or amuno acid homology to
s HCV when searched against GenBank or SWISS-PROT as described
hereinabove.
TABLE 7
HCV Homology of HGBV Cones
Homology
Clone Nucleotidea Amino Acidb StrainC Regiond Functione
4 none 28/73 (38.4%) HCVTW NS4 unknown
10 134/307 46/102 (45.1%) HCVJ6 NS5 replir~ce
(43.6%)f
11 none 40/166 (24.1%) HCVJT NS5 replicase
16 none 55/177 (31.1%) HCVJ8 NS2/3protease
23 none 44/121 (36.4%) HCVJA NS3 hP.~ Ce
none 29/112 (25.9%) HCVH NS4/5 unknown
119 none 27/77 (35.1%) HCVIW NS5 replicase
a ~lomnlogy found to HCV when GB clones were searched against GPnR b using the BLAST
algorithrn.
15 b Hnmf~cO~ found to HCV when ~ ' ~ GB clone sP~I - s were searched against SWISS-
PROT using the FASTdb algorithm.
c Most hnmrlnO _ strain of HCV (SWISS-PROT dPcigr ~ion)
d~e Region of h~---lco~ and reputed function of clone c( ~ Ja..,d with HCV acco~;..O to
~o~ ,h~n ~ et al., ~F ~ 14(2):381-388 (1991). f BLASTN detected a segment of clone 10
that was 64%1~nn~ o with HCV NS5 over 132 ~ ~Li~l~s ~ of the entire clone
10 ~-~ - s with the ~----1~OO~ -lf ~ P S~qU~Pnre of HCVJ6 shows 43.6% homology.

Example 6. Exo~enicity of HGBV clones
The HGBV clones were not detected in normal or HGBV-infected
tamarin liver DNA, normal human Iymphocyte DNA, yeast DNA or E. coli DNA.
This was demonstrated for HGBV clones 2 (SEQUENCE I.D. NO. 22) and 16
(SEQUENCE I.D. NO. 26) by Southern blot analysis. In addition, all HGBV
clones were analyzed by genomic PCR to col~r~ l the exogenous origin of the
HGBV seque~res with respect to the t~m~rin, human, yeast and E. coli genomes.
These data are CO~ with the viral nature of the HGBV sequences described in
Example 5.
A. Southern Blot analysis.

- WO 95/21922 ~ 1 6 ~ ~ i 3 PCl'tUS95tO2118


Tamarin liver nuclei were obtained from low speed pelleting of liver
homogenates of HGBV-infected and normal t~Tn~n~ (described hereinbelow).
DNA was extracted from nuclei using a co,~ lelcially available kit (USB cat. #
73750) as directed by the supplier. The tamarin DNA was treated with RNase
s during the extraction procedure. Human placental DNA (Clontech, Palo Alto,
CA), yeast DNA (Sacchalulllyccs cerevisiae~ Clontech) and E. coli DNA (Sigma)
were obtained from colll,llercial sources.
Each DNA sample was ~1i~S~d with BamHI (NEB) according to the
suppliers direction. Digested DNAs (10 llg) and RDA products (0.5 ~g each from
0 Example 3B) were electrophoresed on 1% agarose gels and capillary blotted toHybond-N+ nylon membranes (Amersham, Arlington Heights, IL) as described in
Sambrook et al. (pp. 9.34 ff~. DNA was fixed to the membrane by alkali
treAtm~.nt as directed by the lllGlll~ e supplier. Melllblalles were prehybridized in
Rapid Hyb solution (Amersham) at 65C for 30 min.
Radiolabeled probes of the HGBV sequences were ~IGpal~d by PCR.
Briefly, 50,ul PCRs were set up using lx PCR buffer II (Perkin Elmer), 2 mM
MgC12, 20 ~LM dNTPs, 1 ~LM each of clone specific sense and antisense primers
(for clone 2, SEQUENCE I.D. NOS. 87 and 88; for clone 4, SEQUENCE I.D.
NOS. 89 and 90; for clone 10, SEQUENCE I.D. NOS. 91 and 92; for clone 16,
SEQUENCE I.D. NOS. 93 and 94; for clone 18, SEQUENCE I.D. NOS. 95 and
96; for clone 23, SEQUENOE I.D. NOS. 97 and 98; and for clone 50,
SEQUENCE I.D. NOS. 99 and 100), 1 ng HGBV clone plasmid (described in
Example 3[E]), 60 ~LCi a-32P-dATP (3000 Ci/mmol) and 1.25 units of
AmpliTaq~;~ polymerase (Perkin Elmer). The reactions were in~ ub~t~ at 94C for
2s 30 sec., 55C for 30 sec., and 72C for 30 sec. for a total of 30 cycles of
amplification followed by a final e~,.lsion at 72C for 3 n-inlltPs Ullincol~olated
label was removed by Quick-Spin~) G-50 spin columns (Boehringer ~Annh~.h~,
Tn~liAn~polis, IN) as directed by the supplier. The probes were denatulcd (99C, 2
min.) prior to addition to the pre-hybridized me~llbl~les.
Radiolabeled probes were added to the ~lchyblidized mc.ll~ es (2 x
io6 dpm/ml) and filters were hybridized at 65C for 2.5 hours as directed by theRapid Hyb~ supplier. The hybridized membranes were washed under conditions
of moderate stringency (lx SSC, 0.1% SDS at 65C) before being exposed to
autoradiographic film for 72 hours at -80C with an hllellsiryillg screen. Theseconditions were ~lesign~1 to detect a single copy gene with a similar radiolabeled
prohe.

WO 95/21922 ~t 1 -~ 6 ~ ~ 3 PCT/US95/02118


The results show that clone 2 (SEQUENCE I.D. NO. 22) and clone 16
(SEQUENCE I.D. NO. 26) sequences did not hybridize to DNA from normal or
HGBV-infected tamarin liver (E;IGURES 15 and 16, lanes lB and 3B,
,c;spe~;lively), human DNA (FIGURES 15 and 16, lane lA), yeast DNA
(FIGURES 15 and 16, lane 2A) or E. coli DNA (~:IGURES 15 and 16, lane 3A).
In addition, no hybridization was lete~t.q~ with the driver amplicon DNA
(FIGURES 15 and 16, lanes 4A, derived from pre-HGBV-inoculated tamarin
plasma as described in Example 2.B). In contrast, strong hybridization signals
were seen with the tester amplicon (I;IGURES 15 and 16, lane 6A, derived from
lo infectious HGBV tamarin plasma using total nucleic acid extraction and reverse
l.~lscflplion steps as described in Example 2.B) and the products of the three
rounds of subtraction/selective amplification (~;IGURES 15 and 16, lanes 7A, 8A
and 4B referring to the products from the first, second and third rounds of
~ubl~ ion/selective amplifir~ti~n, ~s~;ti~ely). These data ~emonctrate that
HGBV clones 2 (SEQUENOE I.D. NO. 22) and 16 (SEQUENOE I.D. NO. 26)
can be detected in nucleic acid sequences amplified from infectious soul~;es,
HGBV clones 2 (SEQUENOE I.D. NO. 22) and 16 (SEQUENCE I.D. NO. 26)
are not derived from tamarin, human, yeast or E. coli genomic DNA sequences.
B. Genomic PCR analysis.
To further demonstrate the exogenicity of the HGBV sequences and
support their viral origin, PCR was ~lro~ ed on genomic DNA from tamarin,
human, yeast and E. coli. DNA from normal tamarin kidney and liver tissue was
prepared as described by J. Sambrook et al., supra. Yeast, Rhesus monkey
kidney and human pl~r,ent~l DNAs were obtained from Clontech. E. coli DNA
was obtained from Sigma.
PCR was performed using GeneAmp(~ agcll~ from Perkin-Elmer-
Cetus essPnti~lly as directed by the supplier's instructions. Briefly, 300 ng ofgenornic DNA was used for each 100 ~11 reaction. PCR primers derived from
HGBV cloned sequences (for clone 2, SEQUENCE I.D. NOS. 87 and 88; for
clone 4, SEQUENOE I.D. NOS. 89 and 90; for clone 10, SEQUENCE I.D.
NOS. 91 and 92; for clone 16, SEQUENCE I.D. NOS. 93 and 94; for clone 18,
SEQUENCE I.D. NOS. 95 and 96; for clone 23, SEQUENCE I.D. NOS. 97
and 98; and for clone 50, SEQUENOE I.D. NOS. 99 and 100) were used at a
final concentration of 0.5 ~M. PCR was performed for 35 cycles (94C, 1 min;
55C, 1 min; 72C, 1 min) followed by an extension cycle of 72C for 7 min.
The PCR product were separated by agarose gel electrophoresis and vicu~li7P(l
by UV irradiation after direct st~ining of the nucleic acid with ethidium bromide

- WO95/21922 Z1~531~ PCT/US95/02118


and/or hybridizaion to a radiolabelled probe after Southern blot transfer to a
nitrocellulose filter. Probes were gene.aLed as described in Example 6A. Filterswere prehybridiæd in Fast-Pair Hybridization Solution from Digene (Belstville,
MD) for 3-5 hours and then hybridized in Fast-Pair Hybridization Solution with
100-200 cpm/cm2 at 42C for 15-25 hours. Filters were washed as described in
G. G. Schlauder et al., J. Virol. Methods 37: 189-200 (1992) and exposed to
Kodak X-Omat-AR film for 15 to 72 hours at -70C with h~lensirying screens.
- FIGURE 17 shows an ethidium bromide stained 1.5% agarose gel.
FIGURE 18 shows an autoradiogram from a Southern blot from the same gel
after hybridization to the radiolabeled probe from clone 16 (SEQUENCE I.D.
NO. 26). Consistent with its exogenous nature, clone 16 (SEQUENCE I.D. NO.
26) sequences were not detected in tamarin (FIGURE 17 and 18, lanes 9 and
10), Rhesus monkey (lane 11) or human genomic DNAs (lane 12) or in yeast or
E. coli DNAs (data not shown) by g~,nOl~iC PCR analysis despite being able to
detect clone 16 (SEQUENCE I.D. NO. 26) sequences that have been spiked into
normal tamarin liver and kidney DNA at 0.05 genome equivalents (lanes 17 and
18). In addition, primers derived from the human dopamine D 1 receptor gene,
1000-1019 base pairs (sense primer) and 1533-1552 base pairs (antisense
primer) (GenBank accession number X55760, R. K. Sunahara. et al., Nature
347:8~83 [1990]) successfully amplified the dopamine D1 receptor DNA from
the primate genomic DNAs (FIGURE 17 lanes 2, 3, 4 and 5 collcs~onding to
tamarin kidney, tamarin liver, rhesus monkey and human DNAs) demonstrating
the utility of this method for tletectin~ low copy number (i.e. single copy)
sequences. Lanes 1 and 8 are H20 contols for dopamine Dl lcceptol and clone
2s 16 ~lhllel~ (SEQUENCE I.D. NOS. 93 and 94), respectively. Lane 6 contains
100fg of clone 16 (SEQUENCE I.D. NO. 26) plasmid DNA amplified with the
dopamine receptor pri~ . Lanes 14, 15, 16 and 20 contain 1, 3, 10, and
100fg, respectively, of clone 16 (SEQUENCE I.D. NO. 26) plasmid DNA.
Lanes 7 and 19 are lll~l~. Similar results were obtained using PCR primers
specific for clones 2, 4, 10, 18, 23 and 50 described above (data not shown).
Clones 2 (SEQUENCE I.D. NO. 22), 4 (SEQUENCE I.D. NO. 21), 10
(SEQUENCE I.D. NO. 23), 18 (SEQUENCE I.D. NO. 27), 23 (SEQUENCE
I.D. NO. 28) and 50 (SEQUENOE I.D. NO. 29) are inconclusive at this time.
- However, clones 4 (SEQUENCE I.D. NO. 21), 10 (SEQUENCE I.D. NO. 23),
3s 18 (SEQUENCE I.D. NO. 27) and 50 (SEQUENCE I.D. NO. 29) sequences
were not detected in t~m~nn, human, yeast and E. coli DNA, (Rhesus monkey

WO 95/21922 82 PCItUS95tO2118


was not tested) inrlic~ting that these sequences are exogenous to the genomic
DNA sources tested and ~U~pOl Lh~g the viral origin of these sequences.
Example 7. Presence of HGBV se~luellces in tamarin sera
s The presence of the HGBV clone sequences in pre-inoculation and acute
phase T-1053 plasma was e~minçd by PCR. Because the HGBV genome could
be DNA or RNA, PCR and RT-PCR was ~clrolllled. Specifically, total nucleic
acids were extracted from plasma as described in Example 3(A). PCR was
performed on the equivalent of S ~1 plasma nucleic acids as ~1~PS( rihed in Example
6(B) and RT-PCR was performed using the GeneAmp(~) RNA PCR Kit from
Perkin-Elmer-Cetus essenti~lly according to the m~nnf~rtllrer's instructions
using 1 ~lM concentration of primers (for clone 2, SEQUENOE I.D. NOS.87
and 88; for clone 4, SEQUENCE I.D. NOS. 89 and 90; for clone 10,
SEQUENOE I.D. NOS. 91 and 92; for clone 16, SEQUENOE I.D. NOS. 93
and 94; for clone 18, SEQUENCE I.D. NOS. 95 and 96; for clone 23,
SEQUENCE I.D. NOS. 97 and 98; and for clone 50, SEQUENCE I.D. NOS.
99 and 100) in the PCRs. cDNA synthesis was primed with random hexamers.
Ethidium bromide st~ining and hybridization of the PCR products
demonstrated the presence of HGBV clone sequences 2 (SEQUENOE I.D. NO.
22), 4 (SEQUENCE I.D. NO. 21), 10 (SEQUENOE I.D. NO. 23), 16
(SEQUENOE I.D. NO. 26), 18 (SEQUENCE I.D. NO. 27), 23 (SEQUENCE
I.D. NO. 28) and 50 (SEQUENOE I.D. NO. 29) in the acute phase T-1053
plasma and not the pre-inoc~ tion T-1053 plasma (data not shown). In addition,
HGBV clones 2 (SEQUENOE I.D. NO. 22), 4 (SEQUENOE I.D. NO. 21), 10
2s (SEQUENCE I.D. NO. 23), 18 (SEQUENCE I.D. NO. 27), 23 (SEQUENOE
I.D. NO.28) and 50 (SEQUENOE I.D. NO. 29) sequences could be detected in
H205, the HGBV inoclllum that was injected into tamarin T-1053 (see Example
lB). These results are ~ u~l~.i~d in TABLE 8. It should be noted that the
HGBV clone sequences were only det~P~ted by RT-PCR in the acute phase
plasma The fact that the HGBV clone seqU~pnr~s were detP~tP~l in the acute
phase plasma by PCR only after a reverse transcription step to convert RNA to
cDNA, taken together with the limited homology of some of these clones with
HCV isolates, and the presence of the sequences coding for the COllSe. ~ed aminoacids found in the RNA-dependent RNA polymerase in HGBV clone 10
3s (SEQUENOE I.D. NO. 23; Example 5) suggest that HGBV is an RNA virus.
RT-PCR analysis of a panel of tamarin pl~cm~C with HGBV clone 16
sequence (SEQUENOE I.D. NO. 26) was undertaken to col.r.. the presence of

- WO95/21922 ~ 3~ ~ PCT/US95/02118


HGBV clone 16 (SEQUENCE I.D. NO. 26) in other individuals who had been
experim~nt~lly infected with HGBV. Briefly, nucleic acids were isolated as
previously described (G. G. Sçhl~llder et al., J. Virological Methods 37:189-200[1992]) from 25 ~11 of plasma from L~lllldlillS obtained prior to and after
5 exp~-rim~nt~l infection with the H205 inoculum. Ethanol pl~ aled nucleic
acids were resuspended in 3 ~11 of DEPC-treated H20. cDNA synthesis and
PCR were performed using the GeneAmp RNA PCR Kit from Perkin-Elmer-
Cetus escçnti~lly accol.lillg to the manufacturer's instructions. cDNA synthesiswas primed with random hexamers. The resulting cDNA was subjected to PCR
0 using clone 16 primers (SEQUENCE I.D. NOS. 93 and 94) at a final
concen~ldlion of 0.5 IlM. PCR was performed for 35 cycles (94C, 1 min; 55C,
1 min, 72C, 1 min) followed by an extension cycle of 72C for 7 min. The PCR
products were s~dlt;d by agarose gel electrophoresis and vi.cll~li7ed by UV
irradiation after direct st~inin~ of the nucleic acid with ethi-lillm bromide and/or
5 hybridization to a radiolabelled probe after Southern blot transfer to a
nitrocellulose filter as describes in Example 6B.
FIGURE 19 shows an ethidium bromide stained 1.5% agarose gel.
FIGURE 20 shows an autoradiogram from a Southern blot from the same gel
after hybridization to the radiolabeled probe from clone 16 (SEQUENCE I.D.
20 NO. 26). H2O and normal human serum are shown in lanes 1 and 2. Lanes 3,
19 and 20 are markers. Lanes 4, 8, 12,and 16 are from uninfected tamarin sera
while lanes 6, 10, 14 and 18 are from infected tamarin sera. These results show
that HGBV clone 16 sequence (SEQUENCE I.D. NO. 26) was detected in other
individuals infected with HGBV, in addition to tamarin T-1053, and not in
25 uninfected individuals. Acute phase sera from five H205-infected animals weretested. Clone 16 sequences (SEQUENOE I.D. NO. 26) were detected in sera
from three of these ~nim~l.c [lane 10, T-1049, 14 days post-inoculation (dpi);
lane 14, T-1051, 28 dpi; lane 18, T-1055, 16 dpi.]. The clone 16 sequence
(SEQUENOE I.D. NO. 26) was not ~etectl-d in pre-inoculation sera from any of
30 the five animals (lane 4, T-1048; lane 8, T1049; lane 12, T-1051; lane 16, T-1055; T-1057 not shown). These results suggest that the clone 16 sequence
~ - (SEQUENCE I.D. NO. 26) may be derived from the infectious HGBV agent.
The absence of clone 16 sequence (SEQUENCE I.D. NO. 26) in two of five
- acute phase plasmas (lane 6, T-1048, 28 dpi; T-1057, 14 dpi, not shown) may
35 be explained by the relative low sensitivity of the clone 16 RT-PCR (e~ tP.cl to
be able to detect approximately 21000 copies of clone 16 sequence (SEQUENOE
I.D. NO. 26) coupled with the acute resolving nature of HGBV infection in

WO 95/21922 ~ ?J ~L 3 PCI/US95/02118 --

84

tamarins. Thus, the acute plasma from the two negative animals may contain a
titer of HGBV that is below the detection level of the RT-PCR assay employed.
The observation that these two ~nim~l~ were positive for clone 4 (SEQUENCE
I.D. NO.21) by RT-PCR (Example 14) may reflect the presence of RNA
sequences of one virus (conl~ g clone 4) and the absence of detectable RNA
sequences from a second virus (cont~ining clone 16).

Example 8. Northern blot analysis of HGBV sequences in infected tamarin liver
Because the HGBV clone sequences were detect~hle by RT-PCR in the
0 acute phase tamarin plasma and the H205 inoculum, it was likely that these
sequences originate from the HGBV genome. Additional RT-PCR studies
demonstrated the presence of the HGBV sequences in liver RNA extracted from
the H205-infected t~m~rin, T-1053 (data not shown). Therefore, to deterrnine thesize of the HGBV genome, Northern analysis of H205-infected and uninfected
tamarin liver RNA was performed. Total cellular RNA was extracted from 1.25 g
liver of H205-infected tamarin T-1053 and from 1.0 g of liver from a control (i.e.
uninfected) tamarin T-1040 using an RNA isolation kit (Stratagene, La Jolla, CA)as directed by the manufacturer. Total RNA (30 ~lg) was electrophoresed through
a 1% agarose gel containing 0.6 M formaldehyde (R.M. Fourney, et al., Focus 10:
20 5-7, [1988]) and then transferred to Hybond-N nylon membrane (Amersham) by
capillary action in 20X SCC (pH 7.0) as previously described. J. Sambrook, et
al., Molecular Cloning - A Laboratory Manual. 2nd Edition (1989). The RNA
was W-crosslinked to the nylon membrane which was then baked in a vacuum
oven at 80C for 60 min. The blots were prehybridized at 60C for 2 hours in 25
25 ml of a solution cont~ining 0.05 M PIPES, 50 mM sodium phosphate, 100 mM
NaCI, 1 mM EDTA, and 5% SDS. G.D. Virca, et al., Biotechniques 8:370-371
(1990). Prior to hybridization with the radiolabeled DNA probe, the solution wasremoved and 10 ml of fresh solution was added. The probes used for hybridizationwere clone 4 (SEQUENCE I.D. NO. 21; 221 bp) and clone 50 (SEQUENCE I.D.
30 NO. 29; 337 bp) and the 2000 bp cDNA encoding human ,B-actin. P. Gunning, et
al., Mol. and Cell. Biol. 3:787-795 (1983). The probes (50 ng) were radiolabeledusing a random primer labeling kit (Stratagene. La Jolla, CA) in the presence of[a-32P]dATP as directed by the manufacturer. The specific activity of each probewas approximately 109 cpm/~g. The blots were hybridized at 60C for 16 hours
35 and washed as described (G.D. Virca, et al., supra) and then exposed to Kodak X-
Omat-AR film at -80C. Photographs of the resulting autoradiographs are shown
in FIGURE 21A. Lanes 1, 3, and 5 contain liver RNA from T-1040 and lanes 2,

~- wo 95/21922 ~ l ~3 6 3 ~ 3 Pcr/uss5/02l18


4, and 6 contain liver RNA from T-1053. Lanes 1 and 2 were hybridized with the
human ~-actin cDNA probe; lanes 3 and 4 were hybridized with the clone 4 probe
(SEQUENCE I.D. NO. 21); and lanes 5 and 6 were hybridized with the clone 50
probe (SEQUENCE I.D. NO. 29). Exposure times were as follows: lanes 1 and
2, 5 hours at -80C; lanes 3-6, 56 hours at -80C. The positions of the 28S and
18S ribosomal RNAs are inrlic~tt-cl by the arrows. The relative sizes of these
ribosomal RNAs are 6333 and 2366 nucleotides, respectively. J. Sambrook, et al.,supra.
Clone 4 (SEQUENCE I.D. NO. 21) and clone 50 probes (SEQUENCE
o I.D. NO. 29) hybridized with an RNA species present in RNA extracted from the
liver of the infected tamarin (T-1053) (FIGURE 21A, lanes 4 and 6). The size of
this hybridizable RNA species was calculated at approximately 8300 nucleotides
based on its relative mobility with respect to 28S and 18S ribosomal RNAs. Both
probes appear to hybridize to the same RNA species. Neither probe hybridized
with RNA extracted from the liver of the uninfected tamarin (T-1040) (FIGURE
21A, lanes 3 and 5). These results suggest that the sequences of clones 4
(SEQUENCE I.D. NO. 21) and 50 (SEQUENCE I.D. NO. 29) are present within
the same 8.3 Kb transcript.
In order to determine the str~n~erlness of the HGBV RNA genome, strand-
specific radiolabeled DNA probes were prepared by assymetric PCR using the
GeneAmp(~ PCR kit from Perkin-Elmer essentially according to the
manufacturer's instructions. Purified clone 50 DNA (SEQUENCE I.D. NO. 29)
was used as template in sep;~d~e reactions co"l;~;"i~g either the clone 50 negative
strand-specific primer (SEQUENCE I.D. NO. 99) or the clone 50 positive strand-
specific primer (SEQUENCE I.D. NO. 100) at 1 ~lM final concentrations. The
reaction mixture contained [a32P-dATP] (Alllel~h~ll; 3000Ci/mmol) in place of
the dATP normally included in the reaction mixture. Following 30-cycles of linear
amplification of the template, the unincorprated [a32P-dATP] was removed by
Quick-Spin(~) Sephadex G50 spin columns (Boehringer-Mannheim, Tn~ n~polis,
IN) according to the manufacturer's instructions. Hybridization of the radiolabeled
probes to DNA dot blots cont~il-i"g ten-fold serial dilutions of double-strandedclone 50 DNA (SEQUENCE I.D. NO. 29) demonstrated that the two probes
posses.ced nearly identical sensitivities (data not shown). The radiolabled probes
' were then hybridized to RNA blots col,~inillg 30 ~lg of total liver RNA extracted
from uninfected tamarin T-1040 and from infected tamarin T-1053 as described
above. Photographs of the resulting autoradiographs are shown in FIGURE 21B.
Lanes 1 and 3 contain liver RNA from T-1040 and lanes 2 and 4 contain liver

WOg5/21922 h~l ~5~'3~ 3 Pcr/ussslo2ll8

86

RNA from T-1053. Lanes 1 and 2 were hybridized with the clone 50 positive
strand probe (i.e., the positive strand is radiolabeled and will detect the negative
strand; SEQUENCE I.D. NO. 100); lanes 3 and 4 were hybridized with the clone
50 negative strand probe (i.e., the negative strand is radiolabeled and will detect
5 the positive strand; SEQUENCE I.D. NO.99). The blots were exposed for 18
hours at -80C. The positions of the 28S and 18S ribosomal RNAs are indicated
by the arrows.
As shown in FIGURE 21B, the clone 50 positive and negative strand
probes (SEQUENCE I.D. NOS.100 and 99, respectivèly) hybridized to an RNA
o species of approximately 8.3 kilobases extracted from the liver of the infected
tamarin T-1053 (FIGURE 21B, lanes 2 and 4), but not to RNA extracted from the
liver of the uninfected tamarin T-1040 (FIGURE 21B, lanes l and 3). This is
consistent with the Northern blot results obtained with the clone 4 (SEQUENCE
I.D. NO. 21) and clone 50 (SEQUENCE I.D. NO. 29) double-stranded probes
15 shown above. The more intense signal obtained with the clone 50 negative strand
probe (SEQUENCE I.D. NO. 99) (FIGURE 21B, lane 4 vs. Iane 2) suggests that
the predominant RNA species present in the liver of infected tamarins is the
positive (i.e. coding) strand.

Example 9. Extending the HGBV clone Sequence
A. Generation of HGBV sequences.
The clones obtained as described in Example 3 and sequenced as described
in Example 5 hereinabove appear to be derived from separate regions of the HGBV
genome. Therefore, to obtain sequences from additional regions of the HGBV
genome that reside between the previously identified clones, and to confirm the
sequence of the RDA clones, several PCR walking experiments were performed.
Total nucleic æids were extracted from 50 ~11 aliquots of infectious T- 1053
plasma as described in Example 3(A). Briefly, præipil~lt;d nucleic acids were
resuspended in 10 111 DEPC-treated H20. Standard RT-PCR was performed using
the GeneAmp~) RNA PCR kit (Perkin Elmer) as directed by the manufacturer.
Briefly, PCR was performed on the cDNA products of random primed reverse
transcription reactions of the extracted nucleic acids with 2 mM MgC12 and l ,uMprimers. Reactions were subjected to 35 cycles of denaturation-~nne.~ling-
extension (94C, 30 sec; 55C, 30 sec; 72C 2 min) followed by a 3 min extensionat 72C. The reactions were held at 4C prior to agarose gel analysis. These
products were cloned into pT7 Blue T-vector plasmid (Novagen) as described in

- --- WO 95/21922 ~ 3 1 3 PCT/US95/02118

87

the art. TABLE 9 presents the results obtained when these reactions were
peformed.
TABLE 9
sReactionPrimer l Primer2 Product Size
1.1 SEQ ID #88 comp. of SEQ ID #93 878 bp
1.2comp. of SEQ ID #87 SEQ ID #97 1191 bp
1.3 SEQ ID #90 SEQ ID #101 864 bp
1.4comp. of SEQ ID #99comp. of SEQ ID #102 1.4 kb
1.5 SEQ ID #102 SEQ ID #91 672 bp
1.6 SEQ ID #98 SEQ ID #99 2328 bp
1.7comp of SEQ ID #103 SEQ ID #104 1300 bp
1.8comp. of SEQ ID #105 SEQ ID #87 900 bp
1.9 SEQ. ID. #93 SEQ. ID. #99 2323 bp
151.10 SEQ. ID. #92 SEQ. ID. #91 1216 bp
1.11 SEQ. ID. #90 SEQ. ID. #92 1570 bp
1.12comp. of SEQ ID #106 SEQ ID #103 550 bp
1.13comp. of SEQ ID #107 SEQ ID #108 900 bp
1.14 SEQ ID #107 comp. of SEQ ID #96 1100 bp
201.15comp. of SEQ ID #109 SEQ ID #110 410 bp
1.16 SEQ ID #111 comp. of SEQ #112 600 bp
1.17comp. of SEQ ID #113 SEQ ID #114 1000 bp
1.18 SEQ ID #98 comp. of SEQ ID #115 720 bp
1.19comp. of SEQ ID #116comp. of SEQ ID #117 825 bp
251.20 SEQ ID #118 comp. of SEQ ID #119 700 bp
1.21 SEQ ID #120 SEQ ID #95 900 bp
1.22 SEQ ID #121 comp. of SEQ ID #122 950 bp
1.23 SEQ ID #123 SEQ ID #124 420 bp
1.24 SEQ.ID#87 SEQ.ID#88 130 bp
301.25 SEQ.ID#55 SEQ.ID#89 450 bp

-: A modification of a PCR walking technique described by Sorensen et al.
(J. Virol. 67:7118-7124 [1993]) was utilized to obtain additional HGBV
sequences. Briefly, total nucleic acid were extracted from infectious tamarin T-1053 plasma and reverse transcribed. The resultant cDNAs were amplified in 50
~I PCR reactions (PCR 1) as described by Sorensen et al. (~) except that 2
mM MgCl2 was used. The reactions were subjected to 35 cycles of denaturation-

WO 95/21922 ~ ) PCT/US95/02118 --

88

~nnt-~lin~-extension (94C, 30 sec; 55C, 30 sec; 72C, 2 min) followed by a 3 min
extension at 72C. Biotinylated products were isolated using streptavidin-coatedparamagnetic beads (Promega) as described by Sorensen et al. (supra). Nested
PCRs (PCR 2) were pelrc,l"led on the streptavidin-purified products as described5 by Sorensen et al. for a total of 20 to 35 cycles of denaturation-~nne~ling-extension
as described above. The resultant products and the PCR primers used to generate
them are listed in TABLE 10.
TABLE 10
o Reaction Primer set PCR 1 Primer set PCR 2 Size of PCR
product
2.1 SEQ ID #103 / SEQ ID #125 SEQ ID #668 / SEQ ID #126 500 bp
2.2 SEQID#114/SEQID#125SEQID#105/SEQID#126 lOOObp
2.3 SEQ ID #92 / SEQ ID #125 SEQ ID #123 / SEQ ID #126 400 bp
2.4 SEQ ID #127 / SEQ ID #128 comp. of SEQ ID #88 / 420 bp
SEQ ID #126
2.5 SEQ ID #108 / SEQ ID #128 SEQ ID #106 / SEQ ID #126 900 bp
2.6 SEQ ID #129 / SEQ ID #125 SEQ ID #98 / SEQ ID #126 750 bp
2.7 SEQID#116/SEQID#128SEQID#115/SEQID#126 825 bp
2.8 SEQID#130/SEQID#125SEQID#107/SEQID#126 630bp
2.9 SEQID#110/SEQID#135SEQID#131/SEQID#126 390bp
2.10 SEQ ID #132 / SEQ ID #125 SEQ ID #109 / SEQ ID #126 1000 bp
2.11 SEQ ID #111 / SEQ ID #128 SEQ ID #133 / SEQ ID #126 600 bp
2.12 SEQ ID #134 / SEQ ID #135 SEQ ID #112 / SEQ ID #126 580 bp
2.13 SEQ ID #136 / SEQ ID #125 SEQ ID #137 / SEQ n~ #126 400 bp
2.14 SEQ ID #138 / SEQ ID #128 SEQ ID #113 / SEQ ID #126 500 bp
2.15 SEQID#139/SEQlD#128SEQID#140/SEQID#126 900bp
2.16 SEQ ID #121 / SEQ ID #135 SEQ ID #141 / SEQ ID #126 400 bp
2.17 SEQ ID #142 ! SEQ ID #125 comp. of SEQ ID #102 / 1000 bp
SEQ ID #126
2.18 SEQID#143/SEQID#135SEQID#144/SEQID#126 550bp
2.19 SEQ.ID#87 / SEQ ID #125 SEQ.ID#90 / SEQ ID #126 220 bp

These products were isolated from low melting point agarose gels and cloned into35 pT7 Blue T-vector plasmid (Novagen) as described in the art.
RNA ligase-m~ t~l 5' RACE (_ap;ld amplification of cDNA_nds) was
employed to obtain the 5' end sequences from viral genomic RNAs as described

- - WO95/21922 ~ 1 ~ 6 313 PCT/US95/02118

89

hereinabove. Briefly, the 5' AmpliFINDERIM RACE kit (Clontech, Palo Alto,
CA) was used as directed by the manufacturer. The source of the viral RNA was
acute phase T-1053 plasma that was extracted as described above. The virus-
specific oligonucleotides utilized for the reverse transcription (RT), the first PCR
s amplification (PCR 1) and the second PCR amplification (PCR 2) are listed in
TABLE- 11. The ligated anchor primer and its complementary PCR primer were
provided by the manufacturer. PCRs were performed with the GeneAmp~) PCR
kit (Perkin Elmer) as directed by the manufacturer.
TABLE 11
ReactionRT primerPCR 1 primer PCR 2 primerSize of PCR 2
product
3.1 SEQ ID #145SEQ ID #146 SEQ ID #147190 bp
3.2 SEQ ID #148SEQ ID #149 SEQ ID #150620 bp
5 The products generated by RNA ligase-me~ te~l 5' RACE were isolated from low
melting point agarose gels and cloned into pT7 Blue T-vector plasmid (Novagen)
as described in the art.
To obtain additional sequence at the 5' and 3' ends of HGBV-B
SEQUENCE (see below, Evidence for the existence of two HCV-like flaviviruses
20 in HGBV), an RNA circularization experiment was performed. (This method is
based on that described by C.W. Mandl et al. (1991) Biotechniques, Vol 10 (4):
485486.) Total nucleic acids were purified from 50 ~1 of T-1057 plasma (14 days
post H205 inoculation except that l llg glycogen replaced the tRNA in the
pl~ipi~lion. The nucleic acid pellet was dissolved in 16.3 ~1 of DEPC-treated
2s water, and 25 ~1 of 2X TAP buffer (lX=50 mM NaOAC, pH 5.0, 1 mM EDTA,
10 mM 2-mercaptoethanol, 2mM ATP) and 8.7 ~11 of tobacco acid pyrophophatase
(20 Units; Sigma) were added. The mixture was incubated at 37C for 60 min.
The sample was extracted with phenol (water-saturated) followed by chloroform
and then precipitated with NaOAC/EtOH in the presence of glycogen (1 ~g). The
30 pellet was dissolved in 83 ~1 of DEPC water and 10 111 of 10X RNA ligase buffer
(New England Biolabs, NEB), 2 ~11 of RNase inhibitor (Perkin Elmer), and 5 ,ul of
T4 RNA ligase (NEB) was then added. The mixture was incubated at 4C for 16
hours. The sample was then extracted with phenol (water-saturated) and then
- chloroform as before and then precipitated with NaOAC/EtOH.
3s One-tenth of the ligated RNA was used in the reverse transcriptase (RT)
reaction using Superscript RT (GIBCO/BRL) and SF(2UENCE ID. NO 146 as
the primer as directed by the manufacturer . One-half of the RT reaction mix was

-~ `r ~ V 1 e~J
wo 95/21922 ~ I ~ 6 31 ~ PCTtuss5to2ll8


used for PCR1 in the presence of a biotinylated oligonucleotide primer
(SEQUENCE ID. NO. 146) and and a second oligonucleotide primer
(SEQUENCE ID. NO. 133) as described above. PCR1 products were purified
from the reaction mixture using streptavidin-magnetic beads as described by
Sorensen et al. Purified PCRl products (2 ~l out of 30 ~11) were used as the
template for PCR2. PCR2 using oligonucleotide primers (SEQUENCE ID. NOS.
147 and 154) yielded a 1200 bp product that was cloned into pT7 Blue T-vector
plasmid and sequenced as described below. Sequence analysis of two independent
clones from this experiment demonstrated 100% identity in the region of overlap
o with known sequence (although one clone possessed a sequence of 18 T residuesand the other a sequence of 27 T residues), and an additional 270 bases of new
sequence.
The above circ~ ri7.~tion experiment provided sequence from both the 5'-
and 3'-ends of the HGBV-B viral genome that was not obtained using standard 3'-
or 5'-RACE techniques. However, the exact 5'-3' junction is difficult to
determine even after additional PCR experiments are performed using primers
designed from the newly obtained sequence. Thus, in order to better characterizethe 5'-end of the HGBV-B RNA genome a primer extension experiment was
performed using RNA isolated from the liver of T-1053.
Total cellular RNA was isolated from the liver of T-1053 and a control (i.e.
uninfected) animal (T-1040) as described in Example 7. An antisense
oligonucleotide (SEQUENCE I.D. NO. 155) was endlabeled with ~-32P-ATP
using T4 polynucleotide kinase (NEB) to a specific activity of approximately 9.39
x 107 CPM/,ug as described (Sambrook et al.). The primer was annealed to 30 ,ug
25 of T-1053 and T-1040 liver RNA in separate reactions and then extended using
MMLV reverse Lldnsclil)t~se (Perkin-Elmer) as previously described (Sambrook et
al). The products were analyæd on a 6% sequencing gel. A sequence ladder
generated from one of the HGBV-B circularization clones using the same primer asthat utilized for the primer extension served as a size standard.
Primer extension products of 176 bp were obtained from T-1053. These
products were not obtained when primer extension was performed using liver
RNA from an uninfected animal (T-1040) and therefore represent products derived
from the HGBV-B genome. The length of the products obtained indicate that the
5'-end of the genome, as present in the liver of infected animals, is located 442
35 nucleotides U~SLlCalll of the initiator AUG codon.
To confirm the 3' location of the sequence obtained in the circulari7ation
experiment, RT-PCRs were performed using primers designed to the predicted 3'

-- Wo g~/21922 PCr/uss5/02ll8
21~631~
91

termini (see reaction 1.25, TABLE 2). RT-PCR of infectious T-1053 plasma as
(described above) using SEQUENCE ID. NOS. 156 and SEQUENCE ID. NO.
157 yielded a product of 450 bp. In contrast, RT-PCR using the complement of
SEQUENCE ID. NO.157 and SEQUENCE ID. NO. 147 did not yield a detectable
PCR product (data not shown). These data suggest that the 3' end of the genome
is located 50 nucleotides downstream of the poly T tract.
The cloned products from TABLES 9, 10 and 11, and the RNA
circul~ri7~tion experiment were sequenced as previously described in Example 5.
Intele~Lingly, the cloned products of reactions 1.4, 1.6, 1.9, 1.10 and 1.11 were
0 found to contain only one of the two primer sequences at the termini, suggesting
that these products were the result of false priming events. PCR/sequencing
experiments have linked sequences detected in products 1.4, 1.6, 1.9, 1.10 and
1.11 with clone 4 (SEQUENCE I.D. NO. 21) and/or clone 50 (SEQUENCE I.D.
NO. 29). In addition, sequences derived from each of these reactions contain
limited HCV identity. Thus, these products, although a result of false priming at
one end of the PCR product, appear to contain allthentic HGBV sequence. The
product from reaction 1.14 also appeared to be a result of false priming. Here, the
complement of SEQUENCE I.D. NO. 160 is found at the 5' end of the product
from reaction 1.14 (GB-B, FIGURE 22). This was unexpected because
SEQUENCE I.D. NO. 160 was derived from SEQUENCE I.D. NO. 161 which
resides in GB-A. However, the sequence identity between products from
reactions 1.14 and 2.8, together with additional PCRs/sequencing experiments
(data not shown), demonstrate that reaction 1.14 contains authentic HGBV
sequence. Ap~ llly~ the complement of SEQUENCE I.D. NO. 160 had enough
identity to GB-B sequences u~ of SEQUENCE I.D. NO. 162 to act as a
PCR primer.
The sequences obtained from the products described in TABLES 9, 10 and
11 hereinabove, and the RNA circul~ri7~tion e~ illlenl were assembled into
contigs using the GCG Package (version 7) of programs. A schematic of the
assembled contigs is presented in FIGURE 22). GB contig A (GB-A) is 9493 bp
in length, all of which has been sequenced and is presented in SEQUENCE I.D.
NO. 163. GB-A includes clones 2 (SEQUENCE I.D. NO. 22), 16 (SEQUENCE
I.D. NO. 26), 23 (SEQUENCE I.D. NO. 28), 18 (SEQUENCE I.D. NO. 27), 11
- (SEQUENCE I.D. NO. 24) and 10 (SEQUENCE I.D. NO. 23). SEQUENCE
I.D. NO. 163 was tr~n~l~t~d into three possible reading frames and is presented in
the Sequence Listing as SEQUENCE I.D. NOS. 164-392. GB contig B (GB-B)
is 9143 bp and is presented in SEQUENCE I.D. NO. 393. GB-B (SEQUENCE

WO95/21922 Z~ 6~31 3 PCT/US95102118 --

92

I.D. NO. 393) includes clones 4 (SEQUENCE I.D. NO. 21), 50 (SEQUENCE
I.D. NO. 29), 119 (SEQUENCE I.D. NO. 30) and 13 (SEQUENCE I.D. NO.
25). SEQUENCE I.D. NO. 393 was translated into one open reading frame and is
presented in the Sequence Listing as SEQUENCE I.D. 396 and 397. The UTRs
from the 5' and the 3' ends can each be tr~n~lated into six reading frames.

B. Evidence for the existence of two HCV-like viruses in HGBV
1. Evidence for GB-A and GB-B representing two distinct RNA species.
Comparison of GB-A (SEQUENCE I.D. NO. 163) GB-B (SEQUENCE
o I.D. NO. 393) and HCV-1 (GenBank accession # M67463) demonstrate that GB-
A (SEQUENCE I.D. NO. 163), GB-B (SEQUENCE I.D. NO. 393) and HCV-1
are all distinct sequences. Dot plot analyses of the nucleic acid sequences of GB-
A (SEQUENCE I.D. NO. 163), GB-B (SEQUENCE I.D. NO. 393) and HCV-1
were performed using the GCG Package (version 7). Using a window size of 21
and a stringency of 14, GB-A (SEQUENCE I.D. NO. 163), GB-B (SEQUENCE
I.D. NO. 393) and HCV- 1 were found to clearly contain different nucleotide
sequences (~;IGURE 23). Therefore, GB-A (SEQUENCE I.D. NO. 163) and
GB-B (SEQUENCE I.D. NO. 393) do not represent different strains or genotypes
of HCV or of each other. Short regions of limited nucleotide identity are found in
20 the putative NS3-like and NSSb-like sequences of GB-A (SEQ. ID. NO. 163) and
GB-B (SEQ. ID. NO. 393) and the NS3 and NSSb sequences of HCV by this
analysis. However, nucleotide identity in these regions is not surprising because
NS3 and NS5b code for the putative NTP-binding helicase and the RNA-
dependent RNA polymerase, respectively, which are conserved in all flaviviruses
25 (see below). That GB-A (SEQUENCE I.D. NO. 163) and GB-B (SEQUENCE
I.D. NO. 393) represent separate RNA molecules and not dirrelclll regions of thesame RNA molecule is evidenced by the 5' RACE e~clilllents (above) and
~u~polled by the Northem blot data (as described in Example 8. First, the 5'
RACE experiments show distinct 5' ends for GB-A (SEQUENCE I.D. NO. 163)
30 and GB-B (SEQUENCE I.D. NO. 393). Because RNA molecules can contain
only one 5' end, GB-A (SEQUENCE I.D. NO. 163) and GB-B (SEQUENCE
I.D. NO. 393) leprcsent separate RNA molecules. Second, the 8300 base RNA
molecule detected in infected tamarin liver RNA by probing Northern blots with
clones 4 and 50 (SEQUENCE I.D. NOS. 21 and 29, respectively, both from GB-
35 B [SEQUENCE I.D. NO. 393], see Example 8, corresponds closely to the size of
GB-B (SEQUENCE I.D. NO. 393, 9143 bp). If GB-A and GB-B were part of
the same RNA molecule, one would expect a Northern blot product of at least

-- WO 95/21922 2 ~ ~ 6 3 ~ ~7 PCI`/US95/02118


17,000 bases. These data demonstrate that GB-A (SEQUENCE I.D. NO. 163)
and GB-B (SEQUENCE I.D. NO. 393) represent the nucleotide sequences of two
distinct RNA molecules that are not variants of HCV or each other.
Northern blot analysis and PCR studies of T- 1053 provided evidence that
s the two RNA species corresponding to GB-A (SEQUENCE I.D. NO. 163) and
GB-B (SEQUENCE I.D. NO. 393) were not at equivalent levels in the liver. As
stated above, clones 4 and 50 (SEQUENCE I.D. NOS. 21 and 29, respectively),
both from the GB-B (SEQUENCE I.D. NO. 393), hybridized to an 8.3 kb RNA
species present in infected liver of T-1053 (as described in Example 8). In
o contrast, clones 2 (SEQUENCE I.D. NO. 22), 10 (SEQUENCE I.D. NO. 23), 16
(SEQUENCE I.D. NO. 26 and 23 (SEQUENCE I.D. NO. 28), all from GB-A
(SEQUENCE ID. NO. 163), showed no hybridization with T-1053 liver RNA in
identical experiments (data not shown). In addition, clone 16 PCR generated
much less product than clone 4 PCR on cDNAs generated from T-1053 liver RNA
by ethidium st~ining, despite equivalent sensitivities of clone 4 and clone 16 PCRs
demonstrated using plasmid templates (data not shown). This is in contrast to
what is found in T-1053 plasma at the time of sacrifice. PCR titration experiments
for clone 4 (GB-B-specific, SEQUENCE I.D. NO. 393) and clone 16 (GB-A-
specific, SEQUENCE I.D. NO. 163) PCR on cDNAs generated from T-1053
plasma RNA suggest that equivalent amounts of GB-A (SEQUENCE I.D. NO.
163) RNA and GB-B (SEQUENCE I.D. NO. 393) RNA are present in T-1053
plasma (Example 4, E.2). Thus, although GB-A (SEQUENCE I.D. NO. 163)
RNA and GB-B (SEQUENCE I.D. NO. 393) RNA were at equivalent levels in T-
1053 plasma, there appeared to be a greater amount of GB-B (SEQUENCE I.D.
NO. 393) RNA relative to GB-A (SEQUENCE I.D. NO. 163) RNA present in T-
1053 liver at the time of sacrifice. Together, these results provide further evidence
for the existence of two dirre~ RNA molecules corresponding to GB-A
(SEQUENCE I.D. NO. 163) and GB-B (SEQUENCE I.D. NO. 393) in T-1053
plasma and suggest that these RNAs are not necessarily present at equivalent levcls
in infected liver RNA. Therefore, it is unlikely that GB-A (SEQUENCE I.D. NO.
163) and GB-B (SEQUENCE I.D. NO. 393) make up individual segments of a
~ - single viral genome.
2. Evidence that GB-A (SEQUENCE I.D. NO. 163) and GB-B (SEQUENCE
' 5 I.D. NO. 393) represent the genomes of two distinct viruses.
Infectivity and PCR studies provide evidence for the viral nature of GB-A
(SEQUENCE I.D. NO. 163) and B (SEQUENCE I.D. NO. 393). Specifically,
tamarins T-1049 and T- 1051 which were inoculated with T-1053 plasma that had

wo 9Sl21922 Pcr/uss~to21l8 ---

- ~9~4~

been filtered (0.1 ~lm) and diluted to 10-4, or unfiltered and diluted to 10-5,
respectively, were positive for both clone 4 (GB-B [SEQUENCE I.D. NO. 393)
and clone 16 (GB-A [SEQUENCE I.D. NO. 163]) sequences. Prior to
inoculation, both of these ~nim~l~ were negative for clones 4 and 16 (Examples 4,
E.4 and 4, E.5). Therefore, the two RNA species present in the acute phase T-
1053 plasma corresponding to GB-A and GB-B can be filtered, diluted and
passaged to other animals cor~.ci.~tent with the proposed viral nature of GB-A
(SEQUENCE I.D. NO. 163) and GB-B (SEQUENCE I.D. NO. 393). That GB-
A and GB-B represent RNA molecules from separate viral particles is evidenced
o by PCR studies of the H205-inoculated tamarins. Specifically, four of four
tamarins became positive for clone 4 (GB-B [SEQUENCE I.D. NO. 393]) by RT-
PCR after H205 inoculation. In contrast, only one of 4 H205-inoculated tamarins
(T-1053) became positive for clone 16 (GB-A [SEQUENCE I.D. NO. 163]) by
RT-PCR (Example 4.E.2). Therefore, ~c~uming that GB-A (SEQUENCE I.D.
NO. 163) sequences were truly absent from T-1048, T-1057 and T-1061, and that
the negative clone 16 PCR results were not due to poor sensitivity, it would appear
that the virus corresponding to GB-B (SEQUENCE I.D. NO. 393) sequences (i.e.
hepatitis GB virus B [HGBV-B]) can be passaged independent of GB-A
(SEQUENCE I.D. NO. 163) sequences. An HGBV-B only sample from T-1057
has been passaged two additional times (Example 4). GB-A (SEQUENCE I.D.
NO. 163) sequences have not been detected in these animals by RT-PCR. In
addition, significant liver enzyme elevations have been noted in these animals
(Example 4), demonsLI~Ling that HGBV-B alone caused hepatitis in t~m~nn~. GB-
A (SEQUENCE I.D. NO. 163) sequences have been identified in tarnarins lacking
detectable GB-B (SEQUENCE I.D. NO. 393) sequences. Specifically, GB-B
only ~nim~lc (T-1048, T-1057 and T-1061) challenged with T-1053 plasma
developed GB-A (SEQUENCE I.D. NO. 163) only viremias as detected by clone
16 specific RT-PCR. The GB-A only plasma from T-1057 has been passaged one
additional time (Example 4). Thus, it appears that a virus corresponding to GB-A(SEQUENCE I.D. NO. 163) sequences (hepatitis GB virus A [HGBV-A]) can
replicate independent of HGBV-B. Additional passages of HGBV-A in the
absence of HGBV-B is ongoing. At this time it is not known whether HGBV-A
causes hepatitis in tamarins. However, the lack of elevated liver enzymes noted in
the T-1053 challenged t~m~rjn~ with HGBV-A viremias and in the passage of the
HGBV-A only serum from T-1057 argue against the hepatotropic nature of
HGBV-B in t~m~rin~.

_-- WO 95/21922 ~ 1 6 ~ ~ 1 3 PCI~/US95/02118


The presence of two viruses in acute phase T- 1053 plasma can be traced
back to the H205 inoculum. Specifically, data from Example 7 showed that clone
16 (SEQUENCE I.D. NO.26, found in GB-A [SEQUENCE I.D. NO. 163]) was
absent in the preinoculation plasma from all 7 tamarins tested. In addition, clones
2, 10, 18 and 23 (SEQUENCE I.D. NOS. 22, 23, 27 and 28, respectively, all
from GB-A [SEQUENCE I.D. NO. 163]) have not been detected in any pre-
HGBV-inoculated tamarin plasma tested (Example 7. Sirnilar negative results
were found when preinoculation tamarin plasma were tested for clones 4 and 50
(SEQUENCE I.D. NOS. 21 and 29, respectively, all`from GB-B [SEQUENCE
0 I.D. NO.393]). Thus, both HGBV-A and HGBV-B were absent in the
preinoculation tamarin plasma. In contrast, all of these clones (i.e. clones 2, 10,
16, 18 and 23 from GB-A [SEQUENCE I.D. NO. 163], and clones 4 and 50 from
GB-B [SEQUENCE I.D. NO. 393]) were detected in the H205 inoculum (TABLE
7). Interestingly, as found in cDNA made from T-1053 liver (above), several
15 different PCR targets in GB-A (SEQUENOE I.D. NO. 163) all generated less
product than similar PCR targets in GB-B (SEQUENCE I.D. NO. 393) using the
same random primed cDNAs from H205 (data not shown). Thus, we conclude
that HGBV-A and HGBV-B are present in the original GB inoculum, H205.
However, HGBV-B appears to be more abundant than HGBV-A in H205. The
20 low relative amount of HGBV-A in the H205 inoculum may explain why only one
of four t~m~nn~ were positive for the HGBV-A after H205 inoculation (Example
4.E.2).
3. Evidence that HGBV-A and HGBV-B are members of the Flaviviridae.
Searches of the SWISS-PROT ~l~t~h~ce with the three frame translation
25 products of GB-A (SEQUENCE I.D. NO. 165-268, 270-384, 386-392) and GB-
B (SEQUENCE I.D. NO. 397) as described in Example S show limited, but
~ignific~nt amino acid sequence identity with various strains of HCV. Translation
products from GB- A (SEQUENCE I.D. NO. 164) and GB- B (SEQUENCE I.D.
NO. 393) show the closest homology to regions of the nonstructural proteins of
30 various HCV isolates (i.e. NS2, NS3, NS4 and NS5). For example, as shown in
FIGURE 24, the conserved residues (indicated by *) in the putative NTP-binding
helicase domain of flaviviruses (FIGURE 24A) and in the RNA-dependent RNA
polymerase domain of all viral RNA-dependent RNA polymerases (FIGURE 24B)
- are held in common between HCV-1 NS3 and NS5b (SWISS-PROT accession3s number p26664), respectively, and the predicted translation products of GB-A
(SEQUENCE I.D. NO. 390) and GB- B (SEQUENCE I.D. NO. 397). (See
Choo et al., PNAS 88:2451-2455 [1991] and Domier et al., Virolo~y 158:20-27

1 3
wo 95/21922 Pcr/uss5lo2ll8

96

[1987]). Therefore, it appears that both GB- A virus and GB- B virus encode
functional NTP-binding helicases and RNA-dependent RNA polymerases.
However, GB-A (SEQUENCE I.D. NO. 390) and GB-B (SEQUENCE I.D. NO.
397) do not share complete amino acid identity to each other and/or to HCV in
other regions of HCV NS3 and NS5b. Specifically, over the 200 residue region
of NS3 shown in FIGURE 24A, GB- A (SEQUENCE I.D. NO. 390, residues
1252-1449) virus and HCV-1 (SEQ. ID. NO.398), GB-B (SEQUENCE I.D. NO.
397, residues 1212-1408) virus and HCV-1 (SEQUENCE I.D. NO.398), and
GB- A (SEQUENCE I.D. NO. 390, residues 1252-1449) virus and GB- B
o (SEQUENCE I.D. NO. 397, residues 1212-1408) virus are 47%, 55% and 43.5%
identical, respectively. In addition, over the 100 residue region of NSSb shown in
FIGURE 24B, GB-A (SEQUENCE I.D. NO. 390, residues 2644-2739) virus and
HCV-1 (SEQUENCE I.D. NO. 398), GB- B (SEQUENCE I.D. NO. 397,
residues 2513-1612) virus and HCV-1 (SEQUENCE I.D. NO.398), and GB-A
(SEQUENCE I.D. NO. 390, residues 2644-2739) virus and GB- B (SEQUENCE
I.D. NO. 397, residues 2599-2698) virus are 36%, 41% and 44% identical,
respectively. Lower levels of homology are found in other putative nonstructuralgenes of GB- A (SEQUENCE I.D. NO. 390) and GB-B (SEQUENCE I.D. NO.
397) when compared to HCV. The overall level of homology of the putative
nonstructural proteins of GB- A virus and GB- B virus compared with HCV
sequences present in GenBank suggests that both GB-A (SEQUENCE I.D. NO.
164) and GB-B (SEQUENCE I.D. NO. 393) are derived from two separate
members of the Flaviviridae. Flaviviruses contain a single genomic RNA
molecule which code for one NTP-binding helicase domain and one RNA-
dependent RNA polymerase domain. The presence of two contigs, each
cont~ining a putative RNA helicase domain and a putative RNA-dependent RNA
polymerase is consistent with the presence of two HCV-like flaviviruses in the
acute phase T-1053 plasma.

-- WO 95/21922 216 6 3 ~;~ PCT/US95/02118


Example 10. PCR
In order to determine the sequence rel~t~o~lness of HGBV to hepatitis C
virus the following PCR-based experiment was performed. PCR primers based
on the 5'-untr~nc~l~te~l region (UTR) sequence of the HCV genome (J.H. Han,
PNAS 88:1711-1715 [1991]), which are highly conserved in HCV isolates from a
variety of geographic origins (Cha, T.-A., et al., J. Clin. Microbiol. 29:2528-
2534 [1991]) were utiliæd in attempts to detect similar sequences in H205-infected
tamarin T-1053 liver RNA. Total cellular RNA was extracted from the liver of
infected tamarin T1053 and from the liver of an uninfected tamarin (T-1040) as
o described in Example 8A. Thirty micrograms of each RNA sample was reverse
transcribed and PCR amplified using a kit available from Perkin-Elmer essenti~lly
as described in the manufacturer's instructions. An ~nti~en~e primer (primer 1)
was used for the reverse transcriptase reaction and comprised bases 249-268 of the
HCV 5'-UTR. Primer 1 and a primer COIllpliSillg bases 13-46 of the HCV 5'-
UTR (primer 2) were then used for PCR amplification of the intervening
sequence. The conditions used for thermocycling were essentially as described byCha et al., supra.
In order to increase the sensitivity of this assay for the detection of HCV
5'-UTR sequences in H205 infected tamarin T-1053, the above PCR reaction was
subjected to a second amplification reacton which utilized "nested" PCR primers.These primers are derived from sequences found internal to the sequences of
primers 1 and 2 above in the HCV 5'-UTR: Primer 3 comprised sequences from
47-69 and primer 4, an antisense primer, comprised bases 188-210 of the HCV
5'-UTR. In this "nested" PCR reaction, PCR products (2 ~11 out of a total of 1002s 111 reaction volume) from the first PCR reaction were used as the source of DNA
template. The thermocycling p~nt;~el~ were essenti~lly the same as described
above except that the ~nne~ling telll~ldtur~ was 55C instead of 60C. The
resulting PCR products from the second PCR reaction were then analyzed for the
expected DNA products by agarose gel electrophoresis and ethidium bromide
staining. The expected DNA fragment sizes, based on the sequence of the HCV
5'UTR (Han et al., supra) is 253 bp for the product of the first PCR reaction and
~ 163 bp for the product of the nested PCR reation. PCR products of the anticipated
size were obtained in control experiments performed using 30 ~lg of total celluar
- RNA extracted form the liver of an HCV infected chimpanzee as described in
3s Example 8A (data not shown), thus demonstrating that this e~ l procedurewas able to detect the 5-UTR of HCV. However, neither of the expected products
were observed on the resulting ethidium bromide stained agarose gel when either

WO 9~/21922 PCI/US95/02118 - --
2 i~63~3
98

T-1053 liver RNA or T-1040 liver RNA were used (data not shown). This
inability to produce the predicted result may suggest that (i) the sequence of the 5'-
UTR of the agent differs signifi~ntly from that of HCV such that the
oligonucleotide primers used would not be able to anneal efficiently thereby
5 dissallowing PCR amplification from occurring or (ii) the agent lacks a 5'-UTR.
In either case it appears from these results that the nucleotide sequence of the agent
is significantly dirrt;lcll~ from that of HCV.
In addition, nucleic acids were isolated as in Example 7 from a chilllpalloee
plasma pool obtained during the acute phase of an experimental infection of HCV
(G. Schl~lder et al., J. Clin. Microbiolo~y 29:2175-2179 [1991]). RT-PCR was
performed as described in Example 7 using clone 16 primers (SEQUENCE I.D.
NOS. 93 and 94). No bands of the expected size for these primers were detected
by ethidium bromide st~ining or after hybridization to a clone 16 specific probe(data not shown). These results support the unrelatedness of clone 16 sequence
15 (SEQUENCE I.D. NO. 26) to HCV.

Example 11. Reactivity of HGBV Infected Serum to Other Hepatits Viruses
Serum specimens were obtained prior to, and after, inoculation with
HGBV using either the H205 inoculum (T-1048, T-1057, T-1061) or the T-1053
20 inoculum (T-1051) and tested for antibodies frequently detected following
exposure to known hepatitis viruses. Specimens were tested for antibodies to
hepatitis A virus (using the HAVAB assay, available from Abbott Laboratories,
Abbott Park, IL), the core protein of hepatitis B core (using the Corzyme(~ testavailable from Abbott Laboratories, AbboK Park, IL), hepatitis E virus (HEV)
25 (using the HEV EIA,-available from Abbott Laboratories, Abbott Park, IL) and
hepatitis C virus (HCV) (ntili7ing HCV second generation test, available from
Abbott Laboratories, Abbott Park, IL). These tests were performed according to
the manufacturer's package inserts.
None of the t~m~rinc tested positive for antibodies to HCV or to HEV
30 either prior to or after HGBV inoculation (see TABLE 12). Therefore, HGBV
infection does not elicit detectable antisera against HCV or HEV.
One of the t~nn~rin.c (T-1061) was positive for antibodies to HAV prior to
and after inoculation with HGBV, suggesting a previous exposure to HAV
(TABLE 9, T-1061). However, the three rem~ining tamarins (T-1048, T-1057
35 and T-1051) show no HAV-specific antibodies after HGBV inoculation.
Therefore, HGBV infection does not elicit an anti-HAV response. One of the
t~".~ c (T-1048) was negative for antibodies to HBV core both prior to and after

216631~3
-- wo 95/21922 Pcr/uss5/02l18


inoculation with HGBV. Two of the t~m~rin.c (T-1061 and T-1057) were positive
prior to inoculation with HGBV. One of the t~m~rin.c (T-1051) was borderline
positive for antibodies to HBV prior to inoculation, but was negative after
inoculation. Based on these data, there is no evidence that infection with the
s HGBV agent induces an immune response to HBV core. Taken together, these
data support that the HGBV agent is a unique viral agent, and is not related to any
of the viral agents commonly associated with hepatitis in man.

Example 12. Western Blot Analysis of HGBV Infected Liver.
o As noted in Examples 1 and 2 above, elevated liver enzyme values are
noted in t~m~rin.c inoculated with HGBV. If HGBV is indeed a hepatotropic
virus, it would be expected that viral protein(s) would be produced in infected
liver cells, and that an immune response to those proteins would be generated. In
this example, evidence is presented wnich suggests that a unique protein appears in
livers obtained from HGBV-infected t~m~rinc; this protein appears to be
specifically recognized via Western blot utili7.ing tamarin serum obtained in the
convalescent stage following infection with HGBV.
HGBV-infected tamarin livers and various control tamarin and
chi",p"l-7.Pe livers were diced and homogenized in PBS (approximately 1 g liver to
5 ml) using a Omni-mixer homogenizer. The resulting suspension was clarified by
centrifugation (10,000 x g, 1 hour, 4C) and by micro-filtration through 5 ~m, 0.8
~m and 0.45 ~m filters. The clarified homogenate was centrifuged under
conditions pelleting all components of lOOS or greater. Pellets (lOOS liver
fractions) were taken up in a small volume of buffer and stored at -70C.
2s SDS polyacrylamide gel electrophoresis (PAGE) was carried out using
standard methods and reagents (Laemmli discontinuous gels). lOOS liver fractionswere diluted 1:20 in a sample buffer co,~ g SDS and 2-mercaptoethanol and
heated at 95C for 5 minute~. The proteins were electrophoresed through either
12% acrylamide or 4-15% acrylamide linear gradient gels, 7cm x 8cm, at 200 voltsfor 30 to 45 minutes. Proteins were electro-transferred to nitrocellulose membranes
using standard methods and reagents.
Westem blots were developed using standard methods. Briefly, the
nitrocellulose membrane was briefly rinsed in TBS/Tween and blocked ovemight
- ~ in TBS/CS (100 mM Tris, 150 mM NaCl, 10 mM EDTA, 0.18% Tween-20,
3s 4.0% calf serum, pH 8.0) at 4C. The nitrocellulose was placed in the Multi-screen
app~us and 600 ~11 of sera was placed in the channels nd followed with a 2
hour room te~ ure and an ovemight 4C incubation. After removing the

WO 95t21922 21 ~ ~ 31 ~3 PCT/US951021 18

100

membrane from the Multi-screen apl)a d~US, it was washed 3 times, 5 minutes
each, in 15 ml TBS/Tween (50 mM Tris, 150 mM NaCI, 0.05% Tween-20, pH
8.0). The membrane was incubated for 1 hour at room te~ c;l~ re in 15 ml goat
anti-human:HRPO conjugate (0.2 ~lg/ml TBS/CS). After washing as before, the
s melllblane was incub~ted in the TMB enzyme substrate solution, rinsed in water
and dried.
Proteins isolated from T-1053 liver at sacrifice (12 days post-GB
inoculation) and blotted as described above showed a unique immunogenic protein
with an appal~lll molecular weight of approximately 50 to 80 kDa when reacted
o with T-1057 sera from 5, 6, 7, 9 or 11 weeks post-GB inoculation. The band was
not present when reacted with T-1057 sera pre-inoculation or 3 weeks post-GB
inoculation. This band did not appear in the lanes colll~ining liver proteins
obtained from an uninoculated tamarin (T-1040) when reacted with any of these T-1057 sera. In addition, a protein of the same size (50 to 80 kDa) was visible when
s the T-1053 liver proteins were reacted with other post-GB inoculation sera (T-
1048 at 1 1 weeks post-GB inoculation and T-105 1 at 8 weeks post-GB
inoculatlon) but not when they were reacted with pre-inoculation sera from thesesame anlmals.
An additional Western blot experiment was performed to determine if this
20 immunoreactive band would be detectecl in liver tissues from other GB-inoculated
tamarins, or in liver tissues of chi~ 7.-~es infected either with HCV or HBV. Ineach case, the nitrocellulose strips colll~ ing the liver proteins were reacted with a
pool of sera from T-1048 (5, 8, and 16 weeks post-GB inoculation) and T-105 1 (8and 12 weeks post-GB inoculation). All 5 sera in the pool were mixed in equal
25 proportion. A reactive protein band of 50-80 kDa was seen with all of the tamarin
liver samples obtained from GB inoculated t~m~in.s ( T-1038, T-1049, and T-
1055 obtained at 14 days post-GB inoculation and T-1053 obtained at 12 days
post-GB inoculation). This immunoreactive band was not detected in the liver
ple~ations obtained from T-1040 (uninoculated) nor in any of the chimp liver
30 plep~lions (CHAS457 (pre-HCV inoculation), CHAS-457 (HCV+), CRAIG-
454 (HCV+) and MUNA-376 (HBV+).
Taken together, these data demonstrate the existence of an immunogenic
and antigenic protein with an a~ nt molecular weight of approximately 50 to 80
kDa specifically associated with HGBV-infected tamarin liver. The nature of this3s HGBV-associated protein (ie. whether it is viral encoded or of host origin) is
currently under investigation. Regardless of the source of the HGBV-associated

-- WOg5/21922 ~ 1 6 6 3 :~ 3 PCT/USg5/02118

101

protein, these result are consistent with HGBV infection inducing an antibody
response to an antigen which is present in HGBV-infected tamarin liver.

Example 13. CKS-based expression and detection of immunogenic
s HGBV-A and HGBV-B polypeptides
A. Cloning of HGBV-A and HGBV-B sequences
The cloning vectors pJO200, pJO201, and pJO202 allow the fusion of
recombinant proteins to the CMP-KDO synthetase (CKS) protein. Each of these
plasmids consists of the plasmid pBR322 with a modified lac promoter fused to a
o kdsB gene fragment (encoding the first 239 of the entire 248 amino acids of the E.
coli CKS protein), and a synthetic linker fused to the end of the kdsB gene
fragrnent. The synthetic linkers include: multiple restriction sites for insertion of
genes, translational stop signals, and the trpA rho-independent transcriptional
terminator. The unique restriction sites in this linker region include, from 5' to 3',
EcoRI, SacI, KpnI, SmaI, BamHI, XbaI, PstI, SphI, and HindIII. Each plasmid
allows for insertion in a different reading frame within the multiple cloning site.
The CKS method of protein synthesis as well as CKS vectors are disclosed in
U.S. Patent No. 5,124,255, which enjoys common ownership and is incorporated
herein by reference, and the use of CKS fusion proteins in assay formats and test
20 kits is described in United States Serial No. 07/903,043, which enjoys common ownership and is incorporated herein by reference.
The HGBV-A and HGBV-B sequences obtained from the walking
experiments described in TABLES 9 and 10 (Example 9) were liberated from the
appro~liate pT7Blue T-vector clones using restriction enzymes listed in TABLES
25 13 and 14 (10 units, NEB), and purified from 1% low melting point agarose gels
as described in Example 3B. Plasmids pJO200, pJO201, and pJO202 were
digested with the same restriction enzymes (10 units, NEB) and dephosphorylated
with bacterial alkaline phosphatase (GIBCO BRL, Grand Island, NY). Each
purified HGBV fragment was ligated into the digested, dephosphorylated pJO200,
30 pJO201, and pJO202 and transformed into E. coli XL1 Blue as described in
Example 3B. Standard miniprep analyses confirmed the successful construction of
the CKS/HGBV expression vectors.
Two additional PCR products were generated specifically for expression.
The 2 products, designated 4.1 and 4.2, were predicted to encode the HGBV-B
35 and HGBV-A core regions, respectively (see FIGURE 22). PCR product 4.1 was
generated using primers coreB-s and coreB-al (SEQUENCE I.D. NOS 708 and
709) and PCR product 4.2 was generated using primers coreA-s and 2.2.1 '

WO 95/21922 PCT/US9~/02118 - .-
~66~i3
102

(SEQUENCE I.D. NOS. 710 and 138), as described in Example 9. The 4.1 sense
and antisense primers had EcoRI and BamHI restriction sites, respectively,
designed into the ends. The 4.1 PCR product was digested, gel isolated, and
ligated to pJO200, pJO201, and pJO202 as described above. The sense primer for
s the 4.2 PCR product had an EcoRI restriction site designed into the end, but the
antisense primer did not have a restriction site. Thus, the product was cut withEcoRI, gel isolated, and ligated to pJO200, pJO201, and pJO202 which had been
digested with BamHI, end-filled with the Klenow fragment of DNA polymerase
and dNTPs, digested with EcoRI, and dephosphorylated with bacterial alkaline
0 phosphatase as described in the art.
B. Expression of HGBV-A and HGBV-B sequences.
E. coli XL1 Blue cultures cont~ining the CKS/HGBV expression vectors
were grown at 37C with .ch~king in media col-t~il-il-g 32 gm/L tryptone, 20 gm/L
yeast extract, S gm/L NaCI, pH7.4, plus 100 mg/L ampicillin and 3mM glucose.
When the cultures reached an OD600 of between 1.0 and 2.0, IPTG was added to
a final concentration of lmM to induce expression from the modified lac promoter.
Cultures were allowed to grow at 37C with ~h~king for an additional 3 hours, and
were then harvested. The cell pellets were resuspended to an OD600 of 10 in
SDS/PAGE loading buffer (62.5mM Tris pH6.8, 2% SDS, 10% glycerol, 5% 2-
mercaptoethanol, and 0.1 mg/ml bromophenol blue), and boiled for 5 minutes.
Aliquots of the prepared whole cell Iysates were run on a 10% SDS-
polyacrylamide gel, stained in a solution of 0.2% Coomassie blue dye in 40%
methanoltl0% acetic acid and destained in 16.5% methanol/5% acetic acid until a
clear background was obtained.
2s The whole cell lysates were run on a second 10% SDS-polyacrylamide gel,
and electrophoretically transferred to nitrocellulose for immunoblotting. The
nitrocellulose sheet cont~ining the transferred proteins was incubated in blocking
solution (5% Carnation nonfat dry milk in Tris-buffered saline) for 30 minutes at
room tt;lllpeldlul~ followed by incubation for I hour at room telllp~ld~ul~ in goat
anti-CKS sera which had been preblocked against E. coli cell lysate then diluted1: 1000 in blocking solution. The nitrocellulose sheet was washed two times withTris-buffered saline (TBS), then incubated for 1 hour at room te",l~e,atu,~ withalkaline phosphatase-conjugated rabbit anti-goat IgG, diluted 1: 1000 in blocking
solution. The nitrocellulose was washed two times with TBS and the color was
3s developed in TBS containing nitroblue tetrazolium and 5-bromo4-chloro-3-indolyl
phosphate. The a~plupliate reading frame for each fragment was identified based-

_ WO 9~/21922 21 6 6 3 1~ PCT/US95/02118

103

on expression of an immunoreactive CKS fusion protein of the correct predicted
size, and further confirmed by DNA sequencing across the vector-insert junction.After delGIlllinil~g the a~pr~liate reading frame for each of the fragments,
samples from cultures CO~ i"illg the al)~ropliate constructs were analyzed by
5 SDS-polyacrylamide gel electrophoresis and Western blot. FIGURE 25A shows 2
Coomassie-stained 10% SDS-polyacrylamide gels containing the CKS fusion
protein whole cell Iysates. Lanes 1 and 16 contain molecular weight standards
with the sizes in kilodaltons shown on the left. The loading order on gel l
(HGBV-A samples) is as follows: lane 2, clone 1.17 prior to induction; lanes 3-
lo 15, clone 4.2, clone 1.17, clone 1.8, clone 1.2, clone 1.18 (SEQUENCE I.D.
NO. 390), clone 1.19, clone 1.20, clone 1.21, clone 1.22 (SEQUENCE I.D. NO.
390), clone 2.12, clone 1.5, clone 1.23, and clone 2.18 respectively, all after 3
hours of induction. The loading order on gel 2 (HGBV-B samples) is as follows:
lane 17,clone4.1 priortoinduction;lanes 18-29,clone4.1,clone l.l5,clone
1.14, clone 2.8, clone 1.13, clone 1.12, clone 2.1, clone 1.7, clone 1.3, clone
1.4, clone 1.16, and clone 2.12 respectively, all after 3 hours of induction. These
proteins were run on 2 additional 10% gels, in the same loading order, and
transferred to nitrocellulose as described above. The samples were analyzed by
Western blot using a pool of sera from 2 convalescent tamarins, T-1048 and T-
20 1051, as follows: The nitrocellulose sheets containing the samples were incubated
for 30 minutes in blocking solution, followed by transfer to blocking solution
containing 10% E. coli Iysate, 6mg/ml XL1-Blue/CKS Iysate, and a 1:100 dilution
of the pooled convalescent tamarin sera described in TABLE 6 (Example 4). After
overnight incubation at room telllpG~alulG, the nitrocellulose sheets were washed
25 two times in TBS and then incub~ted for 1 hour at room telll~eldLulG in HRPO-conjugated goat anti-human IgG, diluted 1 :500 in blocking solution. The
nitrocellulose sheets were washed two times in TBS and the color was developed
in TBS containing 2 mg/ml 4-chloro-1-napthol, 0.02% hydrogen peroxide and
17% methanol. As shown in FIGURE 25B, three HGBV-B proteins
30 demonstrated immunoreactivity with the pooled tamarin sera; CKS fusions of
clones 1.4, 1.7, and 4.1. Clone 1.7 contains the sequence encoding an HGBV-B
immunogenic region (SEQUENCE I.D. NO. 610) and clone 1.4 contains the
sequence encoding two HGBV-B immunogenic regions (SEQ. ID. NOS. 12, 13
and 18), identifled by immunoscreening of a cDNA library (Example 4 ) using the
35 same pool of convalescent tamarin sera.
The samples described in the previous paragraph were also analy~ed by
Western blot as above using a 1:100 dilution of convalescent serum obtained

~l~b'~13
WO 9~/21922 PCT/US95102118 --

104

approximately three weeks following the onset of acute hepatitis from the surgeon
GB. The reactivities of the fusion proteins from HGBV-A and HGBV-B with this
serum are indicated in TABLES 13 and 14. Only one HGBV-B protein (2.1)
showed reactivity with this serum, and the reactivity was quite weak, while two
HGBV-A proteins (1.22 [SEQUENCE I.D. NO. 390] and 2.17) exhibited strong
reactivity with this serum. These two HGBV-A proteins overlap by 40 amino
acids, so this may reflect reactivity with one epitope or more than one epitope.These two HGBV-A proteins were chosen for use in ELISA assays as described in
Example 16. It is of interest to note that although tamarins infected with the
0 eleventh passage GB material (H205 GB pass 11) demonstrate an immune
response to several HGBV-B epitopes but no HGBV-A epitopes, serum from the
original GB source demonstrates significant reactivity with at least one HGBV-A
epitope. This suggests that HGBV-A may have been the causative agent of
hepatitis in the surgeon GB.
Four additional human sera which had indicated the presence of antibodies
to one or more of the CKS/HGBV-A or CKS/HGBV-B fusion proteins by the
1.4, 1.7, or 2.17 ELISAS (see Examples 15 and 16) were chosen for Western
blot analysis. Three of these sera (G I -41, G I - 14 and Gl -31) are from the West
African "at risk" population and the fourth (341C) is from a nonA-E hepatitis
20 (Egypt) sample (see Example 15 for detailed description of these populations).
Additional 10% SDS-polyacrylamide gels col,Li1;"i"g the whole cell Iysates from
some of the CKS fusion proteins ~ c~ed above were run and transferred to
nitrocellulose as described previously. Each of these blots was preblocked as
described, then incubated overnight with one of the human serum sample diluted
25 1: 100 in blocking buffer con~i"i,lg 10% E. coli lysate and 6mg/ml XL1 -
Blue/CKS lysate. The blots were washed two times in TBS, then reacted with
HRPO-conjugated goat anti-human IgG and developed as indicated above.
The CKS/HGBV-B proteins were analyzed with two of these sera, G141
and G1-14, and the reactivities are indicated in TABLE 13. In addition to the three
30 proteins which showed reactivity with the tamarin sera, two additional proteins
(1.16 and 2.1) showed reactivity with one or the other of the two human sera. The
CKS/HGBV-A proteins were analyzed with all four of these human sera and the
reactivities are indicated in TABLE 14. In addition to the two proteins which
showed reactivity with GB serum, three additional proteins (1.5, 1.18, and 1.19)35 showed reactivity with one or more of the human sera. Two of these (1.5 and
1.18) were chosen for use in ELISA assays as described in Example 16 It is of
particular interest to note that the G1 -31 serum, which shows reactivity by Western

-- WO 95/21922 ~ 1 ~ 6 ~ ;1 3 PCT/US95/02118

105

blot and/or ELISA (Examples 15 and 16) with two HGBV-A proteins (1.18 and
2.17) and one HGBV-B protein (1.7), is the serum from which the GB-C
sequence (SEQUENCE I.D. No. 673, residues 2274-2640) was isolated (Example
17).

TABLE 13
HGBV-B Samples
ReactivityReactivity Reactivity Reactivity
o PCR Restriction with T1048 + with with human with human
producta digestb T1051 sera GB sera G1-41 sera G1-14 sera

1.3 EcoRI, PstI
151.4EcoRI, XbaI + - + +
1.7EcoRI, Hindm + - +
1.12 KpnI, PstI
1.13 EcoRI, XbaI
1.14 BamHI, HindIII
201.15EcoRI, PstI
1.16EcoRI, XbaI - - +
2.1EcoRI, Hindm - +/- - +
2.8 EcoRI, XbaI
2.12 KpnI, PstI
254.1EcoRI, BamHI +

aPCR product is as intl~ t~d in TABLE 9, TABLE 10, or Example 13. bRestriction
digests used to liberate the PCR fragment from pT7Blue T-vector or for direct digestion of
4.1 PCR product.
Example 14. Epitope mapping of immunoreactive
HGBV-A and HGBV-B proteins
A. Epitope mapping of HGBV-B protein 1.7
Overlapping subclones within the HGBV-B immunogenic protein 1.7 were
35 generated by RT-PCR from T1053 serum as described in Example 7 in order to
determine the location of the immunogenic region or regions. Each PCR primer
had six extra bases on the 5' end to facilitate restriction enzyme digestion, followed
by either an EcoRI site (sense primers) or a Hindm site (antisense primers). In
addition, each antisense primer contained a stop codon just after the coding region.
40 After digestion, each fragment was cloned into EcoRVHindm-digested pJO201 as

WO 95/21922 ~ 1 6 ~ ~1 3 PCT/US95102118 -`-

106

described in Example 13. The CKS fusion proteins were expressed and analyzed
by Western blot with tamarin T1048/T1051 sera as described in Example 13. Five
overlapping clones, designated 1.7-l through 1.7-5, were generated. The clones
encoded regions of the 1.7 protein ranging in size from 104 to 110 amino acids.
The PCR primers used to generate each clone, the sizes of the encoded
polypeptides, the location within the 1.7 sequence and the reactivity with tamarin
T10481T1051 sera are shown in TABLE 15. Two further overlapping clones were
generated which encomp~csed the immunogenic region (SEQUENCE I.D. NO.
678)identified by immunoscreening of a cDNA library (Example 4). Each of
lo these clones, designated 1.7-6 and 1.7-7, encoded polypeptides of 75 amino acids.
The PCR primers, sizes of encoded polypeptides, location within the 1.7 sequenceand reactivity with tamarin T1048/T1051 sera are shown in TABLE 15. Two
immunogenic regions were identified within the 507 amino acid long 1.7 protein;
one near the N-terminus within residues 1-105, and another near the middle of the
protein, encomp~csing residues 185 to 410. It remains to be determined whether
there is a single epitope or multiple epitopes within each of these regions.
B. Epitope mapping of HGBV-B protein 1.4
Overlapping subclones within the HGBV-B immunogenic protein 1.4 were
generated by RT-PCR from T1053 serum as above in order to determine the
20 location of the immunoreactive region or regions. Each PCR primer had six extra
bases on the 5' end to facilitate restriction enzyme digestion, followed by either an
EcoRI site (sense primers) or a BamHI site (antisense primers). In addition, each
antisense primer contained a stop codon just after the coding region. After
digestion, each fragment was cloned into EcoRI/BamHI-digested pJO201 as
25 described in Example 13. The CKS fusion proteins were expressed and analyzed
by Western blot with tamarin T1048/T1051 sera as described in Example 13. Four
overlapping clones, designated 1.4-1 through 1.4-4, were generated. The clones
encoded regions of the 1.4 protein ranging in size from 137 to 138 amino acids.
The PCR primers used to generate each clone, the sizes of the encoded
30 polypeptides, the location within the 1.4 sequence and the reactivity with tamarin
T1048/T1051 sera are shown in TABLE 15. Two further overlapping clones were
generated which encomp~csecl an immunogenic region identified by
immunoscreening of a cDNA library (Example 4). Each of these clones,
designated 1.4-5 and 1.4-6, encoded polypeptides of 75 amino acids. The PCR
35 primers, sizes of encoded polypeptides, location within the 1.4 sequence and
reactivity with tamarin T1048/T1051 sera are shown in TABLE 15. A 2.65 amino
acid sequence was identified as being the immunogenic region within the 522

- wo 95/21922 ~ 1 6 6 ~ 1 3 PCr/uSg5/02118

107

amino acid long 1.4 protein, encomp~ccing residues 129 to 393. It is likely thatthere are at least two epitopes within this region, since library immunoscreening
(Example 4) identified two immunogenic non-contiguous clones within this
sequence.
C. Epitope mapping of HGBV-A proteins 1.22 (SEQUENCE I.D. NO. 390) and
2.17
The HGBV-A proteins 1.22 (SEQUENCE I.D. NO. 390~ and 2.17
(SEQUENCE I.D. NO. 613) both showed immunoreactivity with GB serum by
Western blot (Example 13). Since these two proteins overlap by 40 amino acids,
o the observed immunoreactivity may have resulted from the presence of one epitope
or more than one epitope. The complete 1.22/2.17 sequence is 641 amino acids
long. Overlapping subclones within this region were generated by RT-PCR from
T1053 serum as above in order to determine the location of the immunogenic
region or regions. Each PCR primer had six extra bases on the 5' end to facilitate
restriction enzyme digestion, followed by either an EcoRI site (sense primers) or a
BamHI site (antisense primers) for 1.22/2.17-2 through 1.22/2.17-6. However,
since clone 1.22/2.17- 1 had an internal EcoRI site, a BamHI site was used in the
sense primer and a Hindm site was used in the antisense primer. In addition, each
antisense primer contained a stop codon just after the coding region. After
digestion, each fragment was cloned into EcoRI/BamHI-digested (or
BamHVHindm-digested for 1.22/2.17-1) pJO201 as described in Example 13.
The CKS fusion proteins were expressed and analyzed by Western blot with GB
serum as described in Example 13. The clones encoded regions of 1.22/2.17
ranging in size from 115 to 116 amino acids. The PCR primers used to generate
each clone, the sizes of the encoded polypeptides, the location within the HGBV-A
polypeptide sequence and the reactivity with GB serum are shown in TABLE 15.
The immunogenic region was narrowed down to a 220 amino acid long region in
the middle of the 1.22/2.17 protein. This encompassed the 40 amino acid region
of overlap between 1.22 and 2.17, and thus the immunoreactivity seen with the
two proteins individually may have been due to a shared epitope or to multiple
epitopes.

wo 95/21922 PCT/US95/02118 - -
21~313
108

TABLE 15
ST7F OFRESrDUES
ENCODED PRIMER T1048/T1051 IN SEQ I.D.
CLONE POLY~k~ E SET REACTMTY NO. 120
5 1.7-1 105 aa SEQ ID #615/SEQID #616 + 1-105
1.7-2 109 aa SEQ ID #617/SEQID #618 - 98-206
1.7-3 110 aa SEQ ID #619/SEQID #620 + 199-308
1.7-4 110 aa SEQ ID #621/SEQID #622 +/- 301-410
1.7-5 104 aa SEQ ID #623/SEQID #624 - 403-507
o 1.7-6 75 aa SEQ ID #625/SEQID #626 + 185-259
1.7-7 75 aa SEQ ID #627/SEQID #628 + 251 -325
S~7F OF RESIDUES
ENCODED PRIMER T1048/T1051 IN SEQ I.D.
15CLONE POLY~ E SET REACTMTY NO. ll9
1.4- 1 137 aa SEQ ID #629/SEQID #630 - 1 - 137
1.4-2 137 aa SEQ ID #631/SEQID #632 + 129-265
1.4-3 137 aa SEQ ID #633/SEQID #634 + 257-393
1.4-4 138 aa SEQ ID #635/SEQID #636 - 385-522
201.4-5 75 aa SEQ ID #637/SEQID #638 + 138-212
1.4-6 75 aa SEQ ID #639/SEQID #640 + 204-278

Sl~ . OF RESIDUES
ENCODED PRIMER GB SERUM IN SEQ I.D.
25CLONEPOLY~ E SET REACTMTY NO. 390
1.22/2.17-l 115 aa SEQID#641/SEQID#642 - 1862-1976
1.22/2.17-2 115 aa SEQ ID #643/SEQID #644 - 1967-2081
1.22/2.17-3 ll5aa SEQID#645/SEQID#646 + 2072-2186
1.22/2.17-4 115 aa SEQ ID #647/SEQID #648 + 2177-2291
301.22/2.17-5 115 aa SEQ ID #649/SEQID #650 - 2282-2396
1.22/2.17-6 116 aa SEQ ID #651/SEQID #652 - 2387-2505

-- WO 95121922 216 6 ~ ~ 3 PCT/US95/02118

109

Example 15. Serolo~ical Studies HGBV-B
A. Recombinant Protein Purification Protocol
Bacterial cell cultures expressing the CKS fusion proteins were frozen and
stored at -70C. The bacterial cells from each of the three constructs were thawed
s and disrupted by treating with Iysozyme and DNAse, followed by sonication in the
presence of phenylmethanesulfonyl fluoride and other protease inhibitors to
produce mixtures of the individual recombinant antigen and E. coli proteins.
Individuàlly for each of the three cultures, the insoluble recombinant antigen was
concentrated by centrifugation and subjected to a series of sequential washes too elimin~te the majority of non-recombinant E. coli proteins. The washes used in
this protocol included distilled water, 5% Triton X-100 and 50 mM Tris (pH 8.5).The resulting pellets were solubilized in the presence of sodium dodecyl sulfate(SDS). After determining protein concentration, 2-mercaptoethanol was added and
the ~ ures were subjected to gel filtration column chromatography, with
Sephacryl S300 resin used to size and separate the various proteins. Fractions
were collected and analyzed by SDS-polyacrylamide gel electrophoresis (SDS-
PAGE) The electrophoretically separated proteins were then stained with
Coomassie Brilliant Blue R250 and e~rnined for the presence of a protein having
a molecular weight of approximately 75 kD (CKS- 1.7/SEQUENCE I.D. NO.
610), 80 kD (CKS-1.4/SEQUENCE I.D. NO. 611), 42 kD (CKS-4.1/
SEQUENCE I.D. NO. 612). Fractions containing the protein of interest were
pooled and re-examined by SDS-PAGE.
The imrnunogenicity and structural integrity of the pooled fractions
co~ g the purified antigen were determined by immunoblot following
electrotransfer to nitrocellulose as described in Example 13. In the absence of a
qualified positive control, the recombinant proteins were identified by their
reactivity with a monoclonal antibody directed against the CKS portion of each
fusion protein. When the CKS-1.7 protein (SEQUENCE I.D. NO. 610) was
examined by Western blot, using the anti-CKS monoclonal antibody to detect the
recombinant antigen, a single band at approximately 75 kD was observed. This
corresponds to the expected size of the CKS- 1.7 protein (SEQUENCE I.D. NO.
610). For the CKS-1.4 protein (SEQUENCE I.D. NO. 611), the anti-CKS
monoclonal antibody detects a quadruplet banding pattern between 60 and 70 kD.
These observed bands are smaller than the expected size of the full length protein
and probably represent truncation products. When the CKS-4.1 protein
(SEQUENCE I.D. NO. 52) was examined by Western blot, the anti-CKS
monoclonal antibody detected the recombinant antigen as a single band at

wo 95/21922 ~ 1 ~ 6 3 1 3 Pcr/ussslo2ll8 --

110

approximately 42 kD. This cc lles~ollds to the expected size of the CKS-4.1
protein (SEQUENCE I.D. NO. 612).

B. Polystyrene Bead Coating Procedure
The proteins were dialyæd and evaluated for their antigenicity on
polystyrene coated beads as described below. Separate enzyme-linked
immunosorbent assays (ELISA's) were developed for detecting antibodies to
HGBV using each of the three purified HGBV recombinant proteins (CKS-1.7
(SEQUENCE I.D. NO. 610); CKS-1.4 (SEQUENCE I.D. NO. 611); and the
o CKS-4.1 protein (SEQUENCE I.D. NO. 612). The ELISA's developed with
these proteins are referred to as the 1.7 ELISA (utilizing the CKS-1.7
(SEQUENCE I.D. NO. 610) recombinant protein), the 1.4 ELISA (utilizing the
CKS- 1.4 (SEQUENCE I.D. NO. 611) recombinant protein), the 4.1 ELISA
(utili7ing the CKS-4.1 [SEQUENCE I.D. NO. 612]) recombinant protein. In the
first study, one-quarter inch polystyrene beads were coated with various
concentrations with each of the purified proteins (approximately 60 beads per lot)
and evaluated in an ELISA test (described below) using serum from an
uninoculated tamarin as a negative control and convalescent sera from an
inoculated tamarin as a positive control. Additional controls included the a pool of
20 human serum from individuals testing negative for various hepatitis viruses. An
additional positive control consisted of monoclonal antibodies to the CKS protein
to monitor the efficiency of bead coating. The bead coating conditions providingthe highest ratio of positive control signal to negative control signal were selected
for scaling up the bead coating process. For each of the four ELISA's at least two
25 lots of 1,000 beads were produced and utilized for serological studies.
Briefly, polystyrene beads were coated with the purified proteins by adding
the washed beads to a scintillation vial and irnmersing the beads (approximately0.233 rnl per bead) in a buffered solution containing the recombinant antigen.
Several dirrelc;~t concentrations of each of the recombinant antigens were evaluated
30 along with several dirr~.G--- buffers prepared at pHs ranging from pH 5.0 to pH
9.5. The vials were then placed on a rotating device in a 40C incubator for 2 hours
after which the fluids were aspirated and the beads were washed three times in
phosphate buffered saline (PBS), pH 6.8. The beads were then treated with 0.1 %
Triton X-100 for 1 hour at 40C and washed three times in PBS. Next, the beads
35 were overcoated with 5% bovine serum albumin and incubated at 40C for 1 hourwith agitation. After additional washing steps with PBS, the beads wer~
overcoated with 5% sucrose for 20 minutes at room telllpel~tul~ and the fluids

-- WO 9S/21922 ~ 1 6 ~ 3 1 3 PCT/US95/02118

111

were aspirated. Finally, the beads were air dried and then utilized for developing
ELISA's for detection of antibodies to HGBV.
C. ELISA Protocol for Detection of Antibodies to HGBV
An indirect assay format was utilized for the ELISA's. Briefly, sera or
plasma was diluted in specimen diluent and reacted with the antigen coated solidphase. After a washing step, the beads were reacted with horseradish-peroxidase
(HRPO) labeled antibodies directed against human immunoglobulins to detect
tamarin or human antibodies bound to the solid phase. Specimens which produced
signals above a cutoff value were considered reactive. Additional details pertaining
o to the ELISA's are described below.
The format for the ELISA's entails contacting the antigen-coated solid
phase with tamarin serum pre-diluted in specimen diluent (buffered solution
cont~ining animal sera and non-ionic detergents). This specimen diluent was
forrn~ tecl to reduce background signals obtained from non-specific binding of
immunoglobulins to the solid phase while enhancing the binding of specific
antibodies to the antigen-coated solid phase. Specifically, 10 ~1 of tamarin serum
was diluted in 150 ,ul of specimen diluent and vortexed. Ten microliters of thispre-diluted specimen was then added to the well of a reaction tray, followed by the
addition of 200 ,ul of specimen diluent and an antigen coated polystyrene bead.
The reaction tray was then incubated in a Dynamic Incubator (Abbott Laboratories)
set for co~ agitation at room telll~ldlulc~. After a 1 hour incubation, the fluids
were aspirated, and the wells containing the beads were washed three times in
distilled water (5 ml per wash). Next, 200 ~11 of HRPO-labeled goat anti-human
immunoglobulins diluted in a conjugate diluent (buffered solution cont~ining
animal sera and non-ionic d~Le. gent~) was added to each well and the reaction tray
was incubated again as above for 1 hour. The fluids were aspirated and the wellscol-t:~il-il-g the beads were washed three times in distilled water as above. The
beads containing antigen and bound immunoglobulins were removed from the
wells, each was placed in a test tube and reacted with 300 ~L of a solution of
0.3% o-phenylenedi~mine-2 HCI in 0.1 M citrate buffer (pH 5.5) with 0.02%
H202. After 30 minutes at room telllpeldlule, the reaction was terrninated by the
addition of 1 N H2SO4 The absorbance at 492 nm was read on a
spectrophotometer. The color produced was directly proportional to the amount ofantibody present in the test sample.
3s For each group of specimens, a preliminary cutoff value was set to separate
those specimens which presumably contain antibodies to the HGBV epi~ope from-
those which did not.

WO 95/21922 2 1 6 6 3 i 3 PCT/US95/02118

112

D. Detection of HGBV derived RNA in Serum from Infected Individuals.
In order to correlate serological data obtained for 1.7 and 1.4 ELISA's with
the presence of HGBV RNA in tamarin serum or in human serum/plasma, RT-
PCR was performed as described in Example 7 of U.S. Serial No. 08/283,314,
s previously inco,~olaled herein by reference utilizing oligonucleotides derived from
HGBV cloned sequences, at a final concentration of 0.5 ~lM for clone 4 (as
described in Example 7) derived from the HGBV-B genome and for clone 16,
derived from the HGBV-A genome.
E. Tamarin Serolo~ical Profiles.
0 Serum was obtained from tamarins housed at LEMSIP on a weekly basis
and tested for liver enzyme levels; the l~ g volume from these specimens
was sent to Abbott Laboratories for further studies.
1. ELISA Results on Tamarins (Initial Infectivity Studies)
Four tamarins (T-1053, T-1048, T-1057 and T-1061) were inocul~ted with
GB serum (designated as H205 GB passage 11). Elevated liver enzymes were
noted in Tamarin T-1053 during the first week post-inoculation (PI): this tamarin
was euth~ni7~d on day 12 PI. Tamarins T-1048, T-1057 and T-1061 exhibited
elevated liver enzyme values within two weeks following their inoculation; theseelevated values persisted until 8-9 weeks PI (FIGURES 2-4) before returning to
20 pre-inoculation levels. On week 14 PI, these three tamarins were re-challenged
with 0.10 ml of neat serum obtained from tamarin T-1053 (which was shown to be
infectious - Example 2).
Sera from three convalescing tamarins (T-1048, T-1057 and T-1061) were
tested for antibodies to the CKS-1.7 (SEQUENCE I.D. NO. 610) recombinant
25 protein, the CKS- 1.4 (SEQUENCE I.D. NO. 611) recombinant protein, and the
CKS 4.1 (SEQUENCE I.D. NO. 612) recombinant protein, using separate
ELISA's (FIGURES 3, 4 and 5). Specific antibodies to 1.7 (SEQUENCE I.D.
NO. 610), 1.4 (SEQUENCE I.D. NO. 611), 4.1 (SEQUENCE I.D. NO. 612, or
1.5 (SEQUENCE I.D. NO.614) recombinant proteins were not detected in any of
30 the pre-inoculation specimens.
As shown in FIGURE 26, specific antibodies were detected in T-1048 sera
with the 1.7 and 1.4 ELISA's on days 56-84 but not on days 97 and 137 PI.
Specific antibodies were not det~.ctt-.d in T- 1048 sera tested with the 4.1 ELISA.
As shown in FIGURE 27, antibodies to the 1.7 protein (SEQUENCE I.D. NO.
3s 610) were detected in T-1057 serum at 56 and 63 days PI, but not after 63 days
PI. Antibodies to the 4.1 protein (SEQUENCE I.D. NO.612) were detected on
days 28-63 PI but not on days 84-97 PI. As noted above, tamarins were

~166313
-- WO 95/21922 PCT/US95/02118
-


113

challenged with a second dose of the H205 inoculum on day 97 PI. Specific
antibodies to the 4.1 protein (SEQUENCE I.D. NO. 612) were detected on days
112 and 126 PI, suggesting an anamnestic response to the inoculum. No antibody
reactivity was noted for the 1.4 recombinant protein (SEQUENCE I.D. NO. 611).
s Specific antibodies to the recombinant 1.4 protein (SEQUENCE I.D. NO.
611) were detected in the serum of tamarin T-1061 between 84 and 112 days PI,
but were not detected after 126 days PI. As shown in FIGURE 28, Tamarin T-
1061 sera were negative for antibodies to the 1.7 protein (SEQUENCE I.D. NO.
610) and to the 4.1 protein (SEQUENCE I.D. NO. 612) for 350 days PI.
o 2. PCR Results on Tamarins (Initial Infectivity Studies)
Selected sera obtained from tamarins T-1048 and T-1057 were tested for
HGBV RNA via RT-PCR using primers from clone 4 as described in Example 7)
and from clone 16 as described in Example 7.
HGBV RNA was not cletecte~l via RT-PCR with either set of primers in
the serum obtained 10 and 17 days prior to inoculation (T-1048) as shown in
FIGURE 26, or 17, 37 and 59 days prior to inoculation (T-1057), as shown in
FIGURE 27. For T- 1048, HGBV RNA was detected via RT-PCR using primers
from clone 4 on fifteen of seventeen dirrerellt sera obtained between 7- 137 days
PI. HGBV RNA was not detected via RT-PCR using primers from clone 16 in
any of the 10 sera obtained on days 7-97 PI. After the challenge with T-1053
plasma, four of five sera obtained between 8 and 40 days after the challenge were
positive for clone 16. For T-1057, positive RT-PCR results were obtained on foursera obtained on days 7-28 PI, using primers from clone 4, as shown in FIGURE
27. RT-PCR performed on specimens drawn beyond day 28 PI were negative for
clone 4, except for day 287 which showed a weak hybridization signal. Neither ofthe six specimens obtained from T-1057 on day 7-97 PI were positive via RT-PCR
using primers from clone 16. However, sera obtained between 8-85 days after the
T-1053 challenge were positive using primers from clone 16.
3. ELISA Results on Tamarins (Titration/Transmissibilty Studies)
As described in Example 2, serum from tamarin T-1053 was inoculated
into four tamarins. Three of these four tamarins were euth~i7toc~ during the acute
stage of the disease (between days 12 and 14 PI). The RT-PCR results obtained
on these three tamarins are described below. The surviving tamarin (T- 1051) first
developed elevated liver enzyme values by day 14 PI and these values persisted for
at least 8 weeks PI. Specimens from tamarin T-1051 were tested in the 1.7 and
1.4 ELISA's; the results are shown in FIGURE 29. Specific antibodi~s were not
detected in the pre-inoculation serum nor in serum drawn in the first 41 days PI.

W O 9~/21922 ~ 3 ~ 3 PCT~US95/02118 `--

114

However, an antibody response was noted against the 1.4 protein (SEQUENCE
I.D. NO. 611), and the 1.7 protein (SEQUENCE I.D. NO. 610) between 49 and
113 days PI and the 4.1 protein (SEQUENCE I.D. NO. 612) between 28 and 105
days PI. The tamarin was euth~ni7ecl during the 113th day PI.
Tamarin (T-10343 was previously inoculated with 0.1 ml of potentially
infectious serum obtained from a patient (original GB source) who was recoveringfrom a recent hepatitis infection as described in Example 1 and in TABLE 4. No
elevations in liver enzyme values were noted in T- 1034 for nearly 10 weeks after
inoculation. For this reason, it was decided that tamarin T-1034 could be used in
o an additional study. Tamarin T- 1034 was inoculated with a pl~;p~lion of HGBVprepared as described in Example 4 ?? from a pool of serum obtained from three
tamarins (T-1055, T-1038 and T-1049) previously inoculated with serum from
tamarin T-1053.
These three tamarins (T-1055, T-1038 and T-1049) were inoculated with
serum prepared from tamarin T-1053 as described in Example 2. Elevated liver
enzyme values were noted in all 3 tamarins by day 11 PI. Tamarin T-1055 was
sacrificed on day 12 PI: tamarins T-1038 and T-1049 were sacrificed on day 14
PI. Serum from these tamarins was pooled, clarified and filtered. Tamarin T-
1034 was inoculated with 0.25 ml of a lO -6 dilution (prepared in normal tamarin20 serum) of this filtered material.
Elevated ALT liver enzyme values were first noted in T-1034 at 2 weeks
PI, and remained elevated for the next 7 weeks, finally norrn~li7ing by week lO
PI. As demonstrated in FIGURE 30, a specific antibody response to the 1.4
(SEQUENCE I.D. NO. 22) recombinant protein was first detected on day 49 PI
25 and continued to be detected on days 56-118 PI. The antibody response to the 4.1
(SEQUENCE I.D. NO. 52) recombinant protein was first detected on day 49 PI
and continued to be detected between days 56-77 PI, but was not detected on
between days 84-118 PI. The antibody response to the 1.7 (SEQUENCE I.D.
NO. 610) recombinant protein was first detected on day 56 PI and continued to be30 detected between days 63- 118 PI. The tamarin was sacrificed on day 118 PI.
As described in Example 2, tamarin T-1044 was inoculated with serum
obtained from T-1057 that had been obtained 7 days after the H205 inoculation.
This inoculum was positive only for sequences detected with clone 4 primers .
The inoculum was negative by RT-PCR with clone 16 primers . Mild elevations in
35 ALT levels above the cutoff were observed from days 14-63 PI. As demonstratedpreviously, a specific antibody response to the 1.7 (SEQUENCE I.D NO. 610)
recombinant protein was detected between 63-84 days PI. No antibody response

21663 1~
-- WO 95/21922 PCT/US95/02118

115

to the 4.1 (SEQUENCE I.D. NO. 612) recombinant protein or to the 1.4
(SEQUENCE I.D. NO. 611) recombinant protein was detected. The tamarin was
sacrificed on 161 days PI.
4. PCR Results on Tamarins (Titration/Transmissibilty Studies)
Sera obtained from T-1049 and T-1055 during the 8th week prior to
inoculation and T-1038 on the day of inoculation, were negative by RT-PCR for
sequences to clone 16 (SEQUENCE I.D. NO. 26) and clone 4 (SEQUENCE I.D.
NO. 21). Tamarins T-1049 and T-1055 were positive for clone 4 sequences
(SEQUENCE I.D. NO. 21) by RT-PCR 1 week after inoculation (clone 16 PCR
0 was not done). Prior to the day of sacrifice, T-1049 (14 days PI) as well as T-
1055 (11 days PI) were positive by RT-PCR for both clone 4 (SEQUENCE I.D.
NO. 21) and clone 16 sequences (SEQUENCE I.D. NO. 26). Tamarin T-1038
was positive with both sets of primers on the day of sacrifice (14 days PI).
As seen in FIGURE 30, T-1034 was positive by RT-PCR for sequences
detected with clone 4 primers on the first serum sample obtained after inoculation
(7 days PI) and remained positive to day 70 PI. A sample obtained on day 112 PI
was negative. All of these samples were negative by RT-PCR with clone 16
primers. Samples obtained 70 and 101 days prior to inoculation were negative
with both sets of primers.
As can be seen in FIGURE 29 for tamarin T- 1051, HGBV RNA was not
detected with either set of primers (from clones 4 and 16 as described above) in the
serum specimen obtained 8 weeks prior to inoculation. HGBV RNA was detected
by RT-PCR using primers from clone 4 on six sera obtained between days 7-69
PI, but not on days 77, 84, 91, or 105 PI. HGBV RNA was detected by RT-PCR
25 using primers from clone 16 on nine samples obtained after inoculation.
As seen in FIGURE 7, T-1044 was positive by RT-PCR for sequences
detected with clone 4 primers on the first serum sample obtained after inoculation
(7 days PI) and remained positive to day 63 PI. Samples obtained between days
77- 119 were negative. All of these samples were negative by RT-PCR with clone
30 16 primers. A sample obtained 42 days prior to inoculation was negative for both
sets of primers.
Tamarins T-1047 and T-1056 were inoculated with T-1044 serum obtained
14 days PI. Nine samples obtained between 7- 64 days PI from both of these
animals were positive by RT-PCR with clone 4 primers (SEQUENCE I.D. NOS.
35 8 and 9) but negative with clone 16 primers .
Tamarin T-1058 was inoculated with neat T-1057 serum obtained 22 days
after the challenge with T-1053 serum. This inoculum was positive for sequences

WO 95/21922 21 6 ~ 3 ~. 3 PCT/US95/02118 --

116

detected with clone 16 primers but negative with clone 4 primers. Serum samples
obtained from this animal were tested with primers derived from GBV- sequences
[clone 16, clone 2 clone 10 and clone 18)] and GB-B sequences [clone 4 and
clone 50]. A sample obtained 9 days prior to inoculation was negative with all
s primer sets. A sample obtained 14 days PI was positive only with clone 10 and 18
primers. A sample obtained 21 days PI was positive only with clone 16, 10 and
18 primers. A sample obtained 28 days PI was positive only with clone 18
primers. A sample obtained 35 days PI was positive only with clone 2, 16 ( and
18 primers. A sample obtained 41 days PI was positive only with clone 16 and
o 18 primers. All samples tested were negative with primers from clone 4 and clone
5. Summary of Serological Studies in Tamarins
Five tamarins were inoculated with various plcpaldlions of HGBV and
developed elevated liver enzyme values by two weeks PI. These elevations
persisted for the next six to eight weeks. A specific antibody response to one or
more HGBV recombinant antigens, 1.7, 1.4, and 4.1 was noted in all five
tamarins. In all cases, the antibodies were first detected by six to ten weeks PI,
and persisted for two to seven or more weeks. In general, the antibody levels
peaked and then declined rapidly over the next several weeks. It is observed that
20 the antibodies become detect~ble shortly after the liver enzyme values returned to
normal levels, suggesting that the generation of antibodies may play a role in
clearing the viral infection.
6. Summary of PCR Studies on Tamarins
The results of the genomic walking e~yclilllc~ suggest that clone 4
25 (SEQUENCE I.D. NO.21) and clone 16 (SEQUENCE I.D. NO. 26) reside on
separate RNA molecules. We previously provided arguments that supported the
idea that there are two distinct viral genomes, one comprised partly of clone 4
(SEQUENCE I.D. NO.21) and one comprised partly of clone 16 (SEQUENCE
I.D. NO. 26). The observation that some animals are positive with primers from
30 clone 4 and not with primers from clone 16 supported the existence of two distinct
viral genomes. However, it can also be argued that the inability to detect clone 16
(SEQUENCE I.D. NO. 26) sequence in some of the infected tamarins may reflect
a lower limit of sensitivity of the clone 16 primer set relative to the clone 4 primer
set. If this latter possibility was the case, then a tamarin positive for both primer
35 sets should exhibit a difference in sensitivity with these two primer sets. In order
to support the explanation that these results are explained by the existence of two
separate viruses, and not differences in sensitivities of these two primer sets, PCR

-- WO 95/21922 ~ 1 ~ 6 3 1 3 PCT/US95/02118

117

was performed on a dilution series of cDNA from tamarins T-1057 and T1053. T-
1057 serum was positive at 5 X 10-3 but negative at 5 X 10-4 ul serum equivalents
with clone 4 primers. As much as 20 ul of T-1057 serum was used for RT-PCR
with clone 16 primers with negative results. If this difference was due to the
s relative sensitivity of the two primer sets (clone 4 vs. clone 16), one would expect
that other specimens would also show a 4000 fold higher endpoint dilution when
tested by PCR. However, cDNA derived from T-1053 serum was found to be
positive at 2.5 X 10-4 but negative at 2.5 X 10-5 ul serum equivalents for both
clone 4 (SEQUENCE I.D. NO.21) and clone 16 (SEQUENCE I.D. NO. 26)
0 sequences. These observations are therefore not consistent with a difference in
sensitivity of primer sets but are consistent with the existence of contig B-clone 4
(SEQUENCE I.D. NO.21) and contig A-clone 16 (SEQUENCE I.D. NO. 26)
sequences on separate viral genomes of roughly equal titer in T-1053 but differing
in titer by at least 4000 fold in T-1057. This data is therefore consistent with the
existence of two separate viruses which may have different relative endpoint titers
in different specimens.
The observation that HGBV-B viremia alone was sufficient to cause
elevations in liver enzyme levels and that no elevations were observed during a
GBV-A-only viremic stage, in-lic~ted that HGBV-B was the probable causative
agent for hepatitis in these tamarins. The immllne response to the HGBV-B
antigens appeared to be for a short duration, at most 150 days PI. One explanation
could be that the selection of epitopes used in these ELISAs was not from the
dominant epitopes to which the immune response is generated. Another
explanation could be that in tamarins the hepatic challenge may not be significant
enough to necessitate a long-lived response. This is consistent with histological
evidence from animals that were sacrificed during the acute phase of the disease or
had died of natural causes some time after the acute phase which showed that
hepatic infl~mm~tion ranged from mild to not significant (results not shown).
Five of six animals described in this study resolved viremia of HGBV-B
by 112 days PI. In contrast, Tamarin T-1048 remained viremic for 136 days and
was found to be viremic at the time of death (137 days PI). Of the four animals
that were positive for GBV-A sequence, three showed resolution by 77 days after
the first appearance of GBV-A sequence. In contrast, tamarin T-1061 was viremic
for 245 days up to the time the animal was sacrificed. In addition, tamarin T- 1051
3s was viremic up to the time of sacrifice (day 113 PI), however, it is unclear if this
persistent viremia is due to the initial inoculation with T-1053 plasma nr a result-of
the subsequent challenge with additional T-1053 plasma 69 days later.

WO 95/21922 ~ 3 1 3 PCT/US95/02118

118

The average peak ALT value for the six animals positive for both HGBV-A
and HGBV-B was higher than the average value for the four HGBV-B-only
animals. In addition, the peak value occurred, on average, earlier in animals
positive for GBV-A and GBV-B than for animals positive only for GBV-B. These
5 results suggest that the intensity of the hepatitis may be related to the presence of
both agents at significant levels. The observation from the additional passage of
GBV-B into t~m~rin~ T-1047 and T-1056 that minim~3l elevation in liver enzymes
occurred with GBV-B viremia ~uppoll~ this assumption that both agents may be
necess~ry for major elevations in ALT levels to occur in t~m~rinc. In addition to
o the passage of HGBV-B alone, initial results from the inoculation of T-1058 with
HGBV-A inoculum suggest thatH GBV-A can be transmitted independent of any
detectable HGBV-B as indicated by the absence of any detectable GB-B sequences
with clone 4 and clone 50 primers.
F. E~pelilll~ll~l Protocol.for demonstratin~ exposure to HGBV in human
5 populations
Specimens were obtained from various human populations and tested for
antibodies to HGBV ntili7.ing three separate ELISA's ntili7ing recombinant
proteins derived from HGBV-B. The 1.7 ELISA utilized the CKS-1.7
recombinant protein (SEQUENCE I.D. NO.610) coated onto the solid phase; the
20 1.4 ELISA utilized the CKS- 1.4 recombinant proteins (SEQUENCE I.D.
NO.61 1) coated on the solid phase and the 4.1 ELISA utilized the 4.1 recombinant
protein (SEQUENCE I.D. NO.612) coated on the solid phase as described in
Example 1 5.B. As also noted in Example 1 5.E, tamarins inoculated with HGBV
produce a specific, but short-lived antibody response to these proteins. In view of
25 the transient nature of this detectable imm-lne response, a negative result in human
populations would not necessarily exclude previous exposure to HGBV.
The objective of the serological studies condl~ctecl with human specimens
was two-fold. First, the seroprevalence of antibodies to the current HGBV
recombinant antigens in various human populations was to be d~tellllh,ed. These
30 studies included testing (1) populations considered at "low risk" for exposure to
HGBV (e.g. healthy volunteer blood donors in U.S.); (2) populations considered
to be "at risk" for exposure to HGBV (e.g. specimens obtained from intravenous
drug users and hemophiliacs are frequently seropositive for palt;nL~l~lly
transmitted hepatitis viruses (HBV and HCV); specimens obtained from
3s individuals residing in developing nations are frequently seropositive for
enterically tr~n~mitte~ viruses (HAV and HEV); (3) panels of specimens obtained
from individuals with "non-A-E hepatitis" that is not associated with exposure to

WO 95/21922 21 6 ~ 3 1 3 PCT/US95/02118

119

known hepatitis viruses (HAV, HBV, HCV, HDV or HEV) or to other viruses
associated with hepatitis such as cytomegalovirus (CMV) or Epstein-Barr Virus
(EBV). In some cases, members of the panels under the general he~ing of non
A-E hepatitis were not tested for antibodies to HEV. Therefore, all specimens in- s the non A-E group which were reactive with the 1.7, 1.4 or 4.1 ELISA's were
retested with an HEV ELISA assay (available from Abbott Laboratories, Abbott
Park, IL). Positive anti-HEV results were noted with samples from three sites
(Pakistàn, U.S. and New Zealand), as explained hereinbelow.
One would expect to observe higher seroprevalence rates among
0 populations "at risk" for exposure to HGBV and among individuals with non-A-Ehepatitis, than among populations considered to be at "low risk" for exposure toHGBV.
The second objective of the serological studies was to examine specimens
found to be positive for antibodies to one or more HGBV epitopes by RT-PCR to
determine if the virus is present in serum. It is well known that HBV and HCV
can establish a viremic state which persists for months or years, and in general,
that HAV and HEV establish a short-lived viremia persisting in general for several
weeks. In cases of HBV and HCV infection which are acute, resolving hepatitis,
the viremic stage may also be short-lived persisting for several weeks. Thus, RT-
PCR can be used to provide evidence that the virus is present in an infected
individual. However, because the viremic state can be short-lived, a negative RT-
PCR result for a given agent can be observed in individuals who are infected with
that agent.
G. Cutoff Del~ lation
2s Previous experience with other ELISA's utili7.ing the indirect assay format
indicated that a preliminary cutoff value can be calculated based on the absorbance
values obtained on a population p~;su~llably negative for antibodies to the protein
being studied. A preliminary cutoff value was calculated as the sum of the mean
absorbance value of the population plus 10 standard deviations from the population
mean. Since the cutoff value was to be used every time a panel was run, a more
convenient method to express the cutoff was as a factor of the negative control
(pool of normal human plasma - NHP) which was run in replicates of five for eachassay run. For the 1.7, 1.4 and 4.1 ELISA's, the negative control typically had an
absorbance value of between 0.030 and 0.060. As described below, the cutoff
3s values were calculated to be at an absorbance value of approximately 0.300 to
0.600, which was equivalent to an absorbance signal of ten times the negative
control value. Thus, in order for a specimen to be considered reactive, the ratio of

wo ss/2l9~2 21 6 6 31 3 PCTIUS95/02118

120

the sample (S) absorbance value to the negative (N) control absorbance value (S/N
ratio) had to be equal to or greater than 10Ø
H. Supplemental Testing
Spechllells which were initially reactive were typically retested in duplicate.
If one or both of the retest absorbance values were above the cutoff value, the
specimen was considered repeatably reactive. Specimens which were repeatably
reactive were then tested with supplemental assays which may further support theELISA data. Repeatably reætive specimens which had sufficient volume may be
tested by Western blot to detelllli-le that the antibody response was directed against
0 the CKS-1.7 (SEQUENCE I.D. NO. 610), a CKS-1.4 (SEQUENCE I.D. NO.
611) or CKS 4.1 (SEQUENCE I.D. NO. 612) antigens and not to _. coli proteins
which may have been co-coated on the solid phase with the major protein of
interest. For a Western blot result to be considered positive, a visible band had to
be detected at 80kD for the 1.7 protein (SEQUENCE I.D. NO. 610), 60-70 kD
for the 1.4 protein (SEQUENCE I.D. NO. 611) or at 42 kD for the 4.1 protein
(SEQUENCE I.D. NO. 612). Since the Western blot has not been optimized to
match or exceed the sensitivity of the ELISA's, a negative result was not used to
discard the ELISA data. However, a positive result reinforced the reactivity
detected by the ELISA's.
Repeatably reactive specimens which had sufficient volume may be tested
by RT-PCR (performed as described in Example l 5.D using clone 4 primers to
identify HGBV specific nucleotide sequences in serum. A positive result would
indicate a viremic specimen and would ultimately help in establishing the role of
HGBV in human hepatitis. A negative result, however, was not to be construed to
indicate that the ELISA results was incorrect. As noted in the tamarin study in
Example 15.E, RT-PCR results were positive in the first several weeks after
infection and then became negative at about the time when antibodies were just
beginning to be detçctçd with the current ELISA's. These later specimens may be
RT-PCR negative but positive in one or both of the ELISA's.
I. Serological Data Obtained with Low-Risk Specimens
A population consisting of 100 sera and 100 plasma was obtained from
healthy, volunteer blood donors in Southeastern Wisconsin and tested for
antibodies to the 1.7 (SEQUENCE I.D. NO. 610) and 1.4 (SEQUENCE I.D. NO.
611) and 4.1 (SEQUENCE I.D. NO. 612) recombinant proteins utilizing the
3s ELISA's described above. The absorbance values obtained with the 1.7, 1.4 and
4.1 ELISA's for serum and plasma were plotted separately (FIGURES 9- 14).

- W O 95/21922 ~ 3 1~ PCTrUS95/02118

121

For the 1.7 ELISA, the mean absorbance values for the serum and plasma
specimens were 0.072 [with a standard deviation (SD) of 0.061] and 0.083
(SD=0.055), respectively. Thus, for the 1.7 ELISA's, the tentative cutoff valuesfor serum and plasma were 0.499 and 0.468, respectively. As discussed above,
s the cutoff also was expressed as a factor of the negative control absorbance value.
specimens having S/N values above 10.0 were considered reactive. Using this
cutoff value, 0 of 200 specimens tested for antibodies to 1.7 (SEQUENCE I.D.
NO. 610).
For the 1.4 ELISA, several specimens (three from the serum population
0 and six from the plasma population) had absorbance values greater than 0.300
(S/N's of 6-12, near or above the expected cutoff value). When retested, all nine
of these specimens produced S/N values of less than 10Ø The mean absorbance
value for the serum and plasma specimens were 0.072 (SD=0.052) and 0.108
(SD=0.062), respectively. The cutoff for the 1.4 ELISA was calculated using the
formula described above; the cutoff values for serum and plasma populations were0.436 and 0.542, respectively. One specimen from the serum population was
initially reactive and when re-tested in duplicate was negative. Two specimens
from the plasma population were initially reactive but were negative upon re-test.
A second population of 200 normals was tested including 100 plasma and 100
20 serum. Using the proposed cutoff, two plasma and two sera were repeatably
reactive.
For the 4.1 ELISA, the mean absorbance values for the serum and plasma
specimens were 0.070 [with a standard deviation (SD) of 0.037] and 0.063
(SD=0.040), respectively. Thus, for the 4.1 ELISA, the tentative cutoff values for
25 serum and plasma were 0.329 and 0.511, respectively. As discussed above, the
cutoff also was expressed as a factor of the negative control absorbance value;
specimens having S/N values above 10.0 were considered reactive. Using this
cutoff value, 0 of 100 plasma specimens and 0 of 100 serum specimens were
initially reactive for antibodies to 4.1 (SEQUENCE I.D. NO.612).
An additional 760 plasma donors from the Interstate Blood Bank (Ohio)
were tested with the 1.7 and 1.4 ELISAs. A total of 9 specimens were repeatably
reactive. None of the specimens were reactive in both ELISAs. All 9 specimens
were repeatably reactive with the 1.4 ELISA.
In total, 960 specimens from plasma or blood donors residing in the U.S.
3s were tested for antibodies to the 1.7 and 1.4 proteins. A total of 13 specimens
were repeatably reactive by the 1.4 ELISA. None of the specimens were repeataby
reactive with the 1.7 ELISA.

WO 95/21922 PCr/US95/02118
21 b'~;31 3
122

In summary, these data in~lic~tt~ that, with the existing ELISA's, a total of
13 of 960 specimens obtained from U.S. blood donors were reactive for
antibodies in one or more of the ELISA's employing recombinant antigens from
HGBV-B. These data suggest that HGBV may be endemic in the U.S.
s These data are summarized in TABLE 16.
J. Specimens Considered "At Risk" for Hepatitis
The data for these studies is summarized in TABLE 16.
(i) Specimens from West Africa
A total of 181 of 1300 specimens obtained from West Africa were
0 lt;peatably reactive in one or more of the ELISA's. One specimen was repeatably
reactive in all 3 ELISA's. A total of 43 specimens were repeatably reactive withthe 1.7 ELISA, 91 specimens were repeatably reactive with the 1.4 ELISA and 51
specimens were repeatably reactive in the 4.1 ELISA.
One of six ~eci,ll~ns repeatably reactive in the 1.7 ELISA was reactive by
Western blot for the 1.7 protein (SEQUENCE I.D. NO.610). Nine of 9
specimens (100%) which were repeatably reactive in the 1.4 ELISA were positive
by Western blot for antibodies to the 1.4 protein (SEQUENCE I.D. NO. 61 l ).
One specimen was positive by Western blot for both proteins. Twelve of 12
specimens ( 100%) repeatably reactive in the 4.I ELISA were positive by Western
blot for the 4.1 protein (SEQUENCE I.D. NO.612.
Three repeatably reactive specimens (including one specimen positive in the
1.4 ELISA and one spechllen positive in both ELISA's and both Western blots)
were tested for HGBV RNA by RT-PCR using primers from clone 4 as described
above. All three specimens were negative by RT-PCR.
2s These data suggest that HGBV may be endemic in West Africa.
(ii) Specimens from Intravenous Drug Users (IVDU's)
Set 1: Three of 112 specimens were positive with the 1.4 ELISA. Five
specimens were reactive on 4.1 ELISA and three on 1.7 ELISA. Two samples
were positive on more than one ELISA.
Set 2: A total of 99 specimens were obtained from a population of
intravenous drug users, as part of a study being conducted at Hines Veteran's
Arlminictration Hospital, in Chicago, IL. None of these specimens were reactive
in the 1.7 or 4.1 ELISA. One specimen was repeatably reactive in the 1.4 ELISA.
This repeatably reactive specimen was tested for HGBV RNA by RT-PCR using
3s primers from clone 4 as described above. This specimen was RT-PCR negative.
K. Specimens obtained from individuals with non A-E Hepatitis
The data for these studies is summarized in TABLE 16.

_ W O 95/21922 21~ 6 313 P C TrUS9~/02118

123

Various populations of specimens were obtained from individuals
diagnosed as having non-A-E hepatitis and tested with the 1.7, 1.4, and 4.1
ELISA's described in Example 15.C. These specimens included: 180 specimens
obtained from a J~p~nese clinic; 56 specimens from a clinic in New 7~ ncl; 73
. 5 specimens obtained from a clinic in Greece; 132 specimens from a clinic in Egypt;
64 specimens from a U.S. clinic in Texas (set T), 72 specimens from a research
center in Minesota (set M); 62 specimens from U.S. (set #1); 82 specimens
obtained from a clinic in Pakistan; 10 specimens from a clinic in Italy. (Due toinsufficient volumes of some sera, certain specimens from these groups were not
o tested on all of the available ELISAs).
(i) SpecimensfromJapan
These 180 specimens were obtained from 85 different patients. These two
reactive specimens came from 2 individuals. A total of 2 of 180 specimens were
repeatably reactive in the 1.7 ELISA. These 2 s~echl-el~s were tested by RT-PCR
s using primers from clone 4 as described above. None of the specimens were
positive.
None of the specimens were positive in the 1.4 ELISA.
For the 4.1 ELISA, seven of 89 specimens were repeatably reactive in the
4.1 assay. (Note: these 89 specimens were obtained from 29 different patients).
Five of the reactive specimens were obtained from one patient. The remaining twowere from a different patient.
(ii) Specimens from New _ealand
A total four of 56 specimens were repeatably reactive in one or more of the
ELISA's 1.7, 1.4, and 4.1. None of these specimens were reactive in two or
more ELISA's. One specimen was repeatably reactive in the 1.7 ELISA and two
specimens were repeatably reactive in the 1.4 ELISA. One specimen was
repeatably reactive with the 4.1 ELISA. PCR was performed on two repeatably
reactive specimens; both specimens were negative. One specimen which was
repeatably reactive in the 1.4 ELISA was also reactive for antibodies to HEV.
(iii) Specimens from Greece
A total of 5 of 73 specimens were found to be reactive for antibodies in the
1.7 and/or 1.4 ELISA's. These 73 specimens were obtained from a total of 11
patients. Two of the five repeatably reactive specimens were repeatably reactivefor both ELISA's and were obtained from one individual on different dates. Two
repeatably reactive specimens were tested by RT-PCR and were negative. None of
these specimens were reactive for antibodies with the 4.1 ELISA.
(iv) SpecimensfromE~ypt

wo 95/21922 2 1 ~ ~ 3 13 PCTfUSg5/02118

124

A total of 1 1 of 132 specimens were reactive in the 1.7, 1.4, or 4.1
ELISA's. Eight specimens were positive in both the 1.7 and 1.4 ELISA's. Nine
specimens were reactive for antibodies in the 1.7 ELISA and 9 specimens were
reactive in the 1.4 ELISA. One specimen repeatably reactive in the 4.1 ELISA buts negative in the 1.7 and 1.4 ELISAs. One specimen repeatably reactive in the 1.7
ELISA was tested by Western blot and was negative for antibodies to the 1.7
recombinant protein (SEQUENCE I.D. NO. 610). Six of nine specimens
repeatably reactive in the 1.4 ELISA tested positive by Western blot for antibodies
to the 1.4 recombinant protein (SEQUENCE I.D. NO. 611). Seven of the
o repeatably reactive specimens were tested by RT-PCR; none of the specimens were
reactive. These 132 specimens were obtained on different dates from 25 differentindividuals. The 11 repeatably reactive specimens were obtained from five
different individuals. For one of these individuals (patient #101), the immune
response clearly mimics that observed with the tamarins (FIGURE 31). Note that
in FIGURE 31, the ALT levels were elevated at the time of presentation of
symptoms to the physician. In subsequent specimens, the ALT levels declined and
antibodies were detectecl utilizing the 1.4 and 1.7 ELISA's. The antibody
response declined over the next several weeks as was noted with the serologic
profiles observed in the tamarins. Three additional patients (257, 260, and 340)exhibited serologic patterns similiar to patient #101 (as shown in FIGURES 32-
34. These data provide supportive evidence that HGBV may be the etiologic agent
in these cases of hepatitis.
None of the seven specimens obtained from these four patients were
positive for HGBV RNA by RT-PCR. There are several potential reasons for
2s these results. First, the viremic phase may have been very short-lived: the virus
may have been cleared from the serum by the time of the first bleed date.
Secondly, these specimens were shipped from Egypt and may potentially have
been frozen and thawed or otherwise colllp,~ ised during the storage and
shipping process, thus reducing the potential to detect HGBV RNA.
(v) Specimens from U.S. (Set T)
None of 64 specimens from the U.S. (set T) were repeatably reactive in
the 1.7, 1.4 or 4.1 ELISA.
(vi) Specimens from U.S. (Set M)
A total of 4 of 72 specimens from U.S. specimens (set M) were repeatably
reactive in one or more of the ELISA's. Two specimens were reactive with the 1.7and 4.1 ELISA's. One specimen was reactive only with 1.7 and one specimen
was reactive only with the 4.1 ELISA.

21~313
_ W O 95/21922 P C TrUS95/02118

125

vii) Specimens from the United States (set I )
A total of three of 51 specimens from non A-E hepatitis U.S. set I were
repeatably reactive in one or both of the ELISA's. One specimen was repeatably
reactive in both ELISA's. One specimen was reactive in the 1.7 ELISA and three
specimens were repeatably reactive in the 1.4 ELISA. The specimen positive in
both ELISA's was positive by Western blot for the 1.4 recombinant protein
(SEQUENCE I.D. NO. 22) but negative for the 1.7 recombinant protein
(SEQUENCE I.D. NO. 23). One additional specimen was positive in the 1.4
ELISA and Western blot positive for the 1.4 recombinant protein (SEQUENCE
0 I.D. NO.611). One specimen which was repeatably reactive in the 1.4 ELISA was reactive for antibodies to HEV.
(viii) Speci-l-ens from Pakistan
A total of four of 82 s~eci...el1s were repeatably reactive for antibodies in
1.4 and/or 1.7 ELISAs. None of the specimens were reactive in both ELISA's.
Two specimens were repeatably reactive in the 1.7 ELISA and two specimens
were repeatably reactive in the 1.4 ELISA. Two specimens repeatably reactive in
the 1.4 ELISA were also reactive for antibodies to HEV. None of these 82
specimens were positive with the 4.1 ELISA.
(ix) Specimens from Italy
None of the ten specimens were repeatably reactive in the 1.7, 1.4, or 4.1
ELISA.
L. St~tictical Significance of Serological Results
These data indicate that specific antibodies to HGBV proteins ( i.e.
specimens repeatably reactive for antibodies in 1.7, 1.4, or 4.1 ELISA's can be
detected in all three categories of populations studied. Serological results obtained
with the various categories of specimens ("low risk", "at risk" and non A-E
hepatitis patients) were grouped together and analyzed for statistical significance
using the Chi square test. The data indicated that there is a significant difference in
co,,,l ~, i"g the seroprevalence of anti-HGBV in volunteer blood donors with either
the individuals considered "at risk" for exposure to HGBV or to individuals
diagnosed with hepatitis of an unknown etiology.
Among West Africans, the seroprevalence rate is 13.9% and is
significantly higher than the baseline group (TABLE 17) with a p value of 0.000.Similiarlyt for the IVDU's, there was a statistically significant difference (p value
of 0.000) when the results from IVDU's were compared with volunteer donors.
In countries (including Japan, New Zealand, U.S., Egypt, and Pakistan), there -

~16~313
wo 95/21922 Pcrlussslo2ll8

126

were significant differences in antibody prevalence in patients with non A-E
hepatitis when compared to the volunteer blood donors from the US.
H. Summary
These data suggest that the ELISA's described herein may be useful in
s diagnosing cases of hepatitis in humans in various geographical regions including
Japan, New 7e~1~nd, U.S., Egypt, and Pakistan. It is likely that these data
underestim~te the seroprevalence of antibodies to HGBV among all categories of
specimens tested. It is expected that as additional HGBV epitopes are discoveredand evaluated, the utility of tests derived from the HGBV genome(s) will become
more illlpOI Lal~t in diagnosing hepatitis among patients whose diagnosis cannotcurrently be made. NOTE: Although the results of RT-PCR were negative in
these initial studies, subsequent data revealed flavi-like vial sequences in serum of
seropositive individuals (see Example 17).
As we have discussed supra, more than one strain of the HGBV is present.
These are considered to be within the scope of the present invention and are termed
"hepatitis GB Virus ("HGBV").

Example 16. Serolo~ical studies with HGBV-A
A. Recombinant Protein Purification Protocol
Bacterial cells expleessil.g the CKS fusion proteins were frozen and stored at -70C. The bacterial cells from each of the GBV-A contstructs were thawed and
disrupted as described in Example 15 for GBV-B constructs. Further, the
recombinant proteins were purified as described for GBV-B recombinant proteins
in example 15.
The fractions which were collected during the purification protocol were
electrophoretically separated and stained with Coomassie Brilliant Blue R250 ande~min~l for the presence of a protein having a molecular weight of approximately60kD (CKS 1.5/SEQUENCE NO. 614), 65kD (CKS 2.17/ SEQUENCE NO.
613), 55kD (CKS 1.18/SEQUENCE NO. 390) and 66kD (CKS
1.22/SEQUENCE NO. 390). Fractions containing the protein of interest were
pooled and re-examined by SDS-PAGE.
The immunogenicity and structural integrity of the pooled fractions
collLaining the purified antigen were determined by immunoblot following
electrotransfer to nitrocellulose as described in Example 13. In the absence of a
qualified positive control, the recombinant proteins were identified by their
reactivity with a monoclonal antibody directed against the CKS portion of each
fusion protein. When the CKS-1.5 protein (SEQUENCE I.D. NO. 614) was

WO 95/21922 21~ 6 3 ~ 3 PCTNS9~/02118

127

e~rnin~i by Western blot, using the anti-CKS monoclonal antibody to detect the
recombinant antigen, a single band at approximately 60 kD was observed. This
corresponds to the expected size of the CKS-1.5 protein (SEQUENCE I.D. NO.
614). Similiarly, bands of the expected sizes were noted for the CKS-2. 17 protein
(SEQUENCE I.D. NO. 613), the the CKS 1.18 protein (SEQUENCE NO. 390)
and the CKS- 1.22 protein (SEQUENCE I.D. NO. 390) when examined by
immunoblot.
B. Polystyrene Bead Coatin~ Procedure
The proteins were dialyzed and evaluated for their antigenicity on polystyrene
o beads described in Example 15.
C. ELISA Protocol for Detection of Antibodies to HGBV
The ELISA's were performed as described in Example 15.
D. Detection of HGBV RNA in Serum of infected Individuals
Speci~l~ens which were repeatably reactive in the ELISAs were tested for HGBV
RNA as described in section D. of Example 15.
E. Tamarin Serolo~ical Profiles
None of the sera from the tamarins produced a specific immune response
when tested in the ELISA lltili7ing the CKS 1.5 protein, the CKS 2.17 protein, the
CKS 1.18 protein or the CKS 1.22 protein, all derived from the HGBV-A
20 genome. However, HGBV-A RNA was cletecte~ in several of the infected
tamarins as described in the previous example. (See Example 15 for a summary of
the tamarin serological profiles).
F. E~ lent~l Protocol for Serolo~ic Studies on Human Populations
In Example 15, ELISA's employing recombinant antigens from HGBV-B
25 were utilized to evaluate the presence of antibodies to HGBV-B in various human
populations. Many of the same specimens were then tested for antibodies to
HGBV-A utilizing the 1.5 ELISA employing the CKS- 1.5 recombinant protein
(SEQUENCE I.D. NO. 614), the 2.17 ELISA employing the CKS-2.17
recombinant protein (SEQUENCE I.D. NO. 613), the 1.18 ELISA employing the
30 CKS- 1. 18 recombinant protein (SEQUENCE I.D. NO. 390), and the ELISA
employing the CKS-1.22 recombinant protein (SEQUENCE I.D. NO. 390),
coated on the solid phase (as described in Example 15). As noted in Example 15,
all five of the convalescing tamarins inoculated with HGBV produced a specific
but short-lived antibody response to the HGVB-B recombinant proteins (as
35 detected with the 1.7, 1.4 and 4.1 ELISA's). Although none of the tamarins
produced a detectable antibody response in the 1.5, 2.17, 1.18 or 1 ~ F.T T~SAs,-
some human specimens from West Africa produced a specific antibody response to

WO 95/21922 ~ 1 6 ~ 3 1 3 PCT/US95/02118

128

one or more of these recombinant proteins when tested via Western blot and one of
the specimens obtained from the surgeon (who was the source of the GB agent) at
22 days after onset of hepatitis produced a specific antibody response to the 2.17
recombinant protein when tested by Western blot (see Example 3). In the current
s example, we evaluated the utility of the 1.5, 2.17, 1.18 and 1.22 ELISA's in
detecting antibodies in various human populations.
G. CutoffDetermination
` The cutoff for the 1.5, 2.17, 1.18, and 1.22 ELISAs were determined as
described in Example 15.
0 H. Supplemental Testing
As noted in Example 15, specimens which were initially reactive were
typically retested; if the specimen was repeatably reactive, additional tests (e.g.
Western blot) may be performed to further support the ELISA data. For a Western
blot result to be considered positive, a visible band should be observed at 60 kD
15 for the 1.5 protein (SEQUENCE I.D. NO. 614) at 65 kD for the 2.17 protein
(SEQUENCE I.D. NO. 613), at 55kD for the 1.18 protein (SEQUENCE I.D.
NO. 390) at 66 kD for the 1.22 protein (SEQUENCE I.D. NO. 390).. Since the
Western blot had not been op~ iGed to match or exceed the sensitivity of the
ELISA's, a negative result was not used to discard the ELISA data. However, a
positive result reinforced the reactivity detected by the ELISA's.
As also noted in Example 15, repeatably reactive specimens which have
sufficient volume may be tested by RT-PCR (performed as described in Example
15) using primers to identify HGBV specific nucleotide sequences in serum.
I. Serological Data Obtained with Low-Risk Specimens
A total of 252 plasma specimens were obtained from the Interstate Blood
Bank in Ohio and tested for antibodies with the 1.5 ELISA which utilizes the 1.5recombinant protein (SEQUENCE I.D. NO. 614). The mean absorbance value for
the population was 0.036 (SD=0.022). The cutoff was calculated to be 0.168,
corresponding to an S/N value of 10Ø A total of 760 plasma specimens
(including the 252 specimens utilized to determine the cutoff) were tested for
antibodies with the 1.5 ELISA. None of the specimens were repeatably reactive.
In addition, 100 plasma specimens were obtained from Southeastern Wisconsin
and tested for antibodies with the 1.5 ELISA. None of the specimens were
repeatably reactive.
Thus, there is no evidence that antibodies to the 1.5 protein were present in
U.S. blood donors.

~ WO 95/21922 ~16 6 ~ ~ 3 PCT/US95/02118

129

A total of 200 specimens were obtained from Wisconsin blood donors and
tested for antibodies with the 2.17 ELISA which utilizes the 2.17 recombinant
protein (SEQUENCE I.D. NO. 60). The mean absorbance value for the
population was 0.058 (SD=0.025). The cutoff was calculated to be 0.208,
- 5 corresponding to an S/N value of approximately 10Ø One of the specimens was
repeatably reactive. Thus, the seroprevalence in U.S. blood donors (N=200) is
relatively low.
The same 200 specimens described in the above paragraph were tested for
antibodies with the 1.18 and 1.22 ELISAs. None of the specimens were
0 repeatably reactive. Thus, there is no evidence that specimens from volunteerblood donors are antibody positive for HGBV-A proteins as determine by the 1.5,
2.17, 1.18 and 1.22 ELISAs.
J. Specimens Considered "At Risk" for Hepatitis
The data for these studies is ~,u~ ed in TABLE 18.
(i) Specimens from West Africa
A total of 58 of 1300 specimens were reactive with the 1.5 ELISA. Twelve
of 18 repeatably reactive specimens were positive by Western blot for antibodies to
the 1.5 protein (SEQUENCE I.D. NO. 614). A total of 43 of 817 specimens were
reactive in the 2.17 ELISA. These repeatably reactive specimens were not tested
by Western blot for antibodies to the 2.17 protein (SEQUENCE I.D. NO. 613).
Six of the 817 specimens were reactive with the 1.22 ELISA. Nine of the
353 specimens were reactive for 1.18 ELISA. Twenty-one specimens reactive
with the 2.17 ELISA were tested by Western blot and 13 were reactive. All eight
specimens that were repeatably reactive with the 1.18 ELISA was positive by
Western blot.
These data suggest that HGBV may be endemic in West Africa.
(ii) Specimens from Intravenous Drug Users
A total of 112 specimens were obtained from a population of intravenous
drug users, as part of a study being conducted at Hines Veteran's ~(lmini~tration
Hospital, in Chicago, IL. One specimen was repeatably reactive in the 2.17 ELISAand an additional specimen was reactive in the 1.18 ELISA. None of these
specimens were positive in the 1.5 or 1.22 ELISA.
K. Specimens obtained from individuals with non A-E Hepatitis
The data for these studies is summarized in TABLE 18.
Various populations of specimens (described in Example l5.K) were
obtained from individuals with non-A-E hepatitis and tested with the 1.5, 2.17, -

wo 95/21922 ~ Pcr/uss5/02ll8
~1663 ~
130

1.18 and 1.22 ELISAs (described in Example l 5.C). Due to insufficient sample
volume, not all specimens were tested in all of the ELISAs.
(i) Specimens from Japan
A total of four of 89 specimens were repeatably reactive in the 1.5 ELISA,
s with three of the specimens being from one individual and one of the specimensfrom a second individual. One specimen which had tested negative for the 1.5
ELISA, the 1.18 ELISA and the 1.22 ELISA was reactive in the 2.17 ELISA.
None of the specimens were reactive in the 1.18 ELISA. These specimens were
not tested with the 1.22 ELISA.
(ii) Speci-l-ensfromNewZealand
None of these 56 specimens were reactive in the 1.5 ELISA. These
specimens were not tested in the 2.17 ELISA, the 1.18 ELISA or the 1.22
ELISA
(iii) Specimens from Greece
None of the 67 specimens (obtained from a total of 10 patients)
were reactive for antibodies with the 1.5, 2.17 or 1.22 ELISA.
(iv) Specimens from E~ypt
None of 132 specimens were reactive in the 1.5 ELISA. A total of 7 of
132 specimens available for testing were reactive in the 2.17 ELISA. These
20 specimens were obtained from 25 individuals with acute non A-E hepatitis. Three
of the 25 patients were seropositive in the 2.17 ELISA on one or more separate
dates following the onset of hepatitis. None were reactive in the 1.18 or 1.22
ELISA.
(v) Specimen from the U.S. (Set M)
None of the 72 specimens were reactive with the 1.5 ELISA. Three of the
72 specimens were reactive for the 1.18 ELISA. Two of the specimens were
reactive in the 2.17 ELISA and four specimens were reactive with the 1.22 ELISA.Two of the samples were reactive in one of more of the ELISAs.
(vi) Specimens from U.S. (Set T)
None of the 64 specimens were reactive with the 1.5, 1.22 or 2.17
ELISAs. One specimen was reactive for the 1.18 ELISA.
(vii) Specimens from U.S. (Set l)
A total of 3 of 62 specimens were reactive in one or more of the GBV-A
ELISAs. One specimen was repeatly reactive in both the 2.17 and 1.22 ELISA.
35 One specimen was reactive only in the 2.17 ELISA and an additional specimen
was reactive only in the 1.22 ELISA. None of the specimens were rea~tive in the
1.5 or 1.18 ELISA.

wo 95/21922 ~ 1 6 ~ 3 ~ 3 PCTtUSs5to2l18

131

As we have lliccllssed supra, it is possible that more than one strain of the
HGBV may be present, or that more than one distinct virus may be represented by
the sequences disclosed herein. These are considered to be within the scope of the
present invention and are termed "hepatitis GB Virus ("HGBV").
s L. Statistical Significance of Serological Results
These data indicated that specific antibodies to HGBV-A proteins ( i.e.
specimens repeatably reactive for antibodies in 1.5, 2.17, 1,18 and 1.22 ELISA's)
were cletectecl among individuals considered "at risk" for exposure to HGBV and
among individuals diagnosed with non A-E hepatitis, but were not frequently
10 cletectecl either among volunteer or paid blood donors from the U.S. In TABLE19, the serological results obtained with the various categories of specimens ("low
risk", "at risk" and non A-E hepatitis patients as shown in TABLE 18) were
grouped together and analyzed for st~ti~tic~l significance using the Chi square test.
Unlike the data in TABLE 18, which compiled the ser~rt;~alence of antibodies to
15 HGBV proteins in the total number of specimens tested, the data in TABLE 19
reflect the results obtained with dirrelent individuals (persons). For the GBV-AELISAs, the data indicate that there is a significant difference (with a p value of
0.000) in colllp~h~g the seroprevalence of anti-HGBV in volunteer blood donors
with the individuals considered "at risk" for exposure to HGBV (West Africa) but20 not in the IVDUs. In addition, there was a statistically significant difference
between the seroprevalence of antibodies to HGBV-A in individuals with non A-E
hepatitis in Egypt and the U.S. when compared to volunteer donors These data
suggest that exposure to HGBV-A was associated with non-A through E hepatitis.
NOTE: although the results of RT-PCR were negative in these initial studies,
25 subsequent data revealed flavi-like vial sequences in serum of seropositive
individuals (see Example 19).
M Summary
These data suggest that the ELISA described herein may be useful in
detecting antibodies among individuals residing in West Africa and among
30 individuals with non-A through E hepatitis. The risk for hepatitis among the West
Africans is relatively high; nearly 85% of these individuals are seropositive for
antibodies to Hepatitis B virus, and approximately 5% are positive for antibodies
to hepatitis C virus. It is likely that these data underestimate the seroprevalence of
antibodies to HGBV among all categories of specimens tested. It is expected that35 as additional HGBV epitopes are discovered and evaluated, the utility of tests
derived from the HGBV genome(s) will become more hll~ol l~ll in diagnosing
hepatitis among patients whose diagnosis cannot currently be made.

WO 95/21922 ,~ PCT/US95/02118

132

Example 18. Identification of a GB-related virus in humans
A. Theorv
Epitopes from both HGBV-A and HGBV-B have been identified (Example
5 3). These have been used as serologic markers to screen human serum and plasmasamples (Examples 5 and 6). A significant correlation between seroreactivity with
some of these markers and the incidence of nonA-E hepatitis has suggested that
HGBV-B is the causative agent of nonA-E hepatitis in humans (Example 5.G).
However, Western blot analysis of GB human sera gave no indication of reactivity0 to HGBV-B epitopes (Example 3). Tn~teatl, at least one HGBV-A epitope was
identified with the GB human sera suggesting that HGBV-A was the causitive
agent of hepatitis in GB. Neither HGBV-A nor HGBV-B sequences have been
identified in patients with nonA-E hepatitis by RT-PCR (Example 5.E).
Therefore, proof of HGBV-A and/or HGBV-B infection in humans with nonA-E
15 hepatitis remains to be det~ ~ined.
The failure to identify HGBV-A and/or HGBV-B sequences in human sera
or plasma sources may be due to several factors. First, we have looked at only alimited number of HGBV-A and/or HGBV-B-seropositive samples by RT-PCR,
and the complete storage history of many of these samples is unknown. Thus, it is
20 possible that viral RNA present in these samples was col"~lolllised by incorrect
storage. Second, GB infection appears to be resolving in nature. As such, the
window of time in which GB sequences are present in an infected individual's
serum may be very narrow. Thus, the chances of obtaining serum samples
cc,l-t~;"ing GB sequences may be extremely low. Finally, a limited number of
25 PCR primer sets were used to look for HGBV-A and/or HGBV-B sequences.
HGBV-A and/or HGBV-B are RNA viruses and, therefore, are likely to have high
rates of mutation (Holland, et al. (1982) Science 215:1577-1585). Thus, the
sequence of HGBV-A and/or HGBV-B present in the e~min~d human sera may
be different enough from the sequence of our PCR primers such that HGBV-A
30 and/or HGBV-B may be not be detected.
To address the possibility that the genomic variability of HGBV-A and/or
HGBV-B prevented these viruses in our PCR studies, degenerate PCR primers
were designed to the highly conserved NS3-like regions of HGBV-A and HGBV-
B (see Fig. 17). It was reasoned that these highly conserved regions serve a
35 necessary function in the viral replicative cycle. Therefore, these sequences should
be m~int~ined in HGBV-A and HGBV-B variants. PCR primers designed within
this region should be able to detect HGBV-A and/or HGBV-B genomic RNA by

WO 95/21922 ~ 1 6 6 3 1 3 PCT/US95/02118

133

RT-PCR. In addition, by designing degenerate PCR primers that can specifically
amplify HGBV-A, HGBV-B and HCV sequences, we reasoned that we might be
able to amplify sequences from viruses related to HGBV-A, HGBV-B and HCV.
Thus, if the limited seroreactivity cletected in human serum and plasma samples
(Examples 5 and 6) is the result of cross-reactive antibodies to antigens from
distinct HGBV-A- or HGBV-B-related viruses, we may be able to obtain
sequences from these GB-related viruses. [This is similar to the experimental
approach that Nichol and colleagues took to identify the unique Hantavirus
associated with the recent outbreak of acute respiratory illness in the Southwest
o United States. Nichol, et al. Science 262:914-917 (1993)]
B. Cloning the NS3-like region of hepatitis GB virus C (HGBV-C).
In several models of virus infetions, viremia occurs during the early stages
of infection and is often associated with the detection of IgM class antibodies to
viral proteins. As noted in examples 5 and 6, several specimens were
15 immunoreactive in ELISA's which detecte~ IgG class antibodies to recombinant
proteins derived from HGBV-A and HGBV-B. Additional ELISA's were
performed to determine if IgM class antibodies could be detected to these proteins.
Several sel~posilive specimens obtained from West African individuals (Example
5.E.i) were reactive for IgM class antibodies to the recombinant proteins (data not
20 shown). These specimens were thought to have a high probability of containing virus. In addition, specimens obtained from HGBV-A- and HGBV-B-
seroposiLive Egyptian individuals (Example 5.F.vii) suffering from acute hepatitis
in the absence of cletect~hle IgM class antibodies to HGBV-A or HGBV-B
recol"binan~ proteins were also e~mined due to the likelihood that acute liver
25 disease is most likely linked to viral presence. A "hemi-nested" RT-PCR was
performed on the nucleic acids from these samples with degenerate oligonucleotide
primers which will amplify HGBV-A, HGBV-B and HCV-1 sequences using the
GeneAmp(~ RNA PCR kit (Perkin Elmer) as directed by the manufacturer.
Briefly, the first set of amplifications were performed on the cDNA products of
30 random-primed reverse transcription reactions of the extracted nucleic acids with 2
mM MgCl2 and 1 ~M primers ns3. 1 -s and ns3. 1 -a (SEQUENCE ID. NOS. 671
and 672, respectively). Reactions were subjected to 40 cycles of denaturation-
annealing-extension [three cycles of (94C, 30 sec; 37C, 30 sec; 2 min ramp to
72C; 72C, 30 sec) followed by 37 cycles of (94C, 30 sec; 55C, 30 sec; 72C,
35 30 sec)] followed by a 10 min extension at 72C. Completed reactions were held
at 4C. The second set of amplifications were as described above except that 4%-of the first PCR products were used as the template, and ns3. 1 -s and ns3-a

wo 95/21922 2 1 6 ~ ~ 1 3 Pcrlusss/02ll8

134

(SEQUENCE ID. NOS. 671 and 673, respectively) were used as the "hemi-
nested" primer set. Products from the first and second sets of PCRs were
analyzed by gel electrophoresis.
One sample from West Africa had a PCR product from the hemi-nested
s reaction that migrated at approximately 386 bp (the expected size of a HGBV-A,
HGBV-B or HCV product). This product was cloned into pT7 Blue T-vector
plasmid (Novagen) as described in the art. The sequence obtained from this clone(GB contig C [GB-C], SEQUENCE ID. NO. 673, residues 2274-2640) was
compared with GB contig A (GB-A, SEQUENCE ID. NO. 163, residues 4438-
0 4804), GB contig B (GB-B, SEQUENCE ID. NO. 393, residues 4218-4587) and
HCV-1 (SEQUENCE ID. NO. 398). FIGURE 36 shows a nucleotide alignment
of these sequences, while TABLE 20 shows the percent identity between these
sequences.

TABLE 20
GB-A GB-B GB-C HCV- 1
GB-A 100.0 47.99 61.6652.55
GB-B 100.0 52.5554.96
GB-C 100.057.37
HCV- 1 100.0
As demonstrated in FIGURE. 36 and TABLE 20, nucleotide comparisons of GB-
A, GB-B and HCV-1 show that these sequences are 47.99 to 61.66% identical to
one another. This is not surprising when one considers the conserved amino acid
residues present in the NTP-binding helicase of these viruses (Example 2.B.3,
FIGURE. 17A). The nucleotide comparison of the NS3 PCR product obtained
from the West African sample (GB-C, SEQUENCE ID. NO. 673, residues 2274-
2640) with the other viruses suggests that the West African NS3 product (GB-C,
SEQUENCE ID. NO. 673, residues 2274-2640) is related to, but distinct from the
NS3 sequences from GB-A (SEQUENCE ID. NO. 163, residues 4438-4804),
2s GB-B (SEQUENCE. ID. NO. 393, residues 4218-4587) and HCV-1
(SEQUENCE ID. NO. 398). This sequence comparison suggests that GB-C may
be from a GB-like virus more closely related to GB-A than GB-B or HCV.
BLASTN and BLASIX searches of nucleic acid and protein databases in the
Wisconsin Sequence Analysis Package (Version 8) with GB-C (SEQUENCE ID.
NO. 673, residues 2274-2640) finds limited sequence identity with several strains
of HCV. The highest P values (i.e., odds of alignment being made by chance) for
nucleotide and amino acid searches were 1.9 x 10-2 and 5.3 x 10-3l, respectively

WO 95/21922 PCT/US95/02118
~1~63 1 3
13s

(data not shown). Together, these data suggest that GB-C (SEQUENCE ID. NO.
673, residues 2274-2640) may be from a unique GB-like virus related to HGBV-
A, HGBV-B and HCV which we now designate, HGBV-C.
C. GB-C is exogenous.
-~ 5 PCR primers to GB-C sequence were utilized to deterrnine whether this
sequence could be detected in the genomes of hum~n~, Rhesus monkeys, S.
cerevisiae and E.coli as described, for example, in Example 6.B. PCR was
performed using GeneAmp(~ reagents from Perkin-Elmer-Cetus essentially as
directed by the supplier's instructions. Briefly, 300 ng of genomic DNA was usedo for each 100 ~I reaction. PCR primers (SEQUENCE I.D. NOS. 675 and 676)
were used at a final concentration of 1.0 ~lM. PCR was performed for 40 cycles
(94C, 30 sec; 55C, 30 sec; 72C, 30 sec) followed by an extension at 72C for 10
min. PCR products were s~al~ed by agarose gel electrophoresis and visualized
by W irradiation after direct st~ining of the nucleic acid with ethidium bromide,
followed by hybridization to a radiolabeled probe after Southern transfer to a
Hybond-N+ nylon filter. FIGURE 37 shows a PhosphoImage (Molecular
Dynamics, Sunnyvale, CA) from a Southern blot of the PCR products after
hybridization with the radiolabeled probe from GB-C (SEQUENCE I.D. NO. 673,
residues 2274-2640). GB-C (SEQUENCE I.D. NO. 673) sequences were not
detected in human (FIGURE 19, lane 1), Rhesus monkey (lane 2), S. cerevisiae
(lane 3) or E. coli (lane 4) genomic DNAs despite the detection of ~350 fg (one
genome copy equivalent, lane 5) and -35 fg (0.1 genome copy equivalents, lane 6)of GB-C plasmid template in 300 ng human genomic DNA. (Lane 7 contains the
PCR products from ~3.5 fg [0.01 genome copy equivalents] GB-C plasmid
template in 300 ng human genomic DNA.) Thus, using genomic PCR that can
detect 0.1 genome copy equivalents, GB-C (SEQUENCE I.D. NO. 673) cannot
be detected in the genomes of human, Rhesus monkey, S. cerevisiae, and E. coli.
These data are consistent with the purported exogenous (i.e. viral) origin of GB-C
(SEQUENCE I.D. NO. 673).
D. GB-C can be detected in additional human serum samples.
Additional HGBV-A and HGBV-B immunoreactive human serum samples
were tested for the presence of GB-C sequences using RT-PCR. As in Example
7, nucleic acids extracted from serum samples were reverse transcribed using
random hexamers, and cDNAs were subjected to 3540 cycles of amplification
(94C, 30 sec; 55C, 30 sec; 72C, 30-90 sec) followed by an extension at 72C for
10 min. GB-C-specific PCR primers (gl31-sl and gl31-al, SEQUENCE ID.
NOS. 675 AND 676) were used at 1.0 )lM concentration. The PCR products

W O 9~121922 2 1 6 6 3 i 3 PCTrUS95/02118

136

were separated by agarose gel electrophoresis and vi.cu~1i7Pcl by W irradiation
after direct st~ining of the nucleic acid with ethidium bromide and hybridization to
a radiolabeled probe after Southern transfer to a Hybond-N+ nylon filter. A total
of 48 HGBV-immunopositive samples were tested from West Africa. Including
s the original sample from which GB-C was identified, eight samples from West
Africa were positive for GB-C sequences by RT-PCR. A total of ten GB
seronegative West African serum samples were tested, none of which had
detectable GB-C sequences. PCR products from four of the positive samples were
cloned and sequenced as described above. Over the 156 nucleotides examined,
0 two of four clones ex~lnin~cl were identical to GB-C sequence (SEQUENCE I.D.
NO. 673, residues 2274-2640), and two clones (SEQUENCE I.D. NOS. 677 and
678) contained sequences that were 88.4% and 83.6% identical to GB-C
(SEQUENCE I.D. NO. 673, residues 2274-2640) (FIGURE 38). However,
despite the divelgence at the nucleotide level, the predicted translation product of
each clone is remarkably similar with only one amino acid change occurring in the
predicted translation of SEQUENCE ID NO. 678.
Additional serum samples from individuals with nonA-E hepatitis from
Greece, Egypt and the United States were tested for GB-C sequences as described
above. None of these samples contained detectable GB-C sequences. The lack of
20 detection of GB-C sequences in these samples may be due to several reasons (see
above, Theory). However, the sequence variation noted above between GB-C
(SEQUENCE I.D. NO. 673, residues 2274-2640) and the two GB-C variants
(SEQUENCE I.D. NOS. 678 and 677) suggest that if the closely related HGBV-
C's from West Africa can differ by 15.1 % at the nucleotide level, it is likely that
25 the GB-C-specific PCR primers (gl31-sl, gl31-al, SEQUENCE ID. NOS. 675
and 676) may not hybridize sufficiently to geographically distinct isolates of GB-C
virus to generate a detectable PCR product. In this case, PCR primers designed to
a more conserved region (5' UTR) of the genome may allow the detection of GB-
C sequences in non-West African serum samples.
30 E. Extension of the HGBV-C sequences.
The PCR walking technique described in Example 2.A hereinabove was
utilized to obtain additional GB-C sequences. Briefly, total nucleic acid were
extracted from the West African human serum originally used to identify GB-C
(SEQUENCE I.D. NO. 673, residues 2274-2640). This nucleic acid was reverse
3s transcribed as described supra. The resultant cDNAs were amplified in 50 ~11 PCR
reactions (PCR 1) as described by Sorensen et al. except that 2 rnM MgC12 was -
used. Reactions were subjected to 35 cycles of denaturation-annealing-extension

_ w O9St21922 ~ 1 ~ 6 3 1 3 PcTnuss5/02ll8

137

(94C, 30 sec; 55C, 30 sec; 72C, 90 sec) followed by a 10 min extension at
72C. Biotinylated products were isolated using streptavidin-coated paramagneticbeads (Promega) as described by Sorensen et al. Nested PCRs (PCR 2) were
performed on the streptavidin-purified products as described by Sorensen et al. for
a total of 35 cycles of denaturation-~nne~ling-extension as described above. Theresultant products and the PCR primers used to generate them are listed in TABLE21.
TABLE 21
Reaction Primer set PCR 1 Primer set PCR 2 Size of PCR product
0 C. l SEQ ID #679/SEQ ID #135 SEQ ID # 680/SEQ ID #126 1250 bp
C.2 SEQ ID # 681/SEQ ID # 694 SEQ ID # 686/SEQ ID #126 220 bp
C.3 SEQ ID # 682/SEQ ID # 694 SEQ ID # 683/SEQ ID #126 250bp
C.4 SEQ ID # 684/SEQ ID #695 SEQ ID # 685/SEQ ID #126 800 bp
C.Scomp. of SEQ ID # 679/SEQ ID # 90/SEQ ID #126 750 bp
SEQ ID #695
C.6SEQ ID # 688/SEQ ID #672SEQ ID # 92/SEQ ID #126 1150 bp
C.7SEQ ID # 690/SEQ ID #695SEQ ID # 94/SEQ ID #126 550 bp
C.8SEQ ID # 692/SEQ ID #695SEQ ID # 96/SEQ ID #126 250 bp
C.9653/SEQ ID # 135 654/SEQ ID #126 625 bp
20 C.10655/SEQ ID # 694 656/SEQ ID #126 350 bp
C.11657/SEQ ID # 694 658/SEQ ID #126 550 bp
C.12659/SEQ ID # 695 660/SEQ ID #126 450 bp
C.13 661/665 662/SEQID#126 750bp
C.14663/FP3 (SEQ ID #13) 664/SEQ ID #126 550 bp
25 C.15 666/125 667/SEQ ID #126 600 bp
In addition, a 1.3 kb product (C.16) was generated with oligonucleotide primers
SEQ ID # 669 and SEQ ID # 670using PCR 1 conditions described above. This
product, together with those described in TABLE 21 were isolated from agarose
gels and cloned into pT7 Blue T-vector plasmid (Novagen) as described in the art.
The cloned products were sequenced as described in Example 5. The
sequences were assembled using the GCG Package (version 7) of programs. A
- schematic of the assembled contig is presented in FIGURE 39. GB-C is 9034 bp
in length, all of which has been sequenced and is presented in SEQUENCE I.D.
- NO. 400-606. These SEQUENCE I.D.'s corresond to the three forward 35 translation frames.

Example 19. CKS-based expression and detection of immunogenic

~'1 6b'31 3
wo g~/21922 PCT/USg5/02118

138

HGBV-C polypeptides
The HGBV-C sequences obtained from the walking ex~e~ ent~ described
in Example 17 (TABLE 13) were cloned into the CKS expression vectors
pJO200, pJO201, and pJO202 using the restriction enzymes listed in TABLE 22
(10 units, NEB) as described in Example 13. Two additional PCR clones,
designated C.3/2 and C.8/12, were also expressed (FIGURE 39). PCR product
C.3/2 was generated using primers SEQUENCE I.D. NO. 681 and the
complement of SEQUENCE I.D No. 685 and PCR product C.8/12 was generated
using primers (SEQUENCE I.D. NO. 693 and its complement) as described in
0 Example 9. The PCR products were cloned into pT7Blue as described previously,
then liberated with the restriction enzymes listed in TABLE 22 and cloned into
pJO200, pJO201 and pJO202 as above.
Two human sera which had indicated the presence of antibodies to one or
more of the CKS/HGBV-A or CKS/HGBV-B fusion proteins by the 1.7, 4.1 or
s 2.17 ELISAS (see Examples 15 and 16) were chosen for Western blot analysis.
One of these sera (240D) was from an individual with nonA-E hepatitis (Egypt)
and the other (G8-81) was from a West African individual "at risk" for exposure to
HGBV (see Example 15). The CKS/HGBV-C fusion proteins were expressed
and transferred to nitrocellulose sheets as described above. The blots were
preblocked as described and incub~ted overnight with one of the human serum
sample diluted 1: 100 in blocking buffer con~inil,g 10% E. coli Iysate and 6mg/ml
XL1-Blue/CKS Iysate. The blots were washed two times in TBS, reacted with
HRPO-conjugated goat anti-human IgG and developed as indicated above. The
results are shown in TABLE 22.
Several of the HGBV-C proteins showed reactivity with one or the other of
the two sera, and three (C.1, C.6 and C.7) were chosen for use in ELISA assays
(see Example 20). Thus, samples previously identified as reactive with HGBV-A
and/or HGBV-B proteins additionally show reactivity with HGBV-C proteins.
The reactivity with multiple proteins from the 3 HGBV viruses may be due to
cross-reactivity resulting from shared epitopes between the viruses. Alternatively,
this may be a result of infection with multiple viruses, or to other unidentified
factors.
TABLE 22
HGBV-C Samples
3s
ReactivityReactivity
PCR Restriction with human --with human
producta digestb G8-81 serum 240D serum

wo 951219222 1 6 6 ~ ~ 3 PCT/USg5/02118

139


GB-C KpnI, XbaI +
C.1 EcoRI, XbaI +
5C.3/2 EcoRI, XbaI
C.4 KpnI, XbaI
C.9 KpnI, PstI ND
C.10 EcoRI, XbaI ND
C.5 KpnI, XbaI +/-
o C.6 KpnI, PstI +
C.7 NdeI-fill, BamHI - +
C.8/12 KpnI, XbaI +

aPCR product is as ' ~ in previous TABLES or Examples. bRestriction digests usedto liberate the PCR fragment from pT7Blue T-vector. ND = not done.

Example 20. Serological studies with GBV-C
A. Recombinant Protein Purification Protocol
Bacterial cells expressing the CKS fusion proteins were frozen and stored at -
70C. The bacterial cells from each of the GBV-C constructs were thawed and
disrupted as described in Example 15 for GBV-B constructs. Further, the
recombinant proteins were purified as described for GBV-B recombinant proteins
in example 15.
The fractions which were collected during the purification protocol were
electrophoretically separated and stained with Coomassie Brilliant Blue R250 andexamined for the presence of a protein having a molecular weight of approximately
75kD (CKS C.1/SEQUENCE I.D. NO. 404), 71kD (CKS C.6/ SEQUENCE I.D.
NO. 404 ), and 49kD (CKS C.7/SEQUENCE I.D. NO.404). Proteins bands of
the expected molecular weight were observed for the CKS-C.6 and CKS-C.7
recombinant proteins. For the CKS-C. I protein, a band was observed which
corresponded to a molecular weight of 62 kD rather than at the expected molecular
weight of 75kD. It is unclear why there are differences between the expected andobserved protein band. Fractions containing the protein of interest were pooled
and re-examined by SDS-PAGE.
3s The immunogenicity and structural integrity of the pooled fractions
cont~ining the purified antigen were determined by immunoblot following
electrotransfer to nitrocellulose as described in Example 13. In the absence of a
qualified positive control, the recombinant proteins were identified by their

W O 95/21922 ~ 3~ 3 PCTrUS95102118

140

reactivity with a monoclonal antibody directed against the CKS portion of each
fusion protein. When the CKS-C. 1 protein (SEQUENCE I.D. NO.404) was
examined by Western blot, using the anti-CKS monoclonal antibody to detect the
recombinant antigen, a single band at approximately 65kD was observed. This
5 differs from the expected size of 75kD for the CKS-C. 1 protein (SEQUENCE I.D. NO.404). Bands of the expected sizes were noted for the CKS-C.6 protein
(SEQUENCE I.D. NO. 404), and the CKS C.7 protein (SEQUENCE I.D. NO.
404) were observed when examined by immunoblot.
B. Polystyrene Bead Coating Procedure
0 The proteins were dialyzed and evaluaed for their antigenicity on polystyrene beads
described in Example 15.
C. ELISA Protocol for Detection of Antibodies to HGBV
The ELISA's were performed as described in the previous Example 15.
D. Detection of HGBV RNA in Serum of infected Individuals
5 Specimens which were repeatably reactive in the ELISAs were tested for HGBV
RNA as described in section D. of the previous example 15.
E. Tamarin Serological Profiles
None of the sera from the tamarins produced a specific immlme response
when tested in the ELISA lltili7ing the CKS-C.l protein, the CKS-C.6 protein, or20 the CKS C.7 protein, all derived from the HGBV-C genome. See Example 15 for
a description of the tamarin serological profiles.
F. Supplemental Testing
As noted in Example 15, specimens which were initially reactive were
typically retested; if the specimen was repeatably reactive, additional tests (e.g.
25 Western blot) may be perforrned to further support the ELISA data. For a Western
blot result to be considered positive, a visible band should be observed at 65kD for
the C. 1 protein (SEQUENCE I.D. NO. 404), at 71kD for the C.6 protein
(SEQUENCE I.D. NO. 404), or at 49kD for the C.7 protein (SEQUENCE I.D.
NO. 4Q4).. Since the Western blot had not been optimized to match or exceed the
30 sensitivity of the ELISA's, a negative result was not used to discard the ELISA
data. However, a positive result reinforced the reactivity detected by the ELISA's.
As also noted in Example 15, repeatably reactive specimens which have
sufficient volume may be tested by RT-PCR (performed as described in Example
10 using primers corresponding to SEQUENCE I.D. NOS. 8 and 9) to identify
35 HGBV-C specific nucleotide sequences in serum.
G. Experimental Protocol.

wo 95/21922 21~ 6 3 13 PCT/US9S/02118

141

In example 15, ELISA's employing recombinant antigens from HGBV-B
were utilized to evaluate the presence of antibodies to HGBV-B AND HGBV-A in
various human populations. Many of the same specimens were then tested for
antibodies to HGBV-C lltili7.ing the C. l ELISA employing the CKS-C.1
s recombinant protein (SEQUENCE I.D. NO. 404), the C.6 ELISA employing the
CKS-C.6 recombinant protein (SEQUENCE I.D. NO. 404), the C.7 ELISA
employing the CKS-C.7 recombinant protein (SEQUENCE I.D. NO. 404) coated
on the solid phase (as described in Example 14). As noted in Example 15, all five
of the convalescing t~m~rin.c inoculated with HGBV produced a specific but short-
0 lived antibody response to the HGVB-B recombinant proteins (as detected with the
1.7, 1.4 and 4.1 ELISA's). Although none of the tamarins produced a detectable
antibody response in the C.1, C.6, C.7 ELISAS, some of the human specimens
produced a specific antibody response to the C.1, C.6, and C.7 recombinant
protein when tested via Western blot (see Example 13) In the current example, weevaluated the utility of the C.1, C.6, and C.7 ELISA's in detecting antibodies in
various human populations.
H. CutoffDetermination
The cutoff for the C.1, C.6, and C.7 ELISAs were determined as
described in Example 15.
I. Serolo~ical Data Obtained with Low-Risk Specimens
A population consisting of 100 sera and 100 plasma was obtained from
healthy, volunteer donors in Southe~tçrn Wisconsin and tested for antibodies to
three recombinant proteins from GBV-C including the CKS- C.1 (SEQUENCE
I.D. NO. 404) protein irl the C.1 ELISA, the CKS- C.6 (SEQUENCE I.D. NO.
2s 404) protein in the C.6 ELISA, and the CKS- C.7 (SEQUENCE I.D. NO. 404)
protein in the C.7 ELISA.
For the C.1 ELISA, the mean absorbance values for the serum and plasma
specimens were 0.049{with a standard deviation (SD) of 0.040} and 0.038
(SD=0.029), respectively The cutoff for serum and plasma were calculated to be
0.214 and 0.286, respectively. As discussed above, the cutoff value was also
expressed as a factor of the negative control absorbance value; specimens havingS/N values above 10.0 were considered reactive. Using this cutoff, 0 of 100
plasma specimens and 1 of 100 serum specimens were initially reactive and
repeatably reactive for antibodies to the C.1 protein (SEQUENCE I.D. NO. 404).
For the C.6 ELISA, the mean absorbance values for the serum and plasma
specimens were 0.102{ with a standard deviation (SD) of 0.046 } and 0 105
(SD=0.047), respectively. Cutoff values were set such that specimens having an

W O 95/21922 2 1 6~ 3 1 3 PC~rrUS95tO2118

142

S/N value of 10 or greater were considered reactive Using this cutoff, three
specimens (two from the serum population and one from the plasma population)
were repeatably reactive (having S/N values of 10 or greater) for antibodies to the
C.6 protein (SEQUENCE I.D. NO. 404).
For the C.7 ELISA, the mean absorbance values for the serum and plasma
specimens were 0.061 ~with a standard deviation (SD) of 0.040} and 0.050
(SD=0.055), respectively. Cutoff values were set such that specimens having an
S/N value of 10 or greater were considered reactive. Using this cutoff, none of the
specimens were repeatably reactive for antibodies to the C.7 protein (SEQUENCE
o I.D. NO. 404).
Thus, there is evidence that antibodies to the C.1, C.6, or C.7 proteins are
present in approximately 1 % of U.S. blood donors (N=200).
J. Specimens Considered "At Risk" for Hepatitis
The data for these studies is ~ lllIAI ;7~1 in TABLE 23.
(i) Specimens from West Africa
A total of 20 of 137 specimens were reactive in one or more of the ELISAs
utilizing GBV-C proteins. A total of 12 of 97 were repeatably reactive in the C. l
ELISA, 3 of 52 were repeatably reactive in the C.6 ELISA, 5 of 137 specimens
were reactive in the C.7 ELISA. Three of the C.1 reactive specimens were tested
on Western blot and found to be reactive.
These data suggest that HGBV may be endemic in West Africa.
(ii) Specimens from Intravenous Drug Users
A total of 112 specimens were obtained from a population of intravenous
drug users, as part of a study being conducted at Hines Veteran's AdministrationHospital, in Chicago, IL. A total of 2 of 112 specimens were repeatably reactivefor one or more proteins. One specimen was repeatably reactive in the C.1
ELISA, one specimen was repeatably reactive in the C.7 ELISA. None of these
specimens were positive in the C.6 ELISA.
K. Specimens obtained from individuals with non A-E Hepatitis
The data for these studies is summarized in TABLE 23.
Various populations of specimens (described in Example 15.K) were
obtained from individuals with non-A-E hepatitis and tested with the 1.5, 2.17,
l .18 and 1.22 ELISAs (described in Example 15.C). Due to insufficient sample
volume, not all specimens were tested in all of the ELISAs.
(i) Specimens from Japan

h 1 ~i 6 3 1 ~
WO 9~/21922 PCT/US95/02118

143

None of a total of 89 specimens were repeatably reactive in the C. I ELISA.
Due to lack of specimen volume, the specimens were not tested for antibodies in
the C.6 or C.7 ELISAs.
(ii) Specimens from Greece
5 A total of 67 specimens were tested with the C. 1 and C.7 ELISAs. None
of the specimens were reactive.
(iii) S~echl~ensfromE~ypt
A total of 18 specimens of 132 specimens were reactive in one or more
ELISA. None of the specimens were reactive in the C. 1 ELISA. A total of 15
o specimens were reactive in the C.6 ELISA and three were reactive in the C.7
ELISA.
(iv) Specimens from U.S. (M set)
A total of 6 specimens were reactive in one or more ELISA. Two
specimens were repeatably reactive in the C. 1 ELISA. Four specimens were
repeatably reactive in the C.6 ELISA. None of the specimens were reactive in theC.7 ELISA.
(v) Specimens from U.S. (T set)
None of the 64 specimens were reactive in either the C. 1 or the C.6
ELISAs. One specimen was repeatably reactive in the C.7 ELISA.
(vi) Specimens from various U.S. clinical sites (set 1 )
In total, three of 62 specimens were reactive in one or more ELISA's. One
specimen was repeatably reactive in both the C. 1 and C.6 ELISA;s. Two
specimens were repeatably reactive in the C.7 ELISA.
As we have tii~cucsed supra, it is possible that more than one strain of the
2s HGBV may be present, or that more than one distinct virus may be represented by
the sequences disclosed herein. These are considered to be within the scope of the
present invention and are termed "hepatitis GB Virus ("HGBV").
L. Statistical Significance of Serolo~ical Results
These data indicated that specific antibodies to HGBV-C proteins ( i.e.
30 specimens repeatably reactive for antibodies in C. 1, C.6 and C.7 ELISA's) were
detected among individuals considered "at risk" for exposure to HGBV and among
individuals diagnosed with non A-E hepatitis, and at low rate among volunteer orpaid blood donors from the U.S. In TABLE 24, the serological results obtained
with the various categories of specimens ("low risk", "at risk" and non A-E
35 hepatitis patients as shown in TABLE 23) were grouped together and analyzed for
statistical significance using the Chi square test. Unlike the data in TABLE 23,which compiled the seroprevalence of antibodies to HGBV proteins in the total

WO 95/21922 PCT/US95/02118
~l~B313
144

number of specimens tested, the data in TABLE 24 reflect the results obtained with
different individuals (persons). For the GBV-C ELISAs, the data indicate that
there is a significant difference (with a p value of 0.000) in colllpaling the
seroprevalence of anti-HGBV in volunteer blood donors with the individuals
considered "at risk" for exposure to HGBV (West Africa) but not for the IVDUs.
In addition, there was a statistically significant difference between the
seroprevalence of antibodies to HGBV-C in individuals with non A-E hepatitis in
Egypt and the U.S. when colllpa~ed to volunteer donors These data suggest that
exposure to HGBV~C was associated with non-A through E hepatitis.
o NOTE: although the results of RT-PCR were negative in these initial studies,
subsequent data revealed flavi-like vial sequences in serum of seropositive
individuals (see Example 19).

Example 21. Presence of HGBV-C in humans with non-A-E hepatitis.
The generation of HGBV-C-specific ELISAs allowed the identification of
immunopositive sera from patients with non-A-E hepatitis (Example for HGBV-C
serology). These sera, together with several HGBV-A and/or HGBV-B-
immunopositive sera from individuals with documented cases of non-A-E hepatitis
(TABLE 25) were examined by RT-PCR for HGBV-C sequences. To increase the
likelihood of detecting HGBV-C variants, RT-PCR was performed using
degenerate NS3 oligonucleotide primers in a first round of amplification followed
by a second round of amplification with nested GB-C-specific primers. Briefly,
the first round amplification was performed on serum cDNA products generated as
described in Example 6, using 2 mM MgC12 and 1 ~M primers ns3.2-sl and
2s ns3.2-al (SEQ. ID. NOS. 711 and 712, respectively). Reactions were subjected
to 40 cycles of denaturation-~nn~ling-extension [three cycles of (94C, 30 sec;
37C, 30 sec; 2 min ramp to 72C; 72C, 30 sec) followed by 37 cycles of (94C,
30 sec; 50C, 30 sec; 72C, 30 sec)] followed by a 10 min extension at 72C.
Completed reactions were held at 4C. A second round of amplification was
performed utili7ing 2 mM MgC12, 1 ~lM GB-C-specific primers (SEQUENCE I.D.
NOS. 675 and 676), and 4% of the first PCR products as template. The second
round of amplification employed a thermocycling protocol designed to amplify
specific products with oligonucleotide primers that may contain base pair
mi~m~t~hes with the template to be amplified [Roux, Bio/Iechniques 16:812-814
3s (1994)]. Specifically, reactions were thermocycled 43 times (94C, 20 sec; 55C
decreasing 0.3C/cycle, 30 sec; 72C, 1 min) followed by 10 cycles (94C, 20 sec;
40C, 30 sec; 72C, 1 min) with a final extension at 72C for 10 minutes. PCR

WO 95/21922 ~16 6 ~ 13 PCT/US95/02118

145

products were separated by agarose gel electrophoresis, visualized by UV
irradiation after direct staining of the nucleic acid with ethidium bromide, then
hybridized to a radiolabeled probe for GB-C after Southern transfer to Hybond-N+nylon filter. PCR products were cloned and sequenced as described in the art.
s Using the above methodology, GB-C.4, GB-C.S, GB-C.6 and GB-C.7
were obtained. These sequences are 82.1-86.6% identical to GB-C (SEQUENCE
I.D. NO. 400,bases 4167-4365). F~GURE 40 displays the sequence differences
of GB-C.4, GB-C.S, GB-C.6 and GB-C.7 aligned to the homologous region of
GB-C in the predicted codon triplicates. As demonstrated, a majority of the
o nucleotide differences do not result in amino acid changes from GB-C. This
overall sequence conservation at the amino acid level suggests that GB-C.4, GB-
C.5, GB-C.6 and GB-C.7 were derived from different strains of the same virus,
HGBV-C. In addition, the level of sequence divergence at the nucleotide level
demonstrates that these PCR products are not a result of co"~."i,l~tion with any of
s the previously identified GB-C sequences.
Three of these individuals (the sources of GB-C.4, GB-C.5 and GB-C.7)
had no evidence of infection with hepatitis A, hepatitis B or hepatitis C viruses.
The presence of GB-C sequences in these individuals with hepatitis of unknown
etiology suggests that HGBV-C is one of the causative agents of human hepatitis.20 Serial samples were available for two of the individuals (cont~ining GB-C.4 and
GB-C.S). To follow the HGBV-C sequence in these samples, clone specific RT-
PCRs were developed. Briefly, nucleic acids extracted from serum were reverse
transcribed using random hexamers as in Example 7. The resultant cDNAs were
subjected to 40 cycles of amplification (94C, 30 sec; 55C, 30 sec; 72C, 30 sec)
2s followed by an extension at 72C for 10 min. GB-C.4- or GB-C.S-specific PCR
primers (GB-C.4-sl and GB-C.4-al, or GB-C.S-sl and GB-C.S-al, respectively)
were used at 1.0 IlM concentration. PCR products were separated by agarose gel
electrophoresis, vi~ li7e~ by W irradiation after direct staining of the nucleic acid
with ethidium bromide, then hybridized to a radiolabeled probe after Southern
30 transfer to Hybond-N+ nylon filter.
GB-C.4 was found in sera from an Egyptian patient with acute non-A-E
hepatitis. This patient was seropositive for a HGBV-A protein (see HGBV-A
ELISA Example). RT-PCR of five serial samples from the Egyptian patient
demonstrated a viremia that persisted for at least 20 days after norm~li7~tion of the
3s serum ALT values (TABLE 26). The presence of GB-C sequence after serum
ALT norm~li7~tion suggested that HGBV-C may establish chronic infections in
some individuals. However, the absence of additional samples from this patient

WO 95/21922 ~ 1 6 6 ~ ~ ~ PCT/US95/02118

146

prevents a conclusion as to the chronic nature of HGBV-C. Additional samples
are being pursued to resolve this question.
GB-C.5 was obtained from a C~n~ n patient with hepatitis associated
aplastic anemia. Each sample from this patient was seropositive in the C.7 ELISAs (Example 20). GB-C.5 was detected in the samples obtained from the C~n~ n
patient during aplastic anemia (day 13 post-presentation) and at the time of death
(day 14, FIGURE. 41) using GB-C.5-specific primers (GB-C.5-sl and GB-C.5-
al). However, GB-C.5-specific PCR failed to detect GB-C.5 sequence at the time
of presentation (day 0, æute hepatitis) and on day 3 (liver failure). Thus, it is
o unclear whether GB-C.5 was present below the limit of detection in the first
samples. If so, HGBV-C may have been the causative agent of this patient's
aplastic anemia. However, because GB-C.5 was detected by RT-PCR only during
aplastic crisis, GB-C.5 may have been acquired from a blood product ~-lrnini~tered
to combat the anemia. In this case, HGBV-C's association with aplastic anemia
would be similar to HCV's [Hibbs, et al.JAMA 267:2051-2054 (1992)].
Due to the distant relation of HGBV-C and HCV, it was of interest to
determine whether current methods for c~et~cting HCV infection would recognize
human samples cont~inin~ HGBV-C. Routine detection of individuals exposed to
or infected with HCV relies upon antibody tests which utilize antigens derived
20 from three or more regions of HCV- 1. These tests allow detection of antibodies to
all of the known genotypes of HCV in most individuals[Sakamoto, et al. J. Gen.
Virol. 75:1761-1768 (1994); Stuyver, et al. J. Gen. Virol. 74:1093-1102 (1993)].Second generation ELISAs for HCV were performed on the samples that contain
HGBV-C as described in Example 10 (TABLE 25). One of the 4 samples that
2s contain HGBV-C was seropositive for HCV antigens. A limited number of human
sera which are seronegative for HCV have been shown to be positive for HCV
genomic RNA by a highly sensitive RT-PCR assay[Sugitani, 1992 #65]. A
similar RT-PCR assay (as described in Example 9) confirmed the presence of an
HCV viremia in the seropositive sample. However, none of the HCV seronegative
30 samples were HCV viremic. Therefore, although 1 of the 4 individuals containing
HGBV-C sequences have evidence of HCV infection, the current assays for the
presence of HCV did not accurately predict the presence of HGBV-C. The one
HCV-positive patient appears to be co-infected with HGBV-C. It is unclear
whether the hepatitis noted in this patient was due to HCV, HGBV-C or the
35 presence of both viruses. That HGBV-C and HCV are found in the same patient
may suggest that common risk factors exist for acquiring these infections.

21663~3
_ WO 95121922 PCT/US95/02118

147

Using the PCR protocol described above, GB-C sequences (~85%
identical to the previous GB-C isolates shown in FIGURE 41, data not shown)
were identified in "normal" units of blood from two volunteer U.S. donor obtained
in 1994. These units tested negative for HBV, HCV, and had normal serum ALT
values. However, these units tested positive in the 1.4 ELISA. Finding HGBV-C
in at least two units of "normal" blood out of ~ 1000 units immunoscreened
suggests that this virus is currently in the U.S. blood supply. However, using
ELISAs developed from HGBV proteins and nucleotide probes from HGBV
sequences, we demonstrate that these units of blood can be identified.
o The large amount of sequence variation in the various GB-C sequences
(FIGURE 41) should be noted. Although highly sensitive, PCR based assays for
viral nucleic acids are dependent on the sequence match between oligonucleotide
primers and the viral template. Therefore, because the PCR primers utilized in this
study were located in a region of the HGBV-C genome that is not well conserved
in various isolates, not all HGBV-C viremic samples tested may have been
detected by the RT-PCR assays employed here. Utilization of PCR primers from a
highly conserved region of the HGBV-C genome, as have been found in the HCV
5' untranslated region [Cha, et al. J. Clin. Microbiol. 29:2528-2534 (1991)],
should allow more accurate detection of HGBV-C viremic samples.
TABLE 25
GB-C containing sera
SequenceOrigin Clinical GB HCV HCV
reactivityl ELISA2 RNA
GB-C.4 Egyptian Acute A 0.25 0
Hepatitis
GB-C.5 Canada HA-AA3 C 0.15 0

GB-C.6 U.S. historyof C 11 51 +
hepatitis
GB-C.7 U.S. h--p~titic A 0.26 0
I Immunoreactivity detected to recornt~ HGBV protein(s) from virus A, B or C.
- - 2 Sample to cutoff values reported. Values 21 (underlined) are considered positive.
25 3 hepatitis ~ccoci^~^~ aplastic anemia

TABLE 27.
Egyptian Serial Samples

WO 95121922 PCT/US95/02118
- ~4l h~ 6 ,~

Days post- 2.17 ELISA GB-C.4
presentation ALT (U/l)1 Reactivity2 RT-PCR
0 128 61.0 +
78 62.9 +
49 69.4 +
33 39. 1 +
55.9 +
I Upper limit of normal: 45 U/l.
2 Sample to normal reported. Values 210 are considered positive.
Example 21. Sequence Comparisons and Phylogentic Analysis
s Information about the degree of relatedness of viruses can be obtained by
performing comparisons, i.e. alignments, of nucleotide and predicted arnino acidsequences. Performing alignm-ont~ of the HGBV sequences with sequences of
other viruses can provide a quantitative ~ses~mPnt of the degree of similarity and
identity between the sequences. This information can then be used to develop a
o rationale for the taxonomic classification of the HGBV viruses. In general, the
calculation of similarity between two amino acid sequences is based upon the
degree of likeness exhibited between the side chains of an amino acid pair in an~lignmPnt. The degree of likeness is based upon the physical-chemical
characteristics of the amino acid side chains, i.e. size, shape, charge, hydrogen-
15 bonding capacity, and chemical reactivity, thus, similar amino acids possess side
chains that have similar physical-chemical characteristics. For example,
phenylalanine and tyrosine are amino acids cont~ining aromatic side chains and
are, therefore, regarded as chemically similar. A discussion of the chemistry ofamino acids can be found in any basic biochemistry textbook, for example,
20 Biochemistry~ Third Edition, Lubert Stryer, Editor, W.H. Freeman and Company,New York, 1988. The calculation of identity between two aligned amino acid
sequences is, in general, an arithmetic calculation which counts the number of
identical pairs of amino acids in the alignment and divides this number by the
length of the sequence(s) in the alignment. Analogous to the method used for
25 amino acid sequence alignments, the determination of the degree of identity
between two aligned nucleotide sequences is an arithmetic calculation which counts
the number of identical pairs of nucleotide bases in the alignment and divides this
number by the length of the sequence(s) in the ~lignm~nt. The calculation of

WO 95/21922 ~ ~. 6 ~ i 3 PCT/US95/02118

149

similarity between two aligned nucleotide sequences sometimes uses different
values for transitions and transversions between paired (i.e. matched) nucleotides
at various positions in the ~lignmPnt; however, the magnitude of the similarity and
identity scores between pairs of nucleotide sequences are usually very close, i.e.
5 within one to two percent.
As has been stated earlier, limited identity exists between amino acid
sequences of the HGBV agents and hepatitis C genotypes. In order to more
accuràtely dt;l~lll~il~e the degree of relatedness between the HGBV agents and
HCV, amino acid sequence ~lignmPntS were performed using the sequence of the
0 entire large open reading frame (ORF) of HGBV-A, B, and C, and the amino acid
sequence of the large ORF of several representative HCV isolates. In addition, the
degree of relatedness between the HGBV agents and HCV at the nucleotide level
was determined using the entire genomic nucleotide sequence of HGBV-A, B, and
C, and that of several representative HCV isolates. ~lignmP.nt of the amino acid15 and nucleotide sequences was performed using the program GAP of the WisconsinSequence Analysis Package (Version 8) which is available from the Genetics
Computer Group, Inc., 575 Science Drive, Madison, Wisconsin, 53711. The gap
creation and gap extension penalties were 5.0 and 0.3, respectively, for nucleicacid sequence alignments, and 3.0 and 0.1, respectively, for amino acid sequence20 comparisons. The GAP program uses the algorithm of Needleman and Wunsch
(J. Mol. Biol. 48:443-453, 1970) to calculate the degree of similarity and identity,
expressed as percentages, between the two sequences being aligned.
The nucleotide and amino acid sequences of selected members of the major
hepatitis C virus (HCV) genotypes were obtained from GenBank and are shown
25 below with their respective accession numbers:

TABLE 27
HCV Isolate Genotype designation GenBank Accession Number
HCV- 1 1 a M62321
30 HCV-JKI lb X61596
HCV-J6 2a D00944
~- HCV-J8 2b D10988
HCV-K3a 3a D28917
-- HCV-Tr 3b D26556
Results of pairwise comparisons of the predicted amino acid sequences of the large
open reading frame (i.e. putative precursor polyprotein) and the nucleotide

WO 95121922 PCT/US9~/02118
- 21~13 150

sequences between each of the above HCV genotypes and each of the HGBV
isolates are shown in Tables 28 and 29, respectively. The genotype designation,
which is based on the system of nomenclature for HCV isolates described by
Simmonds P. et al (1994) Hepatolo~y, 19: 1321-1324, of each of the HCV isolates
s are shown in the top row.
The data shown in TABLE 28 demonstrate that the lower limit of amino
acid sequence identity between the HCV genotypes is 69%. This value is very
close to that shown by Simmonds et al. [Simmonds, P. et al. Hepatology.
19:1321-1324, 1994] who reported that comparisons of the coding region (i.e.
o large open reading frame) of eight complete HCV genomes from two major groups
showed amino acid sequence similarities of 67.1 % to 68.6%; however, these
authors did not describe the method by which the similarities were calculated. This
value (69%) is also very close to the value of 71-84% identity reported by
Okomoto et al., rVirolo~y, 188:331-341, 1992] for comparisons of HCV-J8 with
other major HCV isolates; however, these investigators did not describe the
method by which the identities were calculated. Comparisons of the HGBV
polyprotein sequences with each of the HCV genotypes reveals that the HGBV-
encoded polyprotein sequences exhibit no more than 33% identity to any of the
HCV polyproteins (TABLE 28). A colll~alison of the nucleotide sequences
(TABLE 29) demonstrates a maximum sequence identity of 44.2% between any
HGBV virus and any HCV isolate, whereas, the minimllm nucleotide sequence
- identity between HCV isolates is 64.9%. Therefore, since HGBV-A, B, and C
possess nucleotide and predicted amino acid sequence identity with HCV that is
well outside the range of identities established for the known HCV genotypes, the
HGBV viruses cannot be considered genotypes of the hepatitis C viruses.
The relationship between the hepatitis C viruses and the hepatitis GB
viruses can be examined by performing phylogenetic analysis on their aligned
nucleotide or cled~lced amino acid sequences (i.e. Iarge open reading frames) or on
a portion of these sequences. This approach has been applied to the hepatitis C
viruses and showed that the variability of HCV isolates delineated six equally
divergent main groups of sequences [Simmonds, P. et al., J. Gen. Virol. (1993)
74:2391-2399 and Simmonds, P. et al., J. Gen. Virol. (1994) 75:1053-1061].
This analysis resulted in the establishment of a system of nomenclature for the
hepatitis C viruses [Simmonds, P. et al. Hepatology. 19:1321-1324, 1994] where
the isolates are classified into genotypes based upon the evolutionary distance
between sequences.

~16631~
WO 95/21922 1 ~ PCT/US95102118

151

In order to determine the phylogenetic relationship between the hepatitis
GB viruses and the hepatitis C viruses, ~ nm~nts of amino acid sequences within
the putative helicase gene of NS3 and the putative RNA-dependent RNA-
polymerase (RdRp) of NS5B were performed. Also included in the alignments
` 5 were related sequences from other viruses in the Flaviviridae and viruses that have
been shown to possess evolutionary relatedness within their helicase or
polymerase genes to members of the Flaviviridae [Koonin, E.V. & Dolja, V.V.
(1993) Crit. Rev. Biochem. Mol. Biol. 28, 375-430 and Koonin, E.V. (1991) J.
Gen. Virol. 72, 2179-2206].
The amino acid sequence ali~ ell~ were made using the program
PILEUP of the Wisconsin Sequence Analysis Package (version 8). Phylogenetic
distances between pairs of aligned sequences were determined using the
PROTDIST prograrn of the PHYLIP package (version 3.5c, 1993) kindly
provided by J. Felsenstein [Felsenstein, J. (1989) Cladistics 5:164-166]. These
computed distances were used for the construction of phylogenetic trees using the
program NEIGHBOR (neighborjoining setting). The trees were plotted using the
program DRAWTREE. The trees shown are not rooted. The viral sequences used
and their corresponding GenBank accession numbers are shown in TABLES 31.
The evolutionary ~lict~nre between each HCV genotype and each of the HGBV
viruses for ~lignm~nts made within the helicase, RdRp, or complete large open
reading frame are presented below in TABLES 32, 33, and 34 respectively. The
~i~t~nces calculated between the HCV genotypes or the HGBV viruses and the
other viruses listed in TABLE 30 are not shown. The phylogenetic trees produced
for amino acids ~lignm~nts of the viral helicases, RdRps, or complete large openreading frames sequences are shown in FIGURES 42, 43 and 44, respectively.
Amino acid sequence ~lignm~nts of the putative RdRps, encoded within the
NSSB region, of HGBV-A, B and C with the RdRp of several HCV genotypes,
two of the pestiviruses, several representative flaviviruses, and several positive-
strand RNA plant viruses, show that they possess conserved sequence motifs
associated with the RdRps of positive-strand RNA viruses (data not shown).
Based on similar analyses, the HGBV-A and HGBV-B encoded helicases show
significant identity with the helicases of these positive-strand RNA viruses (data
not shown), with the exception of CARMV, TCV, and MNSV which presumably
do not possess helicase genes [Guilley, H et al. (1985) Nucleic Acids Res.
13:6663-6677]. These results were not unexpected in view of the association of
the helicase and RdRp genes of these viruses into Su~elgrou~s demonstrated by
previous phylogenetic analyses [Koonin, E.V. & Dolja, V.V. (1993) Crit. Rev.

wo 95/21922 21 6 ~ 31 3 PCT/USg5/02118

152

Biochem. Mol. Biol. 28, 375-430]. However, ex~min~tion of the phylogenetic
~list~n~es between the HGBV isolates and the HCV isolates based upon alignment
of the helicase or RdRp sequences (TABLES 30 and 31) demonstrates that there is
considerable ~i~t~nce between the members of these two groups. The distances
s calculated demonstrate the close relationship among the HCV genotypes, where the
maximum distance between any two genotypes is 0.3696 (RdRp distance).
However, the tli~t~nres calculated from the RdRp alignment between HGBV-A,
-B, or -C and any member of the HCV group is 0.96042- 1.46261. Similarly, the
distances calculated from the helicase alignmPnt~ for any two HCV genotype
0 ranges from 0.044555-0.19706, while distances between any member of the HCV
group and HGBV-A, -B, or -C ranges from 0.69130-0.87120. In addition,
alignment of the predicted amino acid sequence of the entire large open reading
frames of the HCV genotype and the GB viruses demonstrates a narrow range of
evolutionary ~lict~nl~e for the HCV isolates (0.17918-0.39646) while the minimumdistance between any GB virus and any HCV isolate is 1.68650. Thus, the
hepatitis GB viruses exhibit evolutionary distances that are clearly outside therange demonstrated for the hepatitis C virus genotypes.
The phylogentic analysis of the HGBV and HCV sequences is attempting
to answer the question, "How does the divergence of the HGBV sequences from
the HCV sequences culllp~ with the divergence among the HCV sequences? In
particular, might it be that the HGBV sequences are no more diverged from HCV
sequences than the HCV sequences are from one another?" A reasonable condition
to be met, if the HGBV sequences were no more diverged from HCV sequences
than HCV sequences are from one another, would be that the HGBV-A, HGBV-
2s B, and/or HGBV-C sequences would be at least as close to one of the HCV
sequences as the most distantly related pair of HCV sequences (i.e., the minimllm
distance from any HGBV sequence to any HCV sequence is less than or equal to
the maximum observed distance among HCV sequences). This condition is not
met by the present sequence data; in Table 31 (RdRp alignment), the minimum
HCV-HGBV distance is 2.83 times the maximum HCV-HCV distance; and in
Table 32 (helicase ~lignmPnt)~ the ~ ,,, HCV-HGBV distance is 3.51 times
the maximum HCV-HCV distance. Thus, the data do not support the idea that the
HGBV sequences are members of a group whose diversity is delimited by
previously characterized members of the HCV group.
The distribution of these relative distances can be examined with a test
based on the bootstrap [Efron, B. (1982) "The jackknife, the bootstrap, and other
resampling plans", Society Industrial and Applied Mathematics: Philadelphia;

`- _ WO 95/21922 ~ 1 ~ 6 3 L 3 PCT/US9',/02118

153

Efron, B. and Gong, G. (1983) "A leisurely look at the bootstrap, the jackknife,and cross-validation." Am. Stat. 37: 36-48]. The results obtained from the
bootstrap sampling are shown in Table 32; which shows the comparison of the
HCV-HGBV divergence (",i~ "l", of all HCV-HGBV distances) to the HCV
5 diversity (maximum of all HCV-HCV distances) based on PAM distances as
calculated using the PROTDIST program. In 1000 bootstrap resamplings of the
columns in the sequence ~lignments~ the greatest divergence among HCV
sequences was never as large as the smallest of the divergences of the HGBV
sequences from the HCV sequences (Table 32). Thus, in independent
0 measurements based on ~lignmt~nts of coding regions from two separate genes,
there was not a single instance in which the data were consistent with the HGBV
sequences falling within the genetic sequence diversity of HCV genotypes.
T e~ning in the direction of a conservative estimate, there is less than one chance in
100,000 that the data for the HGBVs could be drawn from the same pool of
15 sequences as the HCV sequences.

TABLE 32
(a) Distances Determined from RdRp AlignmentAlignmt-nt
Out of bootstrap 1000 samples:
Average min(HCV-HGBV distance)/max(HCV-HCV distance) = 2.543645 +/-
0.367443
25 Minimllm min(HCV-HGBV distance)/max(HCV-HCV distance) = 1.617575
(b) Distances Dt;~ ined from Helicase Alignment
Out of bootstrap 1000 samples:
Average min(HCV-HGBV distance)/max(HCV-HCV distance) = 3.346040 +/-
0.511875
Minimum rnin(HCV-HGBV distance)/max(HCV-HCV distance) = 2.092055

Assuming that the HCV sequences utilized in this study are representative
of the most divergent of the HCV genotypes, these results indicate that HGBV-A,
B and C are not genotypes of HCV. In addition, it appears that HGBV-A and
HGBV-C are more closely related to each other than either is to HGBV-B, which
suggests that HGBV-A and HGBV-C may be representatives of a separate viral
40 lineage. Similarly, HGBV-B may be the sole representative of its own viral
lineage. The relative evolutionary distances between the viral sequences analyædare readily apparent upon inspection of the unrooted phylogentic trees presented in

W O 95/21922 ~ 1 6 ~ ~ ~ 3 PC~rAUS95/02118

154

Figures 45 and 46, where the branch lengths are proportional to the evolutionary~lict~rlce. The close evolutionary relationship of the HCV viruses is apparent and
is consistent whether the analysis is performed using a portion of the encoded
genomic sequence or the entire genome (FIGURE 44). The large degree of
s divergence between HGBV-A, HGBV-B, and HGBV-C and other Flaviviridae
members demonstrate that, while being most closely related to the hepatitis C
viruses, the GB-agents cannot be considered genotypes of HCV and may actually
be representatives of a new virus group, or groups, within the Flaviviridae.
The present invention thus provides reagents and methods for det~llllhlillg
0 the presence of HGBV-A, HGBV-B and HGBV-C in a test sample. It is
contemplated and within the scope of the present invention that a polynucleotide or
polypeptide (or fragment[s] thereof) specific for HGBV-A, HGBV-B and HGBV-
C described herein, or antibodies produced from these polypeptides and
polynucleotides, can be colllbilled with commonly used assay reagents and
5 incorporated into current assay procedures for the detection of antibody to these
viruses. Alternatively, the polynucleotides or polypeptides specific for the HGBV-
A, HGBV-B and HGBV-C (or fragment[s] thereof) described herein, or
antibodies produced from such polypeptides and polynucleotides (or fragment[s]
thereof), can be used separately for detection of the HGBV-A, HGBV-B and
20 HGBV-C viruses.
Other uses or variations of the present invention will be apparellt to those
of ordinary skill of the art when considering this disclosure. Therefore, the
present invention is intended to be limited only by the appended claims.

- W O 95/21922 PCTrUS9~/02118
2166~13
155

3 i ~
WO 95/21922 PCT/US95/02118

156

216~31~
WO 95/21922 PCT/US95/02118
157

-


~D
C~ ~ O O ~ ~ ~ ~ 0 00 0 ~


-- ~ oo ~ oo ~ ~ ~ _ ~ C`l ~ o ~t -- o ~,,

V ~ ~ ~ ~ 00 t~ ~ o ~ 00 ~

~ o ~ ~ O~ ~ ~0 ~ ~ ~ ~O ~ ~0 ~D O ~-- o


_ _ _ _ _ _ _ _ _ ~ ~ o
--.
'` ~ V ~ 00 00 o ~ o ~ _ ~ L ~




z o

U ~ ~ ~ ~ _
o

. C

21~313
WO 95/21922 PCT/US95/02118

158




~ X X X _ --~ ~




CO _l



z

~, ~ ~ ~ ~ d ~ ~, ~ ~ o~ o O ~ ~ _ ~ ~ ~ ~ ~ ~ ~ 8
~ ~ oz ~

WO 9S/21922 2 1 6 ~ 3 1 3 PCT/US95/02118

159




C~ ~ -- -- -- -- -- _ _ _ _ _ _ _ _ _


~ ~ _




~ _1 ~ ----C~ ~ ~ e~l o ~ ~ ~ ~o o o~ o ~ ~-- ~
U~ ~ U __~

o

~ o o~ --~ ~--o ~ o~ t o--

Z Z

a ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~ ~

O

WO 95/21922 Cv~ 3 1 3 PCT/US95/02118
160


C~ ~ -- _ ~ = ~ X -- O ~D




`" e 8 ~

O ~ 00 ~ ~ O~ 00 o ~


o ~ ~ ~ o o ~ r~
-- -- _ _ ~ ~ _

E- 8 ~ _ ~o ~ ~ ~t _

~ ~ ~ O ~ cr~ oo _ ~ O _ X C ~

z ~o
o z
o~ oOO~~ 8

- WO 95121922 2 ~ 6 6 3 1 3 PCT/US95102118
161


~ Z Al Z Z Z Z ~ P ' ~ ~ ~ S

o + + C~ ,~ . ~ V
~ C:) ~ o .~ ~ o

-- Q ~.~ E ~ E
~, ~ E ' CL ~ v=

v~1 ~ Y g y V o E ~

oo ~L ~ _ ~ g o 1, V E

E- o ~ ~ , bO ~ Z

~ o ~ ~ ~ ~ c p~ c
' ' ~ 3 E "

c '~ C C c C C Z ~

D v~ o O z 5 u
~c ~ ~ Y ag


C~) D D D D D D ~ D ~c 3 D ~ ~,, o
v~ o ,~ o ~ ~ ~ ~
~ o ~ a ~3 o o v~

ol C~ ` ~

1~
WO95/21922 ~1 ~ 631~3 PCI/US95/02118
z * ~- a oo o a ~ a ~ a ~ a ~
_ * , + ~ . ~ ~ . ~ . `D O
~" * z O o z o o z o z ~ z
~o
~1 A
O a ~ ~ a ~ a a O a ~
~, o ~ . o z o o z o o z o z z o
o o o o oo o o o


* ~ + 9 ~ 9 ~ ~ 9 . 9 . 4 . .
o. V~ ** Z Z Z Z Z
~ o
o
c ~ ~ to ~ a ~ ~ a ~ ~ a ~ a ~ a ~ ~
~ ~ o ~ ~ o o~ o o o o o o
6 o _ ~ o o~ o o Z o Z o Z o o


9 . a . 9 9 9 + +
6m ~ ~, Z Z ZZ Z

o c t~ ~ CJ~
~ ~ a ~ ~ a o u~ a a ~ a ~ ~
~ ~ ~ o ~ x ~ ~ ~ ~ . ~ . ~ . o o
E--6 _ o ~ _ _ ~ _ ~Z _ Z _ Z o o


+ a q +a + a + a + +
z z z z z
x
o ~ ~ oo o o~ o oo o
a ~ ~ra ~ o a ~ a ~
~t ~ O z O~ -- z ~ t-- z t~ z z ~ K
-- O O ~ O OO O O O ~ _
O K ~
K ~ O

o + 9 .

~ c~ ~ ~, C t~ ~ C ", ~ 3 3

V~ b 3 3 g O O O O O ~ 0 3 3
~ * * *
* f * *

_ W O 9S/21922 ~ 31 PCTrUS95/02118

163



+ + C~
~ 3 ~ c, +
P:

,, ~
~ 3 ~ ~ + +
c~


5 ~C '~
o

~ ~ CO
~ 3 ~

m
X ~ E ~ O
, , , , , , , , + , + , , O
~ 3 p~
o

c) =
.~ + c~
~ ~c ~ ~ l l l l c


c ~ ~ ~ ~ c ~ -
. D ~ C ~ ~ _ a

- ~ W e

P4 ~ 8

21b6313
PC rluS95/021 18
WO 95/21922
164
Table 16 SEROLOGIC RESULTS HGBV- B
POS/TOTAL

CATEGORY SPECIMENS 1.4 4.11.7 ELISA~TOTAL
ELISA~ ELISA~
Individuals Assumed "Low Volunteer Blood Donors
Risk" for HGBV Exposure 1 0/200 0/200 0/200 0/200
2 4/200 4/200
Interstate Blood Bank 9/760 ND~ 0/760 9/760
Individuals Assumed Intravenous
"At Risk" for HGBV Exposure Dru~ Users 13/112 5/112 3/112 9/112
2 1/99 0/99 0/99 1/99
Western Africa 91/130051/130043/1300181/1300
Hemophiliacs 2/100 ND 1/100 2/100

Individuals with "Non A-E
Hepatitis" Clinics in Japan 0/180 7/89 2/180 9/180
ainics in Greece 4/73 0/67 3/73 5/73
Clinics in U.S. tSET M) 1/72 2/72 3/72 4/72
Clinics in U.S. (SETT) 0/64 0/64 0/64 0/64
Clinics in U.S. 0/62 2/62 2/62 3/62
Clinics in Egypt 9/132 1/132 9/13211/132
Clinics in New Zealand 2/56 1/56 1/56 4/56
Clinics in Costa Rica 2/100 ND 1/100 2/100
Clinics in Pakistan 2/82 ND 2/82 4/82
Clinics in Italy 0/10 0/10 0/10 0/10
Clinics in U.S. SET 1 0/56 ND~ 0/56 0/56
SET2 0/20 ND~ 0/20 0/20
SET 3 3/51 ND~ 1 /51 3/51

WO 9~/21922 Z~ 3 13 PCT/US9~/02118
165

TABLE l 7HGBV-B Serolo~ical Results
Repeatably Negative In x2* SIG**
Reacùve 1.4, 1.7 or
1.4, 1.7 or 4.1 ELISA
4.1 ELISA
Volunteer Blood 0 200
Donors
IBB Ohio 9 751 - ???*
Intravenous Drug Users 1 99 - NS*
(US) 9 103 ???
WestAfrica 181 1119 ???-
Clinics inJapan 4 81 ???*
in New 7~ ncl 4 52 ???*
" in Greece 1 10 - ???*
" in Egypt 5 20 - ???*
in U.S.
Set 1 0 56 NS*
Set 2 0 20 NS*
Set 3 3 51 ???
Set M 4 68 ????
Set T 0 64 NS *
~sumsd Low Risk 0 200
Paid Blood Donors 9 751 ???
Assumed High Risk 191 1321 ??
NonA-EHepatitis 21 431 - NS*
Chi square value obtained by applying the Chi square test. ~*Dele.",indtion of s~ic~ l si~fl~nce
based upon the Chi square analysis. tNot statistically signifi- ~n~ by the Chi square test. S~ c~lly
5;~ by the Chi square lest, with p<0.050.

wo95121922 ~ 33 1 3 PCrlUS95/02118
166
Table 18- SEROLOGIC RESULTS - TABLE A
POS/Tal'AL

CATEGORY SPECIMENS 1.182.17 1.22 1.5
ELISA ELISAELISAELISA TOTAL
REACI`IVE
Individuals Assumed "Low Volunteer Blood Donors
Risk" for HGBV Exposure 1 0/2001/200 0/200 0/200 1/200



Interstate Blood Bank ND~ ND ND 0/760 0/760

Individuals Assumed Intravenous
"At Risk" for HGBV Exposure DrugUsers 1/112 1/1120/112 0/112 2/112
Western Africa 9/35343/817 6/81758/130091/1300

Individuals with "Non A-E
Hepatitis" ainics in Japan 0/89 1/89 ND 4/89 3/89
ainica in Greece0/67 0/67 0/67 0/67 0/67
ainics in (Mayo)3/72 2/72 4/72 0/72 7/72
Clinics in U.S. (Thiele)0/64 0/64 0/64 0/64 1/64
Clinics in U.S. (1/3) 1/62 2/62 2/62 0/62 3/62
ainicsin Egypt 0/132 7/132 0/1320/132 7/132
ainica in New Zealand ND ND ND 0/56 ND


Separate ELISA's were developed and cutoffs determined
Not Done

O95/21922 i~ 1 ~ S 3 ~ 3 PCT/USg5/02118
167

TABLE,.19HGBV-A Serolo~ical Results
Repeatably Negative In x2* SIG**
Reactive in 1.18, 2.17.
1.18, 2.17, 1.22, or
1.22, or 1.5 1.5 ELISA
ELISA
Volunteer Blood 1 199
Donors
IBB Ohio 0 760 - NS*
Intravenous Drug Users
(US) 2 110 NS *
WestAfrtca 91 1209 ???-
Clinics in Japan 2 83 - ???*
" in New Zealand 0 56 - NS*
" in Greece 0 11 - NS *
" in Egypt 3 22 - ???*
in U.S.
Set 1 ND ND
Set2 ND ND
Set 3 ND ND
Set M 7 65 ???
Set T 1 63 ???
Assumed Low Risk 1 200
Paid Blood Donors 0 760 NS *
Assumed High Risk 93 1319 ???-
Non A-E Hepatitis 13 300 - ?????*
Chi square value obtained by applying ~he Chi square tes~. * Detcl-"ination of statisLical signficance
based upon the Chi square analysis. tNot st~icLic~lly significant by the Chi square test. StatisLically
signficant by Lhe Chi square test, with p<0.050

2 ~ ~ 6 3 1 3 PCT/US95/02118
WO 95121922
168
Table 23 SEROLOGIC RESULTS HGBV-C

POS/TOTAL
CATEGORY SPECIMENS C.7 ELISA~ C.1 ELISA~ C.6 TOTAL
ELISA~
IndividualsAssumed"Low VolunteerBlood Donors
Risk" for HGBV Exposure 1 0/200 1/200 3/200 4/200



Interstate Blood BankND~ ND~ND~ ND~
Individuals Assumes Intravenous 1/112 1/112 0/112 2/112
"At Risk for HGBV Exposure Drug Users
Western Africa 5/137 12/97 3/52 20/137
Individuals with "Non A-E
Hepatitis" Clinics in lapan ND~ 0/89 ND~ 0/89
ainics in Greece 0/67 0/67 ND~ 0/67
Qinics in U.S. (SETM) 0/72 2/72 4/n 6/72
Clinics in U.S. (SETT) 1/64 0/640/64 1/64
ainics in U.S. (SET 1/3) 2/62 1/621/62 3/62
Clinics in Egypt 3/132 0/132 15/132 18/132
Clinics in New Zealand ND''~ ND''" ND''~ ND~

WO 95/21922 2 1 ~ 3 PCr/US95/02118
169

TABLE 24 HGBV-C Serolo,~ical Results
Repeatably Negative In x2* SIG**
Reactive C.1, C.6,
in C.l, C.6, orC.7
or C.7 ELISA
ELISA
Volunteer Blood 4 196
Donors
IBB Ohio ND ND - NS*
Intravenous Drug Users
(US) 2 110 NS*
West A~ica 20 117 ????
Clinics in Japan 0 85 NS*
in New 7~ ncl ND ND - NS*
" in Greece 0 1 1 - NS *
" in Egypt 6 19 - ????
in U.S.
Set 1/3 3 59 ????

Set M 6 66 ???
Set T 1 63 NS *
~ssllmed Low Risk 0 200
Paid Blood Donors 9 751 ???
~ssumed High Risk 191 1330 ???-
Non A-E Hepatitis 21 303 - ???*
Chi square value obtained by applying the Chi square les~. * Dete.,-,ination of 5t~ictie~l cigrlf;e~nee
based upon the Chi square analysis. tNot 5~tictic~lly signiflc~nt by the Chi square test. st~icti~ y
~i~nfir~nt by the Chi square test, with p<0.050.

WO 95/21922 ~ 3 PCI~/US95/02118
170



~ ~_ o o


D ~7 ~ oO O

-

O c~ ~
O C~ 00 ~ 00
1 ~ O ~1 0
'':
~n D ~

-
r ~ ~ ~ oo ~ oo

V~

.~
V~
D
O C~
c ~ 1
CC

c~ u~



l~ oO



O ~ > > ~ m P~
o~ .V' ~C X ~ ~ ~ ~ . X

WO 95/21922 21~ ~ 313 PCTIUS95/02118

171




o
Dl~ D oo



-
~ ~ o~ o ~ ~
8 ~ ~ ~ ~ ~ ~
G
C~
._ O O~
O
z




o~
C~l
O ~ ~ ~ oo

6 , >
E-

c~ oo l - t - oo oo

i

D ~ ~ ~o oo 6 E~ 6, m' '
o ~ m m m

2~3~3
WO 95/21922 PCTIUS95/02118
172




.^ ~
E ~ ~ o o
C ~ X
.. ~ ~

~" o

C~ -- o
~ X~,
o ~,, ~ ~ o
~ ~ X




' Z Z
~ . o . _

_W O 95/21922 ~ 1 6 ~ 3 1 ~ PC~rAUS95/02118
173


X ~
:>
C~
I ~ _
o~ _
o, t_
oo o

X
o
oo oo . oo
~ ~ ~ cr~ _
~ o _ ~
r ~ ~ O O
~I ~ >
~ ~ O
r ~ ~, 1~ CS~ ~ ~
~ U~ ~ ~ C~l O
--~ O l~ ~ l~ ~O
O _ ~
oo~ O O ~ ~ ~4 0 0 0

~" ~O l_ O ~
O ~D O 1-- t-- O O
0 1~ 1~ 00
~ ~ ~ 0 ~ ~ ~ t~
0 ~o o o ~ o o o o
V
O CJ ~ O ~ ~ ~ O O ~ O
'~ ~1- 00 `~0 .,, v~ ~ O O ~O
V'~ 0 00 /~ . ~ O ~
_~ VO ~t ~ V ~ o o o o o

,_1 V ~ ~ V t--
~o ~ O _ _ ~ 00 ~

~, ~ O O O O O
m ~ V
oo ~ 00 ~ ~ ~ ~ ~ O t-- O 1-- ~t

o ~ ~ ~ o o o o o --
-


m ~ m

V V ~

W O 95t21922 Z ~ ~ 63 1 ~ PC~rtUS9~/02118
174




, ~
~ cr

.. _
~ _.
C C~
CJ ~
C~. ~ ~ t`
o l-- U~
~ ~o
&, ~ ~
oo ~ t--
o
_I ~
V C~
X
a. ~ c- ~
E ~ ~
o
_, O O
~t
~D
~ O ~o ~ O
to ~ ~t ~
, ~ oo
~O
O
~ oo
o C,~ U~
V ~ ~o O ~
_, _, l ~ oo
o O' ~ ~ O~
C ) o ~ O O
m

¢ O~ ~ ~D ~ ~

E~ ~, ~ ~D oo O

~ ~ '` '`i ~
~ 0~0

~ - ~ ~ o ~ oo
o ~ ~ ~ ~`i

W O95/21922 ~ 1 ~ 6 3 1 3 PCTnUS95/02118

175
SEQUENCE LISTING
(1) GENERAL lN~O~MATION:
; (i) APPLICANT: JOHN N. SIMONS
TAMI J. PILOT-MATIAS
GEORGE J. DAWSON
- GEORGE G. SC~TAUDT'R
6URESH-M. DESAI
THOMAS P. LEARY
ANTHONY SCOTT MUERHOFF
JAMES C. ERKER
SHERI L. BUIJK
ISA K. MUSHAuw~R
(ii) TITLE OF lNv~lON: NON-A, NON-B. NON-C, NON-D, NON-E HEPATITIS
REAGENTS AND METHODS FOR THEIR USE
(iii) NUMBER OF SEQUENCES: 720
~iv) CORRESPONDENCE AnnRR~S:
(A) ADDRESSEE: ABBOTT LABORATORIES D377/AP6D
(B) STREET: ONE ABBOTT PARK ROAD
(C) CITY: ABBOTT PARK
(D) STATE: IL
(E) C~U1~1KY: USA
(F) ZIP: 60064-3500
(V) COHYU1~K READABLE FORM:
(A) MEDIUM TYPE: Floppy disk
(B) CU.1YU1;~: IBM PC compatible
(C) OPERATING SYSTEM: PC-DOS/MS-DOS
(D) SOFTWARE: PatentIn Release #l.0, Version #l.25
~Vi) ~UKRh1~1 APPLICATION DATA:
(A) APPLICATION NUMBER:
(B) FILING DATE:
(C) CLASSIFICATION:
(viii) AllORN~i~/AGENT INFORMATION:
(A) NAME: POREMBSKI, PRISCILLA E.
(B) REGISTRATION NUMBER: 33,207
(C) R~K~I~CE/DOCKET NUMBER: 5527.PC.0l
(ix) TELEC~.rrJNlCATION lN~Ofir~.TION:
(A) TELEPHONE: 708-937-6365
(B) TELEFAX: 708-938-2623
- (2) lN~O~!TION FOR SEQ ID NO:l:
(i) SEQUENCE CHARACTERISTICS:
-~ (A) LENGTH: 24 base pairs
(B) TYPE: nucleic acid
(C) STR~Nl~Kl)N~i~S: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

W 095/21922 ~ 1 6 ~ PCTAUS95/02118

176

(Xi) SEQUENCE DESCRIPTION SEQ ID NO:1:

AGCACTCTCC AGC~1~1CAC CGCA 24

~2) INFORMATION FOR SEQ ID NO:2:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH 12 ba~e pair~
(B) TYPE nucleic acid
(C) ST~AN~ NK-~S: ~ingle
(D) TOPOLOGY linear
(ii) MOLECULE TYPE DNA (genomic)

(Xi) ~UU~N~ DESCRIPTION SEQ ID NO:2:
GA1~1 GCG~l GA 12

(2) 1N~O~.TION FOR SEQ ID NO 3:
(i) SEQOE NCE CHARACTERISTICS
(A) LENGTH 24 ba~e pair~
(B) TYPE nucleic acid
(C) STRAN~K~NkSS ~ingle
(D) TOPOLOGY linear
(ii) MOLECULE TYPE DNA (genomic)

(Xi) ~UU~N~ DESCRIPTION SEQ ID NO 3:
AGGCAACTGT GCTATCCGAG GGAA 24

(2) INFORMATION FOR SEQ ID NO: 4
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH 12 ba~e pairs
(B) TYPE nucleic acid
(C) STRANV~N~SS ~ingle
(D) TOPOLOGY linear
(ii) MOLECULE TYPE DNA (genomic)

(Xi) SEQOE NCE DESCRIPTION SEQ ID NO 4:
GATCTTCCCT CG 12

_ W O95/21922 2 1 6 6 ~ ~ 3 PCTrUS95102118

177

(2) INFORMATION FOR SEQ ID NO:5:
~i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 24 ba6e paire
(B) TYPE: nucleic acid
(C) ST~DINl)t'l~Nl~ s eingle
: (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DBSCRIPTION: SEQ ID NO:5:
ACCGACGTCG ACTATCCATG AACA 24

(2) INFORMATION FOR SEQ ID NO:6:
(i) SEQUENCE CHARA~T~RISTICS:
(A) LENGTH: 12 ba6e pairs
(B) TYPE: nucleic acid
( C) STR~NI )~:1 JNK-C s Bingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6:
GAl~ CA TG 12

(2) lN~OR~ATION FOR SEQ ID NO:7:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 18 ba6e pair6
(B) TYPE: nucleic acid
(C) sTp~Nn~nNR~s 6ingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7:

GGAATTCGCG GCCG~lCG 18

(2) lN~OR~ATION FOR SEQ ID NO:8:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 ba~e pair~

W O95/21922 2 1 ~ 6 3 1 3 ~CTrUS95/02118

178
(B) TYPE: nueleic aeid
(C) ST17~NIJICI . KqS: Bingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SBQUENCE DESCRIPTION: SEQ ID NO:8:

CGAGCGGCCG CGAATTCCTT 20

t2) INFORMATION FOR SEQ ID NO:9:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 24 baee paire
(B) TYPE: nucleic aeid
(C) STl~NnRn'~qS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomie)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9:

TTGACACCAG ACCAACTGGT AATG 24

(2) lNro~r~TIoN FOR SEQ ID NO:l0:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 24 base paire
(B) TYPE: nucleic aeid
(C) sT~NnRnNR-ss eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomie)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l0:

GGTGGCGACG A~lC~lG~AG CCCG 24

(2) lN~OF~.TION FOR SEQ ID NO:ll:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 8912 base paire
(B) TYPE: nueleie aeid
(C) STR~NIJKI'NKqS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomie)

_ W O 95/21922 2 1 6 6 3 ~ 3 PC~r~US95/02118

79

~xi) SEQUENCE DESCRIPTION SEQ ID NO 11

TGAATTCGTG lGG~llCGGT GGTGGTGGCG CTTTAGGCAG CCTCCACGCC CACCACCTCC 60
~AGC GGCGGCACTG TAGGGAAGAC CGGGGACCGG TCACTACCAA GGACGCAGAC 120
~ ~-lllllGA GTATCACGCC TCCGGAAGTA GTTGGGCAAG CCCACCTAYA L~ GGGA 180
TGGTTGGGGT TAGCCATCCA TACCGTACTG CCTGATAGGG TCCTTGCGAG GGGATCTGGG 240
A~l~l~-~LAG ACCGTAGCAC ATGC~l~ ~A TTTCTACTCA AACAAGTCCT GTACCTGCRC 300
CrA~-AArGCG ~AA~AA~G CAGACGCAGG CTTCATATCC ~ C~ATT AAAArATCTG 360
TTGAAAGGGG A~AAC~AGCA ARGCGCAAAG TCCAGCGCGA TG~lCGGC~-l CGTAATTACA 420
AAATTGCTGG TATCCATGAT GGCTTGCAGA CATTGGCTCA GGCTGCTTTR CCAGCTCATG 480
~llGGG~ACG C~AA~ACCCT CGCCATAAGT CTCGCAATCT TGGAATCCTT CTGGATTACC 540
CTTTGGGGTG GA~ l~AT GTTACAACTC ACACACCTCT AGTAGGCCCG ~lG~lGGCAG 600
GAGCGGTCGT TCGACCAGTC TGCCAGATAG TACGCTTGCT GGAGGATGGA GTCAACTGGG 660
CTA~lG~llG ~i lCG~i L-~ 1C CAC~ llG TGGTATGTCT GCTATYTTTG GCCTGTCCCT 7 20
GTAGTGGGGC GCGG~i ACT GACCCAGACA ~AAATAccAc AATCCTGACC AATTGCTGCC 7 80
AGCGTAATCA GGTTATCTAY 1~ll~lC~- CCACTTGCCT ACACGAGCCT G~11~1~1GA 840
~ GGA CGAGTGCTGG GTTCCCGCCA ATCCRTACAT CTCACACCCT TCCAATTGGA 900
CTGG QCGGA ~lC~ llG GCTGACCACA TTGA m TGT TATGGGCGCT -l~ GACCT 960
GTGACGCCCT TGA QTTGGT GA~ l~l~lG GTGC~ ~l ATTAGTCGGT GACTGGCTTG 1020
TCAGGCACTG GCTTATTCAC ATAGACCTCA ATGAAACTGG TA~ .AC CTGGAARTGC 1080
CTACTGGAAT AGA~C~-~GGG TTCCTAGGGT TTATCGGGTG GA~GGCCGGC AAGGTCGAGG 1140
CTGTCATCTT CTT~AC~AAA CTGGCTTCAC AAGTACCATA CGCTATTGCG ACTATGTTTA 1200
GCAGTGTACA CTACCTGGCG GTTGGCGCTC TGATCTACTA YGC~CGG GGCAAGTGGT 1260
ATCAGTTGCT CCTAGCGCTT AYGCTTTACA TAGAAGCGAC CTCTGGA~AC CCYATCAGGG 1320
TGCCCACTGG ATGCTCAATA GCTGAGTTTT GCTCGCCTTT GATGATACCA ~ C- GCC 1380
ACTCTTATTT GAGTGAGAAT GTGTCAGAAG TCA ~llA CAGTCCAAAG TGGACCAGGC 1440
CTGTCACTCT AGAGTATAAB AACTCCATAT ~-llG~lACCC CTATACAATC C~ G~lGCGA 1500
GGGGATGTAT GGTTAAATTC AAAAATAACA CATGGGGTTG CTGCC~ n C GCAATGTGCC 156D
ATCGTACTGC ACTATGGGCA CTGATGCAGT GTG~AACSA~ AGTCGCAACA CTTACGAAGC 1620

W 095/21922 ~ 3 ~ 3 PCT~US95/02118

180
AlGCG~lA ACACCATGGC TAACAACCGC ATGGCACAAC GGCTCAGCCC TGAAATTGGC 1680
TATATTACAA TACC~-lGG~l CTAAAGAAAT GTTTAAACCT CATAATTGGA TGTCAGGCCA 1740
CTTGTATTTT GAGGGATCAG ATACCCCTAT AGTTTACTTT TATGACCCTG TGAATTCCAC 1800
~C~lACCA CCGrAr-~GGT GGGCTAGGTT GCCC~ACC CCAC~-~lGG TAC~ C 1860
llG~llACAG GTTCCGCAAG GTTTTACAGT GATGTGAAAG ACCTAGCCAC AGGATTGATC 1920
ACCAAAGACA AAGCCTGGAA AAATTATCAG YTCTTATATT CCGCCACGGG TG~-lll~l.-, 1980
CTTACGGr-AG TTArQC QA GGCC~ G CTAATTCTGT -.GGG~l~.G TGGCAGCAAG 2040
TATCTTATTT TAGCCTACCT CTGTTACTTG ~CC~lll~ll l~GGGCGCGC ..-~GG~AC 2100
M~-l-TTGCGTC CTGTGCTCCC ATCCCAGTCG TATCTCCAAG CTGGCTGGGA l~llll~l~l 2160
AAAGCTCAAG TAG~ -l lG~l~ATT ll~-ll~ATCT ~llG~lATCT CCGCTGCAGG 2220
CTACGTTATG ~GCC~-llll AGG~lll~lG CCCATGGCTG CGGG~-llGCC CCTAACTTTC 2280
lll~l~G~AG CAGCTGCTGC CCAACCAGAT TATGACTGGT GG~lGC~ACT GCTAGTGGCA 2340
GGGTTAGTTT TGTGGGCCGG CCGTGACCGT GGTCACGCAT AG~l~la~-ll GTA~lC~ll 2400
GGC~ laGl AGCG~ l AACC~ llG CATTTSSTKA CGC~lG~-l-A G~ll~ACA 2460
CCGAGATAAT TGGAGGGCTG ACAATACCAC CTGTAGTAGC ATTAGTTGTC Al~..-..~l 2520
TTGGCTTCTT TGCTCACTTG TTACCTCGCT ~.G~ll-AGT TAACTCCTAT ~ GGCAAC 2580
GTTGGGAGAA .~G~.-l.GG AACGTTACAC TAAGACCGGA GAG~ lC ~ rG~-laG 2640
lll~lllCCC CGGTGCGACA TATGACGTGC TGGTGACWTT ~-l~l~l~l~l QCGTAGCTC 2700
TTCTATGTTT AA QTC QGT GCAGCAYMGT l~lllGGGAC TGACTCTAGG GTTAGGGCCC 2760
ATAGAATGTT GGTGCGTCTC GGAAAGTGTC Al~-llG~lA TTCTCATTAT ~llC..AAGT 2820
.~llC~.ll A~l~lllG~l GAGAATGGTG l~l-----lA KAAGCACTTG CATGGTGATG 2880
l~-llGC~lAA TGAllllGCC TCGAAACTAC CATTGCAAGA GCCATTTTTC C~-llllGAAG 2940
GCAAGGCAAG GGTCTATAGG AATGAAGGAA GACG~llGGS KK~lGGGGAC ACG~llGATG 3000
GTTTGSSCGT TGTBGCGCGT CTCGGCGACC ll~llllCGC AGGGTTAGCT ATGCCGCCAG 3060
A~GG~lGGGC CATTACCGCA C~-llllACGC TGCAGTGTCT CTCTGAACGT GGCACGCTGT 3120
CAGCGATGGC AGTGGTCATG ACTGGTATAG ACCCCCGAAC TTGGACTGGA ACTATCTT Q 3180
GATTAGGATC TCTGGCCACT AGCTACATGG GAlll~lllG Tr-Ar~ACGTG TTGAATACTG 3240
CTCACCATGG CAGCA~CGGG GGCCG~llGG CTCATCCCAC AGGCTCCATA CACCCAATAA 3300
CC-~ll~ACGC GGCTAATGAC CAGGACATCT ATCAACCACC ATGTGGAGCT GG~lCC~-llA 3360

21~3~3
_ W O95/21922 PCT~US95/02118

181
~-lCG~lGCTC ~GCGGG(;~ ACCAAGGGGT A~ G~lAAC ACGACTGGGG TCA.~ G 3420
AGGTCAACAA ATCCGATGAC CCTTATTGGT ~l~l~lGCGG GGCC~l~CCC ATGGCTGTTG 3480
CCAAGGGTTC TTCAGGTGCC CCGATTCTGT G~C~lCCGG GCATGTTATT GGGATGTTCA 3540
CCGCTGCTAG AAATTCTGGC GGTTCAGTCG GCCAGATTAG GGTTAGGCCG ll~ ~lG 3600
CTG~-ATACrA TCCCCAGTAC ACAGCACATG CCACTCTTGA TACAAAACCT ACTGTGCCTA 3660
ACGAGTATTC AGTGCAAATT TTAATTGCCC CCACTGGCAG CGGCAAGTCA ACCAAATTAC 3720
CA~ L11~-1 ~A CATGCAGGRG AAGYATGAGG ~ G~C~l AAATCCCAGT GTGGCTACAA 3780
CAGCATCAAT GCCAAAGTAC ATGCACGCGA CGTACGGCGT GAATCCAAAT TGCTATTTTA 3840
ATGGCAAATG TACCAACACA GGGG~ ~AC TTACGTACAG CACATATGGC ATGTACCTGA 3900
CCGGACGATG ~lCCCGGAAC TATGATGTAA TCAlll~l~A CGAATGCCAT GCTACCGATC 3960
GAACCACCGT G~GGG~ATT GG~AAGGTCC TAACCGAAGC TCCATCCAAA AATGTTAGGC 4020
TA~G~ TGCCACGGCT ACCCCCC~.G GAGTAATCCC TACACCACAT GCCAACATAA 4080
CTGAGATTCA ATTAACYGAT GAAGGCACTA lCCC~ A TGGAAAAAA~- ATTAA~Ar-G 4140
AAAATCTGAA GAAAGGGAGA CACCTTATCT TTGAGGCTAC CAAAAAACAC TGTGATGAGC 4200
TTGCTAACGA GTTAGCTCGA AAGG~-AATAA CAG~lGl~lC TTACTATAGG GGATGTGACA 4260
TCTCAAAAAT GCCTGAGGGC GA~l~l~lAG TAGTTGCCAC TGATGCCTTG TGTACAGGGT 4320
ACACTGGTGA C-l~lGATTCC GTGTATGACT GCAGCCTCAT GGTAGAAGGC ACATGCCATG 4380
TTGACCTTGA CCCTAC m C ACCATGGGTG ~C~-~l~lG CGGGGTTTCA GCAATAGTTA 4440
AAGGCCAGCG TAGGGGCCGC ACAGGCCG~G GGAGAGCTGG ~ATATACTAC TATGTAGACG 4500
GGAGTTGTAC CC~l~CGGGl A~ C~-lG AATGCAACAT TGTTGAAGCC TTCGACGCAG 4560
CCAAGGCATG GTA~G~G TCATCAACAG AAGCTCAAAC TATTCTGGAC ACCTATCGCA 4620
CCCAACCTGG GTTACCTGCG ATAGGAGCAA ATTTGGACGA ~GGG~l~AT ~ A 4680
TGGTCAACCC CGAACCTTCA ~l~AATA CTGCAAAAAG AACTGCTGAC AATTATGTTT 4740
TGTTGACTGC AGCCCAACTA CAA~ C ATCAGTATGG CTATGCTGCT CCCAATGACG 4800
; CACCACGGTG G QGGGAGCC CGG~llGGGA AAAAACCTTG TGGGGTTCTG TGGCG~ GG 4860
ACGG~l~lGA CGC~ C~l GGCCCAGAGC CCAGCGAGGT GACCAGATAC CAAATGTGCT 4920
TCACTGAAGT CAATACTTCT GGGACAGCCG CACTCGCTGT lGGC~llGGA ~GG~lATGG 4980
CTTATCTAGC CATTGACACT TTTGGCGCCA ~-~l~l~lGCG GC~llG~lGG TCTATTACAT 5D40
CA~lCC~AC CG~lG~lAcT ~CGCCC~AG lG~ACGA A~Gr.~AAT~ GTGGAGGAGT 5100

W O9St21922 PCTnUS95102118
21$5~1~
182
GTGCATCATT CAllCC~-lG GAGGCCATGG TTGCTGCAAT TGACAAGCTG AAGAGTA QA 5160
TCACCACAAC TA~C~.llC ACATTGGAAA CCGCC~-ll~A AAAACTTAAC AC~ llG 5220
GGCCl~ATGC AGCTACAATC ~-l~G~--ATCA TAGAGTATTG ~l~l`GG~l~A GTCACTTTAC 5280
CTGACAATCC CTTTGCATCA ~GC~ `G CTTTCATTGC GGGTATTACT ACCCCACTAC 5340
CTrAr~AT CAAAATGTTC CTGTCATTAT TTGGAGGCGC AA..GC~lCC AAGCTTACAG 5400
ACGCTAGAGR CGCACTGGCG TTCATGATGG CCGGGGCTGY GGGAACAGCT ~ll~-ACAT 5460
GGA QTCGGT GG~-l.-~C m GACATGC TAGGCGG~-.A TGCTGGCGCC TCATC QCTG 5520
~ G~ AC ATTTAAATGC TTGATGGGTG AGTGGCYCAC TATGGAT QG ~-llG~lG~l 5580
TAGTCTACTC CGC~CAAT CCGGCCGCAG GA~..~.GGG C-~ ~A G~ ~AA 5640
~ 111 GACAACAGCA GGGCQGATC AC-.~GCC~AA CAGACTTCTT ACTATGCTTG 5700
CTAGGAGCAA CACTGTATGT ARTGAGTACT TTATTGCCAC TCGTGACATC CGCAGGAAGA 5760
TA~lGGG~AT TCTGGAGGCA TCTACCCCCT GGAGTRTCAT ATCAGCTTGC AlCC~l~GC 5820
TYCACACCCC GACGGAGGAT GA-.~CGGCC TCA..~llG GGG~`lARAG A m GGCAGT 5880
ATGTGTGCAA 111.1~ Al...... G~ A A~-C-- .AA AGCTGGAGTT CAGAGCATGG 5940
TTAACATTCC .~l.~-C~. TTCTACAGCT GCCAGAAGGG GTACAAGGGC CC~.G~ATTG 6000
GATCAGGTAT GCTCCAAGCA CG~.~.C~AT GCG~.G-.~A ACT QTCTTT l~.`;..aAGA 6060
A.a~...lGC AAAACTTTAC AAPr-~rCCA GAA~.-~.~C AAATTACTGG AGAGGGGCTG 6120
TTCCAGTCAA CGCTAGGCTG .~laG~.CGG CTAGACCGGA CCCAACTGAT TGGACTAGTC 6180
-~,CGl~AA TTAlaGCGll AGGeACTACT GTA~ATPTGA GAAATTGGGA GATCACATTT 6240
TTGTTACAGC AGTATCCTCT CCAAATGTCT G m CACCCA G~lGCCCC~A ACCTTGAGAG 6300
CTGCAGTGGC CGTGGACCGC GTACAGGTTC AGYGTTATCT AGGTGAGCCC AAAACTCCTT 6360
G~-~C~-A~ATC lG~-~lG~.~. TACG~lC~G ACGGTAAGGG TAAAACTGTT AAG~lLCC~-l 6420
~CCGC~l~A CGG~A~A C~-~LG~.C GCATGCAACT TAA m GCGT GATCGACTTG 6480
AGGCAAATGA CTGTAATTCC ATAA~r~CA CTCCTAGTGA TGAAGCCGCA ~.~.CCGCTC 6540
~ CAA ACAGGAGTTG CGGC~-ACAA ACCAATTGCT TGAGGCAATT TCAGCTGGCG 6600
TTGACACCAC CAAACTGCCA GCCCC~-CCC AGATCGAAGA GGTAGTGGTA AGAAAGCGCC 6660
AGTTCCGGGC AAGAACTGGT .CG~.ACCT lGC-- CCCCC TCCGAGATCC GTCCCAGGAG 6720
TGTCATGTCC TGAAAGCCTG CAACGAAGTG ACCC~llAGA AG~.C~-..CA AjC~-lCC~l 6180
CTTCACCACC ~ ~CAG TTGGCCATGC CGAlGCCC~ GG~AGCA GGTGAGTGTA 6840

~1~63~ ~
W O95/21922 PCTnUS95/02118

183
ACC~ AC TGCAATTGGA TGTGCAATGA CCGAAACARG YGGAGKCCCl MAKRATTTAC 6900
CCAGTTACCC TCCCAAAAAG GAG~ G AATGGTCAGA CGAAAGTTGG TCAACGACTA 6960
CAACCG~-~C CAGCTACGTT A~-~GGCCCCC CGTACCCTAA GATACGGGGC AAGGATTCCA 7020
CTCAATCAGC CACCGCCAAA CGGCClACAA AAA-A~AAGTT GGGAAAGAGT GA~llllC~l 7080
GCAGCATGAG CTACACTTGG ACCGACGTGA TTAGCTTCAA AACTGCTTCT AAA~ll~'l~l 7140
CTGCAACTCG GGCCATCACT A~lGGlllCC T~AAA~AAAG ATCATTGGTG TATGTGACTG 7200
AGCCGCGG~A TGCGGAGCTT A~-AAAA~AAA AAGTCACTAT TAATAGACAA C~l~l~llCC 7260
CCCCATCATA CCACAAGCAA GTGAGATTGG CTAA~-AAAA AGCTTCAAAA ~ll~lCG~lG 7320
TCAl~lGG~A CTATGATGAA GTAGCAGCTC ACACGCC~-lC TAAGTCTGCT AAGTCC QCA 7380
TCACTGGCCT TCGGGGCACT GAl~llC~ll CTGGAGCGGC CCGCAAGGCT ~ll~lGGACT 7440
TGCAGAAGTG TGTCGAGG Q GGT~-Ar-ATA~ CGAGTCATTA TCGGCAAACT GTGATAGTTC 7500
CAAAGGAGGA G~l~-llC~LG AAGACCCCCC AGAAACCAAC AAA~AAACCC CCAAGGCTTA 7560
'l~-l~'~lACCC CCACCTTGAA ATGAGATGTG TT~A~-AA~-AT GTACTACGGT CAGGTTGCTC 7620
CTGACGTAGT TAAAGCTGTC ATGGGAGATG CGTACGGGTT TGTAGATCCA CGTACCCGTG 7680
TCAAGCGTCT ~ll~lC~ATG TGGTCACCCG ATGCAGTCGG AGCCACATGC GATACAGTGT 7740
~llll~ACAG TACCATCACA CCCGAGGATA TCAlG~lG~A GACAGACATC TACTCAGCAG 7800
CTAAACTCAG TGACCAACAC CGAGCTGGCA TTCACACCAT TGCGAGGCAG TATCACGCTG 7860
GAGGACCGAT GATCGCTTAT GAlGGCC~AG AGATCGGATA TCGTAGGTGT AG~l~llCCG 7920
GC~l~lATAC TACCTCAAGT TC QA~AGTT TGAC~lG~lG GCTGAAGGTA AATGCTGCAG 7980
CCGAACAGGC TGGCATGAAG AACCCTCGCT TCCTTATTTG CGGCGATGAT TGCACCGTAA 8040
TTTGGAAGAG CGCCGGAGCA GATGCAGACA AACAAGCAAT GC~l~l~lll GCTAGCTGGA 8100
TGAAGGTGAT GGGTGCACCA CAAGATTGTG TGCCTCAACC CAAATACAGT TTG~AA~AT 8160
TAACATCATG CTCATCAAAT GTTACCTCTG GAATTAC QA AAGTGGCAAG CCTTACTACT 8220
TTCTTACAAG AGATCCTCGT AlCCCC~llG GCAGGTGCTC TGCCGAGGGT CTGGGATACA 8280
ACCCCAGKGC KGCGTGGATT GGGTATCTAA TACATCACTA CCCATGTTTG .GG~llAGCC 8340
~l~l~llGGC TGTCCATTTC ATGGAGCAGA TG~ - llGA GGACAAACTT CCCGAGACTG 8400
TGAC~lll~A CTGGTATGGG AAAAATTATA CGGTGCCTGT AGAAGATCTG CCCAGCATCA 8460
TTG~lG~l GCACGGTATT GAGG~lll~-l C~.G~GCG CTACACCAAC GCTGAGATCC 8520
TCAGAGTTTC CCAATCACTA ACAGACATGA CCATGCCCCC C~ '~AGCC TGGCGAAAGA 8580

~lB~313
W O95/21922 PCTAUS95/02118

184
AAGCCAGGGC G~lC~CGCC AGCGCCAAGA GGC~lGGCGG AG~A~A~-AA AAl-GG~uCG 8640
~lC~ C TGGCATGCTA CATCTAGACC TCTACCAGAT TTGGATAAGA CGAGCGTGGC 8700
TCGGTACACC ACTTTCAATT ATTGTGATGT TTACTCCCSG A~GG~ATGT GTTTATTACA 8760
CCA~A~-A~AA GATTGCAGAA ~ll.~ll~lG AAGTATTTGG CTGTCATTGT TTGTGCCCTA 8820
GGG~CATTG ~-~ ~ACT AGCCATCAGC TGAACCCCCA AATTCAAAAT TAATTAACAG 8880
.-llAGG GC 8912

(2) lN~O~-~TION FOR SEQ ID NO:12:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 197 baee paire
(B) TYPE: nucleic acid
(C) ST~2~N~ NK-qS eingle
(D~ TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12:

GAGTGTAACC CTTTCACTGC AATTGGATGT GCAATGACCG AAACAGGCGG AGGCCCTGAT 60
GA m ACCCA GTTACCCTCC CAAAAAGGAG ~l~l~l~AAT GGT~A~-~ AA~lla~l.A 120
ACGACTACAA CCG~llC~AG CTACGTTACT GGCCCCC~lA CCCTAA~-ATA CGG~AAAGGA 180
TTCCACTCAA TTAGCCC 197

(2) lN~ ~.TION FOR SEQ ID NO:13:
(i) 8_yuh~ OE CHARACTBRISTICS:
(A) LENGTH: 207 baee pairs
(B) TYPE: nucleic acid
(C) STRP~T)RnNRqS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) ~yU~N~_ DESCRIPTION: SEQ ID NO:13:

CCTC~.~ AAG TC~A~AA~A~- CCTTGCGGGC TGCTCCAGAA CGAACATCAG 60
lGCCCC~AAG CCAGTGATGT GGGACTTAGC AGACTTAGAG GGC~l~AG CTGCTACTTC 120
ATCATAGTCC CACATGACAC CGACAACTTT TGAAGCTTTT TCCTTAGCCA ATCTCACTTG ~180

216631~
_ W O95/21922 PCT~US9StO2118

185
ell~lG~lAT GAlGGGGGGA ACAGAGG 207

(2) lN~OF~ATION FOR SEQ ID NO:14:

(i) SEQUENCE CHARACTERISTICS:
-, (A) LENGTH: 208 amino acids
(B) TYPE: amino acid
(C) STR~NI-Kl-.KRS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14:

Glu Cys Asn Pro Phe Thr Ala Ile Gly Cye Ala Met Thr Glu Thr Xaa
1 5 10 15
Gly Xaa Xaa Xaa Xaa Leu Pro Ser Tyr Pro Pro Lye Lye Glu Val Ser

Glu Trp Ser Asp Glu Ser Trp Ser Thr Thr Thr Thr Ala Ser Ser Tyr

Val Thr Gly Pro Pro Tyr Pro Lye Ile Arg Gly Lye Asp Ser Thr Gln

Ser Ala Thr Ala Lys Arg Pro Thr Lye Lys Lys Leu Gly Lye Ser Glu

Phe Ser Cy8 Ser Met Ser Tyr Thr Trp Thr Aep Val Ile Ser Phe Lys

Thr Ala Ser Lys Val Leu Ser Ala Thr Arg Ala Ile Thr Ser Gly Phe
100 105 110
Leu Lys Gln Arg Ser Leu Val Tyr Val Thr Glu Pro Arg Aep Ala Glu
115 120 125
Leu Arg lye Gln Lye Val Thr Ile Aen Arg Gln Pro Leu Phe Pro Pro
130 135 140
Ser Tyr Hie LYB Gln Val Arg Leu Ala Lye Glu Lys Ala Ser Lys Val
145 150 155 160
Val Gly Val Met Trp Asp Tyr Aep Glu Val Ala Ala Hi~ Thr Pro Ser
165 170 175
Lys Ser Ala Lye Ser His Ile Thr Gly Leu Arg Gly Thr Asp Val Arg
- 180 185 190
Ser Gly Ala Ala Arg Lys Ala Val Leu Asp Leu Gln Lys Cys Val Glu
195 200 205

W O95/21922 2~ ~ ~333 PCTrUS9S/02118

186
(2) lN~CF~.TION FOR SEQ ID NO:15:
(i) ~yu_.CB CHARACTERISTICS:
(A) LENGTH: 230 base paire
(B) TYPE: nucleic acid
(C) STR~NI~Kl-NK-qS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUEN OE DESCRIPTION: SEQ ID NO:15:

G~C~l~AA TGCAA QTTG TTGAAGCCTT CGACGCAGCC AAGGCATGGT A~G~.C 60
ATCAACAGAA GCTCAAACTA ~ ~GACAC CTATCGCACC CAAOE ~GG~ TACCTGCGAT 120
AGGAGCAAAT TTG~ r~T GGG~uGATCT ~ ATG GTCAACCCCG AACCTTCATT 180
TGTCAATACT GC~AA~A~ ~ACAA TTA,~,~,,~ TTGACTGCAG 230

(2) INFORMATION FOR SEQ ID NO:16:

(i) X_yU_N~'_ CHARACTERISTICS:
(A) LENGTH: 76 amino acide
(B) TYPE: amino acid
( C) STRI~h. ~KK, _ K~qs single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(Xi) ~yU_N~_ DESCRIPTION: SEQ ID NO:16:

Val Pro Glu Cye Asn Ile Val Glu Ala Phe Aep Ala Ala Lye Ala Trp

Tyr Gly Leu Ser Ser Thr Glu Ala Gln Thr Ile Leu Asp Thr Tyr Arg

Thr Gln Pro Gly Leu Pro Ala Ile Gly Ala Asn Leu Asp Glu Trp Ala

Asp Leu Phe Ser Met Val Aen Pro Glu Pro Ser Phe Val Aen Thr Ala

Lye Arg Thr Ala Asp Asn Tyr Val Leu Leu Thr Ala

(2) lN~OF~U.TION EOR SEQ ID NO:17:
(i) ~_yuhnOE CH~RACTERISTICS:
(A) LENGTH: 291 baee pair~

W 095/21922 21~ ~ 3 ~ ~ PCTAUS95/02118

187
(B) TYPE: nucleic acid
(C) STR~ )N~.~S: eingle
(D) TOPOLOGY: linear
-. (ii) MOLECULE TYPE: DNA (genomic)

~xi) SEQUENCB DESCRIPTION: SEQ ID NO:17:

GTATGGTTCC TGAATGCAAC All~llGAAG C~llCGACGC AGCCAAGGCA TGGTATGGTT 60
TGTCATCAAC AGAAGCTCAA ACTATTCTGG A QCCTATCG CACCCAACCT GGGTTACCTG 120
cr~T~r,r~c AAATTTGGAC GAGTGGGCTG A.~ llllC TATGGTCAAC CCCGAACCTT 180
CAll~ AA TACTGCAAAA AGAACTGCTG ACAATTATGT lll~ll~ACT GCAGCCCTGC 240
CACC~l~lG CGTCATTGGG AGCAGCATAG CCATACTGAT GACACAGTTG T 291

(2) lN~ .TION FOR SEQ ID NO:18:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 281 baee pair~
(B) TYPE: nucleic acid
(C) STR~N~ S: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18:

GCGCATGCAA CTTAATTTGC GTGATGCACT Tr~r~Ar~AAT GACTGTAATT CCATAAACAA 60
CACTCCTAGT GATGAAGCCG CA~l~lCCGC lCll~llllC AAArAGG~r~T TGCGGCGTAC 120
AAACCAATTG CTTGAGGCAA TTTCAGCTGG CGTTGACACC ACCAAACTGC CAGCCCC~lC 180
cATcr~Ar~AG GTAGTGGTAA GAAAGCGCCA ~llCCGGGCA AGAACTGGTT CGCTTACCTT 240
GC~-lCCCC~l CCGAGATCCG TCCC~GGAGT GTCATGTCCT G281

(2) lN~O~L!TION FOR SEQ ID NO:19:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 93 amino acids
. (B) TYPE: amino acid
(C) ST171~N,~ S eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein

W O 95/21922 21 ~' 6~ 3 1 ~ PCTrUS95/02118

188
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l9:

Arg Met Gln Leu Aen Leu Arg Aep Ala Leu Glu Thr Aen Aep Cye Asn
1 5 10 15
Ser Ile Aen Aen Thr Pro Ser Aep Glu Ala Ala Val Ser Ala Leu Val

Phe Lye Gln Glu Leu Arg Arg Thr Aen Gln Leu Leu Glu Ala Ile Ser

Ala Gly Val Aep Thr Thr Lye Leu Pro Ala Pro Ser Ile Glu Glu Val

Val Val Arg Lye Arg Gln Phe Arg Ala Arg Thr Gly Ser Leu Thr Leu
65 70 75 80
Pro Pro Pro Pro Arg Ser Val Pro Gly Val Ser Cye Pro
9o

(2) lN~Oh~'.TION FOR SEQ ID NO:20:
(i) SBQUENCE CHARACTERISTICS:
(A~ LENGTH: 281 baee paire
(B) TYPE: nucleic acid
(C) STR~t~RnN~qs eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20:

GCGCATGCAA CTTAATTTGC GTGATGCACT T~-A~ AAT GACTGTAATT CCATAAACAA 60
CACTCCTAGT GATGAAGCCG CA~lCCGC .~ l.C AAACAGGAGT TGCGGC~lAC 120
AAACCAATTG CTTGAGGCAA m CAGCTGG CGTTGA QCC ACCAAACTGC CAGCCCCu.C 180
CATCGAAGAG GTAGTGGTAA GAAAGCGCCA ~l.CCGGG~A AGAACTGGTT CGCTTACw T 240
GC~-lCCCC~ CCGAGATCCG TCCQGGAGT GT QTGTCCT G 281

(2) lN~uk~ATION FOR SBQ ID NO:21:
(i~ SEQUENCE CHARACTERISTICS:
(A) LENGTH: 221 baee paire
(B) TYPE: nucleic acid
(C) STR~ KI~K.q~5 eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

W 095/21922 ~ 3 1 ~ PCTtUS95tO2118

189

(Xi) ~YU~N~ DESCRIPTION SEQ ID NO 21
GATCQTAGT GAGCCACTCA CCCATCAAGC ATTTAAATGT CAAGCAAGCA GTGGATGAGG 60
CGGQr-rATA GCCGC~-AGC ATGTCAAAGA CAAAACCCAC CGATGTCCAT GTACrAAr-Ar- 120
- ~1~11CC~AC AGCCCCGGCC ATCATGAACG CCA~1GCa.C TCTAGCGTCT GTAAGCTTGG 180
ACGCAATTGC GC~1C~AAAT AATGACAGGA ACATTTTGAT C 221

(2) 1N~C~ TION FOR SEQ ID NO 22
(i) S~Y~L _~ CHARACTERISTICS
(A) LENGTH 737 base pairs
(B) TYPE nucleic acid
(C) STR~N~ S: single
(D) TOPOLOGY linear
(ii) MOLECULE TYPE DNA (genomic)

(Xi) SEQUENCE DESCRIPTION SEQ ID NO 22

GATCGAAGCA CACCTCAAGC CCTAAGACGC .~-~.~GCTC CCGG~11ACC CCGCAGCTAC 60
CACr~ATACC AGCGG QGAC GACCC~--GC GAAGTG QTC GCr~r~AGCA CGGCAGCCCT 120
CACAGAGCCC AGGACATTCA GGTACGCCAC GACACA QTC ACACCCAGAC AACCAGTGAA 180
CC~CrA~TCC TGGGCTGCCC AGCCGACCAC CGGGGCGCAC ACCAGCTCGG GAGCCAGCGC 240
GC~-LC~ACGA CCGGCAAGTA AGCCCQACA m GACAACC AGGCCAGACC GGCAGCGAAC 300
GTTCG QGCT TGAGCCACGC GGGCCAGATG Tr~Cr~ACG~ CGGC~1~AGC ACCATCATTG 360
GQGCACCCC AGACCGC~--G AGCCCC~GGCC GTCAGGCCTG CCACCATGTA GCAACCAGCA 420
TTGTAGGTAG A~-CCGC~AC CCG~-~-A GAATTCGGAC AAGATGGAGT TGGAACAGTG 480
GGCGGAGTCC ACAATGGAAC AC m CAGTG GA~11C~1~A CAGAAGGGTG TATGATAACA 540
ATAGTGGCGG CAGATGCTCC ATTCAACCAC rArrA~TTG CCAGCATAAA CAGGGGGGCA 600
ACTCTAGCCT CAGCCAACTT CATCACTACC AACAGGGCCA GGACCATGTC AGTAAGCAAC 660
CAAGCCGCGG AAGACCTTCG CTGACCACTG TAAACCTGCT ~-~--~,,GCC m AACATGG 720
ATGAAGCCGT TGTGATC 737

(2) INFORMATION FOR SEQ ID NO 23

W O95~21922 21~63 ~ 3 PCTADS95/02118

190
(i3 x-uu~ CHARACTFRISTICS
(A) LENGTH 307 base pairs
(B~ TYPF nucleic acid
( C) STR~ r7KnN--.~ S: single
~D) TOPOLOGY linear
(ii) MOLECULE TYPE DNA (genomic)

(xi) X_yu~N~ DE8CRIPTION 6EQ ID NO 23

GATCACTGTG GACGCCACTT ~l~CGACTC ATCGATTGAT GAGCACGATA TGCAGGTGGA 60
GGC--lC~,G l~GCGGCGG CTAGTGACAA CCC~-lCAATG GTACATGCTT TGTGCAAGTA 120
CTA~ -lG~l GGCCCTATGG lllCCC~AGA lGGG~llCCC TTGGGGTACC GCCAGTGTAG 180
C~lCGGGC ~l~l.~ACAA CTAG~lCGGC GAACAGCATC A~-ll~llACA TTAAGGTCAG 240
CGCGGCCTGC AGGCGG~l~G GGATTAAGGC ACCATCATTC m ATAGCTG GAGATGATTG 300
CTTGATC 307

(2) INFORMATION FOR SEQ ID NO 24
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH 500 base pair~
(B) TYPE nucleic acid
(C) ST~ KI~NKcs single
(D) TOPOLOGY linear
(ii) MOLECULE TYPF DNA (genomic)

(Xi) ~_yU_N~K DESCRIPTION SEQ ID NO 24

GATCAGGCCG CTGAGCGGCC G~AGGTTA CAATCTGGAG GGGTGATAGG AAGTATGACA 60
AGCATTATGA GG~ C~ GAGGCTGTCC TGAAAAAGGC AGCCGCGACG AAGTCTCATG 120
GCTGGACCTA TTCCCAGGCT ATAGCTAAAG TTAGGCGCCG AGCAGCCGCT GGATACGGCA 180
GCAAGGTGAC CGC~-lCCACA TTGGCCACTG GTTGGCCTCA CGTGGAGGAG ATGCTGGACA 240
AAATAGCCAG GG~-A~AG~AA ~llC~l~ CA ~1, ll~l~AC CAAGCGAGAG ~lll~ll~l 300
CCAAAACTAC CCGTAAGCCC CCAAGATTCA TA~l~CCC AC~ GGAC TTCAGGATAG 360
CTGAAAAGAT GA~ ~G~l GACCCCGGCA TCGTTGCAAA GTCAATTCTG GGTGACGCTT 420
A~ 1C~A GTACACGCCC AATCAGAGGG TCAAAGCTCT GGTTAAGGCG TGGGAGGGGA ~-480

~ W 095121922 2I 6~3~ 3 PCTnUS95/02118

191
AGTTGCATCC CGCTGCGATC 500

(2) lN~U~ ~.TION FOR SEQ ID NO:25:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 479 baee pair6
~B) TYPE: nucleic acid
(C) STRAN~)KlJN~.~S: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:25:

GATCACA m TTGTTACAGC AGTATCCTCT CCAAATGTCT GTTTCACCCA GGTGCCCCCA 60
ACCTTGAGAG CTGCAGTGGC CGTGGACCGC GTACAGGTTC AGYGTTATCT AGGTGAGCCC 120
AAAACTCCTT GrP~A~ATC ~Gu~lGul~l TAC~.C~lG ACGGTAAGGG TAAAACTGTT 180
AAGullCCu-, TCCGCGTTGA CGGACACACA CulG~lGGTC GCATGCAACT TAAlllGC~l 240
GATCGACTTG AGGCAAATGA CTGTAATTCC ATAAACAACA CTCCTAGTGA TGAAGCCGCA 300
CCGU-~C l~ll,,uAA ACAGGAGTTG CGGCGTACAA ACCAATTGCT TGAGGCAATT 360
TCAGCTGGCG TTGACACCAC CAAACTGCCA GCCCCulCCC AGATCGAAGA GGTAGTGGTA 420
AGAAAGCGCC A~l~CCGGGC AAGAACTGGT TCGCTTACCT TGCulCCCCC TCCGAGATC 479

(2) lN~u~'.TION FOR SEQ ID NO:26:
(i) ~Kyu_N~_ CHARACTERISTICS:
~A) LENGTH: 532 base pair~
(B) TYPE: nucleic acid
(C) STR~r~K~NK-~S: sing1e
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
(xi) ~yu_N~ DESCRIPTION: SEQ ID NO:26:

GATCAACACC TCGTCACCCC GTCTCGCAAC CA QGGTTTC CCGTGGACCA Aul~,CuACA 60
GCCTAPrA~ CGAGCAGAGT CCC~AA~AAT AGCACAATCT lCull~l~A TGCTAACAGG 120
CTCAAGCGCA AAACCCCACT CTCGCAAGCG GGCAGCACCG CGCulGulAG TGTGACCGGC 180
~G~-lC~AG AGGAGGACGC CulGu~GCG CAGGACGCCC ACCAGCCAAG AGCAGGCCAG 240
CCGU-lCu-l~A GCAAGAGCTA AGGAGTCCAG CACCCGCGCC AAGCGCGCGA GA~G~l~A 300

W O95/21922 PCTrUS95/02118
21~631~
192
GTTAACCAAG AGTACTTCCA AGATGAAATC AATGA QTCT AAA~GC.cA AACAGAGTAT 360
GAAGATGACG GAAACTGTGG CAA~ G GGGGAAGAAC CAAGCCACAA CCAACCAAGC 420
TTTC QGCAC GC~-lC~AACG GC QAAAGCT C~A~CGGCG A~,,~.~cAC CCACCGGC~A 480
ACC~ G~ AATTGACGGC CCAC~-~GG A TACCAAGTCA A~GGC~A TC 532
~2) lN~0~5!TION FOR SEQ ID NO 27
(i) SEQUENCE CHARA ERISTICS
tA) LENGTH 306 baee pairs
(B) TYPE nucleic acid
(C) ST~2~Nl)lclJN~:.cs: E~ingle
(D) TOPOLOGY linear
(ii) MOLECULE TYPE DNA (genomic)

(xi) SEQUENCE DESCRIPTION SEQ ID NO 27

GATCCATCTT GACAATGACA A~ CGCAG GACAGTAGAC AC~lG~A CGAACTCATC 60
TTTGAGGAAG AAATCGTCAG GCATCACCGA ACTGCGTGGC ATCATCGTCA ACAATCTGTT 120
AACCr~TCT TGACCCACAC C~ ~AC ~-ACCA~-~GC AACAAGCCCA GAACCACACC 180
GGCCACCGAA GCCCCCGGAG AGGCCAGGCA ACTGACCAGG CACCAAGCGT CA~CG~G 240
TAA~CCCC GCCAGGAGGT CGAAGGTGAG TGAGCGCGGT TCACCGCCCC CTCCCAGCCT 300
CTGATC 306
(2) lN~ TION FOR SEQ ID NO 28
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH 369 baee paire
(B) TYPE nucleic acid
(C) ST17~NI~ s eingle
(D) TOPOLOGY linear
(ii) MOLECULE TYPE DNA (genomic)

(xi) SEQUENCE DESCRIPTION SEQ ID NO 28

GAT QCCCAC ACCCCG~G GTTGGCACTT GCATGCCTGA AGG~AA~G CACCATTAGG 60
GAGCGGGTAG ACCGTGACGT CGTCACTCGC TAACCACCAC CGAGCATTGA CAGGACCGAA 120
AGCCC QCCA TAGGCCGGAC ~G~ACCA CGGTATGTCG TGTACATCAC ~CC~ACG --180

WO 95/21922 2 1 ~ 6 3 ~ ~ PCTnUS95/02118

193
CAGCAGCCCA TGGAACGAGT TGTTGAAGTC CCAAGGACCA CCAC~llCCC GTGATGTTCG 240
GACGAGTCCT TGCCTGTCAT GGAG~,C~lC ACAACCCCGA AGAATCCCTT GCCAGCTTGA 300
TGAAGCACCA CGGGAGCAGT GG~r~AP~ CCAGGCGGAA GGTCGAACCG A~ ACA 360
CAACTGATC 369

(2) INFORMATION POR SEQ ID NO:29:
(i~ SEQUENCE CHARACTERISTICS:
(A) LENGTH: 337 base pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) ~yu~N~ DESCRIPTION: SEQ ID NO:29:

GATCCAATCC AGGGGCCCTC GTACCCCTCC TGGCAGCTGT AGAAAGGACA ACCAGGAATG 60
TTAACCATGC TCTGAACTCC AGCTTTAAGG ACATTAAAGC AAATCACAAA GAAATTGCAC 120
ACATACTGCC AAATCTCTAG ACCCCAAGCA ATGAGGCCGC AATCATCCTC C~lCGGGGTG 180
TGGAGCCAAC GGATGCAAGC TGATATGATA CTCCAGGGGG TAGATGCCTC CAGAATGCCC 240
AGTATCTTCT GCGGATGTCA CGAGTGGCAA TAAAGTACTC ACTACATACA GTGTTGCTCC 300
TAGCAAGCAT AGTAA~-~GT ~l~llGGGCC AGTGATC 337

(2) lN~-O~L!TION FOR SEQ ID NO:30:
( i ) ~yU~N~ CHARACTERISTICS:
(A) LENGTH: 234 base paire
tB) TYPE: nucleic acid
~C) ST~P~ KqS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
;




(xi) SEQUENC~ DESCRIPTION: SEQ ID NO:30:

GATCAGGTAT GCTCCAAGCA CG~l~lC~AT GCG~lG81GA ACTCATCTTT lC-~llGAGA 60
Ala~~ GC AAAACTTTAC AAAGGACCCA GAA~ ~llC AAATTACTGG AGAGGGGCTG 120
TTCCAGTCAA CGCTAGGCTG ~lGGGlCGG CTAGACCGGA CCCAACTGAT TGGACTAGTC 180

W O95/21922 21 ~ 3 PCTrUS95102118

194

ll~lC~AA TTAlGGC~l AGGGACTACT GTAAATATGA GAAATTGGGA GATC 234

(2) lNru~L.TION FOR SEQ ID NO:31:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 73 amino acids
(B) TYPE: amino acid
(C) ST~NnFnNR~S: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:31:
Aep Pro Xaa Xaa Ala Thr Hie Pro Ser Ser Ile Xaa Met Ser Ser LYB
1 5 10 15
Gln Trp Met Arg Arg Gln Hie Ser Arg Leu Ala Cye Gln Arg Gln Aen

Pro Pro Met Ser Met Tyr Gln Glu Leu Phe Pro Gln Pro Arg Pro Ser

Xaa Thr Pro Val Arg Leu Xaa Arg Leu Xaa Ala Trp Thr Gln Leu Arg
50 55 60
Leu Gln Ile Met Thr Gly Thr Phe Xaa

(2) lNrO~I~!TION FOR SEQ ID NO:32:

(i) ~ruurN~ CHARACTBRISTICS:
(A) LBNGTH: 73 amino acids
(B) TYPE: amino acid
(C) STRPNnRnNR-~S: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUBNCE DESCRIPTION: SEQ ID NO:32:
Ile His Ser Glu Pro Leu Thr Hie Gln Ala Phe Lys Cye Gln Ala Ser
1 5 10 15
Ser Gly Xaa Gly Gly Ser Ile Ala Ala Xaa Hi~ Val Lye Aep Lye Thr

Hie Arg Cye Pro Cye Thr Lye Ser Cy8 Ser His Ser Pro Gly Hie Hie


_ W O 95/21922 21~ ~ 3 13 PCTAUS95/02118

195
Glu Arg Gln Cye Val Ser Ser Val Cys Lys Leu Gly Arg Asn Cye Ala
50 55 60
Ser Lye Xaa Xaa Gln Glu Hie Phe Aep

(2) lN~O~ ~TION FOR SEQ ID NO:33:

yu~ CHARACTERISTICS:
(A) LENGTH: 73 amino acide
(B) TYPE: amino acid
(C) ST~ K~J~ qS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:33:
Ser Ile Val Ser Hie Ser Pro Ile Lye His Leu Aen Val Lye Gln Ala
1 5 10 15
Val Asp Glu Ala Ala Ala Xaa Pro Pro 8er Met Ser Lye Thr Lye Pro

Thr Asp Val His Val Pro Arg Ala Val Pro Thr Ala Pro Ala Ile Met

Aen Ala Ser Ala Ser Leu Ala Ser Val Ser Leu Aep Ala Ile Ala Pro
50 55 60
Pro Aen Aen Asp Arg Asn Ile Leu Ile

(2) lN~O~.TION FOR SEQ ID NO:34:

(i) ~Uu~c~ CHARACTERISTICS:
(A) LENGTH: 73 amino acide
(B) TYPE: amino acid
(C) ST~A~ CS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34:
Aep Gln Aen Val Pro Val Ile Ile Trp Arg Arg Aen Cye Val Gln Ala
1 5 10 15
Tyr Arg Arg Xaa Arg Arg Thr Gly Val Hie Asp Gly Arg Gly Cye Gly

Aen Ser Ser Trp Tyr Met Asp Ile Gly Gly Phe Cys Leu Xaa His Ala


W O95/21922 ~ 3 1 ~ PCTnUS95/02118

196
Arg Arg Leu Cye Cye Arg Leu Ile Hie Cy~ Leu Leu Aep Ile Xaa Met
50 55 60
Leu Aep Gly Xaa Val Ala Hie Tyr Gly

(2) lN~O~ U.TION POR SEQ ID NO:35:

QUL_.C~ CHARACTERISTICS:
(A) LENGTH: 73 amino acide
(B) TYPB: amino acid
~C) STRANDEDNESS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:35:
Ile LYB Met Phe Leu Ser Leu Phe Gly Gly Ala Ile Ala Ser Lye Leu
l 5 l0 15
Thr Asp Ala Arg Aep Ala Leu Ala Phe Met Met Ala Gly Ala Val Gly

Thr Ala Leu Gly Thr Trp Thr Ser Val Gly Phe Val Phe Aep Met Leu

Gly Gly Tyr Ala Ala Ala Ser Ser Thr Ala Cye Leu Thr Phe Lye Cye
50 55 60
Leu Met Gly Glu Trp Leu Thr Met Aep

(2) INFORMATION POR SEQ ID NO:36:

(i) S~UU~.-~ CHARACTERISTICS:
(A) LENGTH: 73 amino acid~
(B) TYPE: amino acid
(C) STR~I~IIIKII K-C S: eingle
(D) TOPOLOGY: linear
~ii) MOLECULE TYPE: protein
(xi) ~Luu~N~L DESCRIPTION: SEQ ID NO:36:
Ser Lye Cye Ser Cye Hie Tyr Leu Glu Ala Gln Leu Arg Pro Ser Leu
l 5 l0 15
Gln Thr Leu Glu Thr Hie Trp Arg Ser Xaa Trp Pro Gly Leu Trp Glu ~~

Gln Leu Leu Val Hie Gly Hie Arg Trp Val Leu Ser Leu Thr Cye Xaa


W O95/21922 ~ 1 6 ~ 3 ~ 3 PCT~US95/02118

197
Ala Ala Met Leu Pro Pro Hie Pro Leu Leu Ala Xaa His Leu Aen Ala
50 55 60
Xaa Trp Val Ser Gly Ser Leu Trp Ile


- (2) INFORMATION FOR SEQ ID NO:37:

(i) SEQUENCE CHARACTERISTICS:
~A) LENGTH: 245 amino acide
~B) TYPE: amino acid
(C) ST~ S: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37:
Asp Arg Ser Thr Pro Gln Ala Leu Arg Arg Cys Val Ala Pro Gly Leu
1 5 10 15
Pro Arg Ser Tyr His Gln Tyr Gln Arg Gln Thr Thr Pro Cys Glu Val

Hie Arg Hie Lye His Gly Ser Pro Hie Arg Ala Gln Aep Ile Gln Val

Arg His Aep Thr Hie His Thr Gln Thr Thr Ser Glu Pro Pro Leu Leu

Gly Cy8 Pro Ala Aep His Arg Gly Ala Hi6 Gln Leu Gly Ser Gln Arg

Ala Ser Thr Thr Gly Lys Xaa Ala Pro Thr Phe Asp Aen Gln Ala Arg

Pro Ala Ala Aen Val Arg Ser Leu Ser His Ala Gly Gln Met Ser Pro
100 105 110
Thr Thr Ala Xaa Ala Pro Ser Leu Ala Ala Pro Gln Thr Ala Xaa Ala
115 120 125
Pro Ala Val Arg Pro Ala Thr Met Xaa Gln Pro Ala Leu Xaa Val Glu
130 135 140
- Ser Ala Thr Pro Val Val Glu Phe Gly Gln Asp Gly Val Gly Thr Val
145 150 155 160
Gly Gly Val His Aen Gly Thr Leu Ser Val Asp Phe Val Thr Glu Gly
~ 165 170 175
Cys Met Ile Thr Ile Val Ala Ala Aep Ala Pro Phe Asn Hie Hie His
180 185 190

~ 3~ 3
W 095/21922 ~ PCTrUS95/02118

198
Ile Ala Ser Ile Asn Arg Gly Ala Thr Leu Ala Ser Ala Asn Phe Ile
195 200 205
Thr Thr Aen Arg Ala Arg Thr Met Ser Val Ser Asn Gln Ala Ala Glu
210 215 220
ABP Leu Arg Xaa Pro Leu Xaa Thr Cy8 CYB Leu Leu Pro Leu Thr Trp
225 230 235 240
et LYB Pro Leu Xaa
245
2) lN~O~.TION FOR SEQ ID NO:38:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 245 amino acids
(B) TYPE: amino acid
(C) STRA~ K-~S: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:38:
Ile Glu Ala His Leu LYB Pro Xaa Aep Ala Val Ser Leu Pro Gly Tyr
1 5 10 15
ro Ala Ala Thr Thr Asn Thr Ser Gly Arg Arg Pro Leu Ala LYB CYB

Ile Ala Thr Ser Thr Ala Ala Leu Thr Glu Pro Arg Thr Phe Arg Tyr

Ala Thr Thr His Ile Thr Pro Arg Gln Pro Val Asn His His Ser Trp

Ala Ala Gln Pro Thr Thr Gly Ala His Thr Ser Ser Gly Ala Ser Ala
- 80
ro Arg Arg Pro Ala Ser LYB Pro Gln His Leu Thr Thr Arg Pro ABP
rg Gln Arg Thr Phe Ala Ala Xaa Ala Thr Arg Ala Arg CYB His Gln
100 105 110
Arg Arg Pro Glu Hi~ His His Trp Gln His Pro Arg Pro Pro Glu Pro
115 120 125
Arg Pro Ser Gly Leu Pro Pro CYB Ser Asn Gln His CYB Arg Xaa Ser
130 135 140
Pro Arg Leu Arg Trp Xaa Asn Ser ABP LYB Met Glu Leu Glu Gln Trp
145 150 155 160
Ala Glu Ser Thr Met Glu His Phe Gln Trp Thr Ser Xaa Gln LYB Gly

_ W O 95/21922 2 ~6 31~ PCTAUS95/02118

199
165 170 175
Val Xaa Xaa Gln Xaa Trp Arg Gln Met Leu His Ser Thr Thr Thr Thr
180 185 190
Leu Pro Ala Xaa Thr Gly Gly Gln Leu Xaa Pro Gln Pro Thr Ser Ser
195 200 205
Leu Pro Thr Gly Pro Gly Pro Cye Gln Xaa Ala Thr Lye Pro Arg Lye
210 215 220
Thr Phe Ala Asp His CYB Lye Pro Ala Val Cys Cye Leu Xaa Hie Gly
225 230 235 240
Xaa Ser Arg Cye Aep
245

~2) lN~ORI~.TION FOR SEQ ID NO:39:

(i) 8BQUENCE CHARACTERISTICS:
(A) LENGTH: 245 amino acide
(B) TYPE: amino acid
(C) STRPNn~nN~S: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:39:

Ser Lye Hie Thr Ser Ser Pro LYB Thr Leu Cys Arg Ser Arg Val Thr
1 5 10 15
Pro Gln Leu Pro Pro Ile Pro Ala Ala Aep Aep Pro Leu Arg Ser Ala

Ser Pro Gln Ala Arg Gln Pro Ser Gln Ser Pro Gly Hie Ser Gly Thr

Pro Arg Hi~ Thr Ser Hie Pro Aep Aen Gln Xaa Thr Thr Thr Pro Gly

Leu Pro Ser Arg Pro Pro Gly Arg Thr Pro Ala Arg Glu Pro Ala Arg

Leu Aep Aep Arg Gln Val Ser Pro Aen Ile Xaa Gln Pro Gly Gln Thr

Gly Ser Glu Arg Ser Gln Leu Glu Pro Arg Gly Pro Asp Val Thr Asn
100 105 110
Aep Gly Leu Ser Thr Ile Ile Gly Ser Thr Pro Aep Arg Leu Ser Pro
115 120 125

W O95/21922 ~ ~ 6 6 3 1 ~ . PCTrUS95/02118

200
Gly Arg Gln Ala Cye Hie Hie Val Ala Thr Ser Ile Val Gly Arg Val
130 13S 140
Arg Aep Ser Gly Gly Arg Ile Arg Thr Arg Trp Ser Trp Aen Ser Gly
145 150 155 160
rg Ser Pro Gln Trp Aen Thr Phe Ser Gly Leu Arg Asp Arg Arg Val
165 170 175
yr Aep Aen Aen Ser Gly Gly Arg Cye Ser Ile Gln Pro Pro Pro Hie
180 185 190
Cye Gln Hie Lye Gln Gly Gly Aen Ser Ser Leu Ser Gln Leu Hie Hie
195 200 205
Tyr Gln Gln Gly Gln Aep Hie Val Ser Lye Gln Pro Ser Arg Gly Arg
210 215 220
Pro Ser Leu Thr Thr Val Aen Leu Leu Ser Val Ala Phe Aen Met Aep
225 230 235 240
lu Ala Val Val Ile
245
2) INFORMATION FOR SEQ ID NO:40:

~i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 245 amino acide
(B) TYPE: amino acid
(C) STR~NI~ IJNI~:.C s eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) S~YU~N~ DESCRIPTION: SEQ ID NO:40:

Aep Hie Aen Gly Phe Ile Hie Val Lye Gly Aen Arg Gln Gln Val Tyr
1 5 10 15
er Gly Gln Arg Arg Ser Ser Ala Ala Trp Leu Leu Thr Aep Met Val

Leu Ala Leu Leu Val Val Met Lye Leu Ala Glu Ala Arg Val Ala Pro

Leu Phe Met Leu Ala Met Trp Trp Trp Leu Asn Gly Ala Ser Ala Ala

Thr Ile Val Ile Ile Hie Pro Ser Val Thr LYB Ser Thr Glu Ser Val

Pro Leu Trp Thr Pro Pro Thr Val Pro Thr Pro Ser Cye Pro Aen Ser


W 095/21922 ~ 1 6 6 3 ~. ~ PCTrUS95/02118

201
Thr Thr Gly Val Ala Aep Ser Thr Tyr Aen Ala Gly Cye Tyr Met Val
100 10S 110
Ala Gly Leu Thr Ala Gly Ala Gln Ala Val Trp Gly Ala Ala Aen Aep
115 120 125
Gly Ala Gln Ala Val Val Gly Aep Ile Trp Pro Ala Trp Leu Lye Leu
130 135 140
Arg Thr Phe Ala Ala Gly Leu Ala Trp Leu Ser Asn Val Gly Ala Tyr
145 150 155 160
Leu Pro Val Val Glu Ala Arg Trp Leu Pro Ser Trp Cye Ala Pro Arg
165 . 170 175
rp Ser Ala Gly Gln Pro Arg Ser Gly Gly Ser Leu Val Val Trp Val
180 185 190
Xaa Cye Val Ser Trp Arg Thr Xaa Met Ser Trp Ala Leu Xaa Gly Leu
195 200 205
Pro Cye Leu Trp Arg Cye Thr 8er Gln Gly Val Val Cye Arg Trp Tyr
210 215 220
Trp Trp Xaa Leu Arg Gly Aen Pro Gly Ala Thr Gln Arg Leu Arg Ala
225 230 235 240
Xaa Gly Val Leu Arg
245

(2) INPORMATION POR SEQ ID NO:41:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 245 amino acide
(B) TYPE: amino acid
(C) ST~ )KlJNKcs: eingle
(D) TOPOLOGY: linear
(ii) MOLBCuLE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:41:

Ile Thr Thr Ala Ser Ser Met Leu Lye Ala Thr Aep Ser Arg Phe Thr
1 5 10 15
al Val Ser Glu Gly Leu Pro Arg Leu Gly Cye Leu Leu Thr Trp Ser

Trp Pro Cye Trp Xaa Xaa Xaa Ser Trp Leu Arg Leu Glu Leu Pro Pro

Cye Leu Cye Trp Gln Cye Gly Gly Gly Xaa Met Glu Hie Leu Pro Pro~~


W O 95/21922 PCTnUS95/02118
3 1 3
202
Leu Leu Leu 8er Tyr Thr Leu Leu Ser Arg Ser Pro Leu Lys Val Phe
is Cy8 Gly Leu Arg Pro Leu Phe Gln Leu Hie Leu Val Arg Ile Leu
ro Pro Glu Ser Arg Thr Leu Pro Thr Met Leu Val Ala Thr Trp Trp
100 105 110
Gln Ala Xaa Arg Pro Gly Leu Arg Arg Ser Gly Val Leu Pro Met Met
115 120 125
Val Leu Arg Pro Ser Leu Val Thr Ser Gly Pro Arg Gly Ser Ser Cys
130 135 140
Glu Arg Ser Leu Pro Val Trp Pro Gly Cye Gln Met Leu Gly Leu Thr
145 150 155 160
y8 Arg Ser Ser Arg Arg Ala Gly Ser Arg Ala Gly Val Arg Pro Gly
165 170 175
ly Arg Leu Gly Ser Pro Gly Val Val Val Hie Trp Leu Ser Gly Cys
180 185 190
A~p Val Cys Arg Gly Val Pro Glu Cys Pro Gly Leu Cye Glu Gly Cys
195 200 205
Arg Ala Cy8 Gly A~p Ala Leu Arg Lys Gly Ser Ser Ala Ala Gly Ile
210 215 220
Gly Gly Ser Cye Gly Val Thr Arg Glu Arg His Ser Val Leu Gly Leu
225 230 235 240
lu Val Cys Phe Asp
245

2) lNruRr_~TION FOR SEQ ID NO:42:

(i) ~ryu~N~ CHARACTERISTICS:
(A) LENGTH: 245 amino acid~
(B) TYPE: amino acid
(C) STR1~1JKI K~5 single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:42:

Ser Gln Arg Leu His Pro Cys Xaa Arg Gln Gln Thr Ala Gly Leu Gln
1 5 10 15
Trp Ser Ala Lys Val Phe Arg Gly Leu Val Ala Tyr Xaa His Gly Pr~


W O95/21922 21663L3 PCTnUS95/02118

203
Gly Pro Val Gly Ser Aep Glu Val Gly Xaa Gly Xaa Ser Cye Pro Pro

Val Tyr Ala Gly Aen Val Val Val Val Glu Trp Ser Ile Cye Arg Hie

Tyr Cye Tyr Hie Thr Pro Phe Cye His Glu Val Hie Xaa Lye Cye Ser

Ile Val Aep Ser Ala Hie Cye Ser Asn Ser Ile Leu Ser Glu Phe Tyr

Hie Arg Ser Arg Gly Leu Tyr Leu Gln Cye Trp Leu Leu Hie Gly Gly
100 105 110
Arg Pro Aep Gly Arg Gly Ser Gly Gly Leu Gly Cye Cye Gln Xaa Trp
115 120 125
Cye Ser Gly Arg Arg Trp Xaa Hie Leu Ala Arg Val Ala Gln Ala Ala
130 135 140
Aen Val Arg Cye Arg Ser Gly Leu Val Val Lye Cye Trp Gly Leu Leu
145 150 155 160
Ala Gly Arg Arg Gly Ala Leu Ala Pro Glu Leu Val Cye Ala Pro Val
165 170 175
Val Gly Trp Ala Ala Gln Glu Trp Trp Phe Thr Gly Cye Leu Gly Val
180 185 190
Met Cye Val Val Ala Tyr Leu Aen Val Leu Gly Ser Val Arg Ala Ala
195 200 205
Val Leu Val Ala Met Hie Phe Ala Arg Gly Arg Leu Pro Leu Val Leu
210 215 220
Val Val Ala Ala Gly Xaa Pro Gly Ser Aep Thr Ala Ser Xaa Gly Leu
225 230 235 240
Arg CYB Ala Ser Ile
245
(2) lN~OR~.TION FOR SEQ ID NO:43:

(i) SEQUENCE CHARACTBRISTICS:
(A) LENGTH: 102 amino acide
- (B) TYPE: amino acid
(C) STR1~)Rn~R.~:S: 6ingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(Xi) ~yUh--C~ DESCRIPTION: SEQ ID NO:43:
Aep Hie Cye Gly Arg Hie Leu Phe Arg Leu Ile Aep Xaa Xaa Ala Arg

W 095/21922 ~ 1 ~ S 3 ~ 3 PCTrUS95/02118

204
1 5 10 15
yr Ala Gly Gly Gly Leu Gly Val Cye Gly Gly Xaa Xaa Gln Pro Leu

Aen Gly Thr Cye Phe Val Gln Val Leu Leu Trp Trp Pro Tyr Gly Phe

Pro Arg Trp Gly Ser Leu Gly Val Pro Pro Val Xaa Val Val Gly Arg

Val Aep Aen Xaa Leu Gly Glu Gln Hie Hie Leu Leu Hie Xaa Gly Gln

Arg Gly Leu Gln Ala Gly Gly Aep Xaa Gly Thr Ile Ile Leu Tyr Ser
rp Arg Xaa Leu Leu Aep
100

2) lN~OR~.TION FOR SEQ ID NO:44:

(i) SEQUENCE CHARACTERISTICS:
(A) LBNGTH: 102 amino acide
(B) TYPE: amino acid
(C) ST~N~ .CS eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44:
Ile Thr Val Aep Ala Thr Cye Phe Aep Ser Ser Ile ABP Glu Hie Asp
1 5 10 15
et Gln Val Glu Ala Ser Val Phe Ala Ala Ala Ser Aep Aen Pro Ser

Met Val Hie Ala Leu Cye Lye Tyr Tyr Ser Gly Gly Pro Met Val Ser

Pro Aep Gly Val Pro Leu Gly Tyr Arg Gln Cye Arg Ser Ser Gly Val

Leu Thr Thr Ser Ser Ala Aen Ser Ile Thr Cye Tyr Ile LYB Val Ser
la Ala Cye Arg Arg Val Gly Ile Lye Ala Pro Ser Phe Phe Ile Ala
go 95
ly Aep Asp Cy8 Leu Ile
100

_ W O95/21922 ~ 1 6 6 3 1 3 ~CTrUS95/02118

205
(2) lN~O~I~'.TION FOR SBQ ID NO:45:

(i) SEQUBNCB CHARACTERISTICS:
(A) LBNGTH: l0l amino acide
(B) TYPE: amino acid
(C) sT~Rn~R-cB: single
- (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENOE DESCRIPTION: SEQ ID NO:45:

Ser Leu Trp Thr Pro Leu Val Ser Thr Hie Arg Leu Met Ser Thr Ile
l 5 l0 15
Cy8 Arg Trp Arg Pro Arg Cye Leu Arg Arg Leu Val Thr Thr Pro Gln

Trp Tyr Met Leu Cye Ala Ser Thr Thr Leu Val Ala Leu Trp Phe Pro

Gln Met Gly Phe Pro Trp Gly Thr Ala Ser Val Gly Arg Arg Ala Cys

Xaa Gln Leu Ala Arg Arg Thr Ala Ser Leu Val Thr Leu Arg Ser Ala

Arg Pro Ala Gly Gly Trp Gly Leu Arg Hie Hie His Ser Leu Xaa Leu

Glu Met Ile Ala Xaa
100

(2) lN~u~L!TION FOR SEQ ID NO:46:

(i) ~UU~N~ CHARACTERISTICS:
(A) LENGTH: 102 amino acide
(B) TYPE: amino acid
(C) STF?z~NlJKlll~lK~s eingle
(D) TOPOLOGY: linear
(ii) MOLECULB TYPB: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:46:

Asp Gln Ala Ile Ile Ser Ser Tyr Lys Glu Xaa Trp CYB Leu Aen Pro
l 5 l0 l5
Hie Pro Pro Ala Gly Arg Ala Aep Leu Aen Val Thr Ser Aep Ala Val


W 095/21922 21~ 3~.3
PCTtUS9StO2118

206
Arg Arg Ala Ser Cys Gln Hie Ala Arg Arg Pro Thr Leu Ala Val Pro

Gln Gly Aen Pro Ile Trp Gly Aen Hie Arg Ala Thr Arg Val Val Leu

Ala Gln Ser Met Tyr Hie Xaa Gly Val Val Thr Ser Arg Arg Lye Hi~

Arg Gly Leu Hie Leu Hie Ile Val Leu Ile Asn Arg Xaa Val Glu Thr
er Gly Val His Ser Aep
100

2) lNrOR~.TION FOR SEQ ID NO:47:

yU~N~ CHARACTERISTICS:
(A) LENGTH: 102 amino acide
(B) TYPE: amino acid
(C) STRANDEDNESS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:47:

Ile Lys Gln Ser Ser Pro Ala Ile Lys Aen Aep Gly Ala Leu Ile Pro
l 5 l0 15
hr Arg Leu Gln Ala Ala Leu Thr Leu Met Xaa Gln Val Met Leu Phe

Ala Glu Leu Val Val Aen Thr Pro Aep Aep Leu Hie Trp Arg Tyr Pro

Lye Gly Thr Pro Ser Gly Glu Thr Ile Gly Pro Pro Glu Xaa Tyr Leu

Hie Lye Ala Cys Thr Ile Glu Gly Leu Ser Leu Ala Ala Ala Aen Thr

Glu Ala Ser Thr Cys Ile Ser Cys Ser Ser Ile Asp Glu Ser Lys Gln
al Ala Ser Thr Val Ile
100
2) lN~O~_'.TION FOR SEQ ID NO:48:

QUKN~ CHARACTERISTICS:

~1663~
_ W O95/21922 PCT~US95/02118

207
(A) LENGTH: l0l amino acid~
(B) TYPE: amino acid
tC) STRP~ ~KI ~..Cs: eingle
(D) TOPOLOGY: linear
(ii) MOLBCULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:48:

Ser Ser Aen Hie Leu Gln Leu Xaa Arg Met Met Val Pro Xaa Ser Pro
l 5 l0 15
ro Ala Cye Arg Pro Arg Xaa Pro Xaa Cye Aen Lye Xaa Cy~ Cye Ser

Pro Ser Xaa Leu Ser Thr Arg Pro Thr Thr Tyr Thr Gly Gly Thr Pro

Arg Glu Pro His Leu Gly Lye Pro Xaa Gly His Gln Ser Ser Thr Cy8

Thr Lye Hi~ Val Pro Leu Arg Gly Cye His Xaa Pro Pro Gln Thr Pro

Arg Pro Pro Pro Ala Tyr Arg Ala His Gln Ser Met Ser Arg Aen Lye
rp Arg Pro Gln Xaa
100
2) lNrO~.TION FOR SEQ ID NO:49:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 177 amino acid~
(B) TYPE: amino acid
(C) STR~ )Kl r Cs: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) ~ruu~N~r DESCRIPTION: SEQ ID NO:49:

Aep Gln His Leu Val Thr Pro Ser Arg Aen Hie Arg Phe Pro Val Aep
l 5 l0 15
ln Leu Ser Thr Ala Xaa Hie Thr Ser Arg Val Pro Asn Asn Ser Thr

Ile Phe Leu Gly Tyr Ala Aen Arg Leu Lye Arg Lye Thr Pro Leu Ser

Gln Ala Gly Ser Thr Ala Pro Ala Ser Val Thr Gly Val Leu Val Glu


~.1 6~3 ~
WO 95121922 PCTrUS95/02118

208
Glu Asp Ala Leu Leu Ala Gln Asp Ala Hie Gln Pro Arg Ala Gly Gln
ro Leu Leu Ser Lye Ser Xaa Gly Val Gln Hie Pro Arg Gln Ala Arg
lu Ile Trp Xaa Val Aen Gln Glu Tyr Phe Gln Asp Glu Ile Aen Asp
100 105 110
Ile Xaa Thr Ala Gln Thr Glu Tyr Glu Aep Asp Gly Asn Cye Gly Aen
115 120 125
Cye Leu Gly Glu Glu Pro Ser Hie Asn Gln Pro Ser Phe Pro Ala Arg
130 135 140
Leu Gln Arg Pro Lye Ala Pro Thr Gly Glu Leu Phe Thr Hie Arg Arg
145 150 155 160
Thr Leu Trp Xaa Leu Thr Ala Hie Leu Ala Tyr Gln Val Asn Leu Ala
165 170 175
Aep

t2) INFORMATION FOR SEQ ID NO:50:

(i) SEQUENCE CH~RACTERISTICS:
(A) LENGTH: 177 amino acide
(B) TYPE: amino a~id
(C) ST~Nlll~:lJN~ S: 8ingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) ~u~w~ DESCRIPTION: SEQ ID NO:50:

Ile Asn Thr Ser Ser Pro Arg Leu Ala Thr Thr Gly Phe Pro Trp Thr
1 5 10 15
sn Cye Pro Gln Pro Aen Thr Arg Ala Glu Ser Arg Thr Ile Ala Gln

Ser Ser Leu Val Met Leu Thr Gly Ser 8er Ala Lye Pro Hie Ser Arg

Lye Arg Ala Ala Pro Arg Leu Leu Val Xaa Pro Ala Cye Ser Xaa Arg

Arg Thr Pro Cys Leu Arg Arg Thr Pro Thr Ser Gln Glu Gln Ala Ser
rg 8er Ser Ala Arg Ala Lys Glu Ser Ser Thr Arg Ala Ly~ Arg Ala
rg Phe Gly Glu Leu Thr Lye Ser Thr Ser Lye Met Lye Ser Met Thr

_ W O95/21922 ~ 1~3 6 3 ~ 3 PCTnUS95/02118

209
100 105 110
Ser Lys Leu Leu Lye Gln Ser Met Lys Met Thr Glu Thr Val Ala Thr
115 120 125
Val Trp Gly Lye Asn Gln Ala Thr Thr Asn Gln Ala Phe Gln His Ala
130 135 140
Ser Asn Gly Gln Lys Leu Gln Pro Ala Ser Cys Ser Pro Thr Gly Glu
145 150 155 160
Pro Ser Gly Asn Xaa Arg Pro Thr Trp His Thr LYB Ser Ile Trp Leu
165 170 175
Ile

~2) lN~.TION POR SEQ ID NO:51:

~yu~N~ CHARACTERISTICS:
(A) LENGTH: 176 amino acids
(B) TYPE: amino acid
(C) STRAN~RnNR-~S: single
(D) TOPOLOGY: linear
(ii) MOLBCULE TYPE: protein
(xi) SEQUBNCE DESCRIPTION: SEQ ID NO:51:
Ser Thr Pro Arg His Pro Val Ser Gln Pro Gln Val Ser Arg Gly Pro
1 5 10 15
Thr Val His Ser Leu Thr His Glu Gln Ser Pro Glu Gln Xaa His Asn

Leu Pro Trp Leu Cys Xaa Gln Ala Gln Ala Gln Asn Pro Thr Leu Ala

Ser Gly Gln His Arg Ala Cys Xaa Cys Asp Arg Arg Ala Arg Arg Gly

Gly Arg Pro Ala Cys Ala Gly Arg Pro Pro Ala Lys Ser Arg Pro Ala

Ala Pro Gln Gln Glu Leu Arg Ser Pro Ala Pro Ala Pro Ser Ala Arg

Asp Leu Val Ser Xaa Pro Arg Val Leu Pro Arg Xaa Asn Gln Xaa His
100 105 110
-- Leu Asn Cys Ser Asn Arg Val Xaa Arg Xaa Arg Lys Leu Trp Gln Leu
115 120 125
Phe Gly Gly Arg Thr Lys Pro Gln Pro Thr Lys Leu Ser Ser Thr Pro
130 135 140

W O 95/21922 2166313 PCTnUS9~/02118

210
Pro Thr Ala Lye Ser Ser Aen Arg Arg Val Val His Pro Pro Ala Aen
145 150 155 160
Pro Leu Val Ile Aep Gly Pro Pro Gly Ile Pro Ser Gln Ser Gly Xaa
165 170 175

(2) lN~O}~L~TION FOR SEQ ID NO:52: -

yU~N~ CHARACTERISTICS:
(A) LBNGTH: 177 amino acide
(B) TYPE: amino acid
(C) ST~P~ cs: eingle
(D) TOPOLOGY: linear
( ii ) MOTRCu~-~ TYPE: protein
(xi) ~u~ DESCRIPTION: SEQ ID NO:52:
Aep Gln Pro Aep Xaa Leu Gly Met Pro Gly Gly Pro Ser Ile Thr Arg
1 5 10 15
ly Phe Ala Gly Gly Xaa Thr Thr Arg Arg Leu Glu Leu Leu Ala Val

Gly Gly Val Leu Glu Ser Leu Val Gly Cye Gly Leu Val Leu Pro Pro

Aen Ser Cye Hie Ser Phe Arg Hie Leu Hie Thr Leu Phe Glu Gln Phe

Arg CYB Hie Xaa Phe Hie Leu Gly Ser Thr Leu Gly Xaa Leu Thr Lye
er Arg Ala Leu Gly Ala Gly Ala Gly Leu Leu Ser Ser Cye Xaa Gly
la Ala Gly Leu Leu Leu Ala Gly Gly Arg Pro Ala Gln Ala Gly Arg
100 105 110
Pro Pro Leu Arg Ala Arg Arg Ser Hie Xaa Gln Ala Arg Cye Cye Pro
115 120 125
Leu Ala Arg Val Gly Phe Cys Ala Xaa Ala Cye Xaa Hie Asn Gln Gly
130 135 140
Arg Leu Cye Tyr Cye Ser Gly Leu Cye Ser Cys Val Arg Leu Trp Thr
145 150 155 160
al Gly Pro Arg Glu Thr Cye Gly CYB Glu Thr Gly Xaa Arg Gly Val
165 170 175
ep
2) lN~ l.TION FOR SEQ ID NO:53:

~ W O95/21922 2 1 6 6 3 3 3 PCTnUS95/02118

211

(i) ~QU~N~ CHARACTERISTICS:
(A) LENGTH: 177 amino acide
~B) TYPE: amino acid
- (C) ST~ K~-KqS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:53:

Ile Ser Gln Ile Aep Leu Val Cye Gln Val Gly Arg Gln Leu Pro Glu
1 5 10 15
Gly Ser Pro Val Gly Glu Gln Leu Ala Gly Trp Ser Phe Trp Pro Leu

Glu Ala Cy8 Trp Lye Ala Trp Leu Val Val Ala Trp Phe Phe Pro Gln

Thr Val Ala Thr Val Ser Val Ile Phe Ile Leu Cye Leu Ser Ser Leu

Asp Val Ile Aep Phe Ile Leu Glu Val Leu Leu Val Aen Ser Pro Aen

Leu Ala Arg Leu Ala Arg Val Leu Aep Ser Leu Ala Leu Ala Glu Glu

Arg Leu Ala Cye Ser Trp Leu Val Gly Val Leu Arg Lye Gln Gly Val
100 105 110
Leu Leu Tyr Glu Hie Ala Gly Hie Thr Ser Arg Arg Gly Ala Ala Arg
115 120 125
Leu Arg Glu Trp Gly Phe Ala Leu Glu Pro Val Ser Ile Thr Lye Glu
130 135 140
Asp Cys Ala Ile Val Arg Asp Ser Ala Arg Val Leu Gly Cy8 Gly Gln
145 150 155 160
Leu Val Hie Gly Lye Pro Val Val Ala Arg Arg Gly Aep Glu Val Leu
165 170 175
Ile

(2) lN~O~.TION ~OR SEQ ID NO:54:

(i) SEQUENCE CHARACTERISTICS:
(A) L~NGTH: 176 amino acide
(B) TYPE: amino acid
(C) STRANDEDNESS: eingle
(D) TOPOLOGY: linear

W 095121922 ~3 1 ~ ~ 3 ~ ~ PCTnUS95/02118

212
(ii) MOLECULE TYPE: protein
~Xi) Q~UU~N~ DESCRIPTION: SEQ ID NO:54:

Ser Ala Arg Leu Thr Trp Tyr Ala Arg Trp Ala Val Aen Tyr Gln Arg
1 5 10 15
al Arg Arg Trp Val Asn Aen Ser Pro Val Gly Ala Phe Gly Arg Trp

Arg Arg Ala Gly Lys Leu Gly Trp Leu Trp Leu Gly Ser Ser Pro Lye

Gln Leu Pro Gln Phe Pro Ser Ser Ser Tyr Ser Val Xaa Ala Val Xaa

Met Ser Leu Ile Ser Ser Trp Lye Tyr Ser Trp Leu Thr Hie Gln Ile
er Arg Ala Trp Arg Gly Cye Trp Thr Pro Xaa Leu Leu Leu Arg Ser
ly Trp Pro Ala Leu Gly Trp Trp Ala Ser Cye Ala Ser Arg Ala Ser
100 105 110
Ser Ser Thr Ser Thr Pro Val Thr Leu Ala Gly Ala Val Leu Pro Ala
115 120 125
CYG Glu Ser Gly Val Leu Arg Leu Ser Leu Leu Ala Xaa Pro Arg Lye
130 135 140
Ile Val Leu Leu Phe Gly Thr Leu Leu Val Cy Xaa Ala Val Aep Ser
145 150 155 160
Trp Ser Thr Gly Aen Leu Trp Leu Arg Aep Gly Val Thr Arg Cye Xaa
165 170 175
(2) lN~ORL'.TION FOR SEQ ID NO:55:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 102 amino acide
(B) TYPE: amino acid
(C) ST~h~ .QS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55:
Anp Pro Ser Xaa Gln Xaa Gln Leu Ser Gln Aep Ser Arg Hie Leu Gly
1 5 10 15
Asp Glu Leu Ile Phe Glu Glu Glu Ile Val Arg Hie Hie Arg Thr Ala


_ W O 95/21922 ~ 1 6 6 3 1 ~ PCT~US95/02118

213
Trp Hie Hie Arg Gln Gln Ser Val Aen Pro Ile Leu Thr Hie Thr Leu

Phe Aep Arg Pro Glu Gln Gln Ala Gln Aen Hie Thr Gly Hie Arg Ser

Pro Arg Arg Gly Gln Ala Thr Aep Gln Ala Pro Ser Val Thr Arg Leu

Xaa Leu Pro Arg Gln Glu Val Glu Gly Glu Xaa Ala Arg Phe Thr Ala

Pro Ser Gln Pro Leu Ile
100

(2) lN~hMATION FOR SEQ ID NO:56:

(i) SEQUENCE CH~RACTERISTICS:
(A) LENGTH: 101 amino acide
(B) TYPE: amino acid
(C) ST~l~Nl~Kl)NK~qs eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:56:
Ile Hie Leu Aep Aen Aep Aen Phe Arg Arg Thr Val Aep Thr Leu Val
1 5 10 15
Thr Aen Ser Ser Leu Arg Lye Lye Ser Ser Gly Ile Thr Glu Leu Arg

Gly Ile Ile Val Aen Aen Leu Leu Thr Gln Ser Xaa Pro Thr Pro Phe

Leu Thr Aep Gln Ser Aen Lye Pro Arg Thr Thr Pro Ala Thr Glu Ala

Pro Gly Glu Ala Arg Gln Leu Thr Arg Hie Gln Ala Ser Leu Ala Cye

Aen Phe Pro Ala Arg Arg Ser Lye Val Ser Glu Arg Gly Ser Pro Pro

- Pro Pro Ser Leu Xaa
100

(2) lN~O~TION FOR SEQ ID NO:57:

(i) SEQUENCE CHARACTERISTICS:
(A) LFNGTH: 101 amino acide

~i6~3~
W O95/21922 PCT~US95/02118

214
(B) TYPE: amino acid
(C) ST~ KnN--~qS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) skyu~N~ DESCRIPTION: SEQ ID NO:57:
Ser Ile Leu Thr Met Thr Thr Phe Ala Gly Gln Xaa Thr Pro Trp Xaa
l 5 l0 15
rg Thr His Leu Xaa Gly Arg Asn Arg Gln Ala Ser Pro Asn Cye Val
~ 30
Ala Ser 8er 8er Thr Ile CYB Xaa Pro Asn Leu Asp Pro Hie Pro Phe

Xaa Gln Thr Arg Ala Thr Ser Pro Glu Pro His Arg Pro Pro LYB Pro

Pro Glu Arg Pro Gly Aen Xaa Pro Gly Thr Lye Arg Hie Ser Leu Val

Thr Ser Pro Pro Gly Gly Arg Arg Xaa Val Ser Ala Val Hie Arg Pro
eu Pro Ala Ser Aep
100
2) lN~O}~.TION FOR SEQ ID NO:58:

(i) SBQUENCE CHARACTERISTICS:
(A) LBNGTH: 102 amino acide
(B) TYPE: amino acid
(C) STl~Nl~Kl~ qs: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SBQUBN OE DBSCRIPTION: SBQ ID NO:58:
Aep Gln Arg Leu Gly Gly Gly Gly Glu Pro Arg Ser Leu Thr Phe Aep
l 5 l0 15
eu Leu Ala Gly Lye Leu Gln Ala Ser Asp Ala Trp Cys Leu Val Ser

Cye Leu Ala Ser Pro Gly Ala Ser Val Ala Gly Val Val Leu Gly Leu

Leu Leu Trp Ser Val Lys LYB Gly Val Gly Gln Asp Trp Val Asn Arg

Leu Leu Thr Met Met Pro Arg Ser Ser Val Met Pro Aep ABP Phe Phe

_ W O95/21922 2 1 6 6 ~ 1 3 PCTnUS95/02118

215

Leu LYB Aep Glu Phe Val Thr Lye Val Ser Thr Val Leu Arg Lye Leu

Ser Leu Ser Arg Trp Ile
100

(2) INFORMATION FOR SEQ ID NO:59:

(i) ~uu~N~L CHARACTERISTICS:
(A) LENGTH: l0l amino acide
(B) TYPE: amino acid
(C) ST~Z~hlJKlJ~N~ s eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) ~QUL.._~ DESCRIPTION: SEQ ID NO:59:
Ile Arg Gly Trp Glu Gly Ala Val Aen Arg Ala Hie Ser Pro Ser Thr
l 5 l0 15
Ser Trp Arg Gly Ser Tyr Lye Arg Val Thr Leu Gly Ala Trp Ser Val

Ala Trp Pro Leu Arg Gly Leu Arg Trp Pro Val Trp Phe Trp Ala Cye

CYB Ser Gly Leu Ser Lye Arg Val Trp Val Lye Ile Gly Leu Thr ABP

CYB Xaa Arg Xaa Cye Hie Ala Val Arg Xaa Cys Leu Thr Ile Ser Ser

Ser Lys Met Ser Ser Ser Pro Arg Cys Leu Leu Ser Cys Glu Ser CYB

His Cys Gln Aep Gly
100

(2) lN~O ~ ~.TION FOR SEQ ID NO:60:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: l0l amino acids
(B) TYPE: amino acid
- (C) STR~Nl~ S: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein

W O9St21922 ~ 1 6 6 3 ~ 3 PCTrUS95tO2118

216
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:60:
Ser Glu Ala Gly Arg Gly Arg Xaa Thr Ala Leu Thr Hie Leu Arg Pro
1 5 10 15
ro Gly Gly Glu Val Thr Ser Glu Xaa Arg Leu Val Pro Gly Gln Leu

Pro Gly Leu Ser Gly Gly Phe Gly Gly Arg Cye Gly Ser Gly Leu Val

Ala Leu Val-Cye Gln Lye Gly Cye Gly Ser Arg Leu Gly Xaa Gln Ile
so 55 60
Val Aep Aep Aep Ala Thr Gln Phe Gly Aep Ala Xaa Arg Phe Leu Pro

Gln Arg Xaa Val Arg Hie Gln Gly Val Tyr Cye Pro Ala Lye Val Val
le Val Lye Met Asp
100
2) lN~OR~ATION FOR SBQ ID NO:61:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 123 amino acide
(B) TYPE: amino acid
(C) ST~P~ ~K~ Cs: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) ~yu~N~ DESCRIPTION: SEQ ID NO 61:

Aep Hie Pro Hie Pro Gly Trp Leu Ala Leu Ala Cye Leu Lye Ala Arg
1 5 10 15
er Thr Ile Arg Glu Arg Val Aep Arg Aep Val Val Thr Arg Xaa Pro

Pro Pro Ser Ile Aep Arg Thr Glu Ser Pro Thr Ile Gly Arg Thr Leu

Val Pro Arg Tyr Val Val Tyr Ile Thr Pro Phe Thr Gln Gln Pro Met

Glu Arg Val Val Glu Val Pro Arg Thr Thr Thr Phe Pro Xaa Cye Ser
ep Glu Ser Leu Pro Val Met Glu Val Leu Thr Thr Pro Lye Aen Pro
eu Pro Ala Xaa Xaa Ser Thr Thr Gly Ala Val Gly Thr Lye Pro Gly

W 095/21922 21 6 6 3 ~ ~ PCTnUS95/02118

217
100 105 110
Gly Arg Ser Aen Arg Leu Phe Thr Gln Leu Ile
115 120

(2) lN~C.~ TION FOR SEQ ID NO:62:

(i) S_yu_N~ CHARACTERISTICS:
(A) LENGTH: 122 amino acids
~B) TYPE: amino acid
(C) S~ KI~ S: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) S_yu~ ; DESCRIPTION: SEQ ID NO 62:

Ile Thr Hie Thr Pro Val Gly Trp Hie Leu His Ala Xaa Arg Gln Glu
1 5 10 15
Ala Pro Leu Gly Ser Gly Xaa Thr Val Thr Ser Ser Leu Ala Aen His

His Arg Ala Leu Thr Gly Pro Lye Ala Pro Pro Xaa Ala Gly Arg Trp

Tyr His Gly Met Ser Cye Thr Ser Leu Arg Ser Arg Ser Ser Pro Trp

Asn Glu Leu Leu Lys Ser Gln Gly Pro Pro Arg Ser Arg Aep Val Arg

Thr Ser Pro Cye Leu Ser Trp Arg Ser Ser Gln Pro Arg Arg Ile Pro
go 95
Cys Gln Leu Asp Glu Ala Pro Arg Glu Gln Trp Glu Gln Ser Gln Ala
loO 105 llo
Glu Gly Arg Thr Aep Cye Ser Hie Asn Xaa
115 120

(2) lN~UKLL~.TION FOR SEQ ID NO:63:

(i) SBQUENCB CHARACTBRISTICS:
(A) LENGTH: 122 amino acide
- (B) TYPE: amino acid
(C) STR~I )Kl --K.C~S 8ingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein

W O95/21922 . : 2 1 6 6 3 1 3 PCTnUS95/02118

218
(xi) ~yu~..c~ DESCRIPTION: SBQ ID NO: 63:

Ser Pro Thr Pro Arg Leu Val Gly Thr Cye Met Pro Glu Gly LYB Lye
1 5 10 15
ie His Xaa Gly Ala Gly Arg Pro Xaa Arg Arg Hie Ser Leu Thr Thr

Thr Glu Hie Xaa Gln Aep Arg Lye Pro Hie Hie Arg Pro Aep Val Gly

Thr Thr Val Cye Arg Val His His Ser Val Hie Ala Ala Ala Hie Gly

Thr Ser Cye Xaa Ser Pro Lye Aep Hie Hie Val Pro Val Met Phe Gly
rg Val Leu Ala Cye Hie Gly Gly Pro Hie Asn Pro Glu Glu Ser Leu
la Ser Leu Met Lye Hie Hie Gly Ser Ser Gly Asn Lye Ala Arg Arg
100 105 110
Lye Val Glu Pro Thr Yal Hie Thr Thr Aep
115 120
2) lN~O~.TION FOR SEQ ID NO:64:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 123 amino acide
(B) TYPE: amino acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLBCULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO 64:

Asp Gln Leu Cye Glu Gln Ser Val Arg Pro Ser Ala Trp Leu Cys Ser
1 5 10 15
ie Cys Ser Arg Gly Ala Ser Ser Ser Trp Gln Gly Ile Leu Arg Gly

Cys Glu Aep Leu Hie Aep Arg Gln Gly Leu Val Arg Thr Ser Arg Glu

Arg Gly Gly Pro Trp Asp Phe Aen Aen Ser Phe Hie Gly Leu Leu Arg

Glu Arg Ser Asp Val His Aep Ile Pro Trp Tyr ~ln Arg Pro Ala Tyr


_ W 095121922 2 1 6 6 3 ~ ~ PCTrUS95/02118

219
Gly Gly Ala Phe Gly Pro Val Aen Ala Arg Trp Trp Leu Ala Ser Aep
sp Val Thr Val Tyr Pro Leu Pro Aen Gly Ala Ser Cye Leu Gln Ala
100 105 110
Cye Lye Cye Gln Pro Thr Gly Val Trp Val Ile
115 120

~2) lN~ .TION FOR SEQ ID NO:65:

(i) ~yU~N~ CHARACTERISTICS:
(A) LBNGTH: 122 amino acide
(B) TYPE: amino acid
(C) STR~NnRr''R.~S eingle
(D) TOPOLOGY: linear
(ii) MOL_CULE TYPE: protein
(Xi) g~UU~N~ DESCRIPTION: SEQ ID NO 65:

Ile Ser Cye Val Aen Ser Arg Phe Aep Leu Pro Pro Gly Phe Val Pro
1 5 10 15
hr Ala Pro Val Val Leu His Gln Ala Gly Lye Gly Phe Phe Gly Val

Val Arg Thr Ser Met Thr Gly Lye Aep Ser Ser Glu Hie Hie Gly Asn

Val Val Val Leu Gly Thr Ser Thr Thr Arg Ser Met Gly Cye Cye Val

Aen Gly Val Met Tyr Thr Thr Tyr Arg Gly Thr Aen Val Arg Pro Met
al Gly Leu Ser Val Leu Ser Met Leu Gly Gly Gly Xaa Arg Val Thr
hr Ser Arg Ser Thr Arg Ser Leu Met Val Leu Leu Ala Phe Arg Hie
100 105 110
Ala Ser Ala Asn Gln Pro Gly Cye Gly Xaa
115 120

2) lN~ ~.TION POR SEQ ID NO:66:

(i) ~Uu~N~ CHARACTERISTICS:
(A) LENGTH: 122 amino acide
(B) TYPE: amino acid
(C) ST~ hl)Kl)N~S eingle

W 095/21922 ~ 3 PCTnUS9S/02118

220
(D) TOPOLOGY: linear
(ii) MOL8CULE TYPB: protein
~xi) SBQ~8NOE DBSCRIPTION: SBQ ID NO:66:

Ser Val Val Xaa Thr Val Gly Ser Thr Phe Arg Leu Ala Leu Phe Pro
1 5 10 15
eu Leu Pro Trp Cye Phe Ile Lye Leu Ala Arg Aep Ser Ser Gly Leu

Xaa Gly Pro Pro Xaa Gln Ala Arg Thr Arg Pro Aen Ile Thr Gly Thr

Trp Trp Ser Leu Gly Leu Gln Gln Leu Val Pro Trp Ala Ala Ala Xaa

Thr Glu Xaa Cye Thr Arg Hie Thr Val Val Pro Thr Ser Gly Leu Trp
rp Gly Phe Arg Ser Cye Gln Cys Ser Val Val Val Ser Glu Xaa Arg
rg Hie Gly Leu Pro Ala Pro Xaa Trp CYB Phe Leu Pro Ser Gly Met
100 105 110
Gln Val Pro Thr Asn Arg Gly Val Gly Aep
115 120
2) lN~O~ ~.TION FOR SEQ ID NO:67:

(i) ~KyU~N~ CHARA~-rhRISTICS:
(A) L8NGTH: 112 amino acide
(B) TYPB: amino acid
(C) ST~ JKIJI~lKÇ s eingle
(D) TOPOLOGY: linear
(ii) MOLBCULE TYPB: protein
(xi) ~yU~N~ DBSCRIPTION: SEQ ID NO:67:
Aep Pro Ile Gln Gly Pro Ser Tyr Pro Ser Trp Gln Leu Xaa Ly~ Gly
1 5 10 15
ln Pro Gly Met Leu Thr Met Leu Xaa Thr Pro Ala Leu Arg Thr Leu

Lye Gln Ile Thr Lye Lye Leu Hie Thr Tyr CYB Gln Ile Ser Arg Pro

Gln Ala Met Arg Pro Gln Ser Ser Ser Val Gly Val Trp Ser Gln Arg--


- W 095/21922 ~ 1 6 ~ 3 1 ~ ~CTnUS9~/02118

221
Met Gln Ala Aep Met Ile Leu Gln Gly Val Aep Ala Ser Arg Met Pro

Ser Ile Phe Cye Gly Cye His Glu Trp Gln Xaa Ser Thr Hie Tyr Ile

Gln Cy8 Cye Ser Xaa Gln Ala Xaa Xaa Glu Val Cye Trp Ala Ser Aep
l00 105 ~l0

(2) INFORMATION FOR SEQ ID NO:68:

(i) ~ryurN~r CHARACTERISTICS:
(A) LENGTH: 112 amino acide
(B) TYPE: amino acid
(C) ST~ )KllN~-~s: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) ~rQUrN~r DESCRIPTION: SEQ ID NO:68:

Ile Gln Ser Arg Gly Pro Arg Thr Pro Pro Gly Ser Cy8 Arg Lys Aep
l 5 l0 15
Aen Gln Glu Cye Xaa Pro Cye Ser Glu Leu Gln Leu Xaa Gly His Xaa

Ser Lys Ser Gln Arg Aen Cys Thr Hie Thr Ala Lys Ser Leu Aep Pro

Lye Gln Xaa Gly Arg Aen Hi~ Pro Pro Ser Gly Cye Gly Ala Aen Gly

Cye Lye Leu Ile Xaa Tyr Ser Arg Gly Xaa Met Pro Pro Glu Cye Pro

Yal Ser Ser Ala Aep Val Thr Ser Gly Aen Lye Val Leu Thr Thr Tyr

Ser Val Ala Pro Ser Lys Hie Ser Lys Lys Ser Val Gly Pro Val Ile
l00 105 ll0

(2) lNro~ .TION FOR SEQ ID NO:69:

~ yu~N~r CHARACTERISTICS:
-- (A) LENGTH: lll amino acids
(B) TYPE: amino acid
(C) STR~ S: ~ingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein

W O 95/21922 PCTfUS9~/021~8 --

222

(xi) ~UU~N~ DESCRIPTION: SEQ ID NO:69:

Ser Aen Pro Gly Ala Leu Val Pro Leu Leu Ala Ala Val Glu Arg Thr
1 5 10 15
hr Arg Asn Val Asn His Ala Leu Asn Ser Ser Phe Lys Asp Ile Lys

Ala Asn His Lys Glu Ile Ala His Ile Leu Pro Asn Leu Xaa Thr Pro

Ser Asn Glu Ala Ala Ile Ile Leu Arg Arg Gly Val Glu Pro Thr Asp

Ala Ser Xaa Tyr Asp Thr Pro Gly Gly Arg Cys Leu Gln Asn Ala Gln
yr Leu Leu Arg Met Ser Arg Val Ala Ile Lys Tyr Ser Leu His Thr
al Leu Leu Leu Ala Ser Ile Val Arg Ser Leu Leu Gly Gln Xaa
100 105 110

2) lN~OR~ATION POR SEQ ID NO:70:

(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 112 amino acids
(B) TYPE: amino acid
(C) STRANDBDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:70:

Asp His Trp Pro Asn Arg Leu Leu Thr Met Leu Ala Arg Ser Asn Thr
1 5 10 15
al Cys Ser Glu Tyr Phe Ile Ala Thr Arg Asp Ile Arg Arg Arg Tyr

Trp Ala Phe Trp Arg His Leu Pro Pro Gly Val Ser Tyr Gln Leu Ala

Ser Val Gly Ser Thr Pro Arg Arg Arg Met Ile Ala Ala Ser Leu Leu

Gly Val Xaa Arg Phe Gly Ser Met Cys Ala Ile Ser Leu Xaa Phe Ala


- W 095/21922 ~ 3 ~ 3 PCTrUS95/02118

223
Leu Met Ser Leu Ly~ Leu Glu Phe Arg Ala Trp Leu Thr Phe Leu Val

Val Leu Ser Thr Ala Ala Arg Arg Gly Thr Arg Ala Pro Gly Leu Asp
100 105 110

(2) INFORMATION FOR SEQ ID NO:71:

(i) SBQUENCE CHARACTERISTICS:
(A) LENGTH: 112 amino acids
(B) TYPE: amino acid
(C) STR~Nl)Kl)NK~qS single
(D) TOPOLOGY: linear
(ii) MOLECULB TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:71:
Ile Thr Gly Pro Thr Asp Phe Leu Leu Cys Leu Leu Gly Ala Thr Leu
1 5 10 15
Tyr Val Val Ser Thr Leu Leu Pro Leu Val Thr Ser Ala Glu Asp Thr

Gly His Ser Gly Gly Ile Tyr Pro Leu Glu Tyr His Ile Ser Leu His

Pro Leu Ala Pro His Pro Asp Gly Gly Xaa Leu Arg Pro His Cys Leu

Gly Ser Arg Asp Leu Ala Val Cys Val Gln Phe Leu Cys Aep Leu Leu

Xaa Cys Pro Xaa Ser Trp Ser Ser Glu His Gly Xaa His Ser Trp Leu

Ser Phe Leu Gln Leu Pro Gly Gly Val Arg Gly Pro Leu Asp Trp Ile
100 105 110

(2) INFORMATION FOR SEQ ID NO:72:

yu~N.~_ CHARACTERISTICS:
(A) LENGTH: 111 amino acid~
(B) TYPE: amino acid
- (C) sTR~n~nN-qs ~ingle
(D) TOPOLOGY: linear
-^ (ii) MOLECULE TYPE: protein
(xi) ~u~ DESCRIPTION: SEQ ID NO:72:

W O 95t21922 ~ ~ 6 6 ~ CTrUS95/02118

224
Ser Leu Ala Gln Gln Thr Ser Tyr Tyr Ala Cys Xaa Glu Gln Hie Cye
1 5 10 15
Met Xaa Xaa Val Leu Tyr Cys His Ser Xaa His Pro Gln Lys Ile Leu

Gly Ile Leu Glu Ala Ser Thr Pro Trp Ser Ile Ile Ser Ala Cys Ile

Arg Trp Leu His Thr Pro Thr Glu Asp Asp Cys Gly Leu Ile Ala Trp

Gly Leu Glu Ile Trp Gln Tyr Val Cys Asn Phe Phe Val Ile Cys Phe

Asn Val Leu Lys Ala Gly Val Gln Ser Met Val Asn Ile Pro Gly Cys

Pro Phe Tyr Ser Cys Gln Glu Gly Tyr Glu Gly Pro Trp Ile Gly
100 105 110
(2) INFORMATION FOR SEQ ID NO:73
( i ) ~Uh..~ CHARACTERISTICS
(A) LENGTH: 795 base pairs
(B) TYPE nucleic acid
(C) STRPN~ N-~RS single
(D) TOPOLOGY linear
(ii) MOLE WLE TYPE DNA (genomic)

(xi) SEQUENCE DESCRIPTION SEQ ID NO:73

GATCAGGCCG CTGAGCGGCC GAGAAGGTTA CAATCTGGAG GGGTGATAGG AAGTATGACA 60
AGCATTATGA GG~l~lCG-~ GAGGCL~lCC T~-~AA~GC AGC~GC~ACG AAGTCTCATG 120
GCTGGACCTA TTCCCAGGCT ATAGCTAAAG TTAGGCGCCG AGCAGCCGCT G~-~TACGGCA 180
GCAAGGTGAC CGCCTCCACA TTGGCCACTG GTTGGCCTCA CGTG~-~ - ATGCTGGACA 240
AAATAGCCAG GGGACAGGAA ~llCClll ~A ~llll~l~AC CAAGCGAGAG Gll~ l 300
C~ CTAC CCGTAAGCCC CCAAGATTCA TA~l-~-CCC AC~.~.GGAC TTCAGGATAG 360
CTGAAAAGAT GATTCTGGGT GACCCCGG Q TCGTTGCAAA GTCAATTCTG GGTGACGCTT 420
A1~-1~11C~A GTACACGCCC AAT QGAGGG TCAAAGCTCT GGTTAAGGCG TGGGAGGGGA 480
AGTTGCATCC CGCTGCGATC ACCGTGkACG CCAull~lll CGACTCATCG ATTGATGAGC 540
ACr-~TGCA GGTGGAGGCT lCGGlGll.G CGGCGGCTAG T~-ACA~CCCC TCAATGGTAC 600
ATG~lll~lG CAAGTACTAC l~.G~lGGCC CTAlG~lllC CCCAGATGGG ~llCC~l~GG 660

- W O95/21922 2 1 6 6 3 ~ 3 PCTnUS95/02118

225
GGTACCGCCA GTGTAGGTCG TCGGGCGTGT TGACAACTAG ~GGC~AAC AGCAT QCTT 720
GTTACATTAA GGTCAGCGCG GC~lG~AGGC GG~lGGG~AT TAAGGCAC Q TCAll~l.lA 780
TAGCTGGAGA TGATT 795

(2) INPORMATION POR SEQ ID NO:74:

~QUL.._L CHARACTERISTICS:
(A) LENGTH: 265 amino acid6
(B) TYPE: amino acid
(C) STR~N~ CS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) ~QU~N~ DESCRIPTION: SEQ ID NO:74:

Asp Gln Ala Ala Glu Arg Pro Arg Arg Leu Gln Ser Gly Gly Val Ile
1 5 10 15
ly Ser Met Thr Ser Ile Met Arg Leu Ser Leu Arg Leu Ser Xaa Lys

Arg Gln Pro Arg Arg Ser Leu Met Ala Gly Pro Ile Pro Arg Leu Xaa

Leu Lys Leu Gly Ala Glu Gln Pro Leu Asp Thr Ala Ala Arg Xaa Pro

Pro Pro His Trp Pro Leu Val Gly Leu Thr Trp Arg Arg Cye Trp Thr
ys Xaa Pro Gly Asp Arg Lys Phe Leu Ser Leu Leu Xaa Pro Ser Glu
rg Phe Ser Ser Pro Lys Leu Pro Val Ser Pro Gln Aep Ser Xaa Phe
100 105 110
Ser Hie Leu Trp Thr Ser Gly Xaa Leu Lye Arg Xaa Phe Trp Val Thr
115 120 125
Pro Ala Ser Leu Gln Ser Gln Phe Trp Val Thr Leu Ile Cye Ser Ser
130 135 140
Thr Arg Pro Ile Arg Gly Ser Lye Leu Trp Leu Arg Arg Gly Arg Gly
145 150 15~ 160
er Cye Ile Pro Leu Arg Ser Pro Xaa Thr P~o Leu Val Ser Thr Hie
165 170 175
rg Leu Met Ser Thr Thr Cye Arg Trp Arg Leu Arg CYB Leu Arg Arg
180 185 190

~1 663~ 3
W O95/21922 PCTrUS95/02118

226
Leu Val Thr Thr Pro Gln Trp Tyr Met Leu Cys Ala Ser Thr Thr Leu
195 200 205
Val Ala Leu Trp Phe Pro Gln Met Gly Phe Pro Trp Gly Thr Ala Ser
210 215 220
Val Gly Arg Arg Ala Cys Xaa Gln Leu Ala Arg Arg Thr Ala Ser Leu
225 230 235 240
Val Thr Leu Arg Ser Ala Arg Pro Ala Gly Gly Trp Gly Leu Arg His
245 250 255
is His Ser Leu Xaa Leu Glu Met Ile
260 265

2) INFORMATION FOR SEQ ID NO:75:

(i) SEUU~N~ CHARACTERISTICS:
(A) LENGTH: 264 amino ac$de
(B) TYPE: amino acid
(C) ST~N~ AS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(Xi) ~U~N~ DESCRIPTION: SEQ ID NO:75:

Ile Arg Pro Leu Ser Gly Arg Glu Gly Tyr Asn Leu Glu Gly Xaa Xaa
1 5 10 15
lu Val Xaa Gln Ala Leu Xaa Gly Cye Arg Xaa Gly Cys Pro Glu Lye

Gly Ser Arg Asp Glu Val Ser Trp Leu Aep Leu Phe Pro Gly Tyr Ser
' 45
Xaa Ser Xaa Ala Pro Ser Ser Arg Trp Ile Arg Gln Gln Gly Asp Arg

Leu His Ile Gly His Trp Leu Ala Ser Arg Gly Gly Asp Ala Gly Gln
sn Ser Gln Gly Thr Gly Ser Ser Phe Hie Phe Cye Asp Gln Ala Arg
ly Phe Leu Leu Gln Aen Tyr Pro Xaa Ala Pro Lye Ile Hie Ser Phe
100 105 110
Pro Thr Phe Gly Leu Gln Aep Ser Xaa Lye Aep Aep Ser Gly Xaa Pro
115 120 125
Arg Hie Arg Cye Lye Val Aen Ser Gly Xaa Arg Leu Ser Val Pro Val
130 135 140

_ W O95121922 2 1 6 6 ~ ~ 3 PCTtUS95tO2118

227
Hie Ala Gln Ser Glu Gly Gln Ser Ser Gly Xaa Gly Val Gly Gly Glu
145 150 155 160
al Ala Ser Arg Cys Aep Hie Arg Xaa Arg Hie Leu Phe Arg Leu Ile
165 170 175
ep Xaa Xaa Ala Arg Hie Ala Gly Gly Gly Phe Gly Val Cye Gly Gly
180 185 190
Xaa Xaa Gln Pro Leu Aen Gly Thr Cye Phe Val Gln Val Leu Leu Trp
195 200 205
Trp Pro Tyr Gly Phe Pro Arg Trp Gly Ser Leu Gly Val Pro Pro Val
210 215 220
Xaa Val Val Gly Arg Val Aep Aen Xaa Leu Gly Glu Gln Hie Hie Leu
225 230 235 240
Leu Hie Xaa Gly Gln Arg Gly Leu Gln Ala Gly Gly Asp Xaa Gly Thr
245 250 255
le Ile Leu Tyr Ser Trp Arg Xaa
260

2) lNroRr~TIoN FOR SEQ ID NO:76:

(i) SEQUENCE CHARACTERISTICS:
~A) LENGTH: 264 amino acide
(B) TYPE: amino acid
(C) STR~NI)~ N~C s: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) ~r;~u~N~r; DESCRIPTION: SEQ ID NO:76:

Ser Gly Arg Xaa Ala Ala Glu Lye Val Thr Ile Trp Arg Gly Aep Arg
1 5 10 15
ys Tyr Aep Lye Hie Tyr Glu Ala Val Val Glu Ala Val Leu Lye Lys

Ala Ala Ala Thr Lys Ser Hie Gly Trp Thr Tyr Ser Gln Ala Ile Ala

Lye Val Arg Arg Arg Ala Ala Ala Gly Tyr Gly Ser Lye Val Thr Ala

Ser Thr Leu Ala Thr Gly Trp Pro Hie Val Glu Glu Met Leu Asp Lye

Ile Ala Arg Gly Gln Glu Val Pro Phe Thr Phe Val Thr Lye Arg Glu


W O95/21922 PCTAUS95/02118 --

228
Val Phe Phe Ser LYB Thr Thr Arg LYB Pro Pro Arg Phe Ile Val Phe
100 105 110
Pro Pro Leu Asp Phe Arg Ile Ala Glu Lys Met Ile Leu Gly ABP Pro
115 120 125
Gly Ile Val Ala LYB Ser Ile Leu Gly A~p Ala Tyr Leu Phe Gln Tyr
130 135 140
Thr Pro Asn Gln Arg Val Lye Ala Leu Val LYB Ala Trp Glu Gly LYB
145 150 155 160
eu His Pro Ala Ala Ile Thr Val Xaa Ala Thr CYB Phe Asp Ser Ser
165 170 175
le ABP Glu His Asp Met Gln Val Glu Ala Ser Val Phe Ala Ala Ala
180 185 190
Ser Aep Asn Pro Ser Met Val His Ala Leu CYB LYB Tyr Tyr Ser Gly
195 200 205
Gly Pro Met Val Ser Pro A~p Gly Val Pro Leu Gly Tyr Arg Gln CYB
210 215 220
Arg Ser Ser Gly Val Leu Thr Thr Ser Ser Ala Aen Ser Ile Thr CYB
225 230 235 240
Tyr Ile LYB Val Ser Ala Ala CYB Arg Arg Val Gly Ile Lye Ala Pro
245 250 255
er Phe Phe Ile Ala Gly ABP Asp
260

2) INFORMATION FOR SEQ ID NO:77:

(i) SEQUENC8 CHARACTERISTICS:
(A) LENGTH: 265 amino acids
~B) TYPE: amino acid
~ C) ST~NI )t' l ~NI':.~S single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) ~uu~CE DESCRIPTION: SEQ ID NO:77:

Asn His Leu Gln Leu Xaa Arg Met Met Val Pro Xaa Ser Pro Pro Ala
1 5 10 15
YB Arg Pro Arg Xaa Pro Xaa CYB Asn LYB Xaa CYB CYB Ser Pro Ser
aa Leu Ser Thr Arg Pro Thr Thr Tyr Thr Gly Gly Thr Pro Arg Glu


W 095121922 ~ 1 6 6 3 1 3 PCTnUS95/02118

229
Pro Hi~ Leu Gly Lys Pro Xaa Gly His Gln Ser Ser Thr CYB Thr Lys

His Val Pro Leu Arg Gly Cys Hie Xaa Pro Pro Gln Thr Pro Lys Pro
; 65 70 75 80
Pro Pro Ala Cye Arg Ala Hi~ Gln Ser Met Ser Arg Asn Lys Trp Arg

Xaa Arg Xaa Ser Gln Arg Aep Ala Thr Ser Pro Pro Thr Pro Xaa Pro
100 105 110
Glu Leu Xaa Pro Ser Asp Trp Ala Cys Thr Gly Thr Asp Lys Arg His
115 120 125
Pro Glu Leu Thr Leu Gln Arg Cys Arg Gly His Pro Glu Ser Ser Phe
130 135 140
Gln Leu Ser Xaa Ser Pro Lys Val Gly Lye Leu Xaa Ile Leu Gly Ala
145 150 155 160
Tyr Gly Xaa Phe Trp Arg Arg LYB Pro Leu Ala Trp Ser Gln Lys Xaa
165 170 175
Lys Glu Leu Pro Val Pro Trp Leu Phe CYB Pro Ala Ser Pro Pro Arg
180 185 190
Glu Ala Asn Gln Trp Pro Met Trp Arg Arg Ser Pro Cye Cys Arg Ile
195 200 205
Gln Arg Leu Leu Gly Ala Xaa Leu Xaa Leu Xaa Pro Gly Asn Arg Ser
210 215 220
Ser His Glu Thr Ser Ser Arg Leu Pro Phe Ser Gly Gln Pro Gln Arg
225 230 235 240
Gln Pro Hie Asn Ala Cys His Thr Ser Tyr His Pro Ser Arg Leu Xaa
245 250 255
Pro Ser Arg Pro Leu Ser Gly Leu Ile
260 265

(2) lN~F_~.TION FOR SEQ ID NO:78:

(i) SEQUENCE CHARACTERISTICS:
(A) L~NGTH: 264 amino acids
(B) TYPE: amino acid
(C) STR~Nn~n~S: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:78:

W 095/21922 ~ 1 6 6 3 1 3 PCTAUS95/02118

230
Ile Ile Ser Ser Tyr Lys Glu Xaa Trp Cye Leu Aen Pro Hie Pro Pro
1 5 10 15
la Gly Arg Ala Aep Leu Aen Val Thr Ser Aep Ala Val Arg Arg Ala

Ser Cye Gln His Ala Arg Arg Pro Thr Leu Ala Val Pro Gln Gly Aen

Pro Ile Trp Gly Aen His Arg Ala Thr Arg Val Val Leu Ala Gln Ser

Met Tyr His Xaa Gly Val Val Thr Ser Arg Arg Lys Hie Arg Ser Leu
ie Leu Hie Val Val Leu Ile Aen Arg Xaa Val Glu Thr Ser Gly Val
ie Gly Aep Arg Ser Gly Met Gln Leu Pro Leu Pro Arg Leu Aen Gln
100 105 110
Ser Phe Asp Pro Leu Ile Gly Arg Val Leu Glu Gln Ile Ser Val Thr
115 120 125
Gln Aen Xaa Leu Cye Aen Aep Ala Gly Val Thr Gln Aen Hie Leu Phe
130 135 140
Ser Tyr Pro Glu Val Gln Arg Trp Glu Asn Tyr Glu Ser Trp Gly Leu
145 150 155 160
hr Gly Ser Phe Gly Glu Glu Aen Leu Ser Leu Gly Hie Lys Ser Glu
165 170 175
rg Aen Phe Leu Ser Pro Gly Tyr Phe Val Gln Hie Leu Leu Hie Val
180 185 190
Arg Pro Thr Ser Gly Gln Cye Gly Gly Gly Hie Leu Ala Ala Val Ser
195 200 205
Ser Gly Cys Ser Ala Pro Aen Phe Ser Tyr Ser Leu Gly Ile Gly Pro
210 215 220
Ala Met Arg Leu Arg Arg Gly Cye Leu Phe Gln Asp Ser Leu Asn Asp
225 230 235 240
Ser Leu Ile Met Leu Val Ile Leu Pro Ile Thr Pro Pro Asp Cye Asn
245 250 255
eu Leu Gly Arg Ser Ala Ala Xaa
260

2) INPORMATION FOR SEQ ID NO:79:

(i) ~Ou~.~ CHARACTERISTICS:

- W O95/21922 PCT~US95/02118
'~1&6313
231
(A) LBNGTH: 264 amino acids
(B) TYPE: amino acid
~ C) STR~NnRnN~Cs E~ingle
(D) TOPOLOGY: linear
(ii) MOLECULB TYPE: protein
- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:79:

Ser Ser Pro Ala Ile Ly6 Asn Aep Gly Ala Leu Ile Pro Thr Arg Leu
1 5 10 15
Gln Ala Ala Leu Thr Leu Met Xaa Gln Val Met Leu Phe Ala Glu Leu

Val Val Asn Thr Pro Asp Asp Leu His Trp Arg Tyr Pro Lye Gly Thr

Pro Ser Gly Glu Thr Ile Gly Pro Pro Glu Xaa Tyr Leu His Lys Ala

Cys Thr Ile Glu Gly Leu Ser Leu Ala Ala Ala Asn Thr Glu Ala Ser

Thr Cye Met Ser Cys Ser Ser Ile Asp Glu Ser Lye Gln Val Ala Xaa

Thr Val Ile Ala Ala Gly Cys Asn Phe Pro Ser His Ala Leu Thr Arg
100 105 110
Ala Leu Thr Leu Xaa Leu Gly Val Tyr Trp Asn Arg Xaa Ala Ser Pro
115 120 125
Arg Ile Asp Phe Ala Thr Met Pro Gly Ser Pro Arg Ile Ile Phe Ser
130 135 140
Ala Ile Leu Lys Ser Lys Gly Gly Ly~ Thr Met Asn Leu Gly Gly Leu
145 150 155 160
Arg Val Val Leu Glu Lys Lys Thr Ser Arg Leu Val Thr Lys Val Lys
165 170 175
Gly Thr Ser Cy~ Pro Leu Ala Ile Leu Ser Ser Ile Ser Ser Thr Xaa
180 185 190
Gly Gln Pro Val Ala Asn Val Glu Ala Val Thr Leu Leu Pro Tyr Pro
195 200 205
Ala Ala Ala Arg Arg Leu Thr Leu Ala Ile Ala Trp Glu Xaa Val Gln
210 215 220
Pro Xaa Asp Phe Val Ala Ala Ala Phe Phe Arg Thr Ala Ser Thr Thr
225 230 235 240
Ala Ser Xaa Cys Leu Ser Tyr Phe Leu Ser Pro Leu Gln ~le Val Thr
245 250 255

W 095/21922 ~ PCTrUS95/02118

232
Phe 8er Ala Ala Gln Arg Pro Asp
260

~2) INPORMATION FOR SEQ ID NO 80
(i) ~yUL.._L CHARACTERISTICS
(A) LENGTH 4268 baee pairs
(B) TYPE nucleic acid
(C) STT~Ah~ S: single
(D) TOPOLOGY linear
(ii) MOLECULE TYPE DNA (genomic)

(xi) SEQUENCE DESCRIPTION SEQ ID NO 80

~aG~-~ATCC CACAGGCTCC ATACACCCAA TAACC~.-~A CGCGGCTAAT GACCAGGACA 60
TCTATCAACC ACCATGTGGA G~-~GGG.CCC TTA~C~G CTCTTGCGGG ~A~Ar~AAGG 120
GGTATCTGGT AArACG~TG GGGTCATTGG TTGAGGTCAA CAAATCCGAT GACCCTTATT l80
G~ .G CGGGGCCCTT CCCATGGCTG TTGCCAAGGG -l.cl,cAGGT GCCCCGATTC 240
TGTGCTCCTC CGGGCATGTT ATTGGGATGT TCACCGCTGC TA~AATTcT GGCGGl~cAG 300
TCGGCCAGAT TAGGGTTAGG CC~G~ G~G~ ATA CCATCCCCAG TA~ArAqCAC 360
ATGCCACT TGATACAAAA CCTACTGTGC CTA~C~AGTA TTCAGTGCAA ATTTTAATTG 420
CCCC~A~TGG CAGCGGCAAG TCAACCAAAT TACCACTTTC TTACATGCAG GRGAAGYATG 480
AG~ G~. CCTAAATCCC A~-~-GG~-A CAACAGCATC AATGCCAAAG TACATGCACG 540
CGACGTACGG CGTGAATCCA AATTGCTATT TTAATGGCAA ATGTACCAAC ACAGGGGCTT 600
QCTTACGTA QGrA~ATAT GGCATGTACC TGACCGGACG A.~;~-CCCGG AACTATGATG 660
TAATCATTTG T~-~C~ATGC CATGCTACCG ATCGAACCAC C~.~GGGC ATTG~AAA~G 720
TCCTAA~C~A AGCTCCATCC AAAAATGTTA GGCTAGTGGT TCTTGCCACG GCTACCCCCC 780
CTGGAGTAAT CCCTACAC Q CATGCCAACA TAACTGAGAT TCAATTAACY GATGAAGGCA 840
CTA,CCC~,l TCATGGAAA~A AAGATTAAGG AGGAA~TCT ~AA~AAGGG AGACACCTTA 900
~---~AGGC TAC~AAAA CACTGTGATG AG~G~-AA CGAGTTAGCT CGAAAGGGAA 960
TAACAGCTGT CTCTTACTAT AGGGGATGTG ACATCTCAAA AATGCCTGAG GGCGACTGTG l020
TAGTAGTTGC CACTGATGCC -~.~.ACAG GGTACACTGG TGACTTTGAT ~CC~lATG 1080
ACTCr~Gr~T CATGr,T~AA ~ A~ATGCC ATGTTCACCT T~CCTA~T TTÇACCATGG 1140

W 095121922 ~ 1 6 ~ ~ 1 3 PCTnUS95/02118


233
lC~l~l ~laCGGG~ll TCAGCAATAG TTAAAGGCCA GCGTAGGGGC CGCACAGGCC 1200
GTGGr-A~AGC TGGr~TATAC TACTATGTAG ACGGGAGTTG TACCCCllCG GGTATGGTTC 1260
CTGAATGCAA CA.l~l.~AA GC~.C~ACG CAGCCAAGGC ATGGTATGGT TTGTCATCAA 1320
CAGAAGCTCA AACTATTCTG GACACCTATC GCACCCAACC lGG~l-ACCT GCGATAGGAG 1380
CAAATTTGGA CGA~lGGG~-- GAl~ . CTATGGTCAA CCCCGAACCT TCALll~l~A 1440
ATACTGCAAA AAGAACTGCT GACAATTATG llll~ll~AC TGCAGCCCAA CTACAACTGT 1500
GTCATCAGTA TGGCTATGCT G~-lCC~AATG ACGCACCACG GTGGCAGGGA GCCCGGCTTG 1560
Gr~AAAAAArC ll`-lGGG~l. ~ .aGCG~-. TGGACGGCTG TGACGCC.~l CCTGGCCCAG 1620
AGCCCAGCGA GGTGACCAGA TACCAAATGT GCTTCACTGA AGTCAATACT I~lGG~ACAG 1680
CCGCACTCGC l~llGGC~-l GGAGTGGCTA TGGCTTATCT AGCCATTGAC A~-l.llGGCG 1740
CCA~-ll~.~. GC~GC~..GC .~.~-ATTA CATCAGTCCC TACC6~-GCl A~-l~lCGCCC 1800
CA~G~ A Cr-~Ar-Ar,GAA A.C~.G4AGG AGTGTGCATC ATTCATTCCC TTGGAGGCCA 1860
TGGTTGCTGC AATTGACAAG CTGAAGAGTA CAATCACCAC AACTAGTCCT TTCACATTGG 1920
AAACCGCC--. TGAAAAACTT AACACCTTTC TTGGGCCTCA TGCAGCTACA ATCCTTGCTA 1980
TC~TAr-A~,TA 1lG~ GC TTAGTCACTT TACCTGACAA TCCCTTTGCA TCATGCGTGT 2040
TTG~-lTTCAT TGCGGGTATT ACTACCCCAC TACCTCACAA GATCAAAATG llC~-l~.CAT 2100
TA m GGAGG CGCAATTGCG TCCAAGCTTA CAGACGCTAG AGRCGCACTG GC~.l~ATGA 2160
TGGCCGGGGC TGYGGGAACA G~-..-llG~lA CATGGACATC G~lG~l.11 ~l~lllaACA 2220
TGCTAGGCGG CTA.~-.GGC GCCTCATCCA .~---GCTT GACA m AAA .G~-.laATGG 2280
GTGAGTGGCY CACTATGGAT CAGCTTGCTG G m AGTCTA ~.CCGC~.lC AATCCGGCCG 2340
CAGGAGTTGT GGGC~l~llG TCAGCTTGTG CAAl~ll.GC m GACAACA GCAGGGCCAG 2400
ATCACTGGCC r~Ar~A T CTTACTATGC TTGCTAGGAG CAACACTGTA TGTARTGAGT 2460
ACTTTATTGC CA~ ~l~AC ATCCGCAGGA AGATACTGGG CAll~lGGAG GCATCTACCC 2520
CCTGGAGTRT CATATCAGCT TGCATCCGTT GGCTYCACAC CCCGACGGAG GATGATTGCG 2580
GCCTCATTGC llGGG~l~lA RAGA m GGC AGTATGTGTG CAAll.~l.l GTGATTTGCT 2640
TTAATGTCCT TAAAGCTGGA GTTCAGAGCA TGGTTAACAT lC~l~ll~l C~-lll~lACA 2ioo
G~-.GC~AGAA GGGGTACAAG GGCCC~-.G~A TTGGATCAGG TATGCTCCAA GCACGCTGTC 2760
CAlGC~GlGC TGAACTCATC l.ll~-l~llG AGAATGG m TGrAAAACTT TA~AAr~r-~C 2820
Cr~ CTTG TTCAAATTAC TGr~rAGGGG ~ llC AGT CAACGCTAGG ~ .~lGG~- 2880


W O 95/21922 ~CTnUS95/02118
~1663~.3
234

CGGCTAGACC GGACC QACT GATTGGACTA ~l~ C~l CAATTATGGC GTTAGGGACT 2940
ACTGTAAATA T~-A~AAATTG GGAGAT Q CA ~ ~llAC AGCAGTATCC TCTCCAAATG 3000
l~l~lll~AC CCAGG~GCCC C QACCTTGA GAGCTGCAGT GGCC~AC CGCGTACAGG 3060
TTCAGYGTTA TCTAGGTGAG CCCAAAACTC CTTGGACGAC Al~lG~llGC TGTTACGGTC 3120
CTGACGGTAA GGGTAAAACT GTTAAGCTTC C~-llCCGCGT T~-~G~-A~AC ACAC~lGGlG 3180
GTCG QTGCA ACTTAA m G CGTGATCGAC TTGAGG QAA TGACTGTAAT TCr~TAAA~A 3240
ACACTCCTAG TGATGAAGCC G Q~l~.CCG ~ AAArAG~A~. llGCGGC~lA 3300
CAAACCAATT GCTTGAGG Q A m CAGCTG GC~l-~ACAC CACCAAACTG CCAGCCCC~-l 3360
CCCAGATCGA AGAGGTAGTG GTAAC~AAGC GCCAGTTCCG GG~AA~AACT G~11CG~11A 3420
C~llGC~-lCC CC~-lCC~AGA .CC~lCC~AG GAGTGTCATG TCCTGAAAGC CTG~AA~-AA 3480
GTGACCCGTT AGAAGGTCCT T QAMCCTCC ~ ~ACC AC~ ~ Q~GGC~-A 3540
TGCCGATGCC C~ GG~A GCAGGTGAGT GTAACCCTTT CACTGCAATT GGATGTGCAA 3600
TGACCGAAAC ARGYGGAGKC CC~5P~RATT TACCCAGTTA CCCTCCCAAA AAGGAGGTCT 3660
CTGAATGGTC A~-~AAAGT TGGTCAACGA CTA~ACCGC TTCCAGCTAC GTTACTGGCC 3720
CCCC~1ACCC TAA~ATACGG GGCAAGGATT C QCTCAATC AGCCACCGCC AAACGGCClA 3780
QAAAAA~A~ GG';AAAG AGTGAGTTTT CGTGCAGCAT GAGCTACACT TGGACCGACG 3840
TGATTAGCTT CAAAACTGCT TCTAAAGTTC TGTCTGCAAC ~CGGGC~ATC ACTAGTGGTT 3900
TCCTCAAACA AAGATCATTG GTGTATGTGA CTGAGCCGCG GGATGCGGAG CTTAr~AAA~C 3960
AAAAAGT QC TATTAATA~A CAAC~l~l lCCCCC~ATC ATACCArAAG CAAGTGAGAT 4020
TGGCTAAGGA AAAAGCTTCA AAA~ll~lCG GTGT QTGTG GGACTATGAT GAAGTAGCAG 4080
CTCACACGCC CTCTAAGTCT GCTAAGTCCC A QTCACTGG C~-,,~GGGC ACTGATGTTC 4140
TGGACTTGCA GAA~l~lG~C GAGG QGGTG AGATACCGAG TCATTATCGG QAACTGTGA 4200
TAGTTCCAAA GGAGGAGGTC ll~lGAAGA CCCCC~AGAA AC~AAr~AAG AAACCCCCAA 4260
GGCTTATC 4268

~2) lN~U~ ~.TION FOR SEQ ID NO:81:

(i) ~KyU~N~ CHARACTERISTICS:
(A) LENGTH: 1422 amino acid~
(B) TYPE: amino acid

_ W O9S/21922 2 1 5 6 3 1 3 PCTnUS9S/02118

235
(C) sT~A~n~nNR.cs eingle
~D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) SEQUENOE DESCRIPTION: SEQ ID NO:81:

Trp Leu Ile Pro Gln Ala Pro Tyr Thr Gln Xaa Pro Leu Thr Arg Lou
1 5 10 15
et Thr Arg Thr Ser Ile Aen Hie Hie Val Glu Leu Gly Pro Leu Leu
` 30
Gly Ala Leu Ala Gly Arg Pro Arg Gly Ile Trp Xaa Hie Asp Trp Gly

Hie Trp Leu Arg Ser Thr Aen Pro Met Thr Leu Ile Gly Val Cye Ala

Gly Pro Phe Pro Trp Leu Leu Pro Arg Val Leu Gln Val Pro Arg Phe
y8 Ala Pro Pro Gly Met Leu Leu Gly Cye Ser Pro Leu Leu Glu Ile
eu Ala Val Gln Ser Ala Arg Leu Gly Leu Gly Arg Trp CYB Val Leu
100 105 110
Aep Thr Ile Pro Ser Thr Gln Hie Met Pro Leu Leu Ile Gln Aen Leu
115 120 125
Leu Cye Leu Thr Ser Ile Gln Cye Lye Phe Xaa Leu Pro Pro Leu Ala
130 135 140
Ala Ala Ser Gln Pro Asn Tyr Hie Phe Leu Thr Cys Arg Xaa Ser Met
145 150 155 160
rg Ser Trp Ser Xaa Ile Pro Val Trp Leu Gln Gln His Gln CYB Gln
165 170 175
er Thr Cye Thr Arg Arg Thr Ala Xaa Ile Gln Ile Ala Ile Leu Met
180 185 190
Ala Aen Val Pro Thr Gln Gly Leu Hie Leu Arg Thr Ala His Met Ala
195 200 205
Cye Thr Xaa Pro Aep Aep Val Pro Gly Thr Met Met Xaa Ser Phe Val
210 215 220
Thr Aen Ala Met Leu Pro Ile Glu Pro Pro Cye Trp Ala Leu Glu Arg
225 230 235 240
er Xaa Pro Lye Leu Hie Pro Lys Met Leu Gly Xaa Trp Phe Leu Pro
245 250 255
rg Leu Pro Pro Leu Glu Xaa Ser Leu Hie Hie Met Pro Thr Xaa Leu

W 095/21922 ~lG6~ ~ ~ PCTrUS95/02118

236
260 265 270
Arg Phe Aen Xaa Xaa Met Lye Ala Leu Ser Pro Phe Met Glu Ly~ Arg
275 280 285
Leu Arg Arg Lye Ile Xaa Arg Lye Gly Aep Thr Leu Ser Leu Arg Leu
290 295 300
Pro Lye Asn Thr Val Met Ser Leu Leu Thr Ser Xaa Leu Glu Arg Glu
305 310 315 320
aa Gln Leu Ser Leu Thr Ile Gly Aep Val Thr Ser Gln Lye Cye Leu
325 330 335
rg Ala Thr Val Xaa Xaa Leu Pro Leu Met Pro Cye Val Gln Gly Thr
340 345 350
eu Val Thr Leu Ile Pro Cye Met Thr Ala Ala Ser Trp Xaa Lys Ala
355 360 365
Hie Ala Met Leu Thr Leu Thr Leu Leu Ser Pro Trp Val Phe Val Cye
370 375 380
Ala Gly Phe Gln Gln Xaa Leu Lye Ala Ser Val Gly Ala Ala Gln Ala
385 390 395 400
al Gly Glu Leu Ala Tyr Thr Thr Met Xaa Thr Gly Val Val Pro Leu
405 410 415
rg Val Trp Phe Leu Aen Ala Thr Leu Leu LYB Pro Ser Thr Gln Pro
420 425 430
Arg His Gly Met Val Cye Hie Gln Gln Lye Leu Lye Leu Phe Trp Thr
435 440 445
Pro Ile Ala Pro Aen Leu Gly Tyr Leu Arg Xaa Glu Gln Ile Trp Thr
450 455 460
Ser Gly Leu Ile Ser Phe Leu Trp Ser Thr Pro Aen Leu Hie Leu Ser
465 470 475 480
le Leu Gln Lye Glu Leu Leu Thr Ile Met Phe Cye Xaa Leu Gln Pro
485 490 495
en Tyr Aen Cye Val Ile Ser Met Ala Met Leu Leu Pro Met Thr Hie
500 505 510
ie Gly Gly Arg Glu Pro Gly Leu Gly Lye Aen Leu Val Gly Phe Cye
515 520 525
Gly Ala Trp Thr Ala Val Thr Pro Val Leu Ala Gln Ser Pro Ala Arg
530 535 540
Xaa Pro Aep Thr Lye Cye Ala er Leu Lye Ser Ile Leu Leu Gly Gln
545 550 555 560
Pro Hie Ser Leu Leu Ala Leu Glu Trp Leu Trp Leu Ile Xaa Pro Leu

~ W 095/21922 2 1 ~ 6 3 ~ ~ PCTtUS95tO2118

237
565 570 575
Thr Leu Leu Ala Pro Leu Ual Cye Gly Val Ala Gly Leu Leu Hie Gln
580 585 590
Ser Leu Pro Val Leu Leu Ser Pro Gln Trp Leu Thr Lye Arg Lye Ser
595 600 605
Trp Arg Ser Val Hie Hie Ser Phe Pro Trp Arg Pro Trp Leu Leu Gln
610 615 620
Leu Thr Ser Xaa Arg Val Gln Ser Pro Gln Leu Val Leu Ser Hie Trp
625 630 635 640
Lye Pro Pro Leu Lye Aen Leu Thr Pro Phe Leu Gly Leu Met Gln Leu
645 650 655
Gln Ser Leu Leu Ser Xaa Ser Ile Ala Val Ala Xaa Ser Leu Tyr Leu
660 665 670
Thr Ile Pro Leu Hie Hie Ala Cye Leu Leu 8er Leu Arg Val Leu Leu
675 680 685
Pro Hie Tyr Leu Thr Arg Ser Lye Cye Ser Cye Hie Tyr Leu Glu Ala
690 695 700
Gln Leu Arg Pro Ser Leu Gln Thr Leu Glu Xaa Hie Trp Arg Ser Xaa
705 710 715 720
Trp Pro Gly Leu Xaa Glu Gln Leu Leu Val Hie Gly Hie Arg Trp Val
725 730 735
Leu Ser Leu Thr Cye Xaa Ala Ala Met Leu Ala Pro Hie Pro Leu Leu
740 745 750
Ala Xaa Hie Leu Aen Ala Xaa Trp Val Ser Gly Xaa Leu Trp Ile Ser
755 760 765
Leu Leu Val Xaa Ser Thr Pro Arg Ser Ile Arg Pro Gln Glu Leu Trp
770 775 780
Ala Ser Cye Gln Leu Val Gln Cye Leu Leu Xaa Gln Gln Gln Gly Gln
785 790 795 800
Ile Thr Gly Pro Thr Aep Phe Leu Leu Cye Leu Leu Gly Ala Thr Leu
805 810 8~5
- Tyr Val Xaa Ser Thr Leu Leu Pro Leu Val Thr Ser Ala Gly Arg Tyr
820 825 830
Trp Ala Phe Trp Arg Hie Leu Pro Pro Gly Val Ser Tyr Gln Leu Ala
~ 835 840 845
Ser Val Gly Xaa Thr Pro Arg Arg Arg Met Ile Ala Ala Ser Leu Leu
850 855 860
Gly Val Xaa Arg Phe Gly Ser Met Cye Ala Ile Ser Leu Xaa Phe Ala

W 095121922 ~ 1 6 6 3 1 ~ PCTnUS95/02118 -.

238
865 870 875 880
Leu Met Ser Leu Lye Leu Glu Phe Arg Ala Trp Leu Thr Phe Leu Val
885 890 895
Val Leu Ser Thr Ala Ala Arg Arg Gly Thr Arg Ala Pro Gly Leu Aep
900 905 910
Gln Val Cye Ser Lye Hie Ala Vai Hie Ala Val Leu Aen Ser Ser Phe
915 920 925
Leu Leu Arg Met Val Leu Gln Aen Phe Thr Lys A~p Pro Glu Lou Val
930 935 940-

Gln Ile Thr Gly Glu Gly Leu Phe Gln Ser Thr Leu Gly Cye Val Gly945 950 955 960
Arg Leu Aep Arg Thr Gln Leu Ile Gly Leu Val Leu Ser Ser Ile Met
965 970 975
Ala Leu Gly Thr Thr Val Aen Met Arg Aen Trp Glu Ile Thr Phe Leu
980 985 990
Leu Gln Gln Tyr Pro Leu Gln Met Ser Val Ser Pro Arg Cye Pro Gln
995 1000 1005
Pro Xaa Glu Leu Gln Trp Pro Trp Thr Ala Tyr Arg Phe Ser Val Ile
1010 1015 1020
Xaa Val Ser Pro Lye Leu Leu Gly Arg Hie Leu Leu Ala Val Thr Val
1025 1030 1035 1040
Leu Thr Val Arg Val Lye Leu Leu Ser Phe Pro Ser Ala Leu Thr Aep
1045 1050 1055
Thr Hie Leu Val Val Ala Cye Aen Leu Ile Cye Val Ile Aep Leu Arg
1060 1065 1070
Gln Met Thr Val Ile Pro Xaa Thr Thr Leu Leu Val Met Lye Pro Gln
1075 1080 1085
Cye Pro Leu Leu Phe 8er Aen Arg Ser Cye Gly Val Gln Thr Aen Cye
1090 1095 1100
Leu Arg Gln Phe Gln Leu Ala Leu Thr Pro Pro Aen Cye Gln Pro Pro
1105 1110 1115 1120
Pro Arg Ser Lye Arg Xaa Trp Xaa Glu Ser Ala Ser Ser Gly Gln Glu
1125 1130 1135
Leu Val Arg Leu Pro Cye Leu Pro Leu Arg Aep Pro Ser Gln Glu Cy8
1140 1145 1150
Hie Val Leu Lye Ala Cye Aen Glu Val Thr Arg Xaa Lye Val Leu Gln
1155 1160 1165
Xaa Ser Leu Leu Hie Hie Leu Phe Xaa Ser Trp Pro Çye Arg Cye Pro

W O95/21922 ~ 1 6 6 3 1 3 PCTrUS95/02118

239
1170 1175 1180
Cys Trp Glu Gln Val Ser Val Thr Leu Ser Leu Gln Leu Asp Val Gln
1185 1190 1195 1200
aa Pro Lye Gln Xaa Glu Xaa Xaa Xaa Ile Tyr Pro Val Thr Leu Pro
1205 1210 1215
ys Arg Arg 8er Leu Asn Gly Gln Thr Lys Val Gly Gln Arg Leu Gln
1220 1225 1230
Pro Leu Pro Ala Thr Leu Leu Ala Pro Arg Thr Leu Arg Tyr Gly Ala
1235 1240 1245
Arg Ile Pro Leu Aen Gln Pro Pro Pro Aen Gly Leu Gln Lye Arg Ser
1250 1255 1260
Trp Glu Arg Val Ser Phe Arg Ala Ala Xaa Ala Thr Leu Gly Pro Thr
1265 1270 1275 1280
aa Leu Ala 8er Lye Leu Leu Leu Lys Phe Cye Leu Gln Leu Gly Pro
1285 1290 1295
er Leu Val Val Ser Ser Asn Lye Asp His Trp Cy6 Met Xaa Leu Ser
1300 1305 1310
Arg Gly Met Arg Ser Leu Glu Aen Lys Lys Ser Leu Leu Ile Asp Asn
1315 1320 1325
Leu Cys Ser Pro His His Thr Thr Ser LYB Xaa Asp Trp Leu Arg Lys
1330 1335 1340
Lys Leu Gln Lye Leu Ser Val Ser Cys Gly Thr Met Met Lys Xaa Gln
1345 1350 1355 1360
eu Thr Arg Pro Leu Ser Leu Leu Ser Pro Thr Ser Leu Ala Phe Gly
1365 1370 1375
la Leu Met Phe Trp Thr Cys Arg Ser Val Ser Arg Gln Val Arg Tyr
1380 1385 1390
Arg Val Ile Ile Gly Lye Leu Xaa Xaa Phe Gln Arg Arg Arg Ser Ser
1395 1400 1405
Xaa Arg Pro Pro Arg Aen Gln Gln Arg Aen Pro Gln Gly Leu
1410 1415 1420

t2) INFORMATION POR SE4 ID NO:82:

(i) SEQUBNCE CHARACTERISTICS:
(A) LBNGTH: 1422 amino acids
(B) TYPE: amino acid
(C) 8T~ ~K~ .qs: eingle
(D) TOPOLOGY: linear

W 095/21922 ~ 1 ~ 6 3 ~ ~ PCTnUS95/02118

240
~ii) MOLECULE TYPE: protein
(xi) L~yu~N~ DESCRIPTION: SEQ ID NO:82:

Gly Ser Ser Hie Arg Leu Hie Thr Pro Aen Aen Arg Xaa Arg Gly Xaa
1 5 10 15
aa Pro Gly His Leu Ser Thr Thr Met Trp Ser Trp Val Pro Tyr Ser

Val Leu Leu Arg Gly Aep Gln Gly Val Ser Gly Asn Thr Thr Gly Val

Ile Gly Xaa Gly Gln Gln Ile Arg Xaa Pro Leu Leu Val Cye Yal Arg

Gly Pro Ser Hie Gly Cye Cye Gln Gly Phe Phe Arg Cye Pro Aep Ser
al Leu Leu Arg Ala Cye Tyr Trp Asp Val Hie Arg Cye Xaa Lys Phe
rp Arg Phe Ser Arg Pro Aep Xaa Gly Xaa Ala Val Gly Val Cye Trp
100 105 110
Ile Pro Ser Pro Val Hie Ser Thr Cye Hie Ser Xaa Tyr Lye Thr Tyr
115 120 125
Cye Ala Xaa Arg Val Phe Ser Ala Aen Phe Aen Cye Pro Hie Trp Gln
130 135 140
Arg Gln Val Aen Gln Ile Thr Thr Phe Leu Hie Ala Gly Glu Xaa Xaa
145 150 155 160
ly Leu Gly Pro Lye Ser Gln Cye Gly Tyr Aen Ser Ile Aen Ala Lye
165 170 175
al Hie Ala Arg Aep Val Arg Arg Glu Ser Lye ~eu ~eu Phe Xaa Trp
180 185 190
Gln Met Tyr Gln Hie Arg Gly Phe Thr Tyr Val Gln Hie Ile Trp Hie
195 200 205
Val Pro Aep Arg Thr Met Phe Pro Glu Leu Xaa Cye Aen Hie Leu Xaa
210 215 220
Arg Met Pro Cye Tyr Arg Ser Aen Hie Arg Val Gly Hie Trp Lye Gly
225 230 235 240
ro Aen Arg Ser Ser Ile Gln Lye Cye Xaa Ala Ser Gly Ser Cye Hie
245 250 255
ly Tyr Pro Pro Trp Ser Aen Pro Tyr Thr Thr Cye Gln Hie Aen Xaa
260 265 270
sp Ser Ile Aen Xaa Xaa Arg Hie Tyr Pro Leu Ser Trp Lye Lye Aep

- W O95/21922 2 1 6 ~ ~ 13 PCTnUS95/02118

241
275 280 285
Xaa Gly Gly Lye Ser Glu Glu Arg Glu Thr Pro Tyr Leu Xaa Gly Tyr
290 295 300
Gln Lye Thr Leu Xaa Xaa Ala Cys Xaa Arg Val Ser Ser Ly~ Gly Aen
305 310 315 320
sn Ser Cye Leu Leu Leu Xaa Gly Met Xaa His Leu Lye Aen Ala Xaa
325 330 335
ly Arg Leu Cye Ser Ser Cye Hie Xaa Cye Leu Val Tyr Arg Val Hie
340 345 ` 350
rp Xaa Leu Xaa Phe Arg Val Xaa Leu Gln Pro Hie Gly Arg Arg Hie
355 360 365
Met Pro Cye Xaa Pro Xaa Pro Tyr Phe Hie Hie Gly Cye Ser Cye Val
370 375 380
Arg Gly Phe Ser Aen Ser Xaa Arg Pro Ala Xaa Gly Pro Hie Arg Pro
385 390 395 400
rp Glu Ser Trp His Ile Leu Leu Cye Arg Arg Glu Leu Tyr Pro Phe
405 410 415
ly Tyr Gly Ser Xaa Met Gln Hie Cye Xaa Ser Leu Arg Arg Ser Gln
420 425 430
ly Met Val Trp Phe Val Ile Aen Arg Ser Ser Aen Tyr Ser Gly Hie
435 440 445
Leu Ser Hie Pro Thr Trp Val Thr Cye Aep Arg Ser Lye Phe Gly Arg
450 455 460
Val Gly Xaa Ser Leu Phe Tyr Gly Gln Pro Arg Thr Phe Ile Cye Gln
465 470 475 480
yr Cye Lye Lye Asn Cye Xaa Gln Leu Cys Phe Val Aep Cye Ser Pro
485 490 495
hr Thr Thr Val Ser Ser Val Trp Leu Cye Cye Ser Gln Xaa Arg Thr
500 505 510
hr Val Ala Gly Ser Pro Ala Trp Glu Lye Thr Leu Trp Gly Ser Val
515 520 525
Ala Leu Gly Arg Leu Xaa Arg Leu Ser Trp Pro Arg Ala Gln Arg Gly
530 535 540
Aep Gln Ile Pro Aen Val Leu His Xaa Ser Gln Tyr Phe Trp Asp Ser
545 550 555 560
rg Thr Arg Cye Trp Arg Trp Ser Gly Tyr Gly Leu Ser Ser Hie Xaa
565 570 575
ie Phe Trp Arg Hie Leu Cye Ala Ala Leu Leu Val Tyr Tyr Ile Ser

W 095/21922 ~ ~ ~ 6 ~ ~ 3 PCTrUS95/02118

242
580 585 590
Pro Tyr Arg Cys Tyr Cys Arg Pro Ser Gly Xaa Arg Arg Gly Asn Arg
595 600 605
Gly Gly Val Cys Ile Ile Hi~ Ser Leu Gly Gly His Gly Cys Cys Asn
610 615 620
Xaa Gln Ala Glu Glu Tyr Asn Hie His Asn Xaa Ser Phe His Ile Gly
625 630 635 640
sn Arg Pro-Xaa Lys Thr Xaa His Leu Ser Trp Ala Ser Cys Ser Tyr
645 650 655
sn Pro Cys Tyr His Arg Val Leu Leu Trp Leu Ser His Phe Thr Xaa
660 665 670
Gln Ser Leu Cys Ile Met Arg Val Cys Phe His Cys Gly Tyr Tyr Tyr
675 680 685
Pro Thr Thr Ser Gln ABP Gln Asn Val Pro Val Ile Ile Trp Arg Arg
690 695 700
Asn Cys Val Gln Ala Tyr Arg Arg Xaa Arg Arg Thr Gly Val His Asp
705 710 715 720
ly Arg Gly Cys Gly Asn Ser Ser Trp Tyr Met Asp Ile Gly Gly Phe
725 730 735
ys Leu Xaa His Ala Arg Arg Leu Cys Trp Arg Leu Ile Hie Cys Leu
740 745 750
Leu Asp Ile Xaa Met Leu Asp Gly Xaa Val Ala His Tyr Gly Ser Ala
755 760 765
Cys Trp Phe Ser Leu Leu Arg Val Gln Ser Gly Arg Arg Ser Cys Gly
770 775 780
Arg Leu Val Ser Leu Cys Asn Val Cys Phe Asp Asn Ser Arg Ala Arg
785 790 795 800
er Leu Ala Gln Gln Thr Ser Tyr Tyr Ala Cys Xaa Glu Gln His Cys
805 810 815
et Xaa Xaa Val Leu Tyr CYB His Ser Xaa His Pro Gln Glu Asp Thr
820 825 830
Gly His Ser Gly Gly Ile Tyr Pro Leu Glu Xaa His Ile Ser Leu His
835 840 845
Pro Leu Ala Xaa His Pro Asp Gly Gly Xaa Leu Arg Pro Hie Cye Leu
850 855 860
Gly Ser Xaa Asp Leu Ala Val Cye Val Gln Phe Leu Cys Asp Leu Leu
865 870 875 880
Xaa CYB Pro Xaa Ser Trp Ser Ser Glu His Gly Xaa His Ser Trp Leu

- W 095t21922 2 1 ~ 3 PCTrUS95/02118

243
885 890 895
Ser Phe Leu Gln Leu Pro Glu Gly Val Gln Gly Pro Leu Aep Trp Ile
900 905 910
Arg Tyr Ala Pro 8er Thr Leu Ser Met Arg Cye Xaa Thr Hie Leu Phe
915 920 925
Cye Xaa Glu Trp Phe Cye Lye Thr Leu Gln Arg Thr Gln Aen Leu Phe
930 935 940
Lye Leu Leu Glu Arg Gly Cye Ser Ser Gln Arg Xaa Ala Val Trp Val
945 950 955 960
Gly Xaa Thr Gly Pro Asn Xaa Leu Aep Xaa Ser Cye Arg Gln Leu Trp
965 970 975
Arg Xaa Gly Leu Leu Xaa Ile Xaa Glu Ile Gly Arg Ser Hie Phe Cye
980 985 g9o
Tyr Ser Ser Ile Leu Ser Lye Cye Leu Phe Hie Pro Gly Ala Pro Asn
995 1000 1005
Leu Glu Ser Cye Ser Gly Arg Gly Pro Arg Thr Gly Ser Xaa Leu Ser
1010 1015 1020
Arg Xaa Ala Gln Aen Ser Leu Asp Aep Ile Cye Leu Leu Leu Arg Ser
1025 1030 1035 1040
Xaa Arg Xaa Gly Xaa Aen Cye Xaa Ala Ser Leu Pro Arg Xaa Arg Thr
1045 1050 1055
Hie Thr Trp Trp Ser Hie Ala Thr Xaa Phe Ala Xaa Ser Thr Xaa Gly
1060 1065 1070
Lye Xaa Leu Xaa Phe Hie Lye Gln Hie Ser Xaa Xaa Xaa Ser Arg Ser
1075 1080 1085
Val Arg Ser Cye Phe Gln Thr Gly Val Ala Ala Tyr Lye Pro Ile Ala
1090 1095 1100
Xaa Gly Aen Phe Ser Trp Arg Xaa Hie Hie Gln Thr Ala Ser Pro Leu
1105 1110 1115 1120
Pro Aep Arg Arg Gly Ser Gly Lye Lye Ala Pro Val Pro Gly Lye Aen
1125 1130 1135
Trp Phe Ala Tyr Leu Ala Ser Pro Ser Glu Ile Arg Pro Arg Ser Val
1140 1145 1150
Met Ser Xaa Lys Pro Ala Thr Lye Xaa Pro Val Arg Arg Ser Phe Xaa
`~ 1155 1160 1165
Pro Pro Phe Phe Thr Thr Cye Ser Xaa Val Gly Hie Ala Aep Ala Pro
1170 1175 1180
Val Gly Ser Arg Xaa Val Xaa Pro Phe His CYB Asn Trp Met Cye Asn

W O 95/21922 2 1 6 ~ PCTrUS95/02118

244
1185 1l9o 1195 1200
Asp Arg Asn Xaa Xaa Xaa Pro Xaa Xaa Phe Thr Gln Leu Pro Ser Gln
1205 1210 1215
Ly6 Gly Gly Leu Xaa Met Val Arg Arg Lys Leu Val Aen Aep Tyr Aen
1220 1225 1230
Arg Phe Gln Leu Arg Tyr Trp Pro Pro Val Pro Xaa Asp Thr Gly Gln
1235 1240 1245
Gly Phe Hie`Ser Ile Ser His Arg Gln Thr Ala Tyr Lye Lys Glu Val
1250 1255 1260
Gly Lys Glu Xaa Val Phe Val Gln Hie Glu Leu His Leu Asp Arg Arg
1265 1270 1275 1280
Asp Xaa Leu Gln Aen Cye Phe Xaa Ser Ser Val Cye Aen Ser Gly Hie
1285 1290 1295
Hie Xaa Trp Phe Pro Gln Thr Lye Ile Ile Gly Val Cye Asp Xaa Ala
1300 1305 1310
Ala Gly Cye Gly Ala Xaa Lys Thr Lye Ser Hie Tyr Xaa Xaa Thr Thr
1315 1320 1325
Ser Val Pro Pro Ile Ile Pro Gln Ala Ser Glu Ile Gly Xaa Gly Lye
1330 1335 1340
Ser Phe Lys Ser Cye Arg Cye Hie Val Gly Leu Xaa Xaa Ser Ser Ser
1345 1350 1355 1360
Ser His Ala Leu Xaa Val Cye Xaa Val Pro Hie Hie Trp Pro Ser Gly
1365 1370 1375
Hie Xaa Cye Ser Gly Leu Ala Glu Val Cye Arg Gly Arg Xaa Asp Thr
1380 1385 1390
Glu Ser Leu Ser Ala Aen Cys Aep Ser 9er Lys Gly Gly Gly Leu Arg
1395 1400 1405
Glu Asp Pro Pro Glu Thr Asn Lys Glu Thr Pro Lys Ala Tyr
1410 1415 1420

(2) INFORMATION FOR SBQ ID NO: 83:

(i) ~L~UL.._L CHARACTERISTICS:
tA) LBNGTH: 1422 amino acid~
(B) TYPB: amino acid
tC) STR~r~ K~S: single
(D) TOPOLOGY: linear
(ii) MOLBCULE TYPE: protein

_ W O95/21922 21 6 6 3 ~ 3 PCTrUS95/02118

245
(xi) ~yu~N~K DESCRIPTION: SEQ ID NO:83:

Ala Hie Pro Thr Gly Ser Ile Hie Pro Ile Thr Val Aep Ala Ala Aen
1 s 10 15
Aep Gln Aep Ile Tyr Gln Pro Pro Cy8 Gly Ala Gly Ser Leu Thr Arg

Cye Ser Cye Gly Glu Thr Lys Gly Tyr Leu Val Thr Arg Leu Gly Ser

Leu Val Glu Val Aen Lys Ser Asp Aep Pro Tyr Trp Cye Val Cys Gly

Ala Leu Pro Met Ala Val Ala Lye Gly Ser Ser Gly Ala Pro Ile Leu

Cye Ser Ser Gly Hie Val Ile Gly Met Phe Thr Ala Ala Arg Aen Ser
ly Gly Ser Val Gly Gln Ile Arg Val Arg Pro Leu Val Cye Ala Gly
100 105 110
Tyr His Pro Gln Tyr Thr Ala Hie Ala Thr Leu Asp Thr Ly6 Pro Thr
115 120 125
Val Pro Aen Glu Tyr Ser Val Gln Ile Leu Ile Ala Pro Thr Gly Ser
130 135 140
Gly Lye Ser Thr Lye Leu Pro Leu Ser Tyr Met Gln Xaa Lys Xaa Glu
145 150 155 160
Val Leu Val Leu Aen Pro Ser Val Ala Thr Thr Ala Ser Met Pro Lye
165 170 175
yr Met His Ala Thr Tyr Gly Val Aen Pro Asn Cye Tyr Phe Aen Gly
180 185 190
Lye Cye Thr Aen Thr Gly Ala Ser Leu Thr Tyr Ser Thr Tyr Gly Met
195 200 205
Tyr Leu Thr Gly Arg Cye Ser Arg Aen Tyr Aep Val Ile Ile Cys Asp
210 215 220
Glu Cye Hie Ala Thr Aep Arg Thr Thr Val Leu Gly Ile Gly Lye Val
225 230 235 240
Leu Thr Glu Ala Pro Ser Lye Aen Val Arg Leu Val Val Leu Ala Thr
245 250 255
la Thr Pro Pro Gly Val Ile Pro Thr Pro Hie Ala Asn Ile Thr Glu
260 265 270
le Gln Leu Thr Aep Glu Gly Thr Ile Pro Phe Hie Gly Lye Lye Ile_
275 280 285

W 095t21922 ~ 1 ~ 6 3 1 3 PCTrUS95/02118

246
Lye Glu Glu Asn Leu Lye Lys Gly Arg Hie Leu Ile Phe Glu Ala Thr
290 295 300
Lye Lye His Cys Aep Glu Leu Ala Aen Glu Leu Ala Arg Lye Gly Ile
305 310 315 320
hr Ala Val Ser Tyr Tyr Arg Gly Cys Asp Ile Ser Lys Met Pro Glu
325 330 335
ly Asp Cye Val Val Val Ala Thr Aep Ala Leu Cye Thr Gly Tyr Thr
340 345 350
ly Aep Phe Aep Ser Val Tyr Aep Cye Ser Leu Met Val Glu Gly Thr
355 360 365
Cye Hie Val Aep Leu Aep Pro Thr Phe Thr Met Gly Val Arg Val Cye
370 375 380
Gly Val Ser Ala Ile Val Lye Gly Gln Arg Arg Gly Arg Thr Gly Arg
385 390 395 400
ly Arg Ala Gly Ile Tyr Tyr Tyr Val Aep Gly Ser Cye Thr Pro Ser
405 410 415
ly Met Val Pro Glu Cye Aen Ile Val Glu Ala Phe Aep Ala Ala Lys
420 425 430
la Trp Tyr Gly Leu Ser Ser Thr Glu Ala Gln Thr Ile Leu Aep Thr
435 440 445
Tyr Arg Thr Gln Pro Gly Leu Pro Ala Ile Gly Ala Asn Leu Asp Glu
450 455 460
Trp Ala Asp Leu Phe Ser Met Val Aen Pro Glu Pro Ser Phe Val Aen
465 470 475 480
hr Ala Lye Arg Thr Ala Aep Aen Tyr Val Leu Leu Thr Ala Ala Gln
485 490 495
eu Gln Leu Cye Hie Gln Tyr Gly Tyr Ala Ala Pro Aen Aep Ala Pro
500 505 510
rg Trp Gln Gly Ala Arg Leu Gly Lye Lye Pro Cys Gly Val Leu Trp
515 520 525
Arg Leu Asp Gly Cys Aep Ala Cye Pro Gly Pro Glu Pro Ser Glu Val
530 535 540
Thr Arg Tyr Gln Met Cye Phe Thr Glu Val Aen Thr Ser Gly Thr Ala
545 550 555 560
la Leu Ala Val Gly Val Gly Val Ala Met Ala Tyr Leu Ala Ile Asp
565 570 575
Thr Phe Gly Ala Thr Cys Val Arg Arg Cys Trp Ser Ile Thr Ser Val
580 585 590

- W O95/21922 PCTrUS95/02118
3 ~1 ~
247
Pro Thr Gly Ala Thr Val Ala Pro Val Val Aep Glu Glu Glu Ile Val
595 600 605
Glu Glu Cye Ala 8er Phe Ile Pro Leu Glu Ala Met Val Ala Ala Ile
~ 610 615 620
Aep Lye Leu Lye 8er Thr Ile Thr Thr Thr Ser Pro Phe Thr Leu Glu
~ 625 630 635 640
Thr Ala Leu Glu Lye Leu A~n Thr Phe Leu Gly Pro Hie Ala Ala Thr
645 650 655
Ile Leu Ala Ile Ile Glu Tyr Cye Cye Gly Leu Val Thr Leu Pro A~p
660 665 670
Aen Pro Phe Ala Ser Cye Val Phe Ala Phe Ile Ala Gly Ile Thr Thr
675 680 685
Pro Leu Pro Hie Lye Ile Lye Met Phe Leu Ser Leu Phe Gly Gly Ala
690 695 700
Ile Ala Ser Lye Leu Thr Aep Ala Arg Xaa Ala Leu Ala Phe Met Met
705 710 715 720
Ala Gly Ala Xaa Gly Thr Ala Leu Gly Thr Trp Thr Ser Val Gly Phe
725 730 735
Val Phe Aep Met Leu Gly Gly Tyr Ala Gly Ala Ser Ser Thr Ala Cye
740 745 750
Leu Thr Phe Lys Cye Leu Met Gly Glu Trp Xaa Thr Met Aep Gln Leu
755 760 765
Ala Gly Leu Val Tyr Ser Ala Phe Asn Pro Ala Ala Gly Val Val Gly
770 775 780
Val Leu Ser Ala Cye Ala Met Phe Ala Leu Thr Thr Ala Gly Pro Aep
785 790 795 800
Hie Trp Pro A~n Arg Leu Leu Thr Met Leu Ala Arg Ser Aen Thr Val
805 810 815
Cye Xaa Glu Tyr Phe Ile Ala Thr Arg Aep Ile Arg Arg Lye Ile Leu
820 825 830
Gly Ile Leu Glu Ala Ser Thr Pro Trp Ser Xaa Ile Ser Ala Cy8 Ile
835 840 845
Arg Trp Leu Hie Thr Pro Thr Glu Asp Aep Cye Gly Leu Ile Ala Trp
850 855 860
Gly Leu Xaa Ile Trp Gln Tyr Val Cye Aen Phe Phe Val Ile Cye Phe
865 870 875 880
Aen Val Leu Lye Ala Gly Val Gln Ser Met Val Aen Ile Pro Gly Cye
885 890 895

~16~31~
WO 95/21922 PCI/US95/02118 -

248
Pro Phe Tyr Ser Cye Gln Lye Gly Tyr Lye Gly Pro Trp Ile Gly Ser
900 905 910
Gly Met Leu Gln Ala Arg Cye Pro Cye Gly Ala Glu Leu Ile Phe Ser
915 920 925
Val Glu Aen Gly Phe Ala Lye Leu Tyr Lye Gly Pro Arg Thr Cye Ser
930 935 940
Aen Tyr Trp Arg Gly Ala Val Pro Val Asn Ala Arg Leu CYB Gly Ser
945 950 955 960
la Arg Pro Aep Pro Thr Aep Trp Thr Ser Leu Val Val Aen Tyr Gly
965 970 975
al Arg Asp Tyr Cye Lys Tyr Glu Lys Leu Gly Asp His Ile Phe Val
980 985 990
Thr Ala Val Ser Ser Pro Asn Val Cye Phe Thr Gln Val Pro Pro Thr
995 1000 1005
Leu Arg Ala Ala Val Ala Val Asp Arg Val Gln Val Gln Xaa Tyr Leu
1010 1015 1020
Gly Glu Pro Lys Thr Pro Trp Thr Thr Ser Ala Cys Cy~ Tyr Gly Pro
1025 1030 1035 1040
ep Gly Lys Gly Lys Thr Val Lye Leu Pro Phe Arg Val Asp Gly His
1045 1050 1055
hr Pro Gly Gly Arg Met Gln Leu Asn Leu Arg Asp Arg Leu Glu Ala
1060 1065 1070
Asn Asp Cys Asn Ser Ile Asn Asn Thr Pro Ser Asp Glu Ala Ala Val
1075 1080 1085
Ser Ala Leu Val Phe Lys Gln Glu Leu Arg Arg Thr Asn Gln Leu Leu
1090 1095 1100
Glu Ala Ile Ser Ala Gly Val Asp Thr Thr Lys Leu Pro Ala Pro Ser
1105 1110 1115 1120
ln Ile Glu Glu Val Val Val Arg Lys Arg Gln Phe Arg Ala Arg Thr
1125 1130 1135
ly Ser Leu Thr Leu Pro Pro Pro Pro Arg Ser Val Pro Gly Val Ser
1140 1145 1150
Cys Pro Glu Ser Leu Gln Arg Ser Aep Pro Leu Glu Gly Pro Ser Xaa
1155 1160 1165
Leu Pro Ser Ser Pro Pro Val Leu Gln Leu Ala Met Pro Met Pro Leu
1170 1175 1180
Leu Gly Ala Gly Glu Cys Asn Pro Phe Thr Ala Ile Gly Cys Ala Met
1185 1190 1195 1200

- W O95/21922 21~63~ ~ PCTrUS95/02118

249
Thr Glu Thr Xaa Gly Xaa Pro Xaa Xaa Leu Pro Ser Tyr Pro Pro Lys
1205 1210 1215
Lys Glu Val Ser Glu Trp Ser Asp Glu Ser Trp Ser Thr Thr Thr Thr
~ 1220 1225 1230
Ala Ser Ser Tyr Val Thr Gly Pro Pro Tyr Pro Lys Ile Arg Gly Lye
- 1235 1240 1245
Asp Ser Thr Gln Ser Ala Thr Ala Lys Arg Pro Thr Lys Lys Lys Leu
1250 1255 1260
Gly Lys Ser Glu Phe 8er Cy8 Ser Met Ser Tyr Thr Trp Thr Aep Val
1265 1270 1275 1280
Ile Ser Phe Lye Thr Ala Ser Lys Val Leu Ser Ala Thr Arg Ala Ile
1285 1290 1295
Thr Ser Gly Phe Leu Lye Gln Arg Ser Leu Val Tyr Val Thr Glu Pro
1300 1305 1310
Arg Asp Ala Glu Leu Arg Lys Gln Lye Val Thr Ile Aen Arg Gln Pro
1315 1320 1325
Leu Phe Pro Pro Ser Tyr Hie Lye Gln Val Arg Leu Ala Lys Glu Lye
1330 1335 1340
Ala Ser Lye Val Val Gly Val Met Trp Asp Tyr Asp Glu Val Ala Ala
1345 1350 1355 1360
His Thr Pro Ser Lye Ser Ala Lys Ser His Ile Thr Gly Leu Arg Gly
1365 1370 1375
Thr Aep Val Leu Aep Leu Gln Lye Cye Val Glu Ala Gly Glu Ile Pro
1380 1385 1390
Ser His Tyr Arg Gln Thr Val Ile Val Pro Lye Glu Glu Val Phe Val
1395 1400 1405
Lys Thr Pro Gln Lye Pro Thr Lys Lys Pro Pro Arg Leu Ile
1410 1415 1420

~2) lN~Oh ~.TION FOR SBQ ID NO:84:

( i ) ~UU~N~'~ CHARACTERISTICS:
~A) LENGTH: 1422 amino acide
(B) TYPE: amino acid
(C) STRANl ~cs: eingle
-~ (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(Xi) ~U~N~ DESCRIPTION: SEQ ID NO:84:

WO 95/21922 ~ 1 6 6 3 I 3 PCT/US9S/02118 -~

250
Asp Lys Pro Trp Gly Phe Leu Cy8 Trp Phe Leu Gly Gly Leu Hi e Glu
ep Leu Leu Leu Trp Asn Tyr His Ser Leu Pro Ile Met Thr Arg Tyr

Leu Thr Cys Leu Asp Thr Leu Leu Gln Val Gln Asn Ile 8er Ala Pro

Lys Ala Ser A~p Val Gly Leu Ser Arg Leu Arg Gly Arg Val Ser Cys

Tyr Phe Ile Ile Val Pro Hie Asp Thr Asp Asn Phe Xaa Ser Phe Phe
eu Ser Gln Ser His Leu Leu Val Val Xaa Trp Gly Glu Gln Arg Leu
er Ile Asn Ser Asp Phe Leu Phe Ser Lys Leu Arg Ile Pro Arg Leu
100 105 110
Ser His Ile Hi~ Gln Xaa 6er Lou Phe Glu Glu Thr Thr Ser Al~p Gly
115 120 125
Pro Ser Cys Arg Gln Asn Phe Arg Ser Ser Phe Glu Ala Asn His Val
130 135 140
Gly Pro Ser Val Ala His Ala Ala Arg Lys Leu Thr Leu Ser Gln Leu
145 150 155 160
eu Phe Cys Arg Pro Phe Gly Gly Gly Xaa Leu Ser Gly Ile Leu Ala
165 170 175
ro Tyr Leu Arg Val Arg Gly Ala 8er Asn Val Ala Gly Ser Gly Cys
180 185 190
6er Arg Xaa Pro Thr Phe Val Xaa Pro Phe Arg Asp Leu Leu Phe Gly
195 200 205
Arg Val Thr Gly Xaa Ile Xaa Xaa Xaa Ser Xaa Cys Phe Gly His Cys
210 215 220
Thr Ser Asn Cys Ser Glu Arg Val Thr Leu Thr Cys Ser Gln Gln Gly
225 230 235 240
i~ Arg His Gly Gln Leu Xaa Asn Arg Trp Xaa Arg Arg Glu Xaa Xaa
245 250 255
rg Thr Phe Xaa Arg Val Thr Ser Leu Gln Ala Phe Arg Thr Xaa His
260 265 270
er Trp Asp Gly Ser Arg Arg Gly Arg Gln Gly Lys Arg Thr Ser Ser
275 280 ~85
ys Pro Glu Leu Ala Leu Ser Tyr His Tyr Leu Phe Asp Leu Gly Gly
290 295 300 ~~

- W O 95/21922 ~ 1 5 ~ 3 ~ 3 PCT~US95/02118

251
Gly Trp Gln Phe Gly Gly Val Aen Ala Ser Xaa Aen Cye Leu Lye Gln
305 310 315 320
Leu Val Cye Thr Pro Gln Leu Leu Phe Glu Aen Lye Ser Gly Hie Cye
325 330 335
Gly Phe Ile Thr Arg Ser Val Val Tyr Gly Ile Thr Val Ile Cye Leu
340 345 350
Lye Ser Ile Thr Gln Ile Lye Leu Hie Ala Thr Thr Arg Cye Val Ser
355 360 365
Val Aen Ala Glu Gly Lye Leu Asn Ser Phe Thr Leu Thr Val Arg Thr
370 375 380
Val Thr Ala Ser Arg Cye Arg Pro Arg Ser Phe Gly Leu Thr Xaa Ile
385 390 395 400
Thr Leu Aen Leu Tyr Ala Val Hie Gly Hie Cye Ser Ser Gln Gly Trp
405 410 415
Gly Hie Leu Gly Glu Thr Aep Ile Trp Arg Gly Tyr Cye Cye Aen Lye
420 425 430
Aen Val Ile Ser Gln Phe Leu Ile Phe Thr Val Val Pro Aen Ala Ile
435 440 445
Ile Aep Aep Lye Thr Ser Pro Ile Ser Trp Val Arg Ser Ser Arg Pro
450 455 460
Thr Gln Pro Ser Val Aep Trp Aen Ser Pro Ser Pro Val Ile Xaa Thr
465 470 475 480
Ser Ser Gly Ser Phe Val Lye Phe Cye Lye Thr Ile Leu Asn Arg Lye
485 490 495
Aep Glu Phe Ser Thr Ala Trp Thr Ala Cye Leu Glu His Thr Xaa Ser
500 505 510
Aen Pro Gly Ala Leu Val Pro Leu Leu Ala Ala Val Glu Arg Thr Thr
515 520 525
Arg Asn Val Asn Hie Ala Leu Aen Ser Ser Phe Lys Aep Ile Lys Ala
530 535 540
Aen Hie Lye Glu Ile Ala His Ile Leu Pro Aen Leu Xaa Thr Pro Ser
545 550 555 560
Asn Glu Ala Ala Ile Ile Leu Arg Arg Gly Val Xaa Pro Thr Aep Ala
565 570 575
Ser Xaa Tyr Aep Thr Pro Gly Gly Arg Cys Leu Gln Aen Ala Gln Tyr
580 585 590
Leu Pro Ala Aep Val Thr Ser Gly Aen Lye Val Leu Xaa Thr Tyr Ser
595 600 605

W O9S/21922 PCTrUS95/02118
3 ~ 3
252
Val Ala Pro Ser Lye Hie Ser Lys Lye Ser Val Gly Pro Val Ile Trp
610 615 620
Pro Cye Cye Cye Gln Ser Lye Hie Cye Thr Ser Xaa Gln Aep Ala Hie
625 630 63S 640
Aen Ser Cye Gly Arg Ile Glu Arg Gly Val Aep Xaa Thr Ser Lye Leu
64S 6S0 6SS
Ile Hie Ser Xaa Pro Leu Thr Hie Gln Ala Phe Lye Cye Gln Ala Ser
660 665 670
Ser Gly Xaa Gly Ala Ser Ile Ala Ala Xaa Hie Val Lye Aep Lye Thr
675 680 685
Hie Arg Cye Pro Cye Thr Lye Ser Cye Ser Xaa Ser Pro Gly Hie Hie
690 695 700
Glu Arg Gln Cye Xaa Ser Ser Val Cye Lye Leu Gly Arg Asn Cye Ala
705 710 715 720
Ser Lye Xaa Xaa Gln Glu Hie Phe Aep Leu Val Arg Xaa Trp Gly Ser
725 730 735
Aen Thr Arg Aen Glu Ser Lye Hie Ala Xaa Cys Lye Gly Ile Val Arg
740 745 750
Xaa Ser Aep Xaa Ala Thr Ala Ile Leu Tyr Asp Ser Lye Aep Cye Ser
755 760 765
Cye Met Arg Pro Lye Lys Gly Val Lye Phe Phe Lye Gly Gly Phe Gln
770 775 780
Cye Glu Arg Thr Ser Cye Gly Aep Cye Thr Leu Gln Leu Val Aen Cye
785 790 79S 800
Ser Aen Hie Gly Leu Gln Gly Aen Glu Xaa Cye Thr Leu Leu Hie Aep
80S 810 81S
Phe Leu Phe Val Aen Hie Trp Gly Aep Ser Ser Thr Gly Arg Aep Xaa
820 825 830
Cye Aen Arg Pro Ala Thr Pro Hie Thr Ser Gly Ala Lye Ser Val Aen
835 840 845
Gly Xaa Ile Ser Hie Ser Hie Ser Aen Ala Aen Ser Glu Cye Gly Cye
850 85S 860
Pro Arg Ser Ile Aep Phe Ser Glu Ala Hie Leu Val Ser Gly Bie Leu
86S 870 87S 880
Ala Gly Leu Trp Ala Arg Thr Gly Val Thr Ala Val Gln Ala Pro Gln
88S 890 89S
Aen Pro Thr Arg Phe Phe Pro Lye Pro Gly Ser Leu Pro Pro Trp Cye
900 90S 910

- W 095/21922 2 1 ~ 6 ~ 1 ~ PCTAUS95/02118

253
Val Ile Gly Ser Ser Ile Ala Ile Leu Met Thr Gln Leu Xaa Leu Gly
915 920 925
Cye Ser Gln Gln Aen Ile Ile Val Ser Ser Ser Phe Cye Ser Ile Asp
930 935 940
Lye Xaa Arg Phe Gly Val Aep Hie Arg Lye Glu Ile Ser Pro Leu Val
945 950 955 960
ln Ile Cy8 Ser Tyr Arg Arg Xaa Pro Arg Leu Gly Ala Ile Gly Val
965 970 975
ln Aen Ser Leu Ser Phe Cye Xaa Xaa Gln Thr Ile Pro Cye Leu Gly
980 985 99o
Cye Val Glu Gly Phe Aen Aen Val Ala Phe Arg Aen Hie Thr Arg Arg
995 1000 1005
Gly Thr Thr Pro Val Tyr Ile Val Val Tyr Ala Ser Ser Pro Thr Ala
1010 1015 1020
Cye Ala Ala Pro Thr Leu Ala Phe Aen Tyr Cye Xaa Asn Pro Ala Hie
1025 1030 1035 1040
hr Aen Thr Hie Gly Glu Ser Arg Val Lye Val Aen Met Ala Cye Ala
1045 1050 1055
he Tyr Hie Glu Ala Ala Val Ile Hie Gly Ile Lye Val Thr Ser Val
1060 1065 1070
Pro Cye Thr Gln Gly Ile Ser Gly Aen Tyr Tyr Thr Val Ala Leu Arg
1075 1080 1085
Hie Phe Xaa Aep Val Thr Ser Pro Ile Val Arg Aep Ser Cye Tyr Ser
1090 1095 1100
Leu Ser Ser Xaa Leu Val Ser Lye Leu Ile Thr Val Phe Phe Gly 8er
1105 1110 1115 1120
eu Lye Aep Lye Val Ser Pro Phe Leu Gln Ile Phe Leu Leu Aen Leu
1125 1130 1135
he Ser Met Lye Gly Aep Ser Ala Phe Ile Xaa Xaa Leu Aen Leu Ser
1140 1145 1150
Tyr Val Gly Met Trp Cye Arg Aep Tyr Ser Arg Gly Gly Ser Arg Gly
1155 1160 1165
Lye Aen Hie Xaa Pro Aen Ile Phe Gly Trp Ser Phe Gly Xaa Aep Leu
1170 1175 1180
Ser Aen Ala Gln Hie Gly Gly Ser Ile Gly Ser Met Ala Phe Val Thr
1185 1190 1195 1200
Aen Aep Tyr Ile Ile Val Pro Gly Thr Ser Ser Gly Gln Val Hie Ala
1205 1210 1215

W 095/21922 2 1 ~ ~ 3 1 3 PCTrUS95/02118

254
Ile CYB Ala Val Arg Lye Xaa Ber Pro Cye Val Gly Thr Phe Ala Ile
1220 1225 1230
Lye Ile Ala Ile Trp Ile Hie Ala Val Arg Arg Val His Val Leu Trp
1235 1240 1245
Hie Xaa Cye Cye Cye Ser Hie Thr Gly Ile Xaa Aep Gln Aep Leu Xaa
1250 1255 1260
Leu Xaa Leu Hie Val Arg Lye Trp Xaa Phe Gly Xaa Leu Ala Ala Ala
1265 1270 1275 1280
Ser Gly Gly Aen Xaa Aen Leu Hie Xaa Ile Leu Val Arg Hie Ser Arg
1285 1290 1295
Phe Cye Ile Lye Ser Gly Met Cye Cye Val Leu Gly Met Val Ser Ser
1300 1305 1310
Thr Hie Gln Arg Pro Aen Pro Aen Leu Ala Aep Xaa Thr Ala Arg Ile
1315 1320 1325
Ser Ser Ser Gly Glu Hie Pro Aen Aen Met Pro Gly Gly Ala Gln Aen
1330 1335 1340
Arg Gly Thr Xaa Arg Thr Leu Gly Aen Ser Hie Gly Lye Gly Pro Ala
1345 1350 1355 1360
Hie Thr Pro Ile Arg Val Ile Gly Phe Val Aep Leu Aen Gln Xaa Pro
1365 1370 1375
Gln Ser Cy~ Tyr Gln Ile Pro Leu Gly Leu Pro Ala Arg Ala Pro Ser
1380 1385 1390
Lye Gly Pro Ser Ser Thr Trp Trp Leu Ile Aep Val Leu Val Ile Ser
1395 1400 1405
Arg Val Aen Gly Tyr Trp Val Tyr Gly Ala Cye Gly Met Ser
1410 1415 1420

(2) lNr~F~ ~.TION FOR SEQ ID NO:85:

(i) ~rUu~N~ CHARACTERISTICS:
~A) LENGTH: 1422 amino acide
(~) TYPE: amino acid
(C) ST~NnRnN~CS eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) ~rUurN~r DESCRIPTION: SEQ ID NO:lll:

Ile Ser Leu Gly Gly Phe Phe Val Gly Phe Trp Gly Val Phe Thr Lye
1 5 10 15

--- W O95t21922 ~ 1 ~ 6 ~ t ~ PCTAUS95/02118

255
Thr Ser Ser Phe Gly Thr Ile Thr Val Cye Arg Xaa Xaa Leu Gly Ile

Ser Pro Ala Ser Thr His Phe Cye LYB Ser Arg Thr Ser Val Pro Arg

Arg Pro Val Met Trp Aep Leu Ala Aep Leu Glu Gly Val Xaa Ala Ala
, 50 55 60
Thr Ser Ser Xaa Ser Hie Met Thr Pro Thr Thr Phe Glu Ala Phe Ser

Leu Ala Aen Leu Thr Cye Leu Trp Tyr Aep Gly Gly Aen Arg Gly Cye

Leu Leu Ile Val Thr Phe Cye Phe Leu Ser Ser Ala Ser Arg Gly Ser
lO0 105 110
Val Thr Tyr Thr Aen Aep Leu Cye Leu Arg Lye Pro Leu Val Met Ala
115 120 125
Arg Val Ala Asp Arg Thr Leu Glu Ala Val Leu Lye Leu Ile Thr Ser
130 135 140
Val Gln Val Xaa Leu Met Leu Hie Glu Aen Ser Leu Phe Pro Aen Phe
145 150 155 160
Phe Phe Val Gly Arg Leu Ala Val Ala Aep Xaa Val Glu Ser Leu Pro
165 170 175
Arg Ile Leu Gly Tyr Gly Gly Pro Val Thr Xaa Leu Glu Ala Val Val
180 185 190
Val Val Asp Gln Leu Ser Ser Aep Hie Ser Glu Thr Ser Phe Leu Gly
195 200 205
Gly Xaa Leu Gly Lye Xaa Xaa Gly Xaa Pro Xaa Val Ser Val Ile Ala
210 215 220
Hie Pro Ile Ala Val Lye Gly Leu Hie Ser Pro Ala Pro Aen Arg Gly
225 230 235 240
Ile Gly Met Ala Aen Cye Arg Thr Gly Gly Glu Glu Gly Arg Xaa Glu
245 250 255
Gly Pro Ser Aen Gly Ser Leu Arg Cye Arg Leu Ser Gly Hie Aep Thr
260 265 270
Pro Gly Thr Aep Leu Gly Gly Gly Gly Lye Val Ser Glu Pro Val Leu
275 280 285
Ala Arg Aen Trp Arg Phe Leu Thr Thr Thr Ser Ser Ile Trp Glu Gly
290 295 300
Ala Gly Ser Leu Val Val Ser Thr Pro Ala Glu Ile Ala Ser Ser Aen
305 310 315 320

W 095/21922 ~ 1 ~ 6 3 1 ~ PCTrUS95102118

256
Trp Phe Val Arg Arg Aen Ser Cye Leu Lys Thr Arg Ala Aep Thr Ala
325 330 335
Ala Ser Ser Leu Gly Val Leu Phe Met Glu Leu Gln Ser Phe Ala Ser
340 345 350
8er Arg 8er Arg Lye Leu Ser Cye Met Arg Pro Pro Gly Val Cy8 Pro
355 360 365
Ser Thr Arg Lye Gly Ser Leu Thr Val Leu Pro Leu Pro Ser Gly Pro
370 375 380
Xaa Gln Gln Ala Aep Val Val Gln Gly Val Leu Gly Ser Pro Arg Xaa
385 390 395 400
Xaa Xaa Thr Cye Thr Arg Ser Thr Ala Thr Ala Ala Leu Lye Val Gly
405 410 415
Gly Thr Trp Val Lys Gln Thr Phe Gly Glu Aep Thr Ala Val Thr Lye
420 425 430
Met Xaa Ser Pro Aen Phe Ser Tyr Leu Gln Xaa Ser Leu Thr Pro Xaa
435 440 445
Leu Thr Thr Arg Leu Val Gln Ser Val Gly Ser Gly Leu Ala Aep Pro
450 455 460
Hie Ser Leu Ala Leu Thr Gly Thr Ala Pro Leu Gln Xaa Phe Glu Gln
465 470 475 480
Val Leu Gly Pro Leu Xaa Ser Phe Ala Lye Pro Phe Ser Thr Glu Lye
485 490 495
Met Ser Ser Ala Pro Hie Gly Gln Arg Ala Trp Ser Ile Pro Aep Pro
500 505 510
Ile Gln Gly Pro Leu Tyr Pro Phe Trp Gln Leu Xaa Lye Gly Gln Pro
515 520 525
Gly Met Leu Thr Met Leu Xaa Thr Pro Ala Leu Arg Thr Leu Lys Gln
530 535 540
Ile Thr Lye Lye Leu Hie Thr Tyr Cye Gln Ile Xaa Arg Pro Gln Ala
545 550 555 560
Met Arg Pro Gln Ser Ser Ser Val Gly Val Xaa Ser Gln Arg Met Gln
565 570 575
Ala Aep Met Xaa Leu Gln Gly Val Aep Ala Ser Arg Met Pro Ser Ile
580 585 590
Phe Leu Arg Met Ser Arg Val Ala Ile Ly~ Tyr Ser Xaa Hie Thr Val
595 600 605
Leu Leu Leu Ala Ser Ile Val Arg Ser Leu Leu Gly Gln Xaa Ser Gly
610 615 620

~- W O95/21922 ~ ~ 6 6 ~ 1 ~ PCT~US95/02118

257
Pro Ala Val Val Lye Ala Aen Ile Ala Gln Ala Asp Lye Thr Pro Thr
625 630 635 640
Thr Pro Ala Ala Gly Leu Aen Ala Glu Xaa Thr Lye Pro Ala Ser Xaa
645 650 655
8er Ile Val Xaa Hie Ser Pro Ile Lye Hie Leu Aen Val Lye Gln Ala
660 665 670
Val Aep Glu Ala Pro Ala Xaa Pro Pro Ser Met Ser Lye Thr Lye Pro
675 680 685
Thr Aep Val Hie Val Pro Arg Ala Val Pro Xaa Ala Pro Ala Ile Met
690 695 700
Aen Ala Ser Ala Xaa Leu Ala Ser Val Ser Leu Aep Ala Ile Ala Pro
705 710 715 720
Pro Aen Aen Aep Arg Aen Ile Leu Ile Leu Xaa Gly 8er Gly Val Val
725 730 735
Ile Pro Ala Met Lye Ala Aen Thr His Aep Ala Lye Gly Leu 8er Gly
740 745 750
Lye Val Thr Lye Pro Gln Gln Tyr 8er Met Ile Ala Arg Ile Val Ala
755 760 765
Ala Xaa Gly Pro Arg Lye Val Leu Ser Phe Ser Arg Ala Val Ser Aen
770 775 780
Val Lye Gly Leu Val Val Val Ile Val Leu Phe Ser Leu Ser Ile Ala
785 790 795 800
Ala Thr Met Ala Ser Lye Gly Met Aen Aep Ala Hie Ser Ser Thr Ile
805 810 815
8er 8er 8er 8er Thr Thr Gly Ala Thr Val Ala Pro Val Gly Thr Aep
820 825 830
Val Ile Aep Gln Gln Arg Arg Thr Gln Val Ala Pro Lye Val Ser Met
835 840 845
Ala Arg Xaa Ala Ile Ala Thr Pro Thr Pro Thr Ala 8er Ala Ala Val
850 855 860
Pro Glu Val Leu Thr 8er Val Lye Hie Ile Trp Tyr Leu Val Thr Ser
865 870 875 880
Leu Gly 8er Gly Pro Gly Gln Ala 8er Gln Pro Ser Lye Arg Hie Arg
885 890 895
-~ Thr Pro Gln Gly Phe Phe Pro Ser Arg Ala Pro Cye Hie Arg Gly Ala
900 905 910
Ser Leu Gly Ala Ala Xaa Pro Tyr Xaa Xaa Hie Ser Cye Ser Trp Ala
915 920 925

W O95/21922 PCTrUS95/021l8
~16~313
258
Ala Val Aen Lys Thr Xaa Leu Ser Ala Val Leu Phe Ala Val Leu Thr
930 935 940
Aen Glu Gly Ser Gly Leu Thr Ile Glu LYB Arg Ser Ala Hie Ser Ser
945 950 955 960
Lye Phe Ala Pro Ile Ala Gly Aen Pro Gly Trp Val Arg Xaa Val 8er
965 970 975
Arg Ile Val Xaa Ala Ser Val Aep Asp Lye Pro Tyr Hie Ala Leu Ala
980 985 99o
Ala Ser Lye Ala Ser Thr Met Leu Hie Ser Gly Thr Ile Pro Glu Gly
995 1000 1005
Val Gln Leu Pro Ser Thr Xaa Xaa Tyr Met Pro Ala Leu Pro Arg Pro
1010 1015 1020
Val Arg Pro Leu Arg Trp Pro Leu Thr Ile Ala Glu Thr Pro Hie Thr
1025 1030 1035 1040
Arg Thr Pro Met Val Lye Val Gly Ser Arg Ser Thr Trp Hie Val Pro
1045 1050 1055
Ser Thr Met Arg Leu Gln Ser Tyr Thr Glu Ser LYB Ser Pro Val Tyr
1060 1065 1070
Pro Val Hie Lye Ala Ser Val Ala Thr Thr Thr Gln Ser Pro Ser Gly
1075 1080 1085
Ile Phe Glu Met Ser Hie Pro Leu Xaa Xaa Glu Thr Ala Val Ile Pro
1090 1095 1100
Phe Arg Ala Aen Ser Leu Ala Ser Ser Ser Gln Cye Phe Leu Val Ala
1105 1110 1115 1120
Ser Lys Ile Arg Cye Leu Pro Phe Phe Arg Phe Ser Ser Leu Ile Phe
1125 1130 1135
Phe Pro Xaa Lye Gly Ile Val Pro Ser Ser Val Aen Xaa Ile Ser Val
1140 1145 1150
Met Leu Ala Cye Gly Val Gly Ile Thr Pro Gly Gly Val Ala Val Ala
1155 1160 1165
Arg Thr Thr Ser Leu Thr Phe Leu Aep Gly Ala Ser Val Arg Thr Phe
1170 1175 1180
Pro Met Pro Aen Thr Val Val Arg Ser Val Ala Trp Hie Ser Ser Gln
1185 1190 1195 1200
Met Ile Thr Ser Xaa Phe Arg Glu Hie Arg Pro Val Arg Tyr Met Pro
1205 1210 1215
Tyr Val Leu Tyr Val Ser Glu Ala Pro Val Leu Val Hie Leu Pro Leu
1220 1225 1230

- W 095/21922 2 1 ~ 5 3 ~ ~ PCTAUS95/02118

259
Lys Xaa Gln Phe Gly Phe Thr Pro Tyr Val Ala Cys Met Tyr Phe Gly
1235 la40 1245
Ile Asp Ala Val Val Ala Thr Leu Gly Phe Arg Thr Lys Thr Ser Xaa
1250 1255 1260
Phe Xaa Cye Met Xaa Glu Ser Gly Aen Leu Val Asp Leu Pro Leu Pro
1265 1270 1275 1280
al Gly Ala Ile Lye Ile Cy8 Thr Glu Tyr Ser Leu Gly Thr Val Gly
1285 1290 1295
he Val Ser Arg Val Ala Cye Ala Val Tyr Trp Gly Trp Tyr Pro Ala
1300 1305 1310
Hie Thr Asn Gly Leu Thr Leu Ile Trp Pro Thr Glu Pro Pro Glu Phe
1315 1320 1325
Leu Ala Ala Val Asn Ile Pro Ile Thr Cy~ Pro Glu Glu Hie Arg Ile
1330 1335 1340
Gly Ala Pro Glu Glu Pro Leu Ala Thr Ala Met Gly Arg Ala Pro His
1345 1350 1355 1360
hr His Gln Xaa Gly Ser Ser Asp Leu Leu Thr Ser Thr A~n Asp Pro
1365 1370 1375
er Arg Val Thr Arg Tyr Pro Leu Val Ser Pro Gln Glu His Arg Val
1380 1385 1390
Arg Asp Pro Ala Pro His Gly Gly Xaa Xaa Met Ser Trp Ser Leu Ala
1395 1400 1405
Ala Ser Thr Val Ile Gly CYB Met Glu Pro Val Gly Xaa Ala
1410 1415 1420

(2) lN~O~ ~TION FOR SEQ ID NO:86:

(i) ~yu~.c~ CHARACTERISTICS:
(A) LENGTH: 1422 amino acids
(B) TYPE: amino acid
(C) sTR~NnRn~R~s eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: protein
(xi) ~QU~N~ DESCRIPTION: SEQ ID NO:86:

Xaa Ala Leu Gly Val Ser Leu Leu Val Ser Gly Gly Ser Ser Arg Arg
l 5 10 15
Pro Pro Pro Leu Glu Leu Ser Gln Phe Ala Asp Asn Asp Ser Val Ser


WO 95/21922 ~ 3 3 PCT/US95/02118

260
Hie Leu Pro Arg Hie Thr Ser Ala Ser Pro Glu Hie Gln Cye Pro Glu

Gly Gln Xaa Cye Gly Thr Xaa Gln Thr Xaa Arg Ala Cye Glu Leu Leu

Leu His Hie Ser Pro Thr Xaa Hie Arg Gln Leu Leu Lys Leu Phe Pro

Xaa Pro Ile Ser Leu Ala Cye Gly Met Met Gly Gly Thr Glu Val Val

Tyr Xaa Xaa Xaa Leu Phe Val Phe Xaa Ala Pro Hie Pro Ala Ala Gln
100 105 110
Ser Hie Thr Pro Met Ile Phe Val Xaa Gly Asn Hie Xaa Xaa Trp Pro
115 120 125
Glu Leu Gln Thr Glu Leu Xaa Lye Gln Phe Xaa Ser Xaa Ser Arg Arg
130 135 140
Ser Lye Cye Ser Ser Cye Cye Thr Lye Thr Hie Ser Phe Pro Thr Ser
145 150 155 160
Phe Leu Xaa Ala Val Trp Arg Trp Leu Ile Glu Trp Aen Pro Cye Pro
165 170 175
Val Ser Xaa Gly Thr Gly Gly Gln Xaa Arg Ser Trp Lye Arg Leu Xaa
180 185 190
Ser Leu Thr Aen Phe Arg Leu Thr Ile Gln Arg Pro Pro Phe Trp Glu
195 200 205
Gly Aen Trp Val Aen Xaa Xaa Gly Leu Xaa Xaa Phe Arg Ser Leu His
210 215 220
Ile Gln Leu Gln Xaa Lye Gly Tyr Thr Hie Leu Leu Pro Thr Gly Ala
225 230 235 240
Ser Ala Trp Pro Thr Xaa Glu Gln Val Val Lys LYB Gly Gly Leu Lye
245 250 255
Aep Leu Leu Thr Gly Hie Phe Val Ala Gly Phe Gln Aep Met Thr Leu
260 265 270
Leu Gly Arg Ile Ser Glu Gly Glu Ala Arg Xaa Ala Aen Gln Phe Leu
275 280 285
Pro Gly Thr Gly Ala Phe Leu Pro Leu Pro Leu Arg Ser Gly Arg Gly
290 295 300
Leu Ala Val Trp Trp Cye Gln Arg Gln Leu Lys Leu Pro Gln Ala Ile
305 310 315 320
Gly Leu Tyr Ala Ala Thr Pro Val Xaa Lye Gln Glu Arg Thr Leu Arg
325 330 335

~1~63~ 3
-- WO 95/21922 PCT/US9S/02118

261
Leu His Hie Xaa Glu Cye Cye Leu Trp Asn Tyr Ser Hie Leu Pro Gln
340 345 350
Val Aep Hie Ala Asn Xaa Val Ala Cy6 Aep His Gln Val Cys Val Arg
355 360 365
Gln Arg Gly Arg Glu Ala Xaa Gln Phe Tyr Pro Tyr Arg Gln Asp Arg
370 375 380
Asn Ser Lys Gln Met Ser Ser Lys Glu Phe Trp Ala Hie Leu Asp Asn
385 390 395 400
Xaa Glu Pro Val Arg Gly Pro Arg Pro Leu Gln Leu Ser Arg Leu Gly
405 410 415
Ala Pro Gly Xaa Asn Arg Hie Leu Glu Arg Ile Leu Leu Xaa Gln Lye
420 425 430
Cys Asp Leu Pro Ile Ser Hie Ile Tyr Ser Ser Pro Xaa Arg His Asn
435 440 445
Xaa Arg Gln Asp Xaa Ser Aen Gln Leu Gly Pro Val Xaa Pro Thr His
450 455 460
Thr Ala Xaa Arg Xaa Leu Glu Gln Pro Leu Ser Ser Asn Leu Asn Lys
465 470 475 480
Phe Trp Val Leu CYB Ly~ Val Leu Gln Asn His Ser Gln Gln Lye Arg
485 490 495
Xaa Val Gln Hie Arg Met Asp Ser Val Leu Gly Ala Tyr Leu Ile Gln
500 505 510
Ser Arg Gly Pro Cye Thr Pro Ser Gly Ser Cys Arg Lys Aep Aen Gln
515 520 525
Glu Cy8 Xaa Pro Cys Ser Glu Leu Gln Leu Xaa Gly His Xaa Ser Lys
530 535 540
Ser Gln Arg Asn Cys Thr His Thr Ala Lys Ser Xaa Asp Pro Lye Gln
545 550 555 560
Xaa Gly Arg Aen Hie Pro Pro Ser Gly Cye Xaa Ala Aen Gly Cye Lys
565 570 575
Leu Ile Xaa Xaa Ser Arg Gly Xaa Met Pro Pro Glu Cys Pro Val Ser
580 585 590
Ser Cye Gly Cye Hie Glu Trp Gln Xaa Ser Thr Hie Tyr Ile Gln Cye
595 600 605
Cye Ser Xaa Gln Ala Xaa Xaa Glu Val Cye Trp Ala Ser Aep Leu Ala 610 615 620
Leu Leu Leu Ser Lye Gln Thr Leu Hie Lye Leu Thr Arg Arg Pro Gln
625 630 635 640

WO95/21922 ~ 1 ~ S ~ ~L 3 PCT/US95/02118

262
Leu Leu Arg Pro Asp Xaa Thr Arg Ser Arg Leu Aen Gln Gln Ala Asp
645 650 655
ro Xaa Xaa Ala Thr Hie Pro Ser Ser Ile Xaa Met Ser Ser Lys Gln
660 665 670
Trp Met Arg Arg Gln His Ser Arg Leu Ala Cys Gln Arg Gln Aen Pro
675 680 685
Pro Met Ser Met Tyr Gln Glu Leu Phe Pro Gln Pro Arg Pro Ser Xaa
690 695 700
Thr Pro Val Arg Leu Xaa Arg Leu Xaa Ala Trp Thr Gln Leu Arg Leu
705 710 715 720
ln Ile Met Thr Gly Thr Phe Xaa Ser Cye Glu Val Val Gly Xaa Xaa
725 730 735
yr Pro Gln Xaa Lye Gln Thr Arg Met Met Gln Arg Asp Cye Gln Val
740 745 750
Lys Xaa Leu Ser His Ser Asn Thr Leu Xaa Xaa Gln Gly Leu Xaa Leu
755 760 765
His Glu Ala Gln Glu Arg Cye Xaa Val Phe Gln Gly Arg Phe Pro Met
770 775 780
Xaa Lys Aep Xaa Leu Trp Xaa Leu Tyr Ser Ser Ala Cys Gln Leu Gln
785 790 795 800
ln Pro Trp Pro Pro Arg Glu Xaa Met Met His Thr Pro Pro Arg Phe
805 810 815
ro Leu Arg Gln Pro Leu Gly Arg Gln Xaa His Arg Xaa Gly Leu Met
820 825 830
Xaa Xaa Thr Ser Asn Ala Ala His Lye Trp Arg Gln Lye Cys Gln Trp
835 840 845
Leu Aep Lys Pro Xaa Pro Leu Gln Arg Gln Gln Arg Val Arg Leu Ser
850 855 860
Gln Lys Tyr Xaa Leu Gln Xaa Ser Thr Phe Gly Ile Trp Ser Pro Arg
865 870 875 880
rp Ala Leu Gly Gln Aep Arg Arg Hi~ Ser Arg Pro Ser Ala Thr Glu
885 890 895
ro Hie Lye Val Phe Ser Gln Ala Gly Leu Pro Ala Thr Val Val Arg
900 905 910
Hie Trp Glu Gln His Ser Hie Thr Aep ABP Thr Val Val Val Gly Leu
915 920 925
Gln Ser Thr Lye His Aen Cye Gln Gln Phe Phe Leu Gln Tyr Xaa Gln
930 935 940

-- W 095/21922 2 1 ~ 6 3 ~ ~ PCTAUS95/02118

263
Met Lys Val Arg Gly Xaa Pro Xaa Lys Arg Asp Gln Pro Thr Arg Pro
945 950 955 960
Asn Leu Leu Leu 8er Gln Val Thr Gln Val Gly CYB Asp Arg Cye Pro
965 970 975
Glu Xaa Phe Glu Leu Leu Leu Met Thr Asn His Thr Met Pro Trp Leu
980 985 990
Arg Arg Arg Leu Gln Gln Cys Cys Ile Gln Glu Pro Tyr Pro Lys Gly
995 1000 1005
Tyr Aen Ser Arg Leu Hie Ser Ser Ile Cys Gln Leu Ser His Gly Leu
1010 1015 1020
Cys Gly Pro Tyr Ala Gly Leu Xaa Leu Leu Leu Lys Pro Arg Thr His
1025 1030 1035 1040
Glu Hi6 Pro Trp Xaa Lys Xaa Gly Gln Gly Gln His Gly Met Cys Leu
1045 1050 1055
Leu Pro Xaa Gly CYB Ser His Thr Arg Asn Gln Ser His Gln Cys Thr
1060 1065 1070
Leu Tyr Thr Arg Hie Gln Trp Gln Leu Leu His Ser Arg Pro Gln Ala
1075 1080 1085
Phe Leu Arg Cys His Ile Pro Tyr Ser Ly6 Arg Gln Leu Leu Phe Pro
1090 1095 1100
Phe Glu Leu Thr Arg Xaa Gln Ala His His Ser Val Phe Trp Xaa Pro
1105 1110 1115 1120
Gln Arg Xaa Gly Val Ser Leu Ser Ser Asp Phe Pro Pro Xaa Ser Phe
1125 1130 1135
Phe His Glu Arg Gly Xaa Cys Leu His Xaa Leu Ile Glu Ser Gln Leu
1140 1145 1150
Cys Trp His Val Val Xaa Gly Leu Leu Gln Gly Gly Xaa Pro Trp Gln
1155 1160 1165
Glu Pro Leu Ala Xaa Hie Phe Trp Met Glu Leu Arg Leu Gly Pro Phe
1170 1175 1180
Gln Cye Pro Thr Arg Trp Phe Asp Arg Xaa Hie Gly Ile Arg Hie Lys
1185 1190 1195 1200
Xaa Leu His His Ser Ser Gly Asn Ile Val Arg Ser Gly Thr Cys Hie
1205 1210 1215
Met Cys Cye Thr Xaa Val Lys Pro Leu Cys Trp Tyr Ile Cys His Xaa
1220 1225 1230
Asn Ser Aen Leu Asp Ser Arg Arg Thr Ser Arg Ala Cys Thr Leu Ala
1235 1240 1245

W O 95/21922 ~ 1 6 6 3 1 3 PCTrUS95/02118

264
Leu Met Leu Leu Xaa Pro Hie Trp Aep Leu Gly Pro Arg Pro His Xaa
1250 1255 1260
Ser Pro Ala Cye Lye Lye Val Val Ile Trp Leu Thr Cye Arg Cye Gln
1265 1270 1275 1280
rp Gly Gln Leu Lye Phe Ala Leu Aen Thr Arg Xaa Ala Gln Xaa Val
1285 1290 1295
eu Tyr Gln Glu Trp Hie Val Leu Cye Thr Gly Aep Gly Ile Gln Hie
1300 1305 1310
Thr Pro Thr Ala Xaa Pro Xaa Ser Gly Arg Leu Aen Arg Gln Aen Phe
1315 1320 1325
Xaa Gln Arg Xaa Thr Ser Gln Xaa His Ala Arg Arg Ser Thr Glu Ser
1330 1335 1340
Gly Hie Leu Lye Asn Pro Trp Gln Gln Pro Trp Glu Gly Pro Arg Thr
1345 1350 1355 1360
ie Thr Aen Lye Gly Hie Arg Ile Cye Xaa Pro Gln Pro Met Thr Pro
1365 1370 1375
al Val Leu Pro Aep Thr Pro Trp Ser Pro Arg Lye Ser Thr Glu Xaa
1380 1385 1390
Gly Thr Gln Leu Hie Met Val Val Aep Arg Cye Pro Gly Hie Xaa Pro
1395 1400 1405
Arg Gln Arg Leu Leu Gly Val Trp Ser Leu Trp Aep Glu Pro
1410 1415 1420
(2) lN~u~.TION POR SEQ ID NO:87:
( i ) ~U~nC~ CHARACTERISTICS:
(A) LENGTH: 20 baee paire
(B) TYPE: nucleic acid
( C) ST~N~KI ~Nl~.qs eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:87:
CTACCACCAA TACCAGCGGC 20


(2) IN~ORI~.TION FOR SEQ ID NO:88:
(i) '~QU~N~ CHARACTERISTICS:
(A) LENGTH: 22 baee paire
(B) TYPE: nucleic acid
(C) STRANDEDNESS: eingle

-- W O95/21922 ~166 3 i 3 PCTrUS95/02118

265
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:88:
-

GACATGGTCC lGGCC~ GG 22


(2) lN~L.TION FOR SEQ ID NO:89:
(i) SEQUEN OE CHARACTERISTICS:(A) LBNGTH: 21 base paire
(B) TYPE: nucleic acid
(C) STP~P~ .'lS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPB: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:89:
GATCCATAGT GAGCCACTCA C 2l


(2) lN~OKMATION FOR SEQ ID NO:9O:
(i) ~UUKN~ CHARACTERISTICS:
(A) LENGTH: 23 baee pairs
(B) TYPE: nucleic acid
(C) STR~ CS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUEN OE DESCRIPTION: SEQ ID NO:9O:
CA~AATGTTC CTGTCATTAT TTG 23


(2) lN~OK~'TION FOR SEQ ID NO:9l:
( i ) ~yU~N~ CHARACTERISTICS:
(A) LENGTH: 21 baee paire
(B) TYPE: nucleic acid
(C) STRANDEDNESS: eingle
(D) TOPOLOGY: linear

W O95/21922 ~ PCTrUS95/02118

266
(ii~ MOLECULE TYPE: DNA (genomic)

(xi) ~yu~N~ DESCRIPTION: SEQ ID NO:9l:
CAATCATCTC CAGCTATAAA G 2l

(2) lN~ù~.TION FOR SEQ ID NO:92:
(i) ~Kyu~ CH~RACTERISTICS:
(A) L_NGTH: 20 base paire
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) ~yU~N~ D~SCRIPTION: SEQ ID NO:92:

CTGTGGACGC CA~ C 20

(2) lN~u.~.TION FOR SEQ ID NO:93:
( i ) ~UU~N~ CHARACTERISTICS:
(A) LENGTH: 21 baee paire
(B) TYPE: nucleic acid
(C) STR~ CS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQU~NCE DESCRIPTION: SEQ ID NO:93:

CAATAGCACA Al~l-C~llG G 2l


(2) lN~OR~.TION FOR SEQ ID NO:94:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 baee paire
(B) TYPE: nucleic acid
(C) sTR~nRnNRss eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

- W O 95/21922 2 1 6 6 3 1 3 PCTrUS95/02118

267
(xi) ~uuL.._r DESCRIPTION: SEQ ID NO:94:

GAAAGCTTGG l~G~LGG 20

(2) lNrO~.TION FOR SEQ ID NO:95:
(i) SEQUENCE CHARACTERISTICS:
(A~ LENGTH: 22 baee pair~
(B) TYPE: nucleic acid
(C) STRANDEDNESS: ~ingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SruurN~ DESCRIPTION: SEQ ID NO:95:

CATCTTGACA ATGACAACTT TC 22

(2) INFORMATION FOR SEQ ID NO:96:
(i) ~ryurN~r CHARACTERISTICS:
(A) LENGTH: 20 ba~e pair~
(B) TYPE: nucleic acid
(C) sTp~NnRn~R.cs single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) ~ryu~N~_ DESCRIPTION: SEQ ID NO:96:

CCTCA CAC CTTCGACCTC 20

(2) lNrO~.TION FOR SEQ ID NO:97:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 baee pair~
(B) TYPE: nucleic acid
( C ) ST~ N K~s ~ingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(Xi) ~U_N~'r DESCRIPTION: SEQ ID NO:97:

W 095/21922 ~ 1 fi 6 ~ ~ 3 PCTnUS95/02118

268
G~1~GG~ACT TGCATGCCTG 20

(2) 1N~F~.TION FOR SEQ ID NO:98:
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 20 base pair6
(B) TYPE nucleic acid
(C) STRP~ S single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA ( genomic)

(Xi~ ~YU~N~ DESCRIPTION SEQ ID NO:98:

CCTGGCTTTG TTCCCACTGC 20

(2) 1N~Oh~TION FOR SEQ ID NO 99
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH 20 baee pairs
~B) TYPE nucleic acid
(C) sTR~NnRn~R.qS: single
(D) TOPOLOGY linear
(ii) MOLECULE TYPE DNA (genomic)

(Xi) ~YU~N~ DESCRIPTION: SEQ ID NO 99

CTCGTACCCC 1C~GG~AGC 20

(2) 1N~OR~TION FOR SEQ ID NO:100:
(i) ~YU~N~ CHARACTERISTICS
(A) LENGTH 20 base pairs
~B) TYPE nucleic acid
(C) STR~Nl~Kl~N~Cs: single
(D) TOPOLOGY linear
(ii) MOLECULE TYPE DNA (genomic)

(Xi) ~YU~N~ DESCRIPTION: SEQ ID NO:100:

GCTAGGAGCA ACACTGTATG 20

- W O95/21922 21 ~ 6 ~ ~ 3 PCTrUS9~/02118

269
(2) lNruR ~TION FOR SEQ ID NO:101:
(i) ~r;yu~._r; CHARACTERISTICS:
(A) LENGTH: 27 baee paire
(B) TYPE: nucleic acid
(C) STF2~NI)Kl)NK-CS eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) ~r;uuh~cr; DESCRIPTION: SEQ ID NO:101:

CGCCATAATT GACGACAAGA CTAGTCC 27

~2) lNrORrU.TION FOR SEQ ID NO:102:
(i) Sr;uUL..c_ CHARACTERISTICS:
(A) LENGTH: 23 baee paire
(B) TYPE: nucleic acid
(C) ST~Z~NI~KI C:S: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:102:

CTATTCCCAG GCTATAGCTA AAG 23

(2) lNrOR~.TION FOR SEQ ID NO:103:
(i) ~r;uuL.._r; CHARACTERISTICS:
(A) LENGTH: 26 baee paire
(B) TYPE: nucleic acid
(C) ST~A~nRnNESS: ~ingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:103:

CAGGTACATG CCATATGTGC TGTACG 26

(2) lNruh ~TION POR SEQ ID NO:104:
(i) SEQUENCE CHARACTERISTICS:
(A) TRNGT~: 20 baee paire

W O95121922 PCTrUS95/02118
~lfi~ ~13
270
~B) TYPE: nucleic acid
(C) STR~N~ ] K-CS: ~ingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQOENCE DESCRIPTION: SEQ ID NO:104:

CTTGGACGCA ATTGCGCCTC 20

(2) lNrO~.TION FOR SEQ ID NO:105:
(i) SEQOENCE CHARACTERISTICS:
(A) LENGTH: 21 baee pair~
(B) TYPE: nucleic acid
(C) ST~ANnRnNR.Cs: single
(D) TOPOLOGY: linear
~ii) MOLBCULE TYPE: DNA (genomic)

(xi) SEQOE NCE DESCRIPTION: SBQ ID NO:105:

GTCACTAGGT AACTGATGTT G 21

(2) lNrO~ ~.TION FOR SEQ ID NO:106:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 baee pair~
(B) TYPE: nucleic acid
(C) STRANI~ N~-~S cingle
(D) TOPOLOGY: linear
(ii) MOLBCULE TYPE: DNA (genomic)

(xi) SBQUENCE DESCRIPTION: SEQ ID NO:106:

Q-~G~G~l ~.ATA~TGTC C 21

(2) lNroRr~TIoN FOR SEQ ID NO:107:
(i) ~ryu~ r CHARACTERISTICS:
(A) LENGTH: 20 ba~e pair~
(B) TYPE: nucleic acid
(C) STR~NnRnN~ S: ningle
(D) TOPOLOGY: linear
(ii) M~T.RCUT.F TYPE: DNA (genomic)

- W 095/21922 ~ 1 5 6 3 1 3 PCTrUS95/02118

271

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:107:

GTGTCAAAAG CTAAGCAGGC 20

(2) lN~O~.TION FOR SEQ ID NO:108:
(i) ~4yu~NC4 CHARACTERISTICS:
(A) LBNGTH: l9 base pairs
(B) TYPE: nucleic acid
(C) ST~ )N~S: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SBQUENCE DESCRIPTION: SEQ ID NO:108:

AGATACCCCT ,~.~.CCC l9
(2) lN~ ~.TION FOR SBQ ID NO:l09:
(i) ~4yU4N~ CHARACTERISTICS:
(A) LENGTH: 20 baee pair~
(B) TYPE: nucleic acid
(C) STPPNI ~ JN~C~S single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SBQUBNCE DESCRIPTION: SEQ ID NO:l09:

CAGGATCTAT TCCAGTAGGC 20

(2) lN~ !TION FOR SEQ ID NO:ll0:
(i) ~4Qu~._4 CHARACTBRISTICS:
(A) LBNGTH: 22 base pairs
(B) TYPB: nucleic acid
(C) STRANv~vN~SS: single
(D) TOPOLOGY: linear
(ii) MOLBCULE TYPE: DNA (genomic)

(xi) ~4yu4N~4 DESCRIPTION: SEQ ID NO:ll0:

GTATAGGGGT ACCAAGATAT GG - 22

W O 95/21922 ~ 1 S ~ 3 1 3 PCTnUS95/02118 _

272
(2) lN~ .TION POR SEQ ID NO~
(i) ~yu~._~ CHARACTERISTICS:
(A) LENGTH: 25 baee paire
(B) TYPE: nucleic acid
(C) STP~RnNESS: eingle
(D) TOPOLOGY: linoar
(ii) MOLECULE TYPE: DNA (genomic)

(xi) ~LyULN~ DESCRIPTION: SEQ ID NO:lll:

~clG~uAAG TCCCACATCA CTGGC 25

(2) lN~ORMATION POR SEQ ID NO:112:
(i) ~Lyu_N~_ CHARACTERISTICS:
(A) LENGTH: 20 baee pair~
(B) TYPE: nucleic acid
~C) STRANl~K~ S: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:112:

CATGA~GAAC CCTCGCTTCC 20

(2) lN~ORM~.TION FOR SEQ ID NO:113:
(i) SEQUENOE CHARACTERISTICS:
(A) LENGTH: 21 baee pairs
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:113:

CACCCAACCC GAGGACTCCA G 21

(2) lN~O W.TION FOR SEQ ID NO:114:
(i) S~YUL..__ CHARACTRRISTICS:
(A) LRNGTH: 22 baee paire
(B) TYPE: nucleic acid

W O95/21922 2 ~ 13 PCTrUS95/02118

273
(C) STRANI~ N~.~S eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:114:

CACTTCAGCG CATGCCAATA GC 22
(2) lNrOF~.TION POR SEQ ID NO:115:
(i) SEQU_NCE CHARACTERISTICS:
(A) LENGTH: 23 baee paire
(B) TYPE: nucleic acid
(C) sT~NnRnN~s eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) ~ryu~ r DESCRIPTION: SEQ ID NO:115:

GTACTA~ACC CATCCATTGC CAC 23

(2) lNrO~.TION FOR SEQ ID NO:116:
(i) SEQUBNCE CHARACTERISTICS:
(A) LENGTH: 20 baee paire
(B) TYPE: nucleic acid
(C) STRA~ S: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) ~ryurNur DESCRIPTION: SEQ ID NO:116:

GCCGAATGAG TACGTCAAGG 20
(2) lNru~L!TION FOR SEQ ID NO:117:
ruurN~r CHARACTERISTICS:
(A) LENGTH: 21 baee paire
(B) TYPE: nucleic acid
(C) sT~NnRnNEss eingle
r ~ (D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) ~ryu~ DBSCRIPTION: SEQ ID NO:117:

W O95/21922 ~ 3 PCTrUS95/02118

274

GTAG~l~lGG CC~lGGGAAA G 21

(2) lN~ '.TION FOR SEQ ID NO:118:
(i) SEQUBNCE CHARACTERISTICS:
(A) LENGTH: 20 baee paire
(B) TYPE: nucleic acid
(C) ST~N~ K-CS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) ~yu~N~ DESCRIPTION: 9EQ ID NO:118:

CTGCCGAACT GAGGGClCAG 20
(2) INFORMATION FOR SBQ ID NO:119:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 24 baee paire
(B) TYPE: nucleic acid
(C) STRAN~)KI)NKqS: eingle
(D) TOPOLOGY: linear
(ii) MOLBCULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:119:

GGTTACCGTT CCCATTGACA ACCC 24

(2) lN~CF~!TION FOR SEQ ID NO:120:
(i) ~yu~N~ CHARACTERISTICS:
(A) LBNGTH: 23 baee paire
(B) TYPE: nucleic acid
(C) STR~N~Kl)NKqS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:120:

GGACGGG~.C IClG~ll~lA GTG 23
(2) lN~ .TION FOR SEQ ID NO:121:
~ yu~c~ CHARACTERISTICS:

W 095/21922 ~ 3 ~ 3 PCTrUS95/02118

275
(A) LENGTH: 24 baee paire
(B) TYPE: nucleic acid
(C) STRANDEDNESS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:121:

GTGAACCGCG CTCACTCACC TTCG 24
(2) INFORMATION FOR SEQ ID NO:122:
U~NU~ CHARACTERISTICS:
(A) LENGTH: 21 base pair~
(B) TYPE: nucleic acid
(C) ST~N~ S: eingle
(D) TOPOLOGY: linear
(ii) MOLB-u~E TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:122:

CCTCTAGAGC GGCCTGAGCA G 21

(2) lN~ORr~.TION FOR SEQ ID NO:123:
(i) ~yu~l~u~ CHARACTERISTICS:
(A) LENGTH: 20 baee paire
(B) TYPE: nucleic acid
(C) ST~A l~KI- ~S: eingle
(D) TOPOLOGY: linear
(ii) MOLeCULE TYPE: DNA (genomic)

(Xi) ~yU~N~ DESCRIPTION: SEQ ID NO:123:

GGATTAAGGC ACCATCATTC 20

(2) lN~u.~.TION FOR SEQ ID NO:124:
(i) SEQUEN OE CHARACTERISTICS:
(A) LENGTH: 23 base paire
(B) TYPE: nucleic acid
(C) ST~ S: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)


W O 95/21922 PCTrUS95/02118

276

~xi) SEQUENCE DESCRIPTION: SEQ ID NO:124:

GCACGATTGG AlGCC~GGGA TAAC 23

(2) lNrOR.~.TION FOR SEQ ID NO:125:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) STRP~ N~S: ~ingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) ~ruurN~r DESCRIPTION: SEQ ID NO:125:

CAGTTCAAGC ~ CCAGGA A~ NNC CGGT 34

(2) lNrO~TION FOR SEQ ID NO:126:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 24 ba~e pairs
(B) TYPE: nucleic acid
(C) ST~I~NnRnNRRS: ~ingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) ~ryurN~r DESCRIPTION: SEQ ID NO:126:

CAGTTCAAGC ~ C~AGGA ATTC 24

(2) lNroKr~TIoN FOR SEQ ID NO:127:
(i) ~ryu~ r CHARACTERISTICS:
(A) LBNGTH: 20 baee pairs
(B) TYPE: nucleic acid
(C) ST~ :SS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUBNCE DESCRIPTION: SBQ ID NO:127:

GCCT QGCCA ACTTCATCAC 2Q

- W O 95/21922 ~ 3 ~ 3 PCTrUS95/02118

277


(2) ~NruF~TIoN FOR SEQ ID NO:128:
(i) ~yu~ CHARACTBRISTICS:
(A) LBNGTH: 34 baee paire
(B) TYPE: nucleic acid
(C) 8TR~N~K~ KCS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(Xi) ~LyULN~ DESCRIPTION: SEQ ID NO:128:

CAGTTCAAGC .~.C~AGGA Al.~NNNNG CGCT 34

(2) INFORMATION FOR SBQ ID NO:129:
(i) ~LuUL~UL CHARACTBRISTICS:
(A) LENGTH: 22 base paire
(B) TYPB: nucleic acid
(C) ST~A~)K~NK-~S: eingle
(D) TOPOLOGY: linear
(ii) MOLECULB TYPE: DNA (genomic)

(Xi) ~LyUL.._L DESCRIPTION: SBQ ID NO:129:

GCG~l~AGCC TGTTAGrATA AC 22

(2) lN~F~.TION FOR 8EQ ID NO:130:
(i) ~Lyu~._L CHARACTBRISTICS:
(A) LBNGTH: 20 base paire
(B) TYPE: nucleic acid
(C) STP~ K~ CS: single
(D) TOPOLOGY: linear
(ii) MOLECULB TYPB: DNA (genomic)

(xi) ~LyuLN~L DBSCRIPTION: SBQ ID NO:130:

CAGGC~.GG TATTGTCAGC 20

(2) INFORMATION FOR SBQ ID NO:131:

WO 95/tl922 PCT/US95/02118
~1~6~13
278
(i) ~Kyu~ CHARACTBRISTICS:
~A) LBNGTH: 23 base pairs
(B) TYPE: nucleic acid
(C) STRANn~nN-SS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPB: DNA (genomic)

(xi) ~yu~,~c~ DBSCRIPTION: SEQ ID NO:131:

CA~llGGAC TGTAAr~AAT GAC 23

(2) lNrOh~.TION FOR SEQ ID NO:132:
(i) SEQUBNCE CHARACTERISTICS:
(A) LBNGTH: 21 baee pairs
(B) TYPB: nucleic acid
(C) ST~nR~NBss: eingle
tD) TOPOLOGY: linear
(ii) MOLBCULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:132:

CATCCACCCG ATA-A-AcccTA G 21

(2) lN~.TION FOR SEQ ID NO:133:
( i ) ~Kyuh~c~ CHARACTBRISTICS:
(A) LBNGTH: 24 baee pairs
(B) TYPE: nucleic acid
(C) ST~Nn-KnNBSS: single
~D) TOPOLOGY: linear
~ii) MOLBCULB TYPB: DNA ~genomic)

~Xi) ~KyuL~-K DBSCRIPTION: SEQ ID NO:133:

CTTGCAGAAG l~l~lCGAGG CAGG 24

~2) lN~ .TION FOR SBQ ID NO:134:
(i) SBQUENCB CHARACTBRISTICS:
(A) LBNGTH: 21 ba~e pair~
(B) TYPB: nucleic acid
(C) ST~NI~Kl~ K~S: single

21~ 13
W O95/21922 PCTrUS95/02118
_

279
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(Xi) ~U~N~ DESCRIPTION: SEQ ID NO:134:

TAATGCTGCA GCCGACAGCT G 21

(2) lN~Oh ~.TION POR SEQ ID NO:135:
(i) SEQUENCE CHARACTERISTICS:
~A) LENGTH: 34 base pairs
(B) TYPE: nucleic acid
(C) sTR~Nn~n~.~s single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:135:

CAGTTCAAGC ll~lCCAGGA All~NNNNNG GCCT 34
(2) INFORMATION FOR SEQ ID NO:136:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) sT~ANn~nN~s single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(Xi) ~QU~N~'~ DESCRIPTION: SEQ ID NO:136:

CGGT GGTGCGCTAC 20

(2) INFORMATION FOR SEQ ID NO:137:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pairs
(B) TYPE: nucleic acid
(C) STRAN~ )N~S: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:137:

W O95/21922 , PCTrUS95/02118
~1 &~ ~3
280

CAACGCTGAG ATCCTCAGAG 20
(2) INFORMATION FOR SEQ ID NO:138:
(i) SEQUBNCB CHARACTERISTICS:
(A) LENGTH: 21 baee pairs
(B) TYPE: nucleic acid
(C) sTR~NnEnN~-qs eingle
(D) TOPOLOGY: linear
(ii) MOLBCULE TYPE: DNA (genomic)

(xi) SEQUENCB DESCRIPTION: SEQ ID NO:138:

CCGTGAGAGG CGACTGGTGA G 21

(2) lN~O~ ~.TION FOR SEQ ID NO:139:
(i) SEQUENCE CH~RACTERISTICS:
(A) LENGTH: 24 base paire
(B) TYPE: nucleic acid
(C) STRANDEDNESS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUBNCB DESCRIPTION: SEQ ID NO:139:

CGCAGGACAG TAGACACCTT GGTG 24

(2) lN~OR ~,TION FOR SBQ ID NO:140:
(i) SEQUENCB CHARACTBRISTICS:
(A) LBNGTH: 23 baee pair~
(B) TYPB: nucleic acid
(C) STRANDBDNESS: single
(D) TOPOLOGY: linear
(ii) MOLBCULE TYPE: DNA (genomic)

(Xi) ~UU~N~ DESCRIPTION: SEQ ID NO:140:

CAGGCATCAC CGAACTGCGT GGC 23
(2) lN~O~ ~.TION FOR SBQ ID NO:141:

- W O95/21922 ~ 13 PCTrUS95/02118

261
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 23 base pair6
(B) TYPE: nucleic acid
(C) STRP~ N~Rs: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)
-




(xi) ~rQurN~r DESCRIPTION: SEQ ID NO:141:

CGAGTGACGC ~G~LGC~G GTC 23
(2) lNrO~L'.TION FOR SEQ ID NO:142:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 baee pairs
~B) TYPE: nucleic acid
(C) ST~ANIll':lJNl':.qS: 8ingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:142:

CACCTTGCTG CCGTATCCAG 20

(2) lNrO~ATION FOR SEQ ID NO:143:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 24 base pair6
(B) TYPE: nucleic acid
(C) STRANDEDNESS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:143:

CCAATCGGCA GTGCTTTAGG GACC 24

,
(2) lNrORMATION FOR SEQ ID NO:144:
(i) SEQUENCE CHARACTERISTICS:
~A) LENGTH: 23 baee pairs
(B) TYPE: nucleic acid
(C) STR~Nn~nNR-~S: 6ingle
(D) TOPOLOGY: linear

W 095/21922 ~ 3~ ~ PCTrUS95/02118

282
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:144:

GTATCCCCGG CATCCAATCG TGC 23

(2) lN~Ok~TION POR SEQ ID NO:145:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 22 base pair6
(B) TYPE: nucleic acid
(C) STRA~ N~S: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:145:
CAACCATCCC AACACATGTA GG 22
(2) lN~OR~TION FOR SEQ ID NO:146:
(i) SEQUENCE CHARACTERISTICS:
~A) LENGTH: 20 baee pair6
(B) TYPE: nucleic acid
(C) STRAN~N~SS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:146:
GGG~lGCCC AACTACTTCC 20

(2) lN~OR~.TION FOR SEQ ID NO:147:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pair~
(B) TYPE: nucleic acid
(C) STRANDEDNESS: double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) S~U~N~ DESCRIPTION: SEQ ID NO:147:

-- W 095/21922 ~ 1 ~ 6 ~ ~ ~ PCTrUS95/02118

283
GGAGGCGTGA TACTCAAAAA G 21

(2) lN~Oh ~TION FOR SEQ ID NO:148:
(i) SEQUENCB CHARACTERISTICS:
(A) LENGTH: 21 base pairs
(B) TYPE: nucleic acid
~C) STRA~K~ K~S: double
~D) TOPOLOGY: linear
~ii) MOLECULE TYPE: DNA ~genomic)

~xi) SEQUENCE DESCRIPTION: SEQ ID NO:148:
CCGTGAGAGG CGACTGGTGA G 21
~2) INFORMATION FOR SEQ ID NO:149:
~i) SEQUENCE CHARACTERISTICS:
~A) LBNGTH: 21 baee pairs
(B) TYPE: nucleic acid
(C) STRANl)~ K~S: double
(D) TOPOLOGY: linear
~ii) MOLECULE TYPE: DNA (genomic)

~xi) SEQUENCE DESCRIPTION: SEQ ID NO:149:
CACCCAACCC GAGGACTCCA G 21

~2) lN~hL.TION FOR SEQ ID NO:150:
yU~N~ CHARACTERISTICS:
(A) LENGTH: 21 baee pairs
~B) TYPE: nucleic acid
(C) ST~ ~NnRnNR.~S: double
(D) TOPOLOGY: linear
~ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:150:
CAGCAACCAC ACAGCCAAGC C 21

(2) lN~ul~TION FOR SEO ID NO:151: ~

W O9~/21922 ~ 3 t 3 PCTrUS9~/02118

284
(i) SEQUENCE CHARACTERISTICS:
(A~ LENGTH: 20 ba~e pairs
(B) TYPE: nucleic acid
(C) sTRANnRnN~-cs single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:151:

GGGCTTGCCC AACTACTTCC 20

(2) lN~OR~.TION FOR SEQ ID NO:lS2:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 ba6e pairs
(B) TYPE: nucleic acid
(C) STR~NnRnNR-qS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA ~genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:152:

TAATGCTGCA GCCGACAGCT G 21
(2) INFORMATION EOR SEQ ID NO:153:
(i) ~Uu~N~ CHARACTERISTICS:
(A) LENGTH: 21 baee pair~
(B) TYPE: nucleic acid
(C) STR~nRn~R-~S: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:153:

GGAGGCGTGA TACTCAAAAA G 21
(2) INFORMATION FOR SEQ ID NO:154:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 ba~e pair6
(B) TYPE: nucleic acid
(C) STR~ CS: single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

- W O95121922 2 1 6 ~ 3 ~ 3 PCT~US95/02118

285

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:154:

CATGAAGAAC CCTCGCTTCC 20
(2) IN~ORMATION FOR SEQ ID NO:155:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 28 base pairs
(8) TYPE: nucleic acid
(C) ST17~NIJKIINK~S double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

~xi) SEQUENCE DESCRIPTION: SEQ ID NO:155:
CCAAGTCAAG CTTGGCGCTT GTCATCAC 28

(2) INFORM~TION FOR SEQ ID NO:156:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 base pair6
(B) TYPE: nucleic acid
(C) sT~Nn~nN~-~s single
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:156:

CAACGCTGAG ATCCTCAGAG 20
(2) IN~O~ ~TION FOR SEQ ID NO:157:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 21 base pairs
(B) TYPE: nucleic acid
( C) ST~NI )Kl JNK.CS Bingle
~D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:157:
GATCCATAGT GAGCCACTCA C 21
(2) lN~OR~ATION FOR SEO ID NO:158:

W O95/21922 21 6 ~ 3 i 3 PCTrUS95/02118

286
U~N~ CHARACTERISTICS:
(A) LENGTH: 221 baee pair6
(B) TYPE: nucleic acid
(C) sTR~NnEnNEss eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:158:
GATCCATAGT GAGCCACTCA CCCATCAAGC ATTTAAATGT CAAGCAAGCA GTGGATGAGG 60
CGGCAGCATA GCCGC~lAGC ATGTCAAAGA CAAAACCCAC CGATGTCCAT GTACCAAGAG 120
~ lCC~AC AGCCCCGGCC ATCATGAACG CCAGTGCGTC TCTAGCGTCT GTAAGCTTGG 180
ACGCAATTGC GCCTCCAAAT AATGACAGGA ACATTTTGAT C221

(2) lNr~ATION POR SEQ ID NO:159:
(i~ SEQUENCE CHARACTERISTICS:
(A) LENGTH: 337 baee paire
(B) TYPE: nucleic acid
(C) STRANDEDNESS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) ~yU~N~ DESCRIPTION: SEQ ID NO:159:

GATCCAATCC AGGGGCCCTC GTACCCCTCC TGGCAGCTGT AGAAAGGACA ACCAGGAATG 60
TT~ TGC TCTGAACTCC AGCTTTAAGG ACATTAAAGC AAATCACAAA GAAATTGCAC 120
ACATACTGCC AAATCTCTAG ACCCCAAGCA ATGAGGCCGC AATCATCCTC C~lCGGG~lG 180
TGGAGCCAAC GGATGCAAGC TGATATGATA CTCCAGGGGG TAGATGCCTC CAGAATGCCC 240
AGTATCTTCT GCGGATGTCA CGAGTGGCAA TAAAGTACTC ACTAC~TACA GTGTTGCTCC 300
TAGCAAGCAT AGTAAGAAGT CTGTTGGGCC AGTGATC 337

(2) lN~ORr~.TION FOR SEQ ID NO:160:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 20 baee pairR
(B) TYPE: nucleic acid
(C) STRANDEDNESS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (qenomic)

f~i?r~1 ~
W O95121922 ~ PCTrUS95/02118

287

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:160:

CCTCACTCAC CTTCGACCTC 20
(2) lN~u~_~TION POR SEQ ID NO:161:
yU~N~ CHARACTERISTICS:
(A) LENGTH: 306 baee pair6
(B) TYPE: nucleic acid
(C) STR~N~ K-cS eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:161:

GATCCATCTT GACAATGACA ACTTTCGCAG GACAGTAGAC AC~-llG~l~A CGAACTCATC 60
TTTGAGGAAG AAATCGTCAG GCATCACCGA ACTGCGTGGC ATCATCGTCA ACAATCTGTT 120
AACCCAATCT TGACCCACAC CClllll~AC AGACCAGAGC AACAAGCCCA GAACCACACC 180
GGCCACCGAA GCCCCCGGAG AGGCCAGGCA ACTGACCAGG CACCAAGCGT CACTCGCTTG 240
TAA~llCCCC GCCAGGAGGT CGAAGGTGAG TGAGCGCGGT TCACCGCCCC CTCCCAGCCT 300
CTGATC
(2) lN~O~IATION FOR SEQ ID NO:162:
(i) ~yu~N~ CHARACTERISTICS:
~A) LENGTH: 20 baee paire
(B) TYPE: nucleic acid
(C) STR~h~ N~eS: eingle
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:162:

GTGTCAA~AG CTAAGCAGGC 20
(2) lN~ORMATION FOR SEQ ID NO:163:
(i) SEQUENCE CHARACTERISTICS:
(A) LENGTH: 9364 baee paire
(B) TYPE: nucleic acid
(C) STRANl~ eS: eingle
(D) TOPOLOGY: linear

W O 95/21922 21~ 6 3 ~ 3 PCTAUS95/02118

288
(ii) MOLECULE TYPE DNA (genomic)

(xi) SEQUENCE DESCRIPTION SEQ ID NO 163

CGTGGGAGTC CGGGGCCCCG GAC~-lCC~AC CGAG~i~GGGG GGAAAGGGGC CCTGGACCGG 60
CCGGGTGGAA GGCCCGGAAC CG~lC~ATCT TCCTCAAGGT TGAGGAAGGG GTACGTCTAT 120
CGGTCCGGTC GGTCCGAAAG GC~l~lGGAT GCCTAGTGTT AGGGllC~lA G~lG~lAAAT 180
CCCAGCTAGG CGTGAAAGCG CTATAGGATA GGCTTATCCC GGTGACCGCT GCCCCG~AAC 240
CAGCCCCGCG ~lClllGGA CACGGTCCAC AG~l~GGGGG TACCG~l~lG AATAACCCCC 300
CGACTGAAGC GTCAGTCGTT AAACGGAGAC G~l~-lC~lGA GATCGCAACG ACGCCCCACG 360
TACGGGAACG CCGCCAAAAC ~llCGG~ACA GCTATGCGGG TTGACAATCC CA~lGGGGGG 420
CCGGGGACCA GCTGATTACT l~lC~lGCGA ~l C~ llG AGACTGGCCG AAAGGCAGCC 480
ACGGGGCCAC CAAGGCGGCG CAGCGCTGCA TGCGGCAAGG GGAAAAATCC llCGG~l~AC 540
CC~lG~lGGC AAlCC-llCC CTTAGGAGCA TGAGTGTGGT CGACACATTC ACCATGGCTT 600
GG~l~lG~ll G~lG~lllGC llCCCC~-lCG CGGGGGGGGT G~ AAC TCGCGGCACC 660
AGTGCTTCAA TGGGGACCAT TATGTG--lTT CCAATTGTTG llCCC~AGAC GAGGTTTACT 720
l~l~lllCGG GGACGGATGT ~lG~lGG~-l~ ALGG~-~lAC TGTTTGCACA CA~llG~l 780
GGAAGCTCTA CCGGCCTGGG GTGGCTACTC GGCCCGGGTC CGAACCAGGT GAGCTGCTGG 840
GGAGATTTGG GAGTGTAATT G~lCCG~l~l CGGCTTCGGC TTACACCGCT GGA~lC--lCG 900
G~l~GG~lGA ACCTTACAGT TTGGCCTTCT TGGGGACGTT CCTCACCAGT CGC~l~-l~AC 960
GGATTCCCAA CGTCACCTGC GTGAAGGCTT GTGACCTTGA GTTTACCTAC CCAGGCTTGT 1020
CCATCGATTT TGACTGGGCG TTTACCAAGA TCTTGCAGTT GCCGGC~AAG CTGTGGCGAG 1080
GCCTAACGGC RGCWCCGGTC TTGAGCCTCC TCGTGATCCT CATGCTGGTC CTCGAGCAGC 1140
GC~lC~-l~AT AGCC~C-lA CTG~,,,,GG TAGTGGGCGA GGCTCAGAGG GGGATGTTCG 1200
ACAACTGCGT GTGTGGTTAC TGGGGGGGCA AGAGGCCCCC ~lCG~l~ACC CCGCTGTACC 1260
GTGGCAACGG TA~l~lG~lG TGTGACTGTG ATTTTGGAAA AATGCATTGG GCCCCCCCCT 1320
l~l~llCCGG Y~lG~l~lGG CGGGACGGTC ATAGGAGGGG CACCGlGCGC GAC~lCCCCC 1380
CG~lllGCCC CCGGGAGGTT CTCGGCACGG TGACAGTCAT GTGTCAGTGG GGll~lGC~-l 1440
ACTGGATTTG GAGA m GGG GACTGGGTTG CATTGTACGA CGAGCTACCA CGATCAGCTC 1500
Tt~l~TAt~TTT ~ A~ T ~.AI~Xi~A~ AAt~TAAAt-iA ~ :At-iTG TT~AAT~AT 11;~t~

W O95/21922 216 6 3 ~ 3 PCTAUS95/02118

289
CCGGGGCACC ~lG~ GC~.~ G AC~-~GGCC GCTGAAATGT G~l~C~GCG 1620
lCCGCGACTG CTGGGAGACG GGGG~C~G G~ C~ATGA GTGCG~"~C GGTACTCGGA 1680
T~-~C~ GCA CCTCGAGGCC ~C~G~G ATGGAGGTGT GGAGTCCAAG GT~AA~GC 1740
CCAAGGGTGA GCGCCC~AAA TAC~T~GTC AGCACGGTGT GGGAACCTAC TACGGCGCTG 1800
TCCGTAGCCT CAACATCAGT TACCTAGTGA CTGAGGTGGG GGGCTATTGG CATGCGCTGA 1860
A~GCCC~lG CGA~ G CCCCGAGTGC TCC~A~-~AG AATTCCAGGT AGGC~ ,A 1920
ATGCATGTCT AGCTGGGAAG TCTCCGCACC CGTTCGCAAG TTGGGCTCCC G~GG~ll 1980
ACGCCCCC~l GTTCACCAAG TGCAACTGGC CGAAGACCTC CGGAGTGGAT ~l~C~lG 2040
GGTTTGCTTT CGA~CCC~ GGTGATCACA ACGGCTTCAT CCATGTTAAA GGCAACAGAC 2100
AGCAGGTTTA CAGTGGTCAG CGAAGGTCTT CGCCGG~-~lG GTTGCTTACT GACATGGTCC 2160
~GGCC~-l~l~ G~G~"~ATG AAGTTGGCTG AGGCTAGAGT ~CCCC~-~G TTTATGCTGG 2220
CAATGTGGTG ~ ,Gll~AAT GGAGCATCTG CTGCCACTAT TGTCATCATA CACCCTACTG 2280
TCACGAAGTC CACTGAAAGT GTTCCATTGT GGACTCCGCC CA~ CCA ACTCCATCTT 2340
GCCCGAATTC TACCACCGGA GTCGCGGACT CTACCTACAA TG~lG~l~GC TACATGGTGG 2400
CAGGCCTGGC GGCCGGGGCT CAGGCGGTCT GGGGTGCTGC CAATGATGGT GCTCAGGCCG 2460
lC~71~G~lGG CATCTGGCCC GC~lGG~-l~A AG~-lGC~AAG ~TTCGCTGCC G~ClGGC~l 2520
G~l~lCAAA ~ GGGGCT TACTTGCCGG ~C~CGAGGC CGCVCTGGCT CCCGAGCTGG 2580
TGTGCACCCC G~G~CGGC TGGGCAGCCC AGGAGTGGTG GTTCACTGGT TGTCTGGGTG 2640
TGA~ CGTGGCGTAC CTGAATGTCC TGGGCTCTGT RAGGGCTGCC ~G~ GG 2700
CGATGCACTT CGCAAGGGGT GCTCTGCCGC TGGTATTGGT GGTAGCTGCC GGGGTRACCC 2760
GGGAGCGGCA CAGC~"~-~A GGGCTTGAGG TGTGCTTCGA TCTGGATGGT GGAGACTGGC 2820
CRGACGCCAG ~G~7~ GG GGTTTAGCAG GC~G~"GAG CTGGGCCCTC ~ ZGGGG 2880
GTCTGATGAC CCACGGTGGC CGATCAGCCA GAYTGACTTG GTAYGCCAGG TGGGCCGTCA 2940
ATTAYCAGAG G~llC~,~CGG TGGGTGAACA ACTCACCGGT TGGAGCYTTT GGYC~"l~.l 3000
GGCGYGCCTG GAAAGCYTGG TTRGTKGTGG ~G~ll CCCCCAGACA GTTGCCACAG 3060
~lCC~,lCAT CTTCATACTC l~,lll~,AGCA GTTTAGATGT QTTGATTTC A~ G~,ARG 3120
TACTCTTGGT TAACTCACCA AAl~CGCGC GCTTGGCGCG ~F~GCTGGAC TCCTTAGCTC 3180
~Z~-~GAGGA GCGG~GGCC TG~ GGC ~G~GGGC~l C~-~r~ G CGGGGC~lCC 3240
~C~l~lACGA GCACG~ CACACTAGCA CGCGC~.T~ .A~ .T~ nn

W O95/21922 PCTrUS95/02118
216~
290
.GCG~-l YGAGCCKGTT AGYATAACCA A~-AA~ATTG YGCYATTGTT CGGGACTCTG 3360
~lC~.~-~ll GGG~-l~-GGA CAAl-G~.CC ATGGGAAACC AG~a~.CGCG AGGCGAGGCG 3420
ACGAGGTGTT GAlCGG~- ~l GTGAA QGTC G~llCGACCT TCCGCCTGGC lll~llCC~A 3480
~-G~-CCC~- GGTSCTTCAT CARGCWGG Q ARG~k--~ - YGGGG-lG-G AAGACMTCCA 3540
TGACAGGCAA GGACCC~-CC GAACACCACG GRAACGTGGT G~lC~l GGG ACTTCAACAA 3600
CK-~--C~AT GGGCTGCTGC GTGAACGGAG TAGTGTACAC RACATACCAT GGYACCAACG 3660
CCCGRCCKAT GGCGGGGCCK TTTGGKCCYG TCAAYGCTCG GTG~lG~l--n GCGAGYGACG 3720
ACGT QCGGT YTACCCGCTC CCWAATGGYG ~ ll~C~l YCARGCWTGY AAGTGCCAAC 3780
CAA~- GGG~- ~-~GG-~ATC CGGAATGACG GAG~l--l--G CCATGGAACT CTCGGCAAGG 3840
TGGTGGATTT AGATATGCCC GCTGAGTTGT QGACTTTCG CGG~l~ l GGATCACCAA 3900
..l-~-GCGA TGAGGGTCAT G~-~--GGCA TGCTGATTTC GGTGCTT QT AGGGGGAGTA 3960
GG~lllC~lC GGTGCGGTAT ACCAAACCTT GGGAAACTCT CC~lCGGGAG ATTGAGGCTC 4020
GATCGGAGGC CCCCCC-~lG CCAGGAACCA CTG~-ATA~AG GGAGGCGCCA ~-~--C~lGC 4080
CCACCGGAGC TGGCAAGTCG ACGCGCGTGC CGAATGAGTA CGTCAAGGCT GGACACAARG 4140
lG~-l-~lACT AAACCCATCC ATTGCCACAG TGAGGGCCAT GGGCCCTTAC ATGGAAAAGT 4200
TAACCGGCAA A Q CC~lCG GTGTACTGTG GCCATGACAC TACTGCATAT TCCAGGACTA 4260
CTGACTCATC TTTGACCTAC TGTACATACG GCAGGTTTAT GGCCAATCCC AGGAAATACT 4320
TGCGGGGGAA CGAC~lC~lA ATTTGCGACG AGTTGCACGT CACCGACCCG ACCTCAATTT 4380
TGGGGATGGG TCGGGCGAGG TTACTCGCTC GCGAGTGCGG CGTACGCCTC ~lG~--l--CG 4440
CTACGGCGAC CCCACCGGTC C-CC~ATGG CGAAGCATGA ATCTATTCAT GAGGAGATGT 4500
TGGGCAGTGA GGGGGAGGTC CC~- -.-ATT GCCAATTCCT CCCACTGAGT AGGTATGCTA 4560
CTGGGAGACA CCTGCTGTTT TGTCATTCCA AGGTAGARTG CACTAGGTTA TCCTCAGCTT 4620
TGGCCAGCTT G~-~--AAC ACCG--G-~- ACTT QGAGG ~AAA~-~AACT GACATTCCAA 4680
~lG~-~ACGT GTGCGTTTGC GCCACAGACG CA~---C~AC TGGTTACACT GGCAATTTTG 4740
ACACCGTAAC AGA~l~-G~- TTAATGGTTG AGGAGGTAGT GGAAGTGACC CTGGACCCGA 4800
CCATCACTAT CG~l~l~AAG ACC~-CCCGG CCC~- GCC~A ACTGAGGGCT CAGAGGCGTG 4860
GTAGGTGTGG CC~GGGAAA GCGGG QCTT ACTATCAGGC ATTGATGTCT TCGGCGCCGG 4920
CGGGAACSGT C~---GGG G~-~-i-GGG CAG~-~l-~A GG~-G~-C .C~lG~lATG 4980
GCCTAGAGCC CGATGCTATT GGAGACCTGC TTAGGGC~-lA CGA~l~ CCTTATACTG 5040

W O95/21922 PCT~US95/02118
216~3L3
291
CTGCCATCAG ~GC~-C~ATC GGAGAGGCCA TTGC~-..... TA~ r~A GTGCCAATGA 5100
GGAATTATCC TCA~lG~-l TGGGCCAAGC A~-AAGG~-~CA CAA~-.GGC~A ~-~ L l~lGG 5160
GTGTGCAGAG GCACATGTGT GAGGACGCGG G~.~-GG~CC KCCCGCTAAT GGTCCCGAAT 5220
GGAGCGGCAT CAGGGGAAAA GGGC~-~--C CC~ G CCGATGGGGT GGTGACTTGC 5280
CTGAGTCGGT GGCTCCGCAT CA~lGG~--G ATGACCTACA GGCCCGG~-C G~.~.GGCCG 5340
AGGGTTACAC .CC~lGCATT GCTGGACCGG .G~llllG~l CG~--LaGCG A-GGCGGGGG 5400
GGGCTATCCT GGCACACTGG ACGGGGTCTC TGGTTGTAGT GAC QGTTGG ~--~-~AATG 5460
GGAACGGTAA CCCGCTGATA CAAAGCGCCT CTAGGGGCGT GGCKACYAGC GGTCCATACC 5520
CAGTACCCCC AGATGGTGGT GAACGGTACC CATCAGACAT CAAGCCAATY ACTGAGGCTG 5580
TGACCACCCT TGAGACTGCG TGCG~lGGG GCCCAGCCGC GGCBAGTCTG GCTTATGTGA 5640
AGGC~-~-~A AACTGGAACC ATGTTGGCTG ACAARGCGAG G~-GC~.GG CAGGCTTGGG 5700
CTGCAAACAA ~---~lGCCT CCACCAGCAT CACACTCAAC llC~-.-~-l~ CAGAGCTTGG 5760
AYG~.~C~ll CACTTCAGCT TGGGATAGCG TGTTCACTCA CGGCC~llCC TTG~ll~llG 5820
GGTTCACAGC TGCTTACGGC GCTCGGCGGA ACCCACCGCT GGGCGTCGGA GCtl~lll~l 5880
lG~lGGGCAT GTCATCGAGC CACYTRACTC ACGTCAGACT TGCTGCTGCG TTGCTCCTCG 5940
GC~lCGGGGG TACC~lC~-A GGCACGCCTG CTACTGGGCT TGCTATGGCG GGTGCCTACT 6000
TCGCKGGGGG CAGCGTTACC GCTAACTGGC TGAGTATCAT TGTGGCTCTA ATCGGAGGCT 6060
GGGAGGGGGC RGTKAACGCA GCCTCACTCA C~--C~AYCT CCTGGCRGGG AAGTTACAAG 6120
CKAGYGAYGC TTGGTGCCTR GTCAGYTGCY .GGC~-.~.CC GGGGG~-.CG ~aG~l a 6180
TGGCDCTVGG Y~ lG~l~ lG~l--l~-~A ARAAGGGTGT GGGWCARGAY lGG~l.AACA 6240
GA~.~..aAC GATGATGCCA CGCAGTTCGG TGATGCCTGA CGA---~-.C CT~A~-~TG 6300
A~1C~1~AC CAAG~-~-~- A~l~lC~-GC GAAAGTTGTC ATTGTCAAGA TGGATCATGA 6360
~.~..~.GGA CAAGCGGGAG ATGGAGATGG AGACMCCCGC TTCTCAGATT ~lllGG~ACT 6420
TGCTTGACTG GTGCATCCGG CTRG~lCG~l lC~-~lACAA TAAACTYATG TTTGCTCTCC 6480
CTAGGTTGCG CCTGCCGCTT AlCG~llGCA GTACC~llG GGGTGGCCCG TGGGAGGGCA 6540
ATGGTCATTT GGAAACAAGG TGTACTTGTG G~l~l~.~AT TACCGG-aAT ATTCACGATG 6600
GTATATTGCA CGACCTACAT TATACCTCCC TACTGTGCAG ACATTACTAC AAGAGGACAG 6660
.~C~-l~llGG CGTCATGGGC AATGCTGAGG GAGCAGTCCC C~-l-~-GCCT A~-GGCG~lG 6720
GAATCAGGAC TTACCAAATT GGGACTTCTG A~.~ A GG~.`-~a~.C GTGCATGGGA 6780

W O95/21922 ~ 1 6 ~ 3 1 3 PCTrUS95/02118


292
CAATCACGGT GCACGCCACC A~.-G~-lATG AGTTGAAAGC TGCTGACGTT CGGAGGGCGG 6840
TGCGAGCCGG CCCGACTTAC ~-~G~GGCG TACCTTGCAG CTGGAGCGCG CC~ACTG 6900
CGCCTGCGCT CGTTTACAGG CTAGGCCAGG GCATCAAAAT CGATGGAGCG CGCC~-~CTGT 6960
~GCC~l~lGA CTTAGCACAG GGAGCGCGCC ACCCCCCGGT ATCTGGCAGT ~GCCGG~A 7020
~lG~llGGAC AGATGAGGAC GAGAGGGACT ~G~-GGAAAC CAAGGCTGCC GCCATCGAGG 7080
CCA~lGGGGC GGC~l~G~AC ~lCC~-- QC CGGAGGCTGC TCAGGCCGCT CTAGAGGCTT 7140
TGG~-r~GGC TGCC~-~lCC ~--~--GCCCC ATGTGCCCGT CATTATGGGT GATGACTGTT 7200
CATGCCGGGA TGAGGC~-C CAAGGCCACT TCATCCCAGA ACCCAATGTG ACAGAGGTAC 7260
CCATTGAGCC CACGGTCGGA GACGTGGAGG CACTCAAGCT GCGGGCTGCA GACCTGACCG 7320
CCAGGTTGCA AGACTTGGAG GCCATGGCTC TCGCCCGCGC TGAGTCAATC GAGGATGCTC 7380
GCGCAGCTTC GA GC~-~CG CT~CC~-~GG TGGACTCAAT GCCATCATTG GAGTCGAGCC 7440
CTTGCTCCTC CTTTGAACAA A~ -AA CTGAAAGTGA CCCTGAGACT ~c~lc~AGG 7500
CTGGCTTACC CTTGGAGTTC GTGAACTCCA ACACCGGGCC ~l~lCCGGCT CGGAGGATTG 7560
TCAGAATCCG ACAGGCTTGC l~ll~l~A Q GATCCACAAT GAAGGCCATG CC~--~-C~l 7620
TCA~-l~-CGG GGA~-GC~- C -C~--ACTC GCTATGACCC GGACGGTCAC CAA~--~ll~G 7680
ACGAGCGAGG TCCGATAGAG GTATCTACTC CTATATGTGA AGTGATTGGG GACAT QGGC 7740
TTCAGTGTGA CCA~ATTGAG GAAACTCCAA CATCTTACTC TTACATCTGG TCAGGGGCGC 7800
CCTTGGGTAC TGGGAGAAGT ~lCCCCCAAC CCATGACGCG CCCTATAGGG ACCCATCTGA 7860
CTTGTGACAC TACCAAAGTT TATGTTACTG ACCCTGATCG GGCCG~--~AG CGGGCCGAGA 7920
AGGTTACAAT CTGGAGGGGT GATAGGAAGT ATGACAAGCA TTATGAGGCT ~-C~-~AGG 7980
C~AA AAAGGCAGCC GCGACGAAGT CTCATGGCTG GACCTATTCC CAGGCTATAG 8040
CTA~AGTTAG GCGCCGAGCA GCCGCTGGAT ACGGCAGCAA GGTGACCGCC TCCACATTGG 8100
CCACTGGTTG GCCTCACGTG GAGGAGATGC TGGACAAAAT AGCCAGGGGA QGGAAGTTC 8160
CTTTCACTTT TGTGACCAAG CGAGAGGTTT 'L ~''1' L ~' ~C~AA AACTACCCGT AAGCCCC~aA 8220
GATTCATAGT lllCC~ACCT TTGGACTTCA GGATAGCTGA AAAGATGATT ~GGG-~ACC 8280
CCGG QTCGT TGCAAAGTCA ATTCTGGGTG ACGCTTATCT GTTCCAGTAC ACGCC QATC 8340
AGAGGGTCAA AGCTCTGGTT AAGGCGTGGG AGGGGAAGTT GCATCCCGCT GCGATCACTG 8400
TGGACGCCAC ll~lll~AC TCATCGATTG ATGAGCACGA QTGCAGGTG GAGG~--CGG 8460
l~ll~GC~GC GGCTAGTGAC AACCCCTCAA TGGTACATGC lll~lG~AAG TACTACTCTG 852Q

- W O9~/21922 ~ 1 6 6 ~ ~ 3 PCTrUS95/02118

293
~GGCC~-.AT GG,llCCC~A GA.GGG~l~C C~-llGGG~lA CCGCCAGTGT AG~.C~.CGG 8580
GC~llAAC AACTAGCTCG GCGAACAGCA TCA~ll~l~A CATTAAGGTC AGCGCGGCCT 8640
GCAGGCGG~1 GGGGATTAAG GCACCATCAT TCTTTATAGC TGGAGATGAT TGCTTGATCA 8700
TCTATGAAAA TGATGGAACT GATCCCTGCC ~lG~l-llAA GGCTGCCCTG GCCAACTATG 8760
~-~TA~AGGTG TGAACCAACA AAGCATGCTT CACTGGACAC AGCTGAGTGT TGCTCGGCCT 8820
A~llGG~lGA ~lGC~lAGCT GGGG~lGCCA AGCGCTGGTG GTTGAGCACG GACATGAGGA 8880
AGCCGCTCGC AAGGGCGTCT TCCGAATATT CGGACCCAAT CGGCAGTGCT TTAGGGACCA 8940
TCTTGATGTA lCCCCGGCAT CCAATCGTGC GGTATGTTCT AATACCACAC GTACTAATAA 9000
lGG~llACAG GAGTGGCAGC ACACCGGATG AGllG~l~AT GTGTCAGGTT CAGGGAAATC 9060
ATTACTCTTT CCCGCTGCGG CTGCTGCCTC GC~l~llG~~ lACAT G~lCC~lG~l 9120
GCCTACAAGT CACCACGGAC AGTA~-~P~-A CTAGGATGGA GGCAGGCTCA GCSTTGCGGG 9180
A m AGGAAT GAAATCCCTA GCCTGGCACC GCCGACGTGC CGGAAATGTG CGCACTCGCC 9240
TCCTGAGGGG AGGCAAGGAG TGGGGGCACC TGGCCAGAGC C~lC~l.lGG CAYCCAGGKT 9300
TGAAGGAGCA YCCCCCRCCC ATAAATTCAC TTCCAGGTTT TCAGCTGGCG ACGCCTTACG 9360
AACACCATGA AGAGGTCTTG ATCTCGATCA AGAGTCGACC ACCTTGGATA AGGTGGATTC 9420
TTGGTGCTTG l~-lClC~llG CTGGCCGCCT TGCTGTGAAT TCGCTCCAGG CAGTAGGACC 9480
TTCGGGTCGG GGG 9493

(2) lN~uR~'TION FOR SEQ ID NO 164
(i) SEQUENCE CHARACTERISTICS
(A) LENGTH: 9493 baee pair6
(B) TYPE nucleic acid
(C) STRANI~Kl~NKRs double
(D) TOPOLOGY: linear
(ii) MOLECULE TYPE: DNA (genomic)

(ix) FEATURE
- (A) NAME/KEY CDS
(B) LOCATION 1 9493
(xi) SEQUENCE DESCRIPTION SEQ ID NO 164
CGT GGG AGT CCG GGG CCC CGG ACC TCC CAC CGA GGT GGG GGG AAA GGG 48
Arg Gly Ser Pro Gly Pro Arg Thr Ser Hie Arg Gly Gly Gly Lye Gly
1 5 10 15

W 095/21922 2 1 6 6 3 ~ ~ PCTAUS95/02118

294
GCC CTG GAC CGG CCG GGT GGA AGG CCC GGA ACC GGT CCA TCT TCC TCA 96
Ala Leu A~p Arg Pro Gly Gly Arg Pro Gly Thr Gly Pro Ser Ser Ser
20 25 30
AGG TTG AGG AAG GGG TAC GTC TAT CGG TCC GGT CGG TCC GAA AGG CGT 144
Arg Leu Arg Lys Gly Tyr Val Tyr Arg Ser Gly Arg Ser Glu Arg Arg
35 40 45
CTG GAT GCC TAG TGT TAG GGT TCG TAG GTG GTA AAT CCC AGC TAG GCG 192
Leu A~p Ala ~ Cye ~ Gly Ser ~ Val Val Asn Pro Ser ~ Ala
50 55 60
TGA AAG CGC TAT AGG ATA GGC TTA TCC CGG TGA CCG CTG CCC CGG AAC 240
~ LYB Arg Tyr Arg Ile Gly Leu Ser Arg ~ Pro Leu Pro Arg Aen
65 70 75 80
CAG CCC CGC GGK TCT TTG GAC ACG GTC CAC AGG TTG GGG GTA CCG GTG 288
Gln Pro Arg Xaa Ser Leu A~p Thr Val His Arg Leu Gly Val Pro Val
85 90 95
TGA ATA ACC CCC CGA CTG AAG CGT CAG TCG TTA AAC GGA GAC GGT CTC 336
t Ile Thr Pro Arg Leu Lys Arg Gln Ser Leu Asn Gly Asp Gly Leu
100 105 110
CTG AGA TCG CAA CGA CGC CCC ACG TAC GGG AAC GCC GCC AAA ACC TTC 384
Leu Arg Ser Gln Arg Arg Pro Thr Tyr Gly Asn Ala Ala Lys Thr Phe
115 120 125
GGG ACA GCT ATG CGG GTT GAC AAT CCC AGT GGG GGG CCG GGG ACC AGC 432
Gly Thr Ala Met Arg Val A~p Asn Pro Ser Gly Gly Pro Gly Thr Ser
130 135 140
TGA TTA CTT GTC CTG CGA GTT CCT CTT GAG ACT GGC CGA AAG GCA GCC 480
~ Leu Leu Val Leu Arg Val Pro Leu Glu Thr Gly Arg Lye Ala Ala
145 150 155 160
ACG GGG CCA CCA AGG CGG CGC AGC GCT GCA TGC GGC AAG GGG AAA AAT 528
Thr Gly Pro Pro Arg Arg Arg Ser Ala Ala Cy~ Gly Ly~ Gly Ly~ Asn
165 170 175
CCT TCG GGT GAC CCC TGG TGG CAA TCC CTT CCC TTA GGA GCA TGA GTG 576
Pro Ser Gly A~p Pro Trp Trp Gln Ser Leu Pro Leu Gly Ala ~ Val
180 185 190
TGG TCG ACA CAT T Q CCA TGG CTT GGC TGT GGT TGC TGG TTT GCT TCC 624
Trp Ser Thr Hi~ Ser Pro Trp Leu Gly Cys Gly Cys Trp Phe Ala Ser
195 200 205
CCC TCG CGG GGG GGG TGC TCT TCA ACT CGC GGC ACC AGT GCT TCA ATG 672
Pro Ser Arg Gly Gly CYB Ser Ser Thr Arg Gly Thr Ser Ala Ser Met
210 215 220
GGG ACC ATT ATG TGC TTT CCA ATT GTT GTT CCC GAG ACG AGG TTT ACT 720
Gly Thr Ile Met Cys Phe Pro Ile Val Val Pro Glu Thr Arg Phe Thr
225 230 235 240
TCT GTT TCG GGG ACG GAT GTC TGG TGG CTT ATG GCT GTA CTG TTT G Q 768

PCTnUS95/021l8
- W 095/21922 ~ 1 5 6 ~ ~ ~

295
Ser Val Ser Gly Thr ABP Val Trp Trp Leu Met Ala Val Leu Phe Ala
245 250 255
CAC AGT CTT GCT GGA AGC TCT ACC GGC CTG GGG TGG CTA CTC GGC CCG 816
Hie Ser Leu Ala Gly Ser Ser Thr Gly Leu Gly Trp Leu Leu Gly Pro
260 265 270
GGT CCG AAC CAG GTG AGC TGC TGG GGA GAT TTG GGA GTG TAA TTG GTC 864
Gly Pro Asn Gln Val Ser Cys Trp Gly Asp Leu Gly Val * Leu Val
275 280 285
CGG TGT CGG CTT CGG CTT ACA CCG CTG GAG TCC TCG GGT TGG GTG AAC 912
Arg Cye Arg Leu Arg Leu Thr Pro Leu Glu Ser Ser Gly Trp Val Asn
290 295 300
CTT ACA GTT TGG CCT TCT TGG GGA CGT TCC TCA CCA GTC GCC TCT CAC 960
Leu Thr Val Trp Pro Ser Trp Gly Arg Ser Ser Pro Val Ala Ser His
305 310 315 320
GGA TTC CCA ACG TCA CCT GCG TGA AGG CTT GTG ACC TTG AGT TTA CCT 1008
Gly Phe Pro Thr Ser Pro Ala * Arg Leu Val Thr Leu Ser Leu Pro
325 330 335
ACC CAG GCT TGT CCA TCG ATT TTG ACT GGG CGT TTA CCA AGA TCT TGC 1056
Thr Gln Ala Cys Pro Ser Ile Leu Thr Gly Arg Leu Pro Arg Ser Cys
340 345 350
AGT TGC CGG CCA AGC TGT GGC GAG GCC TAA CGG CRG CWC CGG TCT TGA 1104
Ser Cy6 Arg Pro Ser Cys Gly Glu Ala * Arg Xaa Xaa Arg Ser *
355 360 365
GCC TCC TCG TGA TCC TCA TGC TGG TCC TCG AGC AGC GCC TCC TGA TAG 1152
Ala Ser Ser * Ser Ser Cys Trp Ser Ser Ser Ser Ala Ser * *
370 375 380
CCT TCC TAC TGC m TGG TAG TGG GCG AGG CTC AGA GGG GGA TGT TCG 1200
Pro Ser Tyr Cys Phe Trp * Trp Ala Arg Leu Arg Gly Gly Cys Ser
385 390 395 400
ACA ACT GCG TGT GTG GTT ACT GGG GGG GCA AGA GGC CCC CGT CGG TGA 1248
Thr Thr Ala Cye Val Val Thr Gly Gly Ala Arg Gly Pro Arg Arg *
405 410 415
CCC CGC TGT ACC GTG GCA ACG GTA CTG TGG TGT GTG ACT GTG ATT TTG 1296
Pro Arg Cys Thr Val Ala Thr Val Leu Trp Cys Val Thr Val Ile Leu
420 425 430
GAA AAA TGC ATT GGG CCC CCC CCT TGT GTT CCG GYC TGG TGT GGC GGG 1344
~lu Ly6 Cys lle ~ly ~ro ~ro ~ro Cys ~al Pro Xaa Trp Cys Gly Gly
435 440 445
ACG GTC ATA GGA GGG GCA CCG TGC GCG ACC TCC CCC CGG TTT GCC CCC 1392
Thr Val Ile Gly Gly Ala Pro Cys Ala Thr Ser Pro Arg Phe Ala Pro
450 455 460
GGG AGG TTC TCG GCA CGG TGA CAG TCA TGT GTC AGT GGG GTT CTG CCT 1440
Gly Arg Phe Ser Ala Arg * Gln Ser Cye Val Ser Gly Val Leu Pro
465 470 475 480

W 095/21922 ~ ~ ~ 6 3 1 3 PC~rtUS9~tO2118

296
ACT GGA m GGA GAT TTG GGG ACT GGG TTG CAT TGT ACG ACG AGC TAC 1488
Thr Gly Phe Gly Asp Leu Gly Thr Gly Leu His Cys Thr Thr Ser Tyr
485 490 495
CAC GAT CAG C TCT GTA CTT TCT TCT CAG GTC ATG GTC CAC AAC CTA 1536
His ABP Gln Leu Ser Val Leu Ser Ser Gln Val Met Val Hi~ Asn Leu
500 505 510
AAG ATC TCT CAG TCT TGA ATC CAT CCG GGG QC CTT GTG CTT CTT GCG 1584
Lye Ile Ser Gln Ser t Ile Hie Pro Gly Hie Leu Val Leu Leu Ala
515 520 525
TCG TTG ACC AGA GGC CGC TGA AAT GTG GTT CCT GCG TCC GCG ACT GCT 1632
Ser Leu Thr Arg Gly Arg ~ Asn Val Val Pro Ala Ser Ala Thr Ala
530 535 540
GGG AGA CGG GGG GTC CTG GGT TCG ATG AGT GCG GTG TCG GTA CTC GGA 1680
Gly Arg Arg Gly Val Leu Gly Ser Met Ser Ala Val Ser Val Leu Gly
545 550 555 560
TGA CGA AGC ACC TCG AGG CCG TCC TGG TTG ATG GAG GTG TGG AGT CCA 1728
t Arg Ser Thr Ser Arg Pro Ser Trp Leu Met Glu Val Trp Ser Pro
565 570 575
AGG TGA CAA CGC CCA AGG GTG AGC GCC CCA AAT ACA TAG GTC AGC ACG 1776
Arg ~ Gln Arg Pro Arg Val Ser Ala Pro Aen Thr ~ Val Ser Thr
580 585 590
GTG TGG GAA CCT ACT ACG GCG CTG TCC GTA GCC TCA ACA TCA GTT ACC 1824
Val Trp Glu Pro Thr Thr Ala Leu Ser Val Ala Ser Thr Ser Val Thr
595 600 605
TAG TGA CTG AGG TGG GGG GCT ATT GGC ATG CGC TGA AGT GCC CGT GCG 1872
t Leu Arg Trp Gly Ala Ile Gly Met Arg ~ Ser Ala Arg Ala
610 615 620
ACT TTG TGC CCC GAG TGC TCC CAG AAA GAA TTC CAG GTA GGC CTG TGA 1920
Thr Leu Cye Pro Glu Cye Ser Gln Lye Glu Phe Gln Val Gly Leu
625 630 635 640
ATG CAT GTC TAG CTG GGA AGT CTC CGC ACC CGT TCG CAA GTT GGG CTC 19 68
Met His Val ~ Leu Gly Ser Leu Arg Thr Arg Ser Gln Val Gly Leu
645 650 655
CCG GTG GGT TTT ACG CCC CCG TGT TCA CCA AGT GCA ACT GGC CGA AGA 2016
Pro Val Gly Phe Thr Pro Pro Cys Ser Pro Ser Ala Thr Gly Arg Arg
660 665 670
CCT CCG GAG TGG ATG TGT GTC CTG GGT TTG CTT TCG ATT TCC CTG GTG 2064
Pro Pro Glu Trp Met Cye Val Leu Gly Leu Leu Ser Ile Ser Leu Val
675 680 685
ATC ACA ACG GCT TCA TCC ATG TTA AAG GCA ACA GAC AGC AGG TTT ACA 2112
Ile Thr Thr Ala Ser Ser Met Leu Lye Ala Thr Asp Ser Arg Phe Thr
690 695 700
GTG GTC AGC GAA GGT CTT CGC CGG CTT GGT TGC TTA CTG ACA TGG TCC 2160
Val Val Ser Glu Gly Leu Arg Arg Leu Gly Cys Leu Leu Thr Trp Ser

- W 095/21922 ~ 16 ~ ~ 1 3 PCTrUS95/02118

297
705 710 715 720
TGG CCC TGT TGG TGG TGA TGA AGT TGG CTG AGG CTA GAG TTG TCC CCC 2208
Trp Pro Cye Trp Trp ~ ~ Ser Trp Leu Arg Leu Glu Leu Ser Pro
725 730 735
TGT TTA TGC TGG CAA TGT GGT GGT GGT TGA ATG GAG CAT CTG CTG CCA 2256
Cye Leu Cye Trp Gln Cye Gly Gly Gly ~ Met Glu Hie Leu Leu Pro
740 745 750
CTA TTG TCA TCA TAC ACC CTA CTG TCA CGA AGT CCA CTG AAA GTG TTC 2304
Leu Leu Ser Ser Tyr Thr Leu Leu Ser Arg Ser Pro Leu Lye Val Phe
755 760 765
CAT TGT GGA CTC CGC CCA CTG TTC CAA CTC CAT CTT GCC CGA ATT CTA 2352
Hie Cye Gly Leu Arg Pro Leu Phe Gln Leu Hie Leu Ala Arg Ile Leu
770 775 780
CCA CCG GAG TCG CGG ACT CTA CCT ACA ATG CTG GTT GCT ACA TGG TGG 2400
Pro Pro Glu Ser Arg Thr Leu Pro Thr Met Leu Val Ala Thr Trp Trp
785 790 795 800
CAG GCC TGG CGG CCG GGG CTC AGG CGG TCT GGG GTG CTG CCA ATG ATG 2448
Gln Ala Trp Arg Pro Gly Leu Arg Arg Ser Gly Val Leu Pro Met Met
805 810 815
GTG CTC AGG CCG TCG TTG GTG GCA TCT GGC CCG CGT GGC TCA AGC TGC 2496
Val Leu Arg Pro Ser Leu Val Ala Ser Gly Pro Arg Gly Ser Ser Cye
820 825 830
GAA GCT TCG CTG CCG GTC TGG CCT GGT TGT CAA ATG TTG GGG CTT ACT 2544
Glu Ala Ser Leu Pro Val Trp Pro Gly Cye Gln Met Leu Gly Leu Thr
835 840 845
TGC CGG TCG TCG AGG CCG CVC TGG CTC CCG AGC TGG TGT GCA CCC CGG 2592
Cye Arg Ser Ser Arg Pro Xaa Trp Leu Pro Ser Trp Cye Ala Pro Arg
850 855 860
TGG TCG GCT GGG QG CCC AGG AGT GGT GGT TCA CTG GTT GTC TGG GTG 2640
Trp 8er Ala Gly Gln Pro Arg Ser Gly Gly Ser Leu Val Val Trp Val
865 870 875 880
TGA TGT GTG TCG TGG CGT ACC TGA ATG TCC TGG GCT CTG TRA GGG CTG 2688
Cye Val Ser Trp Arg Thr ~ Met Ser Trp Ala Leu Xaa Gly Leu
885 890 895
CCG TGC TTG TGG CGA TGC ACT TCG CAA GGG GTG CTC TGC CGC TGG TAT 2736
Pro Cye Leu Trp Arg Cye Thr Ser Gln Gly Val Leu Cye Arg Trp Tyr
900 905 910
TGG TGG TAG CTG CCG GGG TRA CCC GGG AGC GGC ACA GCG TCT TAG GGC 2784
Trp Trp ~ Leu Pro Gly Xaa Pro Gly Ser Gly Thr Ala Ser ~ Gly
~ 915 920 925
TTG AGG TGT GCT TCG ATC TGG ATG GTG GAG ACT GGC CRG ACG CCA GTT 2832
Leu Arg Cye Ala Ser Ile Trp Met Val Glu Thr Gly Xaa Thr Pro Val
930 935 940

W 095/21922 ~ 6 G 3 ~ ~ PCTrUS9StO2118

298
GGT CTT GGG GTT TAG CAG GCG TGG TGA GCT GGG CCC TCC TGG TGG GGG 2880
Gly Leu Gly Val t Gln Ala Trp ~ Ala Gly Pro Ser Trp Trp Gly
945 950 955 960
GTC TGA TGA CCC ACG GTG GCC GAT CAG CCA GAY TGA CTT GGT AYG CCA 2928
Val ~ ~ Pro Thr Val Ala Aep Gln Pro Xaa ~ Leu Gly Xaa Pro
965 970 975
GGT GGG CCG TCA ATT AYC AGA GGG TTC GYC GGT GGG TGA ACA ACT CAC 2976
Gly Gly Pro Ser Ile Xaa Arg Gly Phe Xaa Gly Gly ~ Thr Thr His
9B0 985 990
CGG TTG GAG CYT TTG GYC GTT GGM GGC GYG CCT GGA AAG CYT GGT TRG 3024
Arg Leu Glu Xaa Leu Xaa Val Xaa Gly Xaa Pro Gly Ly~ Xaa Gly Xaa
995 1000 1005
TKG TGG CTT GGT TCT TCC CCC AGA CAG TTG CCA CAG TYT CCG TCA TCT 3072
Xaa Trp Leu Gly Ser Ser Pro Arg Gln Leu Pro Gln Xaa Pro Ser Ser
1010 1015 1020
T Q TAC TCT GTT TGA GCA GTT TAG ATG TCA TTG ATT TCA TCT TGG ARG 3120
Ser Tyr Ser Val ~ Ala Val ~ Met Ser Leu Ile Ser Ser Trp Xaa
1025 1030 1035 1040
TAC TCT TGG TTA ACT CAC CAA ATC TCG CGC GCT TGG CGC GRG TGC TGG 3168
Tyr Ser Trp Leu Thr Hi~ Gln Ile Ser Arg Ala Trp Arg Xaa Cy6 Trp
1045 1050 1055
ACT CCT TAG CTC THG CTG AGG AGC GGC TGG CCT GCT CTT GGC TGG TGG 3216
Thr Pro ~ Leu Xaa Leu Arg Ser Gly Trp Pro Ala Leu Gly Trp Trp
1060 1065 1070
GCG TCC TGC GCA AGC GGG GCG TCC TCC TCT ACG AGC ACG CYG GTC ACA 3264
Ala Ser Cy~ Ala Ser Gly Ala Ser Ser Ser Thr Ser Thr Xaa Val Thr
1075 1080 1085
A GCA GGC GCG GTG CTG CCC GCT TGC GAG AGT GGG GYT TTG CGC TYG 3312
Leu Ala Gly Ala Val Leu Pro Ala Cys Glu Ser Gly Xaa Leu Arg Xaa
1090 1095 1100
AGC CKG TTA GYA TAA CCA AGG AAG ATT GYG CYA TTG TTC GGG ACT CTG 3360
Ser Xaa Leu Xaa ~ Pro Arg Ly~ Ile Xaa Xaa Leu Phe Gly Thr Leu
1105 1110 1115 1120
CTC GTG TGT TGG GCT GTG GAC AAT TGG TCC ATG GGA AAC CAG TGG TCG 3408
Leu Val Cy~ Trp Ala Val Asp Asn Trp Ser Met Gly A~n Gln Trp Ser
1125 1130 1135
CGA GGC GAG GCG ACG AGG TGT TGA TCG GCT GTG TGA ACA GTC GGT TCG 3456
Arg Gly Glu Ala Thr Arg Cy~ ~ Ser Ala Val ~ Thr Val Gly Ser
1140 1145 1150
ACC TTC CGC CTG GCT TTG TTC CCA CTG CTC CCG TGG TSC TTC ATC ARG 3504
Thr Phe Arg Leu Ala Leu Phe Pro Leu Leu Pro Trp Xaa Phe Ile Xaa
1155 1160 1165
CWG GCA ARG GR~ TYT TYG GGG TTG TGA AGA CMT CCA TGA CAG GCA AGG 3552
Xaa Ala Xaa Xaa Xaa Xaa Gly Leu ~ Arg Xaa Pro ~ Gln Ala Arg

W O95/21922 PCTrUS9S/02118
21~6~ ~ 3
299
1170 1175 1180
ACC CGT CCG AAC ACC ACG GRA ACG TGG TGG TCC TWG GGA CTT CAA CAA 3600
Thr Arg Pro Aen Thr Thr Xaa Thr Trp Trp 8er Xaa Gly Leu Gln Gln
1185 1190 1195 1200
CKC GTT CCA TGG GCT GCT GCG TGA ACG GAG TAG TGT ACA CRA CAT ACC 3648
Xaa Val Pro Trp Ala Ala Ala ~ Thr Glu ~ CYB Thr Xaa His Thr
1205 1210 1215
ATG GYA CCA ACG CCC GRC CKA TGG CGG GGC CKT TTG GRC CYG TCA AYG 3696
Met Xaa Pro Thr Pro Xaa Xaa Trp Arg Gly Xaa Leu Xaa Xaa Ser Xaa
1220 1225 1230
CTC GGT GGT GGT CWG CGA GYG ACG ACG TCA CGG TYT ACC CGC TCC CWA 3744
Leu Gly Gly Gly Xaa Arg Xaa Thr Thr Ser Arg Xaa Thr Arg Ser Xaa
1235 1240 1245
ATG GYG CTT CTT GCC TYC ARG CWT GYA AGT GCC AAC CAA CTG GGG TGT 3792
Met Xaa Leu Leu Ala Xaa Xaa Xaa Xaa Ser Ala Asn Gln Leu Gly Cye
1250 1255 1260
GGG TGA TCC GGA ATG ACG GAG CTC m GCC ATG GAA CTC TCG GCA AGG 3840
Gly ~ Ser Gly Met Thr Glu Leu Phe Ala Met Glu Leu Ser Ala Arg
1265 1270 1275 1280
TGG TGG ATT TAG ATA TGC CCG CTG AGT TGT CAG ACT TTC GCG GGT CTT 3888
Trp Trp Ile ~ Ile Cye Pro Leu Ser Cye Gln Thr Phe Ala Gly Leu
1285 1290 1295
CTG GAT CAC CAA TCT TGT GCG ATG AGG GTC ATG CTG TTG GCA TGC TGA 3936
Leu Aep Hie Gln Ser Cye Ala Met Arg Yal Met Leu Leu Ala Cye
1300 1305 1310
TTT CGG TGC TTC ATA GGG GGA GTA GGG TTT CCT CGG TGC GGT ATA CCA 3984
Phe Arg Cye Phe Ile Gly Gly Val Gly Phe Pro Arg Cye Gly Ile Pro
1315 1320 1325
AAC CTT GGG AAA CTC TCC CTC GGG AGA TTG AGG CTC GAT CGG AGG CCC 4032
Aen Leu Gly Lye Leu Ser Leu Gly Arg Leu Arg Leu Aep Arg Arg Pro
1330 1335 1340
CCC CTG TGC CAG GAA CCA CTG GAT ACA GGG AGG CGC CAC TGT TCC TGC 4080
Pro Leu Cye Gln Glu Pro Leu Asp Thr Gly Arg Arg Hie Cye Ser Cye
1345 1350 1355 1360
CCA CCG GAG CTG GCA AGT CGA CGC GCG TGC CGA ATG AGT ACG TCA AGG 4128
Pro Pro Glu Leu Ala Ser Arg Arg Ala Cye Arg Met Ser Thr Ser Arg
1365 1370 1375
CTG GAC ACA ARG TGC TTG TAC TAA ACC CAT CCA TTG CCA CAG TGA GGG 4176
Leu Aep Thr Xaa Cye Leu Tyr ~ Thr Hie Pro Leu Pro Gln ~ Gly
1380 1385 1390
CCA TGG GCC CTT ACA TGG A~A AGT TAA CCG GCA AAC ATC CGT CGG TGT 4224
Pro Trp Ala Leu Thr Trp Lys Ser ~ Pro Ala Aen Ile Arg Arg Cye_
1395 1400 1405

WO 95/21922 ~! I 6 6 31 ~ PCT/US9S/02118

300
ACT GTG GCC ATG ACA CTA CTG CAT ATT CCA GGA CTA CTG ACT CAT CTT 4272Thr Val Ala Met Thr Leu Leu His Ile Pro Gly Leu Leu Thr Hie Leu
1410 1415 1420
TGA CCT ACT GTA CAT ACG GCA GGT TTA TGG CCA ATC CCA GGA AAT ACT 4320
Pro Thr Val Hie Thr Ala Gly Leu Trp Pro Ile Pro Gly Aen Thr
1425 1430 1435 1440
TGC GGG GGA ACG ACG TCG TAA m GCG ACG AGT TGC ACG TCA CCG ACC 4368
Cye Gly Gly Thr Thr Ser t Phe Ala Thr Ser Cy8 Thr Ser Pro Thr
1445 1450 1455
CGA CCT CAA TTT TGG GGA TGG GTC GGG CGA GGT TAC TCG CTC GCG AGT 4416
Arg Pro Gln Phe Trp Gly Trp Val Gly Arg Gly Tyr Ser Leu Ala Ser
1460 1465 1470
GCG GCG TAC GCC TCC TGC m TCG CTA CGG CGA CCC CAC CGG TCT CTC 4464
Ala Ala Tyr Ala Ser Cye Phe Ser Leu Arg Arg Pro Hie Arg Ser Leu
1475 1480 1485
CGA TGG CGA AGC ATG AAT CTA TTC ATG AGG AGA TGT TGG GCA GTG AGG 4512
Arg Trp Arg Ser Met Aen Leu Phe Met Arg Arg CYB Trp Ala Val Arg
1490 1495 1500
GGG AGG TCC CCT TCT ATT GCC AAT TCC TCC CAC TGA GTA GGT ATG CTA 4560
Gly Arg Ser Pro Ser Ile Ala Aen Ser Ser His ~ Val Gly Met Leu
1505 1510 1515 1520
CTG GGA GAC ACC TGC TGT TTT GTC ATT CCA AGG TAG ART GCA CTA GGT 4608
Leu Gly Aep Thr Cye Cye Phe Val Ile Pro Arg t Xaa Ala Leu Gly
1525 1530 1535
TAT CCT CAG CTT TGG CCA GCT TTG GTG TCA ACA CCG TTG TGT ACT TQ 4656
Tyr Pro Gln Leu Trp Pro Ala Leu Val Ser Thr Pro Leu Cye Thr Ser
1540 1545 1550
GAG GCA AAG AAA CTG ACA TTC CAA CTG GTG ACG TGT GCG m GCG CCA 4704
Glu Ala Lye Lye Leu Thr Phe Gln Leu Val Thr Cye Ala Phe Ala Pro
1555 1560 1565
CAG ACG CAC TTT CCA CTG GTT ACA CTG GCA ATT TTG ACA CCG TAA CAG 4752
Gln Thr Hie Phe Pro Leu Val Thr Leu Ala Ile Leu Thr Pro t Gln
1570 1575 1580
ACT GTG GTT TAA TGG TTG AGG AGG TAG TGG AAG TGA CCC TGG ACC CGA 4800
Thr Val Val t Trp Leu Arg Arg t Trp Lye t Pro Trp Thr Arg
1585 1590 1595 1600
CCA TCA CTA TCG GTG TGA AGA CCG TCC CGG CCC G CCG AAC TGA GGG 4848
Pro Ser Leu Ser Val t Arg Pro Ser Arg Pro Leu Pro Aen t Gly
1605 1610 1615
CTC AGA GGC GTG GTA GGT GTG GCC GTG GGA AAG CGG GCA, CTT ACT ATC 4896
Leu Arg Gly Val Val Gly Val Ala Val Gly Lye Arg Ala Lou Thr Ile
1620 1625 1630
AGG CAT TGA TGT CTT CGG CGC CGG CGG GAA CSG TTC GGT CTG GGG CTC 4944


DEMANDES OU BR~VETS VOLUMINEUX


LA PRÉSENTE PARTIE- DE ~ t DEMANDE OU CE BREVET
COMPREND PLUS D'UN TOME.


CECI EST LE. TOME / DE ~


NOTE: Pour les tomes additionels, veuillez c~ntacter le Bureau canadien des
brevets


7 1 (~ 3



JUlVlBO APPLICATIONSIPATENTS


THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE
THAN ONE VOLlJME


Tt~IS IS VOLUME 1_ OF ;~


NOTE: Fcr additional vo~umes please c~ntact the Canadian Patent Office

Representative Drawing

Sorry, the representative drawing for patent document number 2166313 was not found.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date Unavailable
(86) PCT Filing Date 1995-02-14
(87) PCT Publication Date 1995-08-17
(85) National Entry 1995-12-28
Dead Application 2003-02-14

Abandonment History

Abandonment Date Reason Reinstatement Date
2002-02-14 FAILURE TO REQUEST EXAMINATION
2003-02-14 FAILURE TO PAY APPLICATION MAINTENANCE FEE

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Application Fee $0.00 1995-12-28
Registration of a document - section 124 $0.00 1996-03-28
Maintenance Fee - Application - New Act 2 1997-02-14 $100.00 1996-12-20
Maintenance Fee - Application - New Act 3 1998-02-16 $100.00 1998-01-23
Maintenance Fee - Application - New Act 4 1999-02-15 $100.00 1999-01-29
Maintenance Fee - Application - New Act 5 2000-02-14 $150.00 1999-12-29
Maintenance Fee - Application - New Act 6 2001-02-14 $150.00 2001-01-19
Maintenance Fee - Application - New Act 7 2002-02-14 $150.00 2002-01-02
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
ABBOTT LABORATORIES
Past Owners on Record
BUIJK, SHERI L.
DAWSON, GEORGE J.
DESAI, SURESH M.
ERKER, JAMES CARL
LEARY, THOMAS P.
MUERHOFF, ANTHONY SCOTT
MUSHAHWAR, ISA K.
PILOT-MATIAS, TAMI J.
SCHLAUDER, GEORGE G.
SIMONS, JOHN N.
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Description 1995-08-17 302 13,287
Description 1995-08-17 314 9,415
Cover Page 1996-04-30 1 29
Abstract 1995-08-17 1 53
Claims 1995-08-17 8 357
Drawings 1995-08-17 39 953
International Preliminary Examination Report 1995-12-28 7 169
Fees 1996-12-20 1 60