Language selection

Search

Patent 2393374 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2393374
(54) English Title: HIGH THROUGHPUT OR CAPILLARY-BASED SCREENING FOR A BIOACTIVITY OR BIOMOLECULE
(54) French Title: CRIBLAGE A HAUT RENDEMENT OU DE TYPE CAPILLAIRE DESTINE A IDENTIFIER UNE BIO-ACTIVITE OU UNE BIOMOLECULE
Status: Deemed Abandoned and Beyond the Period of Reinstatement - Pending Response to Notice of Disregarded Communication
Bibliographic Data
(51) International Patent Classification (IPC):
(72) Inventors :
  • SHORT, JAY M. (United States of America)
  • KELLER, MARTIN (United States of America)
  • LAFFERTY, WILLIAM MICHAEL (United States of America)
(73) Owners :
  • DIVERSA CORPORATION
(71) Applicants :
  • DIVERSA CORPORATION (United States of America)
(74) Agent: MBM INTELLECTUAL PROPERTY AGENCY
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date: 2001-10-10
(87) Open to Public Inspection: 2002-04-18
Availability of licence: N/A
Dedicated to the Public: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/US2001/031806
(87) International Publication Number: US2001031806
(85) National Entry: 2002-06-04

(30) Application Priority Data:
Application No. Country/Territory Date
09/685,432 (United States of America) 2000-10-10
09/738,871 (United States of America) 2000-12-15
09/790,321 (United States of America) 2001-02-21
09/894,956 (United States of America) 2001-06-27
60/309,101 (United States of America) 2001-07-31

Abstracts

English Abstract


Provided is a method of screening or enriching a sample containing
polynucleotides from a mixed population of organisms. The method includes
creating a DNA library from a plurality of nucleic acid sequences of a mixed
population of organisms and separating clones containing a polynucleotide
sequence of interest on an analyzer detects a detectable molecule on a probe
or bioactive substrate. The analyzer includes FACS devices, SQUID devices and
MSC devices. The separated or enrich library can then be further process by
activity based screening or sequence based screening. In addition, the
enriched sequence can be compared to a database and to identify sequences in
the database which have homology to a clone in the library thereby obtaining a
nucleic acid profile of the mixed population of organisms.


French Abstract

L'invention concerne une méthode de criblage ou d'enrichissement d'un échantillon contenant des polynucléotides provenant d'une population mixte d'organismes. La méthode consiste à créer une banque d'ADN à partir d'une pluralité de séquences nucléotidiques d'une population mélangée d'organismes et à séparer des clones contenant une séquence polynucléotidique d'intérêt sur un analyseur détectant une molécule détectable sur une sonde ou un substrat bioactif. L'analyseur comprend des dispositifs de triage cellulaire activés par fluorescence (FACS), des dispositifs supraconducteurs à interférences quantiques (SQID) et des dispositifs de spectroscopie à couplage multipôle (MCS). La banque séparée ou enrichie peut être à nouveau traitée par un criblage basé sur l'activité ou un criblage basé sur la séquence. De plus, la séquence enrichie peut être comparée à une base de données afin d'identifier des séquences dans la base de données présentant une homologie vis-à-vis d'un clone dans la banque, obtenant ainsi un profil d'acide nucléique de la population mélangée d'organismes.

Claims

Note: Claims are shown in the official language in which they were submitted.


177
WHAT IS CLAIMED IS:
1. A method for identifying a polynucleotide in a liquid phase comprising:
a) contacting a plurality of polynucleotides derived from at least
one organism with at least one nucleic acid probe under conditions that allow
hybridization of the probe to the polynucleotides having complementary
sequences,
wherein the probe is labeled with a detectable molecule; and
b) identifying a polynucleotide of interest with an analyzer that
detects the detectable molecule.
2. The method of claim 1, wherein the polynucleotides are from a mixed
population of cells.
3. The method of claim 1, wherein the polynucleotides are in a library.
4. The method of claim 3 wherein the library is an expression library.
5. The method of claim 3 wherein the library is an environmental expression
library.
6. The method of claim 1, wherein the nucleic acid probe is from at least
about 15
bases to about 100 bases.
7. The method of claim 1, wherein the nucleic acid probe is from at least
about 100
bases to about 500 bases.
8. The method of claim 1, wherein the nucleic acid probe is from at least
about 500
bases to about 1,000 bases.
9. The method of claim 1, wherein the nucleic acid probe is from at least
about
1,000 bases to about 5,000 bases.
10. The method of claim 1, wherein the nucleic acid probe is from at least
about
5,000 bases to about 10,000 bases.

178
11. The method of claim 1, wherein the detectable molecule is a fluorescent
molecule.
12. The method of claim 1, wherein the detectable molecule is a magnetic
molecule.
13. The method of claim 1, wherein the detectable molecule modulates a
magnetic
field.
14. The method of claim 1, wherein the detectable molecule modulates the
dielectric signature of the clone.
15. The method of claim 1, wherein the analyzer is a FACS analyzer.
16. The method of claim 1, wherein the analyzer is a magnetic field sensing
device.
17. The method of claim 9, wherein the magnetic field sensing device is a
Super
Conducting Quantum Interference Device.
18. The method of claim 1, wherein the analyzer is a multipole coupling
spectroscopy device.
19. The method of claim 1, wherein the organism is from an environmental
sample.
20. The method of claim -1, wherein the environmental sample is selected from
the group consisting of geothermal fields, hydrothermal fields, acidic soils,
sulfotara mud pots, boiling mud pots, pools, hot-springs, geysers, marine
actinomycetes, metazoan, endosymbionts, ectosymbionts, tropical soil,
temperate soil, arid soil, compost piles, manure piles, marine sediments,
freshwater sediments, water concentrates, hypersaline sea ice, super-cooled
sea ice, arctic tundra, Sargosso sea, open ocean pelagic, marine snow,
microbial mats, whale falls, springs, hydrothermal vents, insect and nematode
gut microbial communities, plant endophytes, epiphytic water samples,
industrial sites and ex situ enrichments.

179
21. The method of claim -2, wherein the environmental sample is selected from
the group consisting of eukaryotes, prokaryotes, myxobacteria (epothilone),
air, water, sediment, soil or rock.
22. The method of claim 1, wherein the organism comprises a microorganism.
23. The method of claim 19, wherein the environmental sample contains
extremophiles.
24. The method of claim 23, wherein the extremophiles are selected from the
group consisting of hyperthermophiles, psychrophiles, halophiles,
psychrotrophs, alkalophiles, and acidophiles.
25. The method of claim 1, further comprising encapsulation of the
polynucleotide
in a microenvironment.
26. The method of claim 25, wherein the microenvironment is selected from
beads, high temperature agaroses, gel microdroplets, cells, ghost red blood
cells, macrophages, or liposomes.
27. The method of claim 26, wherein the microenvironment is a gel
microdroplet.
28. The method of claim 25, wherein the detectable molecule is a biotinylated
substrate.
29. The method of claim 28, wherein the biotinylated substrate comprises a
core
fluorophore structure, a spacer connected to the fluorophore structure by a
first
connector and connected to the bioactivity or biomolecume of interest by a
second connector, and two functional groups, wherein each functional group is
attached to the fluorophore structure by a connector unit.

180
30. The method of claim 29, wherein the fluorophore is selected from the group
consisting of coumarins, resorufins and xanthenes.
31. The method of claim 29, wherein the spacer is selected from the group
consisting of alkanes, and oligoethyleneglycols.
32. The method of claim 29, wherein the connector units are selected from the
groups consisting of ether, amine, amide, ester, urea, thiourea and other
moieties.
33. The method of claim 29, wherein the functional groups are independently
selected from the group consisting of straight alkanes, branched alkanes,
monosaccharides, oligosaccharides, unsaturated hydrocarbons and aromatic
groups.
34. The method of claim 25, wherein the analyzer is a flow cytometer.
35. The method of claim 28, wherein the biotinylated substrate comprises a
core
fluorophore structure, a spacer connected to the fluorophore structure by a
first
connector and connected to the bioactivity or biomolecule of interest by a
second connector, and a quencher component, attached to the cluorophore by a
polymer.
36. The method of claim 35, wherein the fluorophore is selected from the group
consisting of acridines, coumarins, fluorescein, rhodamine, BOPIDY,
resorufin, and porphyrins.
37. The method of claim 35, wherein the quencher is a moiety capable of
quenching fluorescence of the fluorophore.
38. The method of claim 35, wherein the polymer is selected from the group
consisting of amines, ethers, esters, amides, peptides and oligosaccharides.

181
39. The method of claim 35, wherein the spacer is selected from the group
consisting of: alkanes, and oligoethyleneglycols.
40. The method of claim 35, wherein the first and second connectors are
selected
from the groups consisting of ether, amine, amide, ester, urea, thiourea and
other moieties.
41. The method of claim 1, wherein the polynucleotide of interest encodes an
enzyme.
42. The method of claim 41, wherein the enzyme is selected from the group
consisting of lipases, esterases, proteases, glycosidases, glycosyl
transferases,
phosphatases, kinases, mono- and dioxygenases, haloperoxidases, lignin
peroxidases, diarylpropane peroxidases, eposize hydrolases, nitrite
hydratases,
nitrilases, transaminases, amidases, and acylases.
43. The method of claim 1, wherein the polynucleotide of interest encodes a
small
molecule.
44. The method of claim 1, wherein the polynucleotide of interest, or
fragments
thereof, comprise one or more operons, or portions thereof.
45. The method of claim 44, wherein the operons, or portions thereof, encodes
a
complete or partial metabolic pathway.
46. The method of claim 44, wherein the operons or portions thereof encoding a
complete or partial metabolic pathway encodes polyketide syntheses.
47. A method for identifying a polynucleotide encoding a polypeptide of
interest
comprising:

182
co-encapsulating in a microenvironment a plurality of library clones
containing DNA obtained from a mixed population of organisms, with a
mixture of oligonucleotide probes comprising a detectable label and at least a
portion of a polynucleotide sequence encoding a polypeptide of interest having
a specified bioactivity under such conditions and for such time as to allow
interaction of complementary sequences; and
identifying clones containing a complement to the oligonucleotide
probe encoding the polypeptide of interest by separating clones with an
analyzer that detects the detectable label.
48. A method for high throughput screening of a polynucleotide library for a
polynucleotide of interest that encodes a molecule of interest, comprising:
(a) contacting a library containing a plurality of clones comprising
polynucleotides derived from a mixed population of organisms with a plurality
of oligonucleotide probes labeled with a detectable molecule; and
(b) separating clones with an analyzer that detect the detectable
molecule.
49. The method of claim 48, further comprising:
(a) contacting the separated clones with a reporter system that
identifies a polynucleotide encoding the molecule of interest; and
(b) identifying clones capable of modulating expression or activity
of the reporter system thereby identifying a polynucleotide of interest.
50. The method of claim 48, wherein the library is an expression library.
51. The method of claim 48, wherein the mixed population of organisms is from
an environmental sample.
52. The method of claim 51, wherein the environmental sample is selected from
the group consisting of: geothermal fields, hydrothermal fields, acidic soils,
sulfotara mud pots, boiling mud pots, pools, hot-springs, geysers, marine

183
actinomycetes, metazoan, endosymbionts, ectosymbionts, tropical soil,
temperate soil, arid soil, compost piles, manure piles, marine sediments,
freshwater sediments, water concentrates, hypersaline sea ice, super-cooled
sea ice, arctic tundra, Sargosso sea, open ocean pelagic, marine snow,
microbial mats, whale falls, springs, hydrothermal vents, insect and nematode
gut microbial communities, plant endophytes, epiphytic water samples,
industrial sites and ex situ enrichments.
53. The method of claim 51, wherein the environmental sample is selected from
the group consisting of: eukaryotes, prokaryotes, myxobacteria (epothilone),
air, water, sediment, soil or rock.
54. The method of claim 48, wherein the mixed population of organisms
comprises microorganisms.
55. The method of claim 51, wherein the environmental sample contains
extremophiles.
56. The method of claim 55, wherein the extremophiles are selected from the
group consisting of hyperthermophiles, psychrophiles, halophiles,
psychrotrophs, alkalophiles, and acidophiles.
57. The method of claim 49, wherein the reporter system is a bioactive
substrate.
58. The method of claim 57, wherein the bioactive substrate comprises C12FDG.
59. The method of claim 58, wherein the bioactive substrate further comprises
a
lipophilic tail.
60. The method of claim 49, further comprising prior to (a):

184
(i) obtaining polynucleotides from a mixed population of
organisms; and
(ii) generating a polynucleotide library.
61. The method of claim 60, further comprising normalizing the polynucleotides
prior to generating the library.
62. The method of claim 48, further comprising encapsulation of the clones in
a
gel microdrop.
63. The method of claim 62, wherein the detectable molecule is a biotinylated
substrate.
64. The method of claim 63, wherein the biotinylated substrate comprises a
core
fluorophore structure, a spacer connected to the fluorophore structure by a
first
connector and connected to the bioactivity or biomolecume of interest by a
second connector, and two functional groups, wherein each functional group is
attached to the fluorophore structure by a connector unit.
65. The method of claim 64, wherein the fluorophore is selected from the group
consisting of coumarins, resorufins and xanthenes.
66. The method of claim 64, wherein the spacer is selected from the group
consisting of: alkanes, and oligoethyleneglycols.
67. The method of claim 64, wherein the connector units are selected from the
groups consisting of ether, amine, amide, ester, urea, thiourea and other
moieties.
68. The method of claim 64, wherein the functional groups are independently
selected from the group consisting of straight alkanes, branched alkanes,

185
monosaccharides, oligosaccharides, unsaturated hydrocarbons and aromatic
groups.
69. The method of claim 62, wherein the analyzer is a flow cytometer.
70. The method of claim 63, wherein the biotinylated substrate comprises a
core
fluorophore structure, a spacer connected to the fluorophore structure by a
first
connector and connected to the bioactivity or biomolecule of interest by a
second connector, and a quencher component, attached to the cluorophore by a
polymer.
71. The method of claim 70, wherein the fluorophore is selected from the group
consisting of acridines, coumarins, fluorescein, rhodamine, BOPIDY,
resorufin, and porphyrins.
72. The method of claim 70, wherein the quencher is a moiety capable of
quenching fluorescence of the fluorophore.
73. The method of claim 70, wherein the polymer is selected from the group
consisting of amines, ethers, esters, amides, peptides and oligosaccharides.
74. The method of claim 70, wherein the spacer is selected from the group
consisting of alkanes, and oligoethyleneglycols.
75. The method of claim 70, wherein the first and second connectors are
selected
from the groups consisting of ether, amine, amide, ester, urea, thiourea and
other moieties.
76. The method of claim 48, wherein the polynucleotide of interest encodes an
enzyme.

186
77. The method of claim 76, wherein the enzyme is selected from the group
consisting of lipases, esterases, proteases, glycosidases, glycosyl
transferases,
phosphatases, kinases, mono- and dioxygenases, haloperoxidases, lignin
peroxidases, diarylpropane peroxidases, eposize hydrolases, nitrile
hydratases,
nitrilases, transaminases, amidases, and acylases.
78. The method of claim 49, wherein the reporter system comprises a detectable
label.
79. The method of claim 49, wherein the reporter system comprises a first test
protein linked to a DNA binding moiety and a second test protein linked to a
transcriptional activation moiety, wherein modulation of the interaction of
the
first test protein linked to a DNA binding moiety with the second test protein
linked to a transcription activation moiety results in a change in the
expression
of a detectable protein.
80. The method of claim 49, wherein the polynucleotide of interest encodes a
small molecule.
81. The method of claim 49, wherein the polynucleotide of interest, or
fragments
thereof, comprise one or more operons, or portions thereof.
82. The method of claim 81, wherein the operons, or portions thereof, encodes
a
complete or partial metabolic pathway.
83. The method of claim 82, wherein the operons or portions thereof encoding a
complete or partial metabolic pathway encodes polyketide syntheses.
84. The method of claim 48, wherein the analyzer is a fluorescence activated
cell
sorting (FACS) apparatus.

187
85. The method of claim 48, wherein the analyzer is a magnetic field sensing
device.
86. The method of claim 85, wherein the magnetic field sensing device is a
Super
Conducting Quantum Interference Device.
87. The method of claim 48, wherein the analyzer is a multipole coupling
spectroscopy device.
88. The method of claim 48, wherein the plurality of oligonucleotide probes
have
different nucleic acid sequences,
89. The method of claim 88, wherein the sequences are portions of a
polynucleotide encoding a molecule of interest.
90. The method of claim 48, wherein the plurality of oligonucleotide probes
have
the same nucleic acid sequence.
91. A method of screening for a polynucleotide encoding an activity of
interest,
comprising:
(a) obtaining polynucleotides from an environmental sample;
(b) normalizing the polynucleotides obtained from the sample;
(c) generating a library from the normalized polynucleotides;
(d) contacting the library with a plurality of oligonucleotide probes
comprising a detectable label and at least a portion of a polynucleotide
sequence encoding a polypeptide of interest having a specified activity to
select library clones positive for a sequence of interest; and
(e) selecting clones with an analyzer that detects the detectable
label.
92. The method of claim 91, further comprising:

188
(a) contacting the selected clones with a reporter system that
identifies a polynucleotide encoding the activity of interest; and
(b) identifying clones capable of modulating expression or activity
of the reporter system thereby identifying a polynucleotide of interest;
wherein
the positive clones contain a polynucleotide sequence encoding an activity of
interest which is capable of catalyzing the bioactive substrate.
93. A method for screening polynucleotides, comprising:
(a) contacting a library of polynucleotides wherein the
polynucleotides are derived from a mixed population of organism with a probe
oligonucleotide labeled with a fluorescence molecule, which fluoresce upon
binding of the probe to a target polynucleotide of the library, to select
library
polynucleotides positive for a sequence of interest;
(b) separating library members that are positive for the sequence of
interest with a fluorescent analyzer that detects fluorescence; and
(c) expressing the selected polynucleotides to obtain polypeptides.
94. The method of claim 93, further comprising:
(a) contacting the polypeptides with a reporter system; and
(b) identifying polynucleotides encoding polypeptides capable of
modulating expression or activity of the reporter system.
95. A method for obtaining an organism from a mixed population of organisms in
a sample comprising:
(a) encapsulating in a microenvironment at least one organism
from the sample;
(b) incubating the encapsulated at least one organism under such
conditions and for such a time to allow the at least one microorganism to grow
or proliferate; and
(c) sorting the encapsulated at least one organism by a flow
cytometer to obtain an organism from the sample.

189
96. The method of claim 95, wherein the mixed population of organisms is from
an
environmental sample.
97. The method of claim 96, wherein the environmental sample is selected from
the group consisting of geothermal fields, hydrothermal fields, acidic soils,
sulfotara mud pots, boiling mud pots, pools, hot-springs, geysers, marine
actinomycetes, metazoan, endosymbionts, ectosymbionts, tropical soil,
temperate soil, arid soil, compost piles, manure piles, marine sediments,
freshwater sediments, water concentrates, hypersaline sea ice, super-cooled
sea ice, arctic tundra, Sargosso sea, open ocean pelagic, marine snow,
microbial mats, whale falls, springs, hydrothermal vents, insect and nematode
gut microbial communities, plant endophytes, epiphytic water samples,
industrial sites and ex situ enrichments.
98. The method of claim 96, wherein the environmental sample is selected from
the group consisting of eukaryotes, prokaryotes, myxobacteria (epothilone),
air, water, sediment, soil or rock.
99. The method of claim 95, wherein the mixed population of organisms
comprises microorganisms.
100. The method of claim 96, wherein the environmental sample contains
extremophiles.
101. The method of claim 101, wherein the extremophiles are selected from the
group consisting of hyperthermophiles, psychrophiles, halophiles,
psychrotrophs, alkalophiles, and acidophiles.
102. The method of claim 95, wherein the flow cytometer comprises a magnetic
field sensing device.
103. The method of claim 102, wherein the magnetic field sensing device is a
Super
Conducting Quantum Interference Device.

190
104. The method of claim 95, wherein the flow cytometer is a multipole
coupling
spectroscopy device.
105. A method for identifying a bioactivity or biomolecule of interest,
comprising:
(a) transferring a library containing a plurality of clones comprising
polynucleotides derived from a mixed population of organisms or more than one
organism, to a bacterial host cell;
(b) contacting the bacterial host cell with a mammalian host cell
containing a detectable reporter molecule in a microenvironment; and
(c) separating clones with an analyzer that detects the detectable molecule.
106. The method of claim 105, wherein the microenvironment is selected from
beads, high temperature agaroses, gel microdroplets, cells, ghost red blood
cells, macrophages, or liposomes.
107. The method of claim 106, wherein the liposomes are prepared from one or
more phospholipids, glycolipids, steroids, alkyl phosphates or fatty acid
esters.
108. The method of claim 107, wherein the phospholipids are selected from the
group consisting of lecithin, sphingomyelin and dipalmitoyl.
109. The method of claim 107, wherein the steroids are selected from the group
consisting of cholesterol, cholestanol and lanosterol.
110. The method of claim 105, wherein the detectable reporter contains a
bioluminescent molecule, a chemiluminescent molecule, a colorimetric
molecule, an electromagnetic molecule, an isotopic molecule, a thermal
molecule or an enzymatic substrate.
111. The method of claim 110, wherein the bioluminescent molecule is green
fluorescent protein (GFP) or red fluorescent protein (RFP).

191
112. The method of claim 105, wherein the analyzer is a FACS analyzer.
113. The method of claim 105, wherein the analyzer is capillary-based
screening
apparatus.
114. A method for identifying a bioactivity or biomolecule of interest,
comprising:
(a) transferring a library containing a plurality of clones comprising
polynucleotides derived from a mixed population of organisms or more than one
organism, to a first host cell;
(b) contacting the first host cell with a second host cell containing a
detectable reporter molecule in a microenvironment, wherein the first host
cell is
different from the second host cell; and
(c) separating clones with an analyzer that detects the detectable molecule.
115. The method of claim 114, wherein the first host cell is a prokaryotic
cell.
116. The method of claim 114, wherein the first host cell is a eukaryotic
cell.
117. The method of claim 114, wherein the second host cell is a prokaryotic
cell.
118. The method of claim 114, wherein the second host cell is a eukaryotic
cell.
119. The method of claims 115 or 117, wherein the prokaryotic cell is a
bacterial
cell.
120. The method of claims 116 or 118, wherein the eukaryotic cell is a
mammalian
cell.
121. The method of claim 114, wherein the microenvironment is selected from
beads, high temperature agaroses, gel microdroplets, cells, ghost red blood
cells, macrophages, or liposomes.
122. The method of claim 121, wherein the liposomes are prepared from one or
more phospholipids, glycolipids, steroids, alkyl phosphates or fatty acid
esters.
123. The method of claim 122, wherein the phospholipids are selected from the
group consisting of lecithin, sphingomyelin and dipalmitoyl.

192
124. The method of claim 122, wherein the steroids are selected from the group
consisting of cholesterol, cholestanol and lanosterol.
125. The method of claim 114, wherein the detectable reporter contains a
bioluminescent molecule, a chemiluminescent molecule, a colorimetric
molecule, an electromagnetic molecule, an isotopic molecule, a thermal
molecule or an enzymatic substrate.
126. The method of claim 125, wherein the bioluminescent molecule is green
fluorescent protein (GFP) or red fluorescent protein (RFP).
127. The method of claim 114, wherein the analyzer is a FACS analyzer.
128. The method of claim 114, wherein the analyzer is capillary-based
screening
apparatus.
129. A method for identifying a bioactivity or biomolecule of interest,
comprising:
(a) transferring a library containing a plurality of clones comprising
polynucleotides derived from a mixed population of organisms or more
than one organism, to a host cell; and
(b) contacting the first host cell with a second host cell containing a
detectable reporter molecule in a microenvironment, wherein the first
host cell and second host cell are different.
130. The method of claim 129, wherein the first host cell is a prokaryotic
cell.
131. The method of claim 129, wherein the first host cell is a eukaryotic
cell.
132. The method of claim 129, wherein the second host cell is a prokaryotic
cell.
133. The method of claim 129, wherein the second host cell is a eukaryotic
cell.
134. The method of claims 130 or 132, wherein the prokaryotic cell is a
bacterial
cell.
135. The method of claims 131 or 133, wherein the eukaryotic cell is a
mammalian
cell.

193
136. The method of claim 129, wherein the microenvironment is selected from
beads, high temperature agaroses, gel microdroplets, cells, ghost red blood
cells, macrophages, or liposomes.
137. The method of claim 136, wherein the liposomes are prepared from one or
more phospholipids, glycolipids, steroids, alkyl phosphates or fatty acid
esters.
138. The method of claim 137, wherein the phospholipids are selected from the
group consisting of lecithin, sphingomyelin and dipalmitoyl.
139. The method of claim 137, wherein the steroids are selected from the group
consisting of cholesterol, cholestanol and lanosterol.
140. The method of claim 129, wherein the detectable reporter contains a
bioluminescent molecule, a chemiluminescent molecule, a colorimetric
molecule, an electromagnetic molecule, an isotopic molecule, a thermal
molecule or an enzymatic substrate.
141. The method of claim 140, wherein the bioluminescent molecule is green
fluorescent protein (GFP) or red fluorescent protein (RFP).
142. The method of claim 129, further comprising separating the clones with an
analyzer that detects the detectable molecule.
143. The method of claim 142, wherein the analyzer is a FACS analyzer.
144. The method of claim 142, wherein the analyzer is a capillary-based
screening
apparatus.
145. The method of claim 142, wherein the analyzer is a mass spectroscopic
screening apparatus.

194
146. A method for identifying a bioactivity or biomolecule of interest,
comprising:
(a) transferring the extract of a library containing a plurality of clones
comprising polynucleotides derived from a mixed population of organisms
or more than one organism, to a first host cell; and
(b) contacting the extract with a second host cell containing a detectable
reporter molecule.
147. The method of claim 146, wherein the first host cell is a prokaryotic
cell.
148. The method of claim 146, wherein the first host cell is a eukaryotic
cell.
149. The method of claim 146, wherein the second host cell is a prokaryotic
cell.
150. The method of claim 146, wherein the, second host cell is a eukaryotic
cell.
151. The method of claims 147 or 149, wherein the prokaryotic cell is a
bacterial
cell.
152. The method of claims 148 or 150, wherein the eukaryotic cell is a
mammalian
cell.
153. The method of claim 146, wherein the extract is contacted with a host
cell in a
microenvironment.
154. The method of claim 153, wherein the microenvironment is selected from
beads, high temperature agaroses, gel microdroplets, cells, ghost red blood
cells, macrophages, or liposomes.
155. The method of claim 154, wherein the liposomes are prepared from one or
more phospholipids, glycolipids, steroids, alkyl phosphates or fatty acid
esters.
156. The method of claim 155, wherein the phospholipids are selected from the
group consisting of lecithin, sphingomyelin and dipalmitoyl.
157. The method of claim 155, wherein the steroids are selected from the group
consisting of cholesterol, cholestanol and lanosterol.

195
158. The method of claim 146, wherein the detectable reporter contains a
bioluminescent molecule, a chemiluminescent molecule, a colorimetric
molecule, an electromagnetic molecule, an isotopic molecule, a thermal
molecule or an enzymatic substrate.
159. The method of claim 158, wherein the bioluminescent molecule is green
fluorescent protein (GFP) or red fluorescent protein (RFP).
160. The method of claim 146, further comprising separating the clones with an
analyzer that detects the detectable molecule.
161. The method of claim 160, wherein the analyzer is a FACS analyzer.
162. The method of claim 160, wherein the analyzer is a capillary-based
screening
apparatus.
163. The method of claim 160, wherein the analyzer is a mass spectroscopic
screening apparatus.
164. A method for identifying a bioactivity or biomolecule of interest,
comprising:
a) running the extract of a library containing a plurality of clones
comprising
polynucleotides derived from a mixed population of organisms or more
than one organism, through a column;
b) transferring the extract to a first host cell;
c) contacting the extract with a second host cell containing a detectable
reporter molecule; and
d) measuring the mass spectra of the host cell with the extract, wherein a
difference in the mass spectra of the host cell with the extract from the
mass spectra without the extract is indicative of the presence of a
bioactivity or biomolecule of interest in the extract of the library.
165. A sample screening apparatus, comprising:

196
a plurality of capillaries held together in an array, wherein each capillary
comprises at least one wall defining a lumen for retaining a sample;
interstitial material disposed between adjacent capillaries in the array; and
one or more reference indicia formed within of the interstitial material.
166. The apparatus of claim 165, wherein each capillary has an aspect ratio of
between 10:1 and 1000:1.
167. The apparatus of claim 166, wherein each capillary has an aspect ratio of
between 20:1 and 100:1.
168. The apparatus of claim 165, wherein each capillary has an aspect ratio of
between 40:1 and 50:1.
169. The apparatus of claim 165, wherein each capillary has a length of
between
5mm and 10 cm.
170. The apparatus of claim 165, wherein the lumen of each capillary has an
internal diameter of between 3µm and 500µm.
171. The apparatus of claim 165, wherein the lumen of each capillary has an
internal diameter of between 10µm and 500µm.
172. The apparatus of claim 165, wherein the plurality of capillaries are
fused
together to form the array.
173. The apparatus of claim 165, wherein the reference indicia are formed at
intervals of a number of capillaries.
174. The apparatus of claim 165, wherein the reference indicia are formed at
edges
of the array.

197
175. The apparatus of claim 165, wherein the reference indicia are formed of
glass.
176. A capillary for screening a sample, wherein the capillary is adapted for
being
held in an array of capillaries, the capillary comprising:
a first wall defining a lumen for retaining the sample, wherein the first wall
forms a waveguide for propagating detectable signals therein; and
a second wall formed of a filtering material, for filtering excitation energy
provided to the lumen to excite the sample.
177. The capillary of claim 176, wherein the second wall circumscribes the
first
wall.
178. The capillary of claim 176, wherein the second wall is formed of extra
mural
absorption (EMA) glass.
179. The capillary of claim 178, wherein the EMA glass is tuned to filter
specific
wavelengths of light.
18. A capillary array for screening a plurality of samples, comprising:
a plurality of capillaries, held together into the array, wherein each
capillary
includes a first wall defining a lumen for retaining the sample, and a second
wall
circumscribing the first wall, for filtering excitation energy provided to the
lumen to
excite the sample.
181. The array of claim 180, wherein the second wall of each capillary is
formed of
a filtering material.
182. The array of claim 181, wherein the filtering material is EMA glass.
183. The array of claim 182, wherein the EMA glass is tuned to filter specific
wavelengths of light.

198
184. The array of claim 180, further comprising interstitial material between
adjacent capillaries.
185. The array of claim 184, wherein the interstitial material is adapted to
absorb
light.
186. A method for incubating a bioactivity or biomolecule of interest,
comprising:
introducing a first component into at least a portion of a capillary of a
capillary
array, wherein each capillary of the capillary array comprises at least one
wall
defining a lumen for retaining the first component;
introducing air into the capillary behind the first component; and
introducing a second component into the capillary, wherein the second
component is separated from the first component by the air.
187. The method of claim 186, wherein either the first or second component
includes at least one particle of interest.
188. The method of claim 187, wherein the other of the first and second
component
includes a developer for causing an activity of interest by the particle of
interest.
189. The method of claim 187, wherein the particle of interest is a molecule.
190. The method of claim 186, further comprising disrupting the air to combine
the
first component with the second component.
191. The method of claim 186, wherein the first and second components are
liquids.
192. A method of incubating a sample of interest, comprising:

199
introducing a first liquid labeled with a detectable particle into a capillary
of a
capillary array, wherein each capillary of the capillary array comprises at
least one
wall defining a lumen for retaining the liquid and the detectable particle;
submersing one end of the capillary into a fluid bath containing a second
liquid; and
evaporating the first liquid from the opposite end of the capillary to draw
the second
liquid into the capillary tube.
193. The method of claim 192, wherein the second liquid contains a developer
for
causing an activity of interest by the detectable particle.
194. The method of claim 193, wherein the developer includes at least one
nutrient.
195. The method of claim 194, wherein the nutrient includes oxygen.
196. A method of incubating a sample of interest, comprising:
introducing a first liquid labeled with a detectable particle into a capillary
of a
capillary array, wherein each capillary of the capillary array comprises at
least one
wall defining a lumen for retaining the first liquid and the detectable
particle, and
wherein the at least one wall is coated with a binding material for binding
the
detectable particle to the at least one wall;
removing the first liquid from the capillary tube, wherein the bound
detectable
particle is maintained within the capillary; and
introducing a second liquid into the capillary tube.
197. The method of claim 196, wherein the binding material includes DNA.
198. The method of claim 197, wherein the binding material includes an
antibody.
199. A method of incubating a sample of interest, comprising:

200
introducing a liquid labeled with a detectable particle into a capillary of a
capillary array, wherein each capillary of the capillary array comprises at
least one
wall defining a lumen for retaining the liquid and the detectable particle;
introducing paramagnetic beads to the liquid; and
exposing the capillary containing the paramagnetic beads to a magnetic field
to cause movement of the paramagnetic beads in the liquid within the
capillary.
200. The method of claim 199, further comprising reversing polarity of the
magnetic field to cause reverse movement of the paramagnetic beads.
201. A method of recovering a sample from one of a plurality of capillaries in
a
capillary array, comprising:
determining a coordinate position of a recovery tool;
detecting a coordinate location of a capillary containing the sample;
correlating, via relative movement between the recovery tool and the capillary
containing the sample, the coordinate position of the recovery tool with the
coordinate
location of the capillary; and
providing contact between the capillary and the recovery tool.
202. The method of claim 201, further comprising removing, with the recovery
tool, the sample from the capillary containing the sample.
203. A recovery apparatus for a sample screening system, wherein the system
includes a plurality of capillaries formed into an array, the apparatus
comprising:
a recovery tool adapted to contact at least one capillary of the capillary
array and
recover a sample therefrom;
an ejector, connected with the recovery tool, for ejecting the recovered
sample
from the recovery tool.
204. The recovery apparatus of claim 203, wherein the recovery tool includes a
needle connected with a collection container.

201
205. The recovery apparatus of claim 203, wherein the recovery tool includes
an
aspirator for recovering the sample.
206. The recovery apparatus of claim 203, wherein the ejector includes a jet
mechanism adapted to expel the recovered sample.
207. The recovery apparatus of claim 203, wherein the jet mechanism is
operable
by thermal energy applied thereto.
208. The recovery apparatus of claim 207, further comprising a heating element
connected to the jet mechanism.
209. A sample screening apparatus, comprising:
a plurality of capillaries held together in a planar array, wherein each
capillary
comprises at least one wall defining a lumen for retaining a sample;
interstitial material disposed between adjacent capillaries in the array; and
one or more reference indicia formed within of the interstitial material.
210. The sample screening apparatus of claim 209, wherein the planar array
includes approximately 1,000,000 capillaries.
211. A method of enriching for a polynucleotide encoding an activity of
interest,
comprising:
contacting a mixed population of polynucleotides derived from a mixed
population of organisms with at least one nucleic acid probe comprising a
detectable label and at least a portion of a polynucleotide sequence encoding
a
polypeptide of interest having a specified activity to enrich for
polynucleotides
positive for a sequence of interest.

Description

Note: Descriptions are shown in the official language in which they were submitted.


CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
HIGH THROUGHPUT OR CAPILLARY-BASED
SCREENING FOR A BIOACTIVITY OR BIOMOLECULE
FIELD OF THE INVENTION
The present invention relates generally to screening of mixed populations of
organisms or nucleic acids and more specifically to the identification of
bioactive
molecules and bioactivities using screening techniques, including high
throughput
screening and capillary array platform for screening samples.
BACKGROUND
There is a critical need in the chemical industry for efficient catalysts for
the
practical synthesis of optically pure materials; enzymes can provide the
optimal
solution. All classes of molecules and compounds that are utilized in both
established
and emerging chemical, pharmaceutical, textile, food and feed, detergent
markets
must meet stringent economical and environmental standards. The synthesis of
polymers, pharmaceuticals, natural products and agrochemicals is often
hampered by
expensive processes which produce harmful byproducts and which suffer from low
enantioselectivity (Faber, 1995; Tonkovich and Gerber, U.S. Dept of Energy
study,
1995). Enzymes have a number of remarkable advantages which can overcome these
problems in catalysis: they act on single functional groups, they distinguish
between
similar functional groups on a single molecule, and they distinguish between
enantiomers. Moreover, they are biodegradable and function at very low mole
fractions in reaction mixtures. Because of their chemo-, regio- and
stereospecificity,
enzymes present a unique opportunity to optimally achieve desired selective
transformations. These are often extremely difficult to duplicate chemically,
especially in single-step reactions. The elimination of the need for
protection groups,
selectivity, the ability to carry out mufti-step transformations in a single
reaction
vessel, along with the concomitant reduction in environmental burden, has led
to the
increased demand for enzymes in chemical and pharmaceutical industries (Faber,
1995). Enzyme-based processes have been gradually replacing many conventional
chemical-based methods (Wrotnowski, 1997). A current limitation to more
widespread industrial use is primarily due to the relatively small number of

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
2
commercially available enzymes. Only 300 enzymes (excluding DNA modifying
enzymes) are at present commercially available from the > 3000 non DNA-
modifying
enzyme activities thus far described.
The use of enzymes for technological applications also may require
performance under demanding industrial conditions. This includes activities in
environments or on substrates for which the currently known arsenal of enzymes
was
not evolutionarily selected. Enzymes have evolved by selective pressure to
perform
very specific biological functions within the milieu of a living organism,
under
conditions of mild temperature, pH and salt concentration. For the most part,
the non-
DNA modifying enzyme activities thus far described (Enzyme Nomenclature, 1992)
have been isolated from mesophilic organisms, which represent a very small
fraction
of the available phylogenetic diversity (Amann et al., 1995). The dynamic
field of
biocatalysis takes on a new dimension with the help of enzymes isolated from
microorganisms that thrive in extreme environments. Such enzymes must function
at
temperatures above 100 °C in terrestrial hot springs and deep sea
thermal vents, at
temperatures below 0 °C in arctic waters, in the saturated salt
environment of the
Dead Sea, at pH values around 0 in coal deposits and geothermal sulfur-rich
springs,
or at pH values greater than 11 in sewage sludge (Adams and Kelly, 1995). The
enzymes may also be obtained from: geothermal and hydrothermal fields, acidic
soils, sulfotara and boiling mud pots, pools, hot-springs and geysers where
the
enzymes are neutral to alkaline, marine actinomycetes, metazoan, endo and
ectosymbionts, tropical soil, temperate soil, arid soil, compost piles, manure
piles,
marine sediments, freshwater sediments, water concentrates, hypersaline and
super-
cooled sea ice, arctic tundra, Sargosso sea, open ocean pelagic, marine snow,
microbial mats (such as whale falls, springs and hydrothermal vents), insect
and
nematode gut microbial communities, plant endophytes, epiphytic water samples,
industrial sites and ex situ enrichments. Additionally, the enzymes may be
isolated
from eukaryotes, prokaryotes, myxobacteria (epothilone), air, water, sediment,
soil or
rock. Enzymes obtained from these extremophilic organisms open a new field in
biocatalysis.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
3
Fox. example, several esterases and lipases cloned and expressed from
extremophilic organisms are remarkably robust, showing high activity
throughout a
wide range of temperatures and pHs. The fingerprints of several of these
esterases
show a diverse substrate spectrum, in addition to differences in the optimum
reaction
temperature. Certain esterases recognize only short chain substrates while
others only
acts on long chain substrates in addition to a huge difference in the optimal
reaction
temperature. These results suggest that more diverse enzymes fulfilling the
need for
new biocatalysts can be found by screening biodiversity. Substrates upon which
enzymes act are herein defined as bioactive substrates.
Furthermore, virtually all of the enzymes known so far have come from
cultured organisms, mostly bacteria and more recently archaea (Enzyme
Nomenclature, 1992). Traditional enzyme discovery programs rely solely on
cultured
microorganisms for their screening programs and are thus only accessing a
small
fraction of natural diversity Several recent studies have estimated that only
a small
percentage, conservatively less than 1 %, of organisms present in the natural
environment have been cultured (see Table I, Amanxi et al., 1995, Barns et. al
1994,
Torvsik, 1990). For example, Norman Pace's laboratory recently reported
intensive
untapped diversity in water and sediment samples from the "Obsidian Pool" in
Yellowstone National Park, a spring which has been studied since the early
1960's by
microbiologists (Barns, 1994). Amplification and cloning of 16S rRNA encoding
sequences revealed mostly unique sequences with little or no representation of
the
organisms which had previously been cultured from this pool. This suggests
substantial diversity of archaea with so far unknown morphological,
physiological and
biochemical features which may be useful in industrial processes. David Ward's
laboratory in Bozmen, Montana has performed similar studies on the
cyanobacterial
mat of Octopus Spring in Yellowstone Park and came to the same conclusion,
namely,
tremendous uncultured diversity exists (Bateson et al., 1989). Giovannoni et
al.
(1990) reported similar results using bacterioplankton collected in the
Sargasso Sea
while Torsvik et al. (1990) have shown by DNA reassociation kinetics that
there is
considerable diversity in soil samples. Hence, this vast majority of
microorganisms
represents an untapped resource for the discovery of novel biocatalysts. In
order to

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
access this potential catalytic diversity, recombinant screening approaches
are
required.
The discovery of novel bioactive molecules other than enzymes is also
afforded by the present invention. For instance, antibiotics, antivirals,
antitumor
agents and regulatory proteins can be discovered utilizing the present
invention.
Bacteria and many eukaryotes have a coordinated mechanism for regulating
genes whose products are involved in related processes. The genes are
clustered, in
structures referred to as "gene clusters," on a single chromosome and are
transcribed
together under the control of a single regulatory sequence, including a single
promoter
which initiates transcription of the entire cluster. The gene cluster, the
promoter, and
additional sequences that function in regulation altogether are referred to as
an
"operon" and can include up to 30 or more genes, usually from 2 to 6 genes.
Thus, a
gene cluster is a group of adjacent genes that are either identical or
related, usually as
to their function.
Some gene families consist of one or more identical members. Clustering is a
prerequisite for maintaining identity between genes, although clustered genes
are not
necessarily identical. Gene clusters range from extremes where a duplication
is
generated of adjacent related genes to cases where hundreds of identical genes
lie in a
tandem array Sometimes no significance is discernable in a repetition of a
particular
gene. A principal example of this is the expressed duplicate insulin genes in
some
species, whereas a single insulin gene is adequate in other mammalian species.
It is important to further research gene clusters and the extent to which the
full length of the cluster is necessary for the expression of the proteins
resulting
therefrom. Gene clusters undergo continual reorganization and, thus, the
ability to
create heterogeneous libraries of gene clusters from, for example, bacterial
or other
prokaryote sources is valuable in determining sources of novel proteins,
particularly
including enzymes such as, for example, the polyketide syntheses that are
responsible
for the synthesis of polyketides having a vast array of useful activities. As
indicated,

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
other types of proteins and molecules that are the products) of gene clusters
are also
contemplated, including, for example, antibiotics, antivirals, antitumor
agents and
regulatory proteins, such as insulin.
Polyketides are molecules which are an extremely rich source of bioactivities,
including antibiotics (such as tetracyclines and erythromycin), anti-cancer
agents
(daunomycin), immunosuppressants (FK506 and rapamycin), and veterinary
products
(monensin). Many polyketides (produced by polyketide synthases) are valuable
as
therapeutic agents. Polyketide synthases are multifunctional enzymes that
catalyze
the biosynthesis of a huge variety of carbon chains differing in length and
patterns of
functionality and cyclization. Polyketide synthase genes fall into gene
clusters and at
least one type (designated type I) of polyketide synthases have large size
genes and
encoded enzymes, complicating genetic manipulation and in vitro studies of
these
genes/proteins. The methods) of the present invention facilitate the rapid
discovery
of these gene clusters in gene expression libraries.
Gene libraries of microorganisms have been prepared for the purpose of
identifying genes involved in biosynthetic pathways that produce medicinally-
active
metabolites and specialty chemicals. These pathways require multiple proteins
(specifically, enzymes), entailing greater complexity than the single proteins
used as
drug targets. For example, genes encoding pathways of bacterial polyketide
synthases
(PKSs) were identified by screening gene libraries of the organism (Malpartida
et aI.
1984, Nature 309:462; Donadio et al. 1991, Science 252:675-679). PKSs catalyze
multiple steps of the biosynthesis of polyketides, an important class of
therapeutic
compounds, and control the structural diversity of the polyketides produced. A
host-
vector system in Streptomyces has been developed that allows directed mutation
and
expression of cloned PKS genes (McDaniel et al. 1993, Science 262:1546-1550;
Kao
et al. 1994, Science 265:509-512). This specific host-vector system has been
used to
develop more efficient ways of producing polyketides, and to rationally
develop novel
polyketides (Khosla et al., WO 95/08548).

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
6
Another example is the production of the textile dye, indigo, by fermentation
in an E. coli host. Two operons containing the genes that encode the
multienzyme
biosynthetic pathway have been genetically manipulated to improve production
of
indigo by the foreign E. coli host. (Ensley et al. 1983, Science 222:167-169;
lVlurdock
et al. 1993, Bio/Technology 11:381-386). Overall, conventional studies of
heterologous expression of genes encoding a metabolic pathway involve directed
cloning, sequence analysis, designed mutations, and rearrangement of specific
genes
that encode proteins known to be involved in previously characterized
metabolic
pathways.
In view of numerous advances in the understanding of disease mechanisms
and identification of drug targets, there is an increasing need for innovative
strategies
and methods for rapidly identifying lead compounds and channeling them toward
clinical testing. The methods of the present invention facilitate the rapid
discovery of
genes, gene pathways and gene clusters, particularly polyketide synthase
genes,
polyketide synthase gene pathways and polyketides, from gene expression
libraries.
Of particular interest are cellular "switches" known as receptors which
interact with a variety of biomolecules, such as hormones, growth factors, and
neurotransmitters, to mediate the transduction of an "external" cellular
signaling event
into an "internaf° cellular signal. External signaling events include
the binding of a
ligand to the receptor, and internal events include the modulation of a
pathway in the
cytoplasm or nucleus involved in the growth, metabolism or apoptosis of the
cell.
Internal events also include the inhibition or activation of transcription of
certain
nucleic acid sequences, resulting in the increase or decrease in the
production or
presence of certain molecules (such as nucleic acid, proteins, andlor other
molecules
affected by this increase or decrease in transcription). Drugs to cure disease
or
alleviate its symptoms can activate or block any of these events to achieve a
desired
pharmaceutical effect.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
7
Transduction can be accomplished by a transducing protein in the cell
membrane which is activated upon an allosteric change the receptor may undergo
upon binding to a specific biomolecule. The "active" transducing protein
activates
production of so-called "second messenger" molecules within the cell, which
then
activate certain regulatory proteins within the cell that regulate gene
expression or
alter some metabolic process. Variations on the theme of this "cascade" of
events
occur. For example, a receptor may act as its own transducing protein, or a
transducing protein may act directly on an intracellular target without
mediation by a
second messenger.
Signal transduction is a fundamental area of inquiry in biology. For instance,
ligand/receptor interactions and the receptor/effector coupling mediated by
Guanine
nucleotide-binding proteins (G-proteins) are of interest in the study of
disease. A
large number of G protein-linked receptors funnel extracellular signals as
diverse as
hormones, growth factors, neurotransmitters, primary sensory stimuli, and
other
signals through a set of G proteins to a small number of second-messenger
systems.
The G proteins act as molecular switches with an "on" and "off' state governed
by a
GTPase cycle. Mutations in G proteins may result in either constitutive
activation or
loss of expression mutations.
Many receptors convey messages through heterotrimeric G proteins, of which
at least 17 distinct forms have been isolated. Additionally, there are several
different
G protein-dependent effectors. The signals transduced through the
heterotrimeric G
proteins in mammalian cells influence intracellular events through the action
of
effector molecules.
Given the variety of functions subserved by G protein-coupled signal
transduction, it is not surprising that abnormalities in G protein-coupled
pathways can
lead to diseases with manifestations as dissimilar as blindness, hormone
resistance,
precocious puberty and neoplasia. G-protein-coupled receptors are extremely
important to drug research efforts. It is estimated that up to 60% of today's
prescription drugs work by somehow interacting with G protein-coupled
receptors.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
However, these drugs were developed using classical medicinal chemistry and
without a knowledge of the molecular mechanism of action. A more efficient
drug
discovery program could be deployed by targeting individual receptors and
making
use of information on gene sequence and biological function to develop
effective
therapeutics. The present invention allows one to, for example, study
molecules
which affect the interaction of G proteins with receptors, or of ligands with
receptors.
Several groups have reported cells which express mammalian G proteins or
subunits thereof, along with mammalian receptors which interact with these
molecules. For example, W092/05244 (April 2, 1992) describes a transformed
yeast
cell which is incapable of producing a yeast G protein ~ subunit, but which
has been
engineered to produce both a mammalian G protein D subunit and a mammalian
receptor which interacts with the subunit. The authors found that a modified
version
of a specific mammalian receptor integrated into the membrane of the cell, as
shown
by studies of the ability of isolated membranes to interact properly with
various
known agonists and antagonists of the receptor. Ligand binding resulted in G
protein-
mediated signal transduction.
Another group has described the functional expression of a mammalian
adenylyl cyclase in yeast, and the use of the engineered yeast cells in
identifying
potential inhibitors or activators of the mammalian adenylyl cyclase (WO
95/30012).
Adenylyl cyclase is among the best studied of the effector molecules which
function
in mammalian cells in response to activated G proteins. "Activators" of
adenylyl
cyclase cause the enzyme to become more active, elevating the cAMP signal of
the
yeast cell to a detectable degree. "Inhibitors" cause the cyclase to become
less active,
reducing the cAMP signal to a detectable degree. The method describes the use
of the
engineered yeast cells to screen for drugs which activate or inhibit adenylyl
cyclase
by their action on G protein-coupled receptors.
When attempting to identify genes encoding bioactivities of interest from
complex mixed population nucleic acid libraries, the rate limiting steps in
discovery
occur at the both DNA cloning level and at the screening level. Screening of
complex

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
9
mixed population libraries which contain, for example, 100s of different
organisms
requires the analysis of several million clones to cover this genomic
diversity An
extremely high-throughput screening method has been developed to handle the
enormous numbers of clones present in these libraries.
In traditional flow cytometry, it is common to analyze very large numbers of
eukaryotic cells in a short period of time. Newly developed flow cytometers
can
analyze and sort up to 20,000 cells per second. In a typical flow cytometer,
individual
particles pass through an illumination zone and appropriate detectors, gated
electronically, measure the magnitude of a pulse representing the extent of
light
scattered. The magnitude of these pulses are sorted electronically into "bins"
or
"channels", permitting the display of histograms of the number of cells
possessing a
certain quantitative property versus the channel number (Davey and Kell,
1996). It
was recognized early on that the data accruing from flow cytometric
measurements
could be analyzed (electronically) rapidly enough that electronic cell-sorting
procedures could be used to sort cells with desired properties into separate
"buckets",
a procedure usually known as fluorescence-activated cell sorting (Davey and
Kell,
1996).
Fluorescence-activated cell sorting has been primarily used in studies of
human and animal cell lines and the control of cell culture processes.
Fluorophore
labeling of cells and measurement of the fluorescence can give quantitative
data about
specific target molecules or subcellular components and their distribution in
the cell
population. Flow cytometry can quantitate virtually any cell-associated
property or
cell organelle for which there is a fluorescent probe (or natural
fluorescence). The
parameters which can be measured have previously been of particular interest
in
animal cell culture.
Flow cytometry has also been used in cloning and selection of variants from
existing cell clones. This selection, however, has required stains that
diffuse through
cells passively, rapidly and irreversibly, with no toxic effects or other
influences on
metabolic or physiological processes. Since, typically, flow sorting has been
used to

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
study animal cell culture performance, physiological state of cells, and the
cell cycle,
one goal of cell sorting has been to keep the cells viable during and after
sorting.
There currently are no reports in the literature of screening and discovery of
recombinant enzymes in E. coli expression libraries by fluorescence activated
cell
sorting of single cells. Furthermore there are no reports of recovering DNA
encoding
bioactivities screened by expression screening in E. coli using a FACS
machine. The
present invention provides these methods to allow the extremely rapid
screening of
viable or non-viable cells to recover desirable activities and the nucleic
acid encoding
those activities.
A limited number of papers describing various applications of flow cytometry
in the field of microbiology and sorting of fluorescence activated
microorganisms
have, however, been published (Davey and Kell, 1996). Fluorescence and other
forms
of staining have been employed for microbial discrimination and
identification, and in
the analysis of the interaction of drugs and antibiotics with microbial cells.
Flow
cytometry has been used in aquatic biology, where autofluorescence of
photosynthetic
pigments are used in the identification of algae or DNA stains are used to
quantify and
count marine populations (Davey and Kell, 1996). Thus, Diaper and Edwards used
flow cytometry to detect viable bacteria after staining with a range of
fluorogenic
esters including fluorescein diacetate (FDA) derivatives and CemChrome B, a
proprietary stain sold commercially for the detection of viable bacteria in
suspension
(Diaper and Edwards, 1994). Labeled antibodies and oligonucleotide probes have
also
been used for these purposes.
Fapers have also been published describing the application of flow cytometry
to the detection of native and recombinant enzymatic activities in eukaryotes.
Betz et
al. studied native (non-recombinant) lipase production by the eukaryote,
Rhizopus
arrhizus with flow cytometry They found that spore suspensions of the mold
were
heterogeneous as judged by light-scattering data obtained with excitation at
633 nm,
and they sorted clones of the, subpopulations into the wells of microtiter
plates. After
germination and growth, lipase production was automatically assayed

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
11
(turbidimetrically) in the microtiter plates, and a representative set of the
most active
were reisolated, cultured, and assayed conventionally (Betz et al., 1984).
Scrienc et al. have reported a flow cytometric method for detecting cloned -
galactosidase activity in the eukaryotic organism, S. cerevisiae. The ability
of flow
cytometry to make measurements on single cells means that individual cells
with high
levels of expression (e.g., due to gene amplification or higher plasmid copy
number)
could be detected. In the method reported, a non-fluorescent compound (3-
naphthol-
(3-galactopyranoside) is cleaved by (3-galactosidase and the liberated
naphthol is
trapped to form an insoluble fluorescent product. The insolubility of the
fluorescent
product is of great importance here to prevent its diffusion from the cell.
Such
diffusion would not only lead to an underestimation of (3-galactosidase
activity in
highly active cells but could also lead to an overestimation of enzyme
activity in
inactive cells or those with low activity, as they may take up the leaked
fluorescent
compound, thus reducing the apparent heterogeneity of the population.
One group has described the use of a FACS machine in an assay detecting
fusion proteins expressed from a specialized transducing bacteriophage in the
prokaryote Bacillus subtilis (Chung, et.aL, J. of Bacteriology, Apr. 1994, p.
1977-
1984; Chung, et.al., Biotechnology and Bioengineering, Vol. 47, pp. 234-242
(1995)).
This group monitored the expression of a lacZ gene (encodes b-galactosidase)
fused
to the sporulation loci in subtilis (spo). The technique used to monitor b-
galactosidase
expression from spo-lacZ fusions in single cells involved taking samples from
a
sporulating culture, staining them with a commercially available fluorogenic
substrate
for b-galactosidase called C8-FDG, and quantitatively analyzing fluorescence
in
single cells by flow cytometry In this study, the flow cytometer was used as a
detector to screen for the presence of the spo gene during the development of
the
cells. The device was not used to screen and recover positive cells from a
gene
expression library or nucleic acid for the purpose of discovery.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
12
Another group has utilized flow cytometry to distinguish between the
developmental stages of the delta-proteobacteria Myxococcus xanthus (F. Russo-
Marie, et.al., PNAS, Vol. 90, pp.8194-8198, September 1993). As in the
previously
described study, this study employed the capabilities of the FACS machine to
detect
and distinguish genotypically identical cells in different development
regulatory
states. The screening of an enzymatic activity was used in this study as an
indirect
measure of developmental changes.
The lacZ gene from E. coli is often used as a reporter gene in studies of gene
expression regulation, such as those to determine promoter efficiency, the
effects of
traps-acting factors, and the effects of other regulatory elements in
bacterial, yeast,
and animal cells. Using a chromogenic substrate, such as ONPG (o-nitrophenyl-(-
D-
galactopyranoside), one can measure expression of -galactosidase in cell
cultures; but
it is not possible to monitor expression in individual cells and to analyze
the
heterogeneity of expression in cell populations. The use of fluorogenic
substrates,
however, makes it possible to determine (3-galactosidase activity in a large
number of
individual cells by means of flow cytometry. This type of determination can be
more
informative with regard to the physiology of the cells, since gene expression
can be
correlated with the stage in the mitotic cycle or the viability under certain
conditions.
In 1994, Plovins et al., reported the use of fluorescein-Di-[3-D-
galactopyranoside
(FDG) and C 12-FDG as substrates for (3-galactosidase detection in animal,
bacterial,
and yeast cells. This study compared the two molecules as substrates for (3-
galactosidase, and concluded that FDG is a better substrate for (3-
galactosidase
detection by flow cytometry in bacterial cells. The screening performed in
this study
was for the comparison of the two substrates. The detection capabilities of a
FACS
machine were employed to perform the study on viable bacterial cells.
Cells with chromogenic or fluorogenic substrates yield colored and
fluorescent products, respectively. Previously, it had been thought that the
flow
cytometry-fluorescence activated cell sorter approaches could be of benefit
only for
the analysis of cells that contain intracellularly, or are normally physically
associated

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
13
with, the enzymatic activity of small molecule of interest. On this basis, one
could
only use fluorogenic reagents which could penetrate the cell and which are
thus
potentially cytotoxic. To avoid clumping of heterogeneous cells, it is
desirable in
flow cytometry to analyze only individual cells, and this could limit the
sensitivity
and therefore the concentration of target molecules that can be sensed. Weaver
and his
colleagues at MIT and others have developed the use of gel microdroplets
containing
(physically) single cells which can take up nutrients, secret products, and
grow to
form colonies. The diffusional properties of gel microdroplets may be made
such that
sufficient extracellular product remains associated with each individual gel
microdroplet, so as to permit flow cytometric analysis and cell sorting on the
basis of
concentration of secreted molecule within each microdroplet. Beads have also
been
used to isolate mutants growing at different rates, and to analyze antibody
secretion
by hybridoma cells and the nutrient sensitivity of hybridoma cells. The gel
microdroplet method has also been applied to the rapid analysis of
mycobacterial
growth and its inhibition by antibiotics.
The gel microdroplet technology has had significance in amplifying the
signals available in flow cytometric analysis, and in permitting the screening
of
microbial strains in strain improvement programs for biotechnology Wittrup et
al.,
(Biotechnolo.Bioeng. (1993) 42:351-3S6) developed a microencapsulation
selection
method which allows the rapid and quantitative screening of >106 yeast cells
for
enhanced secretion of Aspergillus awamori glucoamylase. The method provides a
400-fold single-pass enrichment for high-secretion mutants.
Gel microdroplet or other related technologies can be used in the present
invention to localize as well as amplify signals in the high throughput
screening of
recombinant libraries. Gell viability during the screening is not an issue or
concern
since nucleic acid can be recovered from the microdroplet.
Different types of encapsulation strategies and compounds or polymers can be
used with the present invention. For instance, high temperature agaroses can
be
employed for making microdroplets stable at high temperatures, allowing stable

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
14
encapsulation of cells subsequent to heat kill steps utilized to remove all
background
activities when screening for thermostable bioactivities.
There are several hurdles which must be overcome when attempting to detect
and sort E. coli expressing recombinant enzymes, and recover encoding nucleic
acids.
FAGS systems have typically been based on eukaryotic separations and have not
been
refined to accurately sort single E. coli cells; the low forward and sideward
scatter of
small particles like E. coli, reduces the ability of accurate sorting; enzyme
substrates
typically used in automated screening approaches, such as umbelifferyl based
substrates, diffuse out of E. coli at rates which interfere with quantitation.
Further,
recovery of very small amounts of DNA from sorted organisms can be
problematic.
The methods of the present invention address and overcome these hurdles with
the
novel screening approaches described herein.
There has been a dramatic increase in the need for bioactive compounds with
novel activities. This demand has arisen largely from changes in worldwide
demographics coupled with the clear and increasing trend in the number of
pathogenic
organisms that are resistant to currently available antibiotics as well as the
need for
new industrial processes for synthesis of compounds. For example, while there
has
been a surge in demand for antibacterial drugs in emerging nations with young
populations, countries with aging populations, such as the U.S., require a
growing
repertoire of drugs against cancer, diabetes, arthritis and other debilitating
conditions.
The death rate from infectious diseases has increased SS% between 190 and 1992
and it has been estimated that the emergence of antibiotic resistant microbes
has
added in excess of $30 billion annually to the cost of health care in the U.S.
alone .
(Adams et al., Chemical and Engineering News, 1995; Amann et al.,
Microbiological
Reviews, 59, 1995). As a response to this trend, pharmaceutical companies have
significantly increased their screening of microbial diversity for compounds
with
unique activities or specificities.
The majority of bioactive compounds currently in use are derived from soil
microorganisms. Many microbes inhabiting soils and other complex ecological

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
communities produce a variety of compounds that increase their ability to
survive and
proliferate. These compounds are generally thought to be nonessential for
growth of
the organism and are synthesized with the aid of genes involved in
intermediary
metabolism. Such secondary metabolites that influence the growth or survival
of
other organisms are known as "bioactive" compounds and serve as key components
of
the chemical defense arsenal of both micro- and macroorganisms. Humans have
exploited these compounds for use as antibiotics, antiinfectives and other
bioactive
compounds with activity against a broad range of prokaryotic and eukaryotic
pathogens (Barnes et al., Proc.Nat. Acad. Sci. U.S.A., 91, 1994).
The approach currently used to screen microbes for new bioactive compounds
has been largely unchanged since the inception of the field. New isolates of
bacteria,
particularly gram positive strains from soil environments, are collected and
their
metabolites tested for pharmacological activity.
There is still tremendous biodiversity that remains untapped as the source of
lead compounds. However, the currently available methods for screening and
producing lead compounds cannot be applied efficiently to these under-explored
resources. For instance, it is estimated that at least 99% of marine bacteria
species do
not survive on laboratory media, and commercially available fermentation
equipment
is not optimal for use in the conditions under which these species will grow,
hence
these organisms are difficult or impossible to culture for screening or re-
supply
Recollection, growth, strain improvement, media improvement and scale-up
production of the drug-producing organisms often pose problems for synthesis
and
development of lead compounds. Furthermore, the need for the interaction of
specific
organisms to synthesize some compounds makes their use in discovery extremely
difficult. New methods to harness the genetic resources and chemical diversity
of
these untapped sources of compounds for use in drug discovery are very
valuable.
A central core of modern biology is that genetic information resides in a
nucleic
acid genome, and that the information embodied in such a genome (i.e., the
genotype)
directs cell function. This occurs through the expression of various genes in
the genome

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
16
of an organism and regulation of the expression of such genes. The expression
of genes
in a cell or organism defines the cell or organism's physical characteristics
(i.e., its
phenotype). This is accomplished through the translation of genes into
proteins.
Determining the biological activity of a protein obtained from an
environmental sample
can provide valuable information about the role of proteins in the
environments. In
addition, such information can help in the development of biologics,
diagnostics,
therapeutics, and compositions for industrial applications.
Accordingly, the present invention provides methods and compositions to
access this untapped biodiversity and to rapidly screen for polynucleotides,
proteins
and small molecules of interest utilizing high throughput screening of
multiple
samples. These biomolecules can be derived from cultured or uncultured samples
of
organisms. In one embodiment, the methods of the present invention provides a
method for high throughput cultivation of unculturable microorganisms.
In the United States, cancer is the second leading cause of disease-related
deaths, second only to cardiovascular disease and it is projected to become
the leading
cause of death within a few years. The most common curative therapies for
cancers
found at an early stage include surgery and radiation ( 1 ). These methods are
not
nearly as successful in the more advanced stages of cancer. Current
chemotherapeutic
agents have been useful but are limited in their effectiveness. Significant
results are
obtained with chemotherapy in a small range of cancers including childhood
cancers
and certain adult malignancies such as lymphoma and leukemia (2). Despite
these
positive results, most chemotherapeutic treatments are not curative and serve
primarily as palliatives (1). Thus, it is clear that current medical science
still has a
long way to go before providing long-term survival to patients and curability
of most
cancers. However, basic research over the past 20 years has provided a vast
amount
of scientific information defining key players in the progression of cancers.
Understanding the disease processes at the molecular level provides the means
to
determine optimal molecular targets and presumably selectively kill cancerous
tissues.
Some of the key areas that have been identified in the progression of tumors
include
proliferative signal transduction, aberrant cell-cycle regulation, apoptosis,
telomere

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
17
biology, genetic instability and angiogenesis (3). This basic research is now
beginning to pay off as progress towards more effective treatments is
beginning to
emerge (4,5). New chemotherapeutic agents directed against these identified
areas
are in Phase I-III clinical trials with some of the most promising agents
active against
tyrosine kinases involved in signal transduction. Small molecule inhibitors of
Bcr-
abl, protein kinase C, VEGF receptors, and EGF receptors, to name a few, are
all in
clinical trials (4). Some specific examples include the EGF receptor
inhibitors,
ZD1839 and CP358774, which are in Phase II trials and appear to be well
tolerated by
patients with positive signs of clinical activity (6). Even with this
progress, the
complexities of tumorigenesis necessitate not only the ongoing discovery and
development of novel therapeutic agents but also the basic research to
elucidate the
underlying mechanisms of the disease. Presently, there are at least SO known
cancer
related targets and it has been speculated that there may be up to several
hundred new
targets discovered (2). To make use of this influx of information, novel
methods for
the ultra high throughput screening of potential anti-cancer drugs must be
developed.
Decent technological developments in molecular biology, automation,
miniaturization, and information technology have facilitated the high
throughput
screening of novel compounds from a variety of sources. However, despite the
increased throughput, there is some disappointment in the industry regarding
the
number of novel drugs that have resulted from these efforts (7). One of the
significant
challenges is to find sufficient numbers of compounds with the structural
diversity
necessary to increase the chances of finding activity at the molecular target.
Currently, screened compounds come from chemical and combinatorial libraries,
historical compound collections and natural product libraries (8). Of these,
one of the
richest sources of drugs has been from natural product libraries. Cragg et al
(9)
reported that over 60% of the approved anticancer drugs and pre-NDA candidates
between 1984 and 1995 were from natural sources or derived from natural
products.
In fact, it is estimated that 39% of all 520 new approved drugs during this
time period
were from or derived from natural products with 80% of anti-infectives coming
from
nature. Typically, natural products are small molecules that have a much
greater
structural diversity than most combinatorial approaches. Small molecules in
general

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
18
are favored by the pharmaceutical industry because they are more "drug-like"
in
nature with the ability to penetrate tumors, be absorbed, and metabolized
easily.
However, natural products have their disadvantages, largely due to the
reproducibility
of the source, the labor-intensive extraction process, the abundance of the
supply, and
the concerns over rights to biodiversity (8).
The therapeutic agents from natural sources have been primarily of plant and
microbial origins. Of these, the greatest biodiversity exists in the
microorganisms that
populate virtually every corner of the earth. The approach currently used to
screen
microbes for new bioactive compounds has changed little over the last 50
years.
Microbiologists collect samples from the environment, isolate a pure culture,
grow up
sufficient material, extract the culture, and test their metabolites for
pharmacological
activity. Variations of these natural products can then be generated through
mutagenesis of the producing organism or through chemical or biochemical
modification of the original backbone molecules. Natural products are
typically made
by multi-enzyme systems in which each enzyme carries out one of the many
transformations required to make the final small molecule products, an example
being
antibiotics. These bioactive molecules are derived from the organism's ability
to
produce secondary metabolites in response to the specific needs and challenges
of
their local environments. The genes encoding these enzymes are often clustered
into
so-called "biosynthetic operons" which contain the blueprint for building a
natural
product (10). This blueprint for production of a small bi0active molecule is
typically
more than 25,000 nucleotides and can be greater than 100,000 nucleotides.
There are
many examples of entire pathways encoding for the production of such small
molecules as oxytetracycline, jadomycin, daunorubicin, to name just a few,
that have
been cloned as contiguous pieces of DNA from a producing organism (11). Some
of
these pathways (e.g. actinorhodin, tetracenomycin, puromycin, nikkomycin) have
been transferred to other microbial hosts and the small molecule
heterologously
expressed (11).
A more recent approach has been to use recombinant techniques to synthesize
hybrid antibiotic pathways by combining gene subunits from previously
characterized
pathways. This approach, called "combinatorial biosynthesis" has been focused

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
19
primarily on the polyketide antibiotics and has resulted in a number of
compounds
which have displayed activity (12,13). In one such approach using the
erythronolide
biosynthetic operon, enzymatic domains have been added to (14) and
repositioned
within the operon (15), thereby reprogramming polyketide biosynthesis.
However,
compounds with novel antibiotic activities have not yet been reported: an
observation
that maybe be due to the fact that the pathway subunits are derived from those
encoding previously characterized compounds. What has not been accounted for
in
previous attempts to discover novel bioactive compounds is the relatively
recent
observation that only a small fraction of microbes in natural environments can
be
grown under laboratory conditions. Estimates are that far less than 1 % of all
prokaryotes are capable of being grown in pure culture in the laboratory. This
implies
a need for culture-independent methods for bioactive compound discovery.
Culture-independent approaches to directly clone genes encoding both target
enzymes
and other bioactive molecules from environmental samples are based on the
construction of libraries which represent the collective genomes of naturally
occurring organisms, archived in cloning vectors that can be propagated in E.
coli,
Streptomyces, or other suitable hosts . Because the cloned DNA is initially
extracted
directly from environmental samples containing a mixed population of
organisms, the
representation of the libraries is not limited to the small fraction of
prokaryotes that
can be grown in pure culture, nor is it biased towards a few rapidly growing
species.
Samples can be obtained from virtually all ecosystems represented on earth,
including
such extreme environments as geothermal and hydrothermal vents, acidic soils
and
boiling mud pots, contaminated industrial sites, marine symbionts, etc.
Screening of complex mixed population libraries containing, for example, 100
different organisms requires the analysis of tens of millions of clones to
cover the
genomic diversity. An extremely high throughput screening method must be
implemented to handle the enormous numbers of clones present in these
libraries. In
the pharmaceutical industry today, high throughput screening typically has
throughput
rates on the order of 10,000 compounds per assay per day with some
laboratories
working at 100,000 assays per day. Most of the development in the industry has

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
centered around the miniaturization and automation of these screens to higher
density,
smaller volume plate formats. However, this strategy could be reaching the
practical
limits of conventional liquid-dispensing technology and current microplate
fabrication
processes, as well as the limits in controlling evaporation in open systems
with very
small well volumes.
Current platforms for screening micro-scale particles of interest include
plates
that are formed with small wells, or through-holes. The wells or through-holes
are
used to hold a sample to be analyzed. The sample typically contains the
particles of
interest. When wells are used, complex and inefficient sample delivery and
extraction
systems must be used in order to deposit the sample into the wells on the
plate, and
remove the sample from the wells for further analysis. Wells-based platforms
have a
bottom, for which gravity is primarily used for suspending the sample on the
plate to
develop the particulate or incubate cells of interest.
Another type of platform uses through-holes, which are typically machined
into a plate by one of a number of well-known methods. Through-holes rely on
capillary forces for introducing the sample to the plate, and utilize surface
tension for
suspending the sample in the through-holes. However, typical through-hole-
based
devices are limited to relatively small aspect ratios, or the ratio of length
to internal
diameter of the hole. A small aspect ratio yields greater evaporative loss of
a liquid
contained in the hole, and such evaporation is difficult to control. Through-
holes are
also limited in their functionality For example, the process of forming
through-holes
in a plate usually does not allow for the use of various materials to line the
inside of
the holes, or to clad the outside of the holes.
SUMMARY OF THE INVENTION
The present invention comprises methods for high throughput screening for
biomolecules of interest. In the present invention, nucleic acids or nucleic
acid libraries
derived from mixed populations of nucleic acids andlor organisms are screened
very
rapidly for bioactivities of interest utilizing liquid phase screening
methods.. These
libraries can represent the genomes of multiple organisms, species or
subspecies. In one
aspect, the libraries are screened via hybridization methods, such as
"biopanning", or by

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
21
activity based screening methods. High throughput screening can be performed
by
utilizing single cell screening systems, such as fluorescence activated cell
sorting
(FACS) or by capillary array-based systems.
Accordingly, in one embodiment, the present invention provides a process for
identifying clones having a specified activity of interest, which process
comprises (i)
generating one or more gene libraries derived from nucleic acid isolated from
a mixed
population of organisms; and (ii) screening said libraries utilizing a high
throughput cell
analyzer, e.g., a fluorescence activated cell sorter or a non-optical cell
sorter, to identify
said clones.
More particularly, the invention provides a process for identifying clones
having
a specified activity of interest by (i) generating one or more libraries,
e.g., expression
libraries, made to contain nucleic acid directly or indirectly isolated from a
mixed
population of organisms ; (ii) exposing said libraries to a particular
substrate or
substrates of interest; and (iii) screening said exposed libraries utilizing a
high
throughput cell analyzer, e.g., a fluorescence activated cell sorter or a non-
optical cell
sorter, to identify clones which react with the substrate or substrates.
In another aspect, the invention also provides a process for identifying
clones
having a specified activity of interest by (i) generating one or more gene
libraries derived
from nucleic acid directly or indirectly isolated from a mixed population of
organsims;
and (ii) screening said exposed libraries utilizing an assay requiring a
binding event or
the covalent modification of a target, and a high throughput cell analyzer,
e.g., a
fluorescence activated cell sorter or non-optical cell sorter, to identify
positive clones.
The invention further provides a method of screening for an agent that
modulates
the activity of a target protein or other cell component (e.g., nucleic acid),
wherein the
target and a selectable marker are expressed by a recombinant cell, by co-
encapsulating
the agent in a microenvironment with the recombinant cell expressing the
target and
detectable marker and detecting the effect of the agent on the activity of the
target cell
component.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
22
In another embodiment, the invention provides a method for enriching for
target
DNA sequences containing at least a partial coding region for at least one
specified
activity in a DNA sample by co-encapsulating a mixture of target DNA obtained
from a
mixture of organisms with a mixture of DNA probes including a detectable
marker and
at least a portion of a DNA sequence encoding at least one enzyme having a
specified
enzyme activity and a detectable marker; incubating the co-encapsulated
mixture under
such conditions and for such time as to allow hybridization of complementary
sequences
and screening for the target DNA. Optionally the method further comprises
transforming host cells with recovered target DNA to produce an expression
library of a
plurality of clones.
The invention further provides a method of screening for an agent that
modulates
the interaction of a first test protein linked to a DNA binding moiety and a
second test
protein linked to a transcriptional activation moiety by co-encapsulating the
agent with
the first test protein and second test protein in a suitable microenvironment
and
determining the ability of the agent to modulate the interaction of the first
test protein
linked to a DNA binding moiety with the second test protein covalently linked
to a
transcriptional activation moiety, wherein the agent enhances or inhibits the
expression
of a detectable protein.
In yet another aspect, the present invention provides a method for identifying
a
polynucleotide in a liquid phase, including contacting a plurality of
polynucleotides
derived from at least one organism, e.g., a mixed population of organisms,
including
microorganisms or plant tissue, with at least one nucleic acid probe under
conditions
that allow hybridization of the probe to the polynucleotides having
complementary
sequences, wherein the probe is labeled with a detectable molecule (e.g., a
fluorescent,
magnetic or other molecule). The detectable molecule changes, e.g.,
fluoresces, upon
interaction of the probe to a target polynucleotide in the library. Clones
from the library
are then separated with an analyzer that detects the change in the detectable
molecule,
e.g., fluorescence, magnetic field or dielectric signature. The detectable
molecule may
also be a bioluminescent molecule, a chemiluminescent molecule, a colorimetric
molecule, an electromagnetic molecule, an isotopic molecule, a thermal
molecule or

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
23
an enzymatic substrate. The separated clones can be contacted with a reporter
system
that identifies a polynucleotide encoding a polypeptide or a small molecule of
interest,
for example, and the clones capable of modulating expression or activity of
the reporter
system identified thereby identifying a polynucleotide of interest. The liquid
phase of the
embodiment includes in a solution (cell-free), in a cell, or in a non-solid
phase.
In another embodiment, the invention provides a method for identifying a
polynucleotide encoding a polypeptide of interest. The method includes co-
encapsulating in a microenvironment a plurality of library clones containing
DNA
obtained from a mixed population of organisms with a mixture of
oligonucleotide probes
comprising a detectable marker and at least a portion of a polynucleotide
sequence
encoding a polypeptide of interest having a specified bioactivity. The
encapsulated
clones are incubated under such conditions and for such time as to allow
interaction of
complementary sequences and clones containing a complement to the
oligonucleotide
probe encoding the polypeptide of interest identified by separating clones
with a
fluorescent analyzer or non-optical analyzer that detects the detectable
marker.
In yet another embodiment, the invention provides a method for high throughput
screening of a polynucleotide library for a polynucleotide of interest that
encodes a
molecule of interest. The method includes contacting a library containing a
plurality of
clones comprising polynucleotides derived from a mixed population of organisms
with a
plurality of oligonucleotide probes labeled with a detectable molecule wherein
said
detectable molecule becomes detectable upon interaction of the probe to a
target
polynucleotide in the library; separating clones with an analyzer that detects
the
detectable marker; contacting the separated clones with a reporter system that
identifies a
polynucleotide encoding the molecule of interest; and identifying clones
capable of
modulating expression or activity of the reporter system thereby identifying a
polynucleotide of interest.
In another embodiment, the invention provides a method of screening for a
polynucleotide encoding an activity of interest. The method includes (a)
obtaining
polynucleotides from a sample containing a mixed population of organisms; (b)
normalizing the polynucleotides obtained from the sample; (c) generating a
library from

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
24
the normalized polynucleotides; (d) contacting the library with a plurality of
oligonucleotide probes comprising a detectable marker and at least a portion
of a
polynucleotide sequence encoding a polypeptide of interest having a specified
activity to
select library clones positive for a sequence of interest; (e) selecting
clones with an
analyzer (e.g. a fluorescent or non-optical analyzer) that detects the marker;
(f)
contacting the selected clones with a reporter system that identifies a
polynucleotide
encoding the activity of interest; and (g) identifying clones capable of
modulating
expression or activity of the reporter system thereby identifying a
polynucleotide of
interest; wherein the positive clones contain a polynucleotide sequence
encoding an
activity of interest which is capable of catalyzing the bioactive substrate.
In yet another embodiment, the present invention provides a method for
screening polynucleotides, comprising contacting a library of polynucleotides
derived
from a mixed population of organism with a probe oligonucleotide labeled with
a
detectable molecule, which is detectable upon binding of the probe to a target
polynucleotide of the library, to select library polynucleotides positive for
a sequence of
interest; separating library members that are positive for the sequence of
interest with an
analyzer that detects the molecule; expressing the selected polynucleotides to
obtain
polypeptides; contacting the polypeptides with a reporter system; and
identifying
polynucleotides encoding polypeptides capable of modulating expression or
activity of
the reporter system.
In another embodiment, the invention provides a method for obtaining an
organism from
a mixed population of organisms in a sample. The method includes encapsulating
in a
microenvironment at least one organism from the sample; incubating the
encapsulated
organism under such conditions and for such a time to allow the at least one
microorganism to grow or proliferate; and sorting the encapsulated organism by
flow
cytometry to obtain an organism from the sample.
In another emodiment, the invention provides a method for identifying a
polynucleotide
in a liquid phase comprising:
a) contacting a plurality of polynucleotides derived from at Ieast
one organism with at least one nucleic acid probe under conditions that allow

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
hybridization of the probe to the polynucleotides having complementary
sequences,
wherein the probe is labeled with a detectable molecule; and
b) identifying a polynucleotide of interest with an analyzer that
detects the detectable molecule. .
According to another embodiment of the invention, a sample screening
apparatus includes a plurality of capillaries formed into an array of adjacent
capillaries, wherein each capillary comprises at least one wall defining a
lumen for
retaining a sample. The apparatus further includes interstitial material
disposed
between adjacent capillaries in the array, and one or more reference indicia
formed
within of the interstitial material.
According to another embodiment of the invention, a capillary for screening a
sample, wherein the capillary is adapted for being bound in an array of
capillaries,
includes a first wall defining a lumen for retaining the sample, and a second
wall
formed of a altering material, for filtering excitation energy provided to the
lumen to
excite the sample.
According to yet another embodiment of the invention, a method for
incubating a bioactivity or biomolecule of interest includes the steps of
introducing a
first component into at least a portion of a capillary of a capillary array,
wherein each
capillary of the capillary array comprises at least one wall defining a lumen
for
retaining the first component, and introducing an air bubble into the
capillary behind
the first component. The method further includes the step of introducing a
second
component into the capillary, wherein the second component is separated from
the
first component by the air bubble.
In yet another embodiment of the invention, a method of incubating a sample
of interest includes introducing a first liquid labeled with a detectable
particle into a
capillary of a capillary array, wherein each capillary of the capillary axray
comprises
at least one wall defining a lumen for retaining the first liquid and the
detectable
particle, and wherein the at least one wall is coated with a binding material
for
binding the detectable particle to the at least one wall. The method further
includes

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
26
removing the first liquid from the capillary tube, wherein the bound
detectable
particle is maintained within the capillary, and introducing a second liquid
into the
capillary tube.
Another embodiment of the invention includes a recovery apparatus for a
sample screening system, wherein the system includes a plurality of
capillaries
formed into an array. The recovery apparatus includes a recovery tool adapted
to
contact at least one capillary of the capillary array and recover a sample
from the at
least one capillary. The recovery apparatus further includes an ejector,
connected
with the recovery tool, for ejecting the recovered sample from the recovery
tool.
BRIEF DESCRIPTION OF THE FIGURES
Figure 1 illustrates the protocol used in the cell sorting method of the
invention
to screen for a polynucleotide of interest, in this case using a (library
excised into E.
coli). The clones of interest are isolated by sorting.
Figure 2 shows a microtiter plate where clones or cells are sorted in
accordance
with the invention. Typically one cell or cells grown within a microdroplet
are dispersed
per well and grown up as clones.
Figure 3 depicts a co-encapsulation assay. Cells containing library clones are
coencapsulated with a substrate or labeled oligonucleotide. Encapsulation can
occur in a
variety of means, including GMDs, liposomes, and ghost cells. Cells are
screened via
high throughput screening on a fluorescence analyzer.
Figure 4. depicts a side scatter versus forward scatter graph of FACS sorted
gel-
microdroplets (GMDs) containing a species of Streptomyces which forms
unicells.
Empty gel-microdroplets are distinguished from free cells and debris, also.
Figure S is a depiction of a FACS/Biopanning method described herein and
described in Example 3, below.
Figure 6A shows an example of dimensions of a capillary array of the
invention.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
27
Figure 6B illustrates an array of capillary arrays.
Figure 7 shows a top cross-sectional view of a capillary array.
Figure ~ is a schematic depicting the excitation of and emission from a sample
within the capillary lumen according to one embodiment of the invention.
Figure 9 is a schematic depicting the filtering of excitation and emission
light
to and from a sample within the capillary lumen according to an alternative
embodiment of the invention.
Figure 10 illustrates an embodiment of the invention in which a capillary
array
is wicked by contacting a sample containing cells, and humidified in a
humidified
incubator followed by imaging and recovery of cells in the capillary array.
Figure 11 illustrates a method for incubating a sample in a capillary tube by
an
evaporative and capillary wicking cycle.
Figure 12A shows a portion of a surface of a capillary array on which
condensation has formed.
Figure I2B shows the portion of the surface of the capillary array, depicted
in
Figure 12A, in which the surface is coated with a hydrophobic layer to inhibit
condensation near an end of individual capillaries.
Figures 13A-C depict a method of retaining at least two components within a
capillary.
Figure 14A depicts capillary tubes containing paramagnetic beads and cells.
Figure 14B depicts the use of the paramagnetic beads to stir a sample in a
capillary tube.
Figure 15 depicts an excitation apparatus for a detection system according to
an embodiment of the invention.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
28
Figure 16 illustrates a system for screening samples using a capillary array
according to an embodiment of the invention.
Figure 17A illustrates one example of a recovery technique useful for
recovering a sample from a capillary array. In this depiction a needle is
contacted
with a capillary containing a sample to be obtained. A vacuum is created to
evacuate
the sample from the capillary tube and onto a filter.
Figure 17B illustrates one sample recovery method in which the recovery
device has an outer diameter greater than the inner diameter of the capillary
from
which a sample is being recovered.
Figure 17C illustrates another sample recovery method in which the recovery
device has an outer diameter approximately equal to or less than the inner
diameter of
the capillary.
Figure 17D shows the further processing of the sample once evacuated from
the capillary.
Figure 18 is a schematic showing high throughput enrichment of low copy
gene targets.
Figure 19 is a schematic of FACS-Biopanning using high throughput
culturing. Polylcetide synthase sequences from environmental samples are shown
in
the alignment.
Figure 20 shows whole cell hybridization for biopanning.
Figure 21 is a schematic showing co-encapsulation of a eukaryotic cell and a
bacterial cell.
Figure 22 shows a whole cell hybridization schematic for biopanning and
FACS sorting.
Figure 23 shows a schematic of T7 RNA Polymerase Expression system.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
29
DETAILED DESCRIPTION OF THE INVENTION
The present invention provides a method for rapid sorting and screening of
libraries derived from a mixed population of organisms from, for example, an
environmental sample or an uncultivated population of organisms. In one
embodiment,
gene libraries are generated, clones are either exposed to a substrate or
substrates) of
interest, or hybridized to a fluorescence labeled probe having a sequence
corresponding
to a sequence of interest and positive clones are identified and isolated via
fluorescence
activated cell sorting. Cells can be viable or non-viable during the process
or at the end
of the process, as nucleic acids encoding a positive activity can be isolated
and cloned
utilizing techniques well known in the art.
This invention differs from fluorescence activated cell sorting, as normally
performed, in several aspects. Previously, FAGS machines have been employed in
studies focused on the analyses of eukaryotic and prokaryotic cell lines and
cell culture
processes. FAGS has also been utilized to monitor production of foreign
proteins in both
eukaryotes and prokaryotes to study, for example, differential gene
expression. The
detection and counting capabilities of the FACS system have been applied in
these
examples. however, FACS has never previously been employed in a discovery
process
to screen for and recover bioactivities in prokaryotes. In addition, non-
optical methods
have not been used to identify or discover novel bioactivities or
biomolecules.
Furthermore, the present invention does not require cells to survive, as do
previously
described technologies, since the desired nucleic acid (recombinant clones)
can be
obtained from alive or dead cells. For example, the cells only need to be
viable long
enough to contain, carry or synthesize a complementary nucleic acid sequence
to be
detected, and can thereafter be either viable or non-viable cells so long as
the
complementary sequence remains intact. The present invention also solves
problems
that would have been associated with detection and sorting of E. coli
expressing
recombinant enzymes, and recovering encoding nucleic acids. The invention
includes
within its embodiments apparatus capable of detecting a molecule or marker
that is
indicative of a bioactivity or biomolecule of interest, including optical and
non-optical
apparatus. In one embodiment, the present invention includes within its
embodiments

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
any apparatus capable of detecting fluorescent wavelengths associated with
biological
material, such apparatuses are defined herein as fluorescent analyzers (one
example of
which is a FACS apparatus).
The use of a culture-independent approach to directly clone genes encoding
novel enzymes from, for example, an environmental sample containing a mixed
population of organisms allows one to access untapped resources of
biodiversity. In one
embodiment, the invention is based on the construction of "mixed population
libraries"
which represent the collective genomes of naturally occurring organisms
archived in
cloning vectors that can be propagated in suitable prokaryotic hosts. Because
the cloned
DNA is initially extracted directly from environmental samples, the libraries
are not
limited to the small fraction of prokaryotes that can be grown in pure
culture.
Additionally, a normalization of the DNA present in these samples could allow
more
equal representation of the DNA from all of the species present in the
original sample.
This can increase the efficiency of fording interesting genes from minor
constituents of
the sample which may be under-represented by several orders of magnitude
compared to
the dominant species.
Prior to the present invention, the evaluation of complex mixed population
expression libraries was rate limiting. The present invention allows the rapid
screening
of complex mixed population libraries, containing, for example, genes from
thousands of
different organisms. The benefits of the present invention can be seen, for
example, in
screening a complex mixed population sample. Screening of a complex sample
previously required one to use labor intensive methods to screen several
million clones
to cover the genomic biodiversity. The invention represents an extremely high-
throughput screening method which allows one to assess this enormous number of
clones. The method disclosed herein allows the screening anywhere from about
30
million to about 200 million clones per hour for a desired nucleic acid
sequence or
biological activity. This allows the thorough screening of mixed population
libraries for
clones expressing novel biomolecules.
The invention provides methods and composition whereby one can screen, sort
or identify a polynucleotide sequence, polypeptide, or molecule of interest
from a mixed

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
31
population of organisms (e.g., organisms present in a mixed population sample)
based on
polynucleotide sequences present in the sample. Thus, the invention provides
methods
and compositions useful in screening organisms for a desired biological
activity or
biological sequence and to assist in obtaining sequences of interest that can
further be
used in directed evolution, molecular biology, biotechnology and industrial
applications.
By screening and identifying the nucleic acid sequences present in the sample,
the
invention increases the repertoire of available sequences that can be used for
the
development of diagnostics, therapeutics or molecules for industrial
applications.
Accordingly, the methods of the invention can identify novel nucleic acid
sequences
encoding proteins or polypeptides having a desired biological activity.
In one embodiment, the invention provides a method for high throughput
culturing of organisms. In one aspect, the organisms are a mixed population of
organisms. In another aspect, the organisms include host cells of a library
containing
nucleic acids. For example, such libraries include nucleic acid obtained from
various
isolates of organisms, which are then pooled; nucleic acid obtained from
isolate
libraries, which are then pooled; or nucleic acids derived directly from a
mixed
population of organisms. Generally, a sample containing the organisms is mixed
with
a composition that can form a microenvironment, as described herein, e.g., a
gel
microdroplet or a liposome. In one aspect, as illustrated in Example 8 a mixed
population of microorganisms is mixed with the encapsulation material in such
a way
that preferably fewer than 5 microorganisms are encapsulated. Preferably, only
one
microorganism is encapsulated in each microenvironment system.
Once encapsulated, the cells are cultured in a manner which allows growth of
the organisms, e.g., host cells of a library. For example, Example 8 provides
growth
of the encapsulated organisms in a chromotography column which allows a flow
of
growth medium providing nutrients for growth and for removal of waste products
from cells. Over a period of time (20 minutes to several weeks or months), a
clonal
population of the preferably one organism grows within the microenvironment.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
32
After a desired period of time, microenvironments, e.g., gel microdroplets,
can be sorted to eliminate "empty" microenvironments and to sort for the
occupied
microenvironments. The nucleic acid from organisms in the sorted
microenvironments can be studied directly, for example, by treating with a PCR
mixture and amplified immediately after sorting. In one Example described
herein,
16S rRNA genes from individual cells were studied and organisms assessed fox
phylogenetic diversity from the samples.
In another aspect, the high throughput culturing methods of the invention
allow culturing of organisms and enrichment of low copy gene targets. For
example,
a library of nucleic acid obtained from various isolates of organisms, which
are then
pooled; nucleic acid obtained from isolate libraries, which are then pooled;
or nucleic
acids derived directly from a mixed population of organisms, for example, are
encapsulated, e.g., in a gel microdroplet or other microenvironment, and grown
under
conditions which allow clonal expansion of each organism in the
microenvironment.
In one aspect, the cells of the clonal population are lysed and treated with
proteinases
to yield nucleic acid (see Figure X) (e.g., the microcolonies are
deproteinized by
incubating gel microdroplets in lysis solution containing proteinase I~ at 37
degrees C
for 30 minutes). In order to denature and neutralize nucleic acid entrapped in
the
microenvironments, they are denatured with alkaline denaturing solution (0.5M
NaOH) and neutralized (e.g., with Tris pH8). In one particular example,
nucleic acid
entrapped in the microenvironment is hybridized with Digoxiginin (DIG)-labeled
oligonucleotides (30-50 nt) in Dig Easy Hyb (available from Roche) overnight
at 37
degrees C, followed by washing with 0.3xSSC and O.IxSSC at 38-50 degrees C to
achieve desired stringency. One of skill in the art will appreciate that this
is merely
an example and not meant to limit the invention in any way. For example, other
labels commonly used in the art, e.g., fluorescent labels such as GFP or
chemiluminescent labels, can be utilized in the invention methods.
The nucleic acid is hybridized with a probe which is preferably labeled. A
signal can be amplified with a secondary label (e.g., fluorescent) and the
nucleic acid
sorted for fluorescent microenvironments, e.g., gel microdroplets. Nucleic
acid that is

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
33
fluorescent can be isolated and further studied or cloned into a host cell for
further
manipulation. In one particular example, signals are amplified with Tyramide
Signal
Amplification (TSA) kit from Molecular Probe. TSA is an enzyme-mediated signal
amplification method that utilizes horseradish peroxidase (HRP) to depose
fluorogenic tyramide molecules and generate high-density labeling of a target
nucleic
acid sequence in situ. The signal amplification is conferred by the turnover
of multiple
tyramide substrates per HRP molecule, and increases in signal strength of over
1,000-
fold have been reported. The procedure involves incubating GMDs with anti-DIG
conjugated horseradish peroxidase (anti-DIG-HRP) (Roche, IN) for 3 hours at
room
temperature. Then the tyramide substrate solution will be added and incubated
for 30
minutes at room temperature.
In one aspect, this high throughput culturing method followed by sorting
(e.g.,
FRCS) screening (e.g., biopanning), allows for identification of gene targets.
It may
be desirable to screen for nucleic acids encoding virtually any protein or any
bioactivity and to compare such nucleic acids among various species of
organisms in
a sample (e.g., study polyketide sequences from a mixed population). In
another
aspect, nucleic acid derived from high throughput culturing of organisms can
be
obtained for further study or for generation of a library. Such nucleic acid
can be
pooled and a library created, or alternatively, individual libraries from
clonal
populations of organisms can be generated and then nucleic acid pooled from
those
libraries to generate a more complex library. The libraries generated as
described
herein can be utilized for the discovery of biomolecules (e.g., nucleic acid
or
bioactivities) or for evolving nucleic acid molecules identified by the high
throughput
culturing methods described in the present invention invention. Such evolution
methods are known in the art or described herein, such as, shuffling, cassette
mutagenesis, recursive ensemble mutagenesis, sexual PCR, directed evolution,
exonuclease-mediated reassembly, codon site-saturation mutagenesis, amino acid
site-
saturation mutagenesis, gene site saturation mutagenesis, introduction of
mutations by
non-stochastic polynucleotide reassembly methods, synthetic ligation
polynucleotide
reassembly, gene reassembly, oligonucleotide-directed saturation mutagenesis,
in vivo
reassortment of polynucleotide sequences having partial homology, naturally

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
34
occurring recombination processes which reduce sequence complexity, and any
combination thereof.
Flow cytometry has been used in cloning and selection of variants from
existing
cell clones. This selection, however, has required stains that diffuse through
cells
passively, rapidly and irreversibly, with no toxic effects or other influences
on metabolic
or physiological processes. Since, typically, flow sorting has been used to
study animal
cell culture performance, physiological state of cells, and the cell cycle,
one goal of cell
sorting has been to keep the cells viable during and after sorting.
There currently are no reports in the literature of screening and discovery of
polynucleotide sequence in libraries by cell sorting based on fluorescence
(e.g.
fluorescent activated cell sorting), or non-optical markers (e.g., magnetic
fields and the
like). Furthermore there are no reports of recovering DNA encoding
bioactivities
screened by FAGS or non-optical techniques and additionally screening for a
bioactivity
of interest. The present invention provides these methods to allow the
extremely rapid
screening of viable or non-viable cells to recover desirable activities and
the nucleic acid
encoding those activities.
Fluorescence and other forms of staining have been employed for microbial
discrimination and identification, and in the analysis of the interaction of
drugs and
antibiotics with microbial cells. Flow cytometry has been used in aquatic
biology, where
autofluorescence of photosynthetic pigments are used in the identification of
algae or
DNA stains are used to quantify and count marine populations (Davey and
Kell,1996).
Diaper and Edwards used flow cytometry to detect viable bacteria after
staining with a
range of fluorogenic esters including fluorescein diacetate (FDA) derivatives
and
CemChrome B, a stain sold commercially for the detection of viable bacteria in
suspension (Diaper and Edwards, 1994). Labeled antibodies and oligonucleotide
probes
can also been used for these purposes.
Fapers have been published describing the application of flow cytometry to the
detection of native and recombinant enzymatic activities in eukaryotes. Betz
et al.
studied native (non-recombinant) lipase production by the eukaryote, Rhizopus
arrhizus

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
with flow cytometry. They found that spore suspensions of the mold were
heterogeneous as judged by light-scattering data obtained with excitation at
633 nm, and
they sorted clones of the subpopulations into the wells of microtiter plates.
After
germination and growth, lipase production was automatically assayed
(turbidimetrically)
in the microtiter plates, and a representative set of the most active were
reisolated,
cultured, and assayed conventionally (Betz et al., 1984). The ability of flow
cytometry
to make measurements on single cells means that individual cells with high
levels of
expression (e.g., due to gene amplification or higher plasmid copy number)
could be
detected.
Cells with chromogenic or fluorogenic substrates yield colored and fluorescent
products, respectively. Previously, it had been thought that the flow
cytometry-
fluorescence activated cell sorter approaches could be of benefit only for the
analysis of
cells that contain intracellularly, or are normally physically associated
with, the
enzymatic activity of a molecule of interest. On this basis, one could only
use
fluorogenic reagents which could penetrate the cell and which are thus
potentially
cytotoxic. In addition, gel microdroplets (GMDs) can be used during FACS
sorting and
culturing. The use of GMDs containing (physically) single cells which can take
up
nutrients, secrete products, and grow to form colonies is usefixl in the
present invention.
The diffusional properties of GMDs may be made such that sufficient
extracellular
product remains associated with each individual GMD, so as to permit flow
cytometric
analysis and cell sorting on the basis of concentration of secreted molecule
within each
microdroplet. Beads have also been used to isolate mutants growing at
different rates,
and to analyze antibody secretion by hybridoma cells and the nutrient
sensitivity of
hybridoma cells.
The GMD technology has had significance in amplifying the signals available in
flow cytometric analysis, and in permitting the screening and sorting of
microbial strains
in strain improvement and isolation programs. GIVfl.~ or other related
technologies can
be used in the present invention to localize, sort as well as amplify signals
in the high
throughput screening of recombinant libraries. Cell viability during the
screening is not
an issue or concern since nucleic acid can be recovered from the microdroplet.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
36
Different types of encapsulation strategies and compounds or polymers can be
used with the present invention. For instance, high temperature agaroses can
be
employed for making microdroplets stable at high temperatures, allowing stable
encapsulation of cells subsequent to heat-kill steps utilized to remove all
background
activities when screening for thermostable bioactivities. Encapsulation can be
in beads,
high temperature agaroses, geI microdroplets, cells, such as ghost red blood
cells or
macrophages, liposomes, or any other means of encapsulating and localizing
molecules.
For example, methods of preparing liposomes have been described (i.e., JJ.S.
Patent No.'s 5,653,996, 5393530 and 5,651,981), as well as the use of
liposomes to
encapsulate a variety of molecules U.S. Patent No.'s 5,595,756, 5,605,703,
5,627,159,
5,652,225, 5,567,433, 4,235,871, 5,227,170). Entrapment of proteins, viruses,
bacteria and DNA in erythrocytes during endocytosis has been described, as
well
(Journal of Applied Biochemistry 4, 418-435 (1982)). Erythrocytes employed as
carriers in vitro or in vivo for substances entrapped during hypo-osmotic
lysis or
dielectric breakdown of the membrane have also been described (reviewed in
Ihler, G.
M. (1983) J. Pharm. Ther). These techniques are useful in the present
invention to
encapsulate samples for screening.
"Microenvironment", as used herein, is any molecular structure which
provides an appropriate environment for facilitating the interactions
necessary for the
method of the invention. An environment suitable for facilitating molecular
interactions include, for example, gel microdroplets, ghost cells, macrophages
or
liposomes. Liposomes can be prepared from a variety of lipids including
phospholipids, glycolipids, steroids, long-chain alkyl esters; e.g., alkyl
phosphates,
fatty acid esters; e.g., lecithin, fatty amines and the like. A mixture of
fatty material
may be employed such a combination of neutral steroid, a charge amphiphile and
a
phospholipid. Illustrative examples of phospholipids include lecithin,
sphingomyelin
and dipalmitoylphos-phatidylcholine. Representative steroids include
cholesterol,
cholestanol and lanosterol. Representative charged amphiphilic compounds
generally
contain from 12-30 carbon atoms. Mono- or dialkyl phosphate esters, or alkyl

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
37
amines; e.g., dicetyl phosphate, stearyl amine, hexadecyl amine, dilauryl
phosphate,
and the like.
The invention methods include a system and method for holding and screening
samples. According to one embodiment of the invention, a sample screening
apparatus includes a plurality of capillaries formed into an array of adjacent
capillaries, wherein each capillary comprises at least one wall defining a
lumen for
retaining a sample. The apparatus further includes interstitial material
disposed
between adjacent capillaries in the array, and one or more reference indicia
formed
within of the interstitial material. (see co-pending applications 09/687,219
and
09/894,956, herein incorporated by reference in their entirety).
According to another embodiment of the invention, a capillary for screening a
sample, wherein the capillary is adapted for being bound in an array of
capillaries,
includes a first wall defining a lumen for retaining the sample, and a second
wall
formed of a filtering material, for filtering excitation energy provided to
the lumen to
excite the sample.
According to yet another embodiment of the invention, a method for
incubating a bioactivity or biomolecule of interest includes the steps of
introducing a
first component into at least a portion of a capillary of a capillary array,
wherein each
capillary of the capillary array comprises at least one wall defining a lumen
for
retaining the first component, and introducing an air bubble into the
capillary behind
the first component. The method further includes the step of introducing a
second
component into the capillary, wherein the second component is separated from
the
first component by the air bubble.
In yet another embodiment of the invention, a method of incubating a sample
of interest includes introducing a first liquid labeled with a detectable
particle into a
capillary of a capillary array, wherein each capillary of the capillary array
comprises
at least one wall defining a lumen for retaining the first liquid and the
detectable
particle, and wherein the at least one wall is coated with a binding material
for

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
38
binding the detectable particle to the at least one wall. The method further
includes
removing the first liquid from the capillary tube, wherein the bound
detectable
particle is maintained within the capillary, and introducing a second liquid
into the
capillary tube.
Another embodiment of the invention includes a recovery apparatus for a
sample screening system, wherein the system includes a plurality of
capillaries
formed into an array. The recovery apparatus includes a recovery tool adapted
to
contact at least one capillary of the capillary array and recover a sample
from the at
least one capillary. The recovery apparatus further includes an ejector,
connected
with the recovery tool, for ejecting the recovered sample from the recovery
tool.
As used herein and in the appended claims, the singular forms "a," "and," and
"the" include plural referents unless the context clearly dictates otherwise.
Thus, for
example, reference to "a clone" includes a plurality of clones and reference
to "the
nucleic acid sequence" generally includes reference to one or more nucleic
acid
sequences and equivalents thereof known to those skilled in the art, and so
forth.
~Jnless defined otherwise, all technical and scientific terms used herein have
the
same meaning as commonly understood to one of ordinary skill in the art to
which the
invention belongs. Although any methods, devices and materials similar or
equivalent to
those described herein can be used in the practice or testing of the
invention, the
preferred methods, devices and materials are now described.
All publications mentioned herein are incorporated herein by reference in full
for
the purpose of describing and disclosing the databases, proteins, and
methodologies,
which are described in the publications which might be used in connection with
the
presently described invention. The publications discussed above and throughout
the text
are provided solely for their disclosure prior to the filing date of the
present application.
Nothing herein is to be construed as an admission that the inventors are not
entitled to
antedate such disclosure by virtue of prior invention.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
39
An "amino acid" is a molecule having the structure wherein a central carbon
atom (the (3-carbon atom) is linked to a hydrogen atom, a carboxylic acid
group (the
carbon atom of which is referred to herein as a "carboxyl carbon atom"), an
amino group
(the nitrogen atom of which is referred to herein as an "amino nitrogen
atom"), and a
side chain group, R. When incorporated into a peptide, polypeptide, or
protein, an
amino acid loses one or more atoms of its amino acid carboxylic groups in the
dehydration reaction that links one amino acid to another. As a result, when
incorporated into a protein, an amino acid is referred to as an "amino acid
residue."
"Protein" or "polypeptide" refers to any polymer of two or more individual
amino acids (whether or not naturally occurring) linked via a peptide bond,
and occurs
when the carboxyl carbon atom of the carboxylic acid group bonded to the (3-
carbon of
one amino acid (or amino acid residue) becomes covalently bound to the amino
nitrogen
atom of amino group bonded to the (3-carbon of an adjacent amino acid. The
term
"protein" is understood to include the terms "polypeptide" and "peptide"
(which, at
times may be used interchangeably herein) within its meaning. In addition,
proteins
comprising multiple polypeptide subunits (e.g., DNA polymerase III, RNA
polymerase
IIJ or other components (for example, an RNA molecule, as occurs in
telomerase) will
also be understood to be included within the meaning of "protein" as used
herein.
Similarly, fragments of proteins and polypeptides are also within the scope of
the
invention and may be referred to herein as "proteins."
A particular amino acid sequence of a given protein (i.e., the polypeptide's
"primary structure," when written from the amino-terminus to carboxy-terminus)
is
determined by the nucleotide sequence of the coding portion of a mRNA, which
is in
turn specified by genetic information, typically genomic DNA (including
organelle
DNA, e.g., mitochondrial or chloroplast DNA). Thus, determining the sequence
of a
gene assists in predicting the primary sequence of a corresponding polypeptide
and
more particular the role or activity of the polypeptide or proteins encoded by
that gene or
polynucleotide sequence.
The term "isolated" means altered "by the hand of man" from its natural state;
i.e., if it occurs in nature, it has been changed or removed from its original
environment,

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
or both. For example, a naturally occurring polynucleotide or a polypeptide
naturally
present in a living animal, a biological sample or an environmental sample in
its natural
state is not "isolated", but the same polynucleotide or polypeptide separated
from the
coexisting materials of its natural state is "isolated", as the term is
employed herein.
Such polynucleotides, when introduced into host cells in culture or in whole
organisms,
still would be isolated, as the term is used herein, because they would not be
in their
naturally occurring form or environment. Similarly, the polynucleotides and
polypeptides may occur in a composition, such as a media formulation
(solutions for
introduction of polynucleotides or polypeptides, for example, into cells or
compositions
or solutions for chemical or enzymatic reactions).
"1'olynucleotide" or "nucleic acid sequence" refers to a polymeric form of
nucleotides. hz some instances a polynucleotide refers to a sequence that is
not
immediately contiguous with either of the coding sequences with which it is
immediately contiguous (one on the 5' end and one on the 3' end) in the
naturally
occurring genome of the organism from which it is derived. The term therefore
includes,
for example, a recombinant DNA which is incorporated into a vector; into an
autonomously replicating plasmid or virus; or into the genomic DNA of a
prokaryote or
eukaryote, or which exists as a separate molecule (e.g., a cDNA) independent
of other
sequences. The nucleotides of the invention can be ribonucleotides,
deoxyribonucleotides, or modified forms of either nucleotide. A
polynucleotides as used
herein refers to, among others, single-and double-stranded DNA, DNA that is a
mixture
of single- and double-stranded regions, single- and double-stranded RNA, and
RNA that
is mixture of single- and double-stranded regions, hybrid molecules comprising
DNA
and RNA that may be single-stranded or, more typically, double-stranded or a
mixture of
single- and double-stranded regions.
In addition, polynucleotide as used herein refers to triple-stranded regions
comprising RNA or DNA or both RNA and DNA. The strands in such regions may be
from the same molecule or from different molecules. The regions may include
all of one
or more of the molecules, but more typically involve only a region of some of
the
molecules. One of the molecules of a triple-helical region often is an
oligonucleotide.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
41
The term polynucleotide encompasses genomic DNA or RNA (depending upon the
organism, i.e., RNA genome of viruses), as well as mRNA encoded by the genomic
DNA, and cDNA.
As mentioned above, there is currently a need in the biotechnology and
chemical
industry for molecules that can optimally carry out biological or chemical
processes
(e.g., enzymes). Identifying novel enzymes in a mixed population environmental
sample is one solution to this problem. By rapidly identifying polypeptides
having an
activity of interest and polynucleotides encoding the polypeptide of interest
the invention
provides methods, compositions and sources for the development of biologics,
diagnostics, therapeutics, and compositions for industrial applications.
All classes of molecules and compounds that are utilized in both established
and
emerging chemical, pharmaceutical, textile, food and feed, detergent markets
must meet
economical and environmental standards. The synthesis of polymers,
pharmaceuticals,
natural products and agrochemicals is often hampered by expensive processes
which
produce harmful byproducts and which suffer from poor or inefficient
catalysis.
Enzymes, for example, have a number of remarkable advantages which can
overcome
these problems in catalysis: they act on single functional groups, they
distinguish
between similar functional groups on a single molecule, and they distinguish
between
enantiomers. Moreover, they are biodegradable and function at very low mole
fractions
in reaction mixtures. Because of their chemo-, regio- and stereospecificity,
enzymes
present a unique opportunity to optimally achieve desired selective
transformations.
These are often extremely difficult to duplicate chemically, especially in
single-step
reactions. The elimination of the need for protection groups, selectivity, the
ability to
carry out mufti-step transformations in a single reaction vessel, along with
the
concomitant reduction in environmental burden, has led to the increased demand
for
enzymes in chemical and pharmaceutical industries. Enzyme-based processes have
been
gradually replacing many conventional chemical-based methods. A current
limitation to
more widespread industrial use is primarily due to the relatively small number
of
commercially available enzymes. Only 300 enzymes (excluding DNA modifying

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
42
enzymes) are at present commercially available from the > 3000 non DNA-
modifying
enzyme activities thus far described.
The use of enzymes for technological applications also may require performance
under demanding industrial conditions. This includes activities in
environments or on
substrates for which the currently known arsenal of enzymes was not
evolutionarily
selected. However, the natural environment provides extreme conditions
including, for
example, extremes in temperature and pH. A number of organisms have adapted to
these conditions due in part to selection for polypeptides than can withstand
these
extremes.
Enzymes have evolved by selective pressure to perform very specific biological
functions within the milieu of a living organism, under conditions of
temperature, pH
and salt concentration. For the most part, the non-DNA modifying enzyme
activities
thus far described have been isolated from mesophilic organisms, which
represent a very
small fraction of the available phylogenetic diversity. The dynamic field of
biocatalysis
takes on a new dimension with the help of enzymes isolated from microorganisms
that
thrive in extreme environments. For example, such enzymes must function at
temperatures above 100°C in terrestrial hot springs and deep sea
thermal vents, at
temperatures below 0°C in arctic waters, in the saturated salt
environment of the Dead
Sea, at pH values around 0 in coal deposits and geothermal sulfiir-rich
springs, or at pH
values greater than 11 in sewage sludge. Environmental samples obtained, for
example,
from extreme conditions containing organisms, polynucleotides or polypeptides
(e.g.,
enzymes) open a new field in biocatalysis. By rapidly screening for
polynucleotides
encoding polypeptides of interest, the invention provides not only a source of
materials
for the development of biologics, therapeutics, and enzymes for industrial
applications,
but also provides a new materials for fiu~her processing by, for example,
directed
evolution and mutagenesis to develop molecules or polypeptides modified for
particular
activity or conditions.
In addition to the need for new enzymes for industrial use, there has been a
dramatic increase in the need for bioactive compounds with novel activities.
This
demand has arisen largely from changes in worldwide demographics coupled with
the

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
43
clear and increasing trend in the number of pathogenic organisms that are
resistant to
currently available antibiotics. For example, while there has been a surge in
demand for
antibacterial drugs in emerging nations with young populations, countries with
aging
populations, such as the U.S., require a growing repertoire of drugs against
cancer,
diabetes, arthritis and other debilitating conditions. The death rate from
infectious
diseases has increased 58% between 1980 and 1992 and it has been estimated
that the
emergence of antibiotic resistant microbes has added in excess of $30 billion
annually to
the cost of health care in the U.S. alone. (Adams et al., Chemical and
Engineering
News, 1995; Amann et al., Microbiological Reviews, 59, 1995). As a response to
this
trend pharmaceutical companies have significantly increased their screening of
microbial
diversity for compounds with unique activities or specificity. Accordingly,
the invention
can be used to obtain and identify polynucleotides and related sequence
specific
information from, for example, infectious microorganisms present in the
environment
such as, for example, in the gut of various macroorganisms.
In another embodiment, the methods and compositions of the invention provide
for the identification of lead drug compounds present in an environmental
sample. The
methods of the invention provide the ability to mine the environment for novel
drugs or
identify related drugs contained in different microorganisms. There are
several common
sources of lead compounds (drug candidates), including natural product
collections,
synthetic chemical collections, and synthetic combinatorial chemical
libraries, such as
nucleotides, peptides, or other polymeric molecules that have been identified
or
developed as a result of environmental mining. Each of these sources has
advantages
and disadvantages. The success of programs to screen these candidates depends
largely
on the number of compounds entering the programs, and pharmaceutical companies
have to date screened hundred of thousands of synthetic and natural compounds
in
search of lead compounds. Unfortunately, the ratio of novel to previously-
discovered
compounds has diminished with time. The discovery rate of novel lead compounds
has
not kept pace with demand despite the best efforts of pharmaceutical
companies. There
exists a strong need for accessing new sources of potential drug candidates.
Accordingly, the invention provides a rapid and efficient method to identify
and
characterize environmental samples that may contain novel drug compounds.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
44
The majority of bioactive compounds currently in use are derived from soil
microorganisms. Many microbes inhabiting soils and other complex ecological
communities produce a variety of compounds that increase their ability to
survive and
proliferate. These compounds are generally thought to be nonessential for
growth of the
organism and are synthesized with the aid of genes involved in intermediary
metabolism
hence their name - "secondary metabolites". Secondary metabolites are
generally the
products of complex biosynthetic pathways and are usually derived from common
cellular precursors. Secondary metabolites that influence the growth or
survival of other
organisms are known as "bioactive" compounds and serve as key components of
the
chemical defense arsenal of both micro- and macro-organisms. Humans have
exploited
these compounds for use as antibiotics, antiinfectives and other bioactive
compounds
with activity against a broad range of prokaryotic and eukaryotic pathogens.
Approximately 6,000 bioactive compounds of microbial origin have been
characterized,
with more than 60% produced by the gram positive soil bacteria of the genus
Streptomyces. (Barnes et al., Proc.Nat. Acad. Sci. U.S.A., 91, 1994). Of
these, at least
70 are currently used for biomedical and agricultural applications. The
largest class of
bioactive compounds, the polyketides, include a broad range of antibiotics,
immunosuppressants and anticancer agents which together account for sales of
over $5
billion per year.
Despite the seemingly large number of available bioactive compounds, it is
clear
that one of the greatest challenges facing modern biomedical science is the
proliferation
of antibiotic resistant pathogens. Because of their short generation time and
ability to
readily exchange genetic information, pathogenic microbes have rapidly evolved
and
disseminated resistance mechanisms against virtually all classes of antibiotic
compounds. For example, there are virulent strains of the human pathogens
Staphylococcus and Streptococcus that can now be treated with but a single
antibiotic,
vancomycin, and resistance to this compound will require only the transfer of
a single
gene, vanA, from resistant Enterococcus species for this to occur. (Bateson et
al.,
System. Appl. Microbiol,12, 1989). When this crucial need for novel
antibacterial
compounds is superimposed on the growing demand for enzyme inhibitors,
immunosuppressants and anti-cancer agents it becomes readily apparent why

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
4S
pharmaceutical companies have stepped up their screening of microbial samples
for
bioactive compounds.
The invention provides methods of identifying a nucleic acid sequence encoding
a polypeptide having either known or unknown function. For example, much of
the
diversity in microbial genomes results from the rearrangement of gene clusters
in the
genome of microorganisms. These gene clusters can be present across species or
phylogenetically related with other organisms.
For example, bacteria and many eukaryotes have a coordinated mechanism for
regulating genes whose products are involved in related processes. The genes
are
clustered, in structures referred to as "gene clusters," on a single
chromosome and are
transcribed together under the control of a single regulatory sequence,
including a single
promoter which initiates transcription of the entire cluster. The gene
cluster, the
promoter, and additional sequences that function in regulation altogether are
referred to
as an "operon" and can include up to 20 or more genes, usually from 2 to 6
genes. Thus,
a gene cluster is a group of adjacent genes that are either identical or
related, usually as
to their function. Gene clusters are generally 1 S kb to greater than 120 kb
in length.
Some gene families consist of identical members. Clustering is a prerequisite
for
maintaining identity between genes, although clustered genes are not
necessarily
identical. Gene clusters range from extremes where a duplication is generated
to
adjacent related genes to cases where hundreds of identical genes lie in a
tandem array.
Sometimes no significance is discernable in a repetition of a particular gene.
A principal
example of this is the expressed duplicate insulin genes in some species,
whereas a
single insulin gene is adequate in other mammalian species.
Further, gene clusters undergo continual reorganization and, thus, the ability
to
create heterogeneous libraries of gene clusters from, for example, bacterial
or other
prokaryote sources is valuable in determining sources of novel proteins,
particularly
including enzymes such as, for example, the polyketide synthases that are
responsible for
the synthesis of polyketides having a vast array of useful activities. Other
types of

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
46
proteins that are the products) of gene clusters are also contemplated,
including, for
example, antibiotics, antivirals, antitumor agents and regulatory proteins,
such as insulin.
As an example, polyketide syntheses enzymes fall in a gene cluster.
Polyketides
are molecules which are an extremely rich source of bioactivities, including
antibiotics
(such as tetracyclines and erythromycin), anti-cancer agents (daunomycin),
immunosuppressants (FK506 and rapamycin), and veterinary products (monensin).
Many polyketides (produced by polyketide syntheses) are valuable as
therapeutic agents.
1'olyketide syntheses are multifunctional enzymes that catalyze the
biosynthesis of a
huge variety of carbon chains differing in length and patterns of
functionality and
cyclization. Polyketide synthase genes fall into gene clusters and at least
one type
(designated type )] of polyketide syntheses have large size genes and enzymes,
complicating genetic manipulation and in vitro studies of these
genes/proteins.
The ability to select and combine desired components from a library of
polyketides and postpolyketide biosynthesis genes for generation of novel
polyketides
for study is appealing. The methods) of the present invention make it possible
to, and
facilitate the cloning of, novel polyketide syntheses, since one can generate
gene banks
with clones containing large inserts (especially when using the f factor based
vectors),
which facilitates cloning of gene clusters.
~ther biosynthetic genes include NRPS, glycosyl transferases and p450s.
For example, a gene cluster can be ligated into a vector containing an
expression
regulatory sequences which can control and regulate the production of a
detectable
protein or protein-related array activity from the ligated gene clusters. Use
of vectors
which have an exceptionally large capacity for exogenous nucleic acid
introduction are
particularly appropriate for use with such gene clusters and are described by
way of
example herein to include artificial chromosome vectors, cosmids, and the f
factor (or
fertility factor) of E. coli. For example, the f factor of E. coli is a
plasmid which affects
high-frequency transfer of itself during conjugation and is ideal to achieve
and stably
propagate large nucleic acid fragments, such as gene clusters from samples of
mixed
populations of organisms.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
47
The nucleic acid isolated or derived from these samples (e.g., a mixed
population
of microorganisms) can preferably be inserted into a vector or a plasmid prior
to
screening of the polynucleotides. Such vectors or plasmids are typically those
containing expression regulatory sequences, including promoters, enhancers and
the like.
Accordingly, the invention provides novel systems to clone and screen mixed
populations of organisms present, for example, in an environmental samples,
for
polynucleotides of interest, enzymatic activities and bioactivities of
interest in vitro. The
methods) of the invention allow the cloning and discovery of novel bioactive
molecules
in vitro, and in particular novel bioactive molecules derived from
uncultivated or
cultivated samples. Large size gene clusters, genes and gene fragments can be
cloned,
sequenced and screened using the methods) of the invention. Unlike previous
strategies, the methods) of the invention allow one to clone, screen and
identify
polynucleotides and the polypeptides encoded by these polynucleotides in vitro
from a
wide range of mixed population samples.
The invention allows one to screen for and identify polynucleotide sequences
from complex mixed population samples. DNA libraries obtained from these
samples
can be created from cell free samples, so long as the sample contains nucleic
acid
sequences, or from samples containing cellular organisms or viral particles.
The
organisms from which the libraries may be prepared include prokaryotic
microorganisms, such as Eubacteria and Archaebacteria, lower eukaryotic
microorganisms such as fungi, algae and protozoa, as well as plants, plant
spores and
pollen. The organisms may be cultured organisms or uncultured organisms
obtained
from mixed population environmental samples and includes extremophiles, such
as
thermophiles, hyperthermophiles, psychrophiles and psychrotrophs.
Sources of nucleic acids used to construct a DNA library can be obtained from
mixed population samples, such as, but not limited to, microbial samples
obtained from
Arctic and Antarctic ice, water or permafrost sources, materials of volcanic
origin,
materials from soil or plant sources in tropical areas, droppings from various
organisms

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
48
including mammals, invertebrates, as well as dead and decaying matter etc.
Thus, for
example, nucleic acids may be recovered from either a cultured or non-cultured
organism and used to produce an appropriate DNA library (e.g., a recombinant
expression library) for subsequent determination of the identity of the
particular
polynucleotide sequence or screening for bioactivity.
The following outlines a general procedure for producing libraries from both
culturable and non-culturable organisms as well as mixed population of
organisms,
which libraries can be probed, sequenced or screened to select therefrom
nucleic acid
sequences having an identified, desired or predicted biological activity
(e.g., an
enzymatic activity or a small molecule).
As used herein a mixed population sample is any sample containing
organisms or polynucleotides or a combination thereof, which can be obtained
from any
number of sources (as described above), including, for example, insect feces,
soil, water,
etc. Any source of nucleic acids in purified or non-purified form can be
utilized as
starting material. Thus, the nucleic acids may be obtained from any source
which is
contaminated by an organism or from any sample containing cells. The mixed
population sample can be an extract from any bodily sample such as blood,
urine, spinal
fluid, tissue, vaginal swab, stool, amniotic fluid or buccal mouthwash from
any
mammalian organism. For non-mammalian (e.g., invertebrates) organisms the
sample
can be a tissue sample, salivary sample, fecal material or material in the
digestive tract of
the organism. An environmental sample also includes samples obtained from
extreme
environments including, for example, hot sulfi~r pools, volcanic vents, and
frozen tundra.
In addition, the sample can come from a variety of sources. For example, in
horticulture
and agricultural testing the sample can be a plant, fertilizer, soil, liquid
or other
horticultural or agricultural product; in food testing the sample can be fresh
food or
processed food (for example infant formula, seafood, fresh produce and
packaged food);
and in environmental testing the sample can be liquid, soil, sewage treatment,
sludge and
any other sample in the environment which is considered or suspected of
containing an
organism or polynucleotides.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
49
When the sample is a mixture of material (e.g., a mixed population of
organisms), for example, blood, soil and sludge, it can be treated with an
appropriate
reagent which is effective to open the cells and expose or separate the
strands of nucleic
acids. Mixed populations can comprise pools of cultured organisms or samples.
For
example, samples of organisms can be cultured prior to analysis in order to
purify a
particular population and thus obtaining a purer sample. Organisms, such as
actinomycetes or myxobacteria, known to produce bioacitivities of interest can
be
enriched for, via culturing. Culturing'of organisms in the sample can include
culturing
the organisms in microdroplets and separating the cultured microdroplets with
a cell
sorter into individual wells of a multi-well tissue culture plate from which
fiu-ther
processing may be performed.
Accordingly, the sample comprises nucleic acids from, for example, a diverse
and mixed population of organisms (e.g., microorganisms present in the gut of
an
insect). Nucleic acids are isolated from the sample using any number of
methods for
DNA and RNA isolation. Such nucleic acid isolation methods are commonly
performed
in the art. Where the nucleic acid is RNA, the RNA can be reversed transcribed
to DNA
using primers known in the art. Where the DNA is genomic DNA, the DNA can be
sheared using, for example, a 25 gauge needle.
The nucleic acids are then cloned into avector. Cloning techniques are known
in
the art or can be developed by one skilled in the art, without undue
experimentation.
~Iectors used in the present invention include: plasmids, phages, cosmids,
phagemids,
viruses (e.g., retroviruses, parainfluenzavirus, herpesviruses, reoviruses,
paramyxoviruses, and the like), artificial chromosomes, or selected portions
thereof (e.g.,
coat protein, spike glycoprotein, capsid protein). For example, cosmids and
phagemids
are typically used where the specific nucleic acid sequence to be analyzed or
modified is
large because these vectors are able to stably propagate large
polynucleotides.
The vector containing the cloned DNA sequence can then be amplified by
plating (i.e., clonal amplification) or transfecting a suitable host cell with
the vector (e.g.,
a phage on an E. coli host). Alternatively (or subsequently to amplification),
the cloned
DNA sequence is used to prepare a library for screening by transforming a
suitable

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
5~
organism. Hosts, known in the art are transformed by artificial introduction
of the
vectors containing the target nucleic acid by inoculation under conditions
conducive for
such transformation. One could transform with double stranded circular or
linear nucleic
acid or there may also be instances where one would transform with single
stranded
circular or linear nucleic acid sequences. By transform or transformation is
meant a
permanent or transient genetic change induced in a cell following
incorporation of new
DNA (i.e., DNA exogenous to the cell). Where the cell is a mammalian cell, a
permanent genetic change is generally achieved by introduction of the DNA into
the
genome of the cell. A transformed cell or host cell generally refers to a cell
(e.g.,
prokaryotic or eukaryotic) into which (or into an ancestor of which) has been
introduced,
by means of recombinant DNA techniques, a DNA molecule not normally present in
the
host organism.
A particularly type of vector for use in the invention contains an f factor
origin
replication. The f factor (or fertility factor) in E. coli is a plasmid which
effects high
frequency transfer of itself during conjugation and less frequent transfer of
the bacterial
chromosome itself. In a particular embodiment cloning vectors referred to as
"fosmids"
or bacterial artificial chromosome (BAC) vectors are used. These are derived
from E.
coli f factor which is able to stably integrate large segments of DNA. When
integrated
with DNA from a mixed uncultured mixed population sample, this makes it
possible to
achieve large genomic fragments in the form of a stable "mixed population
nucleic acid
library."
The nucleic acids derived from a mixed population or sample may be inserted
into the vector by a variety of procedures. In general, the nucleic acid
sequence is
inserted into an appropriate restriction endonuclease sites) by procedures
known in the
art. Such procedures and others are deemed to be within the scope of those
skilled in the
art. A typical cloning scenario may have the DNA "blunted" with an appropriate
nuclease (e.g., Mung Bean Nuclease), methylated with, for example, EcoR I
Methylase
and ligated to EcoR I linkers. The linkers are then digested with an EcoR I
Restriction
Endonuclease and the DNA size fractionated (e.g., using a sucrose gradient).
The
resulting size fractionated DNA is then ligated into a suitable vector for
sequencing,

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
51
screening or expression (e.g:, a lambda vector and packaged using an in vitro
lambda
packaging extract).
Transformation of a host cell with recombinant DNA may be carried out by
conventional techniques as are well known to those skilled in the art. Where
the host is
prokaryotic, such as E. coli, competent cells which are capable of DNA uptake
can be
prepared from cells harvested after exponential growth phase and subsequently
treated
by the CaCla method by procedures well known in the art. Alternatively, MgCla
or RbCl
can be used. Transformation can also be performed after forming a protoplast
of the host
cell or by electroporation. Transformation of Pseudomonas fluorescens and
yeast host
cells can be achieved by electroporation, using techniques described herein.
When the host is a eukaryote, methods of transfection or transformation with
DNA include conjugation, calcium phosphate co-precipitates, conventional
mechanical
procedures such as microinjection, electroporation, insertion of a plasmid
encased in
liposomes, or virus vectors, as well as others known in the art, may be used.
Eukaryotic
cells can also be cotransfected with a second foreign DNA molecule encoding a
selectable marker, such as the herpes simplex thymidine kinase gene. Another
method is
to use a eukaryotic viral vector, such as simian virus 40 (SV40) or bovine
papilloma
virus, to transiently infect or transform eukaryotic cells and express the
protein.
(Eukaryotic Viral Vectors, Cold Spring Harbor Laboratory, Gluzman ed., 1982).
The
eukaryotic cell may be a yeast cell (e.g., Saccharomyces cerevisiae), an
insect cell (e.g.,
Drosophila sp.) or may be a mammalian cell, including a human cell.
Eukaryotic systems, and mammalian expression systems, allow for post-
translational modifications of expressed mammalian proteins to occur.
Eukaryotic cells
which possess the cellular machinery for processing of the primary transcript,
glycosylation, phosphorylation, and, advantageously secretion of the gene
product
should be used. Such host cell lines may include, but are not limited to, CHO,
VERO,
BHK, HeLa; COS, MDCK, Jurkat, HEK-293, and WI38.
After the gene libraries have been generated one can perform "biopanning" of
the libraries prior to expression screening. The "biopanning" procedure refers
to a

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
52
process for identifying clones having a specified biological activity by
screening for
sequence homology in the library of clones, using at least one probe DNA
comprising at
least a portion of a DNA sequence encoding a polypeptide having the specified
biological activity; and detecting interactions with the probe DNA to a
substantially
complementary sequence in a clone. Clones (either viable or non-viable) are
then
separated by an analyzer (e.g., a FACS apparatus or an apparatus that detects
non-optical
markers).
The probe DNA used to probe for the target DNA of interest contained in clones
prepared from polynucleotides in a mixed population of organisms can be a full-
length
coding region sequence or a partial coding region sequence of DNA for a known
bioactivity. The sequence of the probe can be generated by synthetic or
recombinant
means and can be based upon computer based sequencing programs or biological
sequences present in a clone. The DNA library can be probed using mixtures of
probes
comprising at least a portion of the DNA sequence encoding a known bioactivity
having
a desired activity. These probes or probe libraries are preferably single-
stranded. The
probes that are particularly suitable are those derived from DNA encoding
bioactivities
having an activity similar or identical to the specified bioactivity which is
to be screened.
In another embodiment, a nucleic acid library from a mixed population of
organisms is screened for a sequence of interest by transfecting a host cell
containing the
library with at least one labeled nucleic acid sequence which is all or a
portion of a DNA
sequence encoding a bioactivity having a desirable activity and separating the
library
clones containing the desirable sequence by optical- or non-optical-based
analysis.
In another embodiment, in vivo biopanning may be performed utilizing a
FRCS-based machine. Complex gene libraries are constructed with vectors which
contain elements which stabilize transcribed RNA. For example, the inclusion
of
sequences which result in secondary structures such as hairpins which are
designed to
flank the transcribed regions of the RNA would serve to enhance their
stability, thus
increasing their half life within the cell. The probe molecules used in the
biopanning
process consist of oligonucleotides labeled with reporter molecules that only
fluoresce
upon binding of the probe to a target molecule. Various dyes or stains well
known in

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
53
the art, for example those described in "Practical Flow Cytometry", 1995 Wiley-
Liss,
Inc., Howard M. Shapiro, M.D., can be used to intercalate or associate with
nucleic
acid in order to "label" the oligonucleotides. These probes are introduced
into the
recombinant cells of the library using one of several transformation methods.
The
probe molecules interact or hybridize to the transcribed target ml2NA or DNA
resulting in DNA/RNA heteroduplex molecules or DNA/DNA duplex molecules.
Binding of the probe to a target will yield a fluorescent signal which is
detected and
sorted by the FACS machine during the screening process.
The probe DNA should be at least about 10 bases and preferably at least 15
bases. Desirable size ranges for probe DNA are at least about 15 bases to
about 100
bases, at least about 100 bases to about 500 bases, at least about 500 bases
to about 1,000
bases, at least about 1,000 bases to about 5,000 bases and at least about
5,000 bases to
about 10,000 bases. In one embodiment, an entire coding region of one part of
a
pathway may be employed as a probe. Where the probe is hybridized to the
target DNA
in an in vitro system, conditions for the hybridization in which target DNA is
selectively
isolated by the use of at least one DNA probe will be designed to provide a
hybridization
stringency of at least about 50% sequence identity, more particularly a
stringency
providing for a sequence identity of at least about 70%. Hybridization
techniques for
probing a microbial DNA library to isolate target DNA of potential interest
are well
known in the art and any of those which are described in the literature are
suitable for
use herein. Prior to fluorescence sorting the clones may be viable or non-
viable. For
example, in one embodiment, the cells are fixed with paraformaldehyde prior to
soi ting.
Once viable or non-viable clones containing a sequence substantially
complementary to the probe DNA are separated by a fluorescence analyzer,
polynucleotides present in the separated clones may be fixrther manipulated.
In some
instances, it may be desirable to perform an amplification of the target DNA
that has
been isolated. In this embodiment, the target DNA is separated from the probe
DNA
after isolation. In one embodiment, the clone can be grown to expand the
clonal
population. Alternatively, the host cell is lysed and the target DNA
amplified. It is then

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
54
amplified before being used to transform a new host (e.g., subcloning). Long
PCR
(Barnes, W M, Proc. Natl. Acid. Sci, USA, Mar. 15, 1994 ) can be used to
amplify large
DNA fragments (e.g., 35 kb). Numerous amplification methodologies are now well
known in the art.
Where the target DNA is identified in vitro, the selected DNA is then used for
preparing a library for fi~rther processing and screening by transforming a
suitable
organism. Hosts, particularly those specifically identified herein as
preferred, are
transformed by artificial introduction of a vector containing a target DNA by
inoculation
under conditions conducive for such transformation.
The resultant libraries (enriched for a polynucleotide of interest) can then
be
screened for clones which display an activity of interest. Clones can be
shuttled in
alternative hosts for expression of active compounds, or screened using
methods
described herein.
Having prepared a multiplicity of clones from DNA selectively isolated via
hybridization technologies described herein, such clones are screened for a
specific
activity to identify clones having a specified characteristic.
The screening for activity may be effected on individual expression clones or
may be initially effected on a mixture of expression clones to ascertain
whether or not
the mixture has one or more specified activities. If the mixture has a
specified activity,
then the individual clones may be rescreened for such activity or for a more
specific
activity.
Prior to, subsequent to or as an alternative to the in vivo biopanning
described
above is an encapsulation techniques such as GMDs, which may be employed to
localize
at least one clone in one location for growth or screening by a fluorescent
analyzer (e.g.
FACS). The separated at least one clone contained in the GMD may then be
cultured to
expand the number of clones or screened on a FAGS machine to identify clones
containing a sequence of interest as described above, which can then be broken
out into
individual clones to be screened again on a FRCS machine to identify positive
individual
clones. Screening in this manner using a FRCS machine is described in patent

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
SS
application Ser. No. 0~/~76,276, filed June 16, 1997, herein incorporated by
reference.
Thus, for example, if a clone has a desirable activity, then the individual
clones may be
recovered and rescreened utilizing a FACS machine to determine which of such
clones
has the specified desirable activity.
Further, it is possible to combine some or all of the above embodiments such
that a normalization step is performed prior to generation of the expression
library, the
expression library is then generated, the expression library so generated is
then
biopanned, and the biopanned expression library is then screened using a high
throughput cell sorting and screening instrument. Thus there are a variety of
options,
including: (i) generating the library and then screening it; (ii) normalize
the target
DNA, generate the expression library and screen it; (iii) normalize, generate
the
library, biopan and screen; or (iv) generate, biopan and screen the library.
The library may, for example, be screened for a specified enzyme activity.
For example, the enzyme activity screened fox may be one or more of the six
ItTB
classes; oxidoreductases, transferases, hydrolases, lyases, isomerases and
ligases. The
recombinant enzymes which are determined to be positive for one or more of the
IU.~
classes may then be rescreened for a more specific enzyme activity.
Alternatively, the library may be screened for a more specialized enzyme
activity. For example, instead of generically screening for hydrolase
activity, the
library may be screened for a more specialized activity, i.e. the type of bond
on which
the hydrolase acts. Thus, for example, the library may be screened to
ascertain those
hydrolases which act on one or more specified chemical functionalities, such
as: (a)
amide (peptide bonds), i.e. proteases; (b) ester bonds, i.e. esterases and
lipases; (c)
acetals, i.e., glycosidases etc.
As described with respect to one of the above aspects, the invention provides
a
process for activity screening of clones containing selected DNA derived from
a mixed
population of organisms or more than one organism.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
56
Biopanning polynucleotides from a mixed population of organisms by separating
the clones or polynucleotides positive for sequence of interest with a
fluorescent
analyzer that detects fluorescence, to select polynucleotides or clones
containing
polynucleotides positive for a sequence of interest, and screening the
selected clones or
polynucleotides for specified bioactivity. In one embodiment, the
polynucleotides are
contained in clones having been prepared by recovering DNA of a microorganism,
which DNA is selected by hybridization to at least one DNA sequence which is
all or a
portion of a DNA sequence encoding a bioactivity having a desirable activity.
In another embodiment, a DNA library derived from a microorganism is
subjected to a selection procedure to select therefrom DNA which hybridizes to
one or
more probe DNA sequences which is all or a portion of a DNA sequence encoding
an
activity having a desirable activity by:
(a) contacting a DNA library with a fluorescent labeled DNA probe under
conditions permissive of hybridization so as to produce a double-stranded
complex of
probe and members of the DNA library.
The present invention offers the ability to screen for many types of
bioactivities.
F'or instance, the ability to select and combine desired components from a
library of
polyketides and postpolyketide biosynthesis genes for generation of novel
polyketides
for study is appealing. The methods) of the present invention make it possible
to and
facilitate the cloning of novel polyketide synthase genes and/or gene
pathways, and other
relevant pathways or genes encoding commercially relevant secondary
metabolites,
since one can generate gene banks with clones containing large inserts
(especially when
using vectors which can accept large inserts, such as the f factor based
vectors), which
facilitates cloning of gene clusters.
The biopanning approach described above can be used to create libraries
enriched with clones carrying sequences substantially homologous to a given
probe
sequence. Using this approach libraries containing clones with inserts of up
to 40 kbp or
larger can be enriched approximately 1,000 fold after each round of panning.
This
enables one to reduce the number of clones to be screened after 1 round of
biopanning

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
57
enrichment. This approach can be applied to create libraries enriched for
clones carrying
sequence of interest related to a bioactivity of interest, for example,
polyketide
sequences.
Hybridization screening using high density filters or biopanning has proven an
efficient approach to detect homologues of pathways containing genes of
interest to
discover novel bioactive molecules that may have no known counterparts. Once a
polynucleotide of interest is enriched in a library of clones it may be
desirable to screen
for an activity. For example, it may be desirable to screen for the expression
of small ,
molecule ring structures or "backbones". Because the genes encoding these
polycyclic
structures can often be expressed in E. coli, the small molecule backbone can
be
manufactured, even if in an inactive form. Bioactivity is conferred upon
transferring the
molecule or pathway to an appropriate host that expresses the requisite
glycosylation and
methylation genes that can modify or "decorate" the structure to its active
form. Thus,
even if inactive ring compounds, recombinantly expressed in E. coli are
detected to
identify clones which are then shuttled to a metabolically rich host, such as
Streptomyces
(e.g., Streptomyces diversae or venezuelae) for subsequent production of the
bioactive
molecule. It should be understood that E. coli can produce active small
molecules and in
certain instances it may be desirable to shuttle clones to a metabolically
rich host for
"decoration" of the structure, but not required. The use of high throughput
robotic
systems allows the screening of hundreds of thousands of clones in multiplexed
arrays in
microtiter dishes.
One approach to detect and enrich for clones carrying these structures is to
use
FACS screening, a procedure described and exemplified in U.S. Ser. No.
08/876,276,
filed June I6, 1997. Polycyclic ring compounds typically have characteristic
fluorescent
spectra when excited by ultraviolet light. Thus, clones expressing these
structures can be
distinguished from background using a sufficiently sensitive detection method.
High
throughput FACS screening can be utilized to screen for small molecule
backbones in,
for example, E. coli libraries. Commercially available FACS machines are
capable of
screening up to 100,000 clones per second for UV active molecules. These
clones can be

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
58
sorted for further FRCS screening or the resident plasmids can be extracted
and shuttled
to Streptomyces for activity screening.
In another embodiment, a bioactivity or biomolecule or compound is detected by
using various electromagnetic detection devices, including, for example,
optical,
magnetic and thermal detection associated with a flow cytometer.
Flow cytometer typically use an optical method of detection (fluorescence,
scatter, and the like) to discriminate individual cells or particles from
within a large
population. There are several non-optical technologies that could be used
alone or in
conjunction with the optical methods to enable new discrimination/screening
paradigms.
Magnetic field sensing is one such techniques that can be used as an
alternative
or in conjunction with, for example, fluorescence based methods. hall-Effect
Sensors
are one example of sensors that can be employed. Superconducting Quantum
Interference Devices ("SQUIDS") are the most sensitive sensors for magnetic
flux and
magnetic fields, so far developed. A standardized criteria for the sensitivity
of a SQU~
is its energy resolution. This is defined as the smallest change in energy
that the SQU~
can detect in one second (or in a bandwidth of 1 I3z). Typical values are 1 O-
33 d/Hz. The
utility of SQUIDS can be found in the presence of magnetosomes in certain
types of
bacterial that contain chains of permanent single magnetic domain particles of
magnetite
(FE304) of gregite (Fe3S4). The magnetic field (or residual magnetic field) of
a cell that
contains a magnetosome is detected by positioning a SQUID in close proximity
to the
flow stream of a flow cytometer. Using this method cells or cells containing,
for
example, magnetic probes can be isolated based on their magnetic properties.
As
another example, changes in the synthetic pathway of magnetosome containing
bacteria
can be measured using a similar technique. Such techniques can be used to
identify
agents which modulate the synthetic pathway of magnetosomes.
Measuring dynamic charge properties is another techniques that can be used as
an alternative or in conjunction with, for example, fluorescence based
methods.
Multipole Coupling Spectroscopy ("MCS") directly measures the dynamic charge
properties of systems without the need for labeling. Structural changes that
occur when

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
59
molecules interact result in representative changes in charge distribution,
and these
produce a dielectric based spectra or "signature" that reveals the affinity,
specificity and
functionality of each interaction. Similar changes in charge distribution
occur in cellular
systems. By observing the changes in these signatures, the dynamics of
molecular
pathways and cellular function can be resolved in their native conditions. MCS
utilizes a
small microwave (500 MHz to 50 GHz) transceiver that could be positioned in
close
proximity to the flow stream of a flow cytometer. Because of the short
measurement
times (e.g., microseconds) required, a complete MCS signature for each cell
within the
stream of a flow cytometer can be generated and analyzed. Certain cells can
then be
sorted and/or isolated based on either spectral features that are known a
priori or based
on some statistical variation from a general population. Examples of uses for
this
technique include selection of expression mutants, small molecule pre-
screening, and the
like.
gn one screening approach, biomolecules from candidate clones can be tested
for
bioactivity by susceptibility screening against test organisms such as
Staphylococcus
aureus, Micrococcus luteus, E. coli, or Saccharomyces cervisiae. FACS
screening can be
used in this approach by co-encapsulating clones with the test organism.
An alternative to the above-mentioned screening methods provided by the
present invention is an approach termed'°mixed extract" screening. ~'he
°°mixed extract°'
screening approach takes advantage of the fact that the accessory genes needed
to confer
activity upon the polycyclic backbones are expressed in metabolically rich
hosts, such as
Streptomyces, and that the enzymes can be extracted and combined with the
backbones
extracted from E. coli clones to produce the bioactive compound in vitro.
Enzyme
extract preparations from metabolically rich hosts, such as Streptomyces
strains, at
various growth stages are combined with pools of organic extracts from E. coli
libraries
and then evaluated for bioactivity. Another approach to detect activity in the
E. coli
clones is to screen for genes that can convert bioactive compounds to
different forms.
For example, a recombinant enzyme was recently discovered that can convert the
low
value daunomycin to the higher value doxorubicin. Similar enzyme pathways are
being
sought to convert penicillins to cephalosporins.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
Screening may be carried out to detect a specified enzyme activity by
procedures
known in the art. For example, enzyme activity may be screened for one or more
of the
six IUB classes; oxidoreductases, transferases, hydrolases, lyases, isomerases
and
ligases. The recombinant enzymes which are determined to be positive for one
or more
of the IUB classes may then be rescreened for a more specific enzyme activity.
Alternatively, the library may be screened for a more specialized enzyme
activity. For
example, instead of generically screening for hydrolase activity, the library
may be
screened for a more specialized activity, i.e. the type of bond on which the
hydrolase
acts. Thus, for example, the library may be screened to ascertain those
hydrolases which
act on one or more specified chemical functionalities, such as: (a) amide
(peptide bonds),
i.e. proteases; (b) ester bonds, i.e. esterases and lipases; (c) acetals,
i.e., glycosidases.
FAGS screening can also be used to detect expression of UV fluorescent
molecules in any host, including metabolically rich hosts, such as
Streptomyces. For
example, recombinant oxytetracylin retains its diagnostic red fluorescence
when
produced heterologously in S. lividans TI~24. Pathway clones, which can be
sorted by
FACE, can thus be screened for polycyclic molecules in a high throughput
fashion.
Recombinant bioactive compounds can also be screened in vivo using "two-
hybrid" systems, which can detect enhancers and inhibitors of protein-protein
or other
interactions such as those between transcription factors and their activators,
or receptors
and their cognate targets. In this embodiment, both the small molecule pathway
and the
reporter construct are co-expressed. Clones altered in reporter expression can
then be
sorted by FAGS and the pathway clone isolated for characterization.
As indicated, common approaches to drug discovery involve screening assays in
which disease targets (macromolecules implicated in causing a disease) are
exposed to
potential drug candidates which are tested for therapeutic activity. In other
approaches,
whole cells or organisms that are representative of the causative agent of the
disease,
such as bacteria or tumor cell lines, are exposed to the potential candidates
for screening
purposes. Any of these approaches can be employed with the present invention.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
61
The present invention also allows for the transfer of cloned pathways derived
from uncultivated samples into metabolically rich hosts for heterologous
expression and
downstream screening for bioactive compounds of interest using a variety of
screening
approaches briefly described above.
Recovering Desirable Bioactivities
After viable or non-viable cells, each containing a different expression clone
from the gene library are screened, and positive clones are recovered, DNA can
be
isolated from positive clones utilizing techniques well known in the art. The
DNA can
then be amplified either in vivo or in vitro by utilizing any of the various
amplification
techniques known in the art. In vivo amplification would include
transformation of the
clones) or subclone(s) into a viable host, followed by growth of the host. In
vitro
amplification can be performed using techniques such as the polymerase chain
reaction.
Once amplified the identified sequences can be "evolved" or sequenced.
Evolution
One advantage afforded by present invention is the ability to manipulate the
identified polynucleotides to generate and select for encoded variants with
altered
activity or specificity.
Clones found to have the bioactivity for which the screen was performed can be
subjected to directed mutagenesis to develop new bioactivities with desired
properties or
to develop modified bioactivities with particularly desired properties that
are absent or
less pronounced in the wild-type activity, such as stability to heat or
organic solvents.
Any of the known techniques for directed mutagenesis are applicable to the
invention.
For example, particularly preferred mutagenesis techniques for use in
accordance with
the invention include those described below.
Alternatively, it may be desirable to variegate a polynucleotide sequence
obtained, identified or cloned as described herein. Such variegation can
modify the
polynucleotide sequence in order to modify (e.g., increase or decrease) the
encoded
polypeptide's activity, specificity, affinity, function, etc. Such evolution
methods are

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
62
known in the art or described herein, such as, shuffling, cassette
mutagenesis,
recursive ensemble mutagenesis, sexual PCR, directed evolution, exonuclease-
mediated reassembly, codon site-saturation mutagenesis, amino acid site-
saturation
mutagenesis, gene site saturation mutagenesis, introduction of mutations by
non-
stochastic polynucleotide reassembly methods, synthetic ligation
polynucleotide
reassembly, gene reassembly, oligonucleotide-directed saturation mutagenesis,
in vivo
reassortment of polynucleotide sequences having partial homology, naturally
occurring recombination processes which reduce sequence complexity, and any
combination thereof.
The clones enriched for a desired polynucleotide sequence, which are
identified as described above, may be sequenced to identify the DNA sequences)
present in the clone, which sequence information can be used to screen a
database for
similar sequences or functional characteristics. Thus, in accordance with the
present
invention it is possible to isolate and identify: (i) DNA having a sequence of
interest
(e.g., a sequence encoding an enzyme having a specified enzyme activity), (ii)
associate the sequence with known or unknown sequence in a database (e.g.,
database
sequence associated with an enzyme having an activity (including the amino
acid
sequence thereof)), and (iii) produce recombinant enzymes having such
activity.
Sequencing may be performed by high through-put sequencing techniques. The
exact method of sequencing is not a limiting factor of the invention. Any
method useful
in identifying the sequence of a particular cloned DNA sequence can be used.
In
general, sequencing is an adaptation of the natural process of DNA
replication.
Therefore, a template (e.g., the vector) and primer sequences are used. One
general
template preparation and sequencing protocol begins with automated picking of
bacterial
colonies, each of which contains a separate DNA clone which will function as a
template
for the sequencing reaction. The selected clones are placed into media, and
grown
overnight. The DNA templates are then purified from the cells and suspended in
water.
After DNA quantification, high-throughput sequencing is performed using a
sequencers,
such as Applied Biosystems, Inc., Prism 377 DNA Sequencers. The resulting
sequence
data can then be used in additional methods, including to search a database or
databases.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
63
Database Searches and Alignment Algorithms
A number of source databases are available that contain either a nucleic acid
sequence and/or a deduced amino acid sequence for use with the invention in
identifying
or determining the activity encoded by a particular polynucleotide sequence.
All or a
representative portion of the sequences (e.g., about 100 individual clones) to
be tested
are used to search a sequence database (e.g., GenBank, PFAM or ProDom), either
simultaneously or individually. A number of different methods of performing
such
sequence searches are known in the art. The databases can be specific for a
particular
organism or a collection of organisms. For example, there are databases for
the C.
elegans, Arabadopsis. sp., M. genitalium, M. jannaschii, E. coli, gI.
influenzae, S.
cerevisiae and others. The sequence data of the clone is then aligned to the
sequences in
the database or databases using algorithms designed to measure homology
between two
or more sequences.
Such sequence alignment methods include, for example, BLAST (Altschul et al.,
1990), BLITZ (MPsrch) (Sturrock & Collins, 1993), and FASTA (Person & Lipman,
1988). The probe sequence (e.g., the sequence data from the clone) can be any
length,
and will be recognized as homologous based upon a threshold homology value.
The
threshold value may be predetermined, although this is not required. The
threshold
value can be based upon the particular polynucleotide length. To align
sequences a
number of different procedures can be used. Typically, Smith-Waterman or
Needleman-
Wunsch algorithms are used. I3owever, as discussed faster procedures such as
BLAST,
FASTA, PSI-BLAST can be used.
For example, optimal alignment of sequences for aligning a comparison window
may be conducted by the local homology algorithm of Smith (Smith and Waterman,
Adv Appl Math, 1981; Smith and Waterman, J Teor Biol, 1981; Smith and
Waterman, J
Mol Biol, 1981; Smith et al, J Mol Evol,1981), by the homology alignment
algorithm of
Needleman (Needleman and Wuncsch, 1970), by the search of similarity method of
Pearson (Pearson and Lipman, 1988), by computerized implementations of these
algorithms (GAP, BESTFlfi, FASTA, and TFASTA in the Wisconsin Genetics
Software
Package Release 7.0, Genetics Computer Group, 575 Science Dr., Madison, WI, or
the

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
64
Sequence Analysis Software Package of the Genetics Computer Group, University
of
Wisconsin, Madison, WI), or by inspection, and the best alignment (i.e.,
resulting in the
highest percentage of homology over the comparison window) generated by the
various
methods is selected. The similarity of the two sequence (i.e., the probe
sequence and the
database sequence) can then be predicted.
Such software matches similar sequences by assigning degrees of homology to
various deletions, substitutions and other modifications. The terms "homology"
and
"identity" in the context of two or more nucleic acids or polypeptide
sequences, refer to
two or more sequences or subsequences that are the same or have a specified
percentage
of amino acid residues or nucleotides that are the same when compared and
aligned for
maximum correspondence over a comparison window or designated region as
measured
using any number of sequence comparison algorithms or by manual alignment and
visual inspection.
F'or sequence comparison, typically one sequence acts as a reference sequence,
to
which test sequences are compared. When using a sequence comparison algorithm,
test
and reference sequences are entered into a computer, subsequence coordinates
are
designated, if necessary, and sequence algorithm program parameters are
designated.
Default program parameters can be used, or alternative parameters can be
designated.
The sequence comparison algorithm then calculates the percent sequence
identities for
the test sequences relative to the reference sequence, based on the program
parameters.
A "comparison window", as used herein, includes reference to a segment of any
one of the number of contiguous positions selected from the group consisting
of from 20
to 600, usually about 50 to about 200, more usually about 100 to about 150 in
which a
sequence may be compared to a reference sequence of the same number of
contiguous
positions after the two sequences are optimally aligned.
One example of a useful algorithm is BLAST and BLAST 2.0 algorithms, which
are described in Altschul et al., Nuc. Acids Res. 25:3389-3402 (1977) and
Altschul et al.,
J. Mol. Biol. 215:403-410 (1990), respectively. Software for performing BLAST
analyses is publicly available through the National Center for Biotechnology
Information

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
(http:llwww.ncbi.nlin.nih.gov/). This algorithm involves first identifying
high scoring
sequence pairs (HSPs) by identifying short words of length W in the query
sequence,
which either match or satisfy some positive-valued threshold score T when
aligned with
a word of the same length in a database sequence. T is referred to as the
neighborhood
word score threshold (Altschul et al., supra). These initial neighborhood word
hits act as
seeds for initiating searches to find longer HSPs containing them. The word
hits are
extended in both directions along each sequence for as far as the cumulative
alignment
score can be increased. Cumulative scores are calculated using, for nucleotide
sequences, the parameters M (reward score for a pair of matching residues;
always >0).
The BLAST algorithm parameters W, T, and X determine the sensitivity and speed
of
the alignment. The BLASTN program (for nucleotide sequences) uses as defaults
a
wordlength (W) of 11, an expectation (E) of 10, M=5, N=-4 and a comparison of
both
strands.
The BLAST algorithm also performs a statistical analysis of the similarity
between two sequences (see, e.g., Karlin & Altschul, Proc. Natl. Acad. Sci.
USA
90:5873 (1993)). ~ne measure of similarity provided by BLAST algorithm is the
smallest sum probability (P(I~), which provides an indication of the
probability by
which a match between two nucleotide sequences would occur by chance. For
example,
a nucleic acid is considered similar to a references sequence if the smallest
sum
probability in a comparison of the test nucleic acid to the reference nucleic
acid is less
than about 0.2, more preferably less than about 0.01, and most preferably less
than about
0.001.
Sequence homology means that two polynucleotide sequences are homolgous
(i.e., on a' nucleotide-by nucleotide basis) over the window of comparison. A
percentage
of sequence identity or homology is calculated by comparing two optimally
aligned
sequences over the window of comparison, determining the number of positions
at
which the identical nucleic acid base (e.g., A, T, C, G, U, or n occurs in
both sequences
to yield the number of matched positions, dividing the number of matched
positions by
the total number of positions in the window of comparison (i.e., the window
size), and
multiplying the result by 100 to yield the percentage of sequence homology.
This

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
66
substantial homology denotes a characteristic of a polynucleotide sequence,
wherein the
polynucleotide comprises a sequence having at least 60 percent sequence
homology,
typically at least 70 percent homology, often ~0 to 90 percent sequence
homology, and
most commonly at least 99 percent sequence homology as compared to a reference
sequence of a comparison window of at least 25-SO nucleotides, wherein the
percentage
of sequence homology is calculated by comparing the reference sequence to the
polynucleotide sequence which may include deletions or additions which total
20
percent or less of the reference sequence over the window of comparison.
Sequences having sufficient homology can the be fiu-ther identified by any
annotations contained in the database, including, for example, species and
activity
information. Accordingly, in a typical mixed population sample, a plurality of
nucleic
acid sequences will be obtained, cloned, sequenced and corresponding
homologous
sequences from a database identified. This information provides a profile of
the
polynucleotides present in the sample, including one or more features
associated with the
polynucleotide including the organism and activity associated with that
sequence or any
polypeptide encoded by that sequence based on the database information. As
used herein
"fingerprint" or "profile" refers to the fact that each sample will have
associated with it a
set of polynucleotides characteristic of the sample and the environment from
which it
was derived. Such a profile can include the amount and type of sequences
present in the
sample, as well as information regarding the potential activities encoded by
the
polynucleotides and the organisms from which polynucleotides were derived.
This
unique pattern is each sample's profile or fingerprint.
In some instances it may be desirable to express a particular cloned
polynucleotide sequence once its identity or activity is determined or an
suggested
identity or activity is associated with the polynucleotide. In such instances
the desired
clone, if not already cloned into an expression vector, is ligated downstream
of a
regulatory control element (e.g., a promoter or enhancer) and cloned into a
suitable host
cell. Expression vectors are commercially available along with corresponding
host cells
for use in the invention.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
67
As representative examples of expression vectors which rnay be used there may
be mentioned viral particles, baculovirus, phage, plasmids, phagemids,
cosmids,
fosmids, bacterial artificial chromosomes, viral nucleic acid (e.g., vaccinia,
adenovirus,
foul pox virus, pseudorabies and derivatives of SV40), Pl-based artificial
chromosomes,
yeast plasmids, yeast artificial chromosomes, and any other vectors specific
for specific
hosts of interest (such as bacillus, aspergillus, yeast, etc.) Thus, for
example, the DNA
may be included in any one of a variety of expression vectors for expressing a
polypeptide. Such vectors include chromosomal, nonchromosomal and synthetic
DNA
sequences. Large numbers of suitable vectors are known to those of skill in
the art, and
are commercially available. The following vectors are provided by way of
example;
ZAP Express, Lambda ZAP~- CMV,, Lambda ZAP ~ II , Lambda gtl0, Lambda gtl l,
pMyr, pSos, pCMV-Script, pCMV-Script XR, pBK Phagemid, pBK-CMV, pBK-
RSV, pBluescript II Phagemid, pBluescript II KS +, pBluescript II SK +,
pBluescript
II SK -, Lambda FIX II, Lambda DASH II, Lambda EMBL3 and EMBL4, EMBL3,
EMBL4, SuperCos I and pWElS, pWElS, SuperCos I, pPCR-Script Amp, pPCR-
Script Cam, pCMV-Script, pBC KS +, pBC KS -, pBC SK +, pBC SK -, psiX174,
pNHBA, pNHl6a, pNHlBA, pNH46A (Stratagene); PT7BLUE, pSTBlue, pCITE, pET,
ptriEx, pForce (Novagen); pIND-E, pIND Vector, pIND/Hygro, pIND(SP1)/Hygro,
pIND/GFP, pIND(SPl)/GFP, pIND/VS-His and pIND(SP1)/VS-His Tag, pllVD
TOPO TA, pShooterTM Targeting Vectors, pTracerTM GFP Reporter Vectors,
pcDNA~ Vector Collection, EBV Vectors, VoyagerTM VP22 Vectors, pVAXl - DNA
vaccine vector, pcDNA4/His-Max, pBCI Mouse Millc System (Invitrogen); pQE70,
pQE60, pQE-9, pQE-16, pQE - 30 / pQE -80, pQE 31/ pQE 81, pQE -32/pQE 82,
pQE - 40, pQE -100 Double Tag (Qiagen); pTRC99a, pKK223-3, pKK233-3,
pDR540, pRITS, pWLNEO, pSV2CAT, pOG44, pXTl, pSG (Stratagene), pSVK3,
pBPV, pMSG, pSVL (Pharmacia).However, any other plasmid or vector may be used
as
long as they are replicable and viable in the host.
The nucleic acid sequence in the expression vector is operatively linked to an
appropriate expression control sequences) (promoter) to direct mRNA synthesis.
Particular named bacterial promoters include lacI, lacZ, T3, T7, gpt, lambda
PR, PL,
SP6, trp, lacUVS, PBAD, araBAD, araB, trc, proU, p-D-HSP, HSP, GAL4 UAS/Elb,

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
6~
TK, GAL1, CMV/TetOz Hybrid, EF-la CMV, EF-la CMV, EF-la CMV, EF, EF-la,
ubiquitin C, rsv-ltr, rsv , b -lactamase, nmtl, and ga110. Eukaryotic
promoters include
CMV immediate early, HSV thymidine kinase, early and late SV40, LTRs from
retrovirus, and mouse metallothionein-I. Selection of the appropriate vector
and
promoter is well within the level of ordinary skill in the art. The expression
vector also
contains a ribosome binding site for translation initiation and a
transcription terminator.
The vector may also include appropriate sequences for amplifying expression.
Promoter
regions can be selected from any desired gene using CAT (chloramphenicol
transferase)
vectors or other vectors with selectable markers.
In addition, the expression vectors preferably contain one or more selectable
marker genes to provide a phenotypic trait for selection of transformed host
cells such as
dihydrofolate reductase or neomycin resistance for eukaryotic cell culture, or
such as
tetracycline or ampicillin resistance in E. coli.
The nucleic acid sequences) selected, cloned and sequenced as hereinabove
described can additionally be introduced into a suitable host to prepare a
library which is
screened for the desired enzyme activity. The selected nucleic acid is
preferably already
in a vector which includes appropriate control sequences whereby a selected
nucleic acid
encoding an enzyme may be expressed, for detection of the desired activity.
The host
cell can be a higher eukaryotic cell, such as a mammalian cell, or a lower
eukaryotic cell,
such as a yeast cell, or the host cell can be a prokaryotic cell, such as a
bacterial cell. The
selection of an appropriate host is deemed to be within the scope of those
skilled in the
art from the teachings herein.
In some instances it may be desirable to perform an amplification of the
nucleic
acid sequence present in a sample or a particular clone that has been
isolated. In this
embodiment the nucleic acid sequence is amplified by PCR reaction or similar
reaction
known to those of skill in the art. Commercially available amplification kits
are
available to carry out such amplification reactions.
In addition, it is important to recognize that the alignment algorithms and
searchable
database can be implemented in computer hardware, software or a combination
thereof.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
69
Accordingly, the isolation, processing and identification of nucleic acid
sequences and
the corresponding polypeptides encoded by those sequence can be implemented in
and
automated system.
Capillary -Based Screening
Figure 6A shows a capillary array (10) which includes a plurality of
individual
capillaries (20) having at least one outer wall (30) defining a lumen (40).
The outer wall
(30) of the capillary (20) can be one or more walls fused together. Similarly,
the wall
can define a lumen (40) that is cylindrical, square, hexagonal or any other
geometric
shape so long as the walls form a lumen for retention of a liquid or sample.
The
capillaries (20) of the capillary array (10) are held together in close
proximity to form a
planar structure. The capillaries (20) can be bound together, by being fused
(e.g., where
the capillaries are made of glass), glued, bonded, or clamped side-by side.
The capillary
array (10) can be formed of any number of individual capillaries (20). In an
embodiment, the capillary array includes 100 to 4,000,000 capillaries (20). In
one
embodiment, the capillary array includes 100 to 500,000,000 capillaries (20).
In one
embodiment, the capillary array includes 100,000 capillaries (20). In one
specific
embodiment, the capillary array (10) can be formed to conform to a microtiter
plate
footprint, i.e. 127.76mm by 85.47mm, with tolerances. The capillary array (10)
can have
a density of 500 to more than 1,000 capillaries (20) per cm2, or about 5
capillaries per
mm2. For example, a microtiter plate size array of Sum capillaries would have
about
500 million capillaries.
The capillaries (20) are preferably formed with an aspect ratio of 50:1. In
one
embodiment, each capillary (20) has a length of approximately l Omm, and an
internal
diameter of the lumen (40) of approximately 200E.un. However, other aspect
ratios are
possible, and range from 10:1 to well over 1000:1. Accordingly, the thickness
of the
capillary array can vary from O.Smm to over lOcm. Individual capillaries (20)
have an
inner diameter that ranges from 3- SOO~.rn and 0-SOONxn. A capillary (20)
having an
internal diameter of 200 ~m and a length of 1 cm has a volume of approximately
0.3 ~,1.
The length and width of each capillary (20) is based on a desired volume and
other
characteristics discussed in more detail below, such as evaporation rate of
liquid from

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
within the capillary, and the like. Capillaries of the invention may include a
volume as
low as 250 nanoliters/well.
In accordance with one embodiment of the invention, one or more particles are
introduced into each capillary (20) for screening. Suitable particles include
cells, cell
clones, and other biological matter, chemical beads, or any other particulate
matter. The
capillaries (20) containing particles of interest can be introduced with
various types of
substances for causing an activity of interest. The introduced substance can
include a
liquid having a developer or nutrients, for example, which assists in cell
growth and
which results in the production of enzymes. Or, a chemical solution containing
new
particles can cause a combining event with other chemical beads already
introduced into
one or more capillaries (20). The particles and resulting activity of interest
are screened
and analyzed using the capillary array (10) according to the present
invention. In one
embodiment, the activity produces a change in properties of matter within the
capillary
(20), such as optical properties of the particles. Each capillary can act as a
waveguide
for guiding detectable light energy or property changes to an analyzer.
The capillaries (20) can be made according to various manufacturing
techniques. In one
particular embodiment, the capillaries (20) are manufactured using a hollow-
drawn
technique. A cylindrical, or other hollow shape, piece of glass is drawn out
to
continually longer lengths according to known techniques. The piece of glass
is
preferably formed of multiple layers. The drawn glass is then cut into
portions of a
specific length to form a relatively large capillary. The capillary portions
are next
bundled into an array of relatively large capillaries, and then drawn again to
increasingly
narrower diameters. During the drawing process, or when the capillaries are
formed to a
desired width, application of heat can fuse interstitial areas of adjacent
capillaries
together.
In an alternative embodiment, a glass etching process is used. Preferably, a
solid tube
of glass is drawn out to a particular width, cut into portions of a specific
length, and
drawn again. Then, each solid tube portion is center-etched with an acid or
other etchant
to form a hollow capillary. The tubes can be bound or fused together before or
after the
etch process.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
71
A number of capillary arrays (10) can be connected together to form an array
of arrays
(12), as shown in Figure 6B. The capillary arrays (10) can be glued together.
Alternatively, the capillary arrays (10) can be fused together. According to
this
technique, the array of arrays (12) can have any desired size or footprint,
formed of any
number of high-precision capillary arrays (10).
A large number of materials can be suitably used to form a capillary array
according to
the invention and depending on the manufacturing technique used, including
without
limitation, glass, metal, semiconductors such as silicon, quartz, ceramics, or
various
polymers and plastics including, among others, polyethylene, polystyrene, and
polypropylene. The internal walls of the capillary array, or portions thereof,
may be
coated or silanized to modify their surface properties. For example, the
hydrophilicity or
hydrophobicity may be altered to promote or reduce wicking or capillary
action,
respectively. The coating material includes, for example, ligands such as
avidin,
streptavidin, antibodies, antigens, and other molecules having specific
binding affinity or
which can withstand thermal or chemical sterilization.
While the above-described manufacturing techniques and materials yield high
precision
micro-sized capillaries and capillary arrays, the size, spacing and alignment
of the
capillaries within an array may be non-uniform. In some instances, it is
desirable to have
two capillary arrays make contact in as close alignment as possible, such as,
for example,
to transfer liquid from capillaries in a first capillary array to capillaries
in a second
capillary array. One capillary array according to the invention may be cut
horizontally
along its thickness, and separated to form two capillary arrays. The two
resulting
capillary arrays will each include at least one surface having capillary
openings of
substantially identical size, spacing and alignment, and suitable for
contacting together
for transferring liquid from one resulting capillary array to the other.
Figure 7 shows a horizontal cross section of a portion of an array of
capillaries (20).
Capillary (20) is shown having a first cylindrical wall (30), a lumen (40), a
second
exterior wall (50), and interstitial material (60) separating the capillary
tubes in the array
(10). In this embodiment, the cylindrical wall (30) is comprised of a sleeve
glass, while

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
72
exterior wall (50) is comprised of an extra mural absorption (EMA) glass to
minimize
optical crosstalk among neighboring capillaries (20).
A capillary array may optionally include reference indicia (22) for providing
a
positional or alignment reference. The reference indicia (22) may be formed of
a pad of
glass extending from the surface of the capillary array, or embedded in the
interstitial
material (60). In one embodiment, the reference indicia (22) are provided at
one or
more corners of a microtiter plate formed by the capillary array. According to
the
embodiment, a corner of the plate or set of capillaries may be removed, and
replaced
with the reference indicia (22). The reference indicia (22) may also be formed
at spaced
intervals along a capillary array, to provide an indication of a subset of
capillaries (20).
Figure 8 depicts a vertical cross-section of a capillary of the invention. The
capillary
(20) includes a first wall (30) defining a lumen (40), and a second wall (50)
surrounding
the first wall (30). In one embodiment, the second wall (50) has a lower index
of
refraction than the first wall (30). In one embodiment, the first wall (30) is
sleeve glass
having a high index of refraction, forming a waveguide in which light from
excited
fluorophores travels. In the exemplary embodiment, the second wall (50) is
black 1JMA
glass, having a low index of refraction, forming a cladding around the first
wall (30)
against which light is refracted and directed along the first wall (30) for
total internal
reflection within the capillary (20). The second wall (50) can thus be made
with any
material that reduces the "cross-talk" or diffusion of light between adjacent
capillaries.
Alternatively, the inside surface of the first wall (30) can be coated with a
reflective
substance to form a mirror, or mirror-like structure, for specular reflection
within the
lumen (40).
Many different materials can be used in fornling the first and second walls,
creating
different indices of refraction for desired purposes. A filtering material can
be formed
around the lumen (40) to filter energy to and from the lumen (40) as depicted
in Figure
9. In one embodiment, the inner wall of the first wall (30) of each capillary
of the array,
or portion of the array, is coated with the filtering material. In another
embodiment, the
second wall (50) includes the filtering material. For instance, the second
wall (50) can
be formed of the filtering material, such as filter glass for example, or in
one exemplary

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
73
embodiment, the second wall (50) is EMA glass that is doped with an
appropriate
amount of filtering material. The filtering material can be formed of a color
other than
black and tuned for a desired excitationlemission filtering characteristic.
The filtering material allows transmission of excitation energy into the lumen
(40), and
blocks emission energy from the lumen (40) except through one or more openings
at
either end of the capillary (20). In Figure 9, excitation energy is
illustrated as a solid
line, while emission energy is indicated by a broken line. When the second
wall (50) is
formed with a filtering material as shown in Figure 9, certain wavelengths of
light
representing excitation energy are allowed through to the lumen (40), and
other
wavelengths of light representing emission energy are blocked from exiting,
except as
directed within and along the first wall (30). The entire capillary array, or
a portion
thereof, can be tuned to a specific individual wavelength or group of
wavelengths, for
filtering different bands of light in an excitation and detection process.
A particle (70) is depicted within the lumen (40). -During use, an excitation
light is
directed into the lumen (40) contacting the particle (70) and exciting a
reporter
fluorescent material causing emission of light. The emitted light travels the
length of the
capillary until it reaches a detector. One advantage of an embodiment of the
present
invention, where the second wall (50) is black ElVIA glass, is that the
emitted light
cannot cross contaminate adjacent capillary tubes in a capillary array. In
addition, the
black E1VIA glass refracts and directs the emitted light towards either end of
the capillary
tube thus increasing the signal detected by an optical detector (e.g., a CCD
camera and
the like).
In a detection process using a capillary array of the invention, an optical
detection
system is aligned with the array, which is then scanned for one or more bright
spots,
representing either a fluorescence or luminescence associated with a
"positive." The
term "positive" refers to the presence of an activity of interest. Again, the
activity can be
a chemical event, or a biological event.
Figure 10 depicts a general method of sample screening using a capillary array
(10)
according to the invention. In this depiction, capillary array (10) is
immersed or

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
74
contacted with a container (100) containing particles of interest. The
particles can be
cells, clones, molecules or compounds suspended in a liquid. The liquid is
wicked into
the capillary tubes by capillary action. The natural wicking that occurs as a
result of
capillary forces obviates the need for pumping equipment and liquid
dispensers. A
substrate for measuring biological activity (e.g., enzyme activity) can be
contacted with
the particles either before or after introduction of the particles into the
capillaries in the
capillary array. The substrate can include clones of a cell of interest, for
example. The
substrate can be introduced simultaneously into the capillaries by placing an
open end of
the capillaries in the container (100) containing a mixture of the particle-
bearing liquid
and the substrate. In some embodiments, it is a goal to achieve a certain
concentration of
particles of interest. A particular concentration of particles may also be
achieved by
dilution. Figures 13A-C show one such process, which is described below.
Alternatively, the particle-bearing liquid may be wicked a portion of the way
into the
capillaries, and then the substrate is wicked into a remaining portion of the
capillaries.
The mixture in the capillaries can then be incubated for producing a desired
activity.
The incubation can be for a specific period of time and at an appropriate
temperature
necessary for cell growth, for example, or to allow the substrate to
permeabilize the cell
membrane to produce an optically detectable signal, or for a period of time
and at a
temperature for optimum enzymatic activity. The incubation can be performed,
for
example, by placing the capillary array in a humidified incubator or in an
apparatus
containing a water source to ensure reduced evaporation within the capillary
tubes.
Evaporative loss may be reduced by increasing the relative humidity (e.g., by
placing the
capillary array in a humidified chamber). The evaporation rate can also be
reduced by
capping the capillaries with an oil, wax, membrane or the like. Alternatively,
a high
molecular weight fluid such as various alcohols, or molecules capable of
forming a
molecular monolayer, bilayers or other thin films (e.g., fatty acids), or
various oils (e.g.,
mineral oil) can be used to reduce evaporation.
Figure 11 illustrate a method for incubating a substrate solution containing
cells of
interest. While only a single capillary (20) is shown in Figure 11 for
simplicity, it should
be understood that the incubation method applies to a capillary array having a
plurality

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
7S
of capillaries (20). In accordance with one embodiment, a first fluid is
wicked into the
capillary (20) according to methods described above. The capillary (20)
containing the
substrate solution and cells (32) is then introduced to a fluid bath (70)
containing a
second liquid (72). The second liquid may or may not be the same as the first.
For
instance, the first liquid may contain particles (32) from which an activity
is screened.
The particles (32) are suspended in liquid within the lumen (40), and
gradually migrate
toward the top of the lumen (40) in the direction of the flow of liquid
through the
capillary (20) due to evaporation. The width of the lumen (40) at the open end
of the
capillary (20) is sized to provide a particular surface area of liquid at the
top of the lumen
(40), for controlling the amount and rate of evaporation of the liquid
mixture. By
controlling the environment (68) near the non-submersed end of the capillary
(20), the
first liquid from within the capillary (20) will evaporate, and will be
replenished by the
second liquid (72) from the fluid bath (70).
The amount of evaporation is balanced against possible diffusion of the
contents of the
capillary (20) into the liquid (72), and against possible mechanical mixing of
the
capillary contents with the liquid (72) due to vibration and pressure changes.
The greater
the width of the lumen (40), the larger the amount of mechanical mixing.
Therefore, the
temperature and humidity level in the surrounding environment may be adjusted
to
produce the desired evaporative cycle, and the lumen (40) width is sized to
minimize
mechanical mixing, in addition to produce a desired evaporation rate. The non-
submersed open end of the capillary (20) may also be capped to create a vacuum
force
for holding the capillary contents within the capillary, and minimizing
mechanical
mixing and diffusion of the contents within the liquid (72). However when
capped, the
capillary (20) will not experience evaporation.
The liquid (72) can be supplemented with nutrients (74) to support a greater
likelihood
or rate of activity of the particles (32). For example, oxygen can be added to
the liquid
to nourish cells or to optimize the incubation environment of the cells. In
another
example, the liquid (72) can contain a substrate or a recombinant clone, or a
developer
for the particles (32). The cells can be optimally cultured by controlling the
amount and
rate of evaporation. For instance, by decreasing relative humidity of the
environment

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
76
(68), evaporation from the lumen (40) is increased, thereby increasing a rate
of flow of
liquid (72) through the capillary (20). Another advantage of this method is
the ability to
control conditions within the capillary (20) and the environment (68) that are
not
otherwise possible.
A relatively high humidity level of the environment will slow the rate of
evaporation and
keep more liquid within the capillary (20). If a temperature differential
exists between a
capillary array (10) and its environment, however, condensation can form on or
near the
ends of tightly packed capillaries of the capillary array. Figure 12A shows a
portion of a
capillary array (10) of the invention, to depict a situation in which a
condensation bead
(80) forms on the outer edge surface of several capillary walls (30), creating
a potential
conduit or bridge for "crosstalk" of matter between adjacent capillary tubes
(20). The
outer edge surface of the capillary walls (30) is preferably a planar surface.
In an
embodiment in which the wall (30) of the capillary (20) is glass, the outer
edge surface
of the capillary wall (30) can be polished glass.
In order to minimize the effects of such condensation, a hydrophobic coating
(35) is
provided over the outer edge surface of the capillary walls (30), as depicted
in Figure
12B. 'The coating (35) reduces the tendency for water or other liquid to
accumulate near
the outer edge surface of the capillary wall (30). Condensation will form
either as
smaller beads (82), be repelled from the surface of the capillary array, or
form entirely
over an opening to the lumen (40). In the latter case, the condensation bead
(80) can
form a cap to the capillary (20). In one embodiment, the hydrophobic coating
(35) is
TEFLON. In one configuration, the coating (35) covers only the outer edge
surfaces of
the capillary walls (30). In another configuration, the coating (35) can be
formed over
both the interstitial material (60) and the outer edge surfaces of the
capillary walls (30).
Another advantage of a hydrophobic coating (35) over the outer edge surface of
the
capillary tubes is during the initial wicking process, some fluidic material
in the form of
droplets will tend to stick to the surface in which the fluid is introduced.
Therefore, the
coating (35) minimizes extraneous fluid from forming on the surface of a
capillary array
(10), dispensing with a need to shake or knock the extraneous fluid from the
surface.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
77
In some instances, it is necessary to have more than one component in a
capillary that are
not premixed, and which can by later combined by dilution or mixing. Figures
13A-C
show a dilution process that may be used to achieve a particular concentration
of
particles. In one embodiment employing dilution, a bolus of a first component
(82) is
wicked into a capillary (20) by capillary action until only a portion of the
capillary (20)
is filled. In one particular embodiment, pressure is applied at one end of the
capillary
(20) to prevent the first component from wicking into the entire capillary
(20). The end
(21) of the capillary may be completely or partially capped to provide the
pressure.
An amount of air (84) is then introduced into the capillary adjacent the first
component.
The air (84) can be introduced by any number of processes. One such process
includes
moving the first component (82) in one direction within the capillary until a
suitable
amount of the air (84) is introduced behind the first component (82). Further
movement
of the first component (82) by a pulling and/or pushing pressure causes a
piston-Like
action by the first component (82) on the air.
The capillary (20) or capillary array is then contacted to a second component
(86). The
second component (86) is preferably pulled into the capillary (20) by the
piston-like
action created by movement of the first component (82), until a suitable
amount of the
second component (86) is provided in the capillary, separated from the first
component
by the air (84). One of the first or second components may contain one or more
particles
of interest, and the other of the components may be a developer of the
particles for
causing an activity of interest. The capillary or capillary array can then be
incubated for
a period of time to allow the first and second components to reach an optimal
temperature, or for a sufficient time to allow cell growth for example. The
air-bubble
separating the two components can be disrupted in order to allow mix the two
components together and initialize the desired activity. Pressure can be
applied to
collapse the bubble. In one example, the mixture of the first and second
components
starts an enzymatic activity to achieve a multi-component assay.
Paramagnetic beads contained within a capillary (20) can be used to disrupt
the air
bubble and/or mix the contents of the capillary (20) or capillary array (10).
For example,
Figure 14A and 9B depict an embodiment of the invention in which paramagnetic
beads

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
78
are magnetically moved from one location to another location. The paramagnetic
beads
are attracted by magnetic fields applied in proximity to the capillary or
capillary array.
By alternating or adjusting the location of the magnetic field with respect to
each
capillary, the paramagnetic beads will move within each capillary to mix the
liquid
therein. Mixing the liquid can improve cell growth by increasing aeration of
the cells.
The method also improves consistency and detectability of the liquid sample
among the
capillaries.
In another embodiment, a method of forming a multi-component assay includes
providing one or more capsules of a second component within a first component.
The
second component capsules can have an outer layer of a substance that melts or
dissolves
at a predetermined temperature, thereby releasing the second component into
the first
component and combining particles among the components. A thermally activated
enzyme may be used to dissolve the outer layer substance. Alternatively, a
"release on
command" mechanism that is configured to release the second component upon a
predetermined event or condition may also be used.
In another embodiment, recombinant clones containing a reporter construct or a
substrate
are wicked into the capillary tubes of the capillary array. In this
embodiment, it is not
necessary to add a substrate as the reporter construct or substrate contained
in the clone
can be readily detected using techniques known in the art. For example, a
clone
containing a reporter construct such as green fluorescent protein can be
detected by
exposing the clone or substrate within the clone to a wavelength of light that
induces
fluorescence. Such reporter constructs can be implemented to respond to
various culture
conditions or upon exposure to various physical stimuli (including light and
heat). In
addition, various compounds can be screened in a sample using similar
techniques. For
example, a compound detectably labeled with a florescent molecule can be
readily
detected within a capillary tube of a capillary array.
In yet another embodiment, instead of dilution, a fluorescence-activated cell
sorter
(FACS) is used to separate and isolate clones for delivery into the capillary
array. In
accordance with this embodiment, one or more clones per capillary tube can be
precisely
achieved. In yet another embodiment, cells within a capillary are subjected to
a lysis

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
79
process. A chemical is introduced within one of the components to cause a
lysis process
where the cells burst.
Some assays may require an exchange of media within the capillary. In a media
exchange process, a first liquid containing the particles is wicked into a
capillary. The
first liquid is removed, and replaced with a second liquid while the particles
remain
suspended within the capillary. Addition of the second liquid to the capillary
and contact
with the particles can initialize an activity, such as an assay, for example.
The media
exchange process may include a mechanism by which the particles in the
capillary are
physically maintained in the capillary while the first liquid is removed. In
one
embodiment, the inner walls of the capillary array are coated with antibodies
to which
cells bind. Then, the first liquid is removed, while the cells remain bound to
the
antibodies, and the second liquid is wicked into the capillary. The second
liquid could
be adapted to cause the cells to unbind if desirable. In an alternative
embodiment, one or
more walls of the capillary can be magnetized. 'The particles are also
magnetized and
attracted to the walls. In still another embodiment, magnetized particles are
attracted and
held against one side of the capillary upon application of a magnetic field
near that side.
The capillary array is analyzed for identification of capillaries having a
detectable signal,
such as an optical signal (e.g., fluorescence), by a detector capable of
detecting a change
in light production or light transmission, for example. Detection may be
performed
using an illumination source that provides fluorescence excitation to each of
the
capillaries in the array, and a photodetector that detects resulting emission
from the
fluorescence excitation. Suitable illumination sources include, without
limitation, a
laser, incandescent bulb, light emitting diode (LED), arc discharge, or
photomultiplier
tube. Suitable photodetectors include, without limitation, a photodiode array,
a charge-
coupled device (CCD), or charge inj ection device (CID).
In one embodiment, shown with reference to Figure 15, a detection system
includes a
laser source (82) that produces a laser beam (84). The laser beam (84) is
directed into a
beam expander (85) configured to produce a wider or less divergent beam (86)
for
exciting the array of capillaries (20). Suitable laser sources include argon
or ion lasers.
For this embodiment, a cooled CCD can be used.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
~0
The light generated by, for example, enzymatic activation of a fluorescent
substrate is
detected by an appropriate light detector or detectors positioned adjacent to
the apparatus
of the invention. The light detector may be, for example, film, a
photomultiplier tube,
photodiode, avalanche photo diode, CCD or other light detector or camera. The
light
detector may be a single detector to detect sequential emissions, such as a
scanning laser.
Or, the light detector may include a plurality of separate detectors to detect
and spatially
resolve simultaneous emissions at single or multiple wavelengths of emitted
light. The
light emitted and detected may be visible light or may be emitted as non-
visible radiation
such as infrared or ultraviolet radiation. A thermal detector may be used to
detect an
infrared emission. The detector or detectors may be stationary or movable.
Illumination can be channeled to particles of interest within the array by
means of lenses,
mirrors and fiber optic light guides or light conduits (single, multiple,
fixed, or
moveable) positioned on or adjacent to at least one surface of the capillary
array. A
detectable signal, such as emitted light or other radiation, may also be
channeled to the
detector or detectors by the use of such mechanisms.
The photodetector preferably comprises a CCD, CID or an array of photodiode
elements.
Detection of a position of one or more capillaries having an optical signal
can then be
determined from the optical input from each element. Alternatively, the array
may be
scanned by a scanning confocal or phase-contrast fluorescence microscope or
the like,
where the array is, for example, carried on a movable stage for movement in a
X-Y plane
as the capillaries in the array are successively aligned with the beam to
determine the
capillary array positions at which an optical signal is detected. A CCD camera
or the
like can be used in conjunction with the microscope. The detection system is
preferably
computer-automated for rapid screening and recovery. In a preferred
embodiment, the
system uses a telecentric lens for detection. The magnification of the lens
can be
adjusted to focus on a subset of capillaries in the capillary array. At one
extreme, for
instance, the detection system can have a 1:1 correlation of pixels to
capillaries. Upon
detecting a signal, the focus can be adjusted to determine other properties of
the signal.
Having more pixels per capillary allows for subsequent image processing of the
signal.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
81
Where a chromogenic substrate is used, the change in the absorbance spectrum
can be
measured, such as by using a spectrophotometer or the like. Such measurements
are
usually difficult when dealing with a low-volume liquid because the optical
path length
is short. However, the capillary approach of the present invention permits
small
volumes of liquid to have long optical path lengths (e.g., longitudinally
along the
capillary tube), thereby providing the ability to measure absorbance changes
using
conventional techniques.
A fluid within a capillary will usually form a meniscus at each end. Any light
entering
the capillary will be deflected toward the wall, except for paraxial rays,
which enter the
meniscus curvature at its center. The paraxial rays create a small bright spot
in middle of
capillary, representing the small amount of light that makes it through.
Measurement of
the bright spot provides an opportunity to measure how much light is being
absorbed on
its way through. In a preferred embodiment, a detection system includes the
use of two
different wavelengths. A ratio between a first and a second wavelength
indicates how
much light is absorbed.in the capillary. Alternatively, two images of the
capillary can be
taken, and a difference between them can be used to ascertain a differential
absorbance
of a chemical within the capillary.
Tn absorbance detection, only light in the center of the lumen can travel
through the
capillary. However, if at least one meniscus is flattened, the optical
efficiency is
improved. The meniscus can be kept flat under a number of circumstances, such
as
during a continuous cycle of evaporation, discussed above with reference to
Figure 11.
In that embodiment, the fluid bath can be contained in a clear, light-passing
container,
and the light source can be directed through the fluid bath into the
capillary.
In another embodiment, bioactivity or a biomolecule or compound is detected by
using
various electromagnetic detection devices, including, for example, optical,
magnetic and
thermal detection. In yet another embodiment, radioactivity can be detected
within a
capillary tube using detection methods known in the art. The radiation can be
detected at
either end of the capillary tube.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
82
Other detection modes include, without limitation, luminescence, fluorescence
polarization, time-resolved fluorescence. Luminescence detection includes
detecting
emitted light that is produced by a chemical or physiological process
associated with a
sample molecule or cell. Fluorescence polarization detection includes
excitation of the
contents of the lumen with polarized light. Under such environment, a
fluorophore emits
polarized light for a particular molecule. However, the emitting molecule can
be moving
and changing its angle of orientation, and the polarized light emission could
become
random.
Time-resolved fluorescence includes reading the fluorescence at a
predetermined time
after excitation. For a relatively long-life fluorophore, the molecule is
flashed with
excitation energy, which produces emissions from the fluorophore as well as
from other
particles within the substrate. Emissions from the other particles causes
background
fluorescence. The background fluorescence normally has a short lifetime
relative to the
long-life emission from the fluorophore. The emission is read after excitation
is
complete, at a time when all background fluorescence usually has short
lifetime, and
during a time in which the long-life fluorophores continues to fluoresce. Time-
resolved
fluorescence are therefore a technique for suppressing background fluorescent
activity.
Recovery of putative hits (cells or clones producing a detectable or optical
signal) can be
facilitated by using position feedback from the detection system to automate
positioning
of a recovery device (e.g., a needle pipette tip or capillary tube). Figure 16
shows an
example of a recovery system (100) of the invention. In this example, a needle
105 is
selected and connected to recovery mechanism (106). A support table (102)
supports a
capillary array (10) and a light source (104). The light source is used with a
camera
assembly (110) to find an X, Y and Z coordinate location of a needle (105)
connected to
the recovery mechanism (106). The support table is moved relative to the
capillary array
in the X and Y axes, in order to place the capillary array (10) underneath the
needle
(105), where the capillary array (10) contains a "hit." According to various
embodiments, each section of a recovery system can be moved or kept
stationary.
The recovery mechanism (106) then provides a needle (105) to a capillary
containing a
"hit" by overlapping the tip of the needle (105) with the capillary containing
the "hit," in

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
83
the Z direction, until the tip of the needle engages the capillary opening. In
order to
avoid damage to the capillary itself the needle may be attached to a spring or
be of a
material that flexes. Once in contact with the opening of the capillary the
sample can be
aspirated or expelled from the capillary. Alternatively, the capillary array
may be moved
relative to a stationary needle (105), or both moved.
In a specific exemplary embodiment of a recovery technique, a single camera is
used for
determining a location of a recovery tool, such as the tip of a needle, in the
Z-plane. The
Z-plane determination can be accomplished using an auto-focus algorithm, or
proximity
sensor used in conjunction with the camera. Once the proximity of the recovery
tool in
Z is known, an image processing function can be executed to determine a
precise
location of the recovery tool in X and Y. In one embodiment, the recovery tool
is back-
lit to aid the image processing. Once the X and Y coordinate locations are
known, the
capillary array can be moved in X and Y relative to the precise location of
the recovery
tool, which can be moved along the Z axis for coupling with a target
capillary.
In an alternative specific embodiment of a recovery technique, two or more
cameras are
used for determining a location of the recovery tool. hor instance, a first
camera can
determine X and Z coordinate locations of the recovery tool, such as the X, Z
location of
a needle tip. A second camera can determine Y and Z coordinate locations of
the
recovery tool. The two sets of coordinates can then be multiplexed for a
complete X,Y,Z
coordinate location. Next, the movement of the capillary array relative to the
recovery
tool can be executed substantially as above.
The sample can be expelled by, for example, injecting a blast of inert gas or
fluid into the
capillary and collecting the ejected sample in a collection device at the
opposite end of
the capillary. The diameter of the collection device can be larger than or
equal to the
diameter of the capillary. The collected sample can then be further processed
by, for
example, extracting polynucleotides, proteins or by growing the clone in
culture.
In another embodiment, the sample is aspirated by use of a vacuum. In this
embodiment, the needle contacts, or nearly contacts, the capillary opening and
the
sample is "vacuumed" or aspirated from the capillary tube onto or into a
collection

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
84
device. The collection device may be a microfuge tube or a filter located
proximal to the
opening of the needle, as depicted in Figure 17A-D. Figure 17D shows further
processing of a sample collected onto a filter following aspiration of the
sample from the
capillary. The sample includes particles, such as cells, proteins, or nucleic
acids, which
when present on the filter, can be delivered into a collection device.
Suitable collection
devices include a microfuge tube, a capillary tube, microtiter plate, cell
culture plate, and
the like. The delivery of the sample can be accomplished by forcing another
media, air
or other fluid through the filter in the reverse direction.
The sample can also be expelled from a capillary by a sample ejector. In one
embodiment, the ejector is a jet system where sample fluid at one end of the
capillary
tube is subjected to a high temperature, causing fluid at the other end of the
capillary
tube to eject out. The heating of fluid can be accomplished mechanically, by
applying a
heated probe directly into one end of a capillary tube. The heated probe
preferably seals
the one end, heats fluid in contact with the probe, and expels fluid out the
other end of
the capillary tube . The heating and expulsion may also be accomplished
electronically.
For instance, in an embodiment of the j et system, at least one wall of a
capillary tube is
metalized. A heating element is placed in direct contact with one end of the
wall. The
heating element may completely close off the one end, or partially close the
one end.
The heating element charges up the metalized wall, which generates heat within
the
fluid. The heating element can be an electricity source, such as a voltage
source, or a
current source. In still yet another embodiment of a jet system, a laser
applies heat
pulses to the fluid at one end of the capillary tube.
Other systems for expelling fluid from a capillary tube of the invention are
possible. An
electric field may be created in or near the fluid to create an
electrophoretic reaction,
which causes the fluid to move according to electromotive force created by the
electric
field. A electromagnetic field may also be used. In one embodiment, one or
more
capillaries contain, in addition to the fluid, magnetically charged particles
to help move
the fluid or magnetized partcles out of the capillary array.
Each capillary of an array of capillaries is individually addressable, i.e.
the contents of
each well can be ascertained during screening. In one embodiment, a quantum-
dot-

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
8S
tagged microbead method and arrangement is used. In such a method and
arrangement,
tens of thousands of unique fluorescent codes can be generated. The assay of
interest is
attached to a coded bead, and multi-spectral imaging is used to measure both
the assay
and the beads/codes simultaneously. There will always be some capillaries that
get
multiple beads and some that get none.
For an array which contains approximately 100,000 capillaries, one approach is
to fill the
100,000 capillaries of the array with a solution that contains 10 copies of
10,000
different coded beads (or 5 copies of 20,000 codes). Under normal conditions,
simple
statistical analysis can be used to determine which of the wells have single
beads and
maybe even the contents of every well. The chance of having any two beads
together in
a well more than 5 times on any one capillary array platform is negligibly
small.
An advantage of the quantum-dots method is that only a single excitation band
is
needed. This allows a lot of flexibility for the assay (i.e. it can use a
different excitation
band). Magnetic-coded beads may also be used to add another dimension to the
assay
detection. A mufti-spectral imaging system can then be used. Alternatively, a
neural
network application can be utilized for spectral decomposition.
The myriad of microbes inhabiting this planet represent a tremendous
repository of
biomolecules for pharmaceutical, agricultural, industrial and chemical
applications. The
great majority of these microbes, estimated at near 99.5%, have remained
uncultured by
modern microbiological methods due in large part to the complex chemistries
and
environmental variables encountered in extreme or unusual biotopes. Taking
advantage
of enzymes catalyzing chemical reactions in novel pathways and evolved to
function
under environmental extremes is of great industrial significance. This
invention
provides technologies to extract, optimize and commercialize this robust
catalytic
diversity, within culture-independent, recombinant approaches for the
discovery of novel
enzymes and biosynthetic pathways by tapping into the biodiversity present in
nature.
Large, complex (>109 member) gene libraries are constructed by direct
isolation of
DNA from selected microenvironments around the world. These libraries are then
expressed in various host systems and subjected to high throughput screens
specific for
an activity of interest. Because in excess of 5000 different microbial genomes
may be

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
86
present in a single DNA library, ultra high throughput methods are required to
effectively screen this diversity and are crucial to the success of this
culture-
independent, recombinant strategy.
Conventional screening methods include liquid phase, microtiter plate based
assays. The
format for liquid phase assays is often robotically manipulated 96, 384, or
1536-well
microtiter plates. Although these microtiter plate based screening
technologies are being
used successfully, limitations do exist. The primary limitation is throughput
as these
techniques generally allow the screening of only about 105 to 106
clones/day/instrument.
For example, a typical screen of 100,000 wells on a microtiter based HTS
systems
requires 261,384-well microtiter plates and over 24 hours of equipment time.
However,
while 1536-well or greater plate formats are growing in popularity, the
majority of
companies involved in HTS continue to use 384-well plates, as this technology
is
reliable and standardized. While these throughputs may be more than sufficient
for
screening isolate and low-complexity libraries, it could take more than a year
to
thoroughly screen one complex gene library. Clearly, higher throughput
screening
technology is necessary.
Other screening methods include growth selection (Snustad et al., 1988;
Lundberg et al.,
1993; ~'ano et al., 1998), colorimetric screening of bacterial colonies or
phage plaques
(Kuritz, 1999), in vitro expression cloning (I~iing et al., 1997) and cell
surface or phage
display (Benhar, 2001). Each of these systems has limitations. Solid phase
colorimetric
plate screening of colonies or plaques is limited by relatively low
throughput. Even with
the use of microcolonies/plaques and automated imaging and clone recovery,
thorough
screening of complex libraries is impractical. Cell surface and/or phage
display
technologies suffer from structural limitations of the displayed molecule.
Often the size
and /or shape of the displayed molecule is restricted by the display
technology. One of
the highest throughput screening methods, growth selection, is also limited in
its scope
of usefulness. Assay conditions, temperature and pH, are limited by the growth
parameters of the host strain. Molecular interactions are often constrained by
the host
cell membranes and/or cell wall, as substrate must be presented to
intracellular enzymes.
In addition, "false positives" or a high level of "background" are a common
occurrence

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
8~
in many selection assays. With respect to screening for improved variants in
GSSM or
GeneReassembly libraries, growth selection is seldom quantitative
The invention provides screening platforms and methods for use with a
Fluorescence
Activated Cell Sorter (FACS). In FACS methodologies, cells are mixed with
substrates
and then streamed past a detector to screen for a positive molecular event.
This signal
could be a fluorescent signal resulting from the cleavage of an enzyme
substrate or a
specific binding event. The greatest advantage of the use of a FACS machine is
throughput; up to 109 clones can be screened/day. ~Jnfortunately, FAGS based
screening also has limitations including cell wall permeability of enzymes and
substrates/products and incubation times and temperatures. In addition,
viability of host
cells post-sort and dependence on a single data point fox each individual cell
further limit
such technologies.
The development of the capillary array overcomes many of these shortcomings.
Like
microtiter and solid phase screens, it combines the preservation of native
protein
conformation with increased signal strength of clonal amplification. The
throughput,
however, approaches that of selective assays and FAGS-based assays. Moreover,
as
array plates are reusable, the amount of plastic waste generated is greatly
reduced.
Approximately 24 tons of plastic waste* is generated annually in screening
100,000
wells per day in a 96 well format (* Assuming 84g/plate x 1000 plates/day x
260
days/year). Further, a typical screen of 100,000 wells on a robotic high
throughput
screening system requires 261 384-well microtiter plates and over 24 hours of
equipment
time versus less than 10 minutes to process a single plate. The enhancement of
this
technology to densities of one million wells per plate is aimed at approaching
the
throughput of selective assays and FACS-based assays while retaining the
advantages of
a microtiter-based screen.
The first generation capillary array plates can be fabricated using
manufacturing
techniques originally developed for the fiber optics industry, currently
consist of 100,000
cylindrical compartments or wells contained within a 3.3" x 5" reusable plate,
the size of
a SBS (Society for Biomolecular Screening) standard 96 well microtiter plate.
These
wells are 200 Eixn in diameter (about the diameter of a human hair) and act as
discrete

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
88
250 nanoliter volume microenvironments in which isolated clones can be grown
and
screened.
The processes involved in array screening closely parallel those in microtiter
plate
screening, but with significant simplification in required instrumentation and
decrease in
plate storage capacity requirements and reagent costs. Briefly, the plates are
filled with
clones and reagents (e.g. fluorescent substrate, growth media, etc.) by
surface tension,
filling all 100,000 wells simultaneously within a few seconds without the need
for
complicated dispensing equipment. The number of clones per well, typically 1
to 10, is
adjusted by dilution of the cell culture. Once filled, the plates are then
incubated in a
humidity-controlled environment for 24 to 48 hours to allow for both clonal
amplification and enzymatic turnover.
After incubation in a humidified chamber, the plates are transferred to the
detection and
recovery station where fluorescence imaging is used to detect the expression
of bioactive
molecules. The automated detection and recovery system combines fluorescence
imaging and precision motion control technologies through the use of machine
vision
and image processing techniques. Images are generated by focusing light from a
broadband light source (e.g. metal halide arc lamp) onto the plate through a
set of
fluorescence excitation filters. The resulting fluorescence emission is
filtered then
imaged by a telecentric lens onto a high-resolution cooled CCD camera in an
epi-
fluorescent configuration. The plates are scanned to generate a total of 56
slightly
overlapping images in approximately one minute. The images are digitized and
processed on-the-fly to detect and locate positive wells or putative hits.
Putative hits
(clones that have converted the substrate to a fluorescent product) appear as
bright spots
on a dark background. They are distinguished from background fluorescence and
extraneous signals (typically due to dirt and dust) based on a variety of
feature
measurements such as their shape, size, and intensity profile.
Once detected and located, putative hits are recovered from the array plate
and
transferred to a standard microtiter plate for confirmation and secondary
screening. The
process of recovery consists of 1 ) mounting and locating a sterile recovery
needle
(typically a standard blunt end stainless steel needle commonly used for
dispensing

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
89
adhesives for mounting miniature surface mount electronic components), 2)
aligning the
recovery needle to the well containing the putative hit, 3) aspirating the
contents of the
well into the needle (which has attached .22 micron filter to avoid upstream
contamination and loosing the sample), 4) flushing the well contents into a
standard
microtiter plate with an appropriate media, and finally 5) stripping off the
recovery
needle in preparation for the next recovery. Closed loop positioning with
image-based
feedback provides the positional accuracy required to allow aspiration of
individual
wells without contamination from neighboring wells. Finally, after the clones
of interest
have been recovered, the used plates are cleaned, sterilized, and prepared for
re-use. The
array platform according to the invention will accelerate the discovery and
development
of commercial products as well as enable the development of products that
would
otherwise be unobtainable.
This invention is configured for use with a Fluorescence Activated Cell Sorter
(FAGS).
In FAGS methodologies, cells are mixed with substrates and then streamed past
a
detector to screen for a positive molecular event. This signal could be a
fluorescent
signal resulting from the cleavage of an enzyme substrate or a specific
binding event.
The greatest advantage of the use of a FACS machine is throughput; up to 109
clones
can be screened/day. Unfortunately, FAGS based screening also has limitations
including cell wall permeability of enzymes and substrates/products and
incubation
times and temperatures. In addition, viability of host cells post-sort and
dependence on a
single data point for each individual cell .fiu~ther limit such technologies.
The well diameter, plate thickness (well depth), and material optical
properties will be
specified prior to fabricating the new 1,000,000-well density matrices. Once
these
parameters are specified, high density matrices will be fabricated in
rectangular pieces
approximately 1 cm square. The process entails a low-risk modification to the
same
basic fabrication technique that is used to make the 100,000 well plates. The
array
density can be calculated by using the following formula:
#WellsPerPlate = 2 (PlateLength x PlateWidth)
WellDianzete~ + WellSeparation Wall ) 2

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
This calculation reveals that in order to achieve 1,000,000 wells in the
standard 3.3" x 5"
microtiter plate format, the new wells will need to have a diameter of
approximately 70
pm with 25p,m separating walls. Structures of this size/density and smaller
(down to
6~,m) are commonly manufactured for non-biological uses including micro-
channel
faceplates for intensified CCD cameras, X-ray scintillation plates, optical
collimators, as
well as simple fluid filters.
There are some limitations to the depth of the wells due to the nature of the
fabrication
process. The current 100,000-well plates have 8mm deep wells. Based on our
experience with structures of similar size, it is estimated that the depth of
the 70p,m wells
will be between Smm and 8mm. This yields a well volume of approximately 25n1
to
30n1 or approximately 1/lOth of that of the 200p,m diameter wells. Evaporation
rate is a
fiuiction of the surface area to volume ratio rather than the total volume.
For this reason
it is anticipated that the 70pm wells will experience comparable (if not less)
evaporation
than the 200~,m well due to a more favorable length to diameter (volume to
surface area)
ratio. Evaporation is currently not a problem with the 200~m diameter wells.
Samples will be constructed from both transparent and opaque materials to
evaluate
illumination efficiencies, well-to-well optical crosstalk, surface-finish
effects, and
background fluorescence. The current 100,000-well plates use an opaque
material. The
use of transparent materials improves the efficiency of fluorescence
excitation at the
expense of increased well-to-well optical crossta.lk. For assays with low hit
rates, the
tradeoff may favor the use of transparent materials to improve detection
sensitivity. We
estimate that the specification and manufacturing process will take two
months. A
special holder will also be fabricated to adapt the matrices to the capillary
array
hardware. Once the specified matrices are manufactured, they will be tested
for each of
the optical and mechanical properties detailed below:
Background Fluorescence - It is helpful from an imaging and processing
perspective,
but not critical, that the matrix have low background fluorescence for a broad
range of
excitation wavelengths to allow use with a variety of substrates. The
materials used in
the 200pm plates were tested and selected to satisfy this requirement. In the
unlikely

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
91
event that different materials must be used to fabricate both transparent and
opaque
70pm matrices, they will be tested fox their fluorescent properties prior to
fabrication.
These tests are performed by measuring and comparing the fluorescence of the
material
to a reference standard at a range of excitation wavelengths.
Optical Efficiency- The 100,000-well plates are currently illuminated by a
roughly
collimated beam directly on the face of the plate. Light enters each well
through the
aperture formed by the wall around the well. Transparent materials are
expected offer
illumination advantages over opaque materials with the current illumination
system by
transmitting additional excitation energy through the walls separating the
wells. The
optical efficiency of the 1,000,000-well density matrices will be evaluated by
determining the detectable concentration of a fluorescein solution. Typically,
liquid
phase enzyme discovery assays use 10-100p,M concentrations of fluorescent
substrate.
The current detection system can detect approximately lOnM of fluorescein in
the
200pm wells. The equivalent fluorescence of LB (our typical cell growth media)
is
approximately ZSnM. Hardware modifications described in Goal 3 may be required
in
the unlikely event that the detectable levels are less than l ONTVI for the
new matrices.
Optical Crosstalk - While the use of transparent materials may improve the
efficiency of
fluorescence excitation as described above, it does so at the expense of
increased well-
to-well optical crosstalk. This optical crosstalk is due to fluorescence
emission that leaks
from one well into its neighbors. This is easily quantified by, spotting a
fluorophore
onto the matrix, and then measuring the signal intensity vs. distance from a
fluorophore
filled well. The crosstalk could potentially mask the signal of a weak
positive well
resulting in a false negative or be detected as a false positive. In
applications where the
expected hit rate is low (which is commonly the case with enzyme discovery
from
environmental libraries) the probability of this occurnng is generally
insignificant.
However, crosstalk can complicate the image processing required to
automatically locate
putative hits and therefore must be evaluated.
Surface Tension/Wicking Properties - The plates are filled by placing the
surface of the
plate in contact with the assay solution. Surface tension at the liquid/plate
interface
causes the assay components to be drawn or wick into all of the wells
simultaneously.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
92
The surface preparation of the plate can have significant affects on the
wicking
properties of the matrix. Some surface polishing techniques have been found to
make
the glass face of the plate hydrophobic, thus preventing or significantly
slowing the
filling of the plate. Initially, the same surface finish currently used on the
100,000-well
plate will be tested. If necessary, matrices with different surface
preparations will be
placed into contact with a cell/media mixture and their wicking properties
quantified by
timing the filling process and weighing the matrices before and after filling.
In the event
that plate filling remains inadequate after testing available surface
preparations and
treatments, surfactants can be added to improve filling.
Resistance to Cleaning and Sterilization - It is desirable for the 1,000,000-
well plates to
be reusable. To validate this requirement, the matrices will be processed
through
multiple, rigorous cleaning and sterilization protocols. Currently, there is a
great deal of
latitude in both the cleaning and sterilization protocols. Cleaning can
consist of a
combination of flushing, soaking, and/or sonication in water, solvents and/or
soaps.
Likewise, due to the inherent ruggedness of the materials used, sterilization
can be
accomplished by autoclaving, bleach, ethanol, and/or acid washing. Cleanliness
is
verified by fluorescence imaging of the material at multiple excitation
wavelengths.
Sterilization is verified by overnight incubation of matrices filled with
sterile growth
media, followed by plating the contents onto agar and looking for colony
formation.
Only minimal modifications to the detection system hardware will be required
for the
1,000,000-well density matrices. Due to reduced size ofthe wells, minor
modifications
to the optical system may need to be made to adjust the magnification to an
appropriate
level to determine screening feasibility. The optical system will likely need
fiuther
modification as proposed in Phase II to enable automated hit recovery. A
commercially
available 2x extender can be added to the existing telecentric imaging lens
used for the
current 100,000-well plate. This modification will render the final image size
of each
well (relative to the camera) approximately 70% of the current size. Based on
our
experience, this should be more than adequate to visualize positive wells for
determining
feasibility.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
93
As mentioned above, the detection sensitivity of the new matrices is expected
to be
lower (especially for opaque matrices) than for the current plates using the
current
detection system hardware. In addition to the use of transparent matrices, a
number of
hardware enhancements that could significantly improve sensitivity including:
Higher
sensitivity cooled CCD camera; Laser based illumination or other higher power
density
light source; and Faster (possibly non-telecentric) imaging optics.
In order to fully take advantage of the throughput afforded by 1,000,000 well
plates, a
large number of unique clones must be generated. Two alternative methods for
preparing large numbers (107 to 109) of clones per day for screening can be
used with
the 100,000-well plates. They will both be tested for use with the 1,000,000-
well density
matrices and are described below. One effort will use Resorufm [i-D-
galactopyranoside
(Molecular Probes #R-1159) as the fluorescent substrate and a positive [i-
galactosidase
control clone (535-GL2) for both assay development and feasibility screening.
This
substrate and positive clone were well characterized and validated during the
development of the 100,000-well platform.
Method 1: Screening Lambda Phage Libraries for Enzymatic Activity - Gene
libraries
cloned into lambda-based vectors are first titered by plating dilutions on
soft agar in the
presence of an appropriate E. coli host strain according to standard
techniques. Using
this titer infonmation, an adequate amount of the lambda library is allowed to
adsorb to
the host. After 15 minutes, a mixture of growth medium and fluorescent
substrate is
then added to produce a final suspension having the following characteristics:
[1] a
density of host cells that will allow both sufficient growth and an effective
multiplicity of
infection, [2] an optimal concentration of fluorescent substrate for detection
of the
enzymatic activity, and [3] a density of phage particles such that, when
loaded into a
1,000,000-well density matrix, each well will contain an average of 1- 4
library clones.
(Densities of 5-10 clones per well will be attempted once the initial details
are worked
out.) A sample of this suspension is plated on soft agar to determine the
average seed
density of library clones (concomitant titer). The remainder of the suspension
is used to
load the wells of the matrices. The plates are incubated at 37°C for 16-
24 hours

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
94
(protected from light and evaporative loss; see note on Incubation below) to
allow lytic
multiplication of bacteriophage in the wells prior to detection and recovery.
Method 2: Screening Phagemid and Other Colony-Based Libraries for Enzymatic
Activity - Phagemid libraries are produced from parental bacteriophage
libraries using
an in vivo excision process (Short et al., 1988). Following initial titering,
these libraries
are used to infect an appropriate E. coli host strain. After the 15-minute
adsorption
period, cells are supplied with a small amount of medium and allowed to grow
at 30
degrees Celsuis without antibiotic selection for 45 minutes to allow
expression of the
antibiotic resistance gene present on the phagemid. The suspension is then
plated onto
solid plates containing antibiotic and allowed to grow at 30 degrees Celsius
overnight.
Amplified clones from the resulting antibiotic-resistant colonies are
collected into a
pooled suspension. A mixture of antibiotic, fluorescent substrate and growth
medium is
then added to produce the final suspension used to load the high-density
matrices (with
characteristics analogous to [2] and [3] above). A sample of this suspension
is also
plated onto solid agar plates containing antibiotic to determine the average
seed density
of library clones (concomitant titer). The matrices are then incubated at 30-
37 degrees C
for 1-2 days (protected from light and evaporative loss; see note on
Incubation below) to
allow phagemid-containing host cells to multiply within the wells prior to
detection and
recovery.
Libraries created in other vectors (e.g. cosmid, fosmid, PAC, YAC, BAC, etc.)
are also
screened using this platform. Factors such as growth requirements,
transformation
modality, and transformation efficiency have to be taken into consideration
when
adapting a particular library vector to this technology. The use of a variety
of library
and vector types permits screening for small molecules and protein
therapeutics in
addition to novel enzymes.
The array plates are typically incubated in a humidified incubator at 90%
relative
humidity for 24 to 48 hours. The plates are stackable and designed such that
each plate
is contained within a humidity and temperature stable environment by the
plates above
and below it. Lids or extra plates filled with water are used at the top and
bottom of each

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
stack to seal the end plates. The incubation process requires validation of
cell growth,
evaporation, and condensation.
The growth of E. coli, which will be used as the enzyme screening host, has
been clearly
demonstrated in the 100,000 well array plate. Other types of cells including
streptomyces, mammalian (Jurkat human leukemic T cells), and lambda phage have
also
been shown to grow in this format.
Cell growth in the 1,000,000-well density matrices will be verified by the
same
procedure used in for the 100,000-well plates. The number of colonies formed
by
plating the initial cell solution (diluted to 1 to 10 clones/well) will be
compared to a
culture of equal volume aspirated from the matrix after incubation. Although
difficulties
in cell growth are not anticipated, there are alternative strategies to
mitigate these
difficulties. The surface area to volume ratio of the 1,000,000-well density
matrices is
less favorable for oxygen diffusion into the assay solution than in the
100,000-well
format. If oxygen diffusion appears to be limiting cell growth, we will
evaluate methods
for increasing oxygenation. Preliminary experiments have successfully
demonstrated
fluitlic mixing in 200~m diameter wells using paramagnetic beads in a
fluctuating
magnetic field and by agitation with sound pulses. Magnetic mixing has been
shown to
vastly improve the growth of Streptomyces in the 100,000-well format.
If necessary, these mixing methods could be employed to improve oxygen
diffusion and
cell growth. Other methods include oxygen saturation of the assay solution
prior to plate
filling, incubation in a high oxygen environment, and the addition of time-
released
oxygen generating compounds such as sodium percarbonate.
With a total assay volume of approximately 30n1, controlling evaporation from
the
1,000,000-well plates will be critical. However, as mentioned above, the
surface to
volume ratio is favorable for minimizing evaporation. Evaporation studies
conducted in
100,000-well plates indicate a 10% loss of media volume over 24 hours. This
loss is
reduced to 5% with the addition of 10% glycerol. Because the surface area to
volume
ratio of the 1,000,000-well plates will be similar (if not more favorable) to
the 100,000-
well plates. Evaporation in the higher density matrices will be measured by
filling the

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
96
plates with typical assay media and weighing them at several time points over
a 96-hour
period. If stricter evaporation control is required, glycerol can be added.
The effects of condensationlmoisture on the surface of the matrices are also
considered.
Because they are incubated in high-humidity environments, droplets on the
outer
surfaces of the matrices that remain after filling or condense during
incubation may not
evaporate and can cause well to well cross-contamination. These droplets can
lead to the
detection of false positives in wells neighboring a true positive as well as
cause a blotchy
appearance on the plate surface that obscures weak positives. Such problems
with
surface droplets remaining after filling the 100,000-well plates are avoided
by letting
them sit at room temperature until all of the surface moisture has evaporated.
Avoiding
condensation during incubation is accomplished by using strict temperature and
humidity control. This issue is addressed by placing the filled plates in a
programmable
humidified chamber that starts with low humidity and increases it to the
desired
incubation humidity only after the plates have warmed to the chamber
temperature.
Once warm, the stacked plates form a relatively stable thermal mass immune to
the small
temperature fluctuations in the chamber. Surface moisture control issues will
be similar
in the higher density plates. The matrices will be tested to see if these
methods
successfully control surface moisture.
Negative libraries spiked with the positive ~3 -gal clone at a defined
frequency will be the
first subjects of a feasibility screen. The same screen will be performed in
parallel in a
conventional microtiter format for comparison. Once this is proven, screening
will
proceed (again in parallel with microtiter format) to libraries known to
contain positive
clones. A mixed population library was validated for this purpose during the
development of the 100,000-well platform and will be used for the 1,000,000-
well
feasibility screening. These experiments will be performed for both lambda-
based and
phagemid-based library screens since clonal amplification rates, and thus
signal
intensities, may differ between bacteriophage and whole cell assays.
Validation of the feasibility screens can be performed by simply comparing the
number
of positive wells in the fluorescence images of the 1,000,000-well matrices to
those in a
100,000-well array plate filled with the identical assay solution.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
97
Further verification will be done in standard microtiter format. The number of
positive
wells is a function of the concentration of positive clones in the initial
assay solution and
the volume of the wells. Since the well volume of the 1,000,000-well matrices
is
approximately 1/1 Oth that of the 100,000 well plates, the expected number of
positive
wells should also be about 1/l0th when loading the same initial assay
solution.
The array of capillaries can be arranged to fit within a footprint of a
microtiter plate, one
standard of which is a footprint of 3.3" x 5". Within that footprint, up to
1,000,000 or
more capillaries, or wells, can be provided in the array. A 1,000,000 well
platform for
screening gene libraries from mixed populations of organisms for novel
enzymatic
activities provides an ultra high-throughput screening platform in the 3.3" x
5" footprint
of a standard microtiter plate. In this format each well includes a capillary
having a
diameter of 200Nm, and which holds 250n1. The array platform permits rapid
screening
of genes and gene pathways, and increases the productivity of discovery and
gene
optimization programs for products such as novel enzymes, protein
therapeutics,
compounds and small molecule drugs. Any number of novel enzymes of various
catalytic classes (e.g., amylases, proteases, secondary amidases) can be
discovered using
the array platform. The same proprietary cost effective process by which the
100,000-
well plates are made can be utilized to make the 1,000,000-well plates for
smaller, non-
biological applications.
The array screening platform greatly expands the amount of molecular diversity
that can
be screened to discover new products. Using 1,000,000-well plates, employing
over
12,000 wells per square centimeter, more than one billion clones per day can
be screened
using standard liquid phase fluorescent assays, while at the same time
reducing
equipment and operator time through massively parallel dispensing and reading
of
biological samples. Additionally, the 1,000,000-well plates, with wells each
about half
the diameter of a human hair, are be reusable and require only miniscule
volumes of
reagents, making them highly cost effective and environmentally responsible.
Increasing the liquid phase screening density from 100,000 to 1,000,000 wells
per
microtiter plate footprint represents a l Ox increase in density that
contributes to
accelerated discovery and development of commercial products, such as antibody
and

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
98
protein therapeutic programs that require rapid screening of very large
numbers of
antibody and protein variants created by evolution technologies. This
invention includes
the design and fabrication of lcm square matrices with 1,000,000 well/plate
density (i.e.
12,000 wellslcm2) using a process that is scalable to full microtiter plate
sized arrays.
The platform can be utilized to develop a novel liquid phase nitrilase assay
in
the 1,000,000-well format, as well as screening gene libraries from mixed
populations of organisms for chiral nitrilases for use in the manufacture of
chemical
intermediates for chiral therapeutic compounds.
Naked Biopanning involves the direct screening or enrichment for a gene or
gene cluster from environmental genomic DNA. The enrichment for or isolation
of
the desired genomic DNA is performed prior to any cloning, gene-specific PCR
or
any other procedure that may introduce unwanted bias affecting downstream
processing and applications due to toxicity or other issues. Several
methodologies
can be described for this type of sequence based discovery. These generally
include
the use of nucleic acid probes) that is(are) partially or completely
homologous to the
target sequence in conjunction with the binding of the probe-target complex t~
a solid
phase support. The probes) may be polynucleotide or modified nucleic acid,
such as
peptide nucleic acid (PNA) and may be used with other facilitating elements
such as
proteins or additional nucleic acids in the capture of target DNA. An
amplification.
step which does not introduce sequence bias may be used to ensure adequate
yield for
downstream applications.
An example of a Naked Biopanning approach can be found in the use
of RecA protein and a complement-stabilized D-loop (csD-loop) structure
(Jayasena
& Johnston, 1993; Sena and Zarling, 1993) to target genomic DNA of interest.
It
does not involve complete denaturation of the target DNA and therefore is of
particular interest when one is attempting to capture large genomic fragments.
The
following method incorporates the ClonCaptureTM cDNA selection procedure
(CLONTECH Laboratories, Inc.), with some modification, to take advantage of
csD-
loop formation, a stable structure which may be used to capture genomic DNA
containing an internal target sequence:

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
99
Environmental genomic DNA is cleaved into fragments (fragment size
depends upon type of target and desired downstream insert size if making a pre-
enriched library) using mechanical shearing or restriction digest. Fragments
are size
selected according to desired length and purified. A biotinylated dsDNA probe
is
produced, based upon existing knowledge of conserved regions within the
target, by
PCR from a positive clone or by synthetic means. The probe can be internally
(ex.
incorporation of biotin 21-dCTP) or end labeled with biotin. It must be
purified to
remove any unincorporated biotin. The probe is heat denatured (5 min. at
95°C) and
placed immediately on ice. The denatured probe is then reacted with RecA and
an
ATP mix containing ATP and a nonhydrolyzable analog (15 min. at 37°C).
The target
DNA is added and incubated with the RecA/biotinylated probe nucleofilaments to
form the csD-Loop structure (20 min. at 37°C). The RecA is then removed
by
treatment with proteinase I~ and SDS. After inactivating the proteinase I~
with
PMSF, washed and blocked (with sonicated salmon sperm DNA) streptavidin
paramagnetic beads are transferred to the reaction and incubated to bind the
csD-loop
complex to the support (rotate 30 min. at room temp.). The unbound DNA is
removed and may be saved for use as target for a different probe. The beads
are
thoroughly washed and the enriched population is eluted using an alkaline
buffer and
transferred off . The enriched DNA is then ethanol precipitated and is ready
for
ligation and pre-enriched library preparation.
Other stable complexes may be used instead of the RecAlcsD-loop structure
for the capture of genomic DNA. For instance, PNAs may be used, either as
"openers" to allow insertion of a probe into dsDNA (Bukanov et al., 1998), or
as
tandem probes themselves (Lohse et al., 1999). In the first case, PNAs bind to
two
short tracts of homopurines that are in close proximity to each other. They
form P-
loop structures, which displace the unbound strand and make it available for
binding
by a probe, which can then be used to capture the target using an affinity
capture
method involving a solid phase. Likewise, PNAs may be used in a "double-duplex
invasion" to form a stable complex and allow target recovery.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
100
Simpler methods may be used in the retrieval of targets from environmental
genomic DNA that involve complete denaturation of the DNA fragments. After
cutting genomic DNA into fragments of the desired length via mechanical
shearing or
through the use of restriction enzymes, the target DNA may be bound to a solid
phase
using a direct hybridization affinity capture scheme. A nucleic acid probe is
covalently bound to a solid phase such as a glass slide, paramagnetic bead, or
any
type of matriac in a column, and the denatured target DNA is allowed to
hybridize to
it. The unbound fraction may be collected and rehybridized to the same probe
to
ensure a more complete recovery, or to a host of different probes, as a part
of a
cascade scenario, where a population of environmental genomic DNA is
subsequently
panned for a number of different genes or gene clusters.
Linkers containing restriction sites and sites for common primers may be
added to the ends of the genomic fragments using sticky-ended or blunt-ended
ligations (depending upon the method used for cutting the genomic DNA). These
enable one to amplify the size-selected inserted fragment population by PCI~
without
significant sequence bias. Thus, after using any of the abovementioned
techniques for
isolation or enrichment, one may help to ensure adequate recovery for
downstream
processing. Furthermore, the recovered population is ready for cutting and
ligation
into a suitable vector as well as containing the priming sites for sequencing
at any
time.
A variation of the above scheme involves including a tag from a combinatorial
synthesis of polynucleotide tags (Brenner et al., 1999) within the linker that
is
attached onto the ends of the genomic fragments. This allows each fragment
within
the starting population to have its own unique tag. Therefore, when amplified
with
common primers, each of these uniquely tagged fragments give rise to a
multitude of
in vitro clones which are then bound to the paramagnetic bead containing
millions of
copies of the complementary, covalently bound anti-tag. A fluorescently
labeled,
target specific probe may be subsequently hybridized to the target-containing
beads.
The beads may be sorted using FAGS, where the positives may be sequenced
directly
from the beads and the insert may be cut out and ligated into the desired
vector for

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
101
further processing. The negative population may be hybridized with other
probes and
resorted as part of the cascade scenario previously described.
Transposon technology may allow the insertion of environmental genomic
DNA into a host genome through the use of transposomes (Goryshin & Reznikoff,
1998) to avoid bias resulting from expression of toxic genes. The host cells
are then
cultured to provide more copies of target DNA for discovery, isolation, and
downstream processes.
~7Vithout further elaboration, it is believed that one skilled in the art can,
using
the preceding description, utilize the present invention to its fullest
extent. The
following examples are to be considered illustrative and thus are not limiting
of the
remainder of the disclosure in any way whatsoever.
Example I
DNA Isolation and Library Construction
The following outlines the procedures used to generate a gene library from a
mixed population of organisms.
DNA isolation. DNA is isolated using the IsoQuick Procedure as per
manufacturer's instructions (Orca, Research Inc., Bothell, WA). DNA can be
normalized according to Example 2 below. Upon isolation the DNA is sheared by
pushing and pulling the DNA through a 25G double-hub needle and a 1-cc
syringes
about 500 times. A small amount is run on a 0.8% agarose gel to make sure the
majority of the DNA is in the desired size range (about 3-6 kb).
Blunt-ending DNA. The DNA is blunt-ended by mixing 45 u1 of l OX Mung
Bean Buffer, 2.0 u1 Mung Bean Nuclease (150 ulul) and water to a final volume
of
405 u1. The mixture is incubate at 37°C for 15 minutes. The mixture is
phenol/chloroform extracted followed by an additional chloroform extraction.
One

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
102
ml of ice cold ethanol is added to the final extract to precipitate the DNA.
The DNA
is precipitated for 10 minutes on ice. The DNA is removed by centrifugation in
a
microcentrifuge for 30 minutes. The pellet is washed with 1 ml of 70% ethanol
and
repelleted in the microcentrifuge. Following centrifugation the DNA is dried
and
gently resuspended in 26 u1 of TE buffer.
Methylation of DNA. The DNA is methylated by mixing 4 u1 of l OX EcoR I
Methylase Buffer, 0.5 u1 SAM (32 mM), 5.0 u1 EcoR I Methylase (40 u/ul) and
incubating at 37°C, 1 hour. In order to insure blunt ends, add to the
methylation
reaction: 5.0 u1 of 100 mM MgCl2, 8.0 u1 of dNTP mix (2.5 mM of each dGTP,
dATP, dTTP, dCTP), 4.0 u1 of Klenow ('S u/ul) and incubate at 12°C for
30 minutes.
After 30 minutes add 450 u1 1X STE. The mixture is phenol/chloroform
extracted once followed by an additional chloroform extraction. One ml of ice
cold
ethanol is added to the final extract to precipitate the DNA. The DNA is
precipitated
for 10 minutes on ice. The DNA is removed by centrifugation in a
microcentrifuge
for 30 minutes. The pellet is washed with 1 ml of 70% ethanol, repelleted in
the
microcentrifuge and allowed to dry for 10 minutes.
Ligation. The DNA is ligated by gently resuspending the DNA in 8 u1 EcoR I
adaptors (from Stratagene°s cDNA Synthesis Kit), 1.0 u1 of l OX
Ligation Buffer, 1.0
u1 of 10 mM rATP, 1.0 u1 of T4 DNA Ligase (4Wu/ul) and incubating at
4°C for 2
days. The ligation reaction is terminated by heating for 30 minutes at
70°C.
Phosphorylation of adaptors. The adaptor ends are phosphorylated by mixing
the ligation reaction with 1.0 u1 of lOX Ligation Buffer, 2.0 u1 of lOmM rATP,
6.0 u1
of HaO, 1.0 u1 of polynucleotide kinase (PNK) and incubating at 37°C
for 30 minutes.
After 30 minutes 31 u1 H20 and 5 ml lOX STE are added to the reaction and the
sample is size fractionate on a Sephacryl S-500 spin column. The pooled
fractions
(1-3) are phenol/chloroform extracted once followed by an additional
chloroform
extraction. The DNA is precipitated by the addition of ice cold ethanol on ice
for 10
minutes. The precipitate is pelleted by centrifugation in a microfuge at high
speed for

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
103
30 minutes. The resulting pellet is washed with 1 ml 70% ethanol, repelleted
by
centrifugation and allowed to dry for 10 minutes. The sample is resuspended in
10.5
u1 TE buffer. Do not plate. Instead, ligate directly to lambda arms as above
except
use 2.5 u1 of DNA and no water.
Sucrose Gradient (2.2 ml) Size Fractionation. Stop ligation by heating the
sample to 65°C for 10 minutes. Gently load sample on 2.2 ml sucrose
gradient and
centrifuge in mini-ultracentrifuge at 45K, 20°C for 4 hours (no brake).
Collect
fractions by puncturing the bottom of the gradient tube with a 20G needle and
allowing the sucrose to flow through the needle. Collect the first 20 drops in
a Falcon
2059 tube then collect 10 1-drop fractions (labeled 1-10). Each drop is about
60 u1 in
volume. Run 5 u1 of each fraction on a 0.8% agarose gel to check the size.
Pool
fractions 1-4 (about 10-1.5 kb) and, in a separate tube, pool fractions 5-7
(about 5-0.5
kb). Add 1 ml ice cold ethanol to precipitate and place on ice for 10 minutes.
Pellet
the precipitate by centrifugation in a microfuge at high speed for 30 minutes.
Wash
the pellets by resuspending them in 1 ml 70% ethanol and repelleting them by
centrifugation in a microfuge at high speed for 10 minutes and dry. Resuspend
each
pellet in 10 u1 of TE buffer.
Test Ligation to Lambda Arms. Plate assay by spotting 0.5 u1 of the sample
on agarose containing ethidium bromide along with standards (DNA samples of
known concentration) to get an approximate concentration. View the samples
using
UV light and estimate concentration compared to the standards. Fraction 1-4. _
>1.0
ug/ul. Fraction 5-7 = 500 nglul.
Prepare the following ligation reactions (5 p1 reactions) and incubate
4°C, overnight:
Sample Ha0 lOX LigaselOmM Lambda Insert T4 DNA
Buffer rATP arms DNA Ligase
(4
(ZAP) Wu/(1)
Fraction 0.5 0.5 u1 0.5 u1 1.0 u1 2.0 0.5 u1
1-4 u1 u1

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
104
Fraction 5-7 0.5 u1 0.5 u1 0.5 u1 1.0 u1 2.0 u1 0.5 u1
Test Package and Plate. Package the ligation reactions following
manufacturer's protocol. Stop packaging reactions with 500 u1 SM buffer and
pool
packaging that came from the same ligation. Titer 1.0 u1 of each pooled
reaction on
appropriate host (OD6oo =1.0) [XLI-Blue MRF]. Add 200 u1 host (in mM MgSO4) to
Falcon 2059 tubes, inoculate with 1 u1 packaged phage and incubate at
37°C for 15
minutes. Add about 3 ml 48°C top agar [SOmI stock containing 150 u1
IPTG (0.5M)
and 300 uI X-GAL (350 mg/ml)] and plate on 100 mm plates. Incubate the plates
at
37°C, overnight.
Amplification of Libraries (5.0 x 105 recombinants from each library). Add
3.0 ml host cells (OD6oo=1.0) to two 50 ml conical tube and inoculate with 2.5
X 105
pfu of phage per conical tube. Incubate at 37°C for 20 minutes. Add top
agar to each
tube to a final volume of 45 ml. Plate each tube across five 150 mm plates.
Incubate
the plates at 37°C for 6-8 hours or until plaques are about pin-head in
size. Overlay
the plates with 8-10 ml SM Buffer and place at 4°C overnight (with
gentle rocking if
possible).
FIarvest Phage. lZecover phage suspension by pouring the SM buffer off each
plate into a 50-ml conical tube. Add 3 ml of chloroform, shake vigorously and
incubate at room temperature for 15 minutes. Centrifuge the tubes at 2K rpm
for 10
minutes to remove cell debris. Pour supernatant into a sterile flask, add 500
u1
chloroform and store at 4°C.
Titer Amplified Library. Make serial dilutions of the harvested phage (for
example, 10-5=1 u1 amplified phage in 1 ml SM Buffer; 10-6=1 u1 of the 10'3
dilution
in 1 ml SM Buffer). Add 200 u1 host (in 10 mM MgS04) to two tubes. Inoculate
one
tube with 10 u1 10-6 dilution (10-5). Inoculate the other tube with 1 u1 10-6
dilution
(10'6). Incubate at 37°C for 15 minutes. Add about 3 ml 48°C top
agar [SOmI stock

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
105
containing 150 u1 IPTG (0.5M) and 375 u1 X-GAL (350 mg/ml)] to each tube and
plate on 100 mm plates. Incubate the plates at 37°C, overnight. Excise
the ZAP II
library to create the pBLUESCRIPT library according to manufacturers protocols
(Stratagene).
Examine 2
Construction of a Stable, Large Insert Picoplankton Genomic DNA Library
Cell collection and preparation of DNA. Agarose plugs containing
concentrated picoplankton cells were prepared from samples collected on an
oceanographic cruise from Newport, Oregon to Honolulu, Hawaii. Seawater (30
liters) was collected in Niskin bottles, screened through 10 m Nitex, and
concentrated
by hollow fiber filtration (Amicon DC10) through 30,000 MW cutoff polyfulfone
alters. The concentrated bacterioplankton cells were collected on a 0.22 m, 47
mm
Durapore filter, and resuspended in 1 ml of 2X STE buffer (1M NaCl,O.IM EDTA,
mM Tris, pH ~.0) to a final density of approximately 1 x 101° cells per
ml. The cell
suspension was mixed with one volume of 1 % molten Seaplaque LMP agarose
(FMC) cooled to 40 C, and then immediately drawn into a 1 ml syringe. The
syringe
was sealed with parafilm and placed on ice for 10 min. The cell-containing
agarose
plug was extruded into 10 ml of Lyses Buffer (10 mM Tris pH 8.0, 50 mM NaCl,
0.1
M EDTA, 1% Sarkosyl, 0.2% sodium deoxycholate, 1 mglml lysozyme) and
incubated at 37 C for one hour. The agarose plug was then transferred to 40
mls of
ESP Buffer (1% Sarkosyl, 1 mg/ml proteinase K, in O.SM EDTA), and incubated at
55 C for 16 hours. The solution was decanted and replaced with fresh ESP
Buffer, and
incubated at 55 C for an additional hour. The agarose plugs were then placed
in 50
mM EDTA and stored at 4 C shipboard for the duration of the oceanographic
cruise.
One slice of an agarose plug (72 1) prepared from a sample collected off the
Oregon coast was dialyzed overnight at 4 C against 1 mL of buffer A (100 mM
NaCI,
10 mM Bus Tris Propane-HC1, 100 g/ml acetylated BSA: pH 7.0 @ 25 C) in a 2 mL
microcentrifuge tube. The solution was replaced with 250 1 of fresh buffer A
containing 10 mM MgC 1, and 1 mh4 DTT and incubated on a rocking platform for
1
hr at room temperature. The solution was then changed to 250 1 of the same
buffer
containing 4U of Sau3Al (NEB), equilibrated to 37 C in a water bath, and then

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
106
incubated on a rocking platform in a 37 C incubator for 45 min. The plug was
transferred to a 1.5 ml microcentrifuge tube and incubated at 68 C for 30 min
to
inactivate the enzyme and to melt the agarose. The agarose was digested and
the
DNA dephosphorylased using Gelase and HK-phosphatase (Epicentre),
respectively,
according, to the manufacturer's recommendations. Protein was removed by
gentle
phenol/chloroform extraction and the DNA was ethanol precipitated, pelleted,
and
then washed with 70% ethanol. This partially digested DNA was resuspended in
sterile H,O to a concentration of 2.Sngll for ligation to the pFOSl vector.
PCR amplification results from several of the agarose plugs (data not shown)
indicated the presence of significant amounts of archaeal DNA. Quantitative
hybridization experiments using rRNA extracted from one sample, collected at
200 m
of depth off the Oregon Coast, indicated that planktonic archaea in this
assemblage
comprised approximately 4.7% of the total picoplankton biomass. This sample
corresponds to "PAC 1 "-200 m in Table 1 of DeLong et al. (DeLong, 1994),
which is
incorporated herein by reference. Results from archaeal-biased rDNA PCR
amplification performed on agarose plug lysates confirmed the presence of
relatively
large amounts of archaeal DNA in this sample. Agarose plugs prepared from this
picoplankton sample were chosen for subsequent fosmid library preparation.
Each 1
ml agarose plug from this site contained approximately 7.5 x 105 cells,
therefore
approximately 5.4 x 105 cells were present in the 72 1 slice used in the
preparation of
the partially digested DNA.
Vector arms were prepared from pFOSI as described by Kim et al. (Kim,
1992). Briefly, the plasmid was completely digested with AstII,
dephosphorylated
with HK phosphatase, and then digested with BamHI to generate two arms, each
of
which contained a cos site in the proper orientation for cloning and packaging
ligated
DNA between 35-45 kbp. The partially digested picoplankton DNA was ligated
overnight to the PFOS 1 arms in a 15 1 ligation reaction containing 25 ng each
of
vector and insert and 1U of T4 DNA ligase (Boehringer-Mannheim). The ligated
DNA in four microliters of this reaction was in vitro packaged using the
Gigapack XL
packaging system (Stratagene), the fosmid particles transfected to E. coli
strain

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
107
DH10B (BRL), and the cells spread onto LB~mts plates. The resultant fosmid
clones
were picked into 96-well microliter dishes containing LB~",ls supplemented
with 7%
glycerol. Recombinant fosmids, each containing ca. 40 kb of picoplankton DNA
insert, yielded a library of 3.552 fosmid clones, containing approximatelyl.4
x 108
base pairs of cloned DNA. All of the clones examined contained inserts ranging
from
38 to 42 kbp. This library was stored frozen at -80 C for later analysis.
Numerous modifications and variations of the present invention are possible in
light of the above teachings; therefore, within the scope of the claims, the
invention
may be practiced other than as particularly described.
Examule 3
CsCl-Bisbenzimide Gradients
Gradient visualization by UV.'
Visualize gradient by using the LJV handlamp in the dark room and mark
bandings of
the standard which will show the upper and lower limit of GC-contents.
Harvesting of the gradients:
1. Connect Pharmacia-pump LKB P1 with fraction collector (BIQ-RAD model
2128).
2. Set program: rack 3, 5 drops (about 100 u1), all samples.
3. Rise 3 microtiter-dishes (Costar, 96 well cell culture cluster).
4. Push yellow needle into bottom of the centrifuge tube.
5. Start program and collect gradient. Don't collect first and last 1-2 ml
depending on where your markers are.
Dialysis
1. Follow microdialyzer instruction manual and use SpectralPor CE Membrane
MWCO 25,000 (wash membrane with ddH20 before usage).
2. Transfer samples from the microtiterdish into microdialyzer (Spectra/Por,
3. MicroDialyzer) with multipipette. (Fill dialyzer completely with TE, get
rid of
any air bubble, transfer samples very fast to avoid new air-bubbles).
4. Dialyze against TE for 1 hr on a plate stirrer.
DNA estimation with PIGOGREEN
1. Transfer samples (volume after dialysis should be increased 1.5 - 2 times)
with
multipipette back into microtiterdish.
2. Transfer 100 u1 of the sample into Polytektronix plates.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
I08
3. Add 100 u1 Picogreen-solution (5 u1 Picogreen-stock-solution + 995 u1 TE
buffer) to each sample.
4. Use WPR-plate-reader.
5. Estimate DNA concentration.
Example 4
Bis-Benzimide Separation of Genomic DNA
A sample composed of genomic DNA from Clostridium perfringe~cs (27% G+C),
Escherichia coli (49% WC) and Micrococcus lysodictium (72% G+C) was purified
on
a cesium-chloride gradient. The cesium chloride (Rf = I .3980) solution was
filtered
through a 0.2 m filter and 15 ml were loaded into a 35 ml OptiSeal tube
(Beckman).
The DNA was added and thoroughly mixed. Ten micrograms of bis-benzimide
(Sigma; fioechst 33258) were added and mixed thoroughly. The tube was then
filled
with the filtered cesium chloride solution and spun in a VTi50 rotor in a
Beckman
L8-70 Ultracentrifuge at 33,000 rpm for 72 hours. Following centrifugation, a
syringe
pump and fractionator (Brandel Model 186) were used to drive the gradient
through
an ISCO UA-5 UV absorbance detector set to 280 nm. Three peaks representing
the
DNA from the three organisms were obtained. PCR amplification of DNA encoding
rRNA from a 10-fold dilution of the E. coli peak was performed with the
following
primers to amplify eubacterial sequences:
Forward primer: (27F)
-AGAGTTTGATCCTGGCTCAG-3
Reverse primer: (1492R)
5-GGTTACCTTGTTACGACTT-3
Example 5
FACSBionanning
Infection of library lysates into Exp503 E.coli strain. 25 ml LB + Tet culture
of Exp503 were cultured overnight at 37 C. The next day the culture was
centrifuged
at 4000 rpm for 10 minutes and the supernatant decanted. 20m1 lOmM MgSO4 was
added and the OD6oo checked. Dilute to OD 1Ø

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
109
In order to obtain a good representation of the library, at least 2-fold (and
preferably 5-fold) of the library lysate titer was used. For example: Titer of
library
lysate is 2x106 cfu/ml. Need to plate at least 4x106 cfu. Can plate approx.
500,000
rnicrocolonies/ 150mm LB-Kan plate. Need 8 plates. Can plate 1 ml of
reaction/plate-
need 8 mls of cells + lysate.
2-fold (ex. 2 ml) of library lysate was mixed with appropriate amount ( e.g.,
6
ml) of OD 1.0 Exp503. The sample was incubated at 37°C for at least 1
hour. Plated
1 ml reaction on 150mm LB-Kan plate x 8 plates and incubated overnight at
30°C.
Harvesting, induction, and fixing of library in Exp503 cells. Scrape all cells
from plates into 20 ml LB using a rubber policeman. Dilute cells approx. 1:100
(200
u1 cells/ 20 ml LB) and incubate at 37°C until culture is OD 0.3. Add
1:50 dilution of
20% sterile Glucose and incubate at 37°C until culture is OD 1Ø Add
1:100 dilution
of 1M MgS04. Transfer 5 ml of culture to a fresh tube and the remaining
culture can
be used as an uninduced control if desired or discarded. Add MOI 5 of CE6
bacteriophage to the remaining 5 ml of culture. (CE6 codes for T7 RNA
Polymerase)
(e.g., OD 1 = 8x10$ cells/ml x 5 ml = 4x109 cells x MOI S = 2x10t°
bacteriophage
needed). Incubate culture + CE6 for 2 hr at 37°C. Cool on ice and
centrifuge cells at
4000 rpm for 10 min. Wash with 10 ml PBS. Fix cells in 600 u1 PBS + 1.8 ml
fresh,
filtered 4% paraformaldehyde. Incubate on ice for 2 hrs. (4% Paxaformaldehyde:
Heat 8.25 ml PBS in flask at 65°C. Add 100 u1 1M NaOH and 0.5 g
paraformaldehyde (stored at 4°C.) Mix until dissolved. Add 4.15 ml PBS.
Cool to
0°C. Adjust pH to 7.2 with 0.5 M NaH2P04. Cool to 0°C. Syringe
filter. Use within
24 hrs). After fixing, centrifuge at 4000 rpm for 10 min. Resuspend in 1.8 ml
PBS
and 200 u1 0.1 % NP40. Store at 4°C overnight.
Hybridization of fixed cells. Centrifuge fixed cells at 4000 rpm for 10 min.
Resuspend in 1 ml 40 mM Tris pH7.6/ 0.2% NP40. Transfer 100 u1 fixed cells to
an
eppendorf tube. Centrifuge for 1 min and remove supernatant. Resuspend each
reaction in 50 u1 Hybridization buffer (0.9 M NaCI; 20 mM Tris pH7.4; 0.01%
SDS;

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
110
25% formamide- can be made in advance and stored at -20°C.). Add 0.5
nmol
fluorescein-labeled primer to the appropriate reactions. Incubate with rocking
at 46°C
for 2 hr. (Hybridization temperature may depend on sequence of primer and
template.) Add 1 ml wash buffer to each reaction, rinse briefly and centrifuge
for 1
min. Discard supernatant. (Wash buffer: 0.9 M NaCI; 20 mM Tris pH 7.4; 0.01
SDS). Add another 1 ml of wash buffer to each reaction, and incubate at
4S°C with
rocking for 30 min. Centrifuge and remove supernatant. Visualize cells under
microscope using WIB filter.
FACS sorting. Dilute cells in 1 ml PBS. If cells are clumping, sonicate for 20
seconds
at 1.5 power. FAC sort the most highly fluorescent single-cells and collect in
0.5 ml
PCR strip tubes (approximately one 96-well plate/ library). PCR single-cells
with
vector specific primers to amplify the insert in each cell. Electrophores all
samples on
an agarose gel and select samples with single inserts. These can be re-
amplified with
Biotin-labeled primers, hybridized to insert-specific primers, and examined in
an
ELISA assay. Positive clones can then be sequenced. Alternatively, the
selected
samples can be re-amplified with various combinations of insert-specific
primers, or
sequenced directly.
Example 6
Large Insert FACS Biopanning Protocol
Encapsulate 1 vial of 3% home-made SeaPlaque gel. Each vial of gel can
make 106 GMD. Take 100u1 melt frozen fosmid pMF21/DH10B library,
OD600 = 0.4 to encapsulate, centrifuge down to 10u1. Melt agarose gel, add
100u1 FBS (fetal bovine serum) and vortex. Place in 50 C water in a beaker.
Add 10u1 culture, vortex and add to 17m1 mineral oil. Shake for about 30
times, place on the One Cell machine. Blend at 2600rpm lmin at room
temperature and 2600rpm 9 minutes on ice. Wash with PBS twice. Resuspend
in l Oml LB+ Aprs°, shake at 37°C for 4 hours at 230 rpm. Check
microscopically to see the growth and size of microcolonies.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
111
2. Centrifuge at 1500rpm for 6 min. GMDs are resuspend in Sml of 2xSSC and
can be saved at 4 °C for several days. Take 200u1 GMD in 2xSSC for each
reaction.
3. Resuspend in 10 ml 2xSSC/5% SDS. Incubate 10 min at RT shaking or
rotating. Centrifuge.
4. Resuspend in S ml lysis solution containing proteinase K. Incubate 30 min
at
37°C shaking or rotating. Centrifuge.
I,ysis Solution:
SOmM Tris pH8 0.75m11M Tris
SOmM EDTA l.Sml O.SM EDTA
100mM NaCI 300 u1 SMNaCI
1% Sarkosyl 0.75m120% Sarkosyl
250ug/ml Proteinase K 375u1 proteinase I~ stock (lOmg/ml)
11.325m1 dH20
5. Resuspend in 5 ml denaturing solution. Incubate 30 min at RT shaking or
rotating. Centrifuge at 1500rpm for 5 min.
Denaturing Solution:
O.SM NaOH/1.SM NaCI
6. Resuspend in 5 ml neutralizing solution. Incubate 30 min at RT shaking or
rotating. Centrifuge.
Neutralizing Solution:
O.SM Tris pH8/1.SM NaCI
7. Wash in 2XSSC briefly.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
112
8. Aliquot 200u1 /RxN into microcentrifuge tubes, microcentrifuge and take out
the 2XSSC. Add 130 u1 "DIG EASY HYB" to prehyb for 45 minutes at 37°C.
Do prehyb and hyb in Personal Hyb Oven.
9. Aliquot oligo probe and denature at 85°C for 5 minutes, place on ice
immediately. Add appropriate amount of probe (0.5-lnmol/RXN) and return
to rotating hyb. oven for O/N.
10. Prepare a 1% (lOmg/ml) solution of Blocking Reagent in PBS. Store at
4°C
for the day use.
11. Wash GMD's with 0.8m1 of 2XSSC/0.1%SDS RT 15 min, rotating. At the
meantime, prewarm next wash solution.
12. Wash GMD's with 0.8m1 of O.SXSSC/0.1%SDS 2x15min at appropriate temp,
rotating. If more stringency is required, the 2"a wash can be done in
O.IXSSC/0.1%SDS.
13. Wash with 0.8m1/RXN 2XSSC briefly.
14. Block the reaction w/I30uI I% Blocking Reagent in PBS at RT for 30
minutes.
15. Add 1.4u1 anti-DIG-POD (so 1:100) and incubate at RT for 3 hours.
16. Wash GMDs w/ 0.8m1 PBS/RN 3x 7 minutes at 37°C.
17. Prepare a tyramide working solution by diluting the tyramide stock
solution
1:85 in Amplification buffer/0.0015% Ha02. Apply 130u1 tyramide working
solution at RT and incubate in the dark at RT for 30 minutes.
18. Wash 3X for 7 min. in 0.8m1 PBS buffer @37°C.
19. Visualize by microscope and FAGS sort.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
113
Example 7
Biopanning Protocol
Preparing Insert DNA from the Lambda DNA
PCR amplify inserts using vector specific primers CA98 and CA103.
CA98: ACTTCCGGCTCGTATATTGTGTGG
CA103: ACGACTCACTATAGGGCGAATTGGG
These primers match perfectly to lambda ZAP Express clones (pBKCMV).
Reagents: Lambda DNA prepared from the libraries to be panned (Librarians)
Ruche Expand Long Template PCR System #1-759-060
Pharmacia dNTP mix #27-2094-Ol or
Ruche PCR Nucleotide Mix (10 mM) #1-581-295 or
Ruche dNTP's - PCR grade #1-969-064
1. Make the insert amplification mix:
X p1 dH20 (final 50 p1)
S w1 l Ox Expand Buffer #2 (22.5 mM lVIgCl2)
0.5 or D. 625 ,ul dNTP mix (20 mM each dNTP)
ng (approx) lambda DNA per library (usually 1 ~,1 or 1 p,1 1:10 diln)
1-2 ~,l CA98 (100 ng/~,1 or 15 ~M)
1-2 p1 CA103 (100 ng/~,l or lSp,M)
0.5 ~,l Expand Long polymerase mix
2. PCR amplify:
Robocycler
95C 3 minute x 1 c cle
95C 1 minute
65C 45 seconds x 30 cycles
68C 8 minute
68C 8 minute x 1 c cle
6C o0
3. Analyze 5 ~.1 of reaction product on a gel.
Note: The reaction product should be a strong smear of products usually
ranging from
0.5-5 kb in size and centered around 1.5-2 kb.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
114
Prepare Biotinylated Hook
Reagents: PCR reagents
Biotin-14-dCTP (BRL #19518-018)
Individual dNTP stock solutions (Roche dNTP's #1-969-064)
Gene specific template and primers
PCR purification kit (Roche #1732668 or Qiagen Qiaquick #28106)
1. Make l Ox biotin dNTP mix:
150 p1 biotin-14-dCTP
3 ~uI 100 mlVl dATP
3 x,1100 mNI dGTP
3 p,1 100 mM dTTP
1.5 x,1100 mM dCTP
2.1\ilake PCR mix:
74 pal water
p,I !Ox Expand Buffer #1
10 p1 10x biotin dNTP mix (step #1)
2 p,1 Primer #1 (100 ng/p,l)
2 p1 Primer #2 (100 ngl~.1)
1 ~,1 template (gene specific) (100 ng/~1)
1 ~,l Expand Long polymerase mix
3. PCR amplify:
Robocycler
95C 3 minute x 1 cycle
95C 45seconds
* C 45 seconds x 30 cycles
68C ** minute
68C 8 minute x 1 c cle
6C o0
* IJse an annealing temperature appropriate for your primers.
* * Allow 1 minute! kb of target length.
4. Clean up the reaction product using a PCR purification kit. Elute in 50 p,1
ST.lE or
Qiagen's EB buffer (10 mM Tris pH 8.5).
5. Check S p,1 on an agarose gel.
Note: The product may be slightly larger than expected due to the
incorporation of
biotin.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
115
Biopatz~zihg
Reagents: Streptavidin-conjugated paramagnetic beads (CPG MPG-Streptavidin
l Omg/ml #MSTR0502)(Dynal Dynabeads M-280 Streptavidin)
Sonicated, denatured salmon sperm DNA (heated to 95°C, 5 min)
(Stratagene # 201190)
PCR reagents
dNTP mix
Magnetic particle separator
Topo-TA cloning kit with ToplOF' comp cells (Invitrogen #K4550-40)
High Salt Buffer: 5M NaCI, lOmM EDTA, lOmM Tris pH 7.3
1. Make the following reaction mix for each library/ hook combination:
pg insert DNA (PCR amplified lambda DNA)
100 ng Biotinylated hook (100 ng total if using more than one hook)
4.5 ~,l 20x SSC for a 3x final concentration (or High Salt buffer)
X ~l dH~O for a final volume of 30 ~,1
2. Denature by heating to 95°C for 10 min. (Robocycler works well for
this step).
3. Hybridize at 70°C for 90 min. (Robocycler)
4. Prepare I00 ~.l of MPG beads for each sample:
Wash 100 ~1 beads two times with 1 ml 3x SSC
Resuspend in: 50 p,1 3x SSC (or High Salt buffer)
p,1 Sonicated, denatured salmon sperm DNA (10 mg/mI) to
block (or 100 ng total)
(Do not ice)
5. Add the hybridized DNA to the washed and blocked beads.
6. Incubate at room temp for 30 min, agitating gently in the hybridization
oven.
7. Wash twice at room temp with 1 ml O.lx SSC/ 0.1% SDS, (or high salt buffer)
using magnetic particle separator.
8. Wash twice at 42°C with 1 ml O.lx SSC/ 0.1% SDS (or high salt
buffer) for 10 min
each. (magnet)
9. Wash once at room temp with 1 ml 3x SSC. (magnet)
10. Elute DNA by resuspending the beads in 50 p1 dHaO and heating the beads to
70°C for 30 min or 85°C for 10 min. in the hyb oven (or
therm~mixer at SOOrpm).
Separate using magnet, and discard the beads.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
116
11. PCR amplify 1 - 5 p1 of the panned DNA using the same protocol as
Prepaying
Insert DNA from the Lambda DNA above.
12. Check 5 p1 on agarose gel.
Note: The reaction product should be a strong smear of products usually
ranging from
0.5-5 kb in size and centered around 1.5-2 kb.
13. Clone 1-4 p,1 into pCR2.1-TopoTA cloning vector.
14. Transform 2 x 3 ~,1 into ToplOF' chemically comp cells. Plate each
transformation
on 2 x 150mm LB-kan plates. Incubate at 30°C overnight.
(Ideal density is ~ 3000 colonies per plate).
Repeat transformation if necessary to get a representative number of colonies
per
library. Archive the Biopanned DNA.
15. Transfer plates to Hybridization group, along with appropriate templates
and a
single primer for run off PCR 3aP-labeling reactions.
Analysis of Results
1. Filter lifts from plates will be performed, and hybridized to the
appropriate probe.
Resultant films will be given to the Biopanned.
2. Align films to original colony plates. Colonies corresponding to positive
"dots-on-
f lin" should be toothpicked, patched onto an LB-Kan plate, and inoculated in
4 ml
TB-Kan. For automation, inoculate 1 ml TB-kan in a 96-well plate and incubate
I8
hrs. at 37°C.
3. ~vernight cultures are mini-prepped (Biomek if possible). Digest with EcoRI
to
determine insert size.
2 w1 DNA
0. S ~,1 EcoRI
1 ~1 l Ox EcoRI buffer
6.5 p,1 dHaO
Incubate at 37°C for 1 hr. Check insert size on agarose gel.
Large insert clones (>SOObp) are then PCR confirmed if possible with gene
specific
primers.
4. Putative positive clones are then sequenced.
5. Glycerol stocks should be made of all interesting clones (>SOObp).

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
117
Example 8
HIGH THROUGHPUT CULTIVATION OF MARINE MICROBES
FROM SEA SAMPLE
17. Preparation of cell suspension
Cells were obtained after filtering 110 L of surface water through a 0.22 ~m
membrane. The cell pellet was then resuspended with seawater and a volume of
100
~.L was used for cell encapsulation. This provided cell numbers of
approximately 107
cells per mL.
18. Cell encapsulation into GMDs
The following reagents were used: CelMix~ Emulsion Matrix and CelGel~
Encapsulation Matrix (One Cell Systems, Inc., Cambridge, MA), Pluronic F-68
solution and Dulbecco's Phosphate Buffered Saline (PBS, without Caa+ and
Mg2+).
Scintillation vials each containing 15 ml of CelMix~ emulsion matrix were
placed in
a 40°C water bath and were eliquilibrated to 40°C for a minimum
of 30 minutes. 30
u1 of Pluronic Solution F-68 (10%) was added to each of 6 vials of melted
CelGelT~
agarose. The agarose mixture was incubated to 40°C for a minimum of 3
minutes.
100 u1 of cells (resuspended in PBS) were added per 6 vials of the CelGeITM
bottles
and the resulting mixture was incubated at 40°C for 3 minutes. Using a
1 ml pipette
and avoiding air bubbles, the CelGel~-cell mixture was added dropwise to the
warmed CelMix~ in the scintillation vial: This mixture was then emulsified
using
the CellSys100TM MicroDrop maker as follows: 2200 rpm for 1 minute at room
temperature (RT), then 2200 rpm for 1 minute on ice, then 1100 rpm for 6
minutes on
ice, resulting in an encapsultion mixture comprised of microdrops that were
approximately 10-20 microns in diameter. The encapsulation mixture was then
divided into two I S ml conical tubes and in each vial, the emulsion was
overlayed
with 5 ml of PBS. The vials tubes were then centrifuged at 1800 rpm in a bench
top
centrifuge for 10 minutes at RT, resulting in a visible Gel MicroDrop (GMD)
pellet.
The oil phase was then removed with a pipette and disposed of in an oil waste
container. The remaining aqueous supernatant was aspirated and each pellet was

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
118
resuspended in 2 ml of PBS. Each resuspended pellet was then overlayed with 10
ml
of PBS. The GMD suspension was then centrifuged at 1500 rpm for 5 minutes at
RT.
Overlaying process is repeated and the GMD suspension is centrifuged again to
remove all free-living bacteria. The supernatant was then removed and the
pellet was
resuspended in 1 ml of seawater. 10 u1 of the GMD suspension was then examined
under the microscope in order to check for uniform GMD size and containment of
then encapsulated organism into the GMD. This protocol resulted in 1 to 4
cells
encapsulated in each GMD.
19. Sorting of GMDs containing single cells for identification by 16S rRNA
gene
sequence
On the first day of cultivation we sorted occupied GMDs that contained one to
4 cells, although most had only single cells. The sorting was done in a Mo-Flo
instrument (Cytomation) by staining the cells inside the GMDs with Syto9 and
then
selecting green fluorescence (from the stain) and side-scatter as parameters
for sorting
gates. The staining was necessary since the cells are much smaller than E.coli
and
therefore show very low light-scatter signals. The target GMDs were sorted
into a 96-
well plate containing a PCR mixture and ready to be amplified immediately
after
sorting. We used a FIotstart enzyme (Qiagen) such as no reaction would occur
before
boiling for 1 S min and therefore allows to work at room temperature before
amplification. Before starting the PCR it was necessary to radiate the PCR
mixture
with a Stratalinker (Stratagene) at full power for 14 min to cross-link any
potential
genomic DNA present in the mixture before sorting. The primers used include
the pair
27F and 13928 and 27F and 15228 according to the positions in E.coli gene
sequence. The primers were obtained from mT-DNA Technologies and were purified
by HPLC. The primer concentration used in the reactions was 0.2 p,M. We used a
"touchdown" program consisting of 3 stages: a) boiling 15 min, b) 15 cycles
decreasing the annealing temperature from 62 to 55°C by 0.5 degrees per
cycle, c) a
series of cycles (20-40) increasing the annealing time 1 sec per cycle
starting with 30
sec but keeping the temperature constant at SS°C. All the other stages
of the PCR
were as recommended by manufacturer. This protocol allowed the amplification
of

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
119
the 16S rRNA gene from individual cells encapsulated or small consortia of
cells. The
PCR products were then cloned into TOPO-TA (Invitrogen) cloning vectors and
sequenced by dye-termination cycle sequencing (Perkin-Elmer ABI).
Cell growth of encapsulated cells inside GMDs
The encapsulated GMDs were placed into chromatography columns that
allowed the flow of culture media providing nutrients for growth and also
washed out
waste products from cells. The experiment consisted of 4 treatments including
the use
of seawater, and amendments (inorganic nutrients including trace metals and
vitamins, amino acids including trace metals and vitamins, and diluted rich
organic
marine media). This different set of nutrients provided a gradient to bias
different
microbial populations. The seawater used as base for the media was filter
sterilized
through a 1000 kDa and a 0.22 wm filter membranes prior to amendment and
introduction to the columns. The cells were then incubated for a period of 17
weeks
and cell growth was monitored by phase contrast microscopy. Cell
identification was
done by 16S rRNA gene sequence of grown colonies.
20. Sorting of GMDs containing colonies consisting of one or more cell types
To identify the diversity and the community composition of the different
treatments we performed a "bulk sorting" of the GMDs. This was done by taking
a
subsample of the GMDs from each column and run them into the Flow-cytometer.
We
selected as gating criteria forward- and side-scatter as occupied GMI~s with a
colony
of 10 or more cells of individual cell sizes ranging from 0.5 to 5 ~,m were
easy to
discriminate from empty GlVa3s. We verified each time by phase contrast
microscopy
that we selected the correct gate for sorting. We then sorted a total of 300
GMDs per
each individual PCR reaction (prepared as above) and ran the reaction in a
thermocycler for a total of 50 to 60 cycles to have enough PCR product to be
visualized by gel electrophoresis. The resulting PCR reactions from the same
column
were combined (2 to 4 replicates), cloned and sequenced as above to assess the
phylogenetic diversity from each column and observe the bias effect resulting
from
the use of different nutrient regimes.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
120
Gene sequencing and phylogenetic analyses
The gene sequences were aligned and compared to our 16S rRNA database
with the ARB phylogenetic program. Maximum Parsimony and neighbor joining
trees
were constructed using the amplified gene sequences (approximately 1400 bp).
Example 9
MICROEXTRACTION PROCEDURE
A single copy of Streptomyces containing clones from a mixed population are
FACS-
sorted onto agar, allowed to develop into individual colonies, and bioassayed
as
individual clones.
CONSTRUCTION OF A CLONE EXPRESSING A BIOACTIVE METABOLITE
A genomic library of St~eptomyces murayamaensis is constructed in pJ0436
(Bierman et al., Gene 1991 116:43-49) vector and hybridized with probes for
polyketide synthase. A clone (1B) which hybridized was chosen and shuttled
into
Streptomyces venezuelae ATCC 10712 strain. The vector pMFl7 was also
introduced
into S. diversa as a negative control. When bioassayed on solid media, clone
1B
expressed strong bioactivity towards Micrococcus luteus suggesting that the
insert
present in clone 1B encoded a bioactive polyketide molecule.
FACS-sorting of S. venezuelae clones
The S. venezuelae exconjugant spores contaning clone 1B, as well as pJ0436
vector,
are FACS-sorted in 48-well, 96-well, and 384-well format into corresponding
plates
containing MYM agar + Apramycin SOug/ml. The single spore clones were allowed
to germinate, grow and sporulate for 4-5 days.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
121
Natural product extraction procedure: After the clones were fully grown and
sporulated for 4-5 days, following volumes of solvent methanol were added to
the
each well containing the clones.
48 well format:0.8 ml
96 well format : 0.100 ml
384 well format : 0.06 ml
The plates were incubated at room temperature overnight.
The next day, the following volumes were recovered from the wells containing
the
clones.
48 well format : 0.3 ml
96 well format : 0.060 ml
384 well format: 0.030 ml
The extracts were assayed from a single well, and after combining extracts
from 2, 4
and 10 wells.
The methanol extract was dried and resuspended in 40 u1 of methanol:water and
20 u1
of which was assayed against Il~ luteus as the indicator strain.
A single colony of S. venezuelae containing clone 1B produced enough bioactive
molecule, in 48-well, 96-well as well as 384-well format, to be extracted by
the
microextraction procedure and to be detected by bioassay.
Example 11
Expression of actinorhodin uathway fn S. venezuelae 10712
When Sau3A pIJ2303 library constructed in pJ0436 was introduced into S.
venezuelae, one exconjugant which appeared blue-grey in color was spotted.
This
exconjugant showed blue pigment on R2-S agar suggesting the successful
expression
of a heterolgous pathway (actinorhodin) pathway in S. venezuelae. J0436
Se~re:gational stability of S. venezuelae 10712 (pJ0436::actinorhodin)
Since Streptomyces clones for small molecule production are grown in absence
of
antibiotic selection, it was important to determine how stable the S.
venezuelae

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
122
pJ0436 recombinant clones are. The S. venezuelae 10712 (pJ0436::actinorhodin)
clone was used as an example.
The act clone was grown in R2-S liquid cultures with and without apramycin and
total cell count was done by plating on R2-S agar with and without apramycin.
The
act clone gave 100% and 96% apramycin resistant colonies when grown with and
without apramycin, respectively. This suggests that S. vehezuelae pJ0436
clones are
quite stable segregationally.
Expression stability of S vehezuelae 10712 (pJ0436::actinorhodin~
VVe have shown successful expression of the actinorhodin gene cluster in S.
venezuelae 10712. However, when this clone was grown in liquid cultures it
failed to
produce actinorhodin, as determined by the absence of its blue color.
Nonetheless,
when mycelia from such cultures were plated on solid media, actinorhodin
producing
colonies were clearly evident. The majority of the colonies produced a faint
blue color
while a few colonies produced abundant actinorhodin. These colonies which
produce
actinorhodin abundantly have been named as HBC (hyper blue clones) clones.
These observations suggest that perhaps in HBC clones, a host mutation has
occurred which allows very efficient actinorhodin expression. Mutations which
could
lead to efficient actinorhodin expression could include a variety of targets
such as,
elimination of negative regulators like cutRS, overexpression of positive
regulators, or
efficient expression of pathways which provide precursors for actinorhodin.
The
hyper production of actinorhodin by the HBC clones thus strongly suggests that
it is
indeed possible for us to construct a strain which is more optimized for
heterologous
expression of small molecules, by random mutagenesis or by specific cutRS
knockout
mutagenesis.
Construction of a jadom~cin blocked mutant of S. venezuelae
Orfl of the jadomycin biosynthetic gene cluster was chosen a.s a target.
Primers were designed so as to amplify jad-L and jad-R fragments with proper
restriction sites for future subcloning. S. venezuelae is reasonably sensitive
to
hygromycin and therefore, hygromycin resistance gene will be used to disrupt
the orf

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
123
1 gene.The strategy used for disrupting the j adomycin orf 1 is described in
the
attached figure. The hyg-disrupted copy of the orf 1 gene will then be placed
on
pI~C1218 and used for gene replacement in the S. venezuelae 10712, as well as
VS153 chromosome.
Expression of the yellow clone in S. venezuelae
The single arm rescue technique to recover the yellow clone insert from S.
lividans
clone 525Sm575 was described. The recovered clone #3 was mated into S.
venezuelae
10712 as well as VS 153. Yellow color was evident after several days on both
10712
as well as VS153 plates but absent in the pJ0436 vector alone controls. Three
10712
yellow clones were grown in liquid R2-S medium and all three produced yellow
color
profusely. This experiment has validated S. venezuelae as a host and pJ0436 as
the
vector for heterologous expression for the second time, the first time being
with the
actinorhodin gene cluster. This yellow clone insert could now be used in
validation of
different strains in our strain improvement program.
3. Development of a mating protocol in a microtiter plate format.
In order to have the individual E. coli donor clones archived, we are
attempting to develop a mating protocol in a microtitre plate format.
According to this
protocol, we plan to sort the E. coli library into a 96-well microtitre plate.
The
matings with S. diverse would then be done in on a R2-S agar plate in an array
format
corresponding to the 96-well microtitre plate containing the E. coli clones.
The
bioassays can be either conducted on the mating R2-S plate or the clones can
be first
replica plated on to another suitable agar plate and then bioassayed. This
approach
will allow us to go back to the E. coli clones once we detect a bioactive
clone among
the S. diverse exconjugant library. The E. coli clone can then be mated back
into S.
diverse for re-transformation and confirmation of the bioactivity.
In a preliminary experiment, matings were done by spotting S. diverse spores
together with E. coli donor cells on R2-S agar plate (rather than spreading).
After
about 8 hours the plate was overlayed as usual with apramycin and nalidixic
acid. The
exconjugants appeared only on those spots were E. coli donor was added, but
not on

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
124
those spots containing S. diversa spores alone. These initial data are very
promising,
although some more standardization needs to be done to develop this technique
fully.
Example 12
Production of single cells or fragmented mycelia
In order to produce single cells or fragmented mycelia, 25m1 MYM media was
inoculated (see recipe below) in 250 ml baffled flask with 100 u1 of
Streptomyces
10712 spore suspension and incubated overnight at 30°C 250rpm. After a
24 hour
incubation, 10 ml was transferred to SOmI conical polypropylene centrifuge
tube and
centrifuged at 4,OOOrpm for 10 minutes @ 25°C. Supernatant was decanted
and the
pellet was resuspended in lOml O.OSM TES buffer. The cells were sorted into
MY1VI
agar plates (sort 1 cell per drop, 5 cells per drop, 10 cells per drop) and we
incubated
the plates at 30°C.
MYM media (Stuttard, 1982, J. Gen .Microbiol. 128:115-121) contains: 4 g
maltose, 10 g malt ext., 4 g yeast extract, 20 g agar, pH 7.3, water to 1 L.
Example 13
The following describes a method for the discovery of novel enzymes requiring
large
substrates (e.g., cellulases, amylases, xylanases) using the ultra high
throughput
capacity of the flow cytometer. As these substrates are too large to get into
a bacterial
cell, a strategy other than single intracellular detection must be employed in
order to
use the flow cytometer. For this purpose, we have adapted the gel microdrop
(GMD)
technology (One Cell Systems, Inc.) Specifically, the enzyme substrate is
captured
within the GMD and the enzyme allowed to hydrolyze the substrate within this
microenvironment. However, this method is not limited to any particular gel
microdrop technology. Any microdrop-forming material that can be derivatized
with
a capture molecule can be used. The basic experimental design is as follows:
Encapsulate individual bacteria containing DNA libraries within the GMDs and
allow
the bacteria to grow to a colony size containing hundreds to thousands of
cells each.
The GMDs are made with agarose derivatized with biotin, which is commercially

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
125
available (One Cell Systems). After appropriate colony growth, streptavidin is
added
to serve as a bridge between a biotinylated substrate and the biotin-labeled
agarose.
Finally, the biotinylated substrate will be added to the GMD and captured
within the
GMD through the biotin-streptavidin-biotin bridge. The bacterial cells will be
lysed
and the enzyme released from the cells. The enzyme will catalyze the
hydrolysis of
the substrate, thereby increasing the fluorescence of the substrate within the
GMD.
The fluorescent substrate will be retained within GMD through the biotin-
streptavidin-biotin bridge and thus, will allow isolation of the GMD based on
fluorescence using the flow cytometer. The entire microdrop will be sorted and
the
DNA from the bacterial colony recovered using PCR techniques. This technique
can
be applied to the discovery of any enzyme that hydrolyzes a substrate with the
result
of an increased fluorescence. Examples include but are not limited to
glycosidases,
proteases, lipases, ferullic acid esterases, secondary amidases, and the like.
One system uses a biotin capture system to retain secreted antibodies within
the GMD. The system is designed to isolate hybridomas that secrete high levels
of a
desired antibody. This basic design is to form a biotin-streptavidin-biotin
sandwich
using the biotinylated agarose, streptavidin, and a biotinylated capture
antibody that
recognizes the secreted antibody. The "captured" antibody is detected by a
fluoresceinated reporter antibody. The flow cytometer is then used to isolate
the
microdrop based on increased fluorescence intensity. The potentially unique
aspect to
the method described here is the use of large fluorogenic substrates for the
determination of enzyme activity within the GMD. Additionally, this example
uses
bacterial cells containing DNA libraries instead of eukaryotic cells and is
not confined
to secreted proteins as the bacterial cells will be lysed to allow access to
the enzymes.
The fluorogenic substrates can be easily tailored to the particular enzyme of
interest. Described below is a specific example of the chemical synthesis of
an
esterase substrate. Additionally, two examples are given which describe the
different
possible chemical combinations that can be used to make a wide variety of
substrates.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
126
Example of Reaction Sequence Leading; to GMD-Attachable Substrate
HO I ~ O , I OH
HO ~ O , OH
~N.O I ~ w I HzNWO./~O~.O./~Ns O \ / O
O HN O
OO \/ OO
Of a RNs
HO ~ O , OH HO ~ O , OH
I~ ~I I~ ~I
O - _O Hz, Pd/C O - O
HN \ / O HN \ / O
f a ~N3 ~ f a ZNHz
HO ~ O , OH HO ~ O , OH
H O~ O 1i ~I O
O v O~N ..WO~N~ O~~v HN~
\ / O HN~ \ // ' NH
HN S O HN O O
f ~ ~NHz ~ ~ ~N
HO I ~ O , I OH O O OsH~a ~O ~ O i p~CsH13
O O I~ ~I O O
O - O HN~NH OsH1s~OxCsH~s p - O HN~NH
O
H ~ \ O O O~ ~N'C-N~ N~N~ . H ~ \ O O 01'J
O-~ ~--~ ~-NH Of ~--~ -~--NH
In the first step, 1-amino-11-azido-3,6,9-trioxaundecane [Reference 3], an
asymmetric
spacer, is attached to N-hydroxysuccinamide ester of 5-carboxyfluorescein
(Molecular Probes). After reduction of the azide functional group on the end
of the
attached spacer (step 2), activated biotin (Molecular Probes) is attached to
the amine
terminus (step 3), and the sequence is completed by esterification of phenolic
groups
of the fluorescein moety (step 4). The resulting compound can be used as a
substrate
in screens for esterase activity.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
Design of GMD-Attachable Fluorogenic Substrates
127
R1 g2
C1 C2
Fluor
p HN 'O
C3 Spacer C4 '''yNH
. ~/S
Fluor - core fluorophore structure, capable of forming fluorogenic
derivatives, e.g.
coumarins, resorufins, xanthenes, and others.
Spacer - a chemically inert moiety providing connection between biotin moiety
and
the fluorophore. Examples include alkanes and oligoethyleneglycols. The choice
of
the type and length of the spacer will affect synthetic routes to the desired
products,
physical properties of the products (such as solubility in various solvents),
and the
ability of biotin to bind to deep pockets in avidin.
C1, C2, C3, C4 - connector units, providing covalent links between the core
fluorophore structure and other moieties. C1 and C2 affect the specificity of
the
substrates towards different enzymes. C3 and C4 determine stability of the
desired
product and synthetic routes to it. Examples include ether, amine, amide,
ester, urea,
thiourea, and other moieties.
R1 and R2 - functional groups, attachment of which provides for quenching of
fluorescence of the fluorophore. These groups determine the specificity of
substrates
towards different enzymes. Examples include straight and branched alkanes,
mono-
and oligosaccharides, unsaturated hydrocarbons and aromatic groups.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
128
a. Design of GMD-Attaghable Fluorescence Resonance Energy Transfer
Substrates
Fluor - A fluorophore. Examples include acridines, coumarins, fluorescein,
rhodamine, BODIPY, resorufin, porphyrins, etc.
Quencher - A moiety, which is capable of quenching fluorescence of the
fluorophore
when located at a close enough distance. Quencher can be the same moiety as
the
fluorophore or a different one.
Polymer is a moiety, consisting of several blocks, a bond between which can be
cleaved by an enzyme. Examples include amines, ethers, esters, amides,
peptides, and
oligosaccharides,
C1 and C2 are equivalent to C3 and C4 in the previous design.
Spacer is equivalent to Spacer in the previous design.
References:
[1] Gray, F, Kenney, J.S., Dunne, J.F. Secretion capture and report web: use
of
affinity derivatized agarose microdroplets for the selection of hybridoma
cells. J
Immunol. Meth. 1995, I82, 155-163.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
129
[2] Powell, K.T. and Weaver, J.C. Gel microdroplets and flow cytometry: Rapid
determination of antibody secretion by individual cells within a cell
population.
Biotechnology 1990, 8, 333-337.
[3] Schwabacher, A. W.; Lane, J. W.; Schiesher, M. W.; Leigh, K. M.; Johnson,
C.
W. J. Org. Chem. 1998, 63, 1727 -1729.
Example 14
The goal of this experiment is to develop an ultra high throughput screen
designed for
discovery of novel anticancer agents. Yn contrast to the traditional
combinatorial
chemistry or natural product extract approach. The method of Example 14 uses a
recombinant approach to the discovery of bioactive molecules. The examples use
complex DNA libraries from a mixed population of uncultured microorganisms
that
provide a vast source of natural products through recombinant expression from
whole
gene pathways. The two objectives of this Example include:
1) Engineering of mammalian cell lines as reporter cells for cancer targets to
be
used in ultra-high throughput assay system.
2) Detection of novel anticancer agents using an ultra high throughput FACS-
based screening format.
The present invention provides a new paradigm for screening technologies that
brings
the small molecule libraries and target together in a three dimensional ultra
high
throughput screen using the flow cytometer. In this format, it is possible to
achieve
screening rates of up to 10$ per day. The feasibility of this system is tested
using
assays focused on the discovery of novel anti-cancer agents in the areas of
signal
transduction and apoptosis. Development of a validated assay should have a
profound
impact on the rate of discovery of novel lead compounds.
~erimental Design and Methods
1. Development of cell lines
The goal of this example is to develop an ultra high throughput screening
format that
can be used to discover novel chemotherapeutic agents active against a range
of
molecular targets known to be important in cancers. The feasibility of this
approach

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
130
will be tested using mammalian cell lines that respond to activation of the
epidermal
growth factor receptor (EGFR) with induction of expression of a reporter
protein.
The EGFR-responsive cells will be brought together with our microbial
expression
host within a microdrop (see Example 13 and co-pending U.S. patent 6,280,926,
and
U.S. application Serial No. 09/894,956, both herein incorporated by
reference). These
expression hosts will be Streptomyces or E coli and will contain libraries
derived
from a mixed population of organisms, i.e. high molecular weight environmental
DNA (10-100kb fragments) cloned into the appropriate vectors and transferred
to the
host. These large DNA fragments will contain biosynthetic operons which
consist of
the genes necessary to produce a bioactive small molecule. A bioactive
molecule
from the microbial host will elicit a biological response in the mammalian
cell which
will induce expression of a fluorescent reporter. The entire microdrop will be
individually sorted on the flow cytometer based on fluorescence and the DNA
from
the host recovered. The mixed population libraries may contain from 104-
101° clones,
including 105, 106, 107, 108, 109, or any multiple thereof.
An assay based on the EGF receptor was chosen because of its possible role in
the pathogenesis of several human cancers. The EGF-mediated signal
transduction pathway is very well characterized and several inhibitors of the
EGF receptor have been found from natural sources (21,22). The EGFR is
one of the early oncogenes discovered (erbB) from the avian erythroblastosis
retrovirus and due to a deletion of nearly all of the extracellular domain, is
constitutively active (23). Similar types of mutations have been found in 20-
30% of cases of glioblastoma multiforme, a major human brain tumor (24).
Overexpression of EGFR correlates with a poor prognosis in bladder cancer
(25), breast cancer (26,27), and glioblastoma multiforme (28). Most of these
cancers occur in an EGF-secreting background and suggests an autocrine
growth mechanism in these cancers. Additionally, EGFR is overexpressed in
40-80% of non-small cell lung cancers and EGF is overexpressed in half of
primary lung cancers, with patient prognosis significantly reduced in cases
with concurrent expression of EGFR and EGF (29,30). For these reasons,

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
131
inhibitors of the EGF receptor are potentially useful as chemotherapeutic
agents for the treatment of these cancers.
The goal of this experiment is to create mammalian cell lines that serve as
reporter
cells for anticancer agents. HeLa cells endogenously express the EGFR as
confirmed
by FACS analysis using the anti-EGFR antibody, Ab-1 (Calbiochem). In contrast,
CHO cells have little or no expression of the EGFR. The gene encoding EGFR was
obtained from Dr. Gordon Gill (University of California, San Diego) and cloned
it
into the pcDNA3/hygro vector. The resulting vector was transfected into CHO
cells
and stable transformants selected with hygromycin. Enrichment of high EGFR-
expressing CHO cells was performed through two rounds of FAGS sorting using
the
anti-EGFR antibody. For detection of the activated pathway, a parallel
approach is
being taken utilizing both the PathDetect system from Stratagene (San Diego,
CA)
and the Mercury Profiling system from Clontech (San Diego, CA). The Path
Detect
system has been validated by researchers as a means of detecting mitogenic
stimuli
(31,32).
The EGFR is a tyrosine kinase receptor that functions through the MAP-kinase
pathway to activate the transcription factor Elk-1 (33). The PathDetect
product
includes a fusion traps-activator plasmid (pFA-Elkl) that encodes for
expression of a
fusion protein containing the activation domain of the Elk-1 transcription
activator
and the DNA binding domain of the yeast GAL4. A second plasmid contains a
synthetic promoter with five tandem repeats of the yeast GAL4 binding sites
that
control expression of the Photinus pyralis luciferase gene. The luciferase
gene was
removed and replaced with the gene encoding for the destabilized version of
the
enhanced green fluorescent protein (EGFP) (plasmid designated pFR-d2EGFP). The
two plasmids were transfected together into the EGFR/CHO and HeLa cells at a
ratio
of 10:1 (pFR-EGFP: pFA-Elkl) and stable transformants selected using the
neomycin
resistance gene located on the pFA-Elkl plasmid. Thus, ligand binding to the
EGFR
will initiate a signal transduction cascade that results in activation of the
Elkl portion
of the fusion protein, allowing the DNA binding domain of the yeast GAL4 to
bind to
its promoter and turn on expression of EGFP.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
132
Stimulation in the presence of serum is not surprising as this signal
transduction pathway is common to most growth factors and it is likely that
many growth factors including EGF are present in the serum. After 24 hours
of significant serum starvation, this response is greatly reduced (Figure 2A).
The next step will be to selectively stimulate these cells with recombinant
EGF (Calbiochem) and isolate the highly responsive single clones using the
flow cytometer. These clones will be selected by sorting simultaneously for
high levels of GFP and the EGFR. The EGFR will be detected using an anti-
EGFR antibody with a secondary antibody labeled with phycoerythrin. This
system has the advantage that use of the yeast GAL4 promoter in these cells
should keep background or spurious induction of EGFP to a minimum.
The second group of cell lines uses the Mercury Profiling system to assay the
same EGFR pathway. This system responds to activation of the pathway with
an increase in the expression of human placental secreted alkaline phosphatase
(SEAP). A fluorescent signal will be obtained by the addition of the
phosphatase substrate ELF-97-phosphate (Molecular Probes), which yields a
bright fluorescent precipitate upon cleavage. The advantage of this approach
over the PathI7etect system is the ability to amplify the signal through
enzyme
catalysis for low-level activation of the pathway. This parallel approach will
increase the probability of success in finding bioactive compounds. In the
Mercury Profiling system, a vector containing the cis-acting enhancer element
SRE and the TATA box from the thymidine kinase promoter is used to drive
expression of alkaline phosphatase (pTA-SEAP). This system relies on the
endogenous transactivators present in the cell, such as Ellc-1, to bind the
SRE
element on the vector and drive expression of SEAP upon stimulation of
EGFR. The pTA-SEAP vector was transfected into the EGFRlCHO and HeLa
cells and stable transformants selected using neomycin. Again, stimulation of
the pathway occurred in the presence of serum factors in the media. Upon
serum starvation, this response was greatly reduced (Figure 2B). Single high
expressing clones will be isolated following stimulation with EGF and sorting
using a flow cytometer.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
133
Development of ultra hi h~ou~put FACS assay
We have generated complex mixed population libraries (>106 primary
clones/library)
that provide access to the untapped biodiversity that exist in the >99%
uncultivable
microorganisms. These novel libraries require the development of ultra high
throughput screening methods to obtain complete coverage of the library. We
propose developing an assay using the flow cytometer that allows detection of
up to
10g clones/day.
In this assay format (Figure 1), an expression host (Streptomyces, E. coli)
and a
mammalian reporter cell will be co-encapsulated together within a microdrop.
The
microdrop holds the cells in close proximity to each other and provide a
microenvironment that facilitates the exchange of biomolecules between the two
cell
types. The reporter cell will have a fluorescent readout and the entire
microdrop will
be run through the flow cytometer for clonal isolation. The DNA from the genes
or
pathway of interest will subsequently be recovered using in vitro molecular
techniques. This assay format will be validated for the discovery of both EGFR
inhibitors as well as for small molecules that induce apoptosis. With
validation of this
format, we will progress to the ultra high throughput screening phase designed
to
discover novel chemotherapeutic agents active against these important
molecular
mechanisms underlying tumorigenesis.
The feasibility of this approach will be analyzed initially using the
engineered cell
lines described above that respond to activation by EGF with increased
expression of
a reporter protein (i.e. EGFP or alkaline phosphatase). Additionally, this
initial study
will use an E. coli host that overexpresses human EGF as a secreted protein
directed
to the bacterial periplasm (34). This approach will allow us to validate the
assay
format prior to screening for inhibitors of the EGFR pathway using our E. coli
and
Streptomyces expression libraries. For this experiment, the engineered cell
lines will
be co-encapsulated together with the E. coli host at a ratio of one to one.
The EGF-
expressing bacteria will be allowed to grow and form a colony within the
microdrop.
Due to the vastly higher growth rate of bacteria, a colony of bacteria will
form prior to
any or minimal cell division of the eukaryotic cell. This colony will then
provide a
significantly increased concentration of the bioactive molecule. The bacterial
colony

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
134
will be selectively lysed using the antibiotic polymyxin at a concentration
that allows
cell survival (35). This antibiotic acts to perforate bacterial cell walls and
should
result in the release of EGF from these cells without affecting the eukaryotic
cell. In
the final discovery assays, this lysis treatment should not be necessary as
the small
molecule products will likely be able to freely diffuse out of the cell. The
EGF will
activate the signal transduction pathway in the eukaryotic cell and turn on
expression
of the reporter protein.
The microdrops will be run through the flow cytometer and those microdrops
exhibiting an increased fluorescence will be sorted. The DNA from the sorted
microdrops will be recovered using PC1Z amplification of the insert encoding
for
EGF. For the reporter cells expressing secreted alkaline phosphatase, a couple
of
additional steps are required to achieve a fluorescent readout. As the enzyme
is
secreted from the cell, it is possible to prevent the diffusion of the protein
from the
microdrop by selectively capturing it within the matrix of the microdrop. This
can be
accomplished by using microdrops made with agarose derivatized with biotin. By
forming a sandwich with streptavidin and a biotinylated anti-allcaline
phosphatase
antibody, it is possible to capture alkaline phosphatase where it can catalyze
the
conversion of the ELF-97 phosphate substrate within the microdrop (Figure 3A).
This technique was successfully developed by One Cell Systems for the
isolation of
high expressing hybridomas (36,37). In our hands, with the encapsulation of
the
SEAP expressing cells, we have shown that upon addition of the Elf 97
phosphatase
substrate, a fluorescent precipitate forms within the microdrop (Figure 3B&C).
Initial experiments demonstrate the feasibility of co-encapsulating E. coli
and
mammalian cells (e.g., CHO) within microdrops. Microdrops were formed using 3%
agarose dropped in oil and blended at 2600 rpm. The E. coli and CHO cells were
encapsulated at a ratio of 1:1 (Figure 4A). After 6 hours, the single
bacterial cell grew
into a colony containing thousands of cells (Figure 4B). The cells within the
microdrops were stained with propidium iodide to determine viability and
approximately 70-~S % of the CHO cells remained viable after 24 hours.
Subsequent
steps include determining the response of encapsulated clonal EGF-responsive

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
135
mammalian cells to varying concentrations of EGF in the presence and absence
of
EGFR inhibitors such as Tyrphostin A46 or Tyrphostin A48 (Calbiochem). In
addition, E. coli clones producing high levels of secreted EGF will be
isolated using
the Quantikine human EGF immunoassay (R&D Systems). Finally, these two cell
types will be brought together within the microdrop and a change in
fluorescence of
the eukaryotic cell will be analyzed on the flow cytometer in the presence and
absence
of the EGFR inhibitors. A positive result in this experiment would be an
increase in
fluorescence that can be blocked by the EGFR inhibitors.
The next step will be to mix the EGF-expressing E. coli with non-expressing
cells at
varying ratios from 1:1,000 to 1:1,000,000 to mimic the conditions of an mixed
population library discovery screen. The bacterial mixtures and the mammalian
cells
will be co-encapsulated as described above. The highly fluorescent microdrops
will
be individually sorted by the flow cytometer. To confirm a positive hit, the
DNA will
be recovered by PCR amplification using primers directed against the EGF gene.
To
improve the signal to noise ratio, it is likely that it will be necessary to
undergo
several rounds of enrichment before isolation of positive EGF-expressing
clones,
especially for the higher mixture ratios.
In this case, the microdrops will first be sorted in bulk, the microdrop
material
removed with GELase (Epicentre Technologies) and the bacteria allowed to grow.
The encapsulation protocol will be repeated with fresh eukaryotic cells until
a highly
enriched population is observed. At this point, single microdrops will be
isolated and
recovery of the EGF-expressing clone confirmed by PCR. With validation of this
assay, the goal will be to screen for inhibitors of the EGFR using our mixed
population libraries expressed in optimized E. coli and Streptomyces hosts.
This
assay will be done in the presence of EGF and the assay endpoint will be a
decrease in
fluorescence. This format is not limited to only EGFR inhibitors as any
protein
within this pathway could be inhibited and would appear positive in this
screen.
Likewise, this screen can also be adapted to the multitude of anti-cancer
targets that
are known to regulate gene expression. In fact, using this present system,
with the
addition of the appropriate receptors, it would be possible to screen for
inhibitors of
other growth factors such as PDGF and VEGF.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
136
If an increase in fluorescence is not observed with co-encapsulation of the
EGF-
expressing cells and the mammalian reporter cell, there could be several
reasons.
First, it is possible that the EGF diffuses out of the cell too quickly to
elicit a
response. In this case, it will be necessary to modify the microdrops to limit
diffusion
and concentrate the bioactive molecule at the site of the reporter cell. It is
also
possible that in the specific case of the EGF assay, the cells will not
continue to
produce,EGF after polymyxin treatment and thus, the incubation time of the
reporter
cells with EGF will be minimal. This is unlikely as the polymyxin treatment
used will
be at concentrations well below that which produces decreased cell viability.
However, if EGF is not continually expressed in this system, other
permeabilization
methods will be explored that do not significantly affect cell metabolism,
such as the
bacteriocin release protein (BRP) system (Display Systems Biotech). The BRP
opens
the inner and outer membranes of E. coli in a controlled manner enabling
protein
release into the culture medium. This system can be used for large-scale
protein
production in a continuous culture and thus should be compatible with cell
survival.
Apoptosis, or programmed cell death, is the process by which the cell
undergoes
genetically determined death in a predictable and reproducible sequence. This
process is associated with distinct morphological and biochemical changes that
distinguish apoptosis from necrosis. The malfunctioning of this essential
process can
often lead to cancer by allowing cells to proliferate when they should either
self
destruct or stop dividing. Thus, the mechanisms underlying apoptosis are
currently
under intense scrutiny from the research community and the search for agents
that
induce apoptosis is a very active area of discovery.
The present invention provides to develop an assay for the discovery of
apoptotic
molecules using our ultra high throughput encapsulation technology. The source
of
these small molecules will come from our extremely complex mixed population
libraries expressed in Streptomyces and E. coli host strains. These host
strains will be
co-encapsulated together with a eukaryotic reporter cell, the small molecule
will be
produced in the bacterial strain, and will act on the mammalian reporter cell
which
will respond by induction of apoptosis. Apoptosis will be detected using a
fluorescent
marker, the entire microdrop sorted using the flow cytometer, and the DNA of
interest

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
137
recovered. The feasibility of this assay will be determined using our
optimized
Streptomyces host strain, S. diversa, co-encapsulated with the apoptotic
reporter cell
derived from human T cell leukemia (e.g., Jurkat cells). The pathway
controlling
production of the anti-tumor antibiotic, bleomycin, will be cloned into S.
diversa as
the source of an apoptosis-inducing agent. The readout for induction of
apoptosis in
Jurkat cells will be obtained using the fluorescent marker, Alexis 488-annexin
V.
The bleomycin group of compounds are anti-tumor antibiotics that are currently
being
used clinically in the treatment of several types of tumors, notably squamous
cell
carcinomas and malignant lymphomas. However, widespread use of bleomycin
congeners has been limited due to early drug resistance and the pulmonary
toxicity
that develops concurrent with administration of this drug. Thus, there is
continuing
effort to find novel small molecules with better clinical efficacy and lower
toxicity.
Bleomycin congeners are peptidelpolyketide metabolites that function by
binding to
sequence selective regions of DNA and creating single and double stranded DNA
breaks. Several in vitro and in vivo assays have shown that bleomycin induces
apoptosis in eukaryotic cells (43-45). The biosynthetic gene cluster encoding
for the
production of bleomycin has recently been cloned from Streptomyces verticillus
and
is encoded on a contiguous 85 kb fragment (46). We propose to clone this
pathway
into a BAC vector to use as a source of apoptotic agents in eukaryotic cells.
A library
will be made from the S. verticillus ATCC15003 strain and cloned into the BAC
vector, pBlumate2. As the sequence for this pathway is known, probes will be
designed against sequences from the 5' and 3' ends of the pathway. The library
will
be introduced into E. coli and screened using colony hybridization with the
probe
generated against one end of the pathway. Positive clones will subsequently be
screened with the second probe to identify which clone contains the entire
pathway.
Clones containing the complete pathway will be transferred into our optimized
expression host S. diversa by mating. Expression of bleomycin will be detected
using
whole cell bioassays with Bacillus subtillis.
Jurkat cells are the classic human cell line used for studies of apoptosis.
The
fluorescent AleXis 488 conjugate of annexin V (Molecular Probes) will be used
as the
marker of apoptosis in these cells. Annexin V binds to phosphotidylserine
molecules

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
138
normally located on the internal portion of the membrane in healthy cells.
During
early apoptosis, this molecule flips to the outer leaf of the membrane and can
be
detected on the cell surface using fluorescent markers such as the annexin V-
conjugates. The bleomycin-induced apoptotic response in Jurkat cells will
initially be
characterized by varying both the concentrations of the exogenously
administered
drug and the incubation time with the drug. Alexis 488-annexin V will then be
add to
the cells and the level of fluorescence analyzed on the flow cytometer.
Necrotic cell
death will be determined using propidium iodide and the apoptotic population
will be
normalized to this value.
Co-encapsulation of S. diversa with CHO cells within microdrops produced very
similar results to the E, coli co-encapsulation. S, diversa grew well in the
eukaryotic
media and the CHO cell survival rate was high after 24 hours. In this
experiment, the
S. diversa clone expressing bleomycin will be co-encapsulated with the Jurkat
cell
line. S. diversa will be allowed to grow into a colony within the microdrop
and begin
production of bleomycin. The microdrops will be periodically analyzed over
time for
induction of apoptosis using the Alexis 488-annexin V conjugate on the
microscope
and flow cytometer. After noting the time for induction of apoptosis, a mixing
experiment similar to that described for the EGF experiment will be performed.
Bleomycin-expressing and non-expressing cells will be mixed together at ratios
of
1:1000 to 1:1,000,000. Co-encapsulation of the mixtures with Jurkat cells
vc~ill be
performed and the appropriate incubation time maintained. These microdrops
will
then be stained with Alexis 488-annexin V and sorted on the flow cytometer.
Confirmation of a positive bleomycin-expressing sorted clone will be performed
by
PCR amplification of a portion of the pathway. Again, it is likely that
enrichment of
these mixtures will be necessary using a few rounds of bulking sorting on the
flow
cytometer.
If no apoptosis is observed in the initial assay, confirmation of bleomycin
production
will be performed by sorting of the encapsulated S. diversa clone into 1536
well
plates. After a predetermined incubation period, the supernatent will be
removed and
spotted on filter disks for whole cell bioassays using the susceptible strain
B. subtilis.
Use of the 1536 well plates will hopefully avoid significant dilution of the
antibiotic

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
139
in the media. As cloning of the bleomycin pathway is quite recent, it has not
yet been
heterologously expressed from the complete pathway. However, Du et aI
demonstrated the heterologous bioconversion of the inactive aglycones into
active
bleomycin congeners by cloning a portion of the pathway into a S. lividans
host (46).
If bleomycin expression is not detectable in our assay, we will employ a
similar
strategy using our host strain S. diversa. If little bleomcyin production is
detected
under these conditions, it will be necessary to optimize the culture
conditions for S.
diversa to induce pathway expression within the rnicrodrop. On the other hand,
if
bleomycin is produced but apoptosis is not observed, it is possible that the
molecule is
diffusing away from the microdrop too quickly and it will be necessary to
optimize
the microdrop technology to concentrate the metabolite at the site of the
reporter cell.
Optimization of S. diversa secondary metabolite expression in
microdrop
Induction of pathway expression is an issue that is not limited to the
bleomycin
example. Bioactive small molecules within microorganisms are often produced to
increase the host's ability to survive and proliferate. These compounds are
generally
thought to be nonessential for growth of the organism and are synthesized with
the aid
of genes involved in intermediary metabolism, hence the name "secondary
metabolites." Thus, the pathways controlling expression of these secondary
metabolites are often regulated under non-optimal conditions such as stress or
nutrient
limitation. As our system relies on use of the endogenous promoters and
regulators, it
might be necessary to optimize conditions for maximal pathway expression.
There are several methods that can used to optimize for increased pathway
expression
within the microdrops. For easy detection of maximal expression, we will
construct a
transposon containing a promoter-less GFP. The enhanced GFP optimized for
eukaryotes will be used as it has a codon bias for high GC organisms.
Transposition
into a known pathway (e.g., actinorhodin) will be done in vitro and the vector
containing the pathway purified. The transposants will be introduced into an
E. coli
host, screened for clones that express GFP, and positive clones isolated on
the flow
cytometer. With the transfer of the promoter-less gene for GFP into the
pathway,
increased fluorescence within the cells would suggest transcription of the
pathway

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
140
using the endogenous promoters located within the pathway. This clone will be
used
as a tool for quick detection of upregulation in pathway expression due to
changes in
the experimental conditions.
The S. diversa clone containing GFP and the actinorhodin pathway will be
encapsulated in the microdrops and several different growth conditions will be
tested,
e.g., conditioned media, nutrient limiting media, known inducing factors,
varying
incubation times, etc. The microdrops will be analyzed under the microscope
and on
the flow cytometer to determine which conditions produce optimal expression of
the
pathway. These conditions will be verified for viability in eukaryotic cells
as well.
These optimized growth conditions will be confirmed using the bleomycin
pathway to
assess production of the secondary metabolite. Additionally, whole cell
optimization
of S. diversa is ongoing with production of strains that are missing different
pleiotropic regulators that often negatively impact secondary metabolite
production.
As these strains are developed, they will be analyzed in the microdrops for
enhanced
pathway expression.
The proximity of the two cell types within the microdrop should result in a
high
concentration of the bioactive molecule at the site of the reporting cell.
However, if
rapid diffusion of the molecule from the microdrop prevents detection of the
desired
signal, it will be necessary to optimize the microdrop protocol or develop a
new
encapsulation technology. Concentration of the molecule at the site of the
reporter
cell could be achieved by a reduction in the microdrop pore size. Pore size
reduction
can be accomplished by one or a combination of the following approaches: (i)
"plugging" the holes with particles of an appropriate size, which are held in
the pores
by non-covalent or covalent interactions; (ii) cross-linking of the microdrop-
forming
polymer with low molecular weight agents; (iii) creation of an external shell
around
the microdrop with pores of smaller size than those in the current microdrop.
(i) Plugging the pores can be accomplished using polydisperse latexes
with particles sized to fit within the pores of the microdrop. Latex
particles may be modified on their surface such that they are attracted
to the microdrop-forming polymer. For example, agarose-based
microdrops carry a negative electrostatic charge on the surface. Thus,

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
141
amidine-modified polystyrene latex particles (Interfacial Dynamics
Corporation) will be attracted to the microdrop surface and the latex
particles will effectively plug the microdrop pores provided that the
charge density on the latex particles and the microdrop surface is high
enough to sustain strong electrostatic bonds.
(ii) Cross-linking of agarose beads can be achieved by treating them with
various reagents according to known procedures (47). For our
purposes, the cross-linking needs to occur only on the surface of
microdrop. Thus, it may be advantageous to use polymers carrying
reactive groups for cross-linking of agarose, such that permeation of
the cross-linking agent inside the microdrop is prevented.
(iii) Formation of classical (48) or polymerizable liposomes (49,50) around
microdrops would provide a shell that could be an effective barrier
even to small molecules. A wide variety of precursors for such
liposomes as well as methods for their preparation have been reported
(48=50) and most of them are applicable for our purposes. One of the
possible limitations in choice of precursors stems from the intended
use of microdrops for eventual screening by the flow cytometer. Thus,
the liposomes should not absorb in the visible part of the spectrum.
It might also be necessary to use alternative methods and materials for
preparation of
the microdrops. Encapsulation of cells in polyacrylamide, alginate, fibrin,
and other
gel-forming polymers has been described (51). Another plausible candidate for
encapsulation material is silica gel, which can be formed under physiological
conditions with the assistance of enzymes (silicateins) (52) or enzyme
mimetics (53).
Additionally, various polymers may be used as the material for microdrop
construction. Microdrops may be formed either upon polymerization of monomers
(i.e. water-soluble acrylates or metacrylates) or upon gelation and/or cross-
linking of
preformed polymers (polyacrylates, polymetacrylates, polyvinyl alcohol). Since
the
formation of microdrops occurs simultaneously with encapsulation of living
cells,
such formation has to proceed under conditions compatible with cell survival.
Thus,
the precursors for microdrops (monomers or non-gelated polymers) should be
soluble

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
142
in aqueous media at physiological conditions and capable of the transformation
into
the microdrop material without any significant participation and/or emission
of toxic
compounds.
Example 15
Identification of a Novel Bioactivity or Biomolecule of Interest by Mass
Spectroscopic Screening
An integrated method for the high throughput identification of novel
compounds derived from large insert libraries by Liquid Chromotography - Mass
Spectrometry was performed as described below.
A library from a mixed population of organisms was prepared. An extract of
the library was collected. Extracts from the libraries were either pooled or
kept
separate. . Control extracts, without a bioactivity or biomolecule of interest
were also
prepared.
Rapid chromatography was used with each extract, or combination of extracts
to aid the ionization of the compound in the spectra. Mass spectra were
generated for
the natural product expression host (e.g. S. venezuelae) and vector alone
(e.g.pJ0436)
system. Mass spectra were also generated for the host cells containing the
library
extracts, alone or pooled. The spectra generated from multiple runs of either
the
background samples or the library samples were combined within each set to
create a
composite spectra. Composite spectra may be generated by using a percentage
occurrence of an average intensity of each binned mass per time period or by
using
multiple aligned single mass spectra over a time period. By using a redundant
sampling method where each sample was measured several times in the presence
of
other extracts, the novel signals that consistently occurred within a sample
extract but
not within the background spectra were determined.
The host-vector background spectrum was compared to the mass spectra obtained
from large insert library clone extracts. Extra peaks observed in the large
insert library
clone extracts were considered as novel compounds and the cultures responsible
for
the extracts were selected for scale culture so the compound can be isolated
and
identified.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
143
Novel metabolite identification by mass spectroscopic screening.
In integrated method for the high throughput identification of novel compounds
derived from large insert libraries by LC-MS is described below. Liquid
chromatography-mass spectrometry is used to determine the background mass
spectra
of the natural product expression host (e.g. S. dive~sa DS10 or DS4) and
vector alone
(e.g.pmfl7) system. This host-vector background spectrum is compared to the
mass
spectra obtained from large insert library clone extracts. Extra peaks
observed in the
large insert library clone extracts are considered as novel compounds and the
cultures
responsible for the extracts are selected for scale culture so the compound
can be
isolated and identified.
In order to create the background and sample spectra, rapid chromatography is
used to
aid the ionization of the compounds in the extract. The spectra generated from
multiple runs of either the background samples or the library samples are
combined
within each set to create a composite spectra. Composite spectra may be
generated by
using a percentage occurrence of an average intensity of each binned mass per
time
period or by using multiple aligned single mass spectra over a time period.
Using a
redundant sampling method where by each sample is measured several times in
the
presence of other extracts the novel signals that consistently occur within a
sample
extract but not present in the background spectra can be determined. The
purpose of
this invention is to identify novel compounds produced by recombinant genes
encoding biosynthetic pathways without relying on the compounds having
bioactivity.
This detection method is expected to be more universal than bioactivity for
identifying novel compounds.
Currently there is a similar method of examining culture mixtures by LC-MS
with
long chromatographic times (30-60 min) to bring compounds to a fairly high
level
of purity. This method relies on molecular weight searches for dereplication
of
known compounds. This slow method would also work to identify novel
compounds in S. diversa libraries however the throughput would be inadequate
for
the number of samples we need to screen. There are a pair of publications
describing rapid direct infusion analysis of samples to identify fermentation
conditions which improve the biosynthetic productivity of strains. This method
does not identify specific compound, it just correlates greater, more complex
production with different culture conditions.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
144
Shown below are the following:
Chromatographic gradient and mass spec conditions
~ HPLC and MS setting for Mass Spec Screening.TXT
2. Pooling of samples sheet
~ Sampling Strategy.htm
3. Sample flow using average method
~ Mass Spec Screening Flow chart.doc
4. Matlab code for original average background
~ Mass Spec Screening Summary6 Matlab code.txt
5. Matlab code under development for new single aligned
peaks background determination for more accurate data analysis.
~ Mass Spec Screening 2nd Data Analysis Program.txt
The method is best practiced with a set of control extracts and sample
extracts.
Mixing of the compounds in pools prior to analysis and deconvolution of the
mixed
extract pools will provide high throughput while maintaining the ability to
measure
each extract several times.
A secondary screen may be required to eliminate false positives.
This method is more specific for identifying potential novel compounds by
molecular
ion than current methods. This method uses a different data analysis strategy
than the
dereplication methods for the identification of specific peaks for new
compounds in
extracts. Using the molecular ion as a signal to collect on this method may be
coupled
to mass based collection methods for the rapid isolation of compounds.
Related references:
"Rapid Method to Estimate the Presence of Secondary Metabolites in Microbial",
Higgs, R.E.; Zahn, J. A; Gygi, J. D.; Hilton, M. D.; Appl. Environ. Microbiol.
67:371-
376.
"Use of direct-infusion electrospray mass spectrometry to guide empirical
development of improved conditions for expression of secondary metabolites
from
Actinomycetes", Zahn. J. A.; Higgs, R. E.; Hilton, M. D.; Appl. Envron.
Microbiol.
67:377-386.
"A general method for the dereplication of flavonoid glycosides utilizing high
performance liquid chromatography mass spectrometric analysis." Constant, H.
L.;
Slowing, K.; Graham, J. G.; Pezzuto, J. M.; Cordell, G.A.; Beecher, C. W. W..
Phytochemical analysis, 1997, 8:176-180.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
145
Method Information
Gradient column analysis of crude extracts by positive ion mode.
1100 Quaternary Pump 1
Control
Column Flow . 1.000 ml/min
Stoptime . 4.00 min
Posttime . 0f
Solvents
Solvent A . 98.0 $ (Water)
Solvent B . 0.0 ~ (MeOH)
Solvent C . 2.0 ~ (ACCN)
Solvent D . 0.0 ~ (iPrOH)
PressureLimits
Minimum Pressure. 0 bar
Maximum Pressure. 400 bar
Auxiliary
Maximal Flow . 100.00 ml/min~2
Ramp
Primary Channel. Auto
Compressibility. 100*10~-6/bar
Minimal Stroke. Auto
Store Parameters
Store Ratio . Yes
A
Store Ratio . Yes
B
Store Ratio . Yes
C
Store Ratio . Yes
D
Store Flow . Yes
Store Pressure. Yes
Agilent 1100 Contacts Option
Contact 1 . Open
Contact 2 . Open

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
146
Contact 3 . Open
Contact 4 . Open
Timetable
Time Solv.B Solv.C Solv.D Flow Pressure
______
0.00 0.0 2.0 0.0 1.000
o.ol o.0 2.0 0.0
0.30 0.0 95.0 0.0
1.50 0.0 95.0 0.0
1.60 0.0 2.0 0.0
4.00 0.0 2.0 0.0
Agilent 1100 Contacts Option Timetable
Timetable is empty
Agilent 1100 Diode Array Detector 1
Signals
Signal Store Signal,Bw Reference,Bw [nm]
A: Yes 215 4 450 200
B: No 254 4 450 100
C: No 280 4 450 100
D: No 250 16 Off
E: No 280 16 Off
Spectrum
Store Spectra . Apex + Baselines
Range from . 190 nm
Range to . 600 nm
Range step . 2.00 nm
Threshold . 1.00 mAU
Time
Stoptime . As pump

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
147
Posttime . Off
Required Lamps
UV lamp required . Yes
Vis lamp required . Yes
Autobalance
Prerun balancing . Yes
Postrun balancing . No
Margin for negative Absorbance: 100 mAU
Peakwidth . > 0.1 min
Slit . 4 nm
Analog Outputs
Zero offsetana. 1: 5 $
out.
Zero offsetana. 2: 5 ~
out.
Attenuationana. 1: 1000 mAU
out.
Attenuationana. 2: 1000 mAU
out.
Mass Spectrometer Detector
General Information
Use MSD : Enabled
Ionization Mode: APCI
Tune File : atunes.tun
StopTime . asPUmp
Time Filter : Enabled
Data Storage : Condensed
Peakwidth . 0.15
min
Scan Speed Override: Disabled
Signals
[Signal 1]
Polarity : Positive

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
148
Fragmentor Ramp : Disabled
Scan Parameters
Time Mass Range (Frag-Gain~Thres-~Step-
~ ~
(min)Low ~ High~mentor~EMV ~ size
~ _________ hold
______ (
_
~ ~_________~______~_____~______~______
0.00 ~ 110.001500.0070 1.0 500 0.15
[Signal 2]
Polarity : Positive
Fragmentor Ramp : Disabled
Scan Parameters
Time Mass Range (Frag-Gain~Thres-~Step-
~ ~
(min) Low ~ High~mentor~EMV ~ size
~ ______________ hold
_______~_ ~
( ~______~_____~______~______
0.00 110.00____ 110 1.0 500 0.15
1500.00
[Signal 3]
Not Active
[Signal 4]
Not Active
Spray Chamber
[MSZones]
Gas Temp . 350 C maximum 350 C
Vaporizer : 375 C maximum 500 C
DryingGas : 3.0 1/min maximum 13.0 1/min
Neb Pres . 60 psig maximum 60 psig

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
149
VCap (Positive) : 3000 V
VCap (Negative) : 3000 V
Corona (Positive) . 4.0 pA
Corona (Negative) . 15 uA
EIA Series
FIA Series in this Method . Disabled
Time Setting
Time between Injections : 1.00 min
Agilent 1100 Column Thermostat 1
Temperature settings
Left temperature . 35.0°C
Right temperature . Same as left
Enable analysis . When Temp. is within setpoint +/- 0.8°C
Store left temperature : Yes
Store right temperature: No
Time
Stoptime . As pump
Posttime . Off
Column Switching Valve . Column 2
Timetable is empty
During the process create a background file by looking for a certain
percentage signal
occurrence per mass unit. Use the Summary.m program to create this background
spectra for use later in step 5 below.
1 Optional - Pool samples Use attached pooling strategy
2 Measure Data Use LC - MS to acquire data

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
1S0
3 Extract Data Extract mass spectra into
.csv file
format
4 Identify consistent signals Compare same sample runs to
in sample each
other,using Summary.m program,
bin
deconvolute pools if sample frequentlyluniversally occurring
signals
pooling in step 1 was used.
Determine Unique Peaks in 1. Convert percent occurrence
Sample vs. per
Background mass into a new sample spectra
file.
2. Use Massieve to deterermine
unique peaks in all voltages
and
chromatographic fractions
compared
to background
3. Create 'Unique Peaks' file
for
each voltage, chromatographic
peak
comparison.
6 Eliminate extra peaks by takingFeed 'Unique Peak' file for
each sample
advantage of multiple MS detectionback into Summary.m program,
keep
channels and chromatographic peaks that show up in more
conditions. then one
Mass spectrometer channel
or
chromatographic peak.
7 Short list of novel compound
signals
clear
dir
CompressCount=1:
TestFileData=[12 34 45 56 67]
MasterDir='C:\HPCHEM\1\DATA\MS20FEBA\IND4TST'; $ User inputed directory
containing other directories with files
cd(MasterDir);
MasterDirFiles = dir ~ Load all files in master directory to one variable.
TotalFiles = size(MasterDirFiles)
Original Files='Original Files':
X=990099
~ Loop to create compressed directory listing containing only directories.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
151
for ExtractDir=l:TotalFiles(1,1)
~ Look through find directories in master directory
if MasterDirFiles(ExtractDir).isdir--=1 $ Test each
dir item to see if it is a directory
Is Original'Files=strcmp(MasterDirFiles(ExtractDir).name, Original Files);
if not(Is Original Files)
CompressedDirhist(CompressCount).name = MasterDirFiles(ExtractDir).name:
assign new directories.
CompressCount=CompressCount+1;
~ 'Increment count compressed directories
end
end
end
CompressCount
TotalDirectories=size(CompressedDirList);
CompressCount=1;
for CompressCount= 3:TotalDirectories(1,2) ~ Main loop for moving in and out
of
directories.
CurrentDirectory = CompressedDirList(CompressCount).name;
cd(CurrentDirectory);
FileNameStub=char(pwd)
~ Loop to replace backslash in directory names to dash so directory names can
be
labels
i=0;
FileNameLength= size(FileNameStub)
for i=l:FileNameLength(1,2)
if FileNameStub(l,i)=='\'
FileNameStub(l,i)='-'
end
end
ListOfCsvFiles=dir('*.csv')
PrintHistograms=0; ~ 1 means print histogram, 0 means no print.
~ Whether they are
printed or not the files will be saved.
spectra=(]: '
Clear spectra
mass=109.8
Initial starting mass.
CutoffPercent=40; ~ Cutoff
percent to check if peak is consistently present

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
152
spectra=dlmread(ListOfCsvFiles(1).name); ~ Loads first item in dir call into
spectra
sizespectra=size(spectra); ~ Determines size of first spectra
loaded.
master=[];d=l;SignalOne=[]; SignalTwo=[];
endspectra=0;
format compact ~ Output
form for any variables displayed during run.
BiggestSpectra=0; ~ Initialize the
biggest spectra in batch
BiggestObsMass=0; g
Intitialze the Biggest Observed mass in any spectra
FileNameRoot=('-Names.csv');
g Routine to sort filenames into alphabetical order - should correspond to
chronological order for
g individual mass spectra.
SizeDirList = size(ListOfCsvFiles);
for FileNameOrder = 1 : SizeDirList(1,1)
DataFileName(FileNameOrder,:) = ListOfCsvFiles(FileNameOrder).name
end
SortedDataFileName = sortrows(DataFileName)
Routine to prepare NameFile.Csv file for writing
FileNames=strcat(FileNameStub,FileNameRoot); ~ Create full filename as a
variable.
NameFile=fopen(FileNames,'a+') ~ Open file
to record filenames used to create master matriac
NameOut=char('Mass');
file
files
fprintf(NameFile,NameOut); fprintf(NameFile,'\n'); ~ Prints headerline of name
loop to determine largest measured mass and to write filenames in output
to allow matching filenames and columns from directory lists imported into
summaryl
for testlength=l:SizeDirList(1,1)
spectra=dlmread(SortedDataFileName(testlength,:));
sizespectra=size(spectra);
if sizespectra(1,1)>BiggestSpectra
BiggestSpectra=sizespectra(1,1);
end
if spectra(sizespectra(1,1),1)>BiggestObsMass

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
153
BiggestObsMass=spectra(sizespectra(1,1),1);
end
OddCOl=((testlength*2)+1);
EvenCol=testlength*2;
Name(OddCol)=cellstr('X');
Name(EvenCol)=cellstr(SortedDataFileName(testlength,:));
NameOut=char(Name(EvenCol))
Spacer=char(Name(OddCol))
fprintf(NameFile,NameOut); fprintf(NameFile,'\n'); $ Writes even rows
filenames, with linebreak between.
fprintf(NameFile,Spacer); fprintf(NameFile,'\n'); ~ Writes odd row with the
spacer, with a linebreak between.
end
fclose(NameFile); & Close the file with the file names.
Name(1)=cellstr('Mass');
for i=1:(BiggestObsMass - 100) sloop to fill master matrix from 100 to
high mass value
master(i,1)=mass; gfills in the first column
of master with mass units
mass=mass+1;
end
for d=l:SizeDirList(1,1) g loop to bin spectral intensities into master
matrix
spectra=dlmread(SortedDataFileName(d,:)); ~ reads current file in to variable
spectra
mass=109.8; ~ Re initialize starting point
sizemaster=size(master);
mcol=d*2 ;
sizespectra=size(spectra);
~ Print current index and current filename being operated on
d
FileNameStub
SortedDataFileName(d)
PreviousMass=0;
PreviousIntensity=0;

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
154
MaxColmIntensity(l,mcol)=0: Sets column intensity to zero so a comparison
can be made.
MaxColmIntensity(l,mcol+1)=0; Sets column intensity to zero so a comparison
can be made.
for i=l:sizemaster(1,1) ~ loop that goes through every row of
master, adding columns as spectral data is read
j=1;
endspectra=0;
while spectra(j,l) < (mass+1) & endspectra==0 $ loop that checks if there is
a data point at a mass
intensity=spectra(j,2); ~ Mass signal intensity is in column 2 of
Masstab files
smass=spectra(j,l); ~ m/z value for each mass is in
column 1 of Masstab files.
~ InBin = Logical variable to determine if the current mass is in a bin
Insin=((smass>=mass) & (smass < (mass+1)) & (intensity >0));
~ InSameBin = Logical variable to determine if there is a second signal
at the same mass as the previous one
InSameBin=(PreviousMass>=mass & PreviousMass < (mass+1))
&(PreviousIntensity>0);
if InBin & ~InSameBin ~ see the mass for the first time
- generates SignalOne
master(i,mcol)=spectra(j,2);
if intensity > MaxColmIntensity(l,mcol) ~ determine largest value per
column
MaxColmlntensity(l,mcol)=intensity; ~ and store it in
MaxColmIntensity for later use.
end
end
if InSameBin & InBin ~ see the mass for the second time.
master(i,(mcol+1))=spectra(j,2);
assign mass to master matrix in second signal column
if intensity > MaxColmIntensity(l,mcol+1) ~ determine largest value
per second signal column
MaxColumIntensityil,mcol+1)=intensity: ~ and store
it in MaxColmIntensity for later use.
end
end

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
I55
mass units.
j=j+1; ~ this may not be working as I had hoped - should be comparing
if j>sizespectra(1,1) ~ Do not look for more masses once the position
in master has been reached
endspectra=1;
j=j-2:
if j==0 ~ prevents j from being set to zero and putting spectra
out of range
j=1;
end
end
PreviousMass=smass;
PreviousIntensity=intensity;
end
mass=mass+1;
end
end
mass
OutputRoot=char(°-output.csv');
Output File=strcat(FileNameStub,OutputROOt);
dlmwrite(Output File,mas,ter); ~ Write master matrix to file.
sizemaster=size(master);
SignalOne(1,1)=0;
SignalTwo(1,1)=0;
Even='Even';
Odd='Odd';
SignalOneNormalizedExists=0;
SignalTwoNormalizedExists=0;
Loop to sort out the two signals into the SignalOne and SignalTwo matrices.
g Will also create the relative intensity matrices SignalOnePercent and
SignalTwoPercent
so that the signals can be analyzed on a relative intensity basis.
for d=l:sizemaster(1,2) ~ Go through full length of the master
matrix.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
156
d;
for i=1:(BiggestObsMass - 100) % Go through all the masses.
i;
Halfd=d/2;
master(i,d);
% Put in the mass labels down the first column of the seperates signal files.
SignalOne(i,l)=master(i,l);
SignalTwo(i,l)=master(i,l);
SignalOnePercent(i,1)=master(i,l);
SignalTwoPercent(i,l)=master(i,l);
if Halfd==round(Halfd) % Put the even rows in SignalOne
Comprsd even d=(d/2)+1;
SignalOne(i,Comprsd-even d)=master(i,d);
if MaxColmIntensity(1,d)~=0 % Determine relative intensities of first
signal.
SignalOnePercent(i,Comprsd even d)=master(i,d)/MaxColmIntensity(l, d)*100;
SignalOneNOrmalizedExists=1; % Flag to prevent SignalOnePercent save
if empty
end
%Even
end
if Halfd~=round(Halfd) %Puts the odd rows in SignalTwo
Comprsd odd d=round(Halfd);
% size signal 2=size(SignalTwo);
if d <= sizemaster(1,2) % prevents out of range in master because of
missing signal 2 column
SignalTwo(i,Comprsd odd d)=master(i,d);
if MaxColmIntensity(l,d)~=0 % Determine relative intensities of
second signal.
SignalTwoPercent(i,Comprsd odd d)=master(i,d)/Max0olmIntensity(1, d)*100;
SignalTwoNOrmalizedExists=1; % Flag to prevent SignalOnePercent
save if empty
end
%Odd
end
end
end % i =
end % d=
SignallRoot=char('-SignalOne-output.csv');
Signal-1 File=strcat(FileNameStub,SignallRoot);

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
157
dlmwrite(Signal 1 File,SignalOne); ~ Write first signal data file.
Signal2Root=char('-SignalTwo-output.csv');
Signal 2 File=strcat(FileNameStub,Signal2Root):
dlmwrite(Signal 2 File,SignalTwo); ~ Write second signal data file.
if SignalOneNormalizedExists
NormallRoot=char('-Normal-SignalOne-output.csv');
Normal'1 File=strcat(FileNameStub,NormallRoot);
dlmwrite(Normal_1_File,SignalOnePercent); ~ Write first signal
relative (normalized) data file.
end
if SignalTwoNormalizedExists
Normal2Root=char('-Normal-SignalTwo-output.csv')
Normal'2 File=strcat(FileNameStub,Normal2Root);
dlmwrite(Normal_2_File,SignalTwoPercent); ~ ~ Write second signal
relative (normalized) data file.
end
Procedure to create percentage occurrence summaries and to send out
histograms of backgrounds.
size signal 1=size(SignalOne);
size signal 2=size(SignalTwo):
ZeroPercent=0;
TwoFivePercent=2.5;
FivePercent=5:
for row=l: size signal 1(1,1) ~ Main loop to create counts at certain
frequencies.
row
FileNameStub
GreaterThanZero=0; Initialize each counter per row.
GreaterThanTwoFive=0;
GreaterThanFive=0;
for colm=2: size signal 1(1,2)
~colm
Count number of times a signal intensity occurs per mass unit.
if SignalOnePercent(row,colm) > ZeroPercent

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
158
GreaterThanZero=GreaterThanZero+1;
end
if SignalOnePercent(row,colm) > TwoFivePercent
GreaterThanTwoFive=GreaterThanTwoFive+1;
end
if SignalOnePercent(row,colm) > FivePercent
GreaterThanFive=GreaterThanFive+1;
end
end ~ end column for loop
Determine percent times there is a signal per mass
~ First column of Summary=mass index,
~ Columns 2-4 of Summary = percent occurence of intensity.
Columns 5-7 of Summary = Greater than PercentCutoff Occurrence of signals per
run.
if SignalOneNormalizedExists
Summaryl(row,l)=master(row,l);
Summaryl(row,2)=GreaterThanZero/(size signal 1(1,2)-1)*100;
Summaryl(row,3)=GreaterThanTwoFive/(size_signal 1(1,2)-1)*100;
Summaryl(row,4)=GreaterThanFive/(size signal 1(1,2)-1)*100;
TwoColSummary(row,l)=master(row,l);
if Summaryl(row,2)>=CutoffPercent
Summaryl(row,5)=17
TwoColSUmmary(row,2)=1;
else
Summaryl(row,5)=0;
TwoColSummary(row,2)=0.01;
end
if Summaryl(row,3)>=CutoffPercent
Summaryl(row,6)=1;
else
Summaryl(row,6)=0;
end
if Summaryl(row,4)>=CutoffPercent

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
159
Summaryl(row,7)=1;
else
Summaryl(row,7)=0;
end
end % of if statement
end % end row for loop.
% Routine to write 6 col and 2 col summary file of peak occurrence.
if SignalOneNormalizedExists
SummaryRoot=char('-SignalOne-Summary.csv'):
SummaryFile=strcat(FileNameStub,SummaryRoot);
dlmwrite (SummaryFile, Summaryl) ;
TwoColSummaryRoot=char('-SignalOne-TwoColsummary.csv'):
TwoColSummaryFile=strcat(FileNameStub,TwoColSummaryRoot);
% Use fprintf file save method to enter zeros into csv files.
TwoColSummaryFileOpen = fopen(TwoColSummaryFile, 'a+')
TwoColLength = size(TwoColSummary); i=0;
for i=l:TwoColLength(1,1)
fprintf(TwoColSummaryFileOpen,'%f %c %f\r°,
TwoColSummary(i,l),',',TwoColSummary(i,2));
end
%fprintf(TwoColSummaryFileOpen,'\n')
fclose(TwoColSummaryFileOpen);
dlmwrite(TwoColSummaryFile,TwoColSummary);
end
%Create histograms showing binning of percentage occurence, in 5 percent
divisions.
if SignalOneNOrmalizedExists
figure(1);hist(Summaryl(:,2),20);
Over2ero='Occurence over 0% -- ';
FigureTitle=char('- 0% histogram');
Titleword(1,:)=cellstr(OverZero);
TitleWOrd(2,:)=cellstr(FileNameStub):
xlabel('Percent Occurrence'):
ylabel('Counts');
title(Titleword);
if PrintHistograms==1
print
end
FileName=strcat(FileNameStub,FigureTitle);

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
160
print('-djpeg','-r200',FileName)
figure(2);hist(Summaryl(:,3),20);
OverTwoFive='Occurence over 2.5~ intensity -- ';
FigureTitle=char('- 2.5g histogram');
TitleWord(1,:)=cellstr(OverTwoFive)
TitleWord(2,:)=cellstr(FileNameStub);
xlabel('Percent Occurrence');
ylabel('Counts'):
title(TitleWOrd);
if PrintHistograms==1
print
end
FileName=strcat(FileNameStub,FigureTitle);
print('-djpeg°,'-r200',FileName)
figure(3):hist(Summaryl(:,4),20);
OverFive='Occurence over 5~ intensity -- ';
FigureTitle=char('- 5$ histogram');
TitleWord(1,:)=cellstr(OverFive)
TitleWord(2,:)=cellstr(FileNameStub);
xlabel('Percent Occurrence');
ylabel('Counts'):
title(TitleWord);
if PrintHistograms==1
print
end
FileName=strcat(FileNameStub,FigureTitle);
print('-djpeg','-r200',FileName)
g Create bar graphs showing positions observed more than 50~ of the
time vs mass.
figure(4);bar(Summaryl(:,1),Summaryl(:,5)):
OverZero2='Greater than 50g occurrence of signal over 0~ -- ';
FigureTitle=char('- 50~ - Og intensity');
TitleWord(1,:)=cellstr(OverZero2)
TitleWord(2,:)=cellstr(FileNameStub);
xlabel('Mass');
ylabel('Percent Occurrence'):
title(TitleWord);
if PrintHistograms==1
print

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
161
end
FileName=strcat(FileNameStub,FigureTitle);
print('-djpeg',°-r200',FileName)
figure(5);bar(Summaryl(:,1),Summaryl(:,6));
OverTwoFive2='Greater than 50~ occurrence of signal over 2.5~ -- ';
FigureTitle=char('- 50g - 2.5~ intensity');
TitleWord(1,:)=cellstr(OverTwoFive2)
TitleWord(2,:)=cellstr(FileNameStub);
xlabel('Mass');
ylabel('Percent Occurrence');
title(TitleWOrd);
if PrintHistograms==1
print
end
FileName=strcat(FileNameStub,FigureTitle);
print('-djpeg°,'-r200',FileName)
'figure(6);bar(Summaryli:.l).S~aryl(:,7));
OverFive2='Greater than 50~ occurrence of signal over 5~ -- ';
FigureTitle=char('' S0~ - 5g intensity°);
TitleWord(1,:)=cellstr(OverFive2)
TitleWord(2,:)=cellstr(FileNameStub);
xlabel('Mass');
ylabel('Percent Occurrence');
title(TitleWord);
if PrintHistograms==1
print
end
FileName=strcat(FileNameStub,FigureTitle);
print('-djpeg','-r200',FileName)
Create percent occurrence vs mass bar graph across all masses.
figure(7);bar(Summaryl(:,1),Summaryli:,2));
OverZero3='Percentage occurrence of signal over 0~ -- ';
FigureTitle=char('- occur per mass at 0 percent');
TitleWord(1,:)=cellstr(OverZero3)
TitleWord(2,:)=cellstr(FileNameStub);
xlabel('Mass');
ylabel('Percent Occurrence');

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
162
title(TitleWord);
if PrintHistograms==1
print
end
FileName=strcat(FileNameStub,FigureTitle):
print('-djpeg','-r200',FileName)
figure(8);bar(Summaryl(:,1),Summaryl(:,3)):
OverTwoFive3='Percentage occurrence of signal over 2.5~ -- ':
FigureTitle=char('- occur per mass at 2.5 percent');
TitleWord(1,:)=cellstr(OverTwoFive3)
TitleWord(2,:)=cellstr(FileNameStub):
xlabel('Mass');
ylabel('Percent Occurrence');
title(TitleWord);
if PrintHistograms==1
print
end
FileName=strcat(FileNameStub,FigureTitle):
print('-djpeg°,'-r200',FileName)
figure(9):bar(Summaryl(:,1),Summaryl(:,4));
OverFive3='Percentage occurrence of signal over 5~ -- ';
FigureTitle=char('- occur per mass at 5 percent'):
TitleWord(1,:)=cellstr(OverFive3)
TitleWord(2,:)=cellstr(FileNameStub);
xlabel('Mass');
ylabel('Percent Occurrence'):
title(TitleWOrd);
if PrintHistograms==1
print
end
FileName=strcat(FileNameStub,FigureTitle):
print('-djpeg','-r200',FileName)
end ~ of if SignalOneNOrmalizedExists statement.
Return to matlab directory
gcd C:\matlabrll\work
~to ds
gpwd

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
163
dlmwrite('FILE.txt',TestFileData)
cd ..;
X % prints after while
end % Main loop for moving in and out of directories.
% Alinel, m
% The program determines the average background value looking at the entire
peak shape
of the spectra.
% Will need another program to take the measured spectra of true samples and
compare
them to the average
values of the average spectra determined here and the see if they fall within
a
certain percentage of the
% RMSD values to see if they are correct.
clear
dir
CompressCount=1;
TestFileData=[12 34 45 56 67] %Test data for file written as test of program -
remove
later
MasterDir='C:\MATLABR11\work\TestData': % User inputed directory containing
other
directories with files
cd(MasterDir);
MasterDirFiles = dir % Load all files in master directory to one variable.
TotalFiles = size(MasterDirFiles)
Original Files='Original Files';
X=99099
% Value used to show completion of loop.
% Loop to create compressed directory listing containing only directories.
for ExtractDir=l:TotalFiles(1,1)
% Look through find directories in master directory
if MasterDirFiles(ExtractDir).isdir==1 % Test each
dir item to see if it is a directory
Is'Original Files=strcmp(MasterDirFiles(ExtractDir).name, Original Files);
if not(IS'Original Files)
CompressedDirList(CompressCount).name = MasterDirFiles(ExtractDir).name; %
assign new directories.
CompressCount=CompressCount+1;
% Increment count compressed directories
end
end
end
TotalDirectories=size(CompressedDirhist)t

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
164
CompressCount=1;
for CompressCount= 3:TotalDirectories(1,2) ~ Main loop for moving in and out
of
directories.
CurrentDirectory = CompressedDirList(CompressCount).name;
cd(CurrentDirectory);
FileNameStub=char(pwd)
~ Loop to replace backslash in directory names to dash so directory names can
be
labels
i=0;
FileNameLength= size(FileNameStub)
for i=l:FileNameLength(1,2)
if FileNameStub(l,i)=='\'
FileNameStub(1,i)=°-°
end
end
ListOfCsvFiles=dirt'*.csv')
Spectra=[];
Clear Spectra
mass=109.8
Initial starting mass.
Spectra=dlmread(ListOfCsvFiles(1).name); $ Loads first item in dir call into
Spectra
sizespectra=size(Spectra); ~ Determines size
of first Spectra loaded.
master=[];d=l;SignalOne=[]; SignalTwo=[]; ~ Clear master, SignalOne,
SignalTwo
endspectra=0;
format compact ~ Output
form for any variables displayed during run.
BiggestSpectra=0; ~ Initialize the
biggest spectra in batch
BiggestObsMass=0;
Intitialze the Biggest Observed mass in any spectra
FileNameRoot=('-Names.csv');
~S Routine to sort filenames into alphabetical order - should correspond to
chronological order for
~ individual mass spectra.
SizeDirList = size(ListOfCsvFiles);

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
165
for FileNameOrder = 1 : SizeDirList(1,1)
DataFileName(FileNameOrder,:) = ListOfCsvFiles(FileNameOrder).name
end
SortedDataFileName = sortrows(DataFileName)
~ Routine to prepare NameFile.Csv file for writing
FileNames=strcat(FileNameStub,FileNameRoot); ~ Create full filename as a
variable.
NameFile=fopen(FileNames,'a+') ~ Open file
to record filenames used to create master matrix
NameOut=char('Mass');
fprintf(NameFile,NameOut); fprintf(NameFile,'\n'); ~ Prints headerline of name
file
loop to determine largest measured mass and to write filenames in output
files
Aline
to allow matching filenames and columns from directory lists imported into
for testlength=l:SizeDirList(1,1)
Spectra=dlmread(SortedDataFileName(testlength,:));
sizespectra=size(Spectra):
if sizespectra(l,l)>BiggestSpectra
BiggestSpectra=sizespectra(1,1);
end
if Spectra(sizespectra(1,1),1)>BiggestObsMass
BiggestObsMass=Spectra(sizespectra(1,1),1);
end
OddCol=((testlength*2)+1);
EvenCol=testlength*2;
Name(OddCol)=cellstr('X');
Name(EvenCol)=cellstr(SortedDataFileName(testlength,:)):
NameOut=char(Name(EvenCol))
Spacer=char(Name(OddCol))
fprintf(NameFile,NameOut); fprintf(NameFile,'\n'); ~ Writes even rows
filenames, with linebreak between.
fprintf(NameFile,Spacer); fprintf(NameFile,'\n'): ~ Writes odd row with the
spacer, with a linebreak between.
end
fclose(NameFile): ~ Close the file with the file names.
Name(1)=cellstr('Mass'):
sloop to fill first column of matrices from 100 to high mass value with the
mass labels.
for i=1:(BiggestObsMass - 100)
MaxPOSitionMaster(i,l)=mass

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
166
AverageMaxPOS(i,l)=mass:
TruncAverageMaxPos(i,l)=mass;
MaxPosDifference(i,l)=mass:
MasterMeanShiftedSpectra(i,1) = mass:
MasterStDevShifted5pectra(i,l)=mass:
mass=mass+1;
end
%%%%%%%%%%%%%%%%%%%%%% MAIN LOOP TO ORGANIZE ROWS OF MASSES FROM DIFFERENT
FILES
%%%%%%%%%%%%%%%%%%
% Main loop to:
% 1) Read data row by row into master matrix
% 2) Determine first maxima of each peak
% 3) Determine average max position for each mass
% 4) Determine amount to shift each spectra
% 5) Shift each spectra the appropriate amount to align the maxima
% 6) Determine the mean spectra by averaging intensity at each point.
% 7) Determine the standard deviation between the measured spectra and the
average.
% 8) Record the row by row averages and RMSD's into a master matrix for saving
to
files at the end.
for MassPosition = 1:(BiggestObsMass-100)
%LOOp to open each file and read values into MasterMassRowMatrix
%Item 1 above
for FileNumber = l:SizeDirList(1,1)
Spectra=[];
% Clear spectra for new values
from next file.
Spectra = dlmread(SortedDataFileName(FileNumber,:)): % Read
spectra sequentially for MasterMassPerRow
% Need a line here to test that we are not past the end of the file - test at
start
with constant width files.
SizeCurrentSpectra = size(Spectra);
if MassPosition <= SizeCurrentSpectra(1,1)
MasterMassPerRow(FileNumber,:) _
Spectra(MassPosition,2:SizeCurrentSpectra(1,2)): % transfer row to master
matrix
else
MasterMassPerRow(FileNumber,:) = 0:
end % FileNumber else
end
%%%%%%%%%%%%%%%%%
%%%%% May have to insert a routine to generate a zerofilled rectangular
maxtrix
for later manipulations.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
167
SizeMasterMassPerRow = size(MasterMassPerRow);
~ Find position of first maxima in the current files.
~ Item 2 of above
for CurrentFile = l:SizeMasterMassPerRow(1,1) ~ go through rows one by
one.
NoPeak = 1;
~ Set marker for no maxima
PosMarker = 2
Start Current colm position after the mass labels.
~ Item 1 from top of loop
while NoPeak
~s loop continues until the first max is found in each row
YesPeak = 0
Set YesPeak to negative at start of scan.
CurrentPosValue = MasterMassPerROw(CurrentFile,PosMarker);~ set the
current position as the center value
if PosMarker > 2
PreviousPOSValue = MasterMassPerROw{CurrentFile,PosMarker-1); ~ Get
previous position value during scan.
else
PreviousPosValue = 0: ~ if at beginning of row
let every signal start with a zero value
end g end if PosMarker >2
if PosMarker == SizeMasterMassPerROw(1,2)
NextPosValue = MasterMassPerRow(CurrentFile,PosMarker)~ if at end of
row set next value to current value
NoPeak=0; ~ Jump out if at the end of the row.
else
NextPosValue = MasterMassPerRow(CurrentFile,PosMarker+1);
end ~ End of if PosMarker at end
Determine if these three points describe a peak.
~ YesPeak = logical variable to see if CurrentPos is top of peak.
YesPeak = (PreviousPOSValue < CurrentPosValue) & (CurrentPosValue >
NextPOSValue);
if YesPeak
$ Record position of maximum in Master MaxPos Matrix
Rows are masses; columns are FileNumber positions
g Offset CurrentFile by 1 b/c first col'm is the mass label.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
168
MaxPositionMaster(MassPosition,CurrentFile+1) = PosMarker;
NoPeak = 0;
Set NoPeak so while loop can end and can check next row.
end ~ of if YesPeak
PosMarker = PosMarker+1; ~ Increment Pos
Marker to next position.
if PosMarker > SizeMasterMassPerRow(1,2)
NoPeak = 0;
end ~ if PosMarker
end ~ While NoPeak.
end ~ CurrentFile for loop
~ Item 3 - Determine the average position of maxima for each mass
SumMaxPos=o;
for AveIndex = 2:(SizeMasterMassPerROw(1,1)+1)
SumMaxPos = SumMaxPos+MaxPositionMaster(MassPosition,AveIndex);
end g for AveIndex
TruncAverageMaxPos(MassPOSition,2)= fix(SumMaxPOS/SizeMasterMassPerROw(1,1));
$ Item 4 from top of the MassPosition loop
~ If a peak is forward (smaller pos #) of the average maxima then the shift is
positive,
~ if the peak is behind the average maxima then the shift is negative.
for AveIndex = 2:(SizeMasterMassPerRow(1,1)+1)
MaxPOSDifference(MassPOSition,AveIndex)=MaxPositionMaster(MassPOSition,AveIndex
)-
TruncAverageMaxPos(MassPosition,2);
end ~ for AveIndex 2nd time.
~ Determine the largest positive and negative shift that needs to be made
~ Continuation of item 4.
SizeMaxPositionMaster=size(MaxPositionMaster);
LargestPositiveShift=0;
LargestNegativeShift=0;
for i= 2:SizeMaxPositionMaster(1,2)
if MaxPosDifference(MassPosition,i) > LargestPositiveShift
LargestPositiveShift = MaxPOSDifference(MassPOSition,i)
end
if MaxPosDifference(MassPOSition,i) < LargestNegativeShift
LargestNegativeShift = MaxPosDifference(MassPosition,i)
end
end ~ for i loop.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
169
~ Ttem 5 - Shift the spectra depending on the position of their maxima.
~ Fill the ShiftedSpectra matrix with the appropriately shifted spectra from
MasterMassPerRow.
ShiftedMatrixWidth =
LargestPositiveShift+abs(LargestNegativeShift)+SizeMasterMassPerROw(1,2);
ShiftedSpectra = zeros(SizeMasterMassPerRow(1,1),ShiftedMatrixWidth);
zero fill new shifted spectra matrix
SizeMaxPosDifference= size(MaxPosDifference);
for Shift = 2:SizeMaxPosDifference(1,2);
StartIndex = 1+LargestPOSitiveShift-MaxPosDifference(MassPosition,Shift):
FinalPosition = StartIndex+SizeMasterMassPerRow(1,2)-1:
FileNumber=Shift-1:
MasterMassIndex = 1;
for Index = StartIndex:FinalPosition
ShiftedSpectra(FileNumber,Index)=MasterMassPerRow(FileNumber,MasterMassIndex);
MasterMasslndex=MasterMassIndex+1:
end ~ Index loop
end ?s Shift loop
~ Item 6 - Create average intensity spectra for each row.
SizeShiftedSpectra=size(ShiftedSpectra):
MeanShiftedSpectra=mean(ShiftedSpectra):
~ Item 7 - Determine Standard Deviation for each column of aligned spectra
StDevShiftedSpectra=std(ShiftedSpectra):
~ Item 8 - Record the average shifted spectra per mass and the standard dev
per
position.
MasterDim = size(ShiftedSpectra);
MasterColWidth = MasterDim(1,2)+1;
MasterMeanShiftedSpectra(MassPosition,2:MasterColWidth)=MeanShiftedSpectra(1,:)
:
MasterStDevShiftedSpectra(MassPosition,2:MasterColWidth) _
StDevShiftedSpectra(:,:):
dlmwrite('MasterMeanShiftedSpectra.csv',MasterMeanShiftedSpectra):
dlmwrite('MasterStDevShiftedSpectra.csv',MasterStDevShiftedSpectra);
end ~ MassPosition loop
dlmwrite('FILE.txt',TestFileData)
cd ..
X
end ~ Compress Count

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
170
Example 16
Plasmid DNA transformation protocol for Pseudonzohas
a. Preparation of electroporation competent cells
lml of overnight culture is innoculated into 100m1 LB, bacteria are incubated
in
the 30C shaker until OD 600 reading reaches 0.5-0.7. The bacteria are
harvested
by spinning @ 3000rpm for 10 minutes at 4C.
The resulting cell pellet is washed with 100m1 ice-cold ddH20, spun @ 3000rpm
for 10 minutes at 4C to collect the cells. The washing is repeated. The cells
are
then washed with SOmI 10% ice-cold glycerol(in ddH20) once and collected by
spinning @ 3000rpm for 10 minutes at 4C. The bacteria cell is resuspended into
2m1 ice-cold 10% glycerol(in ddH20) SOuI or 100u1 is aliquoted into each of
the
tubes and stored at -HOC.
b. Electroporation
lul plasmid DNA is mixed with SOuI competent cell and kept on ice for 5
minutes.
The mixture is transferred to a pre-chilled cuvette(0.2cm gap, Bio-Rad). The
DNA is transformed into bacteria by electroporation with Bio-Rad machine.
(Setting: Volts: 2.25KV; time: Sms; capacitance: 25uF)
300u1 SOC medium is added to the cell mixture and bacteria are incubated at
30C
shaker for one hour. A certain amount of culture is spread on LA plate with
antibiotics and the plates were incubated at 30C.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
171
Example 17
Transformation of Yeast Cells lby Electoporation
One day before the experiment, 10 ml of YPD medium is inoculated with a single
yeast colony of the strain to be transformed. It is grown overnight to
saturation at
30°C. On the day of competent cell preparation, the total volume of
yeast
overnight culture is transferred to a 2L baffled flask containing 500 ml YPD
medium. The culture is grown with vigorous shaking at 30°C to an OD6oo
~_- 0.~-

500 ml of culture is harvested by centrifuging at 4000 x g, 4°C, for 5
min in
autoclaved bottles. The supernatant is subsequently discarded. The cell pellet
is
washed in 250 ml cold sterile water. Washing is repeated twice. The
supernatant
is discarded.
The pellet is resuspended in 30 ml of ice-cold 1M Sorbitol. The suspension is
transferred into a sterile 50 ml conical tube. The mixture is centrifuged in a
GP-~
centrifuge 2000 rpm, 4°C for 10 min. The supernatant is discarded.
The pellet is resuspended in,50p,1 of ice-cold 1M Sorbitol. The final volume
of
resuspended yeast should be 1.0 to 1.5 ml and the final OD600 should be 200.
In a sterile, ice-cold 1.5-ml microcentrifuge tube, 40u1 concentrated yeast
cells are
mixed with lug of DNA contained in <_5 ~1. The mixture is transferred to an
ice-
cold 0.2-cm-gap disposable electroporation cuvette and pulsed at 1.5 kV, 25
uF,
20052. It should be noted that the time constant reported by the Gene Pulser
will vary
from 4.2 to 4.9 msec. Times <4 msec or the presence of a current arc
(evidenced by a
spark and smoke) indicate that the conductance of the yeast/DNA mixture is too
high.
400 ~1 ice-cold 1M sorbitol is added to the cuvette and the yeast is
recovered, with
gentle mixing. 200 ~,l aliquots of the east suspension should be spread
directly on
sorbitol selection plates. Incubate 3 to 6 days at 30°C until colonies
appear.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
172
Literature Cited
1. Gibbs, J.B., Mechanism-Based Target Identification and Drug Discovery in
Cancer Research. Science 2000, 287, 1969-73
2. Garret, M.D., Workman, P. Discovering Novel Chemotherapeutic Drugs for the
Third Millennium. Eur. J. Cancer 1999, 35, 2010-30
3. Hanahan, D., Weinberg, R.A., The Hallmarks of Cancer. Cell 2000, 100, 57-70
4. Druker, B.J., Nicholas, B.L., Lessons learned from the development of an
Abl
tyrosine kinase inhibitor for chronic myelogenous leukemia. J. Clin. Invest.
2000,
105, 3-7
5. Sikic, B.L, New Approaches in cancer treatment. Ann. Onc. 1999, 10, 5149-
5153
6. Gibbs, J.B., Anticancer drug targets: growth factors and growth factor
signaling. J.
Clin. Invest. 2000, 105, 9-13
7. Drews, J., Drug Discovery: A historical perspective. Science 2000, 287,
1960-64
8. Harvey, A.L., Medicines from nature: are natural products still relevant to
drug
discovery? Trends Pharmacol. Sci. 1999, 20, 196-197
9. Cragg, G.M., Newman, D.J., Snader, K.M. Natural products in drug discovery
and
development. J. Nat. Prod. 1997, 60, 52-60
10. ~lerdine, G.L., The combinatorial chemistry of nature. Nature 1996, 384,
11-13
11. Demain, A.L., and J.E. Davies. Manual of industrial Microbiology and
biotechnology; ASM Press: Washington D.C., 1999
12. Mc Daniel, R., et al., Rational design of aromatic polyketide natural
products by
recombinant assembly of enzymatic subunits. Nature 1995, 375, 549-554
13. Jacobsen, J.R., D.E. Cane, and C. Khosla, Spontaneous priming of a
downstream
module in 6-deoxyerythronolide B synthase leads to polyketide biosynthesis.
Biochem. 1998, 37, 4928-4934
14. Donadio, S., McAlpine, J.B., Sheldon, P.J., Jackson, M., and Katz, L., An
erythromycin analog produced by reprogramming of polyketide synthesis.Proc.
Natl.
Acad. Sci. U.S.A. 1993, 90, 7119-23
15. Cortes, J. et al, Science, Repositioning of a domain in a modulax
polyketide
synthase to promote specific chain cleavage1995, 268, 1487-89
16. Amann, R.LL.W., Schleifer K.H., Phylogenetic identification and in situ
detection
of individual microbial cells without cultivation. Microbiol. Rev. 1995, 59,
143-169

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
173
17. Robertson, D.E., et al. The discovery of new biocatalysts from microbial
diversity.
SIM News 1996, 46, 3-8
18. Stein, J.L., et al., Characterization of uncultivated prokaryotes:
isolation and
analysis of a 40-kilobase-pair genome fragment from a planktonic marine
Archaeon.
J. Bacteriol. 1996, 178, 591-599
19. Short, J.M., Recombinant approaches for accessing biodiversity. Nat.
Biotechnol.
1997, 15, 1322-23
20. Sundberg, S.A., High-throughput and ultra-high-throughout screening:
solution-
and cell-based approaches. Curr. Opin. Biotech. 2000, 11, 47-53
21. Alvi, K.A., Pu, H., Asternquinones produced by Aspergillus candidus
inhibit
binding of the Grb-2 adapter to phosphorylated EGF receptor tyrosine kinase.
J.
Antibiotics 1999, 52, 215-223
22. Levitzki, A., Gazit, A., Tyrosine Kinase inhibition: an approach to drug
development. Science 1995, 267, 1782-88
23. Alberts, B., Bray, D., Lewis, J., Raff, M., Roberts, K., and J.D. Watson,
Molecular
biology of the cell; Garland Publishing, Inc.: New York, 1994
24. Kolibaba, K.S., Druker, B.J., Protein tyrosine kinases and cancer. Biochim
Biophysica Acta 1997, 1333, F217-F248
25. Neal, D.E., Sharpies, L., Smith, K., Fennelly, J., Hall, R.R., Harris,
A.L., The
epidermal growth factor receptor and the prognosis of bladder cancer. Cancer
1990,
65, 1619-25
26. Nicholson, S., Richard, J., Sainsbury, C., Halcrow, P., Kelly, P., Angus,
B.,
Wright, C., Henry, J., Farndon, J., Harris, A., Epidermal growth factor
receptor
(EGFr) status associated with failure of primary endocrine therapy in elderly
postmenopausal patients with breast cancer. Br. J. Cancer 1991, 63, 146-150
27. Klijn, J.G.M., Berns, P.M.J.J., Schmitz, P.LM., Foekens, J.A., The
clinical
significance of epidermal growth factor receptor (EGF-R) in human breast
cancer: a
review on 5232 patients. Endocr. Rev. 1992, 12, 3-17
28. Hiesiger, E., Hayes, R., Pierz, D., Budzilovich, G., Prognostic relevance
of
epidermal growth factor receptor (EGF-R) and c-neu/erbB2 expression in
glioblastomas (GBMs). Neurooncol. 1993, 16, 93-104

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
174
29. Tateishi, M., Ishida, T., Mitsudomi, T., Kaneko, S., Sugimachi, K.,
Immunohistochemical evidence of autocrine growth factors in adenocarcinoma of
the
human lung Cancer Res. 1990, 50, 7077-80
30. Gorgoulis, V., Aninos, D., Mikou, P., Kanavaros, P., Karameris, A.,
Joardanoglu,
J., Rasidakis, A., Veslemes, M., Ozanne, B., Spandidos, D.A., Expression of
EGF,
TGF-alpha and EGFR in squamous cell lung carcinomas Anticancer Res. 1992, 12,
1183-87
31. Sharif, T.R., Sharif, M., A high throughput system for the evaluation of
protein
kinase C inhibitors based on Elkl transcriptional activation in human
astrocytoma
cells. Int. J. Onc. 1999, 14, 327-335 .
32. Li, ~., Vaingankar, S.M., Green, H.M., Green, M.M., Activation of the
9E3/cCAF
chemokine by phorbol esters occurs via multiple signal transduction pathways
that
converge to MEKl/ERK2 and activate the Elkl transcription factor. J Biol Chem
1999, 274, 15454
33. Treisman, R., Regulation of transcription by MAP kinase cascades. Curr.
Opin.
Cell Biol. 1996, 8, 205-215
34. Engler, D.A., Matsunami, R.K., Campion, S.R., Stringer, C.D., Stevens, A.,
Niyogi, S., Cloning of authentic human epidermal growth factor as a bacterial
secretory protein and its initial structure-function analysis by site-directed
mutagenesis. J. Biol. Chem. 1988, 263, 12384-390
35. Salinelin, C., Hovinen, J., Vilpo, J., Polymyxin permeabilization as a
tool to
investigate cytotoxicity of therapeutic aromatic alkylators in DNA repair-
deficient
Escherichia coli strains. Mut. Res. 2000, 467, 129-138
36. Gray, F., Kenney, J.S., Dunne, J.F., Secretion capture and report web: use
of
affinity derivatized agarose microdroplets for the selection of hybridoma
cells. J.
Irmnunol. Methods 1995, 182, 155-163
37. Powell, K.T., Weaver, J.C., Gel microdroplets and flow cytometry: rapid
determination of antibody secretion by individual cells within a cell
population.
Bio/Technology 1990, 8, 333-337
38. Jan van der Wal, F., Luirink, J., Oudega, B., Bacteriocin release
proteins: made of
action, structure, and biotechnological application. FEMS Biol. Rev 1995, 17,
381-
399

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
17S
39. Majno, G., Joris, L, Apoptosis, oncosis, and necrosis: an overview of cell
death.
Am. J. Pathol. 1995, 146, 3-15
40. Wyllie, A.H., Kerr, J.F.R., Currie, A.R., Cell death; the significance of
apoptosis.
Int. Rev. Cytol. 1980, 68, 251-356
41. Sikic, B.L, Rozencweig, M., Carter, S.K., Eds. Bleomycin chemotherapy;
Academic Press: Orlando, FL, 1985
42. Deng, JL., Newman, D.J., Hecht, S.M., Use of COMPARE analysis to discover
functional analogues of bleomycin. J. Nat. Prod. 2000, 63, 1269-72
43. Ortiz, L.A., Moroz, K., Liu, JY., Hoyle, G.W., Hammond, T., Hamilton, R.,
Holian, A., Banks, W., Brody, A.R., Friedman, M., Alveolar macrophage
apoptosis
and TNF-a, but not p53, expression correlate with marine, response to
bleomycin.
Am. J. Physiol. 1998, 275, L1208-L1218
44. Kumagai, T., Sugiyama, M., Protection of mammalian cells from the toxicity
of
bleomycin by expression of a bleomycin-binding protein gene from streptomyces
verticillus. J. Biochem. 1998, 124, 835-841
45. Benitez-Bribiesca, L., Sanchez-Suarez, P., Oxidative damage, bleomycin,
and
gamma radiation induce different types of DNA strand breaks in normal
lymphocytes
and thymocytes. Ann. NY Academy Sci. 1999, 887, 133-149
46. Du, L., Sanchez, C., Chen, M., Edwards, D.J., Shen, B., The biosynthetic
gene
cluster for the antitumor drug bleomycin from Streptomyces verticillus
ATCC15003
supporting functional interactions between nonribosomal peptide synthetases
and a
polyketide synthase. Chem. & Biol. 2000, 7, 623-642
49.Guiseley, K. B. US Patent 3,956,273, Modified Agarose and Agar and Methods
of
Making Same. May 11, 1976.
50. Phospholipids Handbook; Cevc, G., Ed.; Marcel Dekker: New York, 1993.
S1. Ringsdorf, H.; Schlarb, B.; Venzmer, J. Molecular Architecture and
Function of
Polymeric Oriented Systems: Models for Study of Organization, Surface
Recognition,
and Dynamics of Biomembranes. Angew. Chem., Int. Ed. Engl. 1988, 27, 113 - 158
and references cited therein.
52.0'Brien, D. F.; Ramaswami, V. Polymerized Vesicles. Encycl. Polym. Sci.
Eng.
1989, 17, 108 -135.

CA 02393374 2002-06-04
WO 02/31203 PCT/USO1/31806
176
53. Nilsson, K.; Brodelius, P.; Mosbach, K. Entrapment of Microbial and Plant
Cells
in Beaded Polymers. Methods in Emzymology, 1987, 135, 222 - 230 and references
cited therein.
54. Kroger, N.; Deutzmann, R.; Bumper, M. Polycationic Peptides from Diatom
Biosilica That Direct Silica Nanosphere Formation. Science 1999, 286, 1129-
1132.
55. Cha, J. N.; Stucky, G. D.; Morse, D. E.; Deming, T. J. Biomimetic
Synthesis of
Ordered Silica Structures Mediated by Block Copolypeptides. Nature 2000, 403,
289
- 292.
56. Bukanov, N. O., Demidov, V. V., Nielsen, P. E. & Frank-Kamenetskii, M. D.
(1998). PD-loop: A complex of duplex DNA with an oligonucleotide. PNAS, 95
(10), 5516-5520.
57. Brenner, S., Williams, S. R., Vermaas, E.H., Storck, T., Moon, K.,
McCollum,
C., Mao, J., Luo, S., Kirchner, J. J., Eletr, S., DuBridge, R. B., Burcham, T.
&
Albrecht, G. (1999). In vitro cloning of complex mixtures of DNA on
microbeads:
Physical separation of differentially expressed cDNAs. PNAS, 97 (4), 1665-
1670.
58. Goryshin, I. Y., & Reznikoff, W. S. (1998). Tn5 in vitro transposition. J.
Biol.
Chem., 273, 7367-7374.
59. Jayasena, V. K. & Johnston, B. H. (1993). Complement-stabilized D-loop:
RecA-catalyzed stable pairing of linear DNA molecules at internal sites. J.
Mol.
Biol., 230, 1015-1024.
60. Lohse, J., Dahl, O. & Nielsen, P. E. (1999). Double duplex invasion by
peptide
nucleic acid: A general principle for sequence-specific targeting of double-
stranded
DNA. PNAS, 96 (21), 11804-11808.
61. Sena, E. P. & Zarling, D. A. (1993). Targeting in lineax DNA duplexes with
two
complementary probe strands for hybrid stability. Nature Genetics
While the invention has been described in detail with reference to certain
preferred
embodiments thereof, it will be understood that modifications and variations
are within
the spirit and scope of that which is described and claimed.

Representative Drawing

Sorry, the representative drawing for patent document number 2393374 was not found.

Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee  and Payment History  should be consulted.

Event History

Description Date
Inactive: IPC expired 2018-01-01
Application Not Reinstated by Deadline 2006-10-10
Time Limit for Reversal Expired 2006-10-10
Inactive: IPRP received 2006-10-03
Deemed Abandoned - Failure to Respond to Maintenance Fee Notice 2005-10-11
Letter Sent 2003-07-21
Letter Sent 2003-07-21
Letter Sent 2003-07-21
Letter Sent 2003-07-21
Letter Sent 2003-07-21
Inactive: Single transfer 2003-06-03
Inactive: Incomplete PCT application letter 2003-01-14
Inactive: Correspondence - Formalities 2002-12-05
Inactive: Courtesy letter - Evidence 2002-11-12
Inactive: Cover page published 2002-11-12
Inactive: Notice - National entry - No RFE 2002-11-07
Inactive: First IPC assigned 2002-11-07
Application Received - PCT 2002-08-27
National Entry Requirements Determined Compliant 2002-06-04
Application Published (Open to Public Inspection) 2002-04-18

Abandonment History

Abandonment Date Reason Reinstatement Date
2005-10-11

Maintenance Fee

The last payment was received on 2004-09-23

Note : If the full payment has not been received on or before the date indicated, a further fee may be required which may be one of the following

  • the reinstatement fee;
  • the late payment fee; or
  • additional fee to reverse deemed expiry.

Patent fees are adjusted on the 1st of January every year. The amounts above are the current amounts if received by December 31 of the current year.
Please refer to the CIPO Patent Fees web page to see all current fee amounts.

Fee History

Fee Type Anniversary Year Due Date Paid Date
Basic national fee - standard 2002-06-04
Registration of a document 2003-06-03
MF (application, 2nd anniv.) - standard 02 2003-10-10 2003-09-29
MF (application, 3rd anniv.) - standard 03 2004-10-11 2004-09-23
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
DIVERSA CORPORATION
Past Owners on Record
JAY M. SHORT
MARTIN KELLER
WILLIAM MICHAEL LAFFERTY
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column (Temporarily unavailable). To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Description 2002-12-04 181 9,133
Description 2002-06-03 176 9,034
Drawings 2002-06-03 27 1,819
Abstract 2002-06-03 1 67
Claims 2002-06-03 25 943
Cover Page 2002-11-11 1 39
Description 2002-06-05 176 9,009
Notice of National Entry 2002-11-06 1 192
Reminder of maintenance fee due 2003-06-10 1 106
Request for evidence or missing transfer 2003-06-04 1 101
Courtesy - Certificate of registration (related document(s)) 2003-07-20 1 105
Courtesy - Certificate of registration (related document(s)) 2003-07-20 1 105
Courtesy - Certificate of registration (related document(s)) 2003-07-20 1 105
Courtesy - Certificate of registration (related document(s)) 2003-07-20 1 105
Courtesy - Certificate of registration (related document(s)) 2003-07-20 1 105
Courtesy - Abandonment Letter (Maintenance Fee) 2005-12-05 1 174
Reminder - Request for Examination 2006-06-12 1 116
PCT 2002-06-03 1 33
Correspondence 2002-11-06 1 24
Correspondence 2003-01-06 2 35
Correspondence 2002-12-04 6 138
PCT 2002-06-04 19 1,096

Biological Sequence Listings

Choose a BSL submission then click the "Download BSL" button to download the file.

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Please note that files with extensions .pep and .seq that were created by CIPO as working files might be incomplete and are not to be considered official communication.

BSL Files

To view selected files, please enter reCAPTCHA code :