Language selection

Search

Patent 3238472 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 3238472
(54) English Title: ENRICHED PEPTIDE DETECTION BY SINGLE MOLECULE SEQUENCING
(54) French Title: DETECTION DE PEPTIDES ENRICHIS PAR SEQUENCAGE DE MOLECULE UNIQUE
Status: Compliant
Bibliographic Data
(51) International Patent Classification (IPC):
  • C40B 40/10 (2006.01)
  • C12Q 1/6874 (2018.01)
(72) Inventors :
  • ANDERSON, NORMAN LEIGH (United States of America)
  • RAZAVI, MORTEZA (Canada)
(73) Owners :
  • SISCAPA ASSAY TECHNOLOGIES, INC. (United States of America)
(71) Applicants :
  • SISCAPA ASSAY TECHNOLOGIES, INC. (United States of America)
(74) Agent: BERESKIN & PARR LLP/S.E.N.C.R.L.,S.R.L.
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date: 2022-12-01
(87) Open to Public Inspection: 2023-06-08
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/US2022/080781
(87) International Publication Number: WO2023/102502
(85) National Entry: 2024-05-16

(30) Application Priority Data:
Application No. Country/Territory Date
63/284,990 United States of America 2021-12-01
63/288,987 United States of America 2021-12-13
63/296,196 United States of America 2022-01-04
63/303,417 United States of America 2022-01-26
63/313,760 United States of America 2022-02-25
63/340,001 United States of America 2022-05-10
63/348,213 United States of America 2022-06-02
63/352,925 United States of America 2022-06-16
63/373,875 United States of America 2022-08-30

Abstracts

English Abstract

The inventions herein, e.g., relate to quantitative measurement of proteins, and provide significant improvements, e.g., in the sensitivity, accuracy, throughput and cost of measuring clinically important proteins in biological samples such as blood. The inventions herein, e.g., also relate to peptide library preparation for quantitative single molecule analysis.


French Abstract

L'invention, par exemple, se rapporte à une mesure quantitative de protéines, et procure des améliorations significatives, par exemple, dans la sensibilité, la précision, le débit et le coût de mesure de protéines cliniquement importantes dans des échantillons biologiques tels que le sang. L'invention, par exemple, concerne également une préparation de banque de peptides pour une analyse quantitative de molécule unique.

Claims

Note: Claims are shown in the official language in which they were submitted.


What is claim is:
1. A molecular construct and vehicle comprising:
(a) a molecular construct comprising a peptide comprising a target peptide
sequence
derived from proteolytic cleavage of a target protein and a molecular tag
defining the
source of said peptide,
and
(b) a vehicle capable of presenting the construct for analysis by a sequence-
sensitive
single molecule detector.
2. The molecular construct and vehicle of claim 1, wherein the molecular
tag is a target tag
that identifies the peptide as a peptide created by proteolytic digestion of a
biological
sample.
3. The molecular construct and vehicle of claim 1, wherein the peptide
comprises a
synthetic peptide and the molecular tag is a standard tag that identifies the
synthetic
peptide as an internal standard.
4. The molecular construct and vehicle of claim 2, further comprising a
sample barcode
identifying the sample of origin.
5. The molecular construct and vehicle of claim 1 further comprising a
binder barcode
identifying a binder to which the construct has been bound.
6. The molecular construct and vehicle of claim 4, wherein the barcode or
the tag is an
oligonucleotide.
7. A standardized sample digest derived from a proteolytic digest of a
biological sample,
comprising:
208
CA 03238472 2024- 5- 16

an amount of a molecular construct comprising a target tag and a target
peptide, said
construct being a target peptide construct and
an amount of a molecular construct comprising a standard tag and a peptide
whose
sequence is the same or similar to the sequence of said target peptide, said
construct being
a standard peptide construct,
wherein the target peptide is generated by proteolytic digestion of a target
protein in said
biological sample,
wherein said target and standard tags can be distinguished by a single
molecule detector
and comprise chemical or structural groups covalently joined to peptides in
their
respective constructs,
wherein said target tag is covalently attached to a plurality of the peptides
present in said
sample digest,
wherein said target peptide construct comprises more than 90% of the target
peptide
molecules present in said sample digest and
wherein said standard peptide construct is prepared separately and added to
said digest in
a known amount, or in a consistent relative amount across a multiplicity of
samples.
8. The standardized sample digest of claim 7, wherein the number of molecules
of the
standard peptide construct added to the sample digest differs by no more than
a factor of
100 from the number of molecules of the target peptide construct in said
sample digest.
9. The standardized sample digest of claim 7, further comprising one or more
additional
standard peptide constructs having a different standard tag from each other
and with each
construct at a different relative abundance.
10. The standardized sample digest of claim 7, wherein the target tag is
covalently attached to
a majority of the peptides generated by proteolytic digestion of said sample.
11. The standardized sample digest of claim 7 wherein said tags are
oligonucleotides.
209
CA 03238472 2024- 5- 16

12. An enriched standardized sample digest, comprising a bound fraction of the
standardized
sample digest of claim 7 bound by a binder, wherein said bound fraction
comprises a target
peptide construct and a standard peptide construct in a ratio equal within 2%,
5%, 10% or
20% to the ratio in which they are present in said standardized sample digest.
13. A stoichiometrically-flattened standardized sample, comprising a plurality
of pairs of
cognate standard and target peptide constructs enriched from a standardized
proteolytic
digest of a biological sample by binding to their respective cognate binders,
wherein
a pre-enrichment ratio calculated by dividing the number of molecules of a
first target
peptide construct that is the most numerous of said target peptide constructs
in the
standardized sample digest by the number of molecules of a second target
peptide
construct that is the least numerous of said target peptide constructs in the
standardized
sample digest is more than 10 times larger than a post-enrichment ratio
calculated by
dividing the number of molecules of said first target peptide construct by the
number of
molecules of said second target peptide construct in said enriched sample.
14. A method for the measuring the amount of a selected target protein in a
biological sample,
comprising:
proteolytically digesting said sample,
modifying a plurality of peptides in the digested sample by adding a target
tag to form a
plurality of constructs comprising a selected target peptide derived from, and
proteotypic
of, said target protein, said plurality of constructs being target construct
molecules,
adding an amount that is known and/or consistent between a set of samples of a
prepared
standard peptide construct that is a cognate of said selected target peptide
construct and
comprises a standard tag, forming a standardized digest,
210
CA 03238472 2024- 5- 16

enriching said cognate target and standard peptide constructs by contacting
said
standardized digest with a cognate binder, forming bound constructs,
separating said bound constructs from unbound constructs to form enriched
constructs,
releasing said enriched constructs from said binder,
linking said enriched constnicts to a vehicle capable of presenting said
enriched
constructs to a sequence-sensitive single molecule detector,
counting said enriched target construct molecules and said enriched standard
construct
molecules using a sequence-sensitive single molecule detector capable of
distinguishing
said target and standard tags and identifying said peptides,
calculating the amount of said protein in said sample.
15. The method of claim 14, wherein the calculating is performed by
multiplying the amount
of standard construct added by the ratio of the number of target construct
molecules
counted to the number of standard construct molecules counted by said
detector.
16. The method of claim 14, wherein said proteolytic digestion comprises at
least two
sequential steps resulting in peptide cleavage at different sites, and wherein
peptides are
covalently modified between two such steps (or wherein said first sequential
step cleaves
at lysine residues).
17. The method of claim 14, wherein said proteolytic digestion comprises at
least two
sequential steps resulting in peptide cleavage at different sites, and wherein
peptides
retain an unmodified n-terminal amino group when presented to said detector.
18. The method of claim 14, wherein a sample barcode is linked to said
constnicts encoding
the identity, or relative position within a sample set, of said standardized
samples; a
plurality of said standardized samples is pooled; said sample barcodes
associated with
construct molecules are read using a sequence-sensitive single molecule
detector; and
the counts of target and standard construct molecules for each sample are
separated
211
CA 03238472 2024- 5- 16

based on said sample ID barcode identifying the sample from which they were
enriched,
and wherein said barcode may be an oligonucleotide.
19. The method of claim 14, wherein a binder barcode is linked to said
constructs
identifying the binder by which they were enriched, and wherein said barcode
may be an
oligonucleotide.
20. The method of claim 14, wherein said construct molecules are joined
together into
concatamers prior to presentation to said detector.
212
CA 03238472 2024- 5- 16

Description

Note: Descriptions are shown in the official language in which they were submitted.


WO 2023/102502
PCT/US2022/080781
ENRICHED PEPTIDE DETECTION BY SINGLE
MOLECULE SEQUENCING
1 BACKGROUND
The entire content of each patent document and publication referenced in this
application,
including but not limited to those listed below, is hereby incorporated by
reference herein in
its entirety.
1.1 BACKGROUND PATENTS
U.S. Provisional Patent Application No. 63/284,990, filed 12/1/21
U.S. Provisional Patent Application No. 63/288,987, filed
U.S. Provisional Patent Application No. 63/296,196, filed 1/4/22
U.S. Provisional Patent Application No. 63/303,417, filed 1/26/22
U.S. Provisional Patent Application No. 63/313,760, filed 2/25/22
U.S. Provisional Patent Application No. 63/348,213, filed 6/2/22
U. S . Provisional Patent Application No. 63/352,925, filed 6/16/22
U.S. Provisional Patent Application No. 63/373,875, filed 8/30/22
U.S. Provisional Patent Application No. 63/381,722, filed 10/31/22
US Patent No. 7,632,686 (application no. 10/676,005, entitled High Sensitivity
Quantitation of Peptides by Mass Spectrometry filed 2 October 2003)
International Application No. PCT/US11/028569 (entitled Improved Mass
Spectrometric Assays for Peptides filed 15 March 2011)
International Application No. PCT/US13/48384 (entitled Multipurpose Mass
Spectrometric Assay Panels for Peptides)
International Application No. PCT/US12/042,931 (entitled Magnetic Bead Trap
and
Mass Spectrometer Interface)
1.2 SINGLE MOLECULE SEQUENCING USING NANOPORES:
https://nanoporetech.com/
PCT/GB2020/053082
US11098355
1
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
US11168363
US20200239950A1
US 2017021955
US010814298
US20210147904
1.3 REVERSE TRANSLATION OF PEPTIDE SEQUENCE TO DNA (OR RNA, ETC.)
FOLLOWED BY DETECTION USING HIGH-THROUGHPUT NUCLEIC ACID
SEQUENCING PLATFORMS:
"Proteocode" technology developed by Encodia:
https://www.encodia.com/technology).
U520180201980A1
US20180328936A1
U520200348308A1
U520210254047A1
US20210302431A1
W02017192633A1
"ProtSeq" technology developed by Google:
US 2021/0102248
US 2021/0079557
US 2021/0079398
US 2021/0171937
1.4 AFFINITY REAGENT IMAGING PLATFORMS
Nautilus: h t s ://www. nal I s bio
US 2020/0318101
US 2021/0358563
US 2021/0239705
US 2021/0101930
US 2020/0082914
US 10,948,488 B2
2
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
1.5 PEPTIDE DEGRADATION WITH OPTICAL DETECTION OF TERMINAL AMINO
ACIDS
Quantum-Si: https://www.quantum-si.com/products-and-technology/
and the following patent filings:
US 75227
US20160041095A1
US20200123593A1
US20200123594A1
US20200395099A1
US20210121875A1
US20210139973A1
US20210217800A1
US20210270740A1
US20210331170A1
US20210354134A1
W02021086945A1
W02021086954A1
W02021146475A1
W02021216763A1
1.6 FLUOROSEQUENCING AND FRET FINGERPRINTING:
US 9,625,469
US 10,545,153
US 2021/032536
US2018/0201980
US2018/0201980
US2021/03024
US2021/0254047
US20150087526A1
US20180328936A1
US20200018768A1
US20200123593A1
3
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
US20200123594A1
US20200124613A1
US20200231956A1
US20200348308A1
US20200400677A1
US20210221839A1
US20210331170A1
W02016164530A1
W02017192633A1
W02019222527A1
W02020014586A9
W02021086908A1
W02021216763A1
W02021111125A1
4
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
2 FIELD OF THE INVENTION:
2.1 PROTEIN QUANTIT A TION.
The inventions herein relate to quantitative measurement of proteins, and
provides
significant improvements in the sensitivity, accuracy, throughput and cost of
measuring
clinically important proteins in biological samples such as blood. More than
100 different
proteins are currently measured by clinical diagnostic tests in blood (/),
each requiring a
separate test and a separate aliquot of sample. Such tests are typically
immunoassays, and
make use of indirect detection of protein targets by antibodies, opening the
door to a variety of
interferences and associated clinical errors (2). The cost and complexity of
this paradigm for
clinical laboratory testing severely constrains the health benefits obtainable
from measurement
of clinical biomarker proteins, and effectively precludes emerging
applications such as high
frequency longitudinal testing to establish personal biomarker baselines and
health models.
The inventions herein also relate to peptide library preparation for
quantitative single molecule
analysis.
2.2 EARLIER AFFINITY ENRICHMENT METHODS: SISCAPA-MS.
In the past, significant progress was made in improving the specificity,
multiplexability
and sensitivity of protein tests through the introduction of quantitative mass
spectrometric
protein assays, particularly those using specific proteolytic peptides as
quantitative surrogates
for their parent proteins and enriching those peptides using peptide-specific
enrichment
reagents such as anti-peptide antibodies (e.g., the SISCAPA technology, (3,
4)). These
advances have improved patient care (e.g., the SISCAPA test for the thyroid
cancer marker
thyroglobulin performed by leading clinical reference labs in the US and
Canada (5, 6), and
the recently introduced SISCAPA assay for SARS-CoV-2 NCAP protein (7)), and
are widely
used in pharmaceutical research and development to measure biomarkers,
therapeutic proteins
and drug targets with high precision.
However, the requirement for a mass spectrometer as the final detector in such
assays
represents a significant barrier to adoption due to capital cost (typically
¨$500,000), operator
expertise required, limited throughput (typically 2-15 minutes per sample),
and unsuitability
for ultimate point-of-care use. In addition, the practical sensitivity of mass
spectrometers for
peptide detection is limited, with the best current instruments requiring at
least 10 amol (-6
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
million molecules) of a peptide for reliable detection, far more than would be
required in
principle if individual molecules could be counted reliably. It is an object
of the present
invention to overcome these barriers by enabling the use of single molecule
detection
techniques in quantitative protein assays.
2.3 SINGLE MOLECULE METHODS.
Several technologies are being advanced for sequence-sensitive single molecule

peptide characterization and detection using concepts and methods initially
developed for
DNA and RNA sequencing. These include nanopore sequencing, "reverse
translation" of
peptide to DNA sequences, cyclical degradation with electronic detection of
terminal amino
acids, optical detection of single molecule epitopes, etc., as described
below. Nucleic acid
versions of such methods typically aim to provide enormous throughput (e.g.,
gigabases per
run) of sequence (i.e., digital) data - a feature required to address whole
genome, whole exome,
or RNASeq sequencing requirements - but are not focused on precise
quantitative (i.e., analog)
measurements of the amount of a particular type of molecule. Publications in
these field rarely
(if ever) use statistical terms related to quantitative precision (such as
variance, coefficient of
variation (CV), or accuracy): in contrast, these terms are commonly used in
discussions of
protein quantitati on methods where diagnostic accuracy often requires preset
precision (e.g., a
CV of 5%)
Several significant barriers impede the use of these technologies for the
analysis of
peptides and proteins. Peptides and proteins are made of 20 common amino
acids, as opposed
to only 4 common bases in either RNA or DNA, requiring a much greater degree
of analytical
discrimination to sequence peptides as opposed to DNA. An additional
consequence of the
greater variety of amino acids, compared to bases, is that peptides and
proteins have a very
wide variety of physical properties (e.g., number, polarity and localization
of electric charges,
solubility, chemical reactivity, inter-molecular interactions, etc.) as
compared to most nucleic
acids, which are uniformly negatively charged along their lengths, with
generally unreactive
bases attached. In addition, there is no biological equivalent of the
polymerase chain reaction
(PCR) for peptides and proteins, or of any enzymatic process capable of
copying a protein
molecule directly, thus eliminating many of the most powerful methods used in
nucleic acid
preparation for sequencing. Perhaps most importantly, without PCR or an
equivalent
6
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
amplification method for peptides, these methods have limited dynamic range.
Genomic DNA
has a very limited dynamic range (genes are generally present at near equal
stoichiometry).
While RNA or cell-free DNA can be present at widely varying abundances, PCR
can be used
to amplify low abundance molecules. In contrast, important protein-containing
samples such
as blood, plasma or tissues, have documented dynamic ranges in excess of 1011
(difference in
molar amounts between the highest and lowest abundance proteins for which
there is a clinical
need to measure), with no practical method of amplifying the numbers of low
abundance
molecules. As a result, there are major gaps in the generality, specificity,
and robustness of
peptide methods derived from nucleic acid technologies, and thus significant
barriers to their
applicability to the problem of quantitative protein measurement.
2.4 THE PRESENT INVENTIONS.
To overcome these limitations and enable use of a "sequence-sensitive single
molecule
detector" instead of a mass spectrometer to detect and measure peptide
molecules, major
challenges in the preparation of peptide samples and their presentation to
such detectors must
be addressed. The present invention provides a general approach to the
preparation of peptide
libraries for quantitative single molecule analysis, and specific
implementations appropriate
for use with several alternative single molecule detectors (nan op ore s,
optical imaging systems,
and single molecule stepwise sequencing systems)
A key obstacle in formulating the invention has been the multi-dimensional
nature of
the problem, encompassing as it does the areas of protein and peptide
chemistry,
oligonucleotide chemistry and sequence design, antibody selection, single
molecule detection
by optical, chemical and electrical technologies, and requirements of specific
clinical
diagnostic assays. Key aspects of the invention involve adaptation of
technologies from each
of these areas in a novel combination.
The invention preserves the fundamental benefits of direct detection of
analyte
molecules (a strength of mass spectrometry in comparison with indirect
detection methods
such as immunoassays), while offering the potential for improved test
sensitivity, sequence
specificity, and lower cost - all of which improve commercial competitiveness
against legacy
immunoassay technologies and enable expanded use of protein biomarkers in
medicine and
pharmaceutical R&D.
7
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Using reagents and methods of the present invention, a substantial improvement
in the
throughput, cost and sensitivity of protein analysis can be achieved. In the
case of nanopore
sequencing, taken as an example of sequence-sensitive single molecule
detection, we can, in
principle, estimate the performance of a system optimized for peptide
quantitation. Assuming
that 1) peptides can be delivered as oligonucleotide constructs of about 50
bases in length; 2)
nanopore sequencers process (read) oligonucleotides at a rate of approximately
400 bases/sec;
3) accurate measurement of the amount of a peptide requires detection and
counting of less
than 5,000 molecules; and 4) a commercially available nanopore cartridge
contains 3,000
simultaneously readable nanopores, it would be theoretically possible to
identify and precisely
measure 30 peptide targets (representing 30 distinct clinically-relevant
proteins) in a single
sample in approximately 6 seconds. Using DNA barcoding methods described
herein to
multiplex samples, 96 samples could be analyzed in 10 minutes using a single
cartridge,
compared to approximately 10 hr using liquid chromatography ¨ mass
spectrometry (LC-MS):
an advantage of 60-fold in speed with less than 1/10th the equipment cost.
Relatively
inexpensive benchtop commercial devices exist capable of operating 48 such
cartridges
simultaneously, providing potential throughput for such a 30-plex biomarker
panel of 4,608
samples in 10 min, or more than 600,000 samples per 24-hr day (assuming that
sample
preparation could keep up with this throughput!).
Using a completely different single molecule detection method based on optical

imaging of molecules immobilized on a planar array, similar increases in
analytical throughput
can be obtained. An array capacity of 2,000,000,000 individual molecules would
allow
400,000 peptide abundance measurements, assuming that 5,000 molecules need to
be counted
for each measurement and that all the different molecules can be brought to
near equal
stoichiometry. A single run of such a system, requiring approximately 1 day
for optical
readout, could therefore measure 40 proteins in 10,000 samples per day.
Furthermore, the ability to recognize and count individual analyte molecules
can, at
least in theory, offer the maximum assay sensitivity possible by any method,
approximately
1,000 times as sensitive as mass spectrometry (-5,000 vs 6,00,000 molecules
required for
quantitative measurement, respectively). Such an improvement in sensitivity
would enable
8
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
precise measurement of almost all the 100+ clinically- established blood
protein biomarkers in
much less than 1 microliter (1/20th of a drop) of blood.
These advances in throughput and sensitivity can translate into significant
improvements in the cost of measuring the current menu of approximately 115
blood protein
biomarkers. Current clinical laboratory analyzers measure one protein at a
time, typically by
means of specific immunoassays, at an average cost of $5-10 per protein per
sample, and
requiring 50-100 uL of sample per measurement (which explains why 5-10 mL of
blood is
typically drawn by venipuncture when blood tests are required). A significant
downside of
this paradigm is the limitation it places on use of protein panels: measured
one at a time, a
panel of proteins would cost $5-10 times the number of proteins ¨ a strong
disincentive
impeding the application of panels despite their greater diagnostic
information content. Using
single molecule methods, it is estimated that a single run costing $10,000 and
yielding 400,000
protein measurements would lower the cost per individual clinical protein
measurement to
$0.025. This cost structure would revolutionize clinical diagnostics and
enable major advances
in disease detection and management.
However, this enormous advance in performance provided by sequence-sensitive
single molecule detectors is only realizable if several very challenging
problems can be solved
that currently impede their use to measure a wide range of clinically-
important proteins in
diagnostic samples such as plasma and whole blood These problems relate to the
preparation
of peptide libraries for quantitative single molecule analysis, and are
successfully overcome
by the present invention, which provides improvements in quantitation (by
providing novel
internal standards), sensitivity (by enriching low-abundance targets), dynamic
range (by
enabling stoichiometric flattening), and analytical workflow (by providing
applicable
chemistries for sample preparation).
3 BACKGROUND OF THE INVENTION:
The present inventions address several major challenges in protein analysis,
making
use of a number of methods well-known in the art, in novel combinations and in
combination
with entirely novel concepts disclosed herein.
9
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
3.1 PROTEIN QUANTITATION CHALLENGES.
Quantitative measurements of protein biomarkers, drugs and drug targets are
important
in many areas of medical practice, pharmaceutical trials and biological
research. The
importance of improving such measurements in terms of sensitivity,
specificity, and generality
is nowhere more significant than in the context of blood, the primary clinical
specimen. Blood
represents the largest and deepest version of the human proteome present in
any sample: in
addition to the classical "plasma proteins" and cellular proteins of red
cells, white cells and
platelets, it contains all tissue proteins (as leakage markers) plus very
numerous distinct
immunoglobulin sequences (8) . In addition to the large number of proteins
present, proteins
in plasma exhibit an extraordinary dynamic range in abundance: more than 10
orders of
magnitude in concentration separate albumin and the rarest proteins now
measured clinically.
Abundant scientific evidence, from proteomics and other disciplines, suggests
that among
these are proteins whose abundances and structures change in ways indicative
of many, if not
most, human diseases. Nevertheless, only about 100 proteins are currently used
in routine
clinical diagnosis (/), while the rate of introduction of new protein tests
approved by the US
FDA has paradoxically declined over the last two decades to about one or two
new protein
diagnostic markers approved per year. Furthermore, it appears that the
clinical value of most
such tests would be substantially improved if the results were interpreted in
terms of patient-
specific (i.e., personalized) baselines (rather than population reference
intervals) ¨ an advance
that is currently inhibited by the cost and inconvenience of collecting a
series of baseline
samples from each patient before the emergence of major disease processes (9).
Major
advances in clinical diagnostics and pharmaceutical research are to be
expected if certain
technical problems in sample collection, preparation and analysis are solved.
Current methods of protein analysis, including those used in clinical
laboratories for
analysis of samples like blood, have limitations that significantly impact
their utility, both in
research and clinical practice. High-precision tests for clinical use are
expensive, limited to a
small menu of proteins, and require expensive equipment. Research methods,
e.g., those of
proteomics, measure many proteins, but with limited precision, low throughput
and high cost.
There is therefore a need for improvements in protein measurement.
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
3.2 PEPTIDES AS QUANTITATIVE SURROGATES FOR PROTEINS
In many applications aimed at protein quantitation, one can use a single
peptide as a
quantitative surrogate for the parent protein, provided that there is one (or
some other known
number) of copies of the peptide per protein molecule; i.e., that the peptide
molar amount (or
number of molecules) is equal to (or some known multiple of) the protein's
molar amount (or
number of molecules). Using the known sequence of a target protein and/or
experimental data,
one can select one or more proteolytically-derived peptide segments within it
as "target
peptides" (herein referred to as TARGET(s)) to be measured as surrogates for
their parent
proteins. A good target peptide for quantitation purposes is one that is a)
proteotypic for the
protein (i.e., occurs in no other protein of the species from which the sample
is derived); b)
occurs a known number of times (usually once) in the protein sequence,
allowing the peptide
to be used as a surrogate measure of the molar amount of the protein, c) is
efficiently detected
by a chosen detector; and d) behaves reliably in a practical sample
preparation workflow
appropriate to the assay objectives (which may include, for example, specific
binding and
enrichment compared to other un-selected peptides). Methods for selection of
TARGET
peptides from a wide range of target proteins for conventional mass
spectrometric detection is
well-known in the art, but not directly relevant to selection of optimal
peptides for single
molecule detection. .
3.3 PROTEOLYTIC DIGESTION
Digestion of proteins to peptides serves to "simplify" the structure of a
protein sample,
by eliminating complicated protein shapes (and their associated unique
physical properties and
protein:protein interactions), at the expense of increasing the numbers of
molecules present.
In other words, the immense variety of folded protein structures present in a
biological sample
is transformed by digestion into a larger set of essentially unstructured
short, linear peptides.
Proteins exhibit a very wide range of physical properties, ranging from
soluble to insoluble,
compact to extended, positively to negatively charged, with half-lives of
seconds to months,
and thus each protein represents an individual challenge in terms of handling
and measurement.
However, proteolytic digestion of a given protein to peptides generally yields
a mixture of
peptide molecules from which an example can almost always be chosen that is
unique to a
given target protein (and thus can serve as a quantitative surrogate for it)
and has properties
compatible with a selected measurement method (encapsulated by the
aspirational phrase "in
11
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
every bad protein there is at least one good peptide"). For this reason,
peptide-level detection
is less susceptible to interferences, and more compatible with universal
sample preparation
methods, than protein-level detection. A typical human protein yields about 50
peptides upon
digestion with trypsin, and thus a sample containing, for example, 5,000
proteins is likely to
yield a tryptic digest containing 250,000 different peptides. Peptides of the
length of typical
tryptic peptides (5 to 25 amino acids in a typical tryptic digest) do not
generally exhibit stable
folded structures and thus do not generally interact with one another to form
stable multi-
peptide structures. This overall absence of stable interactions between digest
peptides
overcomes the major source of interference and error in technologies such as
conventional
immunoassays.
Proteolytic digestion is widely used in proteomics to fragment proteins for
analysis by
mass spectrometry (10) and other analytical methods. Digestion of a sample
such as plasma
is typically carried out by first denaturing the sample proteins (e.g., with
detergents such as
deoxycholate, organic solvents, urea or guanidine HC1), reducing the disulfide
bonds in the
proteins (e.g., with tris(2-carboxyethyl)phosphine (TCEP), dithiothreitol or
mercaptoethanol),
alkylating the cysteines to prevent re-formation of disulfides (e.g., by
addition of
iodoacetamide which reacts with the free ¨SH group of cysteine), quenching
excess
iodoacetamide by addition of more dithiothreitol or mercaptoethanol, and
finally (after
removal or dilution of the denaturant) addition of the selected proteolytic
enzyme (e.g. trypsin,
Lys-C, etc.), followed by incubation to allow digestion. Following incubation,
the action of
trypsin is terminated, either by addition of a chemical inhibitor (e.g., TLCK)
or by denaturation
(through heat or addition of denaturants, or both) or removal (if the trypsin
is on a solid support)
of the trypsin. Digestion destroys protein:protein interactions and thus
generally eliminates
interferences that occur in conventional immunoassays.
A very wide variety of proteolytic digestion protocols have been developed,
and some
have been shown to exhibit extremely high quantitative reproducibility when
implemented on
automated platforms (4) . Most such protocols involve use of a single
proteolytic step with a
single enzyme (typically trypsin), while in a few cases two enzymes are used
together (e.g.,
Lys-C and trypsin) in order to improve efficiency: Lys-C is smaller than
trypsin and more
stable at elevated temperature and in the presence of denaturants, and
therefore able to cleave
12
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
proteins that are otherwise relatively resistant to trypsin attack. In some
cases this approach
makes use of sequential digestion by Lys-C followed by trypsin, with these two
steps carried
out at different temperatures or at different denaturant concentrations. In
contrast to
embodiments described herein, the sequential use of Lys-C and trypsin to
improve digestion
efficiency does not allow oriented construction of peptide-polymer constructs
as disclosed in
the present invention.
3.4 THE DYNAMIC RANGE PROBLEM
Quantitative detection of peptides by any method faces a major challenge in
the form
of the vast dynamic range of protein concentrations in samples of interest. In
the main clinical
specimen (blood serum or plasma) proteins of clinical interest span more than
10 orders of
magnitude (10 billion-fold) between the highest abundance proteins (albumin in
plasma, or
hemoglobin in whole blood) and low abundance proteins of interest (e.g.,
thyroglobulin (Tg)
in the blood of a thyroid cancer survivor). Thus calculations based on the
known amounts of
various proteins in human plasma show that for every molecule of a selected
peptide unique to
thyroglobulin in the tryptic digest of a given volume of plasma, there are ¨40
billion molecules
of other peptides. Comparing the selected thyroglobulin peptide abundance with
that of a
selected peptide from albumin, there are more than 400,000,000 copies of the
albumin peptide
for each molecule of the Tg peptide An example panel of clinically important
proteins is
shown in Figure 1, comprising a high-abundance protein (transferrin), and
lower abundance
proteins soluble transferrin receptor (sTfR) and hepcidin, as well as
thyroglobulin (present at
0.5 ng/ml - a level indicative of thyroid cancer recurrence in someone treated
for that cancer).
In a plasma digest, a transferrin peptide is expected to outnumber sTfR
peptides by almost
1,000 to 1; to outnumber hepcidin peptides by almost 5,000 to 1, and to
outnumber Tg peptides
by 28,000,000 to 1. These proteins are measured today in separate assays, each
optimized for
a different abundance level, and typically offering an assay dynamic range of
¨1,000 (i.e., a
range in which most clinical specimen concentrations of that protein are
expected to fall).
Wider dynamic ranges have been achieved in detection systems reliant on
amplification
(e.g., PCR assays for nucleic acids and proximity ligation (11) or Somascan
(12) assays for
proteins). However, amplification-based assay systems sacrifice certainty
regarding analyte
identity, since the actual molecular targets are not themselves observed or
measured by the
13
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
ultimate detectors used in such assays (which detect binding reagents such as
antibodies
instead), with the result that unexpected interfering molecules can be
measured and genuine
analyte molecules can fail to generate signal. Confidence that the intended
analyte, and only
this analyte, is being measured requires direct analyte detection by a
detector capable of
discriminating the correct analyte from all others, as exemplified by sequence
sensitive single
molecule detectors used in the present invention.
3.5 LIMITATIONS OF MASS SPECTROMETRY AND SISCAPA
Using direct analyte detection (whether by mass spectrometry or single
molecule
detection) with samples such as blood plasma comprising a very wide dynamic
range of protein
targets represents a major technical challenge. A practical technology for
measuring the
clinical plasma proteome should be capable of accurately quantitating panels
of specific
biomarker proteins spanning all abundance levels in a single aliquot of
sample. To date, the
favored approach to this problem has been a combination of differential
enrichment of peptides
during sample preparation (bringing target peptides into more equal ratios)
with mass
spectrometry detection (providing a close approximation to sequence-based
analyte
identification) ¨ an approach termed SISCAPA. The SISCAPA method described in
previous
disclosures (US7632686) and publications (3, 9, 13-15) is a general approach
for protein
quantitation involving digesting proteins (e g , with trypsin) into peptides
that can be enriched
by specific affinity capture and further fragmented in a mass spectrometer
(e.g., by LC-
MS/MS) to generate a sequence-based identification and a measure of amount by
comparison
to an internal standard. This approach combines the advantages of classical
immunoassays
(sensitivity, throughput) with those of mass spectrometry (specificity,
multiplexability, wide
linear dynamic range), while overcoming the limitations of each. Challenges
remain, however,
in sensitivity (mass spectrometry typically requires millions of molecules to
generate a
reasonably precise signal), throughput (MS-based assays typically require 5-30
minutes per
sample), cost (mass spectrometers are expensive, e.g., $500,000, and require
expert operators),
and robustness (MS-based systems are typically confined to large institutions
with
sophisticated infrastructure).
A major improvement would result from the replacement of mass spectrometry by
a
sequence-sensitive single molecule detector, coupled with sample preparation
technology
14
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
capable of delivering to the detector the small numbers (e.g., 100-10,000) of
purified analyte
molecules that, when counted by the detector, generate the required assay
precision (in this
case determined by counting statistics).
No mature, practically-implementable technology currently exists that is
capable of
preparing peptide libraries from complex sample digests for single molecule
analysis over such
a wide dynamic range.
3.6 PEPTIDE CHEMISTRY
A wide range of chemical modifications have been devised that can be of use in
the
preparation of peptides and digests for analytical applications of the
invention. In some
embodiments described herein a peptide is chemically modified, for example to
create a
linkage to another molecule during assembly of a novel multi-part construct.
Site specific
linkage chemistries are known for amino groups (e.g., the n-terminal amino
group, and the
epsilon amino group of lysine); for carboxyl groups (e.g., the c-terminal
carboxyl, and
sidechain carboxyls of aspartic and glutamic acids); sulfhydryl groups of
cysteine residues;
and a variety of other less frequently used chemistries. Of these, only the
amino and carboxyl
groups are available on almost all peptides, making them the preferred
attachment points for
general methods applicable to a wide variety of peptides. While many reagents
have been
identified that react preferentially with amino groups, the most common
chemistries are n-
hydroysuccinimide (NHS) esters and their more soluble sulfo-NHS derivatives.
The n-
terminal amino group and epsilon amino group of lysine have significantly
different pK values,
offering the potential to modify one in preference to the other, though this
distinction is not
absolute. Nevertheless, protocols have been devised (16) that couple NHS-
derivative small
molecules (e.g., TMT labels used in mass spectrometric detection) to peptide
amino groups
with very high efficiency (>95%) using very little excess reagent (less than 2-
fold excess of
reagent over amino groups), illustrating the feasibility of quantitative
modification of amino
groups in complex peptide mixtures such as proteolytic digests. Carboxyl
groups can also be
modified using chemistries involving carbodiimides (e.g., EDC: lethy1-3-(3-
dimethylaminopropyl)carbodiimide); however the lack of site specificity of
such reactions
(between c-terminal and amino acid side chain carboxyls) restricts their
usefulness for site-
specific approaches.
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
A variety of reagents exist that are capable of introducing "click" chemistry
functional
groups into peptides, often by reaction of an NETS-derivative of a click
functionality with a
peptide amino group, and oligonucleotides, often by incorporating an amino-
derivatized base
in a DNA sequence during synthesis and subsequently adding a click group by
reaction with
this amino group (17). Examples of effective click reagent pairs useful in
creation of constructs
according to the invention include i) reaction of an azide with an alkyne
functionality (some
requiring Cu(I) catalysis, which is less preferred in some embodiments); ii)
reaction of an azide
with a cyclooctyne such as DBCO (dibenzocyclooctyne, also called DIBO), Aza-
dibenzocyclooctyne (ADI130) or BCN (bicyclo[6.1.0]non-4-yne) by means of a
strain-
promoted alkyne cycloaddition (SPAAC) reaction without the need for a Cu
catalyst; and iii)
reaction of a tetrazine (Tz, such as methyltetrazine) with a trans¨cyclooctene
(TCO), also
without the need for a Cu catalyst.
3.7 INTERNAL STANDARDIZATION
Use of an internal standard in an analytical assay is highly desirable as it
provides a
stable reference against which the desired analyte can be measured. In the
case of mass
spectrometric detection of peptides, a synthetic stable isotope labeled
version of a target
peptide can easily be made and used as an internal standard (the well-known
method of
"isotope dilution mass spectrometry") The approach works well because the
labeled and
unlabeled peptides are chemically and structurally identical, and thus behave
the same through
any sample preparation protocol, yet can be distinguished reliably by
measuring their masses
in the final mass spectrometer detection step. Since the labeled peptide is
added at a known
concentration, the ratio between the amounts of the natural and isotopically
labeled forms
detected by the final MS analysis allows the concentration of the natural
peptide in the sample
mixture to be calculated. The approach can be multiplexed to cover multiple
peptides
measured in parallel, and can be automated through computer control to afford
a general
system for protein measurement (13).
Single molecule detectors are unable to measure peptide mass accurately enough
(or in
most cases at all) to use stable isotope versions as internal standards in
this manner. Hence
there is a need for an alternative peptide labeling strategy to create single
molecule internal
standards capable of a) behaving like the targeted peptide analyte during the
steps of sample
preparation, while b) being clearly distinguishable from the target by the
chosen single
16
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
molecule detection technology. Use of the term "standard" in this specific
sense is distinct
from other forms of "standards" that can be introduced into workflows for
quality control of
separations, monitoring of chemical reaction yields, etc., rather than
improving quantitation of
a single specific analyte.
3.8 PEPTIDE-SPECIFIC BINDERS
A variety of types of biologically derived antibodies (e.g., polyclonal,
monoclonal and
oligoclonal antibodies derived from mice, rabbits, humans, camelids and other
species),
molecules derived from antibodies by molecular biology techniques (e.g.,
antibodies selected
from libraries using phage display and other techniques), aptamers (based on
DNA, or RNA,
and including a variety of modified bases and backbones), and other molecular
constructs can
be created that are capable of specifically binding a peptide (-BINDERs"). For
example, a
TARGET peptide can be coupled to a carrier protein (e.g., keyhole limpet
hemocyanin: KLH)
and used to immunize an animal (such as a rabbit, mouse, chicken, goat,
camelid or sheep) by
one of the known protocols that efficiently generate anti-peptide antibodies.
Experience with
the SISCAPA technology (3, 9, 13-15) has shown that antibodies, preferably
monoclonal
antibodies, can be developed that bind and capture a specific low abundance
tryptic peptide
from the digest of a very complex sample such as human blood plasma (which may
contain
250,000 distinct peptides, some at very high abundance), and thereby enrich
the peptide
substantially (e.g., more than 10,000-fold). Discovery of such BINDERs (e.g.,
antibodies)
requires use of very specific screening processes to find reagents that do not
bind non-
TARGET peptides and retain the TARGET peptide long enough to wash non-binding
peptides
away (typically 10-15 minutes in many automated protocols). The screening
process does not
assess equivalence of binding TARGET and STANDARD (stable isotope-labeled
peptide)
since it is known there will be no difference (at least for 15N and 1-3C
isotopic labels).
If, however, a peptide TARGET and its cognate internal STANDARD molecule are
not chemically identical (as is the case with stable isotope labeled
standards), the very
specificity of effective BINDERs creates a major problem: if a BINDER binds a
STANDARD
more or less tightly (or with different kinetics) than its cognate TARGET
peptide, then binding
will impact the ratio of TARGET to STANDARD molecules and lead to an incorrect
assay
result. The selection of TARGETs, STANDARDs and BINDERs that successfully
preserve
17
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
the quantitative ratio is therefore critical for enablement of quantitative
internally-standardized
single molecule detection.
3.9 STRENGTHS AND LIMITATIONS OF SINGLE MOLECULE DETECTION
The ability to detect a single molecule of an analyte provides the maximum
theoretically possible analytical sensitivity. While only a few current
clinical assays are
sensitivity-limited, sensitivity is nevertheless a major barrier for many
emerging and
potentially clinically-useful protein biomarkers. Single molecule sensitivity
would also allow
measurement of the 100+ proteins of the current clinical laboratory menu in
much smaller
samples than the 10-100uL plasma currently required, and therefore reduce the
need for
phlebotomy by enabling use of tiny samples such as dried blood microsamples.
The few single
molecule methodologies so far developed for practical application to protein
analytes (e.g.,
Quanterix SIMOA technology: https://www.quanterix.com/simoa-technology/) make
use of
indirect detection (e.g., using antibodies to identify analyte molecules) and
are therefore
subject to the well-known range of immunoassay interferences, non-linear
responses, and
limited multiplexability.
The ability to determine the partial or full sequence of a biomolecule confers
a further
major advantage: improved confidence in its identity. For biopolymers such as
nucleic acids
and proteins, sequence information can identify the analyte unambiguously,
thus enabling
direct analyte detection. Methods that make use of antibodies to recognize
intact protein
analytes (e.g., immunoassays) cannot provide this level of certainty, and are
classified as
indirect detection methods.
It is thus highly desirable in protein analysis to employ single-molecule
methods that
provide sequence information (18, 19) or structural information closely tied
to sequence. A
number of such methods are being developed, in most cases making use of
technology
foundations created for use in nucleic acid (e.g., DNA and RNA) analysis,
where sequence
information is the primary deliverable.
The present invention provides reagents, methods, and kits for the preparation
of
peptide libraries suitable for quantitative analysis by a variety of sequence-
sensitive single
molecule detection technologies, including, but not limited to, the following
examples:
18
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
3.9.1 Single molecule sequencing using nanopores.
Biopolymers (nucleic acids and polypeptides) can pass through nanopores (both
biological and inorganic) in suitable membranes. Signals (e.g., through-pore
ion current, or
cross-pore tunneling current) recorded during transit of the analyte molecule
through the pore
reflect differences in blockade of ions flowing through the pore by the side
chains of the
biopolymer (i.e., bases or amino acids) in the pore's throat region. Nanopore
methods of DNA
and RNA sequencing have been developed and successfully commercialized ((20,
21)).
Nanopore analysis of peptides and proteins is advancing rapidly ((19, 22-25)),
but
discrimination of 20 different amino acids presents a far greater challenge
than discrimination
of 4 nucleic acid bases. The most mature approach for nanopore analysis of
peptides is one
involving in-line linkage of peptides and nucleic acids into a hybrid polymer,
allowing use of
some features of a successful commercially-available DNA sequencing platform
to be applied
to peptides (e.g., international publication WO 2021/111125 Al). Similar
methods are likely
to work with a variety of alternative platforms including, but not limited to,
alternative
biological nanopores (26, 27), inorganic nanopores (28, 29), DNA-origami
nanopores a,_21
and the like.
3.9.2 Single molecule characterization using an Affinity Reagent Imaging
Platform (ARIP)
in which fluorescent BINDERs recognize molecular features.
Single protein molecules can also be arrayed in a regular pattern on a planar
surface
and probed by a succession of "promiscuous" binding agents to build up a
pattern of epitope
occurrences in each molecule (30-32). Machine learning approaches can be used
to interpret
these epitope occurrence patterns to identify most proteins produced in a
given organism
despite the stochastic nature of individual binding events. In the context of
short peptides,
rather than whole proteins, this approach does not deliver direct peptide
sequence information.
3.9.3 "FRET" fingerprinting of peptides moving through a protease
A limited but sequence-specific fingerprint of a peptide can be accomplished
by
detecting the order of fluorophores coupled to specific amino acids on a
single TARGET
peptide molecule (33). Such a technology has been developed by functionalizing
a peptide
with one type of fluorophore (Cy3) at the N-terminal site and a second type of
fluorophore
19
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
(Cy5) on an internal cysteine residue. The method then monitored the order in
which the two
fluorophores passed through Alexa488-labeled ClpP14 protease, as detected
using the
separation-dependent Forster resonance energy transfer effect ("FRET"). This
approach
provides less information than is potentially available by nanopore
sequencing.
3.9.4 Single molecule degradative sequencing.
Several methods have been developed for recognizing, detecting and removing
one
amino acid at a time from a peptide and either recording (e.g., in DNA) or
detecting (e.g., by
optical readout methods) the result, providing a single molecule version of
classical Edman
peptide sequencing. These methods have in common the need to recognize
individual terminal
amino acids, a problem that has not been completely solved to date, with the
result that only
an "approximate" version of a peptide's sequence is obtained. As described
below, even this
approximate information may be sufficient to allow use of these detection
technologies with
the invention, since the invention requires discrimination among only a
relatively small number
of different TARGET peptides. The instrument platforms associated with these
technologies
typically provide for simultaneous "sequencing- of millions to billions of
peptide molecules,
with successive amino acids decoded in successive cycles of reagent
recognition and terminal
amino acid removal. Thus any of them can be used to generate sequence data (or

"approximate" sequence data) from the peptide libraries prepared using the
invention.
3.9.4.1 Single molecule degradative sequencing by reverse translation.
The concept of "reverse-translation" of a peptide sequence into a DNA sequence
is of
course a noteworthy contradiction of the central dogma of molecular biology
(DNA makes
RNA makes protein), and no such biological system is believed to exist.
However, by coupling
the recognition of a specific terminal amino acid on an immobilized peptide
(by a recognition
molecule specific for one or more terminal amino acids) with a transfer of a
DNA code from
the recognition molecule to a nearby DNA molecule, one position of the
peptide's sequence
can be converted into a DNA code. By then removing the terminal amino acid
(e.g., by
chemical Edman degradation, or limited enzymatic attack by an exoprotease, or
other
equivalent means) and repeating the recognition and transfer process, a DNA
sequence can be
progressively generated that encodes all or part of the peptide amino acid
sequence (the peptide
being destroyed in the process: i.e., the reading process is degradative). The
DNA molecule
CA 03238472 2024-5- 16

WO 2023/102502 PCT/US2022/080781
can subsequently be read using any of the established DNA sequencing
methodologies. In the
event that some amino acids are not clearly discriminated, or that some
recognition molecules
only recognize a class of amino acids (e.g., those having positive charge, or
those with negative
charge, or those with uncharged hydrophilic side chains, etc.), the resulting
"approximate"
sequence information may nevertheless be sufficient to recognize one peptide
sequence among
a limited set of expected alternatives. A variety of methods can be used to
immobilize millions
of individual peptide molecules and adjacent DNA molecules so as to produce a
DNA library
encoding sequence information from the original peptide library. In this
technology, peptides
are typically linked to the solid support via the c-terminal carboxyl group,
leaving the n-
terminus free. Significant progress towards reverse translation of peptide
libraries has been
reported by several groups, e.g., the
"Proteocode" technology
(haps://www.encodia.com/technoio2,-y., US 2021/0208150) and the "ProtSeq"
technology
((34); patent publication US 2021/01022).
3.9.4.2 Single molecule degradative sequencing using fluorescence detection.
Other degradative single peptide molecule methods have been reported that make
use
of optical detection of fluorescent labels. In one such method, the dynamics
of binding of
recognition reagents to terminal amino acids of single peptide molecules
located in individual
wells on a semiconductor chip are sensed and interpreted peptide sequence data
(35). In this
technology, peptides are typically linked to the solid support via the c-
terminal carboxyl group
(PCT/US2021/028471).
Another method ("fluorosequencing", (36-38)) makes use of fluorescent labels
attached to specific amino acids (e.g., cysteine SH, lysine NH2, etc.) by
chemical methods,
and records the disappearance of these fluorescent signals when a labeled
amino acid is cleaved
off during a sequence of degradative (e.g., Edman) steps.
3.10 SAMPLE BARCODING
The ability to manipulate DNA sequences, particularly to synthesize, sequence,
and
splice together DNA sequences of various lengths, enables the attachment of
designed,
recognizable sequence tags (i.e., "barcodes") to the DNA recovered from
biological samples.
21
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Once sample DNA is barcoded, then DNA from multiple samples can be combined
for further
processing, such as next generation sequencing ("NGS"), and afterwards
attributed to the
correct original sample (i.e, demultiplexed). A variety of DNA barcode systems
have been
developed with the object of reliable identification of the original source
sample in NGS
applications. Note: this use of the term barcode (meaning a designed tag used
for labeling) is
distinct from an alternative usage applied to endogenous DNA sequences found
to be
characteristic of biological species and used to identify presence of a
species in a sample
comprised of multiple organisms.
For high-throughput DNA sequencing applications, Xu, et al, devised a library
of
240,000 orthogonal 25mer DNA barcodes in 2009 (39), and continuing work by a
number of
investigators has resulted in barcode libraries of improved reliability and
error-resistance (40-
42). In these applications, where the analytical system has the ability to
directly sequence the
barcode along with the sample DNA, the barcodes can be relatively short, and
sophisticated
mathematical approaches for error-detection and correction can be employed to
reduce the
likelihood of incorrect barcode assignments due to incorrect base calls,
insertions, deletions,
etc.. Sophisticated software, including machine learning, has been developed
to improve
correct DNA barcode assignment in multiplexed nanopore DNA sequencing (e.g.,
(43)).
DNA barcodes are also used in other applications where the barcodes are "read"
by
hybridization of a complementary probe that can be detected by optical or
other detection
means (44) without sequencing (e.g., using a fluorescently labeled
complementary-sequence
probe reagent detected by single molecule microscopic imaging). Such methods
have been
successfully applied for single molecule fluorescence detection of up to 1,000
different mRNA
sequences in single cell images using 16 different 30-mer readout probes in a
16-bit modified
Hamming distance 4 code (45). Such coding methods enable efficient sample
barcoding and
demultiplexing in single molecule imaging platforms.
Potential errors in barcode identification resulting from sequencing errors,
hybridization failures, etc., have been dealt with by application of
techniques derived from
information theory. A particularly effective approach has been the use of
error-correcting
codes (ECC) used in digital information storage systems (e.g., computed
memory), generally
stemming from the work of Hamming (46). By incorporation of extra parity bits
in an encoded
22
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
signal, Hamming codes can be designed to allow detection and repair of single-
bit errors, and
detection (and in some cases repair) of two bit errors. Given the potential
for error in barcode
readout by nucleic acid sequencing (e.g., in single molecule nanopore methods)
or
hybridization (e.g., in optical imaging applications), error detection and
correction can be
critical when single molecules are being detected and counted to determine a
quantitative
result.
Yet other barcoding strategies have been created using DNA barcodes to label
synthetic
chemical libraries (47) and peptide barcodes detected by mass spectrometry
(48).
3.11 MACHINE LEARNING FOR SIGNAL CLASSIFICATION.
Machine learning methods have been successfully developed that allow the
identities
and/or sequences of individual molecules to be deduced from complex signal
patterns Nucleic
acid sequences can be derived from current traces measured as DNA or RNA
molecules pass
through nanopores using highly-trained neural networks to recognize and
interpret
conductivity transitions (49). Proteins can be recognized by machine learning
based on
optically-detected stochastic binding of multiple promiscuous affinity
reagents to single
molecules (31). In general, machine learning approaches make it possible to
improve the
recognition of molecules by all the above single molecule technologies by
building
mathematical models based on large numbers of reference examples, and
incorporating more
data for each example than is practical in human-designed programs.
3.12 LIMITATIONS OF EXISTING TECHNOLOGIES AND OBJECTIVES OF THE
PRESENT INVENTION
The current dominant methods for direct detection of peptide molecules by MS,
including SISCAPA and related methods, have significant limitations. These
include A)
sensitivity limited by the performance of available mass spectrometers
(currently limited to
10-100 amol of peptide, equivalent to 6 million to 60 million molecules of a
peptide); B) low
throughput (largely due to the limited speed of typical liquid chromatography
systems
employed); C) lack of robustness of the liquid chromatography systems used to
separate
peptides and introduce them into the MS; D) level of expertise required to
operate LC-MS
systems); E) high cost of LC-MS systems and the consequent limited adoption in
clinical
laboratories and F) impracticality of use in low-technology environments. In
addition, there is
23
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
the fundamental limitation that MS typically resolves and identifies analytes
based on one or
a few parameters that are derived from the peptide sequence (typically its
mass and the masses
of one to three of its specific fragments), but it does not typically
determine the entire peptide
sequence and is therefore susceptible to various forms of identification
error.
Existing methods of single molecule detection likewise have significant
limitations
restricting their application to peptide quantitation. These include A)
absence of internal
standards that could provide a quantitative reference for comparison of
samples; B) limited
dynamic range (inability to count sufficient numbers of molecules to estimate
frequency of
very low abundance targets); C) lack of efficient sample preparation protocols
to deliver
peptides for single molecule detection; and D) limited ability to recognize
many of the amino
acids in a peptide sequence (i.e., limited specificity).
Recognizing these limitations, a recent analysis (50) of the limitations of
single
molecule methods in comparison with mass spectrometry for the general
characterization of
complex proteomes concluded that the above-referenced limitations, and
specifically the
dynamic range limitation, effectively prevent current single molecule methods
from providing
achieving a general analysis of sample proteins.
It is an object of the present invention to transcend these limitations and
others. The
invention provides significant improvements in assay sensitivity by making use
of single
molecule counting technologies instead of mass spectrometry detection, with
the potential to
make quantitative measurements at the level of hundreds to thousands of
analyte molecules
(i.e., >1,000-fold improvement compared to MS methods, including SISCAPA-MS).
The
invention provides sequence-based assay specificity through direct detection
and counting of
analyte molecules without the use of liquid-chromatography or expensive mass
spectrometer
instruments. The invention makes use of certain technologies and platforms
that have been
extensively developed for nucleic acid applications (e.g., DNA and RNA
sequencing), some
of which have been implemented commercially as small, inexpensive instruments
capable of
generating accurate results in low-technology environments. A further object
of the invention
is to significantly lower the cost of making precise measurements of protein
biomarkers, drugs
and targets, and thereby to enable expanded use of quantitative protein tests
in diagnostics and
in longitudinal health monitoring.
24
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
The invention provides methods for improved protein quantitation by adapting a
novel
specific affinity enrichment strategy to allow detection of enriched peptides
by technologies
other than mass spectrometry ¨ specifically technologies that enable counting
individual
peptide molecules in a sequence-specific manner. In adapting the specific
affinity enrichment
strategy to these alternative detection means, significant novel changes as
described herein are
required in the selection and treatment of peptides, in the generation of
suitable internal
standards as substitutes for stable isotope labeled versions, in the
generation of sequence-
specific binding reagents, in the preparation and delivery of peptides for
single molecule
detection, and in the analysis of resulting data.
The present invention will be described with respect to particular embodiments
and
with reference to certain drawings, but the invention is not limited thereto
but only by the
claims. It is to be understood that not necessarily all aspects or advantages
may be achieved in
accordance with any particular embodiment of the invention. Thus, for example
those skilled
in the art will recognize that the invention may be embodied or carried out in
a manner that
achieves or optimizes one advantage or group of advantages as taught herein
without
necessarily achieving other aspects or advantages as may be taught or
suggested herein.
The invention, both as to organization and method of operation, together with
features
and advantages thereof, may best be understood by reference to the following
detailed
description illustrated in the accompanying drawings. The aspects and
advantages of the
invention will be apparent from and elucidated with reference to the
embodiment(s) described
hereinafter. Reference throughout this specification to "one embodiment" or
"an embodiment"
means that a particular feature, structure or characteristic described in
connection with the
embodiment is included in at least one embodiment of the present invention.
Thus, appearances
of the phrases "in one embodiment" or "in an embodiment" in various places
throughout this
specification are not necessarily all referring to the same embodiment, but
may. Similarly, it
should be appreciated that in the description of exemplary embodiments of the
invention,
various features of the invention are sometimes grouped together in a single
embodiment,
figure, or description thereof for the purpose of streamlining the disclosure
and aiding in the
understanding of one or more of the various inventive aspects. This method of
disclosure,
however, is not to be interpreted as reflecting an intention that the claimed
invention requires
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
more features than are expressly recited in each claim. Rather, as the
following claims reflect,
inventive aspects lie in less than all features of a single foregoing
disclosed embodiment.
It should be appreciated that "embodiments" of the disclosure can be
specifically
combined together unless the context indicates otherwise. The specific
combinations of all
disclosed embodiments (unless implied otherwise by the context) are further
disclosed
embodiments of the claimed invention.
26
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
4 DEFINITIONS OF TERMS
Key terms used frequently herein:
BINDER a peptide-specific binding agent capable of binding a
TARGET and its cognate STANDARD with equal or near-
equal affinity and kinetics, for example an anti-peptide
antibody
Tag, Flag, or Barcode a chemical group connected to a peptide that can be read
by
a detector to provide contextual information about the
peptide molecule (e.g., identity as a TARGET or
STANDARD, identity of a source sample, BINDER
identity, etc.)
Target protein a selected protein whose abundance is to be measured in a
sample
Target peptide a selected target peptide created by proteolytic digestion of
a target protein
TARGET a selected target peptide in combination with a molecular
Tag identifying the TARGET construct as distinct from a
cognate STANDARD construct
STANDARD a selected target peptide, or modified version thereof, in
combination with a molecular Tag identifying the
STANDARD construct as distinct from a cognate
TARGET construct. A STANDARD is added in a known
or reproducible amount to a sample digest and used as an
internal standard for quantitation of the cognate TARGET
Standardized sample digest a proteolytic digest of a sample to which STANDARDs

cognate to TARGETs have been added
27
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Enriched standardized sample a sample containing TARGETs and cognate STANDARDs

digest enriched from a standardized sample digest by use of
BINDERs
Flattened enriched standardized An enriched standardized sample in which the
proportion
sample digest of some TARGETs enriched from a standardized sample
digest is greater than others, leading to a decrease in the
relative abundance differences among TARGETs
VEHICLE a molecule (for example a polymer such as an
oligonucleotide), to which enriched TARGETs and
STANDARDs can be linked in order to facilitate their
detection by a single molecule detector (for example by
transporting them through a nanopore or locating them on
a planar array for analysis)
The term "amino acid" in the context of the present disclosure is used in its
broadest
sense and is meant to include organic compounds containing amine (NH2) and
carboxyl
(COOH) functional groups, along with a side chain (e g., a R group) specific
to each amino
acid. In some embodiments, the amino acids refer to naturally occurring L
amino acids or
residues. The commonly used one and three letter abbreviations for naturally
occurring amino
acids are used herein: A=Ala; C=Cys; D=Asp; E=G1u; F=Phe; G=Gly; H=His; K=Lys;
L=Leu;
M=Met; N=Asn; P=Pro; Q=G1n; R=Arg; S=Ser; T=Thr; V=Val; W=Trp; and Y=Tyr
(Lehninger, A. L., (1975) Biochemistry, 2d ed., pp. 71-92, Worth Publishers,
New York). The
general term "amino acid" further includes D-amino acids, retro-inverso amino
acids as well
as chemically modified amino acids such as amino acid analogues, naturally
occurring amino
acids that are not usually incorporated into proteins such as norleucine, and
chemically
synthesized compounds having properties known in the art to be characteristic
of an amino
acid. For example, analogues or mimetics of phenylalanine or proline, which
allow the same
conformational restriction of the peptide compounds as do natural Phe or Pro,
are included
within the definition of amino acid. Such analogues and mimetics are referred
to herein as
"functional equivalents" of the respective amino acid. Other examples of amino
acids are listed
28
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
by Roberts and Vellaccio, The Peptides: Analysis, Synthesis, Biology, Gross
and Meiehofer,
eds., Vol. 5 p. 341, Academic Press, Inc., N.Y. 1983, which is incorporated
herein by reference
The term "analyte" may refer to any of a variety of different molecules, or
components,
pieces, fragments or sections of different molecules that one desires to
measure or quantitate
in a sample.
The term "anti-peptide antibody" (a class of specific binding agent, or
BINDER) as
used herein means a macromolecule capable of non-covalently and reversibly
binding to a
peptide in a manner that is specific to all or a portion of the peptide's
sequence. The term
includes a variety of types of macromolecules as indicated in the definition
of "antibody"
above, and is not limited to the proteins conventionally considered
antibodies.
The term "barcode" includes any distinguishing physical, chemical or sequence
characteristic of a peptide construct capable of having multiple values that
can be determined
by a single molecule detection method. Nucleic acid sequences can be used as
barcodes, for
example by providing a set of distinguishable sequences, a different one of
which can be linked
to the peptides of each sample, identifying ("decoding") the source of these
peptides after the
peptides from multiple samples are pooled for efficient processing in a single
molecule
detection system. Other forms of molecular barcodes can be used as well,
including sets of
glycan structures (which can be decoded using various specific lectins, for
example), peptides
(when these can be linked to and sequenced with the TARGET peptides); non-
biological
polymers distinguishable by length or content of alternative polymer units;
and small
molecules including colored or fluorescent dyes.
The term "bind" includes any physical attachment or close association, which
may be
permanent or temporary. Generally, reversible binding includes aspects of
charge interactions,
hydrogen bonding, hydrophobic forces, van der Waals forces, etc., that
facilitate physical
attachment between the molecule of interest and the analyte being measured.
The "binding"
interaction may be brief as in the situation where binding causes a chemical
reaction to occur.
Reactions resulting from contact between the binding agent and the analyte are
also within the
definition of binding for the purposes of the present invention, provided they
can be later
reversed.
29
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
The terms "BINDER", "antibody", "anti-peptide affinity reagent", "specific
affinity
reagent", "specific binding reagent", "affinity capture reagent" and "anti-
peptide antibody" as
used herein mean a reagent having the ability to reversibly bind to a specific
TARGET peptide
(and its cognate STANDARD) in a manner that is specific to all or a portion of
the peptide's
sequence. Such a BINDER will typically bind a TARGET peptide with greater
affinity, greater
kinetic on-rate or lower kinetic off-rate than a majority of the other
peptides present in samples,
sample digests, or other sources of contamination. The terms include
antibodies and fragments
thereof as well as non-naturally occurring or synthetic antigen binding
molecules. Thus,
included are IgG antibodies (polyclonal, monoclonal, oligoclonal, etc.), and
other antibody
isotypes, fragments thereof, such as Fab fragments, murine, chimeric, and
other non-human or
not fully human antibodies and fragments thereof, synthetic (non-naturally
occurring) antigen
binding formats such as single chain antibodies and bispecific antibodies, as
well as aptamers
(including DNA, RNA and other polymeric aptamers) and binding proteins built
from non-
antibody structures (e.g., nanobodies).
The term -BINDER ID" means a molecular barcode identifying the BINDER to which

a molecule bound in an enrichment step.
The term "biologic" means a drug produced by a biological mechanism, such as a

protein; i.e., a protein therapeutic, or protein drug.
The term "biomolecules" refers to any molecule present in a biological system,
and
includes proteins, nucleic acids (specifically DNA and RNA in its various
forms, both
intracellular and extracellular), complex sugars (glycans and the like),
lipids, and a variety of
metabolites.
The term "denaturant- includes a range of chaotropic and other chemical agents
that
act to disrupt or loosen the 3-D structure of proteins without breaking
covalent bonds, thereby
rendering them more susceptible to proteolytic treatment. Examples include
urea, guanidine
hydrochloride, ammonium thiocyanate, trifluoroethanol and deoxycholate, as
well as solvents
such as acetonitrile, methanol and the like. The concept of denaturant
includes non-material
influences capable of causing perturbation to protein structures, such as
heat, microwave
irradiation, ultrasound, and pressure fluctuations.
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
The term "click chemistry" means the use of pairs of chemical groups that
react with
each other but not with other chemical groups commonly found in biomolecules:
i.e., they are
bio-orthogonal coupling mechanisms. Commonly used click chemical pairs
include, but are
not limited to, a 3' transcyclooctyne (TCO) group reacting bio-orthogonally
with a tetrazine
group (e.g., methyltetrazine (Me-TZ)), and bicyclononyne (BCN) reacting bio-
orthogonally
with an azide group. In some instances copper (Cu) ions serve as catalysts for
a click reaction,
and in other instances, typically involving a strained cyclic alkyne, a
catalyst is not required.
The term "clonotypic" means uniquely characteristic of a clonal product,
typically
referring to a peptide sequence unique to a specific monoclonal antibody.
The term "cognate" as used herein means a relationship between molecules in
which
either 1) the molecules each contain a region that has the same structure as
the other, or 2) the
molecules can bind together by a specific interaction. In the case of peptides
(e.g., a TARGET
and STANDARD peptide pair), cognate peptides can share a region of identical
sequence,
which may be from 2 amino acids up to the full length. The difference between
cognate
peptides can be a difference in sequence, or a difference due to attachment or
removal of some
atom(s) or groups (including one or more entire amino acids), or the addition
to the peptide or
a chemical group of any size (including oligonucleotides, peptides, "handles"
such as biotin,
and reactive groups able to subsequently bond to other molecules).
The term "cognate BINDER" or "cognate affinity capture reagent" means a
specific
affinity reagent (e.g., a specific binding reagent, BINDER) that is capable of
specifically
binding a cognate TARGET peptide and/or cognate STANDARD, in the sense that
the cognate
affinity capture reagent is designed, generated or selected to have a specific
affinity for an
epitope comprising part or all of its cognate peptide sequence.
The term -degradative sequencing technology" means a technology in which
peptide
molecules are disassembled one amino acid at a time (or in some cases two
amino acids at a
time), typically from one end, and the terminal amino acid identified, e.g.,
as one of the 20
common amino acids found in proteins, or as one of a subset of amino acids. In
some cases,
the identification can be obtained directly by optical or electrical readout,
and in some cases
the amino acid identity is translated into another molecular form (e.g., DNA)
for later readout
using a different technology.
31
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
The terms "drug" and "therapeutic" mean a type of molecule that may, under
appropriate circumstances of dosing and timing, interact with components of a
subject's body
to modify biological processes, including disease processes, normal processes,
aging and the
like. A drug may be a small molecule such as aspirin, or a macromolecule such
as a protein
(e.g., insulin), a nucleic acid (such as an anti-sense drug), or carbohydrate
(such as heparin).
Drugs that comprise or are derived from monoclonal antibodies represent a
growing class of
therapeutic agents with particular advantages in terms of extreme specificity
for endogenous
protein and other targets involved in disease processes.
The term "electrospray ionization" (ESI) refers to a method for the transfer
of analyte
molecules in solution into the gas and ultimately vacuum phase through use of
a combination
of liquid delivery to a pointed exit and high local electric field.
The term -elution" means the release of a bound peptide or construct from a
BINDER.
The term -flag" is used herein as equivalent to Barcode, and may be any type
of
distinguishing molecular feature including, but not limited to, a polymer of
dissimilar subunits
encoding an identification relevant to sample analysis.
The terms "Forster resonance energy transfer- or FRET refer to energy transfer

between two light-sensitive molecules (chromophores, typically fluorescent
molecules). A
donor chromophore, initially in its electronic excited state, may transfer
energy to an acceptor
chromophore through nonradiative dipole¨dipole coupling. The efficiency of
this energy
transfer is inversely proportional to the sixth power of the distance between
donor and acceptor,
making FRET extremely sensitive to small changes in distance, generally on
scales of 1 to 10
nm.
The term "immobilized enzyme" means any form of enzyme that is fixed to the
matrix
of a support by covalent or non-covalent interaction such that the majority of
the enzyme
remains attached to the support of the membrane.
The term "ligation" as used herein means the joining of an end of a polymer
chain (such
as a nucleic acid) to an end of another polymer chain to form a combined
linear polymer. The
term includes joining by enzymatic means (such as that of a DNA ligase,
splicing means such
32
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
as CRISPR, and other well-known molecular biology techniques for joining and
splicing
nucleic acid sequences) and chemical means (such as the use of click
chemistry).
The term "Linkage" means a connection between originally separate molecules,
and
includes common covalent connections between units found in biopolymers and
man-made
polymers, as well as connections made using chemistries such as the well-known
"click"
chemistries, reactions such as those between amino groups and NHS esters, and
formation of
sugar-phosphate bonds when oligonucleotides are ligated together, as well as
strong but non-
covalent connections such as the interaction between biotin and streptavidin.
The related term
"Linker" means a segment of a molecule comprising an atomic configuration
capable of, or
arising from, formation of a linkage between two or more initially separate
molecules.
The term "MALDI" means Matrix Assisted Laser Desorption Ionization and related

techniques such as SELDI, and includes any technique that generates charged
analyte ions
from a solid analyte-containing material on a solid support under the
influence of a laser or
other means of imparting a short energy pulse.
The term "Mass spectrometer" (or "MS") means an instrument capable of
separating
molecules on the basis of their mass m, or m/z where z is molecular charge,
and then detecting
them. In one embodiment, mass spectrometers detect molecules quantitatively.
An MS may
use one, two, or more stages of mass selection. In the case of multistage
selection, some means
of fragmenting the molecules is typically used between stages, so that later
stages resolve
fragments of molecules selected in earlier stages Use of multiple stages
typically affords
improved overall specificity compared to a single stage device. Often,
quantitation of
molecules is performed in a triple-quadrupole mass spectrometer using the
method referred to
as 'Multiple Reaction Targeting' or "MRM mass spectrometry- in which measured
molecules
are selected first by their intact mass and secondly, after fragmentation, by
the mass of a
specific expected molecular fragment. However, it will be understood herein
that a variety of
different MS configurations may be used to analyze the molecules described.
Possible
configurations include, but are not limited to, MALDI instruments including
MALDI-TOF,
MALDI-TOF/TOF, and MALDI-TQMS, and electrospray instruments including ESI-TQMS

and ESI-QTOF, in which TOF means time of flight, TQMS means triple quadrupole
MS, and
QTOF means quadrupole TOF.
33
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
The terms "molecular tag", "molecular flag", or "molecular feature" mean a
structural
component of a molecular construct that can be detected by a single molecule
detector and
assigned a significance in the interpretation of counted molecules (e.g.,
distinction between
TARGET and STANDARD tags, barcodes identifying samples, barcodes identifying
BINDERs, etc.)
The terms "particle" or "bead" mean any kind of particle in the size range
between
lOnm and lcm, and includes magnetic particles and beads.
The term -peptide library preparation- means a method used to convert the
proteins in
a biological sample into a collection of peptides modified so as to be
detectable, identifiable
and countable by a sequence sensitive single molecule detector.
The term "proteolytic treatment" or "proteolytic enzyme" may refer to any of a
large
number of different enzymes, including trypsin, chymotrypsin, LysC, ArgC,
AspN, GluC, v8
and the like, as well as chemicals, such as cyanogen bromide, that, in the
context of the methods
described herein, acts to cleave peptide bonds in a protein or peptide in a
sequence-specific
manner, generating a collection of shorter peptides (a digest).
The term "proteotypic peptide- means a peptide whose sequence is unique to a
specific
protein in an organism, and therefore may be used as a stoichiometric
surrogate for the protein,
or at least for one or more forms of the protein in the case of a protein with
splice variants.
The term "sample" means any complex biologically-generated sample derived from

humans, other animals, plants or microorganisms, or any combinations of these
sources
"Complex digest" means a proteolytic digest of any of these samples resulting
from use of a
proteolytic treatment.
The term "SAMPLE ID" means a molecular barcode identifying the sample from
which a molecule was obtained, i.e., its sample of origin. A sample barcode
present in a
construct identifies the sample of origin and allows this identity to be
recovered after constructs
from multiple samples have been pooled and analyzed together in a single
molecule detector.
The terms "ratchet mechanism", "protein nanomachine" or "molecular motor" mean
a
molecular-scale device capable of pulling, pushing, unzipping (in the case of
complementary
34
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
strands of nucleic acids), or otherwise regulating the motion of linear
molecules in discrete
steps.
The term "sequence-sensitive single molecule detection" or "SSSMD" means
detection
and counting of individual molecules using a method capable of differentiating
between
different linear biopolymer sequences occurring in the molecules. A "sequence-
sensitive
single molecule peptide detector" means a detector, instrument, technology,
chemistry, or
multi-component system that is able to achieve sequence-sensitive single
molecule detection
of peptides. Such a detector need not achieve 100% accuracy to accomplish the
objectives of
the invention, since the number of different peptide sequences that must be
distinguished from
one another and counted in the invention is a small number (e.g., 1, 1-5, 1-
10, 5-20, 10-50, 25-
100, 50-200, or more peptides) compared to number of peptides present in a
digest of a
complex biological sample (typically hundreds of thousands of peptides in the
digest of a
sample such as blood plasma). The term includes nanopore-based sequencing of
nucleic acids,
proteins and peptides; fluorescence-based methods such as fluorosequencing (36-
38)
including Edman methods; -reverse-translation" of peptide sequencing into DNA
sequences
followed by DNA sequencing (the "Proteocode" technology developed by Encodia:
https://www.encodia.com/technology); "FRET" fingerprinting of peptides (36,
51); single
molecule imaging methods (31) and other related methods_
The terms "sequencing nanopore" and "nanopore" as used herein refer to ion-
conductive pores capable of functioning in an ion-impermeable membrane or
vessel wall, and
through which linear polymers can pass. Typical nanopores are of biological
origin (e.g.,
MspA), comprising one or more protein molecules, or created by engineering
(e.g., versions
of biological nanopores modified by mutation, rearrangement or combination of
proteins; very
small holes etched or drilled in thin metallic or ceramic substrates; or DNA
assemblies). A
recording of the current flowing through a nanopore over time is referred to
as a "trace" or
"squiggle".
The term "sequential degradation" refers to a process in which amino acid
residues are
removed, in sequence order, from one terminus of a peptide. In the context of
the invention,
sequential degradation can be employed in a process in which a peptide's
terminal amino acid
is "recognized" (e.g., by binding of one of a series of affinity regents
specific for the various
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
amino acids presented at the terminus) and its identity determined or recorded
for later
evaluation, after which the terminal amino acid can be cleaved off (e.g.,
using enzymes such
as exoproteases, classical Edman chemistry, or other chemistries capable of
removing a
terminal amino acid) and the process repeated to determine a sequence of amino
acids from
the peptides' terminus. Similarly, a process can employ recognition reagents
that report
information on two or more terminal amino acids at a time, and a cleavage
process can be
employed that removes two or more terminal amino acids per cycle. The process
need not
sequence all amino acids in a peptide to generate TARGET peptide or STANDARD
identifications and single molecule counts that are useful in the invention.
The term "SISCAPA" means the method described in US Patent No. 7,632,686, and
in
Mass Spectrometric Quantitation of Peptides and Proteins Using Stable Isotope
Standards and
Capture by Anti-Peptide Antibodies (SISCAPA) (Journal of Proteome Research 3:
235-44
(2004).)
The term "small molecule" or "metabolite" means a multi-atom molecule other
than
proteins, peptides and DNA; the term can include but is not limited to amino
acids, steroid and
other small hormones, metabolic intermediate compounds, drugs, drug
metabolites, toxicants
and their metabolites, and fragments of larger biomolecules.
The term "stable isotope" means an isotope of an element naturally occurring
or
capable of substitution in proteins or peptides that is stable (does not decay
by radioactive
mechanisms) over a period of a day or more The primary examples of interest in
the context
of the methods described herein are C, N, H, and 0, of which the most commonly
used are
13C and 15N.
The term "solubilized tissue sample- means a liquid sample generated from a
sample
of a solid biological tissue (e.g., liver, brain, skin, etc.) by a method that
results in a solution
containing tissue molecules. Depending on the method of solubilization, a
solubilized tissue
sample may contain one, a few, many or almost all tissue molecules in
solution. Tissue
solubilization can be achieved by a variety of methods including grinding,
pulverization,
ultrasonication, homogenization, and similar mechanical methods, as well as
exposure to liquid
solutions including detergents, solvents, protease inhibitors, salts, buffers,
and the like.
36
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
The terms "STANDARD", "internal standard peptide", "internal standard",
"labeled
TARGET", or "labeled TARGET peptide" may be any altered version of the
respective
TARGET fragment or TARGET peptide that is 1) bound by the appropriate BINDER
with an
affinity and kinetics very similar to that with which the cognate TARGET
fragment or
TARGET peptide is bound, and 2) differs from it in a manner that can be
distinguished from
the cognate TARGET peptide by a sequence-sensitive single molecule peptide
detector (e.g.,
by means of some sequence difference, amino acid modification, inclusion of a
non-natural
chemical group), or a mass spectrometer (either through direct measurement of
molecular mass
or through mass measurement of fragments, e.g., through MS/MS analysis), or by
another
equivalent means. In the case of a nanopore detector, for example, a suitable
TARGET peptide
and its STANDARD would produce distinguishable ion current signatures while
passing
through the nanopore.
The term "STANDARD tag" or "STANDARD flag" means a molecular tag or feature
within or attached to a STANDARD peptide enabling a single molecule detector
to distinguish
the STANDARD tag from a TARGET tag. The STANDARD tag may be the absence of any

TARGET tag. A STANDARD tag may consist of the absence of a feature present in
a cognate
TARGET tag, the presence in the STANDARD tag of a feature absent in the
cognate TARGET
tag, or the presence of different features in the STANDARD tag and TARGET tag
Multiple
different STANDARD tags may be used, provided that the STANDARD tags are
distinguishable from any TARGET tags.
The term "standardized sample digest" or "standardized sample" means a protein
or
peptide sample to which one or more STANDARD version(s) of one or more TARGET
peptide
or protein analytes have been added in an amount that is a) known (in terms of
concentration,
mass, moles or other physical units) or b) consistent between samples
(allowing quantitative
comparison of TARGET peptide amounts between samples even if the absolute
amount of the
STANDARD added is not known). Once the sample is standardized with respect to
a given
TARGET peptide, then the ratio between TARGET peptide and STANDARD represents
and
preserves information concerning the amount of the TARGET peptide in the
sample, allowing
this information to be recovered by later quantitative analysis even if a
variable amount of the
TARGET peptide and STANDARD pair is recovered during a suitable enrichment
process
37
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
(i.e., a process that does not distinguish between the TARGET peptide and
STANDARD
peptides) prior to analysis.
The term "stoichiometric" refers to relationships between quantities of
different
molecules. In some chemical contexts, the word stoichiometry refers to
presence of different
elements or compounds in simple integral ratios, as prescribed by an equation
or formula. Thus
a TARGET peptide sequence that occurs once in the sequence of a parent protein
target has a
1:1 stoichiometric relationship with the target, and can therefore be used as
a quantitative
surrogate to measure amounts of the protein. In the more general sense used in
this disclosure,
stoichiometry means a ratio relationship between molecules (or elements) that
may have any
numerical value, including non-integer values. In biological samples, two
different proteins in
blood or in a cell can have a relative stoichiometry extending over a very
broad range, in
principle from one molecule (the lower limit) of a low abundance protein to
hundreds of
billions of molecules (or more) for a high abundance protein in the same
sample.
The terms "stoichiometric flattening", "normalization", "equalization" or
"differential
enrichment- refer to processes by which different molecules (e.g., peptides)
that are present in
a sample (e.g., a biological sample digest) at different concentrations (or in
different amounts
in mass or molar terms) are brought closer to equal concentrations or amounts.
An example
of such a process is an affinity enrichment method in which a larger relative
fraction of a low
abundance molecule is captured while a smaller relative fraction of a higher
abundance
molecule is captured (e.g., by adjusting the amounts of the corresponding
affinity reagents,
such as antibodies, used to accomplish this capture), the captured molecules
being then
separated from the sample and released from capture, resulting in a more
nearly equivalent
amount of the molecules in the processed sample. In order to preserve
information as to the
relative abundances of the molecules in the original sample, internal standard
versions of the
molecules (e.g., STANDARD versions of TARGET peptides) are added before this
enrichment
step, and both TARGET peptide and STANDARD measured in the resulting enriched
sample.
Given an estimate of the amount of STANDARD added to the sample, the ratio of
TARGET
peptide to STANDARD can be used to calculate the TARGET peptide abundance in
the
original sample.
38
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
The terms "Subject" or "Patient" means a biological individual such as an
individual
human being or an animal.
The terms "TARGET" or "TARGET peptide" means a peptide chosen as a TARGET
fragment of a protein or peptide. The TARGET may be any piece of a protein or
peptide which
can be produced by a reproducible fragmentation process (e.g., digestion using
a proteolytic
enzyme, or without a fragmentation if the TARGET fragment is the whole
analyte) and whose
abundance or concentration can be used as a surrogate for the abundance or
concentration of
the analyte.
The term "TARGET tag" or "TARGET flag" means a molecular tag or feature within

or attached to a TARGET peptide enabling a single molecule detector to
distinguish the
TARGET tag from a STANDARD tag. The TARGET tag may be the absence of any
STANDARD tag. A TARGET tag may consist of the absence of a feature present in
a cognate
STANDARD tag, the presence in the TARGET tag of a feature absent in the
cognate
STANDARD tag, or the presence of different features in the TARGET tag and
STANDARD
tag. Multiple different TARGET tags may be used, provided that the TARGET tags
are
distinguishable from any STANDARD tags.
The term "tag" is used herein as equivalent to Barcode, and may be any type of

distinguishing molecular feature including, but not limited to, a polymer of
dissimilar subunits
encoding an identification relevant to sample analysis.
The term "T/S tag" or "T/S flag" means either a TARGET or a STANDARD tag, or a

set of mixed TARGET and STANDARD tags, as will generally be present in a
standardized
sample.
The term "VEHICLE" means a molecule (for example a polymer such as an
oligonucleotide or a polyethylene glycol, a linker comprising chemically
reactive sites such as
NHS or click chemistry groups, or a macromolecular carrier such as a bead or a
"SNAP"
particle), to which TARGET and STANDARD peptides (together with their
associated
distinguishing tags) can be linked in order to facilitate single molecule
detection. A VEHICLE
can include barcodes identifying a sample of origin, barcodes identifying a
BINDER used to
enrich specific cognate TARGET and STANDARD peptides. VEHICLES can also
include
one or more additional molecular structures that facilitate the transport of
TARGET and
39
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
STANDARD peptides to a single molecule detector, their presentation to such a
detector, their
transport through a detector (such as a nanopore), or their immobilization to
a site or in a region
observed by such a detector.
The use of the singular herein in any instance (e.g., "a" construct or "a"
peptide), unless
otherwise indicated, is intended to mean one or more and is not intended be
limited to only
one.
THE PRESENT INVENTION
The inventions herein provide improved quantitative measurement of proteins,
and
peptides derived from them, through improvements to previous methods including
the
replacement of mass spectrometric detection by other detection techniques
capable of
identifying and counting individual molecules.
In the descriptions that follow, quantitation of proteins, peptides and other
biomolecules is addressed in a general sense, and hence the invention
disclosed is in no way
limited to the analysis of blood, plasma and other body fluids.
5.1 BRIEF DESCRIPTION OF THE DRAWINGS:
Figure 1: Examples of abundances, TARGET peptide and STANDARDs for proteins
in human plasma
Figure 2: Adding Amino Acids to a TARGET to Create a STANDARD
Figure 3: Rope-Tow Constructs with STANDARD and TARGET Sequence Tags
Figure 4: Two-Step Rope-Tow Constructs with Double Tags
Figure 5: TARGET/STANDARD Barcoding Followed by Enrichment
Figure 6: Two-step Digestion for Differential Modification of Peptide Ends
Figure 7: Rope-Tow Constructs with Sequencing Adapters: Ligation of rope-tow
constructs with nanopore sequencing adapters
Figure 8: Multi-Epitope Binders
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Figure 9: DNA Sample Barcodes (16-bit example)
Figure 10: Decoding Peptide-A and Sample Barcodes
Figure 11: Identifying Peptides Using Epitopes
Figure 12: Assembly of a Peptide:Oligo Construct for Nanopore Detection
Figure 13: Details of a Peptide:Oligo Construct for Nanopore Detection
Figure 14: Scheme for double ligation of tryptic peptides with c-terminal
lysine (i.e.,
two amino groups) using "Click" chemistry
Figure 15: Click Ligation of Tryptic (Lys) Peptides to Motor Assemblies
Figure 16: Peptide Loop Insertion with Enzymatic Cut: Insertion of TARGET
Peptide
Into Oligo VEHICLE as a Loop, Followed by Sequence-Specific Enzymatic Oligo
Cleavage
Figure 17: Peptide Loop Insertion with Chemical or UV Cleavage
Figure 18: Peptide Loop Preparation of TARGET and STANDARD
Figure 19: Parallel TARGET and STANDARD Constructs
Figure 20: Specific Affinity Capture of TARGET and STANDARD
Figure 21: Peptide Loop: Effect of Failure to Insert Fully
Figure 22: Nanopore sequencing of constructs
Figure 23: Rope-tow Constructs
Figure 24: Concatenation of constructs by hybridization and ligation
Figure 25: Constructs Prepared Using Bi-functional Supports
Figure 26: Example Rope-Tow Construct: Detailed example structure of a rope-
tow
peptide:oligo construct
Figure 27: Concatenation of Tryptic (Lys) Peptides Using Click Chemistry
Figure 28: Concatenated constructs
Figure 29: Rope-Tow Ligation with Splints
Figure 30: Ligation of Double-Tag Rope-Tow Constructs
41
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Figure 31: Analysis of Detection Events in Affinity Imaging Detection
Figure 32: Identifying Peptides: Before and After Cleavage
Figure 33: Stoichiometric Flattening
Figure 34: Equalization: Stoichiometric Flattening
Figure 35: Multiplex Test Panel for SARS-CoV-2: A multiplex combination of
molecules detectable using nanopore sequencing, including a) peptides from the
SARS-CoV-
2 NCAP protein; b) SARS-CoV-2 Spike and NCAP protein linear epitopes whose
binding by
patient antibodies indicates vaccination or exposure to the virus; c)
proteotypic peptides of
three proteins used as plasma biomarkers of inflammation; and d) the RNA
genome of
SARS-CoV-2.
Figure 36: Design and Nanopore Analysis of Loop-Insertion Constructs
Figure 37: Nanopore Traces of Loop-Insertion Constructs
Figure 38: Preparation of Constructs for Reverse Translation Detection
Figure 39: Assembly of a Peptide:Oligo Construct for Reverse Translation
Detection
6 DETAILED DESCRIPTION OF THE INVENTION:
6.1 SUMMARY OF THE INVENTION.
1. A molecular construct and vehicle comprising:
(a) a molecular construct comprising a peptide comprising a target peptide
sequence
derived from proteolytic cleavage of a target protein and a molecular tag
defining the
source of said peptide,
and
(b) a vehicle capable of presenting the construct for analysis by a sequence-
sensitive
single molecule detector.
2. The molecular construct and vehicle of paragraph 1, wherein the
molecular tag is a target
tag that identifies the peptide as a peptide created by proteolytic digestion
of a biological
sample.
42
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
3. The molecular construct and vehicle of paragraph 1, wherein the peptide
comprises a
synthetic peptide and the molecular tag is a standard tag that identifies the
synthetic
peptide as an internal standard.
4. The molecular construct and vehicle of paragraph 1, wherein the vehicle
is capable of
binding said construct either to a support or to a soluble adapter capable of
presenting the
construct for analysis by a sequence-sensitive single molecule detector.
5. The molecular construct and vehicle of paragraph 2, wherein more than 90%
percent of
the target molecules present in said sample digest are linked to target tags.
6. The molecular construct and vehicle of paragraph 2, further comprising a
SAMPLE
barcode identifying the sample of origin.
7. The molecular construct and vehicle of paragraph 1 further comprising a
BINDER
barcode identifying a binder to which the construct has been bound.
8. The molecular construct and vehicle of any of the preceding paragraphs
wherein the
barcode or the tag is an oligonucleotide.
9. The molecular construct and vehicle of any of the preceding paragraphs
wherein the
sequence-sensitive single molecule detector comprises a nanopore, a single
molecule
imaging system, or a single molecule degradative peptide sequencer.
10. A plurality of reagents, comprising:
the molecular construct and vehicle of paragraph 3,
a tag reagent capable of reacting with a target peptide in a proteolytic
digest of a
biological sample to create the molecular construct of paragraph 2, and
a binder that binds to said molecular constructs of paragraph 2 and paragraph
3 with
similar affinity and kinetics.
11. The plurality of reagents of paragraph 10, wherein the binder contacting a
standardized
sample digest comprising the molecular construct and vehicle of paragraph 2
and
paragraph 3 binds the molecular construct of paragraph 2 and paragraph 3 in a
ratio equal
within 2%, 5%, 10% or 20% to the ratio in which they are present in said
standardized
sample digest.
12. The plurality of reagents of paragraph 10, further comprising:
one or more reagents capable of proteolytic fragmentation of sample proteins,
and
43
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
one or more solid supports for binders, including magnetic beads, non-magnetic
beads,
porous supports usable in packed columns, or chemical reagents capable of
introducing
reactive groups into peptides.
13. A calibrator sample for peptide quantitation by a sequence-sensitive
single molecule
detector, comprising an amount of the molecular construct and vehicle of
paragraph 2 and
paragraph 3 in a known ratio.
14. The calibrator sample of paragraph 13, wherein at least one of the
constructs is present in
known amount or concentration.
15. A standardized sample digest derived from a proteolytic digest of a
biological sample,
comprising:
an amount of a molecular construct comprising a target tag and a target
peptide, said
construct being a target peptide construct and
an amount of a molecular construct comprising a standard tag and a peptide
whose
sequence is the same or similar to the sequence of said target peptide, said
construct being
a standard peptide construct,
wherein the target peptide is generated by proteolytic digestion of a target
protein in said
biological sample,
wherein said target and standard tags can be distinguished by a single
molecule detector
and comprise chemical or structural groups covalently joined to peptides in
their respective
constructs,
wherein said target tag is covalently attached to a plurality of the peptides
present in said
sample digest,
wherein said target peptide construct comprises more than 90% of the target
peptide
molecules present in said sample digest and
wherein said standard peptide construct is prepared separately and added to
said digest in
a known amount, or in a consistent relative amount across a multiplicity of
samples.
16. The standardized sample digest of paragraph 7, wherein the number of
molecules of the
standard peptide construct added to the sample digest differs by no more than
a factor of
100 from the number of molecules of the target peptide construct in said
sample digest.
44
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
17. The standardized sample digest of any one of paragraphs 15-16, further
comprising one or
more additional standard peptide constructs having a different standard tag
from each other
and with each construct at a different relative abundance.
18. The standardized sample digest of any one of paragraphs 15-17, wherein the
target tag is
covalently attached to a majority of the peptides generated by proteolytic
digestion of said
sample.
19. The standardized sample digest of any one of paragraphs 15-18, wherein
said tags are
oligonucleotides,
20. An enriched standardized sample digest, comprising a bound fraction of the
standardized
sample digest of any one of paragraphs 15-16 bound by a binder, wherein said
bound
fraction comprises a target peptide construct and a standard peptide construct
in a ratio
equal within 2%, 5%, 10% or 20% to the ratio in which they are present in said
standardized
sample digest.
21. A stoichiometrically-flattened standardized sample, comprising a plurality
of pairs of
cognate standard and target peptide constructs enriched from a standardized
proteolytic
digest of a biological sample by binding to their respective cognate binders,
wherein
a pre-enrichment ratio calculated by dividing the number of molecules of a
first target
peptide construct that is the most numerous of said target peptide constnicts
in the
standardized sample digest by the number of molecules of a second target
peptide construct
that is the least numerous of said target peptide constructs in the
standardized sample digest
is more than 10 times larger than a post-enrichment ratio calculated by
dividing the number
of molecules of said first target peptide construct by the number of molecules
of said
second target peptide construct in said enriched sample.
22. A method for the measuring the amount of a selected target protein in a
biological sample,
comprising:
proteolytically digesting said sample,
modifying a plurality of peptides in the digested sample by adding a target
tag to form a
plurality of constructs comprising a selected target peptide derived from, and
proteotypic
of, said target protein, said plurality of constructs being target construct
molecules,
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
adding an amount that is known and/or consistent between a set of samples of a
prepared
standard peptide construct that is a cognate of said selected target peptide
construct and
comprises a standard tag, forming a standardized digest,
enriching said cognate target and standard peptide constructs by contacting
said
standardized digest with a cognate binder, forming bound constructs,
separating said bound constructs from unbound constructs to form enriched
constructs,
releasing said enriched constructs from said binder,
linking said enriched constructs to a vehicle capable of presenting said
enriched
constructs to a sequence-sensitive single molecule detector,
counting said enriched target construct molecules and said enriched standard
construct
molecules using a sequence-sensitive single molecule detector capable of
distinguishing
said target and standard tags and identifying said peptides,
calculating the amount of said protein in said sample.
23. The method of claim 22, wherein the calculating is performed by
multiplying the amount
of standard construct added by the ratio of the number of target construct
molecules
counted to the number of standard construct molecules counted by said
detector.
24. the method of claim 22 or 23, wherein, independently or in any
combination:
= said binder is attached to a support,
= said linking of said vehicle to said constructs occurs while said
constructs are
bound to said binder,
= said presenting of enriched constructs to said detector occurs while said
constructs
are bound to said binder,
= peptides are bound to the binder, the binder washed, and the peptides
eluted in
two or more successive cycles of enrichment,
= said proteolytic digestion comprises at least two sequential steps
resulting in
peptide cleavage at different sites, and wherein peptides are covalently
modified
46
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
between two such steps (or wherein said first sequential step cleaves at
lysine
residues),
= said tags are added to said peptides by reaction with peptide amino
groups,
= said tags are added to said peptides by chemical reaction at a single
site in an n-
terminal amino acid,
= said tags are added to said peptides by chemical reaction at a c-terminal
lysine
residue,
= said constructs comprise a non-peptidic component attached to an n-
terminal
amino acid and a different non-peptidic component attached to a c-terminal
lysine
residue,
= said proteolytic digestion comprises at least two sequential steps
resulting in
peptide cleavage at different sites, and wherein peptides retain an unmodified
n-
terminal amino group when presented to said detector,
= a sample barcode is linked to said constructs encoding the identity, or
relative
position within a sample set, of said standardized samples; a plurality of
said
standardized samples is pooled; said sample barcodes associated with construct

molecules are read using a sequence-sensitive single molecule detector; and
the
counts of target and standard construct molecules for each sample are
separated
based on said sample ID barcode identifying the sample from which they were
enriched, and wherein said barcode may be an oligonucleotide,
= a binder barcode is linked to said constructs identifying the binder by
which they
were enriched, and wherein said barcode may be an oligonucleotide,
= said construct molecules are joined together into concatamers prior to
presentation
to said detector,
= said detector determines all or part of the amino acid sequence of the
peptide
components of said construct molecules by a stepwise degradative process (or
wherein the sequence of said target peptide is encoded in a nucleic acid
component linked to target tags, standard tags, sample tags, binder tags and
vehicles, and is read, and counted, by conventional DNA sequencing),
= said detector recognizes and decodes the target and standard constructs
comprising target peptides, tags and optional additional barcodes using time-
47
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
dependent variations in an electrical or optical parameter measured while the
construct molecules transit a nanopore (or wherein said nanopore is
biological,
e.g., the common protein nanopores occurring in nature or derivatives thereof,
or
nucleic acid constructs, or a hole in a solid state inorganic material, e.g.,
Si3N4,
SiO2, graphene, or MoS2) (or wherein said target and standard constructs
comprise non-peptide polymers including nucleic acids that engage with a
molecular motor, e.g., a polymerase or helicase to regulate the speed at which
the
constructs move through a nanopore (or wherein the nanopore detection is
continued, and construct counts accumulated, until reaching pre-determined
threshold numbers of counts, which may be based on counts required for each
peptide sequence, e.g., to provide a pre-determined precision according to
counting statistics, or counts required to achieve a pre-determined precision
in a
ratio, e.g., target/standard counts),
= said constructs are located on a support and detected by sequential
binding of a
plurality of binders comprising a detectable label, wherein independently or
in
any combination:
o peptides in constructs are identified by using cognate binders labeled
with
optically detectable moieties including fluorescent dyes or proteins
o multiple binders recognize distinct epitopes within a target peptide
o kinetic binding analysis of binder-construct interactions is used to
improve
the specificity of detection
o detection of binding or lack of binding by binders is interspersed with
sequential removal of n-terminal amino acids or peptide segments
o one or more of the said target tags, standard tags, sample tags, binder
tags,
or vehicles is an oligonucleotide and is detected by hybridization of an
optically-labeled complementary oligonucleotide.
25. A method for the measuring amounts of a plurality of selected target
proteins in a biological
sample, comprising independently or in any combination:
proteolytically digesting proteins in said sample to yield peptides,
48
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
modifying said peptides by covalent chemical addition of a target tag to form
a
plurality of constructs, including target construct molecules comprising
selected
proteotypic target peptides derived from said target proteins,
adding prepared standard constructs that are cognates of said target
constructs and
comprise a standard tag in amounts that are known and/or consistent between a
set of
samples, forming a standardized digest,
enriching said cognate target and standard construct pairs by contacting said
standardized digest with cognate binders, forming bound constructs,
separating said bound constructs from unbound constructs to form enriched
constructs,
releasing said enriched constructs from said binder,
linking said enriched constructs to a vehicle capable of presenting said
enriched
constructs to a sequence-sensitive single molecule detector,
recognizing and counting said enriched target construct molecules and said
enriched
standard construct molecules using a sequence-sensitive single molecule
detector
capable of distinguishing said target and standard tags and identifying
peptides,
calculating the amount of said proteins in said sample by multiplying the
amount of
each standard construct added by the ratio of the number of cognate target
construct
molecules counted to the number of cognate standard construct molecules
counted by
said detector,
wherein a pre-enrichment ratio calculated by dividing the number of molecules
of a
first target peptide construct that is the most numerous of said target
peptide constructs
in the standardized sample digest by the number of molecules of a second
target peptide
construct that is the least numerous of said target peptide constructs in the
standardized
sample digest is more than 10 times larger than a post-enrichment ratio
calculated by
49
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
dividing the number of molecules of said first target peptide construct by the
number
of molecules of said second target peptide construct in said enriched sample.
The inventions herein provide improved quantitative measurements of the
amounts of
proteins, in a way that is highly specific, extremely sensitive, multiplexable
with wide dynamic
range, capable of very high throughput with low cost per measurement, and
amenable to
implementation on compact, inexpensive equipment.
The invention combines a series of known and new processes in a novel
combination,
and provides novel advantages over existing protein measurement methods. In
its most basic
aspect, the invention comprises proteolytic fragmentation of target
protein(s), addition of
internal standard versions of one or more peptides in known amount (these
standard peptides
being detectably different from the sample digest peptides based on
incorporation of molecular
tags into either sample digest peptides, added standard peptides, or both),
enrichment of
selected sample peptides (TARGETs) and cognate internal standards (STANDARDs)
by
specific affinity selection on BINDERs, and single molecule identification and
counting of the
resulting enriched peptides. By counting individual peptide molecules, the
invention provides
the maximum sensitivity attainable by direct analyte detection (i.e.,
detection without
amplification). The following general features, and others described herein,
are included in
the invention:
Proteolytic digestion of proteins in a sample to yield a peptide digest.
Selection of TARGET peptides from among the candidate peptides produced by
digestion of
a target protein, based on theoretical (e.g., in silico) and/or experimentally
determined
features and performance as a quantitative surrogate of the protein analyte.
Design of internal standard (STANDARD) versions of TARGET peptides (which may
be
identical sequences).
Coupling TARGET molecular tags to TARGET peptides to form TARGET constructs,
coupling STANDARD molecular tags to STANDARD peptides to form STANDARD
constructs, or both.
Addition of known or reproducible amounts of STANDARD constructs to a digest
prior to
enrichment, creating a standardized sample digest.
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Specific enrichment of TARGET and STANDARD peptide constructs, removal from
digest
matrix, washing, chemical modification as needed, and elution, creating an
enriched
standardized sample digest. This specific enrichment step may be carried out
with
amounts of BINDERs adjusted to achieve some degree of stoichiometric
flattening
among a series of TARGET peptides, creating an enriched and flattened
standardized
sample digest.
Optional addition of BINDER-identifying barcodes to TARGET and STANDARD
peptide
constructs in a sample digest.
Optional addition of sample-identifying barcodes to TARGET and STANDARD
peptide
constructs in a sample digest when samples are to be pooled prior to single
molecule
detection. Sample barcode addition can be carried out either before or after
specific
enrichment by BINDERs.
Presentation of enriched TARGET and STANDARD peptide constructs to a sequence-
sensitive single molecule detector in an appropriate chemical form or
structure,
typically by linkage to a VEHICLE.
Detection and counting of peptide TARGET and STANDARD molecules using the
sequence-
sensitive single molecule detector.
Estimation of the amount(s) of TARGET peptide(s) (and thus parent proteins) in
a sample
based on the ratios between respective cognate TARGET and STANDARD peptide
counts, with pooled samples decoded when necessary using sample barcodes
Several technologies among those known in the art can be used for sequence-
specific
single molecular detection and counting of biological macromolecules. Some of
these have
been developed for peptide sequencing, while others have been developed
primarily for
sequencing nucleic acids, but show potential for application to peptides.
The present disclosure provides methods for preparation and analysis of
protein-
containing biological samples compatible with any of the sequence-sensitive
single molecule
detection methods.
In some embodiments the invention provides a means to measure the amount of a
peptide molecule (termed a "TARGET" peptide), typically a proteolytic fragment
of a sample
51
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
protein resulting from proteolytic digestion of a biological sample. In some
embodiments,
sample proteins are proteolytically digested and standardized by addition of
an internal
standard peptide or peptide construct (STANDARD) to create a standardized
sample, from
which the TARGET peptide and STANDARD are enriched and individually counted
using
sequence-sensitive single molecule detection (e.g., nanopore sequencing). The
resulting
counts of TARGET peptide and STANDARD molecules allow estimation of the
absolute or
relative amount or concentration of the target protein in the sample, given
knowledge of the
amount of STANDARD present in the digest.
6.2 COMPONENTS USED IN THE INVENTION
Four molecular components used in some embodiments of the invention are
described
below: a) one or more TARGET peptides (the analytes to be measured); b)
internal standard
(STANDARD) versions of TARGET peptides or peptide constructs used as internal
standards;
c) specific affinity reagents (BINDERS) for capture and enrichment of TARGET
peptides and
STANDARD prior to detection; and d) barcodes used to distinguish TARGET and
STANDARD constructs and/or to distinguish peptides on constructs derived from
different
samples. In general, these components are selected, prepared or optimized in
ways surprisingly
distinct from earlier work in which mass spectrometry has been used for
peptide detection.
6.3 TARGET PEPTIDE.
Using the known sequence of a target protein, one or more peptide segments
within it
are selected as "TARGET" peptides. Selection can be accomplished using an "in
silico"
approach (e.g., by "digesting" the sequence of a target protein, known for
example from the
genome sequence of the relevant species, using a computer to cut the sequence
at sites
predicted based on the known cleavage specificities of a selected protease or
chemical
fragmentation method), an experimental approach (e.g., from a list of peptide
fragments
actually observed in a digest of the protein or a sample containing it), or
both. A preferred
TARGET peptide can be defined in some embodiments by criteria selected from a
set intended
to identify peptides that optimize the performance of the assay. In some
embodiments it is
preferred that a TARGET peptide has one or more characteristics that improve
its performance
in an assay according to the invention, including, but not limited to,
peptides that are or have:
A length of about 4 to 50 amino acids, or more preferably 6 to 24 amino acids;
52
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Efficient digestion: Produced rapidly and in high (ideally >90%) yield by
digestion of the target
protein through an efficient, inexpensive proteolytic treatment with an enzyme
(e.g., a
protease such as trypsin) or a high-yield chemical treatment (e.g., CNBr
cleavage);
Proteotypic sequence: A sequence that is unique to the target protein (unless
measurement of
a family of proteins is an object of the assay), i.e., that it is
"proteotypic" for the protein
and appears in no other proteins likely to be found in the intended sample (or
more
preferably, that it occurs in no other natural protein coded for by the genome
of the
species of interest), and that it occurs in the target protein sequence in a
known number
of locations (typically one location, but potentially more than one if the
peptide
sequence is repeated in the protein);
Few variants: A sequence that is relatively consistent across the population
of target protein
molecules occurring in a sample, i.e., occurs with relatively few sequence
variants
(unless measurement of such sequence variants is an object of the assay).
Few post-translational modifications:
Peptides with relatively few post-translational
modifications (unless measurement of such modifications is an object of the
assay)
Favorable epitopes: A sequence containing structural features (e.g.,
"immunogenic- epitopes
in the case of antibody affinity reagents) that facilitate development of
specific affinity
reagents capable of binding the peptide with high affinity, and specifically a
slow off-
rate (e.g., monoclonal antibodies, aptamers, etc.).
Solubility: Favorable physico-chemical properties, including solubility in
aqueous solutions,
little or no binding to materials used in sample preparation and analysis
vessels and
devices (e.g., nanopores), and little or no tendency to aggregate.
Stability: Low rate of spontaneous chemical degradation (e.g., by methionine
or tryptophan
oxidation, asparagine or glutamine de-amidation, etc.).
Recognizable sequence: A sequence that has features making it easily
distinguishable from
other sequences using the chosen sequence-sensitive single molecule detection
means.
If, for example, the detection means is nanopore sequencing, then peptide
sequences
that produce distinctive current traces as molecules pass through a nanopore,
thus
allowing them to be distinguished from other peptides, are preferred. Typical
53
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
nanopores produce current signals that are reflective of a stretch of 3-6
contiguous
amino acids (a "kmer") inside the pore. Amino acids have a multiplicity of
different
side-chain volumes, and while these volumes do not always directly determine
the
nanopore "blockade currents", sequences with more variable patterns of side
chain
volume are preferred. If the detection means involves recognition and
recording of a
terminal amino acid and its removal in a cyclical process to expose a new
terminus
(e.g., using degradative sequencing technology), then sequences that include
amino
acids for which the terminal recognition is most accurate, and/or least
confusing, are
likely to be preferred. If the detection means involves binding of recognition
molecules
to peptide epitopes, then peptides with multiple distinct epitopes are
preferred.
Amino acid constraints: Presence or absence of specific amino acids that
impact assay
performance. In some embodiments cysteine is avoided due to its potential to
form
bridges between peptides, or alternatively a step is included in the sample
preparation
to block cysteines (e.g., via alkylation by iodoacetamide). In some
embodiments, one
or more cysteine residues present in a peptide are used as reactive sites for
introduction
of linkages to other molecules, or labels that can assist in peptide
recognition by a
sequence-sensitive single molecule detector.
Amino groups: Specific numbers and sites of amino groups (lysine side chain
and n-terminal).
As the most preferred sites for chemical linkage to other molecules, the
number and
position of amino groups is an important factor in the design of some
constructs
required for efficient presentation of peptides to a sequence-sensitive single
molecule
detector. For example, a tryptic peptide with c-terminal arginine and no
internal lysine
residues has a single amino group located at its n-terminus (i.e., it is -
single amino"),
and therefore has a unique site at which certain amino-reactive chemistries
can
establish a covalent linkage between the peptide and another molecule.
Alternatively,
a tryptic peptide having a c-terminal lysine and no internal lysines has two
amino
groups. one at the n-terminus and one at the epsilon amino group of the lysine
side
chain, thus facilitating methods in which a peptide is linked into a longer
polymer by
coupling at both ends.
54
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Carboxyl groups: Specific numbers and sites of carboxyl groups (glutamic and
aspartic acid
side chains and at the c-terminus). As an alternative preferred site for
chemical linkage
to other molecules, the number and position of carboxyl groups is an important
factor
in the design of some constructs required for efficient presentation of
peptides to a
sequence-sensitive single molecule detector.
Electric charges: Specific numbers of charged amino acids (e.g., lysine,
arginine, histidine, and
the n-terminus with positive charges; and glutamic and aspartic acids and the
c-
terminus with negative charges) and the sum of these (i.e., the net charge of
the peptide
at the working pH of a sequence-sensitive single molecule detector). The total
charge
of a peptide can significantly affect its movement through a nanopore under
the
influence of an electric potential between cis and trans compartments, between
which
the nanopore serves as a conduit: in some embodiments a net negative peptide
charge
is preferred, such that the peptide is pulled through the pore from cis to
trans (i.e., in
the same direction a negatively charged oligonucleotide would be). In some
embodiments a net positive peptide charge is preferred, such that a peptide is
dragged
through the pore by another molecule to which it is attached (and which has a
net
negative charge). In some such embodiments, it is preferred that positive
charge(s) be
localized towards the end of the peptide that is last to enter the pore (i_e_,
the trailing
end of the peptide) so as to help maintain the peptide in an extended,
linearized form
as it passes into and through a nanopore.
These characteristics vary considerably, and in many cases independently,
across the
population of potentially selectable TARGET peptides to which the invention
may be applied,
as well as between single molecule detection technologies, and as a result the
selection of
optimal TARGET peptide sequences appropriate for some embodiments involves a
complex
weighting of these and other characteristics.
6.3.1 Distinction from selection for mass spectrometry
It is useful to note that in many embodiments, commonly-considered features of

proteolytic peptides in conventional peptide quantitation methods are NOT
included among
peptide characteristics considered important or limiting in the present
invention: these include
i) performance of a peptide in chromatographic separations (e.g., favorable
elution behavior in
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
reversed-phase liquid chromatography, elution time separated from other
peptides used in
panel combinations, etc.); ii) ionization efficiency in a mass spectrometer
source (including
the preference for a c-terminal positive charge); iii) fragmentation behavior
in a mass
spectrometer (important for example when using triple quadrupole "MRM"
quantitation
methods), and iv) peptide size (i.e., as a limitation due to the preferred
mass range of a mass
spectrometer). In view of the inapplicability of these criteria, some
embodiments of the present
invention will make use of TARGET peptides different from those preferred in
mass
spectrometry-based detection systems (including peptides that would be
unusable in MS
detection).
6.3.2 Peptides with 1 or 2 amino groups
In some embodiments TARGET peptides are generated through cleavage of proteins

by trypsin (an inexpensive and well-understood protease that cuts polypeptide
chains
preferentially c-terminal to lysine and arginine residues). Tryptic TARGET
peptides can be
selected to contain either 1 ("Single amino") or 2 ("Double amino") amino
groups. Numerous
alternative proteases (e.g., Lys-C, Arg-C, pepsin, papain, chymotrypsin, etc.)
and chemical
cleavage reactions (cyanogen bromide (CNBr) cleaving at methionine (Met)
residues; BNPS-
skatole cleaving at tryptophan (Trp) residues; formic acid cleaving at
aspartic acid-proline
(Asp-Pro) peptide bonds; hydroxylamine cleaving at asparagine-glycine (Asn-
Gly) peptide
bonds, and 2-nitro-5-thiocyanobenzoic acid (NTCB) cleaving at cysteine (Cys)
residues) can
also generate peptides with characteristics allowing their use as TARGET
peptides.
In some embodiments, peptides with a single amino group are preferred because
this
provides a unique and chemically convenient site that can be covalently
coupled to other
molecules used in enrichment or detection of peptides. Tryptic peptides ending
in c-terminal
arginine (R) and having no internal lysine(s) possess only one amino group (at
the n-terminus)
and are termed "Single amino" peptides. Reaction with this group provides a
geometrically
well-defined "handle" on one end of the peptide. Such peptides are used in
some embodiments
to create "rope-tow" constructs for nanopore sequencing through combination
with
oligonucleotides, as described below. In some embodiments, it is preferred
that the TARGET
peptide has either no net charge or a net positive charge in order to
facilitate peptide movement
within a nanopore. Peptides of the "Single amino" group may be selected to
contain no aspartic
56
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
or glutamic acids so as to minimize the contribution of negative charges on
the peptide (i.e.,
the peptide preferably has zero or net positive charge that, in some
embodiments, helps resist
being pulled through a nanopore by its attachment to a negatively charged
polymer construct),
and no cysteine (so as to avoid the necessity for a method step to block these
reactive groups).
Single amino peptides with no aspartic or glutamic acid residues also have a
single carboxyl
group, which is useful in some embodiments that rely on anchoring a peptide to
a support via
its c-terminus, leaving the unmodified n-terminus available for binding of
affinity reagents or
sequential degradation (e.g., by Edman chemistry).
In some embodiments requiring linkage sites on both ends of a peptide, it can
be
convenient to select TARGET peptides (e.g., "Double amino" peptides) with a
single lysine
residue present at or near the c-terminus, whose epsilon-amino group provides
a second
reactive amino group, in addition to the n-terminal amino group at the
opposite end of the
peptide molecule. Linkage through these two amino groups allows a peptide to
be coupled
-in-line" with preceding and succeeding polymers via amine-reactive chemistry
to form a
continuous thread. Peptides with an internal lysine, or multiple lysines, are
less preferred in
such embodiments due to the potential for multiple non-linear constructs.
In some embodiments TARGET peptides are selected for other configurations of
reactive sites. In some embodiments it is preferred that a peptide have a
single carboxyl group
and this criterion can be met by peptides with no aspartic or glutamic acid
residues while
possessing a free c-terminal carboxyl. In some embodiments other specific
amino acids are
desired so as to facilitate labeling of peptide molecules with amino acid-
specific detection
reagents.
In some embodiments in which a net positive peptide charge is preferred,
aspartic or
glutamic acid carboxyl groups in the peptide can be converted to positive
charged sites by a
chemical modification, e.g., activation by a carbodiimide and reaction with a
reagent having
an amino group (that couples to the carboxyl) and a second positively charged
group.
In some embodiments, peptides having post-translational modifications are
selected.
For example, the n-terminal amino group of tryptic peptide VHLTEEPK from the
beta chain
of human hemoglobin (Hb) is modified by glycation in a fraction of molecules
in the blood as
a result of slow reaction with blood glucose (the modified Hb is referred to
as HbAl c, and is
57
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
used clinically as a measure of average blood glucose over time in a test for
diabetes). The
unmodified form of this peptide is thus "Double amino", while the modified
form is "Single
amino". In some embodiments designed to measure the fraction of Hb that is
modified by this
glycation, the n-terminal amino group of unmodified peptide is first blocked
by reaction under
conditions favoring reaction with the lower pK n-terminal amino group, and
subsequently both
modified and unmodified forms of the peptide are coupled to other molecules by
the single
remaining amino group of the c-terminal lysine.
6.4 STANDARD VERSION OF A TARGET PEPTIDE FUNCTIONING AS AN
INTERNAL STANDARD.
6.4.1 History in mass spectrometry
In mass spectrometry (MS) as used for peptide quantitation (5 2) , an internal
standard
is typically a synthetic, same-sequence version of a TARGET peptide including
one or more
amino acids comprising stable isotope labels (typically referred to as a
Stable Isotope Standard
or SIS) that allow it to be distinguished from the sample-derived TARGET
peptide by mass
measurement in the MS instrument (i.e., the well-known method of "isotope
dilution-). Given
the chemically identical structures of TARGET peptides and stable isotope
labeled peptide
internal standards, there is no basis on which to suspect that specific
capture reagents (e.g.,
anti-peptide antibodies or other BINDERS) can distinguish between them, and
thus such
capture reagents will bind TARGET and stable isotope labeled peptides in
whatever ratio they
exist in the surrounding solution (e.g., the sample digest). A key feature of
this approach to
internal standardization is that the same method can be used to create the
standard in all cases:
for example, with tryptic peptides an effective standard can be made by
synthesizing the
TARGET sequence with a c-terminal amino acid (typically either lysine or
arginine)
containing stable isotopes (e.g., all 12C replaced by 13C and all 14N replaced
by 15N).
Therefore no advanced design or experimental testing and selection is required
in a particular
case: one approach works in all cases. This is not the case in the present
invention, in which
some embodiments require an involved selection and/or manufacturing strategy
to identify and
produce cognate TARGETS, STANDARDS and BINDERS that function properly together
in
the invention.
58
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
6.4.2 Limited use of internal standards in nucleic acid-based technologies
Internal standards that are truly identical, or almost identical (i.e.,
"cognate"), to the
target analyte sequence are not typically used in genomics and nucleic acid
technologies. Since
the genome has very flat stoichiometry to begin with (most genes present in
equal copy
numbers, or if not, present in small integer ratios), internal standardization
is typically not
required for copy number quantitation. In quantitative methods such as RNAseq,
where
quantitation is important and mRNA levels can vary widely, current approaches
typically
sequence deeply and use total counts of specific sequences without
differential enrichment,
while focusing on between-sample normalization methods to reduce the effect of
systematic
factors that influence the results, such as RNA length, GC content, structure
formation
potential, etc. Where nucleic acid quantitation is important (e.g., in
detection of one or more
specific sequences such as SARS-CoV-2 sequences), PCR and related technologies
are
typically employed that rely on amplification, and the result is expressed in
terms of the number
of amplification cycles required to achieved a certain detection threshold
(e.g., a -Ct value").
In none of these cases is a true internal standard (one chemically equivalent
to the analyte in
the assay but distinguishable by the detector) required or employed.
Since most of the emerging single molecule peptide technologies are evolutions
of
genomic methods, and/or are aimed at biomarker discovery rather than precise
quantitation
over a wide dynamic range, internal standardization with cognate STANDARDs has
not
previously been a significant objective of single molecule peptide analysis by
these methods.
It is thus a novel object of the present invention to provide effective
internal standardization
for single molecule peptide methods so as to enable accurate peptide
quantitation by single
molecule methods.
6.4.3 Need for novel internal standards for quantitative single molecule
detection
In the present invention, a single molecule detection technology is not
expected to be
able to reliably detect small mass differences (a few atomic mass units)
between the otherwise
identical chemical structures of a TARGET peptide and an isotopically-labeled
STANDARD
with the same sequence (i.e., same chemical structure), and therefore other
differences in
molecular character besides isotopic mass must be employed. In this case a
difference in
chemical structure is required, and it would be expected based on experience
with the extreme
59
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
specificity of high-affinity specific capture reagents that a structural
difference would in
general lead to a significant difference in binding by a specific capture
reagent, which in turn
would interfere with the ability of a chemically modified peptide to function
as an effective
internal standard by preserving the ratio of TARGET to STANDARD during
enrichment by
BINDER from the digest.
In some embodiments of the invention, a STANDARD is identified and prepared
for
each TARGET peptide and added to the sample in known or constant amount
before, during
or after digestion, but before enrichment, to act as a quantitative reference
at the detection step.
A sample digest to which STANDARDs corresponding to cognate TARGETs have been
added
in known or constant amount is referred to herein as a "standardized sample
digest". A sample
digest may be standardized with respect to a single TARGET, or with respect to
multiple
TARGETs. The amount of a TARGET peptide can be compared with the amount of
added
STANDARD, and thereby measured, by multiplying the amount of STANDARD by the
observed ratio of TARGET peptide to STANDARD in a sample. In some embodiments,
the
STANDARD is very similar to the TARGET peptide, i.e., as close as possible to
being
indistinguishable from it during steps of the workflow before the detection
step, while being
clearly distinguished from it at the detection step ¨ in other words a cognate
sequence peptide
standard herein referred to as a STANDARD
In some embodiments, the STANDARD serves as an internal standard against which

the TARGET peptide amount is compared, for example by comparing the number of
TARGET
peptide molecules to the number of STANDARD molecules, providing a ratio
measurement.
In some embodiments in which a known amount of STANDARD is added (e.g., a
known mass,
or molar amount, or known number of molecules), multiplication of the ratio by
this amount
yields the amount of TARGET peptide (or mass or the number of TARGET peptide
molecules)
in the sample digest. In the case where the same amount (although not
necessarily an amount
whose mass in moles, grams or molecules, or concentration is known) of
STANDARD is added
to multiple samples, the presence in each of a consistent amount of STANDARD
allows the
amounts of TARGET peptide to be compared between these samples (using the
shared amount
of STANDARD as basis) to provide relative quantitation within a sample set.
Samples can be
compared using this approach by addition of the same amount of STANDARD (i.e.,
the same
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
mass or the same volume of the same or equivalent solution), or using
different amounts in
different samples so long as the amounts added to different samples are known
in relative terms
(e.g., twice as much STANDARD added to sample 2 and to sample 1). Methods for
the use of
STANDARDs at various levels of monitor peptide quantitation and calibration
are described
in detail in Provisional patent application 63/213,371 - Calibration of
Analytical Results in
Dried Blood Samples, which is incorporated by reference herein in its
entirety.
It is advantageous for a STANDARD construct and the STANDARD peptide it
comprises to be as similar as possible to the respective TARGET construct and
the TARGET
peptide it comprises, since this similarity minimizes the probability that the
ratio between them
(which encodes the desired quantitative result of the analytical process) will
be skewed or
altered by some physical or chemical process in any step of an analytical
workflow prior to
detection, including enrichment by a cognate BINDER. As noted above, since a
BINDER
selected to enrich cognate TARGET and STANDARD constructs (or the respective
peptides)
must be highly specific in order to bind these peptides and not the enormous
variety of other
peptides present in a digest of a biological sample, some embodiments of the
invention make
use of TARGET and STANDARD peptides that are identical (i.e., perfect
cognates).
Alternative approaches, in which limited modifications of peptide sequence or
structure
distinguish TARGET and STANDARD peptide components, are less ideal and less
general,
but in some cases may be practically useful.
Similarly, for TARGET and STANDARD constructs, where a structural or chemical
distinction is required in order that they be separately countable by a single
molecule detector,
the non-peptidic components of the TARGET and STANDARD constructs should also
be
cognates, though with relaxed similarity constraints. It is therefore
advantageous for the non-
peptidic components of the TARGET and STANDARD constructs to have similar
physical
properties such as mass, physical dimensions, shape, charge, hydrophobicity,
solubility, etc.
In some embodiments of the invention these constraints are addressed by the
use of
oligonucleotide TARGET and STANDARD tags, wherein the tags have the same
length and
may have the same base composition (implying the same molecular mass), but
different
sequences, allowing them to be distinguished by DNA sequencing or by specific
hybridization
to complementary probes. Linkage of such oligo tags to TARGET to STANDARD
peptides
61
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
can be accomplished using bifunctional linkers (for example including flexible
polymer
components such as polyethylene glycol between the oligo and peptide
attachment sites) that
reduce any steric hindrance the oligo may exert on the peptide that could
affect binding to a
BINDER. Such a level of similarity reduces the probability of skewing of the
TARGET to
STANDARD construct ratio because of differences in the diffusion, charge
repulsion, epitope-
masking, or solubility of the two constructs.
Suitably similar TARGET and STANDARD constructs form a cognate construct pair.

The TARGET and STANDARD constructs, together with a BINDER capable of binding
them
without skewing the ratio, form a set of cognate molecules. A STANDARD
construct, the
cognate BINDER, a TARGET tag and any linker required to link the TARGET tag to
TARGET
peptide molecules in a sample digest form a cognate reagent set useful for
specific
measurement of the TARGET peptide and its parent target protein in a sample
(i.e., they can
serve as a kit for measuring the protein).
Achieving the goal of construct cognate equivalence nevertheless remains
challenging
because of the interdependence of constraints governing cognate constructs and
BINDERs,
and the absence of successful attempts to solve this problem in the past.
6.4.4 Standards with altered TARGET amino acid sequence
In some embodiments, the STANDARD is created by replacement or alteration
(e.g.,
by chemical modification) of one or more amino acids in the TARGET peptide
sequence, or
by addition of amino acids or other chemical structures. In some embodiments
it is preferred
that the replacement, addition or alteration a) does not result in any
significantly difference in
binding of the TARGET peptide and the cognate STANDARD to the cognate BINDER,
and
b) results in an easily detected change in the result from a sequence-
sensitive single molecule
detector (e.g., a different ion trace during transit of a nanopore compared to
the TARGET
peptide, or a different amino acid sequence detected by a degradative
sequencing process, or a
difference in the set of epitope-specific binders detected by a single
molecule imaging
platform).
In some embodiments, one or more amino acids or other chemical groups can be
added
to either the n-terminal or c-terminal end of the TARGET peptide to create a
STANDARD,
with the same constraints (e.g., an easily detected change in the result of a
sequence-sensitive
62
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
single molecule detector, but no significantly difference in binding of the
TARGET peptide
and STANDARD to the cognate specific BINDER). In some embodiments these
replacements and/or modifications are made to residues outside the peptide
epitope to which a
selected BINDER binds ¨ such epitopes are typically linear contiguous regions
of 4-8 amino
acids in the case of IgG antibody BINDERS, leaving numerous potential
modification sites
available outside this region in a TARGET peptide 8-25 amino acids long.
In some embodiments a single serine (S) residue may be added to either the n-
terminus
or c-terminus of the sequence of a TARGET peptide to create a cognate
STANDARD. Any
other amino acid, or sequence of amino acids, that is clearly recognized by a
sequence-sensitive
detector can in theory be used instead of serine, the choice of added amino
acid(s) being free,
constrained only by the requirements of STANDARDs generally (i.e., BINDER
binding
equivalent to that of the cognate TARGET peptide, etc.) and any sequence
constraints arising
from any chemistry required to present peptides for detection. In some
embodiments that make
use of coupling reactions at two linkage sites on the peptide provided by the
n-terminal amino
group and the epsilon amino groups of a c-terminal lysine residue, addition of
a residue after
the lysine (a serine residue in this embodiment) provides a STANDARD that is
chemically
identical to the TARGET peptide along the entire chain of connected atoms
between the
peptide's two amino groups (the n-terminal amino and lysine epsilon amino
group) while
comprising an appended serine residue "side chain". In some embodiments, an
amino acid
such as serine can be added to the n-terminus. In other embodiments any amino
acid(s) or
chemically linkable group of atoms can be added to one or the other terminus,
to an internal
amino acid, or to both termini, to create a STANDARD version of a TARGET
peptide
sequence.
Figure 2 illustrates the challenge in practice of designing a simple addition
of amino
acids to the n-term or c-term of a peptide TARGET to create a cognate STANDARD
while
preserving equivalent binding to a BINDER. Here a series of versions of the
peptide
LLGPHVEGLK (proteotypic for human mesothelin) was synthesized with each of the
20
amino acids added to the n-terminus, and dipeptides added to the c-terminus
(in each case a
proline was added after the lysine and ahead of the variable amino acid in
order to prevent
removal of the added amino acid by trypsin cleavage, were the STANDARD added
to a sample
63
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
prior to digestion as is often the case). Each variant was mixed with a
similar amount of the
unmodified TARGET (LLGPHVEGLK), and the ratio of variant candidate STANDARD
and
TARGET signals measured by mass spectrometry before and after enrichment by a
rabbit
monoclonal antibody with specific affinity for this peptide. All n-terminal
additions result in
a dramatic decrease in binding of candidate STANDARDs compared to the TARGET:
none
are enriched to more than 4% of the level of the TARGET. The epitope
recognized by this
antibody thus probably includes the n-terminus and cannot accommodate an added
amino acid.
C-terminal additions are successfully enriched, with recoveries compared to
TARGET of 27%
(-PC) to 5386% (-PW); i.e., widely varying depending on the specific amino
acid added after
the proline. The antibody binding therefore appears to be affected by c-
terminal additions, and
in some cases (e.g., -PW) these c-terminal variants bind in preference to the
original TARGET
against which the antibody was made. Only 2 of the 39 variants examined bind
the TARGET
and candidate STANDARD at near-equivalence: c-terminal -PP and -PQ additions
bind with
approximately 99% and 102% recovery relative to the TARGET sequence. This
example
demonstrates that a large majority of modifications made by adding amino acids
to the n-term
or c-term of a 10 amino acid long TARGET peptide are unlikely to yield
STANDARDs that
bind equivalently to a given BINDER, preserving the TARGET/BINDER ratio
present in the
standardized sample digest. In addition, those STANDARD versions that appear
to satisfy this
simple version of the equivalence requirement (-PP and -PQ in this case), must
pass further
tests of equivalence under varying solution conditions, workflow timelines,
and sample
matrices, further restricting the range of choices.
In some embodiments using nanopore sequencing, and in which peptides transit
the
pore from n-terminal to c-term, it can be advantageous to use a STANDARD with
an added
residue at the c-terminus (the end closest to a DNA motor on the cis-side of
the pore) so as
ensure that the STANDARD variation is read, even if the peptide is longer than
the nanopore's
read depth and as a result some of the peptide's n-terminal residues are not
read during a period
of controlled movement of the peptide through the nanopore. Sets of STANDARDS
generated
by addition of a constant c-terminal residue or residue pair to form the
STANDARDs will in
general require accurately reading a minimal subsequence of 2 amino acids more
than the
minimum required to distinguish the TARGETs themselves. Given the likelihood
of imperfect
reads, and the potential contamination with other, un-selected peptides,
longer reads of 5, 6, 7,
64
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
8, or 10 amino acids, or the entirety of the peptide's sequence may be
required to identify the
TARGET peptide and STANDARD molecules with sufficient accuracy (e.g., 99.5%,
99%,
98%, or 95% accuracy) to enable use of the ratio TARGET-to-STANDARD molecule
counts
to calculate a precise estimate of TARGET peptide amount.
In some embodiments, for example degradative methods in which a peptide is
immobilized by the c-terminus and read by successively removing amino acids
from the n-
terminus, an added residue indicating STANDARD status is preferred at the n-
terminus so as
to ensure that the distinction between TARGET and STANDARD peptides is read at
the
beginning, and does not require sequencing to the end of the entire peptide.
Accurately reading
a minimal subsequence of 3 amino acids starting from the n-terminus is often
sufficient to
distinguish among a small set (e.g., 20) of TARGET peptides and their
respective
STANDARDs. Given the likelihood of imperfect reads, and the potential
contamination with
other, un-selected peptides, longer reads may be required to confidently
identify the TARGET
peptide and STANDARD molecules. In some embodiments, for example those that
employ a
sequential enzymatic (e.g., exoprotease) or chemical (e.g., Edman) process to
remove single
amino acid residues from one terminus of a peptide, the advantage of rapid
definitive
identification of TARGET and STANDARD sequences based on just a few terminal
residues
is substantial, since it could allow early termination of the cyclical read
process, thus leading
to a significant decrease in the number of cycles required and thus in
analysis time, with
associated decreases in cost and increased throughput.
In some embodiments, larger numbers of TARGET peptides and STANDARDs are
used and need to be discriminated: for example, 25, or 50, or 100, or 200, or
400, or 600, or
800, or 1,000 TARGET peptides and their cognate STANDARDs; in such cases,
based on an
analysis of the uniqueness of the sequences, it may be desirable or required
that a detector
determine more of the peptide sequence, up to a complete sequence of some or
all of the
peptides.
In some embodiments, initial studies can be undertaken in which the TARGET
peptides
are sequenced beyond 3 or 4 residues, up to complete sequences, in order to
detect the presence
of any interfering peptides (i.e., peptides that share short n-terminal or c-
terminal sequences
with TARGET or STANDARD sequences, or otherwise generate output that can be
confused
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
with the pre-selected TARGET and STANDARD sequences) likely to be present in a
given
sample type. If such interfering sequences are commonly detected, deeper
sequencing can be
applied to distinguish the interfering sequences from TARGET or STANDARD
sequences
(i.e., sequencing up to or beyond the amino acid residue where the interfering
peptide is no
longer identical to a TARGET or STANDARD sequence), an approach with a high
probability
of success given that TARGET peptides are typically selected to be proteotypic
in the species
of interest.
STANDARDs generated by modification of a TARGET amino acid sequence face
several challenges that motivate exploration of alternative approaches. These
include rarity
(the low probability of finding a modified sequence that binds equivalently to
a cognate
BINDER); lack of generality (the fact that each TARGET and cognate BINDER
represent a
separate case that must be individually optimized); and the fact that only
some single molecule
detection technologies are likely to be able to detect such a sequence
difference reliably.
6.4.5 Chemically modified peptides as standards
In some embodiments one or more amino acid residues of a TARGET peptide may be

modified to generate a STANDARD. A large number of non-canonical amino acids
that are
known in the biochemical literature can be substituted for residues of the
TARGET peptide or
added to its sequence. Likewise, a large number of naturally occurring
chemical modifications
of amino acids are known and can be introduced into residues of the TARGET
peptide during
or after synthesis to form a STANDARD Likewise, a large number of artificial
chemical
modifications can be made to amino acids of the TARGET peptide to form a
STANDARD.
Two examples of small but significant modifications are terminal blockages: 1)
acetylation of
an n-terminal amino group or 2) amidation of a c-terminal carboxyl group, both
of which can
be carried out easily during synthesis of a STANDARD peptide having the same
sequence as
a cognate TARGET, and both of which represent small alterations in the peptide
structure.
These small alterations can be "read" at a later stage of a single molecule
workflow by reaction
of peptides with a chemical reagent capable of efficiently combining with
exposed amino or
carboxyl groups (respectively). In some embodiments making use of amino groups
to link
peptides to oligonucleotides, blockage of a STANDARD' s c-term carboxyl can
prevent
reaction with a reagent that nevertheless reacts with a TARGET' s c-terminus:
if the result of
66
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
reaction with the reagent (which may for example add polymer or other
structures to the
TARGET' s structure) is detectable by a single molecule detector such as a
nanopore, then the
distinction between TARGET and STANDARD required by the invention can be
provided.
Any of these modifications may be used to create a STANDARD, provided that it
meets the
criteria described above (equivalent binding to a specific enrichment reagent,
and equivalent
reactivity in any required chemical reactions involved in sample preparation).
In some embodiments, for example those that employ nanopore sequencing, TARGET

and STANDARD molecules may be "read" completely during passage through a
nanopore,
reducing the potential for confusion between expected TARGET and STANDARD
sequences,
or with potentially interfering sequences. In some nanopore sequencing
embodiments capable
of halting the reading of a peptide after reading a small number of amino
acids and ejecting the
peptide from the nanopore based on confidently identifying it as a specific
TARGET or
STANDARD sequence, the uniqueness of n-terminal or c-terminal sequences
remains
important and provides an opportunity to reduce time spent on unproductive
sequence reading
and therefore increase throughput of molecule counting.
6.4.6 Discrimination of STANDARDs and TARGETs using members of different
"click"
chemical pairs
In some embodiments, STANDARDs and cognate TARGETs share an identical amino
acid sequence but differ in an attached chemical group. For example, STANDARDs
can
comprise a peptide sequence linked to one member of a pair of "click"
chemistry groups (e g ,
TCO, capable of reacting bio-orthogonally with molecules comprising a
tetrazine group, the
other member of the click pair, or vice versa), while cognate TARGETs comprise
the same (or
very similar) peptide sequence linked to one member of a different pair of
"click- chemistry
groups (e.g., BCN, capable of reacting bio-orthogonally with molecules
comprising an azide
group, the other member of that click pair, or vice versa). Because the
components of the two
click pairs generally react only with the other pair member, but not between
click pairs, such
click-activated TARGETs and STANDARDs are generally inert until they encounter
a
molecule comprising the opposite pair member, at which time they spontaneously
react
forming a covalent linkage. Such click-activated TARGETs and STANDARDs are
therefore
each capable of reacting specifically with different additional molecules
(e.g., oligonucleotides
67
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
comprising the appropriate different click groups) at a later stage of a
sample preparation
workflow. In some embodiments, TARGETs and STANDARDs comprise different
chemical
linkage groups (e.g., selected from the above-mentioned click pairs) connected
to the peptide
by similar or identical spacers (e.g., polyethylene glycol of length 1, 2, 3,
4, 5 or more polymer
units) thus reducing any potential impact of the difference in chemical linker
structures (e.g.,
TCO and BCN as mentioned above) on the relative binding of TARGETs and
STANDARDs
to a cognate BINDER.
6.4.7 Identification of STANDARDs by linkage to non-peptide flags or barcodes
In some embodiments a peptide is attached to another molecule to label it as a
TARGET
vs a STANDARD, to barcode it (e.g., to identify the sample from which it
came), to facilitate
or regulate its passage through a nanopore, or a variety of other purposes
useful in a single
molecule detection workflow. In some embodiments the distinction between
TARGETs and
STANDARDs is encoded in an attached, non-peptidic "tag" component rather than
in the
peptides' structures themselves or in chemical linkage groups (e.g., click
groups) they
comprise. In some embodiments this is accomplished by preparing the STANDARD
prior to
its addition to a sample digest in a form that is already attached to a
detectable tag (e.g., a
nucleic acid sequence tag) that specifically indicates its status as a
STANDARD. In an
example of such an embodiment shown in Figure 3A, an oligonucleotide VEHICLE
comprises
a 5' phosphate 52 (to facilitate ligation with other nucleic acid chains), a
preceding sequence
29, a residue 33 (indicated by X) capable of forming a linkage 34 with a
terminal residue of
peptide 52 (in this case a STANDARD peptide having the same sequence as a
cognate
TARGET), an abasic stretch 36 running alongside the peptide (forming a rope-
tow construct
as described herein), and a following sequence 30 comprising a tag sequence 54
(indicated by
a box) that identifies the construct as containing a STANDARD peptide. In the
example
shown, the peptide GFVEPDHYVVVGAQR is a member of the class of "single amino"
peptides, and thus comprises only a single amino group which is located at its
n-terminus.
Cognate TARGET peptide 53 (example shown in Figure 3B) in such an embodiment
is
attached to a VEHICLE of similar overall structure as the STANDARD construct,
but
comprising a different nucleic acid sequence tag 55 that indicates its status
as TARGET.
During single molecule detection, the VEHICLE nucleic acid sequences can be
read by a
68
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
nanopore and their location in relation to a peptide (e.g., preceding or
following with pre-
determined proximity) can be used to identify each peptide molecule as a
TARGET or
STANDARD molecule. The overall similarity of the VEHICLEs attached to the pre-
prepared
STANDARD (Figure 3A) and sample digest-derived TARGET peptides (Figure 3B)
minimizes any potential difference in binding of the peptide portions of the
constructs
(TARGET and STANDARD peptides 52 and 53 being structurally identical) to
cognate
peptide sequence-specific BINDERs. In some embodiments the sequence tags
distinguishing
the TARGET and STANDARD VEHICLEs are optimized for high sequence accuracy in a

given sequence-specific detection system (e.g., a nanopore reading system, or
an affinity
reagent imaging system).
The primary function of STANDARD tags and TARGET tags is to distinguish
peptide
constructs added to a sample as internal standards (STANDARD constructs) from
peptide
constructs that incorporate peptides created by proteolytic digestion of the
sample proteins
(TARGET constructs). In some embodiments the TARGET tag may be the absence of
any
STANDARD tag. In some embodiments the STANDARD tag may be the absence of any
TARGET tag. A STANDARD tag may consist of the absence of a feature present in
a cognate
TARGET tag, the presence in the STANDARD tag of a feature absent in the
cognate TARGET
tag, or the presence of different features in the STANDARD tag and TARGET tag.
In some
embodiments, such presence/absence features may include differences in the
sequence of
oligonucleotide tags. The importance of maintaining unbiased (unskewed) ratio
relationships
between TARGET and STANDARD constructs (i.e., preserving their cognate
character,
specifically in regard to interaction with a cognate BINDER) argues against
large structural
differences between the TARGET and STANDARD tags (e.g., presence vs absence of
a
sizable chemical group). Multiple different STANDARD tags may be used,
provided that the
STANDARD tags are distinguishable from any TARGET tags. Multiple different
TARGET
tags may be used, provided that the TARGET tags are distinguishable from any
STANDARD
tags.
In some embodiments each different peptide STANDARD is prepared attached to a
respective cognate VEHICLE that comprises a nucleic acid sequence tag that
specifically
identifies that STANDARD amino acid sequence and distinguishes it from a
plurality of other
69
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
STANDARDS that may be used in the same workflow. Use of different nucleic acid
sequence
tags for each STANDARD provides an orthogonal method for identifying these
peptides, and
this information makes it possible to assess the reliability of both methods
(i.e., the degree of
agreement between the peptide detection and tag detection), and to optimize
the respective
detection methods to improve accuracy.
In some embodiments the oligo tag sequences used to identify and distinguish
cognate
STANDARDs and TARGETs in a cognate group are selected so as to be chemically
very
similar (e.g., same length and base composition) while being reliably
distinguishable (e.g.,
different base sequence). By being chemically similar, and not located in
close proximity to
the peptide, the tags are unlikely to have any differential effect on the
binding of cognate
TARGET and STANDARD molecules to the cognate BINDER, thus preserving the ratio
of
TARGET to STANDARD in the standardized digest. By having distinguishable
sequences,
as detected by any of the single molecule methods herein, the TARGET and
cognate
STANDARD molecules can be identified and counted reliably, thus providing an
accurate
value for the ratio of TARGET to STANDARD.
In some embodiments STANDARD-VEHICLE constructs (e.g., Figure 3A) are
prepared and added to the sample digest after sample digest peptides have been
incorporated
into similarly-structured TARGET constructs (e.g., Figure 3B). In these
embodiments, the
structure of the STANDARD peptide molecule may be identical to the TARGET
peptide
structure (e.g., it can be a synthetic version of a known cognate TARGET
peptide sequence),
while their respective STANDARD and TARGET VEHICLEs comprise distinct nucleic
acid
sequence tags (54 and 55 in Figure 3), thus ensuring that the cognate BINDER
will bind the
attached peptides equivalently, and thereby accurately preserve the TARGET-to-
STANDARD
ratio present in a standardized sample digest.
Given such an encoding scheme to identify STANDARDs and TARGETs the method
requires that these peptides be joined to their respective tags prior to
enrichment and selection
using the cognate BINDERs. A specific advantage of this approach is the
identical structures
of cognate TARGET and STANDARD peptides, and thus the high likelihood that the
cognate
BINDER binds them with identical affinity and kinetics, thus preserving the
TARGET-to-
STANDARD ratio present in the sample digest. The potential disadvantage of
this approach
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
is the expense involved in using sufficient TARGET vehicles to incorporate all
the digest
peptides. A further simplification of this approach to address this issue is
described below.
Some embodiments make use of a further simplification of the VEHICLE-encoding
scheme shown in Figure 3 and described above. Figure 4 shows a method in which
a short
identifying oligo a STANDARD tag 62 is attached to a STANDARD peptide 52, in
this case
by an amine-reactive N-hydroxysuccinamide (NHS) group 61 attached by linker 34
to a
suitable DNA nucleotide of the oligo (for example an amino-modified C6 dT base
to which
NHS functionality has been added during manufacture). The 16 base long oligo
tag 62 has a
molecular weight of about 5,000 daltons, substantially less than the VEHICLEs
described in
the embodiment shown in Figure 3, and therefore less expensive and also able
to diffuse and
bind BINDERs more rapidly in solution. Those skilled in the art will be able
to design
oligonucleotide tag sequences of reduced (or longer) length capable of
specifically hybridizing
with complementary sequences as required in the steps of Figure 4 C and D.
TARGET and
STANDARD oligonucleotide tags (e.g., 62 or 63) may be provided of lengths
ranging from 4
to 30 bases, more preferably 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, or 16-30
bases. Oligo tag
sequences are designed according to well-known principles to maximize specific
binding to a
complementary sequence (e.g., 64 or 68) and dissociate at a reasonable
(melting) temperature,
while minimizing the potential to hybridize with other oligo sequences used in
the workflow,
or to form self-associations (e.g., intra-molecule hairpins or inter-molecular
hybrids).
The product of the reaction of the oligo tag's NHS and the peptide's NH2
groups of
Figure 4A is shown in Figure 4B, while the equivalent reaction for TARGET
peptide 53 with
TARGET tag 66 is shown in Figure 4 E and F. The STANDARD construct of Figure
4B is
added to a sample digest (in this example a tryptic digest) whose sample
peptides (including
the TARGET) are prepared as TARGET-tag constructs (as in Figure 4F), thus
forming a
standardized sample digest. In order to preserve the quantitative relevance of
the TARGET
amount, it is desirable that the reaction of peptide 53 with VEHICLE 66 goes
to completion
(or if not, that the proportion of the TARGET peptide incorporated into
construct of Figure 4F
is consistent between samples).
At this point TARGET and STANDARD peptides are linked to, and thus identified
by,
oligo sequence tags of the respective TARGET and STANDARD tags (oligos 62 and
66 in
71
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Figure 4). Enrichment of the TARGET and STANDARD peptides using the cognate
BINDER
isolates these two peptides and their attached respective oligo tags (the
peptides having
identical structures but derived from different sources), preserving the
TARGET-to-
STANDARD ratio in the standardized sample. Subsequent to the enrichment step,
after
removal of unbound sample peptide constructs, the enriched bound TARGET and
STANDARD constructs can be "completed" by hybridization and ligation to
respective
"secondary VEHICLEs" as needed for various single molecule detection methods.
Completed
constructs shown in Figures 4 C and D (for the STANDARD peptide) and Figure 4
G and H
(for the TARGET peptide) are particularly useful in the case of nanopore
sequencing. In
summary, short oligo tag 62 (which identifies peptide 52 as a STANDARD
molecule)
hybridizes with a complementary sequence 64 bringing the 3' residue of the
oligo 62 into
proximity with the phosphorylated 5' end of an oligo comprising an abasic
region 36 (abasic
backbone links are symbolized by "o") and a following sequence 63. In some
embodiments
the 5' terminus of the secondary VEHICLE comprises one or more bases before
the abasic
region (e.g., AA at the 5' end of secondary VEHICLE oligo 36 in Figure 4) that
are capable of
hybridizing with complementary bases in oligo 70 (TT in oligo segment 70 of
Figure 4D), and
these bases are different in STANDARD and TARGET secondary VEHICLEs so as to
minimize potential hybridization of a secondary STANDARD oligo (e.g., 36 + 63)
with a
TARGET complementary sequence 70, and vice versa. The number of such
hybridizing bases
at the 5' end of segments 36 may be selected to have an extended length that
is less than the
total length of linker 34 in order to avoid overlap of the peptide with these
hybridizing bases
when the complete construct passes through a nanopore. Similarly, the length
of abasic region
36 is chosen so as to avoid overlap with the peptide 52 as it transits a
nanopore, and oligo
sequence 63 is designed so as to engage with a DNA motor and regulate pore
transit of the
peptide, allowing measurement of its current trace ("squiggle"). To optionally
further
reinforce the identification of the peptide as a STANDARD, oligo sequence 63
may be a unique
sequence also indicating status as a STANDARD molecule (i.e., different from
the sequence
of oligo 67 in the TARGET VEHICLE construct) resulting in a "double-tag"
(i.e., redundantly
tagged) construct. An advantage of the double-tag approach is that each
peptide is identified
as STANDARD or TARGET by sequence information both before and after the
peptide.
72
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
In some embodiments, alternative linkage chemistries, including various click
chemistry linkages as described elsewhere herein, are used to create a linkage
between a
peptide and a nucleic acid tag. Figure 5 shows an example in which a tag oligo
prepared with
a 3' transcyclooctyne (TCO) group is reacted with a peptide derivatized at its
n-terminus with
methyltetrazine (Me-TZ) to yield an "in-line" construct (one in which the
oligo and peptide
sequences form a single continuous polymer). Figure 5A shows preparation of a
TARGET
construct using a tag sequence identifying the construct as a TARGET (labeled
OLIGO-
TARGET tag). In some embodiments, this modification (labeling peptides with
the OLIGO-
TARGET tag) is applied to all peptides in a sample digest (for example by
derivatizing all
digest peptide amino groups with NHS-tetrazine and reacting these with an
excess of the
reactive oligo comprising 3' TCO), thereby ensuring that all molecules of the
TARGET are
taken into account in subsequent steps of an analytical workflow. Figure 5B
shows preparation
of a STANDARD construct using a tag sequence identifying the construct as a
STANDARD
(i.e., the OLIGO-STANDARD tag). In some embodiments, the STANDARD construct is

prepared separately, for example using synthetic STANDARD peptide having the
same (or
generally similar) sequence as the TARGET, and is added to sample digests to
serve as an
internal quantitative standard (i.e., to standardize the sample digests). In
some embodiments
the physical properties of the TARGET and STANDARD oligo tags (e.g., length,
base
composition, and melting temperature) are selected to be similar so as to
minimize the impact
of their different sequences (necessary to distinguish then during peptide
single molecule
analysis) on relative binding to the cognate BINDER. Figure 5C shows TARGET
and
STANDARD constructs bound to cognate antibody BINDER molecules immobilized on
a
magnetic bead. Since the difference between TARGET and STANDARD is encoded by
the
oligo component, the peptide components can be chemically identical (same
sequence) capable
of identical interaction with the BINDER. In some embodiments, the peptide is
linked to the
oligo via its c-terminus (e.g., via a c-terminal carboxyl and the epsilon
amino group of a c-term
lysine) rather than the n-term as shown. In some embodiments the oligo is
linked to the peptide
via the 5' end rather than the 3' as shown. These and other functionally
equivalent physical
arrangements will be apparent to those skilled in the art.
In some embodiments, such as nanopore sequencing applications, that may
require
addition of molecular components such as oligonucleotides to both ends of a
peptide,
73
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
STANDARD constructs can be prepared in a stepwise process comprising 2
discrete
attachment steps, thereby allowing different molecular components to be added
at (or near) the
peptide N-terminus and C-terminus, and establishing a consistent orientation
of the peptide
with respect to the overall construct (thereby avoiding the need to recognize
a peptide in two
different polarities). In some embodiments the 2-step digestion process
described herein for
preparing sample peptide libraries is used to generate STANDARD constructs. In
this case, a
synthetic peptide comprising the cognate TARGET sequence ending in Lys (i.e.,
a 2-amino
peptide), and comprising an added n-terminal sequence ending in Arg (e.g.,
GSGR in the case
of trypsin second cleavage, or any suitable peptide ending in the amino acid
at whose c-
terminal position a second proteolytic enzyme cleaves), can be processed in a
series of steps
similar to those used to process sample peptides as shown in Figure 6 For
example, for a
generic TARGET peptide sequence XXXXXXK, a synthetic STANDARD precursor can be

generated with the sequence XXXRXXXXXXK. This peptide has both an n-terminal
and a
lysine epsilon amino group, both of which can be reacted with a suitable
linkage reagent such
as NHS-BCN, thus adding a click linker to both ends of the peptide, after
which excess NHS-
BCN can be removed. A suitably activated flag group (e.g., an oligonucleotide)
can be linked
to the peptide at this point using an appropriate click partner (e.g., azide
to react with BCN) in
order to identify the STANDARD (either generically using the same flag for all
STANDARDs,
or specifically, using a flag that identifies which STANDARD peptide sequence
is involved).
Cleavage with a second enzyme (e.g., trypsin) then reveals a fresh n-terminal
amino group (in
the peptide )0(XXXXK) which is available to react with a second linkage
reagent such as
NHS-TCO, thereby activating the n-terminus for potential connection to a
second tetrazine-
activated oligo via the TCO-tetrazine click reaction. Use of different
mutually orthogonal click
groups at the two termini offers the ability to add distinct oligos to the two
ends via the distinct
click pair reactions, and to postpone one or both oligo additions until after
BINDER enrichment
of the TARGET and cognate STANDARD peptides. Likewise, one or both of the
oligos may
be coupled to the peptide prior to enrichment on a BINDER, provided that the
same process is
applied to the peptides of the digest (including TARGET peptides) prior to
enrichment by a
BINDER. It will be apparent to those skilled in the art that alternative
linkage chemistries,
including different click pairs or application of click pairs in alternate
order, can be employed;
that the steps of adding the first oligo for click linkage and second cleavage
can be carried out
74
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
in either order, and that the addition of the second linker group and/or
second oligo may be
delayed until after the BINDER enrichment step or omitted altogether when the
single
molecule sequencing and detection means requires an available n-terminal amino
group (e.g.,
for Edman degradation).
An important advantage of carrying out BINDER enrichment after assembly of
peptide-oligo constructs such as those shown in Figures 3, 4, 5 and 7 is that
constructs taken
forward for sequencing after enrichment are very likely to contain a peptide
(otherwise they
would not be bound by and recovered from the BINDER).
In some embodiments, a TARGET peptide, its cognate STANDARD and the cognate
specific affinity BINDER reagent(s) (i.e., those molecules used together in
the present
invention to quantitate the TARGET peptide) may be developed together (co-
evolved or co-
optimized) to achieve optimal performance; i.e., through an iterative process
comparing assay
performance of various combinations of versions of the reagents, through a
full-matrix
comparison of all available variants of each, or by molecular engineering
guided chemical
knowledge and experimental results.
6.4.8 Multi-level standards
In some embodiments, multiple distinguishable STANDARDs are provided for a
single
TARGET, and added in different amounts so as to establish a standard curve
against which the
TARGET can be quantitated. In such embodiments, the multiple STANDARDs may be
distinguished by connection to distinct oligo sequence tags, or they may
contain different and
distinguishable structural modifications of the TARGET peptide (e.g.,
different amino acids
added to its sequence). The different STANDARDs may be added to a sample
digest in
different amounts to standardize it: for example, three STANDARD versions (A,
B, C) of a
given TARGET peptide may be added to a digest in 0.1 : 1.0 : 10 relative
amounts, thus
generating a 3-point calibration curve. Multi-level STANDARDS increase the
likelihood that
at least one STANDARD will be present in an amount (and thus number of
molecules) close
to the amount of a TARGET in an unknown sample. Likewise, the pre-established
ratio
between different STANDARD version provides an internal check on the
quantitative
precision and linearity of a single molecule detection system.
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
6.5 SPECIFIC AFFINITY REAGENTS (BINDERS) TO CAPTURE AND ENRICH
PEPTIDE TARGETS AND STANDARDS.
One or more specific affinity reagents, capable of binding the peptide TARGET
and
STANDARD specifically (i.e., while not binding a potentially vast number of
other peptides
that can be present in a sample digest) are used in some embodiments of the
invention to
capture the TARGET peptide and STANDARD prior to single molecule detection. We
refer
to such a reagent generically as a BINDER, and include within that term not
only canonical
antibodies such as IgG, but also numerous types of proteins and other
macromolecules (e.g.,
aptamers) known in the art to be able to bind to particular peptide sequences
with specific
affinity.
Experience with the SISCAPA technology (3, 9, 13-15) has shown that
antibodies,
preferably monoclonal antibodies, can be developed that bind and capture a
specific low
abundance tryptic peptide from the digest of a very complex sample such as
human blood
plasma (which may contain 250,000 distinct peptides, some at very high
abundance), and
thereby enrich the peptide substantially (e.g., more than 10,000-fold). A
variety of types of
biologically derived antibodies (e.g., polyclonal, monoclonal and oligoclonal
antibodies
derived from mice, rabbits, humans, camelids and other species), molecules
derived from
antibodies by molecular biology techniques (e g , antibodies selected from
libraries using
phage display and other techniques), aptamers, and other molecular constructs
can be created
to achieve the purpose of specific peptide binding ¨ all of these are included
within the term
BINDER as used herein.
6.5.1 Antibody BINDERs
Many methods exist for generating antibodies to a peptide in animals. For
example, a
synthetic peptide having the TARGET peptide sequence can be coupled to a
carrier protein
(e.g., keyhole limpet hemocyanin: KLH) and used to immunize an animal (such as
a rabbit,
mouse, chicken, goat, camelid or sheep) by one of the known protocols that
efficiently generate
anti-peptide antibodies. For convenience, the peptide used for immunization
and antibody
purification may contain additional c-terminal or n-terminal residues (e.g.,
cysteine) added to
the TARGET peptide sequence. The resulting extended TARGET peptide can be
conveniently
coupled to carrier KLH that has been previously reacted with a
heterobifunctional reagent such
76
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
that multiple SH-reactive groups are attached to the carrier. In classical
immunization with the
peptide (now constituting a hapten on the carrier protein), a polyclonal
antiserum can be
produced containing antibodies directed to the peptide, to the carrier, and to
other non-specific
epitopes.
Specific polyclonal anti-peptide antibodies can be prepared from an immunized
animal's serum by affinity purification on a column containing tightly-bound
peptide. Such a
column can be easily prepared by reacting an aliquot of synthetic TARGET
peptide made with
a cysteine residue added to one end with a thiol-reactive solid support. Crude
antiserum can
be applied to this column, which is then washed and finally exposed to 10%
acetic acid (or
other elution buffer of low pH, high pH, or high chaotrope concentration) to
specifically elute
anti-peptide antibodies. These antibodies are neutralized or separated from
the elution buffer
(to prevent denaturation), and the column is recycled to physiological
conditions for
application of more antiserum if needed.
In some embodiments, one of a variety of methods known in the art (e.g., B-
cell
cloning, hybridoma generation, recombinant expression, etc.) is used to
generate candidate
clonal antibody proteins, genes or gene sequences (from the cells, genes, or
proteins of an
immunized animal, or from natural or artificial protein binder libraries such
as phage display
libraries) that can be screened to select a monoclonal antibody (or other
BINDER molecule)
with the ability to enrich the TARGET peptide and cognate STANDARD from a
complex
peptide mixture (e.g., a sample digest) under a specified set of solution
conditions. Such
screening can be carried out using the method of the invention (i.e.,
screening in the assay of
ultimate use), or by alternative methods such as the SISCAPA method with MS
detection.
Monoclonal antibodies, particularly those produced by recombinant methods,
have the
advantages of homogeneity, superior performance, scalable production and
longevity
compared to polyclonal mixtures.
In some embodiments a preferred method of selecting a monoclonal (homogeneous)

anti-peptide antibody BINDER for use in the invention includes testing whether
a candidate
antibody (e.g., product of a clone) binds the TARGET peptide and STANDARD
equally, and
selecting for use the antibody product of a clone or clones that bind these
peptides most equally
(i.e., with least bias towards one or the other, and thus capable of capturing
both from a mixture
77
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
without changing the ratio between them). Preserving the ratio of TARGET to
STANDARD
unchanged during capture and enrichment by the BINDER is desirable since the
invention
involves measuring this ratio (by counting TARGET peptide and STANDARD
molecules) to
calculate the amount of TARGET peptide (and hence target protein) in the
original sample. In
the event that BINDERS are not found that preserve the TARGET to STANDARD
ratio
precisely, some embodiments make use of BINDERS that exhibit differential, but

reproducible, binding, thus allowing correction of the ratio as described
above.
6.5.2 Multi-epitope BINDERs
In some embodiments multiple BINDERs are used to enrich a single peptide. In
some
embodiments, TARGET peptides are sufficiently long to include multiple
epitopes: a typical
antibody linear epitope is 4-6 amino acids long, while TARGET peptides may be
6-30 amino
acids in length ¨ those of length 12-30 amino acids have a high likelihood of
comprising 2 or
more non-overlapping epitopes. In some embodiments, multiple BINDERs targeting
linear,
non-overlapping epitopes in a peptide can be generated and used to increase
enrichment
specificity. In some embodiments monoclonal antibodies to multiple linear, non-
overlapping
epitopes of a peptide are made using conventional hybridoma or other cloning
techniques to
select among clones created by immunization of an animal with peptide and/or
derivatives of
it. In some embodiments, antibodies to multiple linear, non-overlapping
epitopes of a peptide
are selected from libraries such as naive or immunized phage display libraries
of single-chain
antibodies. In some embodiments, aptamers to multiple linear, non-overlapping
epitopes of a
peptide are selected from libraries or evolved using iterative selection
approaches well-known
in the art. In some embodiments, multiple BINDERs of different types (e.g.,
polyclonal or
monoclonal antibodies, aptamers, etc.) targeting linear, non-overlapping
epitopes are used
together for increased affinity and/or specificity.
In many cases, the cost involved in creation of multiple monoclonal antibodies
or
aptamers for each TARGET peptide can present a practical barrier to such an
approach. Some
embodiments therefore make use of polyclonal antibodies purified from sera of
immunized
animals, or, alternatively, oligoclonal mixtures of antibody-like molecules
extracted from large
libraries (e.g., naive or immunized phage display libraries). In some
embodiments methods
well known in the art are used for affinity purification of multiple distinct
antibody specificities
78
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
from such polyclonal antisera using multiple affinity media, each comprising a
linear peptide
subsequence of a TARGET peptide sequence.
In some embodiments, multiple BINDERs with affinities for distinct, ideally
non-
overlapping, epitopes on a TARGET peptide are simultaneously affinity purified
by binding
to synthetic TARGET peptide. In this approach, a TARGET peptide with multiple
epitopes
binds multiple BINDERs from a mixture until each epitope is saturated with
BINDER, thereby
establishing a balanced mixture of BINDERs to the epitopes. Figure 8B shows 3
BINDERs
(I, II and III) bound to different linear epitopes (1, 2, 3) on a single
peptide molecule (the
peptide shown in Fig 8A). In some embodiments, this is achieved by affinity
purification of
BINDERs from a polyclonal antiserum (or pool of antisera) generated in
response to
immunization with the peptide, or multiple fragments of it. In some
embodiments, this is
achieved by capture of a mixture of BINDERs from a variety of sources selected
to bind to the
peptide, or fragments of it. By saturating the TARGET peptide epitopes with
BINDERs
competing to bind to various epitopes, a population of BINDERs is produced
that covers the
peptide (i.e., has a BINDER bound to all, or at least a substantial fraction,
of the epitopes
present on each peptide molecule). The BINDERs are expected to have varying
affinities and
specificities, some or most of which may be individually insufficient to
effectively capture the
peptide from a sample digest However, in some embodiments, a combination of
these
BINDERs can be linked together covalently (e.g., using bi-functional chemical
crosslinkers or
click connectors) or non-covalently (e.g., by reaction of biotin-labeled
BINDERs with
multivalent streptavidin) into a multivalent (e.g., bi-specific, tri-specific,
or quadra-specific)
BINDER (Fig 8C), that can be eluted from the peptide and subsequently used for
affinity
enrichment of TARGET and cognate STANDARD from standardized sample digests.
This
novel approach makes use of a multi-epitope peptide as a "scaffold" or
"template" on which a
plurality of BINDERs to short epitopes are assembled and then linked to form a
larger multi-
epitope BINDER. In some embodiments a multi-epitope (or multi-valent) BINDER
is
immobilized (e.g., on magnetic beads, a column, a surface, etc.) and used to
capture and enrich
TARGET and STANDARD peptides and peptide constructs according to the
invention.
Such a multi-valent BINDER can bind the peptide with much higher affinity than
any
of the individual BINDERs. This effect is well known in the art as "avidity-:
the increased
79
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
binding efficacy of a multivalent BINDER compared to a monovalent BINDER.
Natural
antibodies exploit this effect, comprising 2 binding sites in IgG and 5
binding sites in IgM. In
some embodiments, this avidity effect is exploited by crosslinking the
individual epitope
BINDERs while in proximity to one another (in situ in Figure 8C) to create a
multi-BINDER
construct as shown in Figure 8D. In some embodiments, a similar approach is
carried out using
individual monoclonal BINDERs, either crosslinked together or expressed in a
combined
recombinant product similar to well-known "bi-specific" or "tri-specific"
therapeutic antibody
constructs.
In some embodiments, multiple BINDERs with affinities for distinct, ideally
non-
overlapping, epitopes on a TARGET peptide are separately affinity purified by
binding one
after another to a series of immobilized synthetic TARGET peptides. In some
embodiments,
multiple BINDERs with affinities for distinct, ideally non-overlapping,
epitopes on a TARGET
peptide are used to sequentially affinity purify TARGETs and cognate STANDARDs
from a
standardized digest: TARGETs and cognate STANDARDs that bind to (and are
subsequently
eluted from) each successive BINDER specific to one of multiple TARGET
epitopes are
rendered purer than the set of TARGETs and cognate STANDARDs that bind to and
are eluted
from a BINDER to a single epitope.
In some embodiments, the higher peptide sequence specificity obtainable with
multi-
epitope BINDERs (through their interaction with a larger number of amino acids
in the peptide
portions of TARGETs and STANDARDs) provides an important addition to the
overall
specificity of a range of single molecule detection technologies. By
increasing the enrichment
discrimination between authentic pre-selected TARGET/STANDARD peptide
molecules and
the multitude of potential similar sequence -off-target" peptides present in
digests of complex
samples, potential assay interferences (particularly false-positive detection
of TARGET-like
peptides) is reduced for all single molecule detection methods
Multi-epitope BINDERs are described in more detail in U.S Provisional Patent
Application No. 63/381,722, incorporated herein by reference in its entirety.
6.5.3 Enrichment of TARGETs and STANDARDs from standardized sample digests
The invention makes use of a specific enrichment step to enrich both TARGET
peptide
and cognate STANDARD from a sample digest, thereby creating an "enriched
standardized
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
digest sample" (or "enriched standardized sample digest"). In some embodiments
the
enrichment step is carried out by a cognate affinity capture reagent (BINDER)
to which the
two peptides bind equivalently, such that the ratio of their amounts after
enrichment is the same
or nearly the same as before enrichment. In some embodiments, the TARGET and
STANDARD peptides bind to the cognate affinity capture reagent with identical
affinity and
kinetics, preserving the ratio between them exactly. In some embodiments the
TARGET to
STANDARD ratio after enrichment is within 2%, within 5%, within 10%, within
20%, or
within 30% of the ratio before enrichment. In some embodiments, enrichment
using a
BINDER results in a change in the TARGET to STANDARD ratio, which change is
consistent
across a range of samples and assay replicates. In this case, prior knowledge
of the factor by
which the ratio is changed by enrichment (established by measurements of
TARGET to
STANDARD ratios in a sample or digest before and after enrichment) allows
correction of the
ratio observed after enrichment to yield the correct relative amounts of
TARGET and
STANDARD in the sample digest.
In some embodiments, a homogeneous cognate affinity capture reagent (BINDER,
described below) is selected from a plurality of alternative BINDERs for its
ability to bind the
TARGET and STANDARD peptides equivalently (or with a consistent ratio shift).
Alternatively, in some embodiments, TARGET and STANDARD peptides are selected
from
a plurality of alternatives to bind to a cognate BINDER equivalently. In some
embodiments,
TARGET peptide, STANDARD peptide, and cognate BINDER are each selected from a
plurality of alternatives so as to maximize the property of equivalent (or
consistent) TARGET
and STANDARD peptide binding.
Conservation (or correctability, as described above) of the ratio between
TARGET and
STANDARD peptides through enrichment means that a measurement of the TARGET
peptide
to STANDARD ratio after enrichment provides an accurate measure of TARGET
peptide
amount, even if only a fraction of the total TARGET and STANDARD is captured
in the
enrichment step. For example, the enrichment process might capture 100%, or
90% or 10% or
1% or 0.1% or 0.01% or 0.001% of the TARGET and STANDARD peptides present in a

sample digest, but in each case the ratio of TARGET peptide to STANDARD
molecules would
remain the same, and thus a measurement of the TARGET peptide to STANDARD
ratio would
81
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
provide the same answer, which would be equal to (or correctable to) the ratio
present in the
original standardized sample.
The proportion of a peptide captured by the enrichment step is therefore an
adjustable
feature of the invention, which can be used to capture more of one TARGET
peptide & its
cognate STANDARD than another. Such adjustments make it possible to use the
enrichment
step to capture most or all of a low abundance TARGET peptide, while capturing
only a small
fraction of the molecules of a high abundance TARGET peptide ¨ with the result
that the
difference in absolute molar amounts of the two TARGET peptides can be
significantly
reduced by this differential enrichment. This use of the enrichment step to
bring multiple
TARGET peptides to similar abundances (while preserving the TARGET peptide to
STANDARD ratios that encode TARGET peptide amount in the original sample) is
referred
to as "stoichiometric flattening" or "equalization", and provides a means by
which amounts of
molecules with high and low abundances in the original sample can be measured
using a
measurement technology with limited dynamic range (e.g., single molecule
counting methods).
The flattening approach (differential enrichment of different TARGET/STANDARD
cognate
pairs) generates a "flattened enriched standardized sample digest sample".
6.5.4 BINDER affinity
In some embodiments the affinity of the BINDERs most useful for enrichment is
in the
range of 0.01 to 10 nanomolar, more particularly with a preferred half-off-
time (reciprocal of
the off-rate) of at least several minutes, or more preferably 10-15 minutes
Off-rate is
particularly important since it governs the length of time that unbound
materials, including
non-TARGET peptides, can be washed away (e.g., using conventional manual and
automated
workflow steps such as magnetic bead manipulation in 96-well plates) while
retaining the
TARGET and STANDARD peptides on the BINDER. Higher affinity BINDERS are
typically
required to enrich lower abundance TARGETs, i.e., to capture peptides present
in a digest at
low concentration.
In some embodiments, specific solution conditions, or changes in solution
conditions,
are employed in a sample preparation workflow to preferentially dissociate
less-tightly-bound
peptides while retaining the correct BINDER cognate TARGETs and STANDARDs. In
some
embodiments bound peptides are exposed to increasing chaotrope or denaturant
82
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
concentrations, increasingly acidic or basic solution pH, increasing salt
concentrations, and/or
increasing temperature to dissociate less-tightly-bound peptides prior to
final elution of the
enriched TARGETs and STANDARDs.
In some embodiments (e.g., single molecule imaging technologies) in which
labeled
BINDERs are used to detect and identify specific TARGET peptide sequences,
BINDER
specificity may be more important than affinity, and BINDER selection methods
correspondingly adapted.
6.5.5 Dissociation from BINDER
In some embodiments, a preferred property of a BINDER is the ability to
release bound
peptides rapidly at a desired point in a workflow, as a result of a change in
solution conditions
- for example as a result of a change in pH (e.g., pH 3.0 or pH 9.5), addition
of a chaotrope
(e.g., ammonium thiocyanate or KC1) or organic solvent (e.g., 50% acetonitrile
in water),
increase in temperature, or the application of an electrical field (as in
electrophoresis).
In some embodiments, BINDERs are selected that tightly and specifically bind
the
cognate TARGET and STANDARD peptides from a digest, and then release these
peptides
only when in close proximity to a site at which sequence-sensitive single
molecule detection
can occur (e.g., a nanopore, or an immobilization site on a support), thus
maintaining the
peptides in concentrated form and reducing losses due to diffusion. Nanopore
sequencing (53)
typically relies on the presence of a concentrated salt solution (e.g., 0.4M
KCl) to provide
sufficient charge carriers to create a measurable open channel current through
the nanopore
(typically 20-200 pA). In some embodiments BINDER reagents are selected that
release their
TARGET and STANDARD peptide cargo when exposed to such salt conditions, i.e.,
when the
antibodies are placed in the solution present on the cis side of a nanopore
sequencing device,
or when exposed to a high concentration of salt in a salt gradient in which
salt concentration
increases closer to a nanopore. This feature of a selected BINDER (i.e.,
releasing bound
peptides in a high-salt environment) is advantageous in that peptides can be
retained on a
physically manipulatable solid support (e.g., magnetic beads) until they are
in a sequencing
chamber itself (or in a region of that chamber nearest to a nanopore), thereby
minimizing
potential losses and dilution that could occur if peptides were eluted
elsewhere and later
transported into the sequencing chamber.
83
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
In some embodiments, chaotropic anions such as SCN are incorporated into the
solutions of a nanopore cis compartment (or both compartments) in addition to
or in place of
Cl anions conventionally used, in order to facilitate release of peptides from
a BINDER. It
will be evident to those skilled in the art that a range of chaotropic anions
or cations can be
used to effect peptide release from BINDER over a range of concentrations
suitable for
optimization in particular device configurations.
In some embodiments using one or more alternative influences to release bound
peptides from a BINDER without a pH change, the potential deleterious effects
of acid elution
(as practiced in conventional affinity enrichment systems including SISCAPA,
e.g., pH 2.0-
3.5) on some protein nanopores, or on components of other single molecule
detection systems,
can be avoided. Acid elution is used in S1SCAPA in order to avoid introduction
of salts, since
salts interfere significantly with detection by mass spectrometry. The use of
high salt in
nanopore sequencing at near-neutral pH contrasts with the use of acid elution
and low salt in
mass spectrometry-based detection systems, and this difference suggests that
different
BINDERs with different elution characteristics will be preferred in the
respective peptide
detection methods.
In some embodiments where enriched TARGET and STANDARD constructs are to be
immobilized on a support, the constructs are delivered into proximity with the
support while
they are bound to the cognate BINDER (e.g., on easily manipulatable magnetic
beads).
BINDER-bound constructs are present at a very high effective local
concentration, and may
be conveniently moved from one environment to another without loss. This
feature is an
important advantage of the BINDER enrichment step when applied to low
abundance peptides
and their detection by single molecule counting.
6.5.6 BINDER immobilization
In some embodiments, a BINDER with specific affinity for the TARGET peptide
and
STANDARD may be immobilized on a solid support in order to facilitate
separation of the
antibody and its bound peptide and/or peptide construct cargo from a complex
sample digest,
to wash away unbound molecules, to concentrate bound peptides, and to deliver
bound peptides
to a site where they are available for sequencing. Typical solid supports used
for this purpose
include magnetic beads (allowing collection of beads from a liquid suspension
by magnetic
84
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
force) or a porous column (e.g., an affinity column) through which liquids may
be pumped. In
some embodiments, the BINDER is immobilized on commercially available protein
G-
derivatized magnetic beads (Dynabeads G; Thermo Fisher) and optionally
crosslinked
covalently with dimethyl pimelimidate (DMP) according to the manufacturer's
instructions.
In an alternative preferred embodiment, the antibody is immobilized on tosyl-
activated
Dynabead magnetic beads. In a further alternative embodiment, the anti-peptide
antibody can
be immobilized on solid phase chromatography media (e.g., POROS G resin)
packed in a
column and crosslinked using DMP. Such a column can bind the TARGET peptide
specifically from a peptide mixture (e.g., a tryptic digest of serum or
plasma) and, following a
wash step, release the TARGET peptide under elution conditions.
6.5.7 BINDER homogeneity
In some embodiments, e.g., those using a homogeneous cognate affinity capture
reagent (e.g., a monoclonal antibody BINDER, wherein all or nearly all
molecules have the
same sequence), it is expected that the ratio of TARGET and STANDARD peptides
is not
affected by the degree of saturation of the BINDER binding sites by the
peptides at equilibrium
(particularly at low saturation). Inhomogeneous affinity capture reagents are
difficult to
characterize in detail, and can contain variants that bind one or the other of
the TARGET and
STANDARD peptides more strongly. Thus, saturation of one variant could be
followed by
binding to another, potentially lower affinity, variant that has different
relative affinities for
the TARGET and STANDARD peptides, resulting in a change in the bound ratio as
a function
of the amount bound. For this reason, homogeneous (typically clonal or
sequence-defined)
BINDERS are typically preferred: e.g., monoclonal antibodies or sequence-
defined aptamers.
6.5.8 Chemical modification of peptides while bound to BINDER
In some embodiments, chemical or enzymatic reactions for the purpose of
modifying a
TARGET (or STANDARD) peptide are carried out in solution, and in some
embodiments one
or more reactions are carried out while a peptide or peptide construct is
bound to a BINDER,
which may or may not itself be bound to a solid support. In some embodiments,
one or more
reactions are carried out while a peptide is bound to a BINDER linked to a
solid support, thus
allowing the peptide to be contacted with reagents, and removed from contact,
by physical
movement of the support between liquids (e.g., by removal of magnetic beads
carrying
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
BINDER and bound peptides from liquid in one vessel and deposition of the
beads in a
different vessel where they are exposed to a different reagent), or
equivalently by movement
of liquids in contact with the support (e.g., by pumping one reagent and then
a second reagent
over a porous column, or magnetic bead mass, to which BINDER and its peptide
cargo are
bound). In addition to contact with one or more reagents required for
execution of a sequence
of reactions, manipulation of peptides on a support allows the peptide to be
washed free of a
reagent by exposure to a wash solvent prior to contact with a subsequent
reagent. Movement
of peptides between liquids by movement of a BINDER or support to which they
are bound
reduces or eliminates the need for purification or concentration of
intermediate peptide forms
created during a sequence of one or more chemical reactions. In some
embodiments, peptides
are bound to a solid support by means other than interaction with a specific
BINDER, e.g., by
binding of peptides to a generic support such as a reversed phase support
(e.g., C18) or an ion
exchange support.
In some embodiments, use is made of amino groups present at peptide amino
termini
and on lysine side chains for chemical linkage of a peptide to other molecules
(e.g., oligo and
other polymers that, with peptides, form constructs amenable to sequence-
sensitive single
molecule detection) while a peptide is bound to a BINDER. In order to
eliminate competing
side reactions with amino groups present in specific affinity reagents (e g ,
lysines and n-
terminal amino groups of anti-peptide antibody BINDERs) used in the invention,
the invention
provides for the optional elimination of some or all of these BINDER amino
groups by
chemical blockage (e.g., by reductive methylation, by PEGylation using
commercially-
available NHS-PEG or other reagents, conversion of lysine residues to
homoarginine by
treatment with 0-methylisourea, or other chemical modifications known in the
art), by protein
engineering (e.g., by replacing some or all lysines in a recombinant antibody
sequence with
arginines or other amino acids), or various other means. Specific affinity
reagents to be used
in such embodiments may be selected so as not to contain any lysine residues
in the TARGET
peptide binding site, since these residues would likely be blocked along with
other lysines,
potentially leading to a loss of binding activity. Non-protein BINDERs, such
as DNA and RNA
aptamers and other similar molecules, may contain no amino groups to begin
with, eliminating
the need to block these prior to process aimed at modifying peptide amino
groups. The
elimination of BINDER amino groups that could participate in side reactions
has the effect of
86
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
avoiding waste of expensive reagents used in amino group modifications of
TARGET and
STANDARD peptides, including use in creating concatenated constructs of these.
In some embodiments, blockage (e.g., by PEGylation) of many or all of the
amino
groups on BINDERs, and on any other proteins present on the capturing support
(e.g., Protein
A or Protein G used to guide antibody immobilization on solid supports such as
Dynabeads G
magnetic beads), can also have the advantageous effect of rendering the BINDER
more stable,
and thus less liable to degradation by heat, by proteases, or by exposure to
complex samples
and sample digests. In the case of a Protein A or G coated magnetic bead, for
example, it is
advantageous in some embodiments to first react the antibody BINDER with the
Protein A or
G on the bead, then chemically cross-link the BINDER to the Protein A or G on
the bead, then
PEGylate some or all of the remaining protein amino groups on the bead. Such
modifications
can also alter the net charge on proteins and on beads carrying them towards
greater negative
charge overall, since typically about half the positive charges on a protein
are arginine and half
lysine (the latter of which would be blocked by blockage of amino groups).
Since the amount
of negative charge (largely attributable to glutamic and aspartic acids) would
be unaffected by
amino group blockage, the overall decrease in positive charges by ¨50% will
shift the net
charge on the BINDERs, and on a bead coated with BINDERs, towards the
negative.
Nanopore sequencing devices are typically operated with a negative electrode
in the cis
compartment (where the input molecules to be sequenced are added) and a
positive electrode
in the trans compartment: this polarity induces an oligo, which is strongly
negatively charged
on account of its sugar-phosphate backbone, to migrate towards and through the
pore to initiate
sequencing. In some embodiments this polarity also serves to move a negatively
charged bead
towards the pore, contributing towards the goal of delivering peptide-oligo
constructs in close
proximity to the pore.
6.6 BARCODES
In some embodiments the methods used for single molecule detection have the
capability to detect very large numbers of molecules (e.g., 1010 in (54)), far
exceeding the
requirements for quantitative measurement of a modest number of peptides in
one sample. In
order to make effective use of the analytical capacity of such platforms, and
the consequent
reduction in analytical cost, some embodiments connect sample-specific labels
("barcodes-)
87
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
to the TARGET and STANDARD peptides present in a sample digest, or enriched
from a
sample digest by a BINDER, allowing the TARGET and STANDARD peptides from
multiple
samples to be combined prior to peptide detection (i.e., multiplexed), and
afterwards de-
multiplexed to associate them with the correct original samples. DNA provides
an ideal
medium for implementation of such barcodes since, as essentially a digital
medium, it is easy
to synthesize, cut, ligate, copy, and detect by both sequencing and
hybridization. Alternative
barcode polymers can be employed, such as peptides and synthetic chemical
polymers,
although these may be significantly more difficult to generate, manipulate and
detect than
oligonucl eoti des.
As described above, many sample barcoding systems have been developed using
sets
of distinct DNA barcodes to identify nucleic acid molecules derived from
different samples
prior to sequencing, or to facilitate optical readout of individual nucleic
acid molecules in
imaging systems. In some embodiments, sample barcodes with identical or very
similar base
composition but distinguishable sequences are preferred in order to minimize
differences in
physical properties between constructs on account of barcode properties.
In some embodiments the identity of samples from which single molecule
constructs
according to the invention are derived is encoded using nucleic acid (e.g.,
DNA or another
sequenceable polymer having multiple distinguishable subunits) barcodes.
In some
embodiments sample barcodes are appended or linked to TARGET and STANDARD
constructs prior to enrichment by cognate BINDERs. In some embodiments sample
barcodes
are appended or linked to TARGET and STANDARD constructs after enrichment by
cognate
B1NDERs, in which case smaller amounts of the DNA barcodes are required.
Figures 9 and 10 illustrate schematically a 2-level encoding scheme used in
some
embodiments. In each sample digest, a specific peptide (here labeled Peptide-
A) is linked to a
DNA sequence tag (labeled OLIGO-TARGET) identifying it as a sample-derived
TARGET
molecule. The cognate internal standard formed by linkage of a synthetic
version of Peptide-
A with a distinct DNA sequence tag (labeled OLIGO-STANDARD) is added to the
digest,
creating a standardized digest (standardized with respect to Peptide-A).
Either before, or more
efficiently after, enrichment of Peptide-A TARGET and STANDARD constructs
using a
BINDER, sample barcodes comprising a plurality of modules (Codes) are linked
to these
88
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
constructs using conventional methods that may include ligation to the
TARGET/STANDARD tag, chemical linkage (e.g., using click chemistry), non-
covalent
means (e.g., biotin on one oligo and streptavidin on the other), or a variety
of other linkage
means known in the art. Alternatively, the sample barcodes can be linked to a
site on the
peptide different from the site at which the TARGET/STANDARD tag is connected.
The scheme for sample barcoding shown in Figure 9 provides a construct
compatible
with a variety of single molecule detection methods, as described below. In
this example,
barcode modules at positions 1, 2, 3 and 4 are used to encode "bits" in a 10-
bit binary sample
code. In such a code, if all bits were readable at once, 10 bits could
conventionally encode 2'
= 1,024 samples. Given that each DNA base is 1 of 4 alternatives (2 bits of
information), 10
bits of information could theoretically be encoded in a short sequence of 5
bases. However,
all the methods envisioned for reading the sample barcode in a single molecule
detection
system are subject to error, and avoiding sample misassignment errors is a
high priority in
many applications (e.g., clinical). A preferred approach is therefore to add
redundancy to the
sample code. In some embodiments this is done by providing a unique sequence
module
comprising multiple bases (e.g., 4 to 30 bases depending on the preferred
readout method)
corresponding to each of the bits in the desired sample code space (number of
samples to be
identified) As a further measure against sample assignment errors, the error
detection and
correction methods of Hamming can be used, and in the case of a 10-bit code,
Hamming
extended parity error detection involves the addition of 4 parity bits to the
10-bit code, resulting
in a total of 14 bits of information. Such a 10+4 = 14-bit code is capable of
detecting and
correcting any 1-bit error, and detecting but not correcting 2-bit errors. To
implement such an
approach in a manner that economizes on the total length of DNA that must be
"read" to obtain
the sample code, the example of Figure 9 simplifies this coding scheme to use
only those code
values having 3 or 4 bits set to a value of 1, which reduces the sample coding
space to 105
samples that can be identified with very high accuracy, but reduces the total
number of DNA
modules that need to be in any one sample code. Thus 4 modules are included in
any sample
code, selected from among 14 different DNA sequences selected using
computational and
experimental methods well known in the art for minimal likelihood of confusion
during
readout. A mistaken read on any one of these modules can be corrected by the
coding scheme,
and mistakes in 2 modules can be detected (but not corrected). Those skilled
in the art will
89
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
recognize that many alternate coding schemes exist, with greater or lesser
numbers of bits, of
larger of smaller numbers of identifiable samples, and of great or lesser
numbers of bases in
each DNA module.
6.6.1 DNA sample barcodes used with platforms that incorporate DNA sequencing.
Several approaches to peptide single molecule detection include the ability to
read
nucleic acid sequences interspersed with peptide sequence (e.g., current DNA
sequencing
nanopore platforms) or else together with peptide sequence that has been
reverse-translated
into DNA (e.g., reverse translation platforms). In some embodiments, modules
of 2, 3, 4, 5, 6,
7, 8, 9, 10, 11,12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26,
27, 28, 29, 30 or more
bases are used.
6.6.1.1 Nanopore sequencing
DNA sample barcodes and TARGET vs STANDARD tags linked to peptides can be
read directly by passage through a suitable nanopore sequencing system. At the
current state
of the art represented by the commercial MinION device, the accuracy of
individual basecalls
can be greater than 99%, and therefore the accuracy with which one of a small
set of sequences
(designed to be distinct) can be recognized is high. In some embodiments, the
sample barcode
(e.g., a binary number identifying the sample, an alphanumeric code taken from
a physical
sample label, or any type of computer encodable sample identification) can be
encoded directly
(2-bits per base) or with redundancy in the form of multiple bases per code
bit, additional parity
bits (including error detection and correction), or any information
representation scheme that
can be encoded in DNA or another nanopore-readable polymer with 2 or more
distinguishable
units. The association of a code with an individual peptide molecule is
accomplished through
the covalent linkage between the two that is established in the peptide
library preparation
workflow of the invention.
6.6.1.2 Reverse translation
Methods of single molecule sequencing by some form of reverse translation have
been
described (US 2021/0302431 and (34)), and these typically include the ability
to copy a DNA
sequence from an affinity reagent capable of recognizing a terminal amino acid
(or amino
acids) to a "root" oligo attached to, or in proximity to, the peptide being
reverse-translated,
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
which is extended with a DNA code identifying the amino acid at each
degradation cycle. In
some embodiments, such a root oligo linked to a peptide comprises a
TARGET/STANDARD
tag (identifying which version of a peptide it is attached to) and optionally
a sample barcode.
After the required number of decoding and degradation steps, the root oligo is
prepared for
conventional high-throughput DNA sequencing, generating a sequence comprising
the peptide
sequence (or a representation of it), its identity as a TARGET or STANDARD,
and its sample
of origin.
6.6.2 DNA sample barcodes used with affinity reagent imaging platforms
In some embodiments, e.g., those making use of single molecule imaging (30-
32),
DNA sample barcodes can be detected by sequential hybridization with labeled
oligos
complementary to the sample barcodes. Complementary oligos can be labeled with
a variety
of fluorescent or colored dyes, with quantum dots or other optically
detectable nanoparticles,
with enzymes capable of generating a localized signal (e.g., luminescence), or
a variety of
other compositions known in the art for the generation of a spatially
localized externally-
detectable signal. In some embodiments, a set of barcode sequences is used
that are designed
to have high specificity (minimal cross-hybridization of one barcode with the
probes
complementary to the other barcodes). The lengths of the barcode sequence
modules generally
impact the specificity with which they are recognized by complementary probes,
the kinetics
with which they bind and the temperature at which they can be removed after
being read (i.e.,
analogous to the "melting temperature"). In some embodiments, modules of 8, 9,
10, 11,12,
13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30 or more
bases are used.
Since imaging approaches typically do not reliably establish the order of
individual
barcodes present in a construct molecule (this being at or beyond the
resolution limits of
conventional imaging systems), probing a single molecule construct with a set
of probes
complementary to the set of barcodes yields a series of binary (i.e., the
probe binds or does not
bind) results that can be considered a binary code.
In some embodiments, a set of N distinct barcodes is used, where N is a number

required to encode at least the number of samples to be pooled (i.e., 2N>
number of samples).
For example, 127 different samples could be uniquely identified by detecting
the presence or
absence of 7 different barcodes (27=128, yielding 127 unique barcodes,
excluding 0 - the
91
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
absence of all barcodes). To make use of this scheme would require that
peptide:oligo
constructs include up to all 7 of the barcodes.
In some embodiments, a larger set of barcodes is used, but constructs need
include only
a limited number of these. For example, if 11 distinct barcodes are used, but
only 4 or fewer
of these are included in any given construct, several hundred different
samples can be uniquely
identified. Use of fewer barcodes in a construct is advantageous since it
reduces the length
and cost of the sample barcoding oligos required: individual distinct barcodes
may be 20-30
bases long.
In some embodiments, a further improvement in sample barcoding is provided by
the
use of Hamming codes. For example, in a coding scheme with 10 data bits and 4
parity bits
(14 bits total, here corresponding to 14 distinct DNA barcodes), and using
only 3 or 4 of these
barcodes in any individual peptide construct, it is possible, using the
features of the Hamming
scheme, to identify and correct any single error in detection of any of the
individual barcodes.
Error detection is of great value in preventing mis-attribution of molecules
to the wrong
sample, which could result in erroneous quantitative results derived from
errors in the counts
of TARGET and STANDARD molecules in a sample digest.
Figure 10 illustrates the process of decoding an example construct in 16
cycles using a
Hamming code: in each cycle one of 16 oligo probes complementary to one of the
DNA codes
is applied, detected when present, and removed. In this example the first 2
cycles involve
probes that determine whether the peptide is a TARGET or a STANDARD (either
one probe
binds or the other). In the remaining 14 cycles, 14 probes complementary to
each of the 14
DNA sequence modules described above are successively applied, detected when
present, and
removed. The result is a 14-bit binary number that is capable of identifying
one of 105 different
samples with single (1-bit) error correction and double (2-bit) error
detection (referred to as
"SECDED" in the art). Separately, but as part of the same decoding process,
one or more
recognition reagents capable of characterizing the peptide are applied in
additional cycles to
establish the identity of the peptide.
7 GENERAL METHOD OF ANALYSIS.
92
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Some embodiments of the invention comprise a series of steps to transform a
protein-
containing sample into an enriched standardized digest sample, or a flattened
enriched
standardized digest sample, prior to sequence-sensitive single molecule
detection and
counting. The invention is equally applicable to protein samples from sources
such as blood,
blood plasma and blood serum, as well as other sources, such as tissue
homogenates, animal,
plant or microbial samples, other body fluids, environmental samples and the
like.
An important feature of the invention is its generality, allowing the design
of similar
protocols and using similar reagents and equipment to prepare peptide
libraries suitable for
analysis of a wide range of different proteins in a variety single molecule
detection systems.
To accomplish this the invention makes use of specific features of peptides
generated by
particular enzyme cleavages; features provided by multiple click chemistry
pairs; requirements
for specific peptide and oligo orientation; multiple levels of barcoding; and
detailed control of
the capture and enrichment of peptide:oligo constructs by specific affinity
reagents.
In some embodiments, steps of the general method may be carried out in a
different
order than that outlined below. Nevertheless the invention requires that steps
of sample
digestion to peptides, generation of TARGET constructs from TARGET peptides,
and addition
of STANDARD constructs to the digest (thus creating a standardized digest)
must precede
enrichment of TARGET constructs and STANDARD constructs from the standardized
digest
using BINDERs .
7.1 DIGESTION OF SAMPLE.
In some embodiments a general approach for sequence-based protein quantitation

involves digesting sample proteins (e.g., with trypsin) into peptides. In
order to improve the
completeness of digestion, disulfide bonds may be broken and proteins
denatured to disrupt
secondary and tertiary structure. Samples can be any kind of protein-
containing sample
without limitation, including body fluids, tissues, tissue lysates, tissue
extracts, bacterial,
fungal, animal and plant samples, recombinant proteins including protein
drugs, food products,
and the like.
In some embodiments, preparation of proteolytic peptides from a complex sample
is
carried out by a series of reagent addition steps which may include:
denaturing a protein
sample (e.g., with detergents such as deoxycholate or CHAPS, organic solvents,
urea or
93
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
guanidine HC1), reducing the disulfide bonds in the proteins (e.g., with
tris(2-
carboxyethyl)phosphine (TCEP), dithiothreitol or mercaptoethanol), alkylating
the cysteines
(e.g., by addition of iodoacetamide, or iodoacetic acid, which react with the
free ¨SH group of
cysteine preventing reformation of disulfide bonds), quenching excess
iodoacetamide by
addition of more dithiothreitol or mercaptoethanol, and (after removal or
dilution of the
denaturant) addition of the selected proteolytic enzyme (e.g. trypsin),
followed by incubation
to allow digestion. Numerous variations of this process, some including
additional steps and
some eliminating individual steps, are known in the art. In some embodiments,
following
incubation, the action of trypsin can be terminated, either by addition of a
chemical inhibitor
(e.g., TLCK) or by denaturation (through heat or addition of denaturants, or
both) or removal
of the trypsin (if the trypsin is on a solid support). There are many specific
protocols available
for proteolytic digestion, including automated methods using only liquid
addition steps (14).
In some embodiments it has been shown that automated digestion of biological
samples can
be very reproducible, exhibiting minimal variations (e.g., CV <2%) between
replicate samples.
In some embodiments, a desired peptide can be liberated by proteolysis without
the
need for disulfide reduction and alkylation (e.g., peptides that do not
contain cysteine residues,
and are not sterically constrained by nearby disulfide bridges), and in some
cases without
denaturation (e g , peptides exposed on the surfaces of a protein) A range of
alternative
proteolytic enzymes can be used instead of trypsin to produce peptides defined
by specific
cleavage sites (including GluC, Lys-C, Arg-C, chymotrypsin, papain, pepsin, V8
protease, and
the like), and chemical agents can also be used (e.g., CNBr cleavage at
methionine residues).
In some embodiments a simplified digestion protocol is used comprising
addition of
protease to a liquid protein-containing sample without prior denaturation,
reduction of
disulfides or blockage of resulting cysteines. In some embodiments, heat is
used to improve
protein digestion by partial denaturation of protein substrates before the
trypsin (or other
proteolytic enzyme) is denatured, often using a variable temperature profile
such as a ramp
from room temperature to a higher temperature (e.g., 70C for a plasma sample).
Such protocols
are not expected to result in complete digestion of most proteins, but they
can reproducibly
generate certain tryptic peptides from some regions (e.g., surface exposed
sequence segments)
of some proteins, and if these peptides satisfy the requirements of TARGET
peptides in a given
94
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
application, the abbreviated protocol allows substantial simplification of the
sample
preparation workflow.
In some embodiments digestion can be carried out by immobilized proteolytic
enzymes
such as trypsin. Trypsin has been immobilized in the art at very high
concentrations (e.g., on
derivatized porous nylon, PVDF or nitrocellulose membranes) and used to
perform very rapid
(e.g., < 1 minute) digestion of proteins.
Proteolytic digestion disrupts protein:protein interactions by largely if not
completely
eliminating tertiary structure when a large protein is reduced to short
peptides free to diffuse
apart. This conversion of a large complex protein molecule to a series of
short peptides offers
a significant improvement in protein quantitati on, since it removes the
primary sources of assay
interferences observed with immunoassays (in which a protein:protein
interaction that blocks
an epitope used by an assay antibody can result in a false negative result,
while false positives
can result from bridging interactions involving protein components not
expected to be involved
in the assay). An example in which tryptic digestion overcomes such an
interference is the
SISCAPA assay for thyroglobulin (55) .
An average-length human protein produces about 50 peptides upon tryptic
digestion,
from which an assay designer can choose one or more peptides suitable for
specific
applications. This feature expands the range of detection alternatives
compared to intact
protein detection. Since intact proteins are so diverse in their physical
properties, and therefore
difficult to measure in many circumstances, the ability to select a
proteotypic peptide from a
range of alternatives as a stoichiometric surrogate for the intact protein is
a major advantage
of the digestion approach. It is often observed that within every "bad"
protein there is at least
one "good- peptide for a given application.
7.1.1 Proteolytic production of peptides with either one or two amino groups.
In some embodiments it is advantageous for TARGET peptides to have a single n-
terminal amino group (i.e., "single amino peptides"), and in this case
digestion is preferably
carried out using an enzymatic protocol in which most peptides of an
appropriate length do not
contain lysine (which contains a side chain amino group), and thus have a
unique amino group
at the n-terminus of the peptide chain. While approximately half of the
peptides resulting from
tryptic digestion have a lysine at the c-terminus, addition of Lys-N or
similar enzymes that
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
cleave n-terminal to a lysine residue can in many cases remove the c-terminal
lysine from
tryptic peptides, resulting in a larger proportion of "single amino" peptides
of a useful length
(i.e., the sum of the c-terminal arginine peptides produced by tryptic
digestion and the set of
lysine peptides from which the c-terminal lysines have been removed by the
additional action
of Lys-N). Another approach to decrease the proportion of double-amino
peptides is to
chemically convert the lysine epsilon-amino groups to homoarginine in a
guanidination
reaction with methylisourea (5 6) . Alternatively, in embodiments in which it
is preferred to
have amino groups at both ends of the peptide (i.e., "double amino" peptides)
digestion with
Lys-C in place of trypsin typically leads to generation of peptides having
both an n-terminal
amino group and a c-terminal lysine with its side chain amino group.
7.1.2 Sequential digestion steps and application to distinguish linkage sites
In some embodiments it is advantageous to link peptides to a specific type of
molecule
at or near the n-terminus, and link the peptide to a different type of
molecule at or near the c-
terminus ¨ such an approach can be used to generate a construct in which the
peptide is "in-
line- between preceding and following polymeric components (such as
oligonucleotides)
useful in detection and identification of the construct. While a variety of
chemical methods
referred to elsewhere herein can be used to selectively couple molecules to
the n-terminal
amino group, a c-terminal carboxyl group, a c-terminal lysine side chain amino
group, a
fortuitously positioned cysteine sulfhydral group, etc., the limited
specificity and quantitative
yield of these reactions can make it difficult to quantitatively and
reproducibly couple the two
peptide termini to different molecular additions. In some embodiments the
invention provides
an improved alternative method making use of sequential proteolytic cleavages
and coupling
procedures comprising the following steps: 1) cleavage of sample proteins c-
terminal to lysine
residues (e.g., using the enzyme Lys-C) to produce "Lys peptides", a number of
which may
include internal Arg residues; 2) reaction of the lysine side chain amino
groups and the exposed
peptide n-terminal amino groups of these peptides, in one or more steps (which
may, for
example, include click chemistry ligations), with a first added molecule
(e.g., a first
oligonucleotide); 3) removal or depletion of any remaining uncoupled amount of
this first
added molecule (or intermediate chemical components); 4) cleavage of the Lys
peptides at
internal arginine residues when present in a second proteolytic step (e.g., by
addition of
96
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
enzymes such as trypsin, Arg-C, etc.), thereby exposing a fresh and unreacted
n-terminal amino
group in the c-terminal part of those peptides that have a c-terminal lysine
and such an internal
arginine residue (i.e., the part of the original Lys peptide that extends from
the amino acid
following the arginine to the c-terminus); and 5) reaction of these fresh n-
terminal amino
groups, in one or more steps, with a second added molecule (e.g., a second
oligonucleotide
using a click chemistry linkage). These steps are illustrated in Figure 6
using a hypothetical
protein sequence shown in Fig 6A containing arginine residues indicated by R,
lysine residues
indicated by K and other amino acids (all indicated here by X). Digestion with
Lys-C produces
a series of peptides shown in Fig 6B, each having an amino terminal and
carboxy terminus.
Reaction of the free amino groups (the n-terminal amino group and the side
chain amino group
of lysine residues K) with an added group M1 produces a series of modified
peptides shown
in Fig 6C. Subsequent digestion with trypsin produces a set of peptides shown
in Fig 6D.
Reaction of the n-terminal amino groups exposed by this second digestion with
added group
M2 results in the peptides shown in Fig 6E. Two of these peptides (indicated
by boxes and
large asterisks) have the different groups M2 and M1 positioned, respectively,
at the n-terminus
and near the c-terminus (i.e., on the side chain amino group of the c-terminal
lysine), separated
by a sequence of amino acids long enough to have a high probability of being
proteotypic for
the target protein (e.g., unique to one protein in the human proteome): these
peptides represent
likely choices for use in the invention for quantitative measurement of the
target protein.
In the case of K (Lys-C) cleavage followed by R (trypsin) cleavage as
described above,
not all proteins of interest necessarily comprise an appropriate proteotypic
"R. .K" sequence.
Surprisingly, however, our in silico calculations on the known protein-
encoding regions of the
human genome indicate that 17,482 of the approximately 20,000 proteins in the
human
proteome contain at least one such peptide with a length of 7-31 amino acids
(there being
approximately 97,330 such peptides in the human proteome overall). Of these,
approximately
13,470 peptides are 15-to-31 amino acids long, which is long enough to
encompass multiple
epitopes for BINDER recognition. In contrast to earlier peptide detection
methods using mass
spectrometry, in which shorter peptides are generally favored over longer
peptides due to their
greater MS signal strength and better separation by conventional reversed-
phase
chromatography, the single molecule methods used in some embodiments of the
present
invention are not typically biased against long peptides. As described herein,
recognition of
97
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
multiple epitopes in a peptide can provide increased specificity, as well as
the possibility of
leveraging avidity effects to improve BINDER capture efficiency.
Each of the two sequential linkages to ends of the peptide may, in some
embodiments,
involve a sequence of reactions, for example an initial reaction with an amine
coupling reagent
such as an NHS or sulfo-NHS conjugate of a click reagent (e.g., NHS-BCN, NHS-
DBCO,
NHS-TCO, NHS-tetrazine, NHS-azide, an NHS-alkyne, or the like), followed by
reaction of
the click group thus introduced with a corresponding click chemistry partner
attached to the
molecule to be added (e.g., reaction of a BCN-modified peptide with an azide-
modified oligo
tag or barcode). Other schemes for linkage of molecules to peptide amino
groups are well-
known in the art (e.g., direct reaction with an NHS adduct of the molecule to
be added, etc.)
and can be used to create the linkages described. In some embodiments, the
same coupling
chemistry is used in the first and second linkage steps, since the first
linkage is complete and
any component reagents can be removed prior to the exposure of the amino group
used in the
second linkage step. Alternatively, in some embodiments, the first linkage
step is carried out
using one or more reactive groups different from those used in the second
step. For example,
in some embodiments the first linkage step makes use of an NIIS-BCN reagent to
activate
peptide amino groups which are then reacted with an azi de-activated oligo to
accomplish the
first linkage Following removal or depletion of the reagents involved in this
first step and a
second proteolytic digestion, a second linkage is carried out by activating
the freshly-exposed
n-terminal amino group with NHS-tetrazine and subsequent reaction with a DCO-
activated
oligo. Because of the general orthogonality of BNC-Azi de and tetrazine-DCO
click reactions,
these two steps are unlikely to cross-react even if some amount of reagent
persists from the
first modification step. In some embodiments, the first step of the 2-step
procedure is carried
out to the point of activating the lysine amino group with a member of a first
click pair (e.g.,
BCN) but without linkage to the first oligo, and a second activation step,
following the second
proteolytic step, is accomplished using a member of a second orthogonal click
pair (e.g.,
tetrazine), after which both activated peptide groups can be reacted
simultaneously with the
respective orthogonal activated oligos (i.e, with an azide-activated oligo at
the lysine site and
a TCO-activated oligo at the n-terminus).
98
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
In some embodiments that make use of sequential reactions (e.g., after the 2
successive
proteolytic cleavages by Lys-C and trypsin as described above) to expose
different peptide
reactive sites (e.g., lysine and n-terminal amino groups), the overall yield
of correctly modified
products can be increased by removal of the modifying reagents (e.g., NHS-BCN
and an azide-
labeled oligo) used in the first step (e.g., coupling an oligo to the lysine
amino group via BCN-
azide click coupling) before executing the second cleavage to expose fresh
reactive groups
(e.g., n-terminal amino groups). This removal step decreases the probability
that the peptide
reactive groups exposed by the second cleavage (e.g., the amino terminal NH2
groups shown
among the peptides of Figure 6D) will be modified in the same way as the amino
groups
exposed after the first digestion (e.g., addition of group M1), and instead
react only with a
second reagent or reagents that introduce a different addition (labeled M2 in
Figure 6E). In
some embodiments, the reagents involved in adding group M1 to the peptides are
removed by
separation of peptides from the solution phase (e.g., by capturing the
peptides on a suitable
support such as a reversed phase or ion exchange support and then washing the
soluble reagents
away), by size exclusion separation to separate peptides from low molecular
weight reagents
(such as NHS-BCN, etc.), or by exposing the mixture to a solid support
comprising a
substantial content of free amino groups to which any un-reacted amino-
modifying reagents
can couple before removal of the support. A variety of magnetic beads, agarose
particles and
column packing materials having reactive amino groups are commercially
available and can
be used for this purpose.
Following such a 2-step linkage procedure, peptides whose sequences are
bounded by
a c-terminal lysine and an n-terminal amino acid that is immediately preceded
in the protein
sequence by an arginine will, with high likelihood, be modified with distinct
added groups on
the two termini, as desired. In the invention, specific peptides with these
characteristics (i.e.,
a preceding arginine and c-terminal lysine) can be selected as TARGETS and
efficiently
incorporated in a predetermined orientation (i.e., n-term to c-term or vice
versa) into constructs
amenable to single molecule detection and counting. In the method shown
schematically in
Figures 12 and 13, an in-line construct is assembled that comprises, in order,
an oligonucleotide
in 5' to 3' orientation, a peptide in C-term to N-term orientation (opposite
to the conventional
method of writing a peptide sequence) and a further oligonucleotide in 5' to
3' orientation.
When the final (3') oligo is omitted from such a construct, the peptide's n-
terminus is exposed
99
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
and available for sequential degradation by Edman (or alternative) chemistries
used to read
peptide sequence (e.g., Encodia or Quantum-Si technologies). Those
knowledgeable in the art
will recognize that the approach described allows the design and construction
of hybrid
molecules in which peptides, oligonucleotides and other polymers can be linked
in a specified
order and a specified orientation adapted to a variety of different single
molecule detection
technologies.
It is well-known in the art that improved proteolytic digestion can be
achieved using a
combination of Lys-C and trypsin (57), and that this combination can be used
to advantage in
sequence (58): addition of Lys-C first in a concentrated denaturant (e.g., 6M
urea), followed
by subsequent addition of trypsin after dilution of denaturant (e.g., dilution
to 1.5M urea).
However, the use of the two enzymes in sequence, with each enzyme cleavage
step followed,
respectively, by coupling of a different added molecule to available amino
groups, is novel and
provides an effective method for the assembly of oriented linear constructs.
Similarly, the use
of the two enzymes in sequence, with the first enzyme cleavage step followed
by coupling of
an added molecule to available amino groups, while leaving the n-terminal
amino group
created by the second enzyme cleavage step is a novel and effective method for
the generation
of a peptide-oligo construct having a free, unmodified n-terminal amino group
available for
cyclical degradative sequencing
In some embodiments useful for nanopore detection, a linear construct is
produced
comprising a leading oligonucleotide, a central peptide, and a trailing
oligonucleotide
according to the invention. The method of the invention allows each segment to
be assembled
in a specific orientation as required by a detector such as a nanopore
regulated by a DNA
motor; e.g., a leading oligo oriented 5 '-to-3 ', followed by a peptide that
is oriented c-terminal-
to-n-terminal, followed by an oligo oriented 5' -to-3 ' (as shown in Figure
12). Likewise, for
detection technologies that require peptides to be immobilized via the c-
terminus while
retaining an unmodified amino group available for sequential degradation
(e.g., by Edman
chemistry), the second amino group modification can be omitted (i.e., just
using sequential
first digestion-modification-second digestion) to produce tethered peptides
with free n-termini
as required by some degradative sequencing single molecule detectors.
100
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
While a c-terminal lysine is a preferred in-line linkage site in multiple
embodiments of
the sequential cleavage method described here, other cleavage sites can be
used in place of Arg
cleavage (as expected above in a second cleavage using trypsin), since many
proteolytic
enzymes, as well as chemical agents such as CNBr, generate a fresh n-terminal
amino group
when they cleave a polypeptide. A wide range of such alternative cleavage
specificities are
available: for example, the enzymes AspN, GluC, chymotrypsin, elastase or even
relatively
non-specific proteinase K can be used in combination with Lys-C or equivalent
enzymes to
generate different sets of double-amino peptides and extend the applicability
of the approach
to proteins not well-covered by the R...K method.
A related alternative embodiment makes use of chemical linkages to peptide
carboxyl
groups instead of amino groups, and employs a series of steps to 1) cleave
sample proteins n-
terminal to Asp residues (e.g., using the enzyme Asp-N) to produce "Asp
peptides" having an
n-terminal Asp residue; 2) react the Asp side chain carboxyl groups and the
exposed peptide
c-terminal carboxyl groups of these peptides (and any internals Glu carboxyl
side-chains), in
one or more steps (which may, for example, include click chemistry ligations),
with a first
added molecule (e.g., a first oligonucleotide); 3) removal or depletion of any
remaining
uncoupled amount of this first added molecule (or intermediate chemical
components); 4)
cleavage of the Asp peptides resulting from the first cleavage at one or more
selected internal
residues (e.g., by addition of trypsin to cleave K or R, Lys-C to cleave Lys,
Glu-C to cleave at
Glu, etc.), thereby exposing a fresh and unreacted c-terminal carboxyl group
in those peptides
that have an n-terminal Asp and such an internal residue; and 5) reaction of
these fresh c-
terminal carboxyl groups, in one or more steps, with a second added molecule
(e.g., a second
oligonucleotide using a click chemistry linkage). In this case, it is
preferred to select as
TARGET peptides those that lack an internal Glu residue since this would
comprise a 3'
carboxyl group when AspN is used initially. A symmetrical situation would
obtain if an
enzyme with GluN specificity were used initially, and preferred TARGET
peptides would be
those lacking internal Asp residues.
In some embodiments, appropriate STANDARDs can be produced according to any of

the 2-step methods described above by carrying out a similar set of steps
using a synthetic
peptide with an extended n-terminal sequence providing the same second-step
cleavage site as
101
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
the process applied in processing samples. In some embodiments the cognate
STANDARD
and TARGET constructs are distinguished by sequences incorporated into one or
more oligos
linked to the peptide. In some embodiments, one of more of the linkage steps
in assembling
STANDARD constructs makes use of a different click chemistry pair than that
used in
assembling TARGET constructs.
In some embodiments, multiple different 2-step peptide modification processes
as
described above are carried out in parallel on a sample, and their results
combined to provide
a collection of TARGET peptide constructs providing improved protein coverage
or detection
performance compared to a single 2-step procedure (e.g., the R...K method
described initially).
7.2 GENERATION OF TARGET PEPTIDE CONSTRUCTS
TARGET peptide constructs are created by linkage of TARGET tags to TARGET
peptides in a sample digest. In some embodiments, TARGET tags are linked to
common
chemical features of peptides, for example peptide n-terminal and/or lysine
epsilon amino
groups. The common occurrence of such features implies that it would be
advantageous to
convert a large fraction, and potentially all, peptides in a digest into
constructs of the form of
TARGET constructs, irrespective of whether each such construct is to be
measured against a
STANDARD construct. The feasibility of such an approach is related to the
efficiency with
which TARGET tags can be coupled to a range of peptides and the cost of
reagents required
to modify more than a few specific peptides. Enabling the use of inexpensive,
efficient
reagents and methods to link TARGET tags to all digest peptides is therefore
one of the objects
of the invention.
7.3 ADDITION OF STANDARD.
A TARGET peptide in a sample or set of samples is "standardized- by addition
of a
known quantity of its respective STANDARD (the STANDARD based on the sequence
of the
TARGET peptide with modification as disclosed above). The resulting
"standardized sample
digest" may be so standardized with respect to one TARGET peptide, or to
multiple TARGETs
(requiring multiple cognate STANDARDs). Multiple STANDARDs may be added
together
at one time, or at different times (e.g., after part of a standardized sample
is analyzed, additional
STANDARDs may be added to permit subsequent analysis additional TARGET
peptides.
102
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
In some embodiments the added quantity of a STANDARD is known in absolute
quantitative terms (e.g., in grams/sample or grams/liter, in moles/sample or
moles/liter, or
molecules per sample or molecules per liter), and in some embodiments the
amount of
STANDARD added is known to be the same as, or have a defined ratio to, the
amount of
STANDARD added to other samples (thus allowing multiple samples to be compared
on a
consistent scale, particularly useful when samples run in a batch are compared
with one
another, or in longitudinal studies measuring changes in amounts of biomarker
proteins
occurring between serial samples from an individual). A sample to which
STANDARDs have
been added corresponding to a set of TARGET peptides is considered a
"standardized sample"
with respect to those TARGET peptides. In some embodiments one or more
STANDARDs
are added to a protein sample before or during digestion of the sample
proteins to peptides. In
some embodiments, one or more STANDARDs are added after digestion but prior to

enrichment. In some embodiments additional STANDARDs are added to a digest
sample that
has been previously analyzed according to the invention for an earlier set of
TARGET peptides
and STANDARDs, enabling cycles of measurement for successive panels of
peptides in a
sample.
In some embodiments, the quantity of STANDARD added to a sample is chosen
based
on the amount of the respective TARGET peptide expected to be present in the
sample(s)
Specifically, the quantity of STANDARD may be based on the average or median
amount of
TARGET peptide observed or known to be present in similar samples, so that the
ratio of
TARGET peptide to STANDARD molecules falls in a range centered close to 1.0
(i.e., equal
amounts). Depending on the variation in TARGET peptide amount in the samples,
the ratio
may for example range from 0.5 to 2, or 0.2 to 5, or 0.1 to 10, or 0.01 to
100. The benefit of
arranging STANDARD amount based on TARGET peptide ranges is that it best
avoids
situations where the ratio is very large (e.g., 1,000:1). Extensive
investigations of the
population ranges of clinical protein analytes (59) have shown that the
observed range is
protein-specific. High-abundance blood proteins such as albumin or hemoglobin
usually vary
by small amounts (much less than 2-fold), while acute phase proteins such as C-
reactive protein
(CRP) or serum amyloid A (SAA) can increase by 1000-fold in a serious
infection.
103
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Since the present invention utilizes single molecule counting for purposes of
quantitating the peptides (and therefore yields measurements whose precision
is expected to
depend on counting statistics in which precision is typically determined by
the square root of
the number of objects counted), a standardized sample with a 1,000:1 ratio of
TARGET peptide
to STANDARD (or vice versa) would require the detector to count 1,000 times as
many of the
higher abundance peptide molecules as lower abundance peptide molecules (i.e.,
a total of
1,000 + 1 = 1001 times the minimum acceptable counts in each peptide to
achieve desired
precision based on established results in counting statistics and experimental
data), thereby
wasting counting capacity compared to a situation in which the peptides are
present at closer
to equal abundance (1 + 1 =2 times the minimum acceptable counts would need to
be counted).
It will be clear to one skilled in the art that the optimal situation for
efficient quantitation by
counting TARGET and STANDARD peptide molecules is one in which the ratio is as
close to
1:1 as practically possible. A person skilled in the art could design an assay
according to the
invention to measure hemoglobin using the population average value for the
TARGET
peptides as the STANDARD amount, while in CRP it could be preferable to set
the
STANDARD amount higher than the population average TARGET amount so as to
better
center the TARGET to STANDARD ratio for this highly inducible protein closer
to 1Ø
For similar reasons, in some embodiments it is preferred that the amounts of
molecules
of different TARGET peptides measured together in a multiplex assay (each with
its cognate
STANDARD) should be as nearly equal as possible. This arrangement results in
optimal
precision achievable with a given total capacity for counting molecules
according to counting
statistics, and can be achieved by stoichiometric flattening during enrichment
as described
below.
In some embodiments two or more TARGET peptides are selected for a protein,
yielding independent measurements of the protein's amount that can be combined
to deliver
improved precision, or used to circumvent sequence (i.e., genetic) or post-
translational
variation in TARGET peptides in a population of samples. In general, unless a
TARGET
peptide sequence is repeated in a protein, different TARGET peptides from the
same protein
will be present in equal amounts after complete digestion (i.e., present in
molar amounts equal
to the molar amount of the parent protein). In some embodiments, multiple
TARGETS are
104
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
selected from a protein that exhibits highly variable amounts in relevant
samples, and their
respective cognate STANDARDS are added at different amounts so that the TARGET-
to-
STANDARD ratio of at least one of the TARGETS is close enough to 1:1 to be
efficiently
countable and thereby furnish an accurate ratio measurement In some
embodiments, for
example, three TARGET peptides are selected and their respective STANDARDS
added to
the sample digest at 0.1x, 1.0x, and 10.0x the average amount of TARGET in an
average
sample, so that variation in the amount of target protein over a range of 10-
fold above and
below the expected amount (100-fold dynamic range) will be measured by at
least one of the
TARGET-to-STANDARD ratios close to 1:1. In some embodiments of this kind, the
amounts
of the respective BINDERs used to enrich the peptides are adjusted separately
to bring the
peptides into flat stoichiometry (near equivalence) before detection and
counting.
7.3.1 Differences between MS and single molecule requirements
It is worthwhile to note that detection of peptides by single molecule
counting involves
significantly different tradeoffs as compared to quantitation by mass
spectrometry (MS). MS
typically produces analog measurements of the amount of a molecule that passes
through a set
of mass filters, or reaches a detector after some mass or size-based
separation process. Since
different molecules are typically detected at different times, the amount of
one peptide does
not usually affect the detection of a different peptide in a significant way,
apart from extreme
cases in which a detector is saturated, or a total aggregate ion capacity is
exceeded. The
dynamic range of modern triple-quadrupole MS instruments approaches 100,000-
fold for a
given analyte (e.g., peptide), and the MS (or LC-MS) system can accommodate
two different
peptide molecules that differ in abundance by 1 million-fold and still produce
some
quantitative information on both (e.g., by selecting a low-efficiency
detection mode such as an
infrequent MRM fragment mass for the high abundance molecule and a very
efficient detection
mode for the low abundance molecule). The detection process is typically
driven by a
chromatographic separation of 2 to 60 minutes prior to introduction of
separated peptides into
the MS, and is essentially insensitive to the total number of molecules in the
applied sample.
the analytical run will "consume" all the applied sample and will occupy the
same period of
time regardless of the number of molecules being analyzed.
105
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
The situation pertaining in single molecule detection and counting is very
different: the
number of molecules sequenced and counted depends directly on time. Thus, for
nanopores
the number of molecules analyzed is a direct function of the time required to
observe a typical
molecule's sequence (e.g., 0.25sec for a 50bp equivalent length oligo:peptide
construct at a
typical rate of 200bp/sec through a single nanopore) multiplied by the number
of pores
operating in parallel: running the device twice as long will typically detect
twice as many
peptides. Hence nanopore detection methods are inherently limited by the
number of
molecules that can be sequenced per time. As discussed below, the precision of
ratios between
TARGET and STANDARD peptides is largely determined by counting statistics,
with more
counts yielding higher precision. As a result, it is highly desirable in the
present invention to
avoid wasting capacity (in peptides or time) in counting more than a specified
number of a
given peptide, since this capacity could instead be used count more molecules
of a lower
abundance peptide to improve its precision. In some embodiments this principle
is applied to
generate enriched peptide samples in which a) the number of added STANDARD
molecules
is close to the average number of TARGET peptide molecules in typical samples,
and b) the
sum of TARGET + STANDARD molecules is approximately the same for all proteins
and
peptides being measured in a multiplex panel. This principle, referred to
herein as
"Stoichiometric Flattening" is described in greater detail below.
Similarly, for degradative peptide sequencing methods, a series of sequential
steps is
applied in parallel to a large number of immobilized peptide molecules, with
the number of
molecules being determined by the physical scale of the device used (number of
peptides that
can be immobilized and resolved by the detector) and the number of such runs.
Likewise, for
single molecule imaging detection methods, a large but limited number of
immobilized peptide
molecules is detected in a run, with the number of molecules being determined
by the physical
scale of the device used (number of peptides that can be immobilized and
resolved by the
detector) and the number of such runs. The overall throughput in terms of the
number of
molecules analyzed per unit of time is determined by the number of molecules
sequenced per
run, the duration of a run, and the number of runs (for a batch method).
Hence, as with
nanopore sequencing, efficiency is maximized by sequencing similar numbers of
each
TARGET and STANDARD peptide, instead of allowing one or more high abundance
peptides
106
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
to occupy a large fraction of the capacity, reducing the numbers of molecules
from lower
abundance TARGETS and thus the precision of their measurement.
7.3.2 Comparison between samples
In some embodiments, a fixed amount of each STANDARD (e.g., an equal volume
aliquot from the same STANDARD stock solution) is added to each of a
multiplicity of
samples, thereby establishing a shared reference basis that allows accurate
comparison of the
amounts of TARGET peptides between these samples. This approach enables
relative
comparison of protein amounts between samples, but does not directly provide
absolute
quantitative information (e.g., in mass or concentration units) without the
use of external
calibrators (see e.g., US provisional patent application 63/213,371 - entitled
Calibration of
Analytical Results in Dried Blood Samples, filed 6/22/21, incorporated herein
in its entirety).
In some embodiments, the amounts of STANDARD molecules added represent known
quantitative amounts (i.e., numbers of molecules, mass, or concentration), in
which case the
absolute amounts of TARGET peptides can be estimated by multiplying the
STANDARD
amounts by the observed TARGET peptide to STANDARD ratios.
In some embodiments, STANDARDs are generated from synthetic (e.g.,
recombinantly expressed) protein constructs whose digestion yields STANDARDs
in relative
ratios defined by their copy number in the construct's sequence (see for
example Patent
Application US 2006/0154318). In some embodiments the STANDARDs are provided
in
physical forms enabling simplified manipulation and addition (as described in
US9274124)
In some embodiments the STANDARDS are added as peptides in solution.
7.3.3 Amount of standard added
In some embodiments, the amounts of each STANDARD added can be determined
according to (or equal to) the average or baseline levels of corresponding
TARGET peptides
observed in a subject's prior samples, thus providing STANDARD levels that are
more or less
equal to the expected TARGET peptide levels for that subject. This approach
provides optimal
precision and efficiency for the measurement of longitudinal samples from that
subject, and
represents an ideal case of personalized protein measurement with the
potential to maximize
precise detection of small protein changes from baseline levels over time.
107
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
7.4 ENRICHMENT OF TARGET PEPTIDES AND STANDARDS.
Once a sample digest has been "standardized" by addition of one or more of the

appropriate STANDARDs in the desired amounts, the sample can be fractionated
or purified
to enrich analytes to a desired level, and/or to deplete unwanted components
(the analytical
"matrix" background). In some embodiments, peptide-oligo constructs according
to the
invention are purified by a process that reversibly captures peptides (e.g.,
reversed-phase
adsorbents such as C18 resins) with the result that oligonucleotides and other
non-peptide
components not part of peptide-oligo constructs can be washed away and thus
removed. In
some embodiments, peptide-oligo constructs according to the invention are
purified by a
process that reversibly captures oligonucleotides (e.g., adsorbents such as
Ampure resins) with
the result that peptides, remaining proteins and other non-peptide components
not part of
peptide-oligo constructs can be washed away and thus removed. In some
embodiments
addition features of the peptide-oligo constructs are used to isolate them,
for example binding
of a biotin group engineered into the oligo, the peptide or the linkage
between them can be
captured by immobilized streptavidin as a means of cleaning up the desired
constructs.
After any such construct cleanup steps, enrichment using specific affinity
BINDERs is
an important aspect of the invention. In some embodiments BINDERs are used to
carry out
specific affinity enrichment of the respective cognate TARGET peptide and
STANDARD
pairs. The TARGET peptide, its STANDARD and the BINDER designed to bind them
collectively form a cognate set of molecules specialized for the measurement
of a specific
TARGET peptide and thus its parent protein.
7.4.1 Magnetic bead enrichment
In some embodiments, BINDERs are immobilized on magnetic beads and these beads

mixed with the standardized digest and incubated to allow binding of peptides
to the BINDER.
In some embodiments (e.g., using anti-peptide antibodies as BINDERs) the
BINDERs can be
bound to commercially-available Dynabeads G via Protein G's affinity for the
Fc domain of
IgG, and optionally covalently linked to the beads using DMP crosslinking or
other equivalent
crosslinking methods forming bonds between the magnetic bead and the BINDER.
In some
embodiments BINDERs are bound to other types of magnetic particles such as
Tosyl-activated
beads, or otherwise chemically bound to particles that can be manipulated to
allow exposure
108
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
to and removal from a sample. Due to their specific affinity for the TARGET
peptide
sequences, the BINDERs bind the respective TARGET peptides and STANDARDs when
placed in contact with them. In some embodiments, BINDERs are in contact with
a
standardized sample digest for a 30-minute incubation period with shaking to
keep the beads
suspended. In some embodiments, BINDERs are incubated with standardized digest
for
shorter periods (e.g., 1, 2, 5, 10, 15, 20 or 25 minutes) and in some
embodiments, BINDER are
incubated with standardized digest for longer periods (e.g., 45, 60, 90, 120,
or 180 minutes, or
4, 5, 6, 9, 12, 18 or 24 hours). Depending on the kinetic properties of
specific BINDERs, the
abundances of their cognate TARGET and STANDARD peptides, and the presence or
amounts
of any competing sample peptides (e.g., peptides with different but similar
sequences to a
TARGET peptide), persons skilled in the art will understand how to perform
experiments to
evaluate and select a suitable incubation time. After binding, the beads with
attached
BINDERs and their bound peptide cargo are separated from the digest. To
achieve this
separation, the BINDER beads can be collected together using a magnet and
either removed
from the digest (for example using a Kingfisher device provided by
ThermoFisher); held in a
vessel (for example by magnetic attraction to the side of a well of a 96-well
plate) while the
digest solution is removed to another container by a pipetting device (e.g.,
an Agilent Bravo,
Beckman Counter Biomek, Hamilton, Tecan or other liquid handling robot); held
magnetically
in a conventional pipette tip while surrounding liquid is expelled and
replaced (e.g., as in the
established "Magtration" technology); processed in a "magnetic bead trap" (60)
or processed
manually (e.g., by manipulation of vessels, magnets and handheld pipettes).
The standardized
digest remaining after separation from BINDERs may be preserved apart from the
BINDERs
and stored or subjected to additional processes to measure additional
constituents at a later
time.
In some embodiments, the beads with BINDERs and specifically bound peptide
cargo
are washed by addition to, and mixing with, aliquots of a wash solution, after
which the beads
are recollected and separated from the wash. In some embodiments the beads are
washed 1, 2,
3, or 4 times with volumes of 50 to 400uL of wash solution, which may include
buffers (e.g.,
PBS or Tris, typically at pH between 6.0 and 8.5 when antibody BINDERs are
used), gentle
detergents (e.g., CHAPS or deoxycholate), and/or low concentrations of an
organic solvent
(e.g., 5-20% acetonitrile added to help remove peptides bound to beads non-
specifically).
109
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Persons skilled in the art will be able to evaluate and select wash solution
compositions that
are most effective at removing non-target digest peptides (and other
components) while
retaining TARGET peptides and STANDARDs on the BINDER. Since it is desired
that the
specifically-bound peptides remain attached to the beads during the wash
procedure, in some
embodiments the BINDERs are designed or selected to have half-off-times (the
time period
over which half the bound molecules become unbound, i.e., the dissociation
half-life) longer
than the time required to execute the sequence wash steps (typically 10-15
minutes using
current laboratory automation systems). In some embodiments, for example those
in which
the BINDERs are present at high local concentrations at some point(s) during
the wash process,
TARGET and STANDARD peptides that leak from the BINDER can be re-bound by the
same
or other BINDER sites before being lost. It will be understood by those
skilled in the art that
experimentation with specific BINDERS, TARGETs and STANDARDs, specific wash
solution compositions and temperatures, specific wash volumes and vessel
geometries, and
specific sample digest matrices is required to optimize a) the enrichment of
the TARGETs and
STANDARDs and b) the removal of the other digest components. As a general
matter, the
purer the TARGETs and STANDARDs are after enrichment, the better the invention
will
function to measure them precisely.
In some embodiments, the recovery of TARGETs and STANDARDs from a digest by
enrichment using BINDERs is evaluated by successively contacting 2 or more
separate
aliquots of BINDER with the digest, and comparing the amounts eluted from the
first and
second BINDER aliquots. In cases where the number of BINDER binding sites is
greater than
the number of TARGET or STANDARD molecules, an effective BINDER will typically

capture 80% or more of these peptides on the first capture step, and the
second (and possible
subsequent) capture steps will capture successively smaller amounts of the
peptides. The ratio
of TARGET and STANDARD captured in the first capture divided by the amount of
these
peptides in the sum of first and second captures provides a useful index of
the overall recovery.
In cases where the number of BINDER binding sites is less than the number of
TARGET and
STANDARD molecules, sequential BINDER captures will yield more nearly equal
amounts
of these peptides.
110
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
In some embodiments, BINDERs immobilized on magnetic beads are contained
within
microfluidic systems capable of moving the beads between different liquid
volumes to effect
the steps of the invention. Microfluidic systems and technologies well known
in the art allow
use of reduced liquid volumes (and thereby less dilution of low-concentration
analytes such as
enriched peptides) and more complex, multi-step chemical processes with
reduced losses as
compared to conventional lab-scale (i.e., 5-500uL) liquid handling processes
(for example in
a magnetic bead trap device (60)).
7.4.2 Column format enrichment
In some embodiments, BINDERs are immobilized on columns through which sample
digest and wash solutions can be passed (3), typically using a liquid
chromatography system.
7.4.3 Post-peptide-capture immobilization
In some embodiments BINDERs are contacted with a standardized digest in
solution
(i.e., BINDERs free in solution, not bound to a support), thereby maximizing
freedom of
diffusion and potentially providing faster binding to TARGET and STANDARD
peptides or
respective VEHICLE constructs. After binding, the BINDERs can be themselves
captured on
magnetic beads (e.g., protein G coated beads in the case of antibody BINDERs,
streptavidin
coated beads in the case of BINDERs that have been previously biotinylated,
etc.) or on
columns functionalized with equivalent capture functionalities.
744 Multiple enrichment cycles
In some embodiments, peptides captured by and eluted from BINDERs are
subjected
to one or more additional cycles of BINDER enrichment. When bound peptides are
eluted
from BINDERs by a change to specific elution solution conditions (e.g., pH 2.5
in the case of
antibody BINDERs), reversal of these conditions (e.g., neutralization to pH
near 7.0) can allow
the peptides to bind to respective BINDERs once again. Similarly, if elution
is carried out by
exposure to a chaotropic salt, a detergent, or increased temperature, these
conditions can be
reversed (e.g., through dilution of a salt or detergent, dialysis, or
cooling), restoring conditions
in which binding to BINDER (typically fresh BINDER) can occur. After the
initial BINDER
enrichment process, the captured peptides are removed from the bulk sample
digest, and thus
the vast majority of non-target "matrix" peptides are no longer present. A
second BINDER
111
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
enrichment cycle thus begins with a much smaller amount of total peptide
material, in which
the TARGET and STANDARD peptides represent a much larger fraction of dissolved
material
(mainly peptides). In some embodiments, a second BINDER enrichment cycle is
carried out
using fresh BINDER (i.e., BINDER that has not previously been exposed to
complex digest),
and this additional cycle further depletes non-target matrix peptides, while
recovering a large
fraction of the TARGET and STANDARD peptides, and resulting in a purer sample
of the
TARGET and STANDARD peptides of interest. Increasing the fraction of peptides
that are
desired TARGET or STANDARD peptides directly improves the efficiency of single
molecule
detection aimed at measuring those peptides by decreasing the time and
resources spent
sequencing other peptides. In some embodiments, one or more additional
enrichment cycles
are carried out using a smaller amount of BINDER (e.g., a smaller volume of
beads) than used
in the initial enrichment cycle, resulting in an opportunity to reduce the
volume in which
peptide constructs are eluted and thereby increasing the concentration of
TARGET and
STANDARD peptide constructs introduced into a detector. The ability to deliver
TARGET
and STANDARD peptides in a small volume, or on a small number of beads, can
improve the
efficiency of single molecule detection. Providing peptides for single
molecule detection in as
concentrated a form as possible can be important in specific detection
methods, for example in
delivering peptides to the vicinity of a sequencing nanopore as described
below. In some
embodiments, a first BINDER capture step is carried out using BINDER
immobilized on a
large number of small (e.g., 1, or 2.8 or 5 micron diameter) magnetic beads,
thus maximizing
the dispersion of BINDER in the sample digest volume and decreasing the
diffusion distance
and time required for peptide capture, after which the captured peptides are
eluted and
recaptured by fresh BINDER immobilized on a smaller number of beads, with the
result that
the peptides captured in the second round are both purer (fewer non-TARGET
peptide
molecules) and more concentrated (e.g., when the beads are collected
magnetically into a mass
for removal from contact with the peptide-containing liquid). In some
embodiments, recovery
of TARGET and STANDARD peptides in a very small volume of beads allows the
bound
peptides to be exposed to very small volumes (equal to or slightly greater
than the included
volume of the bead mass) of reagents used in the preparation of peptides for
linkage to
VEHICLEs as described below. In some embodiments a second capture makes use of

BINDER bound to a small number of larger (e.g., 10-40 micron diameter) beads
each having
112
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
a greater BINDER capacity, such that each bead can be taken through the series
of chemical
steps to prepare it for and/or complete VEHICLE linkage to it in a separate
container, as
described below.
In some embodiments a first BINDER enrichment cycle is carried out to recover
and
purify the TARGET and STANDARD peptides from a mass of sample digest, and the
same or
similar peptide-specific binders used to identify these TARGET and STANDARD
peptides in
a single molecule detection system. The first BINDER enrichment cycle can thus
serve to
remove non-TARGET peptides present in the digest (thus minimizing analytical
capacity
wasted on irrelevant peptides), and optionally to improve the stoichiometric
flatness of a series
of different TARGETs to be detected and counted (as described in detail
below). For example,
a fluorescently-labeled BINDER can be used to detect its cognate TARGET and
STANDARD
peptides in an optical imaging system of the kind used in high-throughput DNA
sequencing or
in similar protein detection systems (e.g., US 2021/02397, and thereby used to
count the
numbers of such molecules. In some embodiments a second class of BINDER that
specifically
recognizes a unique tag present in the STANDARD but not in the TARGET peptide
is used to
separately distinguish the STANDARD peptide molecules from the TARGET
molecules.
Using these two separate detection steps applied to a population of single
molecules (i.e.,
identification of the molecules in a TARGET + STANDARD pair, and separate
identification
of STANDARD peptides), allows separate identification and counting of the
molecules of
TARGET and STANDARD peptides, and thereby determination of the TARGET-to-
STANDARD ratio. In some embodiments, this approach is used to identify all
STANDARD
molecules in one recognition step (using a BINDER specific to the STANDARD
molecule
tag), and each different TARGET+ STANDARD pair is identified by a separate
detection step
using the TARGET-specific cognate BINDER. In some embodiments, the tag
distinguishing
the STANDARD molecules can be an added amino acid sequence on either amino or
carboxy
terminus of the TARGET peptide sequence (for example the well-known FLAG
peptide
sequence used in recovering expressed proteins), a chemical group bound to the
STANDARD
(such as biotin), or any of a variety of distinctive chemical structures
unlikely to be found in
the group of TARGET peptides. In some embodiments this approach can be applied
to count
whole protein molecules instead of proteolytic peptides using peptide-sequence-
specific
BINDERS to identify their cognate linear sequence epitopes in intact target
protein molecules.
113
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
In such embodiments aimed at detecting protein molecules, a first BINDER
enrichment cycle
can use BINDERS specific for an intact protein as well as BINDERS specific for
linear peptide
epitopes, and STANDARDS can be versions of the intact protein with any of a
variety of
unique tags as for peptides.
7.4.5 Stoi chi om etri c flattening
In some embodiments, a plurality of TARGET and STANDARD peptides is enriched
by the corresponding plurality of BINDERs, and the relative amounts, kinetic
properties or
solution conditions of the BINDER enrichment are selected or adjusted so as to
accomplish
some degree of stoichiometric flattening; i.e., to diminish differences in the
relative amounts
of different TARGET + STANDARD peptide pairs. As described above, to obtain
the benefit
of stoichiometric flattening, it is necessary to standardize measurement of
TARGET peptide
constructs by incorporating STANDARD constructs that can act as internal
standards before
BINDER enrichment (to preserve information on quantitation), and to effect
enrichment using
peptide-specific BINDERs.
Stoichiometric flattening, described in greater detail below, is distinct from

conventional sample preparation enrichment methods. In genome sequencing, gene

stoichiometries are equal, or close to equal, to begin with and do not need to
be adjusted. In
studies of RNA's, which can be present at a range of relative abundances,
enrichment can be
used to focus on a specific set of RNA's (e.g., mRNA's), but sequence-specific
internal
standards are not used and thus differential enrichment of different sequences
destroys relative
abundance information. In proteomics studies aimed at broad proteome coverage,
internal
standards have found more use because of the practicality of using stable
isotope labeled
peptides and mass spectrometric detection; however, the requirement for
individual enrichment
BINDERs for a significant fraction of digest peptides remains an
insurmountable problem
given the number of proteins present in most samples (the ¨50-fold higher
number of peptides
therefore present in a sample digest). In the case of targeted methods such as
SISCAPA, the
use of mass spectrometric detection makes it possible to achieve some degree
of stoichiometric
flattening by choosing target peptides that have very different ionization
properties; i.e.,
choosing peptides with extremely high MS detection performance to represent
low abundance
proteins (where sensitivity is a major challenge) and choosing peptides with
much lower
114
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
performance to represent higher abundance molecules. Such peptide choices are
an important
component of stoichiometric flattening in MS methods, in addition to the
adjustment of relative
BINDER enrichments as described above.
Stoichiometric flattening by choosing peptides with different detection
efficiencies is
not useful in single molecule counting methods, which generally count all
molecules with
equal efficiency. In the present invention, focused on single molecule
detection and counting,
in most embodiments there is little or no difference expected between the
detection efficiencies
of different peptides (in contrast to the situation in mass spectrometry where
ionization
efficiency, ion transport, molecular fragmentation, and detection efficiency
are all peptide-
specific and highly variable). While the near-equivalence of peptides as far
as detection
efficiency in single molecule methods is an important advantage in expanding
peptide choice
for any given protein target, it removes one of the major avenues available
for stoichiometric
flattening and restricts the method to adjustments in the BINDER capture step
only.
7.4.6 Consequences of not flattening stoichiometry
A further important distinction between the methods based on mass spectrometry

versus single molecule sequence-sensitive detectors is the impact of failure
to flatten
stoichiometry. In typical mass spectrometry protocols using liquid
chromatography (LC-MS),
a sample is analyzed using a method involving a specified chromatographic
separation (with a
specified duration, usually in the range of 1-60 minutes), and the mass
spectrometer is
presented with whatever peptide ions emerge from the end of the column,
whether they are too
few to register as a signal, or too many to be accurately measured, or in
between. In other
words, the sample analysis typically takes a pre-specified length of time
whether the
stoichiometry of TARGET peptides in the sample has been flattened or not.
Using single
molecule methods, for example using nanopore sequencing, each peptide molecule
takes some
time to be analyzed - during this time one pore is occupied with one molecule,
or a fixed
number of molecules requires a given block of time to analyze. While the
throughput of such
a process can be increased by providing multiple pores or molecule
immobilization sites, or
decreasing read time, there remains a direct relationship between the number
of molecules
detected and the time devoted to the analysis of that sample. It is therefore
of paramount
importance when using single molecule methods for quantitation according to
the present
115
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
invention that the system not spend unnecessary time sequencing a) peptide
molecules that are
not TARGET or STANDARD peptides (i.e., do not contribute to the results
desired), or b)
peptides that have already been sequenced in sufficient numbers to provide a
TARGET-to-
STANDARD ratio with the desired precision (e.g., %CV based on counting
statistics).
Removing peptides defined by (a) above is a matter of enriching TARGET
peptides and
depletion of the rest. However, minimizing the counting of peptides that are
surplus to
statistical requirements (those defined by (b) above) is a key benefit of
stoichiometric
flattening. It is thus key to the practicality of single molecule detection
methods for
quantitation of selected TARGET peptide targets, and can reduce the total
number of molecules
that need to be counted in practical biomarker studies by large factors (e.g.,
16,000-fold
improvements in efficiency in measuring components of dried blood spots, as
described in
more detail below ¨ "Stoichiometric Flattening").
7.5 PREPARATION OF PEPTIDES FOR SINGLE MOLECULE SEQUENCE-SPECIFIC
DETECTION.
In some embodiments the detection of BINDER-enriched TARGET and STANDARD
peptides can be facilitated by certain chemical modifications, including
covalent linkage to
polymeric molecules on one or both ends (i.e., on or near peptide n- or c-
termini), or linkage
to a support, surface or bead, resulting in constructs with improved uptake
and sequence
readout by a sequence-sensitive detector, and/or incorporating additional
information beyond
the peptide itself in the form of detectable polymer sequences (e.g., DNA
sequence tags).
7.5.1 Chemical modification of peptides on BINDER
In some embodiments, peptides are chemically modified while bound non-
covalently
to a BINDER (e.g., a BINDER that is used to enrich them from the peptide
sample).
Modification of the peptides while thus non-covalently anchored to a BINDER
(which may
itself be bound to a solid support) facilitates exchange of reagents between
steps of a multi-
step series of chemical modifications, avoids the necessity for other more
cumbersome
purification methods between steps (e.g., to separate modified peptides from
reagents and
unmodified peptides), and allows the peptides to be concentrated when
necessary (e.g., by
gathering magnetic beads bearing the BINDER into a solid mass with minimal
included
116
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
liquid). These advantages are very useful when a single modification step is
required, and
progressively more valuable as more modification steps are needed.
In some embodiments a series of chemical steps are used to prepare peptides
for
analysis, and it may be necessary to remove the chemical reagents required for
one step, and
in some cases wash them way, before adding reagents for the next step. When
the off-rate of
the BINDER is low, with a half-off-time for example longer than 10-15 minutes
(as is typical
for antibody BINDERs developed for use in SISCAPA), one or more rapid chemical
reactions
can be carried out before peptide molecules dissociate from the BINDER. In
some
embodiments involving more time-consuming chemical modification processes
(e.g.,
requiring an incubation period of 1-60 minutes for a reaction to progress
towards completion),
BINDERs can be concentrated (e.g., by collecting magnetic beads into a mass or
small volume,
or using a column format bearing a high density of immobilized BINDERs) during
steps of the
process, and during these periods any peptide that dissociates from a BINDER
is likely to
quickly rebind to another BINDER site given the high local BINDER
concentration. This
kinetic effect effectively prolongs the time available for chemical
modification of BINDER-
bound peptides.
Certain limitations apply to this approach in some embodiments. For example,
if the
BINDER is an antibody, then solution conditions that would denature the
antibody (e.g., strong
detergents such as SDS at high concentrations, extremes of pH, or high
temperatures) or cause
the peptide to be released from the antibody (e.g., pH below 3.5 or presence
of 2M NH4SCN)
can be problematic while carrying out the desired peptide modification.
However, means are
known in the art for carrying out a wide range of desirable modifications to
TARGET and
STANDARD peptides under solution conditions compatible with retention of the
peptides on
a cognate BINDER.
In some embodiments the BINDERs themselves are modified prior to use in
capturing
TARGET and STANDARD peptides in order to prevent or diminish their reaction
with
reagents intended to react with the peptide cargo. Thus, in some embodiments,
some or all of
any free amino groups on an antibody BINDER can be blocked, for example by
PEGylation
using commercial NHS-PEG reagents. In some embodiments using DNA or RNA
aptamers
117
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
as BINDERs, modifications may be required to prevent hybridization between the
aptamers
and sequences being attached or ligated to peptides on a BINDER.
In some embodiments, the peptide modifications can be carried out while the
peptides
are bound to a support by a general but less-specific mechanism (e.g., to a
reversed-phase
support such as C18 particles), or free in solution.
7.5.2 Amino group linkage
Among the most useful reaction sites on a peptide is an amino group. For
peptides
generated by digestion with trypsin (the most commonly used proteolytic
enzyme), all
correctly processed peptides (except a protein's c-terminal peptide) should
have a c-terminal
lysine or arginine residue. Since lysine is the only amino acid with a side-
chain primary amino
group (reaction with which would allow two attachment sites on a peptide),
some embodiments
make use of TARGET peptides from the group of tryptic peptides with c-terminal
lysine when
preparing a construct with polymer additions at both ends (a -Double amino"
peptide), or
selecting TARGET peptides from the group of tryptic peptides with c-terminal
arginine (which
peptides will have a single n-terminal amino group available to react with a
linker; i.e., a
-Single amino" peptide) when an extension on only one end (the n-terminus) of
the peptide is
desired.
In some embodiments, an advantage of linkage through amino groups,
particularly in
the case of c-terminal lysine peptides modified by a linkage chemistry that
results in a decrease
in the peptide' s net positive charge, is that the peptide subsequently has
little or no positive
charge (e.g., if it contains no His resides), and at least one (the c-terminal
carboxyl) and perhaps
more negative charges (if the peptide contains Asp or Glu amino acids).
Peptides with a net
negative charge have the same charge polarity as nucleic acids (negative, on
account of the
phosphate groups), facilitating the movement of both types of polymers through
a pore using
the same polarity electric field.
7.5.3 Carboxyl group linkage
In some embodiments linkage through peptide carboxyl groups can be used (61)
but
this approach has limited ability to distinguish between c-terminal carboxyls
and side chain
carboxyls of aspartic and glutamic acids, and thus could present additional
constraints on
118
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
peptide selection (i.e., de-selection of Asp and Glu containing TARGET
peptides) or give rise
to multiple constructs due to side reactions. In some embodiments, TARGET
peptides devoid
of aspartic and glutamic residues, and hence having a unique carboxyl group at
the peptide c-
terminus, are used with carboxyl coupling chemistries well-known in the art to
link peptides
through the c-terminus.
7.5.4 Other linkage sites
Other linkage sites and reaction chemistries can be employed. Linkage through
a
cysteine sulfhydryl group is frequently favored when a peptide's sequence can
be freely
designed ¨ however the occurrence of a cysteine residue at the n-terminus or c-
terminus of a
proteolytic peptide in a natural protein is infrequent, thus representing a
limiting constraint on
TARGET peptide selection.
Chemistries are known in the art for specific chemical modification of, and/or
linkage
to, histidine, tyrosine, tryptophan and other amino acids, and these can also
be used in the
invention.
7.5.5 Peptide:Oligo in-line constructs using sequential amino linkages
Figure 14 illustrates an embodiment for modification of enriched TARGET and
STANDARD molecules (either of these being labeled a -Peptide" in the Figure)
by linkage of
an n-terminal amino group to a single-stranded oligonucleotide Leader (labeled
Oligo 1) and
an epsilon-amino group of a c-terminal lysine to a single-stranded
oligonucleotide Trailer
(labeled Oligo 2). Such a hybrid molecule can be considered an "in-line"
peptide-
oligonucleotide construct; i.e., one in which the construct forms a continuous
backbone with
"side-groups" consisting primarily of bases and amino acid side chains. In the
embodiment
shown in Figure 14, the linkages are carried out in a series of steps using
"click" chemistry, a
bio-orthogonal reaction chemistry (25, 62, 63). First, (Step 1 in Figure 14) a
peptide's n-
terminal amino group is reacted with azidoacetic anhydride (AAA) under
conditions of pH
(e.g., pH 5.5 to 7.0, preferably 6.7) that effectively prevent or reduce
reaction with a lysine's
tertiary amino group (25, 64), resulting in the introduction of an azide
functionality at the
peptide's n-terminus (shown in Step 2 of Figure 14). After removal of the AAA
(e.g., by
washing the beads carrying the BINDER and their peptide cargo), this azide
group is then
allowed to react with an amount of oligonucleotide Oligo 1 (Step 3 in Figure
14), which has
119
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
been prepared with an aza-dibenzocyclooctyne (ADIBO) functionality on the 3'
end, forming
a "click" chemistry linkage of the Oligo 1 to the peptide n-terminus (25)
(Step 4 in Figure 14).
In a next step, after removal of unreacted Oligo 1, the peptide's c-terminal
lysine tertiary amino
group is allowed to react with an azidoacetic acid NHS ester (Step 5 in Figure
14), thereby
introducing an azide functionality on the peptide's C-terminal lysine side
chain (Step 6 in
Figure 15). After removal of the unreacted NHS ester (e.g., by washing the
beads carrying the
BINDER and their peptide cargo), this azide group is then allowed to react
with an amount of
oligonucleotide Oligo 2 (Step 7 in Figure 14), which has been prepared with an
aza-
dibenzocyclooctyne (ADIBO) functionality on its 5' end, forming a "click"
chemistry linkage
of the Oligo 2 to the peptide's C-terminal amino acid (Step 8 in Figure 14)
The result is a
peptide-oligonucleotide construct comprising a peptide linked to an
oligonucleotide on each
end. Carrying out the additions step-wise as shown, using the two Oligos, each
prepared with
a "click" linkage site on one end, allows the construct to be generated with
specified oligo
polarities (i.e., 5' to 3' or vice versa) in each case, as may be necessary to
achieve the desired
interaction with DNA motors, pores, etc. The Oligos can each be single-
stranded, or they can
be rendered duplexes by hybridization with complementary sequences over part
or all of their
length(s).
In some embodiments the locations of the "click" reactive pairs can be
inverted (e g ,
modifying one or both of the peptide ends with ADIBO, and the Oligos with
azide). Linkage
at the peptide c-terminus can alternatively be introduced through modification
of the peptide
c-terminal carboxyl group (e g , a c-terminal arginine, in which case the n-
terminal amino
group is the only amino group in the peptide) instead of a lysine tertiary
amino group.
In some embodiments some or all of the steps of the double ligation scheme
shown in
Figure 15 are carried out with peptides bound to a BINDER on magnetic beads.
Using peptides
longer than the groove of a typical BINDER binding site (typically 4-8 amino
acids), and
BINDER that binds an epitope that does not include either peptide terminus,
leaves both
peptide termini available for reaction as shown. Magnetic beads carrying the
BINDER and
bound peptides can be exposed stepwise to the sequence of reagents as shown by
moving the
beads from one reagent solution to the next, with optional wash steps in
between as required,
120
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
or alternatively the beads can be gathered to the side of a vessel using a
magnet and the
sequence of reagent solutions added and removed, or manipulated in a
microfluidic device.
Many alternative chemical reaction schemes are known in the art that can be
used in
place of the specific reagents and steps shown in Figure 14, while generating
a construct with
similarly useful features A variety of reagents exist that are capable of
introducing "click"
chemistry functional groups into peptides and oligonucleotides (17). Examples
include azide
and tetrazine functionalities that are capable of specific, bio-orthogonal
reactions with alkyne
functionalities, some requiring Cu(I) catalysis (which is less preferred in
some embodiments),
or strain-promoted alkyne cycloaddition (SPAAC) reactions with bicyclononyne
(BCN) or
cyclooctyne and derivatives such as dibenzocyclooctyne (DIBO) or Aza-
dibenzocyclooctyne
(ADIBO shown in Figure 14).
In some embodiments one or both of the oligos linked to the peptide is part of
a
multimolecular construct designed to facilitate nanopore sequencing. Figure 15
illustrates a
modification of the embodiment of Figure 14 in which the oligo joined to the N-
terminal amino
group of a peptide (Oligo 1) is part of a duplex having a site at which a
"motor protein- can be
(or in some embodiments already is) bound. The overall construct shown in
Figure 15 provides
a leading oligo motor construct capable of controlling movement of the peptide
through the
pore (providing a stepwise ratchet motion as used, for example, in the
commercially-available
Oxford Nanopore sequencing platform
"Y-adapter";
https://nanoporetech.com/sites/default/files/s3/literature/product-
brochure.pdf), followed by a
peptide to be sequenced, and a trailing oligo 2. A tether molecule that
hybridizes to some
sequence in the construct may be used to associate the construct with the
membrane in which
the nanopore is located, thereby increasing the Leader's probability of
entering the nanopore.
In some embodiments, alternative chemistries can be used to join the peptide
and
oligos, such as use of an NHS-functionalized oligo that can react directly
with a lysine or n-
terminal amino group.
Similar peptide-oligonucleotide constructs have been described, for example in
WO
2021/111125, using different "click" reagent combinations and using
alternative means of
purifying desired products (e.g., purifying via the oligo part of the
construct using Agencourt
AMPure XP beads). The referenced disclosure showed that oligo-peptide-oligo
constructs
121
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
assembled by click chemistry can be processed by existing nanopores and DNA
motors to
generate reproducible ion current traces indicative of peptide sequence.
However, these
disclosed inventions do not include the use of BINDERs to carry and/or move
TARGET
peptides during one or more steps of oligo-peptide construct assembly, washing
and
purification, and do not include STANDARDs or use of BINDERs for enrichment or

stoichiometric flattening.
7.5.6 2-step digestion to distinguish amino groups
As described above and illustrated in Figure 6, a sequential enzymatic
approach is used
in some embodiments to derivatize peptides with two different added molecules,
resulting in
an ordered construct. Using this approach, a peptide can be linked to a
specific oligo on the n-
terminus and a different oligo on the c-terminal lysine reside. This approach
is not completely
general since it requires a specific order relationship between an arginine
immediately
preceding the peptide sequence and the peptide's c-terminal lysine. Such
peptides are common
but not universal in proteins, and when added to a requirement such as
uniqueness in the human
proteome, it is probable that only a subset of proteins can be represented by
proteotypic
peptides having such a structure.
7.5.7 Delivery of peptides to a site of detection.
The potential of single molecule detection to improve assay sensitivity is
enormous
provided that the desired peptide molecules can be efficiently presented to
the sequencing
machinery - i.e, to occupy a large fraction of the available peptide
sequencing capacity. As is
well known in peptide analysis by mass spectrometry, peptides at very low
concentration (i.e.,
at low single molecule levels) easily become "lost" through escape into dilute
solutions,
binding to surfaces, etc., and therefore detection can become more challenging
as target
abundance or concentration is reduced.
Some embodiments therefore make use of the localization of TARGET and
STANDARD peptide constructs on BINDERs, which themselves can be bound to
particles
such as magnetic beads, to transport the peptides (e.g., as peptides, or as
part of peptide-oligo
constructs as disclosed above) into close proximity to the site of detection
(e.g, a nanopore, a
surface on which they may be imaged, or a well in which they may be subjected
to degradative
122
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
sequencing). Recent progress in this area is disclosed in patent applications
US 2020/0284783
and US 2021/0147904.
Localization of TARGET and STANDARD peptide constructs on BINDERs produces
a remarkable concentration effect. In some embodiments the BINDERs are
attached to
spherical magnetic particles, which can be gathered together into a compact
mass by magnetic
forces. In such a mass of spherical particles, the particles occupy about 74%
of the total
volume. Thus elution of constructs from binders on beads in such a compact
mass releases the
constructs into about 26% of the volume of the mass, and given the sub-
microliter volume of
this mass in many practical embodiments, the constructs can be present in
volumes of
interstitial liquid in the range of tens to hundreds of nanoliters.
In some embodiments, the immobilization of TARGET peptides, STANDARDs and
their derivatives and constructs by non-covalent binding to BINDERs that are
immobilized on
supports such as magnetic beads, provide improved methods for delivery of
these molecules
to sequencing machinery. In some embodiments, magnetic beads carrying BINDERs
and their
peptide construct cargoes (including peptide-oligonucleotide constructs) are
added directly to
the cis chamber of a nanopore sequencing device, where the beads sink under
the influence of
gravity and come to lie on the membrane in which the nanopore is located. As
described in
US 2021/0147904, this presentation of sequenceable polymers adjacent to the
membrane
improves capture by the pore by orders of magnitude. In some embodiments, the
salt solution
of the cis compartment acts to slowly release the peptides from the BINDERs,
for example
with an off-rate equivalent to elution over a period of 15 to 180 minutes.
Eluted peptide
constructs are then captured by the membrane through a hydrophobic (e.g.,
cholesterol) tether
as described in the commercial Oxford Nanopore device. Upon capture by the
membrane, the
peptide constructs diffuse in 2-dimensions on the membrane and are efficiently
presented to
the pore for capture and threading into the pore for sequencing.
In some embodiments, the BINDERs are attached to magnetic beads (or other
particles)
by a cleavable link such as a disulfide-containing linker. Exposure to a
disulfide reducer (e.g.,
TCEP or mercaptoethanol) can thus release the BINDERs and their cargo from the
beads into
solution. By further providing the BINDER with a hydrophobic tether on a
linker, the
123
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
BINDER can be captured by the membrane and then diffuse in 2-dimensions,
eventually
bringing bound peptide constructs into close proximity to the nanopore.
In some embodiments, the BINDERs are released free into solution and the
peptide
constructs are not eluted from the BINDERs under the conditions of the cis
chamber (e.g.,
0.4M KC1), so that peptide constructs, still bound by BINDERs, are captured by
and threaded
into the nanopore. In some embodiments the force with which the electric field
acts on the
construct to pull it through the nanopore causes the construct to be pulled
free of the BINDER
while the speed of this motion is regulated by a DNA motor acting on the
construct's
oligonucleotide component. Use of the pore forces to strip peptide constructs
off of the cognate
BINDER allows the BINDER to be used as a "chaperone" for the peptide construct
throughout
the journey from the peptide's capture from a sample digest, though its
modification to create
a sequenceable construct, all the way to threading and delivery into a
nanopore for sequencing.
In some embodiments, tethered versions of the BINDERs are added into the cis
chamber and allowed to contact and be retained by the membrane, forming a
dense surface of
BINDER binding sites on the membrane within which the nanopore lies. This
surface,
comprising a dense plane of binding sites for TARGET and STANDARD peptide
molecules,
is able to capture these molecules from the contents of the cis compartment
and thus provide
an increased local concentration of constructs in the plane of the pore, and
with the ability to
diffuse in the membrane plane so as to deliver constructs for threading into
the pore.
In some embodiments, peptide-oligonucleotide constructs can be propelled back
and
forth though a nanopore by reversal of the transmembrane electric potential (a
process
described as "flossing") to repeatedly read and re-read a sequence to provide
greater accuracy
(65). In some embodiments of the present invention, this flossing approach is
used to read
selected peptides multiple times under computer control. This approach is
particularly useful
in confirming the sequences of peptides present at low abundance; i.e., when
few copies of the
peptide have been encountered and the potential for quantitative error is
large. Thus, if a low
frequency peptide is detected on its first pass through a nanopore, the
nanopore control system
can act on the observation of low frequency in real-time and implement a multi-
read flossing
protocol to verify the identity as a rare sequence. Achieving certain
identification of a low
frequency peptide sequence is more consequential than for high frequency
peptides.
124
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
7.5.8 Inclusion of separative steps in addition to specific affinity capture
and enrichment in
a workflow.
In some embodiments additional separative steps are added to the workflow to
improve
performance. In some embodiments proteolytic peptides in a sample digest are
captured on a
solid support (e.g., C18-coated magnetic beads), thus allowing non-peptide
sample
components to be removed, allowing the immobilized peptides to be reacted with
chemical
reagents (e.g., for click chemistry derivatization as described below), and
excess reagents to
be removed from the peptides prior to their elution (e.g., using 50%
acetonitrile) for use in
subsequent steps. Likewise a conventional magnetic bead-based DNA "cleanup"
(e.g., using
commercially-available Ampure beads (Beckman Coulter)) can be carried out
after assembly
of oligo:peptide constructs generated according to the invention in order to
remove any excess
reagents, short oligos, un-linked peptides, and/or enzymes prior to delivery
of peptide
constructs to a single molecule sequencer (e.g., a nanopore). Modern
laboratory robotics
provides means to automate workflows involving multiple such affinity
separative steps as
well as precise liquid additions.
7.6 USE OF THE INVENTION WITH SINGLE MOLECULE DETECTION BY
NANOP ORE S .
In some embodiments a TARGET or STANDARD peptide is linked to a polymer at
one or more sites (e.g., at one or both ends) by stable linkages (e.g., by
covalent bonds or very
stable non-covalent bonds). In some embodiments these linkages are made
between specific
sites at, or near, the peptide's terminus (or termini) and one or more polymer
molecules (e.g.,
nucleic acids including DNA or RNA, chemical variants of these including
phosphorothioate
backbones, -locked" nucleic acids (-LNA"), peptides including polyglutamic or
polyaspartic
acids, and the like). In many preferred embodiments, an object of the
invention is to cause a
peptide to pass through a nanopore in an extended conformation, allowing the
sequence of
amino acids to be "read" by measurement of current flowing through the pore or
other
equivalent means. In other embodiments, peptides of interest may be
immobilized and
subjected to a series of binding interactions or covalent modifications (e.g.,
stepwise
degradation), and in such embodiments the linkage of peptides to a surface by
a single unique
site (e.g., a unique amino group such as an n-terminal amino group) may be
preferred.
125
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
7.6.1 Peptide selection for pore signature and ligation properties.
Some embodiments make use of monitor peptides selected from among a target
protein's proteolytic fragments based on features including A) their ability
to generate a
distinct sequence or fingerprint in a single-molecule sequence-sensitive
detector (e.g., a
"squiggle" or ion current signature over time in a nanopore sequencing system)
compared to
other peptides (particularly other peptides that may be used with the selected
peptide in a
multiplex panel assay); B) their content of reactive groups (e.g., primary
(amino terminal) and
tertiary (lysine) amines, primary (carboxy terminal) and side chain (aspartic
and glutamic acid)
carboxyl groups, cysteine sulfhydryls, etc.) of potential use in labeling or
ligation reactions; C)
their uniqueness to a target protein (i.e., they are typically required to be
"proteotypic" so as to
act as a surrogate of the pre-specified target protein exclusively); and other
properties desirable
in a peptide analyte (e.g., solubility, chemical stability, etc.). Proteolysis
using trypsin (i.e.,
cleaving polypeptides at lysine and arginine) is common in peptide analysis
for several reasons
including low cost of the enzyme and its ability to generate peptides having a
positive charge
at the c-terminus (a useful feature in mass spectrometric analysis). In the
context of the present
invention, selection from a tryptic digest of TARGET peptides with a c-
terminal lysine ensures
the presence of amino groups at both peptide termini, while selection of c-
terminal arginine
peptides ensures the presence of only one amino group (at the n-terminus) In
the present
invention alternative enzymes can be used as well. The enzyme LysC cuts only
at lysines (not
arginine) and therefore on average generates longer peptides than trypsin, and
each of these
(apart from a protein's c-terminal peptide) will have a lysine (and its
tertiary amino group) at
the c-terminus (features that can be exploited for linking purposes in the
invention). TARGET
peptide selection criteria in the present invention are therefore
significantly different from
selection criteria for mass spectrometry.
7.6.2 Double ligation strategy to attach Leader and Trailer.
Some embodiments apply a strategy to connect polymeric molecules (e.g.,
nucleic
acids, polypeptides, and other polymers) to either or both ends of a peptide
to form a
heteropolymer construct, including ligation of a polymer to the n-terminal
amino group and
another (or another molecule of the same) polymer to a site near the c-
terminus (e.g., through
the c-terminal carboxyl group or to the tertiary amino group of a c-terminal
lysine residue).
126
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
One such ligated polymer (the Leader, which may be highly charged to
facilitate movement
by an electrophoretic process) can be used to initiate threading of the
construct into a
sequencing nanopore and another (the Trailer, at the opposite end of the
peptide) can be used
to assist in ratcheting the peptide through the ion current sensing region of
a nanopore. While
constructs joining a peptide with an oligonucleotide on one end are known in
the art, the
attachment of polymers to both ends is novel.
7.6.3 Concatenation of peptide analytes and other polymers in an alternating
pattern to
provide multi-peptide strings for analysis.
Preparation of macromolecules comprised of multiple Target and/or STANDARD
peptides joined together with intervening polymers (e.g., DNA) that can
participate in
molecular ratchet mechanisms to control (e.g., slow down) movement of the
concatenated
construct through a nanopore and improve accuracy during the readout of
sequence
information from peptides. Intervening polymer segments (e.g., DNA) can
comprise
sequences that identify samples or other data associated with the peptides.
Concatamers can
be assembled using means including "click- chemistry or DNA or peptide
ligases, and can be
comprised of any mixture of peptides, oligonucleotides, or other polymers.
Since
concatenated peptides (with or without intervening linkers) can follow one
another rapidly
through a nanopore, without the need to wait for each new peptides construct
to anchor to a
membrane or thread through a pore, the rate at which peptide molecules can be
identified and
counted can be much higher than with individual single peptide constructs,
thereby increasing
the rate at which precise quantitative results can be obtained.
7.6.4 Simplified sequencing of peptide using a "Rope-tow" construct.
Some embodiments make use of a continuous polymer backbone to which a peptide
can be linked through a single site and -dragged" through a nanopore by
movement of the
backbone, which may include single or double-stranded regions that interact
with DNA motors
controlling this movement in a ratchet-like manner.
7.6.5 Adaptable precision by continued sequencing.
The invention makes use of real-time peptide sequence evaluation and counting
(e.g.,
as provided in commercial DNA nanopore sequencers) to update counts of
individual
127
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
TARGET peptides and their cognate STANDARD peptides, and thereby estimate the
precision
with which each has been measured up to this point based on counting
statistics and other
statistical methods. The availability of updated precision estimates allows
the analytical device
(e.g., a sequence-sensitive single molecule detector such as a nanopore
sequencer) to terminate
an analytical run when pre-defined precision criteria are met (e.g., when the
variance of one or
more specific peptide counts, estimated for example as the square root of the
number of
peptides counted, divided by the number of counts is less than a target such
as 5%). This
approach allows the analytical system to avoid wasting time counting peptides
beyond the
number providing the required analytical precision by terminating a run, or to
continue
counting beyond the expected run duration when more counts are needed to
achieve a precision
target for one or more peptides. Precision targets can be different for
different TARGET
peptides. In a typical case, a minimum number of counts can be specified for a
given TARGET
peptide + STANDARD pair, such that the lower abundance of the pair must
achieve this
minimum number to satisfy a precision target for the ratio between the two
(which represents
the assay's quantitative result).
7.6.6 Adaptable focus on some peptides by rejection of over-abundant leaders.
The invention also makes use of real-time peptide sequence evaluation and
counting
(e.g., as provided in commercial DNA nanopore sequencers) to stop sequencing
of peptides
whose precision targets have already been met and eject these molecules from a
sequencing
pore in order to allow entry of a different molecule that may contain peptides
whose precision
targets remain unmet. The analytical system is thereby able to focus its
attention on peptides
in need of more counts (e.g., low abundance peptides) at the expense of
peptides whose count
targets have already been met (e.g., higher abundance peptides). This increase
in efficiency
increases throughput and lowers cost of peptide counting.
7.6.7 Peptide:Oligo loop insertion constructs using amino linkages
In some embodiments a peptide is inserted into a polymer chain by reacting two

chemical groups at or near the peptide termini with adjacent or nearby
residues of a polymer
such as a nucleic acid, followed by cleavage of the polymer between these
residues. In Figure
16, a peptide 71 having reactive groups (here amino groups in a double amino
peptide) on both
ends is activated by addition of groups 72 (for example BCN click groups,
indicated in the
128
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Figure as X, that can be added to the peptide's amino groups through reaction
with an NETS
derivative of BCN) in Figure 16B. The resulting activated peptide is reacted
with oligo 62
(Figure 16C) having complementary click groups 74 (e.g., azide, indicated in
the Figure as Y)
attached to adjacent bases (here shown as a dinucleotide TT). In this example
T was selected
as the attachment base because of the commercial availability of synthetic
oligonucleotides
having an azide attached to one or more T residues (e.g., from Integrated DNA
Technologies,
Inc.); however alternative attachment means involving other bases or non-base
components
capable of being incorporated into synthetic oligos can be used. The two
attachment sites can
be adjacent bases as shown in Figure 16, or they can be separated by one or a
few intervening
bases (e.g., as shown in Figure 17) The result upon reaction between X and Y
groups to form
covalent linkages 75 is a peptide loop conjugate shown in Figure 16D.
7.6.7.1 Enzymatic cleavage for loop release
In some embodiments, the linkage sites on the oligo 62 are designed such that
the
surrounding sequence (when hybridized to the complementary strand 65) is
recognized by a
restriction endonuclease (e.g., Pad) capable of cutting the oligonucleotide
backbone between
the two bases to which the peptide is linked (the enzyme Pad l cuts between
the second pair of
T residues 76 in the sequence TTAATTAA, as shown in Figure 16E). Following the
cutting
of both strands by the nuclease, the result is a nanopore-sequenceable
construct 77 comprising
the peptide with leading and trailing oligo segments, either or both of which
can comprise oligo
sequence tags identifying the peptide as either a STANDARD or TARGET. In this
simple
version of the "loop ligation" method, which uses identical coupling groups on
both ends of a
peptide for simplicity of preparation, peptides may be inserted in both
orientations, i.e., n-term
first (shown as 77 in Figure 16F), or c-term first (shown as 78 in Figure
16G). Suitable data
analysis algorithms are constructed to recognize a specific peptide's
squiggles in either
orientation. Reaction of the activated peptide with the oligo
bearing linkage sites can be
carried out before or after the oligo is hybridized with the complementary
strand to allow site-
specific cleavage between the adjacent peptide-linked bases. Linkage of the
peptide to oligo
can be carried out after peptides are enriched by BINDERs (provided that
STANDARD
peptides are identified by a structural difference from cognate TARGET
peptides, which
difference can be sensed by a nanopore), or before BINDER enrichment (in which
case the
129
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
oligo sequences 62 or 63 can be used as tags indicating whether a peptide is a
STANDARD or
a TARGET, and the BINDER is used to enrich assembled peptide-oligo
constructs). The loop
method can provide a simpler method of inserting a peptide into an oligo
sequence than other
methods requiring stepwise linkage of linear peptide and oligo molecules.
7.6.7.2 Chemical or photolytic cleavage Pr loop release
In some embodiments, linearization of a peptide loop construct is accomplished
using
chemical rather than enzymatic means. Figure 17 shows an example similar to
that of Figure
16, with the difference that the nucleic acid bases providing oligo attachment
sites 74 (here
shown as T residues) are separated by an intervening chemical structure (shown
here as residue
labeled "Z"). In some embodiments, this intervening structure is a base linked
to either or both
adjacent T's by a phosphorothioate linkage. After linkage of the peptide to
the oligo (e.g., by
click chemistry as described above), the backbone of oligo 62 can be cleaved
at the Z position
using chemical means (e.g., using iodine, aqueous silver nitrate or mercuric
chloride (66) or
chloride assisted by myeloperoxidase (67)). In some embodiments the
intervening structure
comprises a photocleavable spacer or linker (e.g., the linker designated
/iSpPC/ available from
Integrated DNA Technologies, or linker 26-6888 available from (ieneLink) that
can be cleaved
by exposure to near UV light (e.g., 300-350nm wavelength). Photochemical
cleavage provides
an extremely efficient way to effect the linearization of a peptide:oligo loop
construct. Those
skilled in the art will understand that a variety of specific chemical and
enzymatic cleavage
mechanisms can be used to selectively cleave an oligo VEHICLE after insertion
of a peptide
loop to yield a linear oligo:peptide:oligo construct. Those skilled in the art
will understand that
the spacing between two peptide attachment sites 74 can be as short as no
intervening bases,
or as long as 2, 3, 4, 5, 6 or up to 10, 20 or 30 bases. An advantage of a
short distance between
sites 74 is the increased rate of formation of the second linkage after the
first has formed:
formation of the first linkage transforms the reaction from a bi-molecular
(diffusion-limited)
reaction between the peptide and oligo into a uni-molecular reaction that is
likely to be more
rapid, thereby increasing the probability that both nearby linkage sites are
connected to
opposite ends of the same peptide molecule (rather than two separate peptide
molecules).
130
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
7.6.7.3 Loop assembly without a disruptible linkage
In some embodiments, a peptide loop construct is assembled by reacting a
peptide
having reactive groups on both ends (e.g., Figure 16B) with a double-stranded
oligo similar to
that shown in Figure 16C except that the adjacent modified T bases comprising
reactive groups
74 (indicated as "Y" groups) are not linked by sugar-phosphate bond (i.e., the
DNA backbone
is interrupted between them) causing oligos 62 and 63 to be separate
molecules. Oligos 62 and
63 are held in place by their hybridization with the complementary continuous
oligo 64 + 65,
thereby holding the reactive sites 74 in close proximity, where they react
preferentially with
the activated groups 74 on either end of a peptide molecule, creating a
continuous oligo-
peptide-oligo construct amenable to single molecule detection using nanopores
or other
methods. A variety of alternative chemical linkages can be used to connect the
peptide and
oligos, including click chemical linkages as described elsewhere herein.
In some embodiments, for example those in which a 2-step digestion enables
placement
of 2 different, non-cross-reacting chemical groups at or near the ends of a
peptide (e.g.,
members of 2 families of click groups as described elsewhere herein), the
appropriate
corresponding reactive groups can be placed on oligos 62 and 63 to enable
peptide insertion
with defined polarity (e.g., peptide inserted n-term to c-term in a 5' to 3'
oligo VEHICLE).
For example, the peptide n-term amino and c-term lysine epsilon amino groups
can be labeled
with BCN and TCO respectively, and the 5' and 3' T residues of the adjacent
pair labeled
respectively with azide and tetrazine. In this case, the peptide n-term reacts
with the 5' T (BCN
+ azide) and the peptide c-term reacts with the 3'T (TCO + tetrazine) creating
a construct with
all oligo and peptide components in a unique defined order.
7.6.7.4 Oligonucleotide tags to identify STANDARDs
In some embodiments oligonucleotide sequences into which a peptide is inserted

comprise encoded information ("tags") identifying a peptide as a TARGET or a
STANDARD
(the STANDARD constructs being assembled in a separate process from the sample-
derived
TARGET constructs). In the embodiments of Figures 16 and 17, unique 16-base
tags are
provided, which can be incorporated either 5' to the peptide attachment site
(62) or 3' to it
(63), or both (as shown). The tags must be long enough to provide reliable
identification when
read by a nanopore or other single molecule reader ¨ the examples of Figures
16, 17 show 16-
131
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
base tags on both 5' and 3' ends, while the examples of Figures 18, 19 and 20
show the use of
8-base tags on the 3' end to distinguish TARGET and STANDARD peptides (Figure
18). If
the accuracy of sequence recognition is very high, a short tag sequence of a
few bases can
suffice; however, for increased accuracy and reliability, a tag of at least 4,
or 5, 6, 7, 8, or more
bases can be used. In cases where the construct is sequenced by passage
through a nanopore
from 5' to 3', it can be advantageous to place an oligonucleotide sequence
identifying a peptide
as TARGET or STANDARD in the oligo on the 3' side of the peptide (i.e., the
portion of the
oligo that follows the peptide through the nanopore) since the oligo component
on the 5' side
may not be read effectively when the adjacent peptide slips through the DNA
motor without
ratcheting). Figure 18 shows successive steps in the assembly of TARGET
constructs (Figure
18A, B, C and D), and cognate STANDARDS (Figure 18 E, F, G, and H), using two
different
8-base oligo tag sequences 63 and 67 to distinguish them (in this case the 5'
oligo 62 is the
same for TARGET and STANDARD constructs, as it is unlikely to be readable
during 5'-to-
3' transit through a nanopore. Because the TARGET and STANDARD peptide
molecules are
distinguished by oligo tags 63 and 67, joined with the respective peptides in
separate processes
(the STANDARD constructs being assembled separately from sample digest
preparation), the
TARGET and STANDARD peptides are of identical chemical structure, thereby
ensuring their
equivalent binding by the cognate BINDER.
7.6.7.5 Stepwise assembly of peptide:oligo constructs
In some embodiments it is desirable to prepare peptide:oligo constructs of
small size
and then join these to larger oligos to provide sufficient oligo length
upstream (i.e., 3'-wards
in the case of a 5'-to-3' nanopore reading system) to allow the pore to "read-
a significant
length of the peptide. In embodiments that make use of an oligo tag to
distinguish TARGET
and STANDARD peptides, it is typically necessary to form peptide:oligo
constructs of all the
peptide molecules in a sample digest in order to ensure that sample peptides
are counted
accurate in comparison with known added numbers of STANDARD molecules. Short
oligos
are more economical for such bulk peptide derivatization applications. In
addition, short oligos
exhibit more rapid diffusion, thereby improving the reaction rate between
oligos and peptides.
In the examples of Figure 18, the oligo CCTGAACCTZTTATCCAGT has a molecular
weight
of approximately 5,700, and the complete oligo:peptide construct including the
peptide
132
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
ESDTSYVSLK is 6,800 dalton. A construct of this size diffuses significantly
faster than a
construct comprising an oligo as shown in Figure 16 (12,700 daltons of oligo
and
approximately 1,100 daltons of peptide, for a total of 13,800 daltons).
However the shorter
oligos do not provide a sufficient number of bases 3' to the peptide to enable
reading both the
peptide and the tag 3' to it (the DNA motor must engage a sufficient number of
bases upstream
of the peptide and tag to provide effective ratcheting during the read). In
some embodiments,
the need for additional bases 3' to the peptide and tag is satisfied by
extension of the construct
with more bases. Figure 19 shows an embodiment to achieve this extension in
which
oligo:peptide:oligo constructs 91 (a TARGET with oligo tag 63 shown in Figure
19A) and 92
(a STANDARD with oligo tag 67 shown in Figure 19E) prepared according to the
steps of
Figure 18 are annealed with complementary strands 93 and 94 respectively
(Figure 19B and
F). Complementary oligos 93 and 94 also hybridize with extension oligo 95
(Figure 19C and
G), allowing oligos 95 to be ligated with oligos 91 and 92 to produce longer
oligos 96 (an
extended TARGET construct) and 97 (an extended STANDARD construct) shown in
Figure
19D and H. The end structure of these constructs (5' phosphate and 3'T
overhang on the upper
strand, 5' A overhang on the lower strand) enables ligation of the constructs
into longer oligos
and their linkage to sequencing Y-adapters by conventional enzymatic ligation
in preparation
for sequencing (Figure 191).
In some embodiments peptide:oligo constructs are enriched by capture on
specific
BINDERS, and the kinetics of such capture can be improved by using relatively
small
constructs capable of rapid diffusion, and having less propensity to form
large aggregates. In
the example shown in Figure 20, constructs 91 (TARGET) and 92 (STANDARD) are
captured
by interaction of the peptide components with BINDERS 98 (which may be
sequence-specific
anti-peptide antibodies, for example) attached to magnetic beads 51. The
capture reaction can
be carried out either before (Figure 20A) or after (Figure 20B) cleavage of
the oligo to
"linearize" a peptide loop insertion construct (Figures 16, 17 and 18). The
capture can be
carried out with the sequenceable construct alone or hybridized to a
complementary oligo.
Those with skill in the art will understand that capture of smaller constructs
is likely to proceed
more quickly, with lowered potential for non-specific interactions. In some
embodiments,
extension of peptide:oligo constructs (as in Figure 19) and/or addition of
complementary
strands is carried out after the BINDER affinity enrichment step in order to
take advantage of
133
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
capturing smaller constructs. Nevertheless, extension of peptide:oligo
constructs (as in Figure
19) and/or addition of complementary strands can also be carried out prior to
the BINDER
enrichment step.
7.6.7.6 Reduction of loop insertion failures
In some embodiments the lengths of oligo segments in the constructs are
optimized to
minimize potential for failure to generate effective loop constructs. As shown
in Figure 21,
several scenarios can occur when a peptide connects with an oligo having two
attachment sites.
Figure 21A shows an effective assembly in which both ends of the peptide
connect to nearby
sites on the oligo (following the processes shown in Figures 16, 17 and 18)
generating a small
construct capable of hybridizing with a complementary strand (Figure 21D) and
thereafter
forming a full-length sequenceable construct (e.g., through the extension
steps shown in Figure
19). Figures 21B and C, in contrast, show cases in which only one end of a
peptide successfully
links to a site on the oligo, and in these cases the resulting constructs
contain only a 5' or a 3'
oligo segment, or in which two different peptide molecules react with nearby
oligo linkage
sites (Figure 21D). By limiting the length of these 5' and 3' segments, e.g.,
to 8 or 9 bases as
shown, complexes shown in Figure 21E, F and G have low melting temperatures
(e.g., below
20C, as in the examples shown), and therefore are unlikely to form in
appreciable quantities at
room temperature and above, compared to the construct in Figure 21E which is
much more
stable and thus capable of participating efficiently in the extension steps of
Figure 19. The
optimization of oligo segment lengths and sequences, and of temperature during
relevant stages
of a workflow, to disfavor the extension of incomplete constructs, while
enabling extension of
complete constructs, improves the efficiency of generation of sequenceable
constructs.
7.6.8 Reading peptide:oligo constructs in nanopores
In some embodiments, the polymers attached to the peptides can fulfill several

important functions in the process of sequence-sensitive peptide detection
according to the
invention, as exemplified in Figure 22, including: A) providing a highly
charged "guide thread"
that rapidly threads through a sequencing nanopore (ahead of the peptide) as a
result of a
voltage potential across the membrane in which a nanopore is embedded; B) work
together
with a protein nanomachine (e.g., a DNA motor such as a helicase) adjacent to
the sequencing
pore to provide a molecular "ratchet" that moves the peptide (or allows it to
move under the
134
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
influence of a cross-membrane electric potential) through the pore in discrete
steps at a
controlled rate (i.e., slow enough to allow accurate measurements of ion
current through the
pore for each step, despite being stochastic in nature); and C) provide a pore-
sequenceable
"sequence tag" or "barcode" whose nucleotide (or amino acid) sequence is read,
in whole or
in part, during transit through the sequencing pore (either before or after
the TARGET peptide)
and that identifies the attached peptide as having certain characteristics. In
some embodiments,
a sequence tag (e.g., comprising DNA) identifies a construct containing a
TARGET peptide
(and/or its associated STANDARD) as coming from a specific sample among a
multiplicity of
samples whose enriched peptides have been pooled ("multiplexed") together
after addition of
sample-specific tags and prior to sequencing together in one sequencing run.
Such a sample
"barcode" can be included in a conventional sequencing adapter or provided as
a separate
polymer module to be linked with all of the constructs derived from a specific
sample, as is
commonly done using commercially available kits (for example
https://store.nanoporetech.com/us/native-barcoding-expansion- 1 - 12.html).
This use of a tag
for identification of a sample in a pool is well-known in the art as a means
for multiplexing
two or more samples to be combined in a pool for DNA sequencing: the tag
allows sequences
of the molecules from each sample to be separated after bulk sequencing. Sets
of such tags are
commercially available for the Oxford Nanopore system. Additional
characteristics that may
be encoded in the DNA Leader or Trailer tag include identification as a member
of a specific
TARGET or STANDARD peptide subset in cases in which subsets of TARGET peptides
are
separately extracted from a sample digest. Functions A and B above (guide
thread and motor
engagement) can potentially be fulfilled by a homopolymer (e.g., a DNA
homopolymer of one
of the 4 bases, or a peptide homopolymer of glutamic or aspartic acid), while
function C
requires a polymer (like DNA or peptide) made of multiple different monomers
that can be
distinguished by a nanopore sequencing system. In some embodiments sample
barcodes are
used together with TARGET vs STANDARD differentiating barcode tags (the former
required
in principle only once per sequenceable construct molecule, while the latter
are required in
association with each peptide molecule in a construct to identify its source).
Barcodes provide
an efficient way to record and recover information about the nature and source
of individual
peptide molecules in the invention, and thus exploit an advantageous feature
of technologies
135
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
such as nanopores that are capable of "reading" both oligonucleotide and
polypeptide
polymers.
In some embodiments shown in Figure 22A, Leader polymer 1 (labeled Oligo 1),
linked
to the n-terminal residue of peptide 3, threads through nanopore 5 in membrane
4 (shown in
cross-section) as a result of its negative charge (i.e., the oligo' s
phosphate backbone) and the
application of an electric potential between the "cis" side 9 of the membrane
4 and the "trans"
side 10, in this case with the "trans" side positive. The peptide 3 follows
Leader 1 into the pore.
A second oligo 2 (Oligo 2), attached to the c-terminal residue of the peptide
3, is part of a
complex governing the movement of the peptide through the pore. As shown
schematically in
Figure 22, this complex includes a protein nanomachine 7 (e.g., a DNA motor)
interacting with
oligo 2, another oligo 6 with a sequence partially complementary to oligo 2,
and another oligo
8 forming a "tether" promoting an association between the membrane 4 and the
construct.
Tethering a construct close to the membrane (e.g., by providing a cholesterol
functionality that
adheres to the membrane while allowing free diffusion of the complex in the
plane of the
membrane while seeking an open pore) is known to increase the rate at which
construct
molecules thread into the nanopore by more than 1,000-fold.
In some embodiments, the current passing through nanopore 5 in membrane 4
between
the cis chamber 9 and the trans chamber 10 (Figure 22) is measured (typically
in picoamps
given an electrolyte concentration in chambers 9 and 10 of 100-500mM salt).
This current
changes as amino acids or nucleic acid bases transit the narrow throat of
nanopore 5, with the
speed of transit regulated by nanomachine 7. A variety of DNA motor
nanomachines 7 are
known in the art and can be used, including helicases, polymerases, etc. In
the embodiment
shown in Figure 22, the nanopore is protein MspA or a derivative thereof, and
nanomachine 7
is a processive enzyme motor protein such as a helicase, capable of regulating
the passage of
DNA through the nanopore 5. The motor is pre-positioned on adapter Oligo 2,
where
specialized bases prevent it from contacting the rest of the DNA before
loading into the
nanopore. This scheme is commercially available (Oxford Nanopore
Technologies).
The DNA motor that engages with the oligo and regulates its passage through
the
nanopore is offset from the region of the nanopore (the -throat") where the
bases and/or amino
acids modulate the through-pore current (i.e., where the polymer is read").
This offset provides
136
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
the means by which peptides can be sequenced, by allowing the motor to engage
oligo bases
"above" the peptide while the peptide is in the throat begin read. However,
since the DNA
motor cannot engage with peptide to achieve ratchet motion, regions "below"
the peptide by
the offset distance cannot typically be read, but instead move rapidly through
the nanopore
until bases are once again engaged by the DNA motor. This offset and its
effect on the
acquisition of sequence information from peptide-oligo constructs is
represented graphically
in Figure 22. Construct 81 (a typical rope-tow construct) produces sequence
information when
moving through a nanopore whose offset between motor and throat 82 is
approximately 8
bases. Thus sequence is obtained in this example for regions of the construct
about 8 bases 5' -
ward of each base in the oligonucleotide sections 89, but not for the non-base-
containing (e.g.,
abasic) sections 87. Thus oligonucleotide section 85 can be read, as can
peptide section 86,
but region 83 produces no sequence information due to lack of DNA motor
engagement. In
some embodiments using peptide oligo constructs, whether in-line or rope-tow,
this effect
limits the length of peptide sequence that is observable using nanopore
sequencing. For similar
reasons, in rope-tow constructs, this offset limits the readable portion of a
peptide linked near
the 5' end of an abasic stretch to a fixed length of amino acid chain measured
from the 3' end
of the abasic stretch (i.e., the beginning of the oligo sequence that can
engage the DNA motor).
In some embodiments it is therefore preferred that peptides having a constant
length (i.e.,
number of amino acids) that is the maximum length which will fit within the
abasic stretch
without overlapping base-containing stretches: this maximizes the portion of
peptide sequence
that falls within the offset 5'-ward of the following base-containing oligo
section. Those
skilled in the art will recognize that a section of peptide sequence of 8-12
amino acids should
be readable using current nanopore systems (in future extendable to greater
lengths by
providing longer pores), and that the length of an abasic region parallel to a
linked peptide
molecule can be optimized so as to match (and possibly slightly exceed) the
linear extended
length of a peptide of 8, or 9, 10, 11 or 12 amino acids. This range of
peptide lengths occurs
very commonly in tryptic peptides of human and other proteins, indicating the
likelihood that
a proteotypic peptide of pre-specified length can be found for almost all
target proteins.
A variety of alternative pores can be used in the invention. Nanopores such as

aerolysin, a-hemolysin, fragaceatoxin C (FraC), MspA can be used. Motors
working on DNA
(e.g., Oligo 2 as shown) are known that can control movement of the construct
towards the
137
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
positively polarized trans side by "paying out" the oligo in steps while
consuming chemical
energy (e.g., ATP in the Oxford Nanopore system), as are motors such as phi29
DNAP (23) or
helicase He1308 (65) that "pull" an Oligo up against the electrophoretic force
using ATP. A
variety of schemes for controlling the upward (cis-wards) or downward (trans-
wards) motion
of a polymer through a nanopore have been described (referred to colloquially
as "inny" and
"outy" methods by Oxford Nanopore). Non-enzymatic methods can also be
employed, such
as "unzipping" of a DNA duplex under the influence of an electrophoretic
force, and variation
of the cross-membrane potential to regulate transit speed (68) . Peptides can
take the place of
nucleic acids (e.g, replacing Oligo 2), in which case the motor function
during readout as a
construct transits a nanopore can be implemented using an "unfoldase" (69), or
using ClpX on
the "trans" side of the membrane. An alternative nanopore technology using
"Field Effect
Nanopore Transistor" has been described in US 9,341,592 B2 assigned to
iNanoBio. A variety
of non-biological nanopore technologies have been disclosed, many using
semiconductor
technology to create thin inorganic membranes (e.g, of Si3N4, SiO2, graphene,
or MoS2) with
small holes that function as nanopores, and in some cases enabling use of
quantum mechanical
tunneling currents across the lumen of the hole in addition to measurements of
current through
the hole as signals indicative of the transiting polymer sequence.
In the embodiment should in Figure 22B, the construct moves through the pore
generating sequence information (as the timeline of pore current) from the
peptide, and
optionally Trailer Oligo 2 (e.g., containing a sequence tag), until the end of
the construct is
reached, at which point the construct is released by protein nanomachine 7 and
the construct
completes its movement through the pore and floats free into "trans-
compartment 10. At this
point another construct molecule can enter the pore from the "cis" side for
sequencing,
repeating the cycle.
Because of the limited length of the channel in a typical nanopore 5, sequence-
related
information can only be obtained from about 10-25 amino acids of the peptide 3
nearest to the
linkage to the Trailer oligo 2 (i.e., at the c-terminal end of peptide 3 in
Figure 22A). In some
further embodiments, a second set of constructs is generated similar to those
described above,
but in which the peptide is linked in the opposite orientation by swapping the
linkages to the
two oligos in Figure 15. Such constructs (Figure 22C and D) allow sensing of
sequence-related
138
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
information from about 10-25 amino acids nearest to the n-terminal end of
peptide 3.
Combining results from the two reads, analogous to sequencing complementary
strands of
DNA, provides greater coverage of the peptide sequence as well as overlapping
reads of the
middle region of short peptides.
In some embodiments, means known in the art are applied to extend the length
of a
nanopore (and lengthen Oligo 2 if necessary) so as to be able to read longer
amino acid
sequences. These include stacking a spacer protein above the entrance to a
nanopore as
described, for example in WO 2021/111125, or construction of multi-component
stacks (7 0) .
It will be clear to those skilled in the art that numerous improvements in
both nanopores
and "motors" have been, and will be, devised, any of which could contribute to
improvements
in the performance of the present invention.
An example of the current state of the art of nanopore detection and its use
to
characterize peptides is described in patent application W02021111125A1.
7.6.9 Parallel polymer "Rope-tow" constructs.
In some embodiments, a preferred alternative to the forgoing elaborate
chemistry
required to prepare "in-line" peptide-polymer constructs (e.g., with peptides
and oligo
segments alternating in a continuous linear polymer) is implemented by linking
peptides by
only one end to a continuous polymer VEHICLE (e.g., an oligonucleotide) having
a plurality
of available linkage sites, thus forming a long continuous polymer chain with
multiple peptide
branches. This VEHICLE construct is reminiscent of a rope-tow ski lift in
which skiers can
grasp any of a multiplicity of handles on a continuously moving rope to be
pulled uphill. In
this case the continuous polymer serves as the rope, and peptides are attached
to the rope via
linkers.
The motion of such a hybrid molecule traversing a nanopore can be regulated
(e.g., to
produce the desired ratchet motion) by interaction of the continuous polymer
chain with a
suitable motor (e.g., a continuous oligonucleotide interacting with a DNA
motor). Short
complementary oligos can be hybridized to the long continuous oligo as
necessary to facilitate
its interaction with a motor such as a DNA helicase. In the case of a typical
oligonucleotide
polymer, which has a uniform negative charge along its length due to the
phosphates in its
139
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
backbone, it is preferred that the attached peptides have a net positive or
zero charge (i.e.,
experience an electrical force opposite to that of the polymer, or else no
force, under the
influence of the electric potential across the membrane) so that the polymer
pulls them through
the pore by the peptide end attached to the polymer, thus reducing the chance
that the peptide's
free end will move forward and "bunch up" in the pore, potentially clogging
it.
Peptides pulled through the nanopore transit alongside the continuous polymer,
so that
the nanopore throat is occluded by both together. In order to maximize the
information content
of the nanopore current signal for recognizing the peptide sequence, it is
preferred that the
region of polymer lying alongside a peptide is "featureless" allowing the
peptide sequence
signal to be recognized with minimal interference from the polymer background.
This
featurelessness can be achieved through the use of a homopolymer stretch
alongside the
peptide, for example an "abasic" stretch completely devoid of bases in the
case of an
oligonucleotide polymer. In some embodiments, the unit-spacing of the extended
form oligo
backbone (i.e., base pair spacing) is different from the amino acid spacing of
the extended
peptide.
In some embodiments a rope-tow VEHICLE can be created comprising i) an abasic
stretch, ii) a reactive linker group within or adjacent to an abasic stretch
and which is capable
of making a covalent linkage to a peptide end, and iii) a stretch adjacent to
the abasic stretch
that is capable of engaging with a DNA motor (i.e., a stretch comprising
bases, either single or
double-stranded) to regulate movement of the oligo through a nanopore. Repeats
of this
configuration provide a long polymer VEHICLE with a plurality of peptide
attachment sites.
In some embodiments, rope-tow constructs are formed so as to be capable of
assembling into
longer concatamers, through enzymatic ligation (e.g., via DNA ligase),
transposase
recombination, CRISPR insertion, or chemical coupling (e.g., using click
chemistry).
In some embodiments, a rope-tow construct has the advantage that it requires
only a
single linkage to a peptide, usually at either the peptide's n- or c-terminus,
instead of two sites
as required when a peptide is inserted into an oligo 'in-line" with the oligo
backbone (the
arrangement described in the prior art). In some embodiments, the oligo
backbone through the
abasic stretch provides a uniformly distributed negative charge (e.g., the
canonical sugar-
phosphate backbone of DNA) that largely masks the net charge and charge
distribution of an
140
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
attached peptide, and allows the electric potential applied between cis and
trans sides of a
nanopore to exert a near-uniform force on the construct irrespective of the
peptide's
composition.
In some embodiments the abasic stretch has a length equal to or greater than
the length
of selected TARGET and STANDARDs peptides that can be linked to the linkage
site, such
that a peptide in an extended configuration can lie "alongside", and parallel
to, the extended
oligo backbone without overlapping nucleic acid bases (i.e., such that the
cross-sectional area
of the linked peptide:oligo construct along the peptide is that of the peptide
plus the backbone).
In some embodiments one or more additional dsDNA segments are included in the
oligo that
contain sequence information useful to identify and establish registration of
ionic signatures in
a nanopore, to identify a construct as associated with a particular sample in
a pool, or analyte
in a panel (e.g., by identifying specific STANDARDs), or for quality control.
In some
embodiments, the oligo comprises one or more regions to which a DNA motor can
bind, or
where one can be pre-loaded (which regions may not be limited to natural DNA
bases, abasic
structures, or to a conventional sugar:phosphate backbone). In some
embodiments the oligo
comprises regions of DNA or RNA that can interact with a DNA motor to regulate
passage of
the oligo through a nanopore in a ratchet motion.
In some embodiments, multiple peptides are linked to multiple abasic sites on
a
prepared rope-tow oligo to form a peptide:oligo concatamer. In some
embodiments, such an
extended rope-tow "template" oligo VEHICLE, having a plurality of empty
linkage sites and
abasic stretches, is present in the cis compartment of a nanopore and reacts
with TARGET and
STANDARD peptides (introduced into the compartment in solution, or bound to
BINDER
from which they dissociate under conditions prevailing in the compartment, or
on other solid
supports) to form rope-tow peptide:oligo constructs in the vicinity of a
nanopore. In some
embodiments, an empty extended "template" rope-tow oligo VEHICLE is bound to
the
nanopore's membrane by a tether. In some embodiments, the empty rope-tow oligo
continues
to react with and accumulate attached peptides after the commencement of a
sequencing run.
In some embodiments, a rope-tow oligo VEHICLE allows peptides functionalized
with
one member of a -click" reagent pair to react directly with a -click" site
(comprising the other
member of a "click" pair) on a pre-synthesized VEHICLE, which can be of any
convenient
141
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
length, and contain any number of abasic stretches, thus creating a continuous
polymer capable
of threading and being read continuously by a nanopore having a processive DNA
motor. The
peptides are attached to the VEHICLE backbone in an orientation such that they
are dragged
through the pore alongside an abasic stretch of sugar-phosphate backbone. The
nanopore
current trace describing the peptide sequence thus reflects the combined areas
and chemical
properties of backbone and amino acids passing through the reading region
(i.e., the -throat"
of the nanopore) as the two parallel polymer segments are ratcheted through
the pore by the
interaction of a DNA motor at the entrance to the pore with an oligo region
following the abasic
stretch. Those skilled in the art will understand that numerous permutations
of this concept
can be implemented with alternative oligonucleotide sequences, backbone
chemistries (e.g.,
PNA, etc.), base-free regions (e.g., positions with side groups smaller than
normal bases or
abasic) of various lengths designed to accommodate peptides of various
lengths, and chemical
connecting groups (e.g., various click combinations, amino-reactive groups,
etc.). In some
embodiments, tryptic peptides ending in arginine are preferred, as these have
a single amino
group (the n-terminal amine), which conveniently provides a single specific
site for attachment
of a click group for facile linkage to a "rope-tow" oligo. In some embodiments
tryptic peptides
ending in lysine are connected by one of the peptide' s two amino groups
following blockage
on one amino group (e.g., blockage of the n-terminal amino group by reaction
at near-neutral
pH), thus allowing a peptide to be coupled in the opposite orientation (c-term-
first into the
pore) compared to linkage via the n-terminal group, and providing the ability
to "read" peptide
sequences in both directions. In some embodiments, aspartic and glutamic acid-
free peptides
are linked to the oligo via the unique carboxyl group at the peptide's c-
terminus. In some
embodiments, peptides are attached to the oligo via a site near the 5' end of
an extended abasic
region so that as the oligo is drawn 5'-first through a nanopore, the peptide
(whether n-terminal
first, via an n-terminal linkage, or c-terminal first via a c-terminal
linkage) is pulled through
the pore lying alongside the oligo backbone.
In some embodiments, as illustrated in Figure 23A, a method of concatenating
peptides
for nanopore sequencing is provided in which a continuous polymer is prepared
comprising
single-stranded oligonucleotide segments 23 with stretches of abasic sites 21
(positions in
which there is no base attached to the continuous sugar-phosphate backbone)
and through
which the chain therefore has a diminished cross-sectional area on account of
the lack of bases.
142
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
In some embodiments, the length of the abasic stretch is designed to be longer
than any of the
peptides to be linked to it (including the length of any chemical linkers). In
some embodiments
a nucleic acid residue 22 preceding (e.g., 5' to) and adjacent to one or more
abasic regions 21
is provided (during synthesis or through subsequent modification) that
includes a reactive
chemical linking group (e.g., one of a pair of click-chemistry groups, or a
reactive group such
as an amino group where a click functional group can be installed), capable of
combining with
the other of the pair of click groups that is attached to one terminus (e.g.,
the n-terminus) of a
peptide molecule 3. The provision of abasic stretches provides a length of
backbone devoid of
bases, and whose diminished cross-sectional area allow a peptide chain
attached at the leading
(typically 5') end of the stretch to be pulled through a nanopore parallel to
the abasic backbone.
The nanopore occlusion in the area of the abasic region is thus due to the
peptide chain plus
the parallel oligonucleotide backbone, and through this stretch the amino acid
sequence can be
read from the changes in nanopore ion current during transit under control of
the DNA motor
interacting with the DNA sequence to the 3' end of the abasic region.
In some embodiments the polymer chain is synthesized chemically or ligated
together
from chemically-synthesized units, for example using oligonucleotide synthesis
and
incorporating DNA, RNA, modified DNA and abasic synthons (such abasic
sequences can be
obtained commercially from, e g , Integrated DNA Technologies as d Spacer,
rSpacer or Abasic
II residues). In some embodiments, alternatives to the common DNA or RNA
backbones are
used, such as peptide nucleic acid or phosphorothioate backbones, or any of a
variety of linear
polymers that can be joined to oligonucleotide backbones to form a continuous
molecule. It is
understood by those skilled in the art that the polymer VEHICLE constructs
described herein
as comprising "oligos" or DNA can alternatively be formed of other polymers,
other backbones
and a variety of natural and modified bases or side-groups. It is likewise
understood by those
skilled in the art that the linkage groups described for coupling peptides to
rope-tow constructs
can be any of a variety of coupling chemistries including "click" chemistry,
amine-reactive
chemistries (e.g., NHS esters), carboxyl-reactive chemistries, etc. Reactive
sites may be
created in synthetic oligos by a variety of means, for example by including an
amino-modified
version of an internal base such as 5' Amino Modifier Co dl or a 3' Amino
Modifier (both
available commercially e.g., through Integrated DNA Technologies, Inc. custom
DNA
143
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
synthesis service). These amino groups can be converted to NETS derivatives as
part of oligo
manufacture, and can be further converted to click chemistry groups where
required.
It is likewise understood by those skilled in the art that numerous
alternative ratcheting
nanomachines can be used to regulate movement of polymers, including oligos,
through a
nanopore in place of a DNA motor, and that numerous alternative biological
(protein and
DNA-based) and inorganic (e.g., solid-state) nanopores have been described
that are capable
of reading polymer sequences.
In some embodiments, the abasic stretches as described can be any structure
that
preserves the continuity of the backbone and preferably has a smaller cross-
sectional area than
canonical single-stranded DNA or RNA. In some embodiments the backbone in the
abasic
stretches comprises negative charges, mimicking the uniform negative charge
distribution of a
sugar-phosphate backbone.
The modified residues comprising linkage sites can be any of a variety of
residues
comprising a linker group, and the linker may be attached to a base, to a
sugar, or to a phosphate
group.
In some embodiments the linkage site is preceded (i.e., in the 5' direction)
by one or
more abasic sites. Such preceding abasic sites can be provided to generate a
high-current
(almost open-pore) start signal preceding the current profile attributable to
a linked peptide and
parallel polymer backbone.
In some embodiments, as shown in Figure 23A, a complementary oligonucleotide
24
is generated that hybridizes with oligo 23, except through the abasic regions
(where there are
no bases on 23 with which to hybridize), and in the region of the 5' terminus
of oligo 23 (where
the 5' region comprises a leader sequence, or non-oligo charged polymer, that
threads through
the nanopore initially and may comprise a site for binding a DNA motor 7). In
some
embodiments only a fraction of the corresponding residues hybridize. In some
embodiments,
the complementary oligo 23 is interrupted, comprising only segments that
hybridize with the
non-abasic ("natural") segments of oligo 23, leaving the abasic stretches
single-stranded. In
some embodiments the number of complementary strand bases aligned with abasic
stretches is
not the same as the number of abasic sugar-phosphates, such that one strand is
longer than the
other in the abasic region, leading to a kink in the duplex at abasic regions,
increased exposure
144
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
of the region to the environment and lesser steric hindrance in the reaction
of the oligo linkage
site with that of the peptide.
In some embodiments the length of abasic regions can be set to accommodate the
length
of peptides that are intended to be read (e.g., the TARGET and STANDARD
peptides), so that
the abasic regions are at least as long as the extended length of the peptides
plus any linking
groups, thereby ensuring that a full-sized nucleic acid base and an amino acid
do not transit the
throat of the nanopore together, potentially clogging it. Abasic regions are
typically 6 to 50
backbone (sugar-phosphate) units long.
In some embodiments, nanopores with throats of larger dimensions can be used
to
accommodate oligo and attached peptides in parallel (i.e., roughly double the
throat area of
nanopores currently used for DNA sequencing), in which case the stretches of
abasic sites
alongside the peptides in Figure 23 can be replaced with canonical DNA with
backbone present
with normal nucleotides attached. In such a case it is preferred that the
stretch of oligo
alongside the peptide is a homopolymer, thus providing a consistent background
against which
the variations of nanopore current due to different amino acids of the peptide
can be detected.
In some embodiments, peptides are prepared for conjugation with rope-tow oligo

VEHICLEs by functionalizing the n-terminal amino group with a reactive moiety
such as a
click chemistry reagent suitable for joining to the modified oligo attachment
site. In some
embodiments, TARGET peptides ending in Arginine are preferred since they have
only a
single amino group (the n-terminus) and therefore are derivatized at only one
site by reaction
with an amino selective reagent.
In some embodiments, a peptide amino group is derivatized while the peptide is

localized on a BINDER, and later released into solution to react with a rope-
tow oligo
VEHICLE.
In some embodiments, a peptide amino group is derivatized while the peptide is

localized on a BINDER, and subsequently reacted with a solution of rope-tow
oligo molecules,
after which any unreacted rope-tow oligo is washed away prior to elution from
the BINDER.
This approach has the advantage that a large majority of the rope-tow
molecules will have
attached peptides (i.e., most or all attachments sites will be "loaded" with
peptides), and
subsequent concatentation of these loaded rope-tow oligos will generate a
fully loaded
145
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
construct capable of yielding a plurality of TARGET and STANDARD peptide
counts in one
nanopore read.
In some embodiments peptide amino groups are not modified but rather react
directly
with an oligo whose reactive sites can react directly with an amino group
(e.g., the oligo
VEHICLE has NHS linkage sites). In some embodiments peptides are eluted from
BINDERs
using a competing (i.e., "displacer") peptide of the same or similar sequence
to the TARGET
and STANDARD peptides and introduced, when elution is required, at higher
concentration
than the TARGET and STANDARD peptides or BINDER binding sites. After a
duration of
one or a few BINDER half-off-times, the BINDER is saturated with the displacer
peptide and
the Target and STANDARD peptides will be free in solution. Displacer peptides
can be
modified so as to be unable to participate in linkage reactions taking place
after elution of
TARGET and STANDARD peptides, e.g., by blocking the n-terminal amino group
and/or the
c-terminal carboxyl, and any lysine amino group. In some embodiments that
involve a linkage
reaction with the n-terminal amino group of bound peptides, a displacer
peptide of the same
sequence as the TARGET (or STANDARD), but with the n-terminal amino group
acetylated
during or after synthesis, can be used to displace and thus release the bound
peptides without
interfering in the amino group chemistry. An advantage of eluting TARGET and
STANDARD
peptides using a displacer peptide is that no other solution conditions need
be changed,
reducing the likelihood of eluting non-specifically bound materials from the
BINDERs, carrier
beads or other supports. In the case of nanopore sequencing, use of a
displacer peptide with a
net positive charge results in no displacer peptide migrating towards or into
the nanopore.
In some embodiments, not all reactive sites on the rope-tow oligo VEHICLE
react with
peptides, leaving empty peptide-accommodating sites. Empty sites are easily
recognized in
nanopore ion current traces by their short duration and lack of major current
modulation.
In some embodiments, illustrated in Figure 23B, a rope-tow oligo (with
peptides
attached) and its complementary strand can be joined to a prepared adapter 25
comprising the
components required to present the oligo to a sequencing nanopore, start
threading into the
pore and regulate movement into the pore (e.g., using a DNA motor). In this
case the adapter
is modeled after a commercially available -Y-adapter" provided by Oxford
Nanopore, which
146
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
can be ligated to a DNA duplex (e.g., aligned by a T-overhang) via a simple
kit procedure
according to the manufacturer.
In some embodiments, DNA motors are loaded at periodic sites along the rope-
tow
oligo between the peptide attachment sites in order to provide means to
continue ratcheting the
rope-tow VEHICLE through a nanopore if a motor falls off when transiting a
peptide-loaded
or abasic site.
In some embodiments rope-tow constructs are prepared by joining a peptide to
an oligo
having a single abasic stretch (i.e., with a capacity of a single peptide),
and these short rope-
tow constructs are subsequently assembled into concatamers using linkage
methods described
above and illustrated in Figures 23 and 24 (e.g., click chemistry or enzymatic
ligation). In
some embodiments, such short, single attachment site oligos are reacted with
peptides while
the peptides remain on the BINDER, allowing unreacted oligos to be washed away
prior to
elution of peptides from the BINDER, and resulting in subsequent
concatamerization of only
"loaded" oligos, thus avoiding empty abasic sites and wasted sequence reads.
In some embodiments, double-stranded oligo-peptide rope-tow VEHICLE constructs

are introduced into prepared double-stranded sequenceable constructs (e.g.,
the products of
well-known commercially-available library preparation kits and methods used
with the Oxford
Nanopore system) by recombinant processes (e.g., by use of transposases,
CRISPR
mechanisms, "tagmentation", hybridization and repair, and the like) to form
sequenceable
con catam ers
In some embodiments, a bead functionalized with a single type of BINDER is
used to
capture and transport molecules of a single TARGET + STANDARD pair to the
vicinity of a
nanopore, where the peptides are eluted by one of the methods described above
and allowed
to combine (e.g., via -click" linkages as described above) with a VEHICLE
construct molecule
(e.g., a "rope-tow" construct) pre-positioned at (i.e., already threaded into
or directly available
to) the nanopore. In such an embodiment, an incubation period can optionally
be included to
allow eluted peptide molecules to couple with the VEHICLE prior to the start
of motion
through the nanopore. Motion of the construct through the pore can be
initiated by an increase
in the trans-membrane voltage pulling the typically negatively-charged
construct through the
nanopore, at which point a ratchet mechanism (e.g., a DNA motor) begins to
feed the construct
147
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
through the nanopore for reading. In some embodiments each nanopore is
prepared with a
single VEHICLE in place, and is used to sequence only that VEHICLE molecule ¨
in such an
embodiment it can be advantageous to use a long VEHICLE with a large number of
peptide
binding sites, for example a VEHICLE of 100kb equivalent length with abasic
stretches
comprising peptide linkage sites every 100b and therefore able to accommodate
1,000 peptide
molecules (a number sufficient to provide a precise TARGET-to-STANDARD ratio,
and thus
protein amount, when the TARGET and STANDARD are in approximately equal
amounts).
In some embodiments, such pre-positioned VEHICLEs associated to nanopores are
combined
with mixtures of different TARGET + STANDARD pairs.
In some embodiments a bi-functional support is used in which a BINDER 52
specific
for a TARGET 48 + STANDARD 49 peptide pair is immobilized on a support (e.g.,
magnetic
beads 51) that also carries immobilized VEHICLE molecules comprising a tag
(e.g., a
sequenceable DNA tag 55) that is assigned to the TARGET. An example using
specific
sequences is shown in Figure 25A to illustrate the concepts, while not
limiting the scope of
sequences, binders and chemistries that can be used. An object of this
arrangement is to capture
a TARGET I STANDARD peptide pair via the cognate BINDER, separate these
peptides from
other peptides, and subsequently allow the captured peptide molecules to react
with the
VEHICLE comprising the TARGET' s tag (which can be considered a VEHICLE
cognate to
the TARGET sequence), producing a construct whose sequenceable (e.g., DNA) tag
identifies
the TARGET and STANDARD peptides expected to be attached. This approach
provides
additional information (the identity of the TARGET) that improves the
reliability of the
identification of the TARGET and STANDARD peptides linked to the VEHICLE in
nanopore
current traces ¨ essentially transmitting the identity of the BINDER to the
nanopore sequencer
via this "labeled construct". In some embodiments, a bi-functional support
(which may, for
example, be one or more magnetic particles) carries multiple molecules of a
BINDER and
multiple molecules of a VEHICLE (e.g., a rope-tow construct 45) incorporating
a sequence tag
55 indicative of the BINDER (and thus TARGET) identity. Figure 25A shows a
rope-tow
VEHICLE comprising a nanopore sequencing adapter 41 (including a DNA motor 50)

followed by two tandem copies of a rope-tow construct, each of which comprises
a linkage site
47 to which a peptide (48 or 49) is covalently linked by linkage 46. The
construct can include
tens, hundreds or thousands of repeats of a rope-tow construct, providing the
capability to
148
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
attach tens, hundreds or thousands of peptide molecules to a single VEHICLE
molecule.
Figure 25B shows a magnetic bead 51 to which is attached multiple copies of
the VEHICLE
construct 53 (e.g., the construct shown in Figure 25A) and multiple copies of
BINDER (52)
with bound TARGET (48) and STANDARD (49) peptides. This configuration results
from
exposure of the bifunctional bead to a standardized sample digest containing
TARGET and
STANDARD peptides, allowing the BINDERs to capture these peptides
specifically, and
subsequent removal of the beads from the digest and washing unbound peptides
away. Figure
25C shows the configuration of the bifunctional bead after the peptides are
eluted from the
BINDERs and allowed to react with the linkage sites 47 on the VEHICLE
constructs 53. The
VEHICLE constructs are subsequently released from the beads using any of a
variety of well-
known chemical (e.g., reduction of an S-S bond) or enzymatic (e.g., cleavage
at a restriction
endonuclease site) means and delivered to a sequencing nanopore.
In some embodiments, including those shown in Figures 3, 4 and 26, the
peptides can
be chemically modified while bound to the BINDERS in order to introduce a
linkage group
capable of combining with the VEHICLE linkage sites (e.g., site 47). In some
embodiments
either the peptide linker group or the VEHICLE linker group, or both, are
present while
TARGET and STANDARD peptides are bound to the BINDERs in a form that is not
reactive
with the counterpart group, thereby reducing or eliminating premature reaction
of free reagents
with either of the linking groups. After the peptide and VEHICLE linking
groups are in place
and any reagents used to introduce them are removed (e.g., by washing the
beads), an activation
step is carried out to convert the inactive linker forms to active forms
capable of reacting with
counterpart groups to form linkages 46 between peptides and VEHICLEs. In some
embodiments the linkages of peptides to VEHICLEs result from reactions between
a pair of
"click" chemistry groups i.e., one on the peptide (e.g., on its n-terminal
amino group) and one
on the VEHICLE (e.g., at linker site 47), at least one of which was prepared
initially in an
unreactive form and converted afterwards to an active form able to react to
form linkage 46.
Introduction of -click" chemistry precursor groups into peptides and nucleic
acids, and their
subsequent conversion to reactive "click" groups is well known in the art, for
example using
chemistries described as "single-pot click chemistry" (71). Subsequent elution
of the peptides
(now having active linkers) from the BINDERs results in rapid reaction of the
peptides with
the VEHICLE at high efficiency due to the very close spatial proximity of the
BINDERs to the
149
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
VEHICLEs on a bead. In some embodiments the yield of peptides linked to
VEHICLEs is
further improved by carrying out the elution step (and subsequent reactions to
form linkages
46) after the bead has been placed in a nanowell, thus restricting diffusion
of eluted peptides
away from the bead and VEHICLES to a very small volume. In some embodiments
the
VEHICLES are released from the beads before, during or after elution of the
peptides from the
BINDERs
In some embodiments in which reactive groups are introduced into peptides
while the
peptides are bound by BINDERs (e.g., by reaction with peptide amino groups),
the BINDERs
are either selected so as to contain no reactive amino groups (e.g., nucleic
acid aptamers contain
no free amino groups), or else the amino groups of the BINDERs are blocked to
avoid creation
of active linkage-capable groups on the BINDERs that could compete for
reaction with the
VEHICLE reactive sites.
In some embodiments, oligo:peptide constructs of the invention are purified
before
sequencing by binding to and elution from a support designed to capture and
enrich nucleic
acids from a sample (e.g., Agencourt AMPure XP beads (Beckman Coulter)).
7.6.10 Concatamer construct formats
In some embodiments, peptide constructs are concatenated to optimize detection

performance, specifically throughput. Presentation of peptides as concatamers
allows a
nanopore to continuously read molecules, and avoids delays that can arise if a
pore must wait
for each of a series of short constructs to approach and thread the nanopore
to be read. In some
embodiments, as shown in Figure 27, the ligation approach can be modified to
enable assembly
of long strings of covalently linked molecules, in this case with polymer
linkers (e.g., DNA)
between peptide molecules in an alternating pattern. In the case shown in
Figure 27, the DNA
linker is modified compared to the Oligos of Figure 15 by providing ADIBO
functionality on one end (the 3' end in this example) and an amine reactive
NHS functionality
on the other (5') end. When the individual peptides having a c-terminal lysine
residue (with a
free epsilon amino group) are modified to introduce azide functionality on the
n-terminus (as
shown in Step 3 of Figure 28) while on the BINDER, and then, after removal of
the unreacted
AAAH, released from the BINDER into solution and exposed to the modified
oligos (here
labeled "Linker"), the result is rapid formation of "click" links to form
extended chains of
150
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
peptide+linker repeats. The arrangement shown in Figure 27 preserves a 5' to
3' polarity for
the DNA linking segments as required in typical nanopore sequencing systems.
In the resulting
concatamers (Figure 28A), specific sites on the linkers interact with
molecular motors (Figure
28B; either pre-positioned on the oligos or added later in solution) to
provide a ratchet
movement of peptides through the pore. In the case of molecular motors like
nucleic acid
helicases, polymerases and the like, the motor might be expected to fall off
the concatamer
when it encounters the peptide segment (thus requiring a new motor assembly on
each DNA
linker as shown in Figure 28B) - however as disclosed in WO 2021/111125 there
are existing
motor proteins that can slide over a peptide and engage a subsequent DNA
segment.
In some further embodiments shown in Figure 28C and 28D, "click" chemistry is
employed to assemble concatamers without intervening DNA linkers. In this case
one end of
peptides (e.g., the n-terminus) is derivatized with one of a pair of "click"
reagents (e.g., azide
functionality introduced using azidoacetic anhydride, while the other end
(e.g., the tertiary
amino group of a c-terminal lysine residue) is derivatized with Aza-
dibenzocyclooctyne
(ADIBO). When the peptides are released from the BINDER into solution, the
peptides react
with one another to form extended concatamers. In this embodiment, alternative
molecular
motors capable of stepwise, ratchet-like processing of polypepti de chains are
used (70) instead
of the enzymes used with nucleic acids
In some embodiments, constructs are concatenated by hybridization as shown in
Figure
24. Here the constructs of Figure 15 are joined in series by hybridization
with an oligo (labeled
Oligo 3) that has sequence regions complementary with both Oligo 1 and Oligo
2. After
hybridization has occurred, an enzymatic ligation is carried out (at site
indicated by the black
triangles) to covalently join the successive constructs into a continuous
single stranded
molecule comprising repeats of Oligo 1, peptide and Oligo 2. In some
embodiments Motor
proteins are pre-positioned on the constructs prior to ligation as shown. In
some embodiments
a small proportion of the Oligo 1 Linkers of Figure 28 are specialized as
"guide threads"
optimized to engage and enter sequencing pores most efficiently (e.g., by
having optimal
means for engaging tethers to bring the concatamer into contact with the
membrane allow its
diffusion to a pre). Such Linkers have "click" linkage sites on only one end,
and so are
incorporated only at one end of a concatamer (typically the 5' end).
151
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
When the enrichment on BINDER is carried out to enrich one TARGET peptide and
its STANDARD, then the resulting extended chains will be comprised of this
TARGET
peptide and STANDARD molecules in the same ratio as present in the original
sample digest
(provided that the reactivities of the TARGET and STANDARD with the other
polymer
components of the concatamer are equal, or nearly so ¨ this equivalence being
a criterion for
selection of the STANDARD structure with respect to the TARGET peptide).
However, when
multiple TARGET peptides and their respective STANDARDS (e.g., forming a
protein
biomarker panel) are enriched using the panel of respective BINDERs together,
and then eluted
into solution, the concatamers that form in solution comprising these peptides
will be
composed of random mixtures of the different TARGET peptides and their
STANDARDs, and
each nanopore read will include a variety of TARGET peptides (and cognate
STANDARDs).
7.6.10.1 Single TARGET concatctmers
If each concatamer molecule is comprised only one type of TARGET and its
STANDARD, then recognition and classification of the peptides using a nanopore
current trace
would be simplified, based on the a priori expectation that the peptide
sequences read would
all be variations of a single TARGET sequence. Thus in some embodiments, the
constructs
described above are concatenated so as to join into a given chain only
molecules of a given
TARGET peptide and its STANDARD (i.e., each chain being homogenous with
respect to the
TARGET to be read out), and will generate counts of only these two peptides.
In some
embodiments this objective is achieved by concatenating together constructs
bound to an
individual bead that is functionalized (or coated) with molecules of a single
type (specificity)
of BINDER. In a multiplex assay, this can be achieved by fixing each BINDER to
beads
separately (such that each bead has copies of only one BINDER on it) and
subsequently
pooling the beads to capture the various TARGET and STANDARD peptides. In some

embodiments, after peptides are captured by BINDER on beads (each bead
carrying bound
peptide molecules of only one TARGET peptide and STANDARD pair), the beads can
be
distributed into very small (e.g., femtoliter) containers (such as those used
in Illumina DNA
sequencing technology) or droplets (as commonly used in digital PCR and
associated
microfluidic methods), effectively isolating each bead in a separate
container. The addition of
linking groups to the peptides can be carried out prior to distribution of
beads into individual
152
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
containers (i.e., while the peptides remain on the BINDER on the beads), or
after the beads are
distributed to containers. In the containers, the peptides are eluted from the
BINDER (by
exposure to eluting conditions or reagents, including displacer peptides),
combined with the
derivatized Oligos, and allowed to concatenate, forming sequenceable
concatamer constructs
as described in the invention. Since the peptides in each tiny container arise
from one bead,
and this bead bore only a single specificity of BINDER, the container's
peptide contents can
be joined into concatamers containing only that TARGET peptide and its
STANDARD.
Concatamers comprising copies of only one TARGET peptide and its STANDARD are
advantageous for two reasons: a) the probability of correctly recognizing the
peptide sequence
can be increased because multiple copies of the same (or similar STANDARD)
sequence are
detected successively in the current trace ("squiggle") from a single nanopore
and these jointly
used to form a consensus assignment (e.g., by machine learning algorithms well-
known in the
nanopore sequencing art) to one of a set of pre-selected TARGET peptide
sequences, and b) if
a peptide sequence can be determined early in the processing of a long
concatamer through the
nanopore (i.e., in the first few of a large number of concatenated peptides),
and that sequence
has already been detected enough times to achieve the required assay
sensitivity and/or
precision, the concatamer can be ejected from the nanopore to enable the
nanopore to begin
reading a different sequenceable construct (a concept referred to as
"computational enrichment
of target sequences").
In some embodiments the tiny container into which a single bead (with a single

specificity of BINDER) is directed also comprises one or more sequencing
nanopores, which
are thereby devoted to sequencing a single TARGET and STANDARD pair. In some
embodiments, microfluidic means are employed to control the movement of such
beads into
separate pore containers and optionally deliver one or more successive
reagents into the
container. In some embodiments each bead described above, bearing a single
BINDER
specificity and carrying bound molecules of a single TARGET peptide and
STANDARD pair,
is placed in the region of a single nanopore. The nanopore region can be a
container in which
one nanopore is present, and may be electrically isolated or having liquid or
electrical
connection to other nanopores, but having little or no diffusion between
nanopore regions. The
BINDER-bound peptides are eluted from the bead in the nanopore vicinity, and,
due to their
physical proximity to the nanopore and the nanopore's isolation from other
beads, the bead's
153
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
peptide cargo is detected, recognized as a specific TARGET peptide or
STANDARD, and
counted by passage of the constructs containing them through the nanopore.
In some embodiments an alternative approach can be implemented in which each
BINDER is used separately to extract its TARGET peptides and STANDARD from a
digest
(a process that can be implemented as a sequence of separate BINDER captures),
and then
these peptide cargoes are processed separately to form similar homogenous
concatamers. In
such embodiments, the BINDER can be immobilized on physically separate
supports (e.g., on
separate porous affinity supports including column chromatography beads, or as
separate zones
on a porous membrane such as nitrocellulose as used in conventional lateral
flow
immunoassays), or the different BINDER can be exposed to a sample digest
sequentially, one
at a time, to produce separate captures.
In some embodiments beads (e.g., PierceTM Protein A/G Magnetic Agarose Beads,
diameter 10-40 microns) having a BINDER capacity larger than typical magnetic
beads (e.g.,
Dynabeads, e.g., 2.8-micron diameter) are used in order to collect more
molecules for
presentation to a single nanopore. Methods for placing beads near nanopores to
increase the
rate and/or probability of sequenceable constructs entering the pore have been
described in the
art (72, 73) but they fail to encompass the current purpose of presenting one
(or a small subset)
of homologous constructs on one or a small number of beads to a given pore.
In some embodiments, microfluidic means known in the art are used to
distribute single
beads to individual nanopores
It will be apparent to those skilled in the art that some optimization of
concatamer
assembly chemistry and conditions will be useful in order to minimize
production of circular
concatamers, which, because they have no free end to thread through a
nanopore, are unlikely
to be detected in a nanopore sequencer.
7.6.11 Rope-Tow ligation and concatenation using splint oligos.
Nanopore sequencing adapters can be ligated to one or a series of peptide-
oligonucleotide constructs in tandem using a commercial ligase (e.g., T4 DNA
ligase) capable
of joining a 5' phosphate of one oligo with a 3' hydroxyl of another. The
linkage can be
facilitated by providing a single base sticky end, for example the T/A
overhang at sites 45
154
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
shown in Figure 7B. Increased efficiency and specificity of ligation can be
achieved using
extended overlap regions in which "splint" oligos are provided having a 5'-end
complementary
to the 5' end of a VEHICLE or sequencing adapter, and a 3' end complementary
to the 3' end
of another VEHICLE (shown in Figure 29). Annealing of two peptide-oligo
constructs (such
as a rope-tow constructs of Figure 26) with a splint such as 56 in Figure 29C
allows formation
of a head-to-toe chain of peptide-oligo constructs amenable to ligation by T4
ligase into a
continuous long nanopore sequenceable molecule. Ligation of this chain with an
appropriate
sequencing adapter, optionally using a short splint 57 (Figure 29B) renders
this molecule ready
for entry into a nanopore.
7.6.12 Ligation of Double-Tag Rope-Tow Constructs
In some embodiments double-tag Rope-tow constructs according to the invention
(exemplified by Figure 26) are directly suitable for ligation to a sequencing
adapter (Figures
30A and B), and head-to-tail ligation in series (Figures 30C and D). In some
embodiments
each Rope-tow construct (the "right-hand" construct being ligated) has a 5'-
phosphate and is
hybridized to a complementary strand having a projecting 3' A that base pairs
with a projecting
3' T on the "left-hand" construct being ligated. Such a configuration can be
acted upon by a
ligase (e.g., T4 ligase) to form a standard phosphate linkage between the two
constructs.
7.6.13 Dynamic selection of concatamer molecules for detection at the nanopore
level.
In some embodiments, a multiplex panel of proteins can be measured in the
sample,
and the different TARGET peptides (and their STANDARDs) can be enriched
separately, for
example using different populations of magnetic beads for each different
TARGET peptide or
by placement of loaded beads in separate tiny containers, with the chemical
modifications and
reaction with linkers to form chains also carried out separately for each
TARGET peptide and
STANDARD, as described above. Then each concatenated chain will contain only
one type
of TARGET peptide sequence and STANDARD. Pooling these separately processed
concatamers will create a sample comprised of multiple peptides, but in which
each
concatamer molecule chain will contain only one predominant TARGET peptide and
its
S TANDARD
In some embodiments using this "individual enrichment and pool" approach, the
first
peptide sequences read by a nanopore from a concatamer molecule will identify
the type of
155
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
TARGET peptide (and/or STANDARD) comprising the whole molecule, making it
possible
for the sequencing system software to decide whether or not further counts of
that TARGET
peptide or its STANDARD are required, i.e., whether the minimum molecule
counts for the
peptide and respective STANDARD required to achieve the desired measurement
precision
have already been achieved or not (whether from this pore or others). If the
desired minimum
counts have already been collected, a reverse voltage can be applied to the
pore and the
concatamer ejected without spending further time reading, thus allowing
another different
concatamer to thread the pore and begin sequencing. This approach has been
referred to as
"computational enrichment of target sequences" (74).
If the desired minimum counts have not been achieved, then the current
concatamer
continues to be read, gradually adding counts towards the minimums. This
approach has been
implemented in devices for nanopore DNA sequencing, and shown to decrease
repeated re-
sequencing of the same sequences, and improved coverage of rarer sequences in
a given total
sequence output. In some embodiments of the present invention, this ability to
reject already-
well-measured peptides improves throughput substantially, and more
substantially the longer
(i.e., having more peptides attached) the concatamers are, the greater the
benefit.
In the case that peptides are prepared and delivered to a nanopore
individually (i.e., not
in concatamers) there is less motivation to eject a sequence previously
measured many times,
since the total transit time for a single peptide construct is likely to be
short. Hence
computational enrichment offers little benefit towards the goal of
stoichiometric flattening of
a panel of TARGETs unless long constructs having each having multiple
molecules of only a
single type of TARGET and STANDARD are used.
7.6.14 Additional motive forces propelling peptide molecules through nanopores
In some embodiments, particularly those making use of long TARGET peptides, or

peptides without attached nucleic acids, peptide movement through a nanopore
can be
facilitated by addition of charged molecules that bind to, but do not
covalently react with,
peptides (e.g., sodium dodecylsulfate, and similar charged detergent
molecules).
156
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
7.6.15 Data analysis.
Numerous algorithms have been developed to detect transitions between
different
through-pore ion current levels as molecules transit a nanopore, taking
account of the variable
timing between these transitions, and methods have been developed to classify
sequences of
these current transitions (i.e., sequence-sensitive signatures) using machine
learning and other
advanced computational techniques (e.g., commercial offerings by Oxford
Nanopore). These
methods have proven to be very accurate (approaching 99% accuracy) when used
in "base
calling" oligonucleotide sequences, and the same or similar computational
methods can be
applied to the determination of peptide sequences as well.
Given that DNA sequence determination involves discriminating among 4 bases,
while
perfect protein sequencing requires discriminating 20 amino acids, perfect
peptide sequencing
by nanopore reads is far more difficult than DNA sequencing. This is
particularly evident
when one takes account of the fact that in existing nanopores more than one
residue typically
affects the overall pore current - in many cases nanopores are sensing a "k-
mer" of 3, 4, 5 or 6
consecutive residues at a time, which requires creation of extensively trained
machine learning
methods to de-convolve current signatures into residue sequences.
However, in the context of the present invention, which uses single molecule
methods
to count molecules of a limited variety (i.e., set of TARGET and STANDARD
sequences,
which in many cases will total 4 to 50 peptides) the primary requirement is to
accurately
classify a peptide's sequence as either i) confidently recognized as one of a
limited set whose
nanopore signatures have been extensively characterized before (i.e., TARGETs
and
STANDARDs), and for which a machine learning method has been optimized, or ii)
a
molecule whose sequence has not been confidently recognized. In some
embodiments, such
recognition and classification of ion current signatures is used to count the
confidently
recognized TARGET and STANDARD peptides and eliminate the signatures that are
not
confidently recognized. A similar approach has been used to distinguish
limited sets of DNA
barcodes used to tag DNA libraries from different samples that are then pooled
together for
analysis. In such strategies, DNA reads that are not assigned to a barcode
with sufficient
certainty can be discarded, improving the overall quality of results.
157
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
In some embodiments, a machine learning system is trained to recognize and
classify
the ion current signatures of a set of TARGET peptides and their STANDARDs
using large
numbers (e.g., thousands to millions) of "reads" (or "traces" or "signatures"
or "squiggles") of
known peptide sequences transiting sequencing nanopores. Recognition based
directly on
machine learning evaluation of the ion current traces (i.e., current
measurements over time,
typically generating 100-1,000 current measurements during transit of a
peptide) is generally
more reliable than recognition based on amino acid sequences deduced from the
traces, and
therefore represents the preferred method of peptide recognition. This
training can be
accomplished using libraries of nanopore current signatures generated by
constructs made from
pure synthetic peptides having the TARGET and STANDARD peptide sequences.
Large
training sets of pure TARGET and STANDARD peptide constructs are used to
select optimal
recognition algorithms (e.g., machine learning methods including convolutional
neural nets,
etc.) and iteratively improve the classification accuracy of these methods to
provide accurate
counts of the various peptide sequences.
Since the recognition accuracy of specific peptides can be dependent on the
type of
nanopore used, in some embodiments the type of pore used is selected based on
recognition
performance of machine learning systems trained with a specific set of TARGET
and
STANDARD peptides on the various candidate pores In some embodiments multiple
types
of nanopores are used in a system, allowing recognition of specific TARGET and

STANDARD peptides by a type of nanopore best able to accurately recognize
them. In some
embodiments, novel nanopores are designed and tested to optimize performance
in recognizing
specific sets of TARGET and STANDARD peptides
In some embodiments, the accuracy of counting TARGET and STANDARD peptides
is further improved by "counter-training" a machine learning system to reject
peptide
sequences other than TARGET and STANDARD peptides that may be present as low-
abundance contaminants after enrichment of the TARGET and STANDARD peptides
from
digests of complex biological samples. In some embodiments a library of
peptide sequences
coded for by the relevant genome and sharing partial sequence or specific
sequence motifs with
members of a set of TARGET and STANDARD peptides is created and used to
counter-train
a peptide recognition system to avoid mistaking these sequences for authentic
TARGET and
158
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
STANDARD peptides. The results of such training can be expected to improve
peptide
recognition with time and the accumulated learning from increasing sample
numbers,
providing the potential to retrospectively improve the precision of past assay
results by
reanalysis with updated software.
In some embodiments a plurality of candidate TARGET peptide sequences (derived

from one or more selected target proteins) are prepared as constructs for
nanopore sequencing,
and libraries of nanopore current reads collected using these molecules. This
data is used to
determine the accuracy with which specific peptide read signatures can be
distinguished from
other sequences, and this information used in the selection of a set of most
accurately
classifiable TARGET peptide sequences to represent the target proteins in
subsequent routine
analyses. Specific affinity reagents can then be generated to bind epitopes in
the middle region
of these sequences, providing optimal analytical performance.
In some embodiments, a plurality of TARGET peptide sequences derived from a
panel
of target proteins are prepared as constructs for nanopore sequencing, and
libraries of nanopore
current read signatures collected using these molecules. Classification
accuracy data derived
from these signatures is used to select a set of most accurately classifiable
TARGET peptide
sequences spanning the set of desired protein panel members.
In some embodiments, a plurality of candidate STANDARD sequences cognate to
one
or more TARGET peptide sequences is included in a set of constructs used to
generate libraries
of nanopore current signatures, and STANDARD sequences are selected for each
TARGET
peptide so as to provide a set of most accurately classifiable STANDARDs that
minimize
errors in classifying a TARGET peptide's STANDARD in relation to other TARGET
peptides
and STANDARDs.
In some embodiments, peptide:oligo constructs (e.g., rope-tow constructs) are
constructed with recognizable ion current signals (e.g., a high current
associated with an abasic
stretch) either before or after the peptide, or both before and after. The
presence of such
"punctuation" in an ion trace can substantially improve peptide sequence
classification.
Use of the methods described above for selection of most-accurately
classifiable
TARGET and STANDARD peptide sequences provides information about each selected

peptide and its likelihood of misclassification within a panel of TARGET and
STANDARD
159
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
peptides. In some embodiments additional signature and classification accuracy
data is
generated by analysis of sets of relevant biological samples (e.g., plasma or
dried blood spot
samples) and versions of these into which selected TARGET and STANDARD
peptides have
been spiked at known levels. STANDARD sequences are unlikely to exist among
proteolytic
fragments of naturally-occurring proteins (a supposition that is easily tested
by bioinformatics
analysis of the relevant genome sequences, allowing any naturally-occurring
sequences to be
rejected as STANDARD candidates) and therefore detection of apparent STANDARD
signatures in digests of natural samples that have not been spiked with
STANDARD provides
a direct estimate of the "false positive" detection rate for STANDARDs.
Comparisons of
molecule counts among a set of STANDARDs spiked into sample digests at the
same (or
different but known) levels provides a means of estimating STANDARD "false
negative"
detection rates (e.g., any STANDARD showing fewer counts than other STANDARD
spiked
at the same level is likely to be affected by false negative detection
errors). Since TARGET
peptides may likely be detectable in digests of natural samples from the
relevant species, false
positive and negative detection rates can be estimated by comparing TARGET and

STANDARD peptide detection rates in samples spiked with equal amounts of
TARGET and
STANDARD peptides: any excess of TARGET peptide counts over STANDARD counts
provides an estimate of the TARGET peptide false positive rate, and any
deficit of TARGET
peptide counts compared to STANDARD counts provides an estimate of the TARGET
peptide
negative rate (in each of these cases taking into account the independently
determined false
positive and negative detection rates of the STANDARD' s.
In some embodiments using chemical and/or affinity-based methods of single
molecule
sequencing and counting instead of nanopore sequencing, alternative indices of
sequence error
can be used, e.g., an experimentally determined confusion matrix among amino
acids, and/or
an experimentally determined confusion matrix among the selected Target and
STANDARD
peptides. In such systems, if two or more peptides are found to be confused
with one another,
the detection approach can be modified, e.g., by extending the sequence
acquisition to more
residues (e.g., when using sequential degradative readouts), by alteration of
the sequence or
modification of a STANDARD involved in a confusion uncertainty, by selection
of an alternate
TARGET sequence from a target protein, or by other means known in the art.
160
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
7.7 USE OF THE INVENTION WITH SINGLE MOLECULE IMAGING AND
COUNTING TECHNOLOGIES.
Advances in fluorescence microscopy make it possible to reliably detect single

molecules of a fluorescent dye and acquire images of large numbers of such
molecules and
their spatial distribution (38, 75). TARGET and STANDARD constructs prepared
according
to the invention can be immobilized (e.g., on glass or quartz slides) and,
after staining with
fluorescently labeled reagents (e.g., BINDERs for peptides and complementary
oligos for flags
and barcodes), imaged using this technology to count molecules.
7.7.1 Peptide detection by imaging BINDERs interacting with immobilized TARGET
or
STANDARD peptides and constructs immobilized on a surface
In some embodiments, peptide molecules are immobilized (e.g., on a surface)
and their
identities (e.g., as TARGET or STANDARD peptides) determined by optically
detecting the
binding (or lack of binding) of a series of one or more specific and/or
possibly promiscuous
affinity reagents with optically-detectable labels (e.g., BINDERS and
oligonucleotides
complementary to barcode sequences that are labeled with fluorescent dyes or
proteins) applied
to the surface one after another (e.g., in a flowcell) with the option of
removing each affinity
reagent before application of the next, and using recognition techniques
(including machine
learning) to decipher peptide identity based on the pattern of affinity
reagents that do, or do
not, bind detectably. Such a system, for example that described in US Patent
Application
16/659,132, can also be used to count TARGET and STANDARD peptide molecules of
the
invention (or to count intact target protein molecules in the event that
BINDERs bind to linear
epitopes).
A variety of linkage chemistries can be employed to connect a peptide
construct to an
imageable surface, including through direct reaction with peptide amino groups
(e.g., using
NHS esters), with carboxyl groups (e.g., using carbodiimide chemistries), with
cysteine
sulthydryl groups, and with biotin, click chemistry and other groups that have
previously been
introduced into a peptide construct. Chemistries such as e.g., click
chemistry, involve
modification of a site or sites on the peptide as well as providing a
connecting site on the
surface. In some embodiments the required modification(s) of the peptide are
carried out while
161
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
the TARGET and STANDARD peptides are bound to the BINDER (e.g., during the
capture
stage of the enrichment process).
In some embodiments, for example those making use of 'structured nucleic acid
particles' (SNAPs; (54)) or similar methods as a means of densely arraying
single molecules
on a solid support, TARGET and STANDARD peptides or peptide:oligo constructs
comprising a click attachment group (such as methyltetrazine) can be eluted
from BINDERs
(after enrichment) in the presence of a concentrated suspension of SNAPs
comprising a TCO
click attachment group, resulting in the covalent coupling of one peptide to
each SNAP, after
which large numbers (e.g., 10 billion) SNAPs can be arrayed for affinity
reagent imaging in a
suitable optical detection system. Since click chemical linkages are
relatively insensitive to
pH, the elution of peptides (and associated VEHICLEs) from BINDERs under
acidic
conditions (e.g., pH 3.0) can occur before, at the same time as, or after the
peptides couple to
the SNAPs. In some embodiments, this elution and coupling can take place in a
very small
volume, e.g., within the interstitial volume of a packed mass of magnetic
beads on which the
BINDERs are immobilized (i.e., in 0.1 to a few microliters of liquid). By thus
efficiently
transferring peptides and constructs from one immobilized BINDER to another
class of object
(e.g., a SNAP) that is designed to efficiently convey them into position for
reading, a high
proportion of TARGET molecules can be recovered, and the sensitivity of the
analytical
system maximized. In some embodiments, constructs bound to BINDERs on magnetic
beads
are reacted with SNAPs, and the magnetic beads carrying the SNAP.construct
complexes
moved into close proximity with an imageable surface before release (i.e.,
elution) of the
complexes from the BINDERs on the beads, after which they need only migrate a
very short
distance by diffusion to reach and bind to the imageable surface. This
approach significantly
diminishes losses of molecules in the workflow, and thereby maximizes
detection sensitivity.
In some embodiments peptide detection is accomplished using BINDERs modified
to
comprise detectable labels (e.g., fluorescent dyes or proteins such as GFP,
nanoparticles
comprising fluorescent dyes, enzymes that generate optically detectable
products, and the like)
to visualize TARGET and STANDARD peptides on a support. In some embodiments,
specific
nucleic acid sequences are detected by means of hybridizing complementary
probes
comprising optically detectable labels, for example labels like those used in
optical genome
162
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
mapping (7 6) . Well-known methods of optical detection using microscopic
systems are able
to detect individual bound labels and associate the resulting optical signals
with discrete
locations on a surface, thereby allowing a sequence of binding events to be
constructed for
each bound analyte molecule (e.g., TARGETs and STANDARDs).
In such a system, a specific STANDARD label functionality (e.g., biotin, a
fluorescent
label, a unique short peptide segment, or a unique oligonucleotide sequence)
can be
incorporated into the structure of STANDARDs in order to facilitate their
identification and
discrimination from TARGET peptides using a reagent capable of specifically
binding to the
label or direct optical detection (e.g., of a fluorescent label).
The use of multiple detection stages, with removal of detection labels between
stages,
thus allows separate detection and identification of specific features of
single peptides or
peptide:oligo construct molecules, including: a) identification as STANDARDS
or TARGETS
(through detection of the STANDARD label); b) identification as a molecule
bound by a first
BINDER through detection of a BINDER ID code (and so forth for multiple
BINDERs); c)
identification as a molecule derived from processing of a first specific
sample (i.e., a sample
within a pool of samples) through detection of a sample barcode (and so forth
for multiple
samples); identification as members of a first TARGET + STANDARD cognate pair
through
detection of one or more peptide sequence-specific detection reagents, any of
which may be a
BINDER (and so forth for multiple distinct peptides).
Figure 31 schematically illustrates the use of such a multi-step detection
approach to
characterize a standardized sample digest, focusing on a region where 96
peptide:oligo
construct molecules prepared according to the invention and arrayed on a
surface are probed,
and the results decoded to provide a quantitative estimated of the amount of a
TARGET
molecule. In Fig 31A and B, two different BINDERs are used and optically
detected (shown
in green) where present. These signals establish the array sites having each
of the two
TARGET peptides. Fig 31C and D, BINDERs (or oligos complementary to construct
DNA
sequences identifying TARGET and STANDARD molecules) are separately applied
and
imaged to determine which constructs are TARGET and STANDARD molecules (in
this case
irrespective of which peptide they represent). Finally, Fig 31E and F show
detection results of
separately applying oligos complementary to construct DNA barcode sequences
identifying
163
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
molecules recovered from two different sample digests (Samples 1 and 2). Using
the digital
information provided by the binary "optically detected or undetected" signals
recorded for each
arrayed molecule during these 6 detection cycles, the number of molecules of
each TARGET
and STANDARD version of each peptide in each sample can be directly tabulated,
and the
ratio of TARGET to STANDARD counts computed. This ratio, multiplied by the
known
amount (in relative or absolute terms) of the STANDARD added during
standardization of the
sample digest, provides a measure of TARGET abundance.
Figure 11 shows schematically a series of 6 such sequential detection steps or
cycles,
each using different binders to identify, or help identify, a specific peptide
sequence. In cycle
1, peptide A is recognized first by an anti-peptide antibody BINDER specific
for an internal
peptide epitope. In cycle 2, two different BINDERS specific for short trimer
amino acid
sequences present in the peptide are used to support peptide identification.
In cycle 3, an
antibody specific for the c-terminal amino acid (or amino acids) is used to
further support
peptide identification. In cycles 4, 5 and 6, additional anti-peptide antibody
BINDERs specific
for other peptide sequences are used to identify these molecules coupled to a
support. Each of
the bound peptides is part of a larger peptide:oligo construct comprising DNA
sample barcodes
(Codes 1,5,7 and 11 in the Figure) and a DNA barcode identify each molecule as
either a
TARGET or STANDARD (shown as the OLIGO-EITHER barcode)
In some embodiments, for example those using BINDERS as a means of identifying

immobilized TARGET (and cognate STANDARD) peptides (e.g., imaging methods),
there
may be a need to confirm these identities by generating additional information
on the peptides.
A variety of means may be employed to probe specific features of peptides,
making use of the
fact that their sequences are known a priori (i.e., established during initial
TARGET peptide
selection).
Figure 32 shows additional methods of confirming or improving the specificity
of
BINDER interactions with peptides by detecting the effect of a change in
peptide structure on
the binding. In cycles 1, 2, and 3, respectively, antibody BINDERS to an
internal epitope, to
2 short trimer epitopes, and to a c-terminal epitope are applied and read as
in the example of
Figure 11. A proteolytic enzyme capable of cleaving a specific site in some of
the peptides is
then applied to the immobilized peptide constructs, resulting in release of a
c-terminal fragment
164
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
from the peptide shown (Peptide-A). After Peptide-A is thus shortened, the
same sequence of
BINDERs is applied as in cycles 1-3, but the absence of some epitopes is
revealed by the
presence of antibody A binding (as before), and absence of one Anti-trimer
signal and the anti-
C-terminal BINDER signal. Such results confirm the presence of a specific
proteolytic enzyme
cleavage site within the peptide, and the binding of antibody A and one of the
2 anti-trimer
antibodies to sites in the n-terminal portion of the peptide molecule.
Similarly, if the peptide
is linked to the remainder of the construct by its c-terminal end (e.g.,
through linkage to the
amino group of a c-terminal lysine), then a cleavage with in the peptide will
result in release
of an n-terminal fragment, and loss of epitopes involving this fragment. In
essence the addition
of the proteolytic step enables "mapping" epitope locations within the
peptide, further
strengthening the identification. In some embodiments peptide identification
is strengthened
by observing the effects of one or more peptide alterations, including peptide
cleavage (as
shown in Fig 32), chemical or enzymatic removal of one or two terminal amino
acids (e.g.,
using Edman degradation), enzymatic or chemical removal of a phosphate group
from Ser, Thr
or Tyr, chemical modification of one or more amino acids (e.g., alkylation of
a free cysteine,
etc.), or any of a very large repertoire of amino acid modifications known in
proteomics and
protein chemistry.
In some embodiments, the binding of specific BINDERs is further characterized
by
observation of the effects of altered solution conditions on the binding to
individual peptide
molecules. A change from near-neutral to acidic (or basic) pH can result in
the dissociation of
some binders (but not others) from some peptide epitopes. Similarly, a change
from near-
neutral to acidic (or basic) pH can result in the dissociation of some BINDERs
from less-
preferred peptide epitopes (e.g., sequences similar to but not the same as the
cognate TARGET
sequence) while remaining bound to the true cognate sequences. Likewise, the
introduction of
a chaotropic agent (such as NH4SCN), an organic solvent (e.g., acetonitrile),
or a detergent can
reduce binding of some BINDERs to some targets while allowing other, stronger
interactions
to persist. Likewise, a change in temperature can differentially affect
various BINDER
interactions. Similarly changes (particularly temperature) may be employed
that affect
interactions between oligonucleotide components of a construct and
complementary probes
used to read sample barcodes, BINDER barcodes, or TARGET/STANDARD codes. Any
of
the changes employed for detection of differential effects can be applied
stepwise (i.e., as an
165
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
abrupt change), or as a gradient of change over time ¨ in which case the
degree of change
(determined as a function of time across a gradient) at which a BINDER
interaction is affected
can serve as a highly specific indicator of the affinity and/or specificity of
the interaction, and
hence its contribution to a correct identification.
In some embodiments, peptide molecules or their constructs are positioned at
points on
a predetermined lattice of locations on a planar support (e.g., like the
system described in Patent
Application 16/659,132). In some embodiments, peptide molecules or their
constructs are
positioned through hybridization of construct DNA sequences to complementary
sequences in
extended nucleic acid molecules produced by techniques such as "optical genome
mapping"
(OGM: e.g., US Patent 9,536,041). Such OGM implementations can use naturally-
occurring
DNA molecules, or DNA molecules designed to comprise tens, hundreds,
thousands, or tens
of thousands of repeating complementary sequences at appropriate intervals
(e.g., 0.1, 0.5, or
LO microns separation) along the length of the molecules. Long DNA molecules
linearized
by OGM methods can be transferred to, and immobilized on, a planar support
having
appropriate reactive groups, thereby creating a regular array of complementary
sites on a
surface within a flowcell for optical imaging during application and removal
of a series of
optically labeled peptide BINDERs and oligonucl eoti de probes useful in
characterizing bound
peptide.oligo constructs In some embodiments, peptide molecules and their
constructs are
positioned randomly, but at spacings that are typically optically resolvable,
on a planar support
through binding to sites previously established on the support, e.g., by
coating the support with
BINDER molecules, by coating the support with molecules having an affinity for
some
chemical feature of a peptide construct (including oligonucleotides
complementary to
components of a peptide construct, biotin labels, or the like), or with
chemically reactive sites
such as click chemistry groups capable of reacting with click groups on
peptide constructs, etc.
In some embodiments comprising detection of molecules by binding of multiple
BINDERs labeled with different fluorophores, an optical detection system is
used capable of
simultaneously and separately detecting these labels based on differences in
their excitation
and/or emission wavelengths (i.e., multicolor imaging). The use of multiple
labels with
separate detection wavelengths allows BINDERs to be multiplexed, thereby
decreasing the
number of binding and elution cycles required to observe a given set of
BINDERs.
166
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
These and other means described below of identifying (e.g., by amino acid
sequence,
partial amino acid sequence, or presence of sequence-related features detected
by binding
reactions) individual TARGET or STANDARD peptide molecules (or derivatives of
these that
preserve their individual identities) can be used to recognize and count the
numbers of
TARGET and STANDARD peptides in a sample that is standardized (i.e., by
STANDARD
addition) and enriched according to the invention.
In some embodiments a plurality of BINDERs, each specific for a different
epitope on
a peptide, are used to increase recognition specificity by increasing the
number of amino acids
involved in interactions with (i.e., "recognized by") the BINDERs. This
effectively increases
the peptide sequence coverage of the BINDER(s). In some embodiments, 2 or more
such
single-epitope BINDERs are stably joined to form a single molecule, and the
well-known
"avidity effect" results in a much higher overall affinity for the peptide
than would be seen
with any of the BINDERs individually. In some embodiments, multiple single-
epitope
BINDERs with distinct optical (e.g., fluorescent) labels are used together,
and peptides that
bind the set of BINDERs cognate to the peptide's epitopes are identified as
those exhibiting
the correct label emissions. In some embodiments, multiple single-epitope
BINDERs are
labeled with distinct fluorophores such that one BINDER is labeled with a
fluorophore acting
as a FRET donor and another BINDER is labeled with a fluorophore acting as a
FRET
acceptor. When such BINDERs bind to adjacent epitopes on a peptide, the
proximity of the
donor and acceptor fluorophores enables detection of this inter-epitope
proximity relationship
through detection of emission by the acceptor when the donor is illuminated at
its excitation
wavelength (i.e., a FRET signal is generated).
In some embodiments, BINDERs are used whose binding to a cognate peptide
epitope
is characterized by a rapid off-rate (e.g., in the range of 20 msec to 60 sec
half-off-times). The
optical signal from such a BINDER will appear and disappear as it repeatedly
binds to,
dissociates from, and re-binds to (etc.) an immobilized peptide construct. The
number of
transitions between bound (localized fluorescent signal) and dissociated (no
localized signal)
states per unit time serves as a quantitative kinetic parameter of the
strength of binding (77)
which can be used to differentiate binding events to the correct cognate
epitope from binding
167
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
events to a similar but slightly different epitope. This fine level structural
recognition further
amplifies the specificity of peptide detection.
In some embodiments, peptides are chemically modified before, during or after
a series
of imaging detection steps. In some embodiments these modifications alter the
detectability
of specific peptides, such that a method of detection (e.g., imaging of a
BINDER bound to an
epitope of the peptide) that produces a positive signal when used before the
modification does
not produce a signal after the modification has taken place (or vice versa).
For example, a
modification that perturbs, disrupts or cleaves the peptide in an epitope can
result in the failure
of a BINDER specific for the original intact epitope to bind to the peptide
after the modification
has taken place (or for the BINDER to exhibit altered binding kinetics as
discussed above).
By comparing the detection result (BINDER binds to the peptide) obtained
before the
modification with the result after the peptide has been modified to disrupt
the BINDER' s
epitope (BINDER fails to bind to the peptide), it is possible to infer that
the peptide in question
did, in fact, include a site that was modified. Likewise, a specific peptide
modification can
introduce a structural feature that enables binding of a BINDER that does not
bind (or binds
weakly) to the original intact epitope.
In some embodiments, a sequence-specific proteolytic cleavage is used as a
modification ¨ in this case the cleavage can result in release of the end
portion of the peptide
that is not immobilized on the support Sequence-specific enzymes such as
trypsin, ArgN,
AspN, GluC, chymotrypsin, pepsin, papain and the like may be used to cleave
peptides at
specific sites ¨ only peptides comprising such sites will be cleaved, and the
positions of the
sites in the cleaved peptide sequences, in relation to the BINDER epitopes,
determine whether
or not BINDER binding is affected.
In some embodiments, specific amino acids within a peptide sequence are
modified.
For example, protein kinase enzymes may be used to add a phosphate group to
specific serine,
threonine or tyrosine residues within a sequence. Addition of a phosphate
group to an amino
acid within a binding epitope is likely to have a significant impact on BINDER
binding to the
epitope (typically diminishing binding). In some embodiments, BINDERs are used
that
specifically bind to a phosphorylated epitope but do not bind to the
unphosphoryated epitope,
168
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
and in this case the BINDER binds to the peptide only after the kinase
modification has taken
place.
In some embodiments cysteine SH groups are modified, e.g., by reaction with
iodoacetamide, acrylamide, or any of a variety of n-ethylmaleimide compounds.
For example,
a peptide containing cysteine in a BINDER' s epitope can be kept unmodified
(including un-
oxidized) for recognition by the BINDER and subsequently modified covalently
by reaction
with iodoacetamide, after which reprobing with the same BINDER results in
weaker (or no)
binding.
In some embodiments a free n-terminal amino group (or similarly a free c-
terminal
carboxyl group) can be modified in a way that impacts the binding of a BINDER
whose epitope
included that terminal group (e.g., by acetylation of the amino group, by
removal of a terminal
group, or by enzymatic addition of a terminal amino acid).
In some embodiments, the first of a pair of FRET donor-acceptor fluorophores
is added
to a site on the peptide (e.g., the n-terminal amino group, a cysteine SH
group, a linker joined
to the peptide) and the second to a BINDER capable of binding to an epitope
near the site of
the first. The intensity of the resulting FRET fluorescence provides a
measurement of the
distance between the two fluorophores that can contribute to the
identification of the peptide.
In some embodiments one member of a pair of FRET donor-acceptor fluorophores
is
added to each of 2 BINDERs specific for adjacent epitopes on a peptide.
BINDING of the two
BINDERs in proximity to one another (i.e., to their adjacent epitopes) creates
the conditions
required for FRET detection, thus confirming correct binding to these
epitopes.
In some embodiments one or more BINDERs capable of distinguishing terminal
amino
acids (or the terminal pair of amino acids) is used to determine this feature
of the peptide
sequence, thus adding considerable specificity to the overall detection
scheme. BINDERs such
as those described above for use in peptide sequencing by cyclical
degradation/identification
can be used for this purpose. In some embodiments repeated cycles of Edman or
enzymatic
removal of one or two terminal amino acids allows identification of multiple
terminal amino
acids.
169
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
In some embodiments, the binding of specific BINDERs is further characterized
by
observation of the effects of altered solution conditions on the binding to
individual peptide
molecules. A change from near-neutral to acidic (or basic) pH can result in
the dissociation of
some binders (but not others) from some peptide epitopes. Similarly, a change
from near-
neutral to acidic (or basic) pH can result in the dissociation of some BINDERs
from less-
preferred peptide epitopes (e.g., sequences similar to but not the same as the
cognate TARGET
sequence) while remaining bound to the true cognate sequences. Likewise, the
introduction of
a chaotropic agent (such as NEI4SCN), an organic solvent (e.g., acetonitrile),
or a detergent can
reduce binding of some BINDERs to some targets while allowing other, stronger
interactions
to persist. Likewise, a change in temperature can differentially affect
various BINDER
interactions. Similarly changes (particularly temperature) may be employed
that affect
interactions between oligonucleotide components of a construct and
complementary probes
used to read sample barcodes, BINDER barcodes, or TARGET/STANDARD codes. Any
of
the changes employed for detection of differential effects can be applied
stepwise (i.e., as an
abrupt change), or as a gradient of change over time ¨ in which case the
degree of change
(determined as a function of time across a gradient) at which a BINDER
interaction is affected
can serve as a highly specific indicator of the affinity and/or specificity of
the interaction, and
hence its contribution to a correct identification.
7.8 USE OF THE INVENTION WITH SINGLE MOLECULE DEGRADATIVE
SEQUENCING AND COUNTING TECHNOLOGIES
7.8.1 Peptide sequencing by cyclical degradation with "reverse translation" to
DNA
In some embodiments, peptide molecules can be "reverse-translated" into
nucleic acid
sequences using a cyclic procedure involving recognition of peptide n-terminal
amino acid
residues, or a pair of n-terminal residues, and using this recognition to add
or transfer an oligo
sequence tag specific for the detected amino acid (or pair) to a growing DNA
oligo, which is
subsequently sequenced to identify and count the reverse-translated TARGET and

STANDARD sequences. This technology, described in US Patent Application
16/760,028,
can also be used to identify and count TARGET and STANDARD peptide molecules
of the
invention.
170
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
In some embodiments making use of the detection schemes that reverse translate

peptide sequence into nucleic acid sequence information, additional
information comprising a
peptide's identity as a TARGET or STANDARD, sample identity (e.g., a sample
barcode),
identity of the BINDER that bound the peptide during an enrichment step (e.g.,
a BINDER
barcode), and other pertinent information can be added to a growing DNA oligo
using any of
the methods well-known in the art, including copying of a sequence (e.g., by a
polymerase),
ligation of an oligo onto the growing chain, insertion of a sequence using
CRISPR and related
technologies, etc. The information thus collected characterizes the peptide in
a variety of ways
useful in the interpretation of the peptide molecule's sequence and its
significance in an assay.
This information may be read out using any of the well-known nucleic acid
detection (e.g.,
PCR) or sequencing methodologies (nanopores, sequencing by synthesis, etc.).
This readout
can be accomplished either with or without first removing the peptide from the
nucleic acid
component of the construct.
7.8.2 Peptide sequencing by cyclical degradation with optical detection
In some embodiments, the TARGET and STANDARD peptide molecules of the
invention can be arrayed by binding to a surface, or by distribution in an
array of pre-formed
wells or zones on a surface, and the molecules can be observed individually by
a position-
sensitive detection means, e.g., optical detection means or electronic
detection means.
Appl.No.:16/686,028 describes such a method that can be used to decode the
sequence of
individual peptide molecules anchored in individual wells of an array of wells
in a
semiconductor chip. The method enables identification of individual molecules
by matching
to TARGET peptide or STANDARD sequences, and tabulation of the numbers of such

molecules occurring in the array of wells (described as millions of wells on a
semiconductor
chip substrate (35)). As described herein, it may suffice to sequence only 3,
or 4, or 5 or 6
amino acids from the end of a selected set of TARGET and STANDARD peptides in
order to
identify them (discussed elsewhere herein) A further alternative detection
means capable of
detecting the step-wise removal of fluorescent labels on selected amino acid
side chains of
peptides immobilized on a surface during cycles of Edman degradation has been
described (IS,
78). This approach can also be used to identify and count TARGET and STANDARD
peptide
molecules.
171
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
7.8.3 Peptide sequencing by cyclical degradation with electronic detection
In some embodiments, peptide molecules and/or VEHCILE constructs are
immobilized
on a surface and their identities (e.g., as TARGET or STANDARD peptides)
determined by
electronic detection of the presence of BINDERs recognizing peptide n-terminal
amino acid
residues (35) . Analogous technological means can be used in the same or
similar platforms
to read DNA sequences present in peptide:oligo constructs (79).
7.9 CALIBRATORS AND CONTROLS TO SUPPORT ACCURATE PEPTIDE
QUANTITATION.
A majority of biomarker tests for proteins deliver a result based on quantity
(e.g., the
concentration of the target protein in a biological sample) rather than
reporting a sequence. In
order to make use of sequence sensitive single molecule detection for
quantitation of peptides
and proteins, it is important to consider the calibration of these detection
systems (e.g., using
"calibrators"), as well as the confirmation that calibration is reliable
(i.e., through analysis of
," control s").
Use of external calibrator and control samples, analyzed alongside
experimental
samples to be analyzed, is well known in the analytical art, and widely used
for specific assays
(e.g., immunoassays) in clinical diagnostics and in research. Typically, data
obtained by
analysis of a calibrator is used to determine one or more adjustable
parameters that bring the
system's analytical result into concordance with an established external
reference system. If a
measurement system is inherently linear, a single point calibration can be
used to provide a
calibration factor by which detector output is multiplied to yield standard
abundance or
concentration units. In non-linear detection systems, calibrators with
multiple levels of analyte
may be used to produce a non-linear "standard curve- to translate detector
output into an
accurate abundance or concentration value. Control samples are typically
provided to confirm
that calibration has been effective: values obtained by analysis of one or
more control samples
are compared, after calibration adjustments, to pre-assigned values as a test
of the calibration
validity (i.e., controls provide quality control for the assay and its
calibration). In some
embodiments, calibrators and controls of this type are provided to be analyzed
in the same
sequence sensitive single molecule detection workflow as experimental samples,
and thus
provide calibration of the entire workflow.
172
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
In some embodiments an additional level of calibration and control is provided
to
ensure optimal operation of the sequence sensitive single molecule detector
itself (i.e., focusing
on the detector alone, instead of the entire workflow that includes digestion,
any chemical
modifications, etc.). This is particularly relevant because sequence sensitive
single molecule
detectors can produce errors, specifically misidentification of nucleic acid
bases, amino acids,
or whole molecules. In the case of nanopore detection, these errors can arise
from several
sources, including errors in assembly of peptides into sequenceable
constructs, the movement
of molecules through the pore, defects in a nanopore itself, statistical
fluctuations in current
flowing through a nanopore, electronic noise in the device measuring through-
pore current,
and a variety of errors contributed by the complex mathematical algorithms,
including deep
multi-layer machine learning software systems used to interpret that current
traces. These
errors are inherently difficult to address by simple inspection because
individual nanopore
current traces are extremely difficult to interpret -manually" (i.e., by
visual or elementary
mathematical inspection), and essentially impossible to evaluate manually in
large numbers
(as required for useful quantitative applications). Software systems that are
capable of rapid
and accurate evaluation of nanopore current traces are typically complex,
multilayered
machine learning systems, which are well known in the art to be essentially
impossible to
understand in detail in terms of human-perceivable logic processes. It is
therefore desirable to
provide means for a) minimizing, and b) evaluation the sum of these errors so
as to provide
accurate peptide quantitati on and a precise understanding of the magnitude
and nature of errors
remaining.
In some embodiments, calibrator TARGET and STANDARD constructs are provided
to address this issue, and these can perform either or both of at least two
functions: 1)
calibration of the relationship between the numbers of TARGET and STANDARD
molecules
reported by the detection system and the numbers expected based on prior
validated
measurement of the numbers present in the calibrator material, and 2) tuning
and assessment
of the accuracy with which TARGET and STANDARD molecules are classified.
In some embodiments a calibrator sample is provided that is capable of being
read by
a nanopore under conditions that area the same as, or similar to, those
pertaining when sample
peptides are read using workflows of the invention. In some embodiments, a
calibrator
173
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
comprises a polymer VEHICLE (which may comprise polymer segments capable of
threading
a nanopore and oligonucleotide segments capable of engaging an oligonucleotide
motor to
control movement through a nanopore), with TARGET and STANDARD peptide
molecules
attached or incorporated therein, and in which the ratio between the numbers
of TARGET and
STANDARD peptide molecules in the sample's population of calibrator constructs
is known.
Nanopore analysis and current trace interpretation of such a calibrator sample
will generate an
experimentally determined TARGET:STANDARD ratio, which may be compared to the
ratio
known a priori to be present in the calibrator. Any discrepancy can be used as
a basis for
calculating and applying a correction factor to the TARGET:STANDARD ratio
reported by
the analytical system on other samples. For example, if the known
TARGET:STANDARD
ratio in the calibrator is 1.0, and the nanopore result (TARGET:STANDARD ratio
calculated
from the counts of TARGET and STANDARD molecules by the analytical system) is
1.2, then
the measured ratios for other samples can be multiplied by 1/1.2 to provide a
calibrated result.
In some embodiments, a calibrator is used to tune the detection system itself
In this
case, the calibrator comprises constructs of TARGET and STANDARD peptide
molecules on
a VEIIICLE in a manner that identifies TARGET and STANDARD molecules to the
detection
system. For example, the calibrator can comprise a mixture of two constructs
consisting of a)
a plurality of TARGET peptides on a type of VEHICLE and b) a plurality of
STANDARD
peptides on the same or a different type of VEHICLE. In this case, each
construct comprises
multiple copies of either the TARGET and STANDARD peptides, but not both. In
some
embodiments, TARGET and STANDARD peptides are coupled to different VEHICLES
that
provide independent identification of which peptide is present (e.g., by
incorporating different
DNA or other recognizable sequences. In some embodiments, TARGET and STANDARD
peptides are present in the construct in an order or arrangement (e.g.,
alternating order) that
allows the sequencing systems to accurately infer the identity of each
peptide. In analyzing
the calibrator, the detection system is able to recognize a separate set of
valid current traces for
each of the two types (TARGET and STANDARD). These sets of trace data,
correctly labeled
as to identity based on the calibrator's design, provide an opportunity to
test the accuracy of
the system's assignment resulting from processing the current traces,
resulting in a typical
confusion matrix composed of the correct and incorrect calls for each of the
two peptides.
174
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
In some embodiments a similar calibrator approach (providing sets of TARGET
and
STANDARD trace data, correctly labeled as to identity) is used to adjust
parameters of nanopre
current trace evaluating algorithms themselves. A number of studies have
suggested that direct
application of machine learning to the raw current traces can provide better
accuracy in
identifying a sequence among a set of candidates (e.g., a DNA or peptide
"barcode") than
evaluation through "base-calling" and comparisons of the resulting sequences.
It is clearly
demonstrated in the art that numerous alternative methods of machine learning,
algorithm
designs and parameter sets can be applied with varying levels of success.
Hence the accurately
labeled current trace data provided by a calibrator as described can be used
to create or refine
the algorithm used to identify TARGET and STANDARD peptides, as well as test
its accuracy
(as in the preceding embodiment).
In some embodiments collections of constructs each comprising only one or a
few
peptide molecules are provided, and in such cases each peptide' s true
identity is determined
from highly reliable barcode labels in the constructs.
In some embodiments, a plurality of calibrator constructs is provided for the
calibration
and/or optimization of detection of a plurality of TARGETs and cognate
STANDARDs.
In some embodiments, a peptide digest prepared from a complex protein sample
(e.g.,
a plasma sample) can be processed according to the invention to create a large
collection of
different TARGET constructs (e.g., incorporating a TARGET code). A second
aliquot of the
same protein sample can be processed according to the invention to create a
large collection of
constructs labeled with a different code (e.g., incorporating a STANDARD
code). Any pair of
distinct codes can be used instead of TARGET and STANDARD codes for this
special
purpose. The two preparations (the same peptides labeled with different tags)
can be mixed in
a specified ratio (e.g., 1 part of the first mixture and 10 parts of the
second) to provide a
calibrator in which two construct versions (e.g., labeled with TARGET and
STANDARD tags)
of many different peptides can be detected. Observation of the expected ratio
(1:10 in this
example) for each detected peptide confirms the linearity of a single molecule
detection
system.
175
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
In some embodiments, one or more calibrators are analyzed separately from
experimental samples (e.g., before or after a run of experimental samples),
and the results used
for the purposes described above.
In some embodiments, one or more calibrators are mixed with an experimental
sample
to provide calibration within a nanopore run. In this case it is advantageous
to definitively
identify the calibrator constructs by incorporating specific sequence barcode
labels in addition
to a set of homogenous peptide molecules.
In some embodiments, calibrator construct nanopore current traces are
evaluated for
individual nanopores among a plurality of available nanopores in a device and
used to
deactivate or otherwise suppress data from such nanopores. In some
embodiments, calibrator
construct current traces are used to optimize the machine learning algorithms
used to analyze
the data from each individual nanopore.
In some embodiments, algorithm parameters are adjusted based on evaluation of
calibrator traces to provide a specified level of certainty of peptide
assignment. For example,
when few copies of a TARGET peptide are detected, it can be preferable to
ensure that these
few copies are correctly identified and are not incorrectly assigned STANDARD
(or other)
molecules. The current trace interpretive algorithm can be modified to count
only high
confidence identifications while assigning lower confidence identifications to
an "unassigned"
category. This modification increases the certainty that these TARGET peptide
molecules are
correct identifications, at the cost of reducing the number of
identifications, and hence
increasing the CV of the measurement. It will be clear to those skilled in the
art that tradeoffs
between the accuracy of nanopore trace identification on the one hand and
overall
TARGET: STANDARD ratio precision on the other result from such adjustments,
and that
these must be taken into account in the overall optimization of assay
performance.
In some embodiments the false positive and negative detection rates of TARGET
peptides and STANDARDs are used in statistical calculations to provide
improved estimates
of the precision of the respective molecule counts and the precision of the
ratio between
TARGET and STANDARD counts. Those knowledgeable in the art will understand
that a
variety of advanced statistical methods exist for the incorporation of
multiple measures of
uncertainty and error into an overall estimate of precision. A fully
elaborated model of assay
176
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
precision is of considerable importance in establishing the clinical utility
of assays according
to the invention
In some embodiments, one or more well-characterized samples similar to or
representative of the experimental samples to be analyzed are used as
"controls".
7.10 STOICHIOMETRIC FLATTENING AND ITS IMPACT ON THE EFFICIENCY OF
PEPTIDE QUANTITATION BY COUNTING.
In some embodiments, quantitative results provided by use of the invention
include the
ratio of the number of molecules identified and counted as TARGET peptide
constructs to the
number of molecules identified and counted as STANDARD constructs (STANDARD
being
present in known or at least consistent amount across a set of samples being
analyzed) The
precision afforded by counting molecules, where counts are distributed in an
approximately
Gaussian manner, is estimated to be governed by the ratio of the square root
of the number of
counts to the number of counts (a ratio often referred to as the Coefficient
of Variation, or CV).
In various applications, CVs of 20% or less (for research assays), of 5% or
less (for critical
diagnostic assays) or 2-3% or less (for sensitive longitudinal tracking of
biomarker levels) are
desired. In such a model the number of counts theoretically needed to achieve
a target CV is
(1/CV)2: thus, CVs of 20%, 5% and 2% would require respectively 25, 400 and
2,500 molecule
counts for a single TARGET or STANDARD construct.
The CV of the ratio of the number of molecules identified and counted as
TARGET
peptide to the number of molecules identified and counted as STANDARD is more
complicated, but is dominated by the count with the higher CV (i.e., the
peptide with the fewer
molecules counted, and hence the lower precision). In quantitative
applications, e.g.,
measuring clinical protein biomarkers in blood, the amount of STANDARD added
to a sample
as internal standard may be set approximately equal to the average level of
the TARGET
peptide observed in a set of samples from a relevant human population (i.e.,
at the population
average level, such that the averaged TARGET: STANDARD ratio is 1.0). While
different
biomarkers exhibit different levels of quantitative variation among
individuals (59), in most
cases normal variation occurs within a range of 10-fold below and 10-fold
above the population
average (i.e., from 10 to 1,000 units for a biomarker whose average level is
100 units in a
relevant population; (59)), though some biomarkers show less variation (e.g,
occur within a
177
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
range of 0.5-2.0-fold from the mean) and a few others can change by >1,000-
fold (e.g., CRP
in cases of extreme inflammation). In some samples, more TARGET peptide
molecules will
be counted than STANDARD, and in some samples the reverse. The CV of the ratio
will be
dominated by the CV of the variable with the fewer counts, since the variable
with more counts
will have a smaller CV. As a general estimate we can expect the CV of the
ratio to be no more
than 1.5 times the larger of the TARGET peptide and STANDARD CV's. For
example, a
TARGET peptide whose ratio to its STANDARD is to be measured with a CV of 3%
or better,
needs to produce 3%/1.5 = 2% CV for the lower of the two counts, which in turn
requires 2,500
molecules of the less frequent peptide version (i.e., TARGET or STANDARD) to
be counted
(an estimate based on counting statistics as discussed above). If the more
frequent version of
a peptide is 10x as abundant as the less frequent (at the edge of the intended
dynamic range of
the assay), then 25,000 molecules of the more frequent version would need to
be counted
before 2,500 molecules of the less frequent are detected. If the more frequent
version of a
peptide is 100x as abundant as the less frequent (at the edge of the intended
dynamic range of
the assay), then 250,000 molecules of the more frequent version would need to
be counted.
The total number of molecules of the TARGET peptide to be counted (TARGET +
STANDARD) would be a maximum of 25,250 molecules in the first case (10x range)
and
252,500 molecules in the second case (100x range). This requirement for
counting large
numbers of peptide molecules, and in particular larger numbers to achieve
better precision
(lower CV's) and wider dynamic range, provides strong motivation to optimize
the design of
assays using the stoichiometric flattening method of the invention.
In clinically important sample types such as whole blood and blood plasma,
proteins
of diagnostic interest can vary in abundance by more than 1010 (10 billion-
fold (/, 8) ) , a range
that significantly exceeds the practical dynamic range of available detection
technologies,
including mass spectrometry and molecule counting. However, given that peptide
quantitation
according to the invention makes use of counts of TARGET peptides compared to
counts of
STANDARD internal standard molecules (e.g., as a ratio between the two), it is
not necessary
to capture all, or even a large fraction, of the molecules of a high-abundance
peptide in order
to accurately measure its concentration in a sample. Thus in some embodiments
the invention
instead provides for adjustment of the amount of each peptide TARGET+STANDARD
pair
captured, e.g, by adjusting the amount of each peptide's specific enrichment
reagent (e.g.,
178
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
amount of cognate BINDER) or the circumstances of enrichment (e.g., duration
of binding and
washing steps, solution conditions, etc.) so as to capture only the amount of
the cognate
TARGET peptide and STANDARD pair that is needed to allow counting the minimum
required number of peptide molecules (specifically the minimum number required
to deliver
the desired measurement precision for the less abundant of the Target and
STANDARD
molecules: the more abundant of the two will by definition have more counts
and thus a better
precision, so that the ratio of Target and STANDARD measurements will have a
precision
similar to that of the less abundant molecule alone). By adjusting the amounts
of each peptide
recovered, stoichiometric differences between different TARGET peptides are
reduced,
effectively "flattening" the stoichiometry across the set of TARGET peptides.
This differential
adjustment of enrichment recoveries for different TARGET peptide constructs
requires co-
enrichment of cognate STANDARDs ¨ otherwise the desired quantitative
information on
TARGET abundance is lost.
Flattening the stoichiometric differences (i.e., reducing the abundance
differences)
between peptides that are present at very different abundance levels in the
original sample
digest, the total number of molecules that need to be counted in the assay,
and hence the cost
and time involved, is minimized.
In some embodiments, stoichiometric flattening enriches low-abundance
peptides, and
specifically "de-enriches" or depletes selected high-abundance peptides to a
relative
abundance level specified in the assay design (typically much less than 100%
but greater than
0% of the initial amount), and is therefore distinct from the general concept
of "enriching"
TARGET peptides as a means of increasing assay sensitivity by capturing all of
a rare analyte
from a large sample. In conventional methodologies, low abundance analytes are
enriched as
efficiently as possible so as to allow measurement of all the analyte present
in a sample: i.e.,
the enrichment target is 100% recovery, and different assay targets are not
generally enriched
to different levels in a multiplex assay design. It is important to note that
the ability to
differentially enrich different TARGET peptides as practiced in the invention
relies on the
presence of STANDARDs to preserve information on the quantity of the TARGET
peptide in
the sample: i.e., the constancy of the ratio between TARGET and STANDARD
amounts
before, and after, enrichment. Stoichiometric flattening is not practicable in
situations where
179
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
an internal standard is not included for each analyte, nor in situations where
each analyte is not
specifically enriched by a different BINDER whose capture amount can be
adjusted.
In some embodiments of a multi-analyte panel of TARGET peptides and their
respective STANDARDs designed to measure multiple proteins, the amounts of the
respective
BINDERs are adjusted so as to deliver approximately equal numbers of TARGET
plus
STANDARD peptide molecules for each TARGET peptide, assuming that STANDARDs
are
added to the sample at levels approximately equal to the expected level of the
cognate
TARGET peptide. In some embodiments the process of adjusting BINDER amounts is
carried
out in a series of steps, beginning with a combination of the BINDERs in
certain amounts
(which may for convenience initially be equal amounts), measuring the numbers
of each
TARGET peptide (or STANDARD) molecule detected after enrichment and then
reducing the
amount of BINDERs for which a large number of peptides were detected and/or
increasing the
amount of BINDERs for which few peptide molecules were counted. Applied in an
iterative
approach, this empirical method allows progressive adjustments of the relative
amounts of the
BINDERs towards the goal of similar peptide counts for each TARGET peptide and

STANDARD pair. It will be evident to those skilled in the art that such a
process can be
terminated at any point where the numbers of peptides counted for the panel
components meet
the needs of a specific application, or it can be continued to progressively
improve performance
towards an optimum defined by any of a range of well-known statistical
measures of overall
precision and accuracy. This method of tuning is fully empirical and
independent of any prior
knowledge of the relative kinetic properties of the various BINDERs, of the
relative
abundances of the target proteins in samples of interest, or the details of
data analysis (e.g., the
probabilities of errors in identification of specific peptides by the
sequencing platform). Once
the relative amounts of the various BINDERs are established for a panel, the
recipe can be
locked down as a reproducible product until changes in one or more BINDERs
(e.g.,
development of different BINDER reagents), STANDARD amounts, or panel
composition are
required.
In some embodiments, two or more stages of BINDER capture are used: a first
capture
to collect TARGET and STANDARD peptides from a standardized sample digest
(i.e., having
hundreds of thousands of different peptides), and one or more secondary BINDER
capture
180
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
steps to further purify or concentrate these relatively pure peptides, or
transfer them to a
different immobilized format (e.g., a smaller number of larger beads). In some
embodiments,
the process of stoichiometric flattening as described is carried out by
adjustments of relative
amounts of different BINDER in the first capture stage, or else in the second
capture stage, or
in multiple capture stages. In some embodiments, a first BINDER capture stage
is used to
collect amounts of the TARGET and STANDARD peptides from a complex
standardized
digest, and may not, because of variations in the character of different
samples, yield the
desired level of stoichiometric flattening (i.e., a roughly equal amounts of
all the peptides);
however adjustments of BINDER amounts or properties in a second stage capture,
which
begins with a relatively pure peptide sample can provide a much flatter
stoichiometry and thus
better detection efficiency.
7.10.1 Computing the benefit of stoichiometric flattening
Figure 33 illustrates the value of stoichiometric flattening in reducing the
number of
molecules that must be counted to ensure precise measurement of peptides
present in a sample
in widely disparate amounts. In a small sample of whole human blood (e.g., 10
uL), the most
abundant protein (hemoglobin derived from red blood cells; Hb) is typically
present at a level
of 33,000,000 fmol (-2e16 molecules), while soluble transferrin receptor
(sTfR) is present at
a level of 37 fmol (-2.2e10 molecules) ¨ a difference of almost 1,000,000-
fold. Such a sample,
which would be considered small in current clinical laboratories, contains
many more
molecules than must be counted to achieve precise quantitation (low CV's)
using single
molecule counting as disclosed in the invention. If an embodiment requires a
CV of ¨5%, it
is necessary to count only ¨400 molecules of the less abundant of the TARGET
and
STANDARD pair of peptides representing each of the protein targets (according
to simple
counting statistics; the square root of 400 divided by 400 being 20/400 = 5%).
In the case of
sTfR, the estimated maximum level expected in a sample is slightly more than
the population
average, and so the combined number of counts needed for sTfR TARGET peptide
and
STANDARD is estimated to be 913 total molecules. Without using stoichiometrie
flattening,
¨822,000,000 TARGET and STANDARD peptide molecules corresponding to HbA would
be
counted during the time it took to accumulate the 913 molecules of sTfR
peptide (almost a
million molecules for each sTfR molecule counted). In this case the single
molecule counter
181
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
is spending 99.9999% of its time counting HbA while waiting for sTfR counts to
accumulate,
as a result of the vast difference in their relative abundances in the sample.
This level of
inefficiency would render the use of single molecule counting for quantitating
these molecules
impractical
However, using stoichiometric flattening as provided in the invention, only a
tiny
fraction of the HbA TARGET and STANDARD peptide molecules are captured by a
correspondingly tiny amount of the cognate BINDER, while a much larger
fraction of the sTfR
TARGET and STANDARD peptide molecules are captured by a correspondingly larger

amount of their cognate BINDER (or use of a much higher affinity BINDER). As a
result,
while 913 molecules of sTfR peptides are counted, only 844 molecules of HbA
are required (a
smaller number than sTfR because the variation in the amount of HbA across a
population is
less), for a total of 1,757 peptide molecules.
As a result, in this example stoichiometric flattening reduces the total
number of
molecules to be counted from ¨822,000,913 to 1,757, a reduction of almost
500,000-fold while
delivering the same precision. This difference makes practical what would
otherwise be
entirely impractical.
Figure 34 presents a more complete example in which stoichiometric flattening
is used
to improve measurement of a panel of 26 proteins measured in small human blood
samples
As in the previous case, the lowest abundance protein is sTfR and the highest
is HbA, with a
series of clinically relevant protein biomarkers occurring at various
abundance levels in
between. Without stoichiometric flattening, a total of approximately
1,000,000,000 peptide
molecules must be counted while 913 counts of sTfR are accumulated (as in the
previous
example). However, using stoichiometric flattening to equalize the amounts of
the peptides
(e.g., by adjusting the amounts of their respective BINDER 's used in the
enrichment step),
and taking into account the greater range of variation of some proteins
compared to others in
the panel, a total of only ¨64,000 peptide molecules would need to be counted
to provide 5%
CV measurements of all, resulting in an efficiency improvement of ¨16,000-fold
overall.
7.10.2 Stoichiometric flattening in nanopore sequencing
In some embodiments that employ nanopore sequencing for peptide detection (an
essentially serial technique), the increased efficiency provided by
stoichiometric flattening
182
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
translates directly into a dramatic reduction in the time and number of pores
required to analyze
a sample, which in this case is the time required to accumulate the required
numbers of counts.
In such systems multiple samples can be analyzed together (i.e., multiplexed)
using some form
of molecular barcoding technology (as used for example in genomic sequencing
on Oxford
Nanopore platforms), and given sufficient pore throughput capacity, this
enables more samples
in a given time and thus higher overall throughput.
In some embodiments, the capability of a nanopore reader to identify portions
of a
sequence early during the read operation, and eject a molecule whose sequence
is not of interest
(or is surplus to requirements in the current context) can be used to further
reduce the
stoichiometric differences between high and low abundance peptide reads. This
approach,
termed -computational enrichment of target sequences" or -Read Until" (74) can
provide
modest (e.g., max 10-fold) improvements in yield of target sequences in a DNA
context, but
its value depends on having long reads in order to have the opportunity of
"rejecting" a
significant amount of sequenceable material. In the context of the invention,
this approach
would yield little or no benefit for constructs carrying one or a few peptide
molecules.
However, in embodiments that employ lengthy concatenated constructs carrying
many copies
of the same TARGET and STANDARD peptide pair, the use of computational
enrichment can
allow rejection of large numbers of peptide molecules of which sufficient
numbers (defined by
the statistical needs of the assay) have already been read.
7.10.3 Stoichiometric flattening in degradative sequencing applications
In some embodiments employing peptide sequencing by stepwise peptide
disassembly
while recording amino acids one after another (i.e., degradative sequencing
methods), the
number of peptides counted may be determined by the capacity of the detection
system (e.g.,
millions of peptide sites on arrays used by Quantum-Si, Encodia or Google
platforms), and the
time required for analysis of an initially fixed number of molecules is
determined by the
number of amino acids that must be serially decoded to accurately identify the
TARGET and
STANDARD peptides for counting. In degradative sequencing methods that records
the
peptide sequence as DNA (Encodia or Google), the analysis further requires a
DNA
sequencing step: however current NGS sequencing platforms provide such an
enormous
capacity that this step is probably not a significant throughput limitation.
Example sets of
183
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
TARGET and STANDARD peptide sequences designed to measure a panel of proteins
can be
constructed so as to allow recognition of each peptide by sequencing only 3 or
4 amino acids
from either terminus. In some embodiments it will be advantageous to sequence
further (more
amino acids) in order to decrease potential for misidentification and/or
provide for recognition
of any unwanted peptides with sequences similar to, but different from, the
expected TARGET
and STANDARD sequences. In any case, using stepwise degradative sequencing
technology
the time required may be somewhat adjustable (e.g., by adjusting the number of
amino acids
required to be read) but the overall number of peptide molecules being
processed is determined
by the geometry of the detection system itself. For this reason, stoi chi
ometri c flattening is key
to ensuring that there is sufficient number capacity to provide acceptable
precision in the
measurement of a series of target proteins.
7.10.4 Stoichiometric flattening in single molecule imaging detectors
Current single molecule optical imaging detection platforms have on the order
of 10'
attachment sites (54), all of which can theoretically be loaded with single
molecules and
imaged. While 1010 molecules might allow detection of a few copies of a
peptide from a low
abundance protein, that protein could not be quantitated accurately (too few
counts), while a
high abundance protein (like plasma albumin) might contribute 0.5x101 of the
total molecules
imaged (an enormous overburden of no measurement value). In embodiments
employing
stoichiometric flattening according to the invention, however, accurate
measurements can be
made with a few thousand molecules per TARGET, allowing tens of thousands of
samples to
be analyzed for a hundred or more proteins in a single run, with
correspondingly enormous
decreases in cost per measurement and capacity to run large sample sets.
7.11 MULTI-ANALYTE PANEL EMBODIMENTS.
Since the sequence-sensitive single molecule detection approach of the
invention can
distinguish between different peptide sequences, it can be used to measure
multiple different
TARGET peptides and their respective STANDARDs, potentially representing
multiple
different sample proteins, at the same time in the same sample. As has been
demonstrated
using the SISCAPA method (14), multiple specific affinity reagents (e.g.,
BINDERs) can be
used together (e.g., immobilized on magnetic beads) to enrich their cognate
peptide sequences
from a complex sample digest without significant interference between
peptides.
184
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Figure 35 illustrates a multiplex panel embodiment in which 10 peptides, along
with
their respective STANDARD peptides designed according to the invention are
measured by
nanopore sequencing in the form of concatamers and counted to provide
quantitative
measurements of the presence in a clinical sample of SARS-CoV-2 NCAP protein,
antibodies
to SARS-CoV-2 NCAP and Spike proteins, levels of three host inflammation
markers (CRP,
LPSBP and Hp), and the RNA genome of SARS-CoV-2. This collection of analytes,
determined by a single nanopore sequencing run, provides broad coverage of
COVID-19
infection and patient response.
8 DISCUSSION OF SOME NOVEL FEATURES OF THE INVENTION.
8.1 DESIGN OF ORIENTED PEPTIDE-OLIGONUCLEOTIDE CONSTRUCTS
The invention provides novel components and workflows for modifying
proteolytic
peptides to create of heterogenous molecular constructs suitable for single
molecule detection
using several different detector technologies.
8.2 DESIGN OF INTERNAL STANDARDS FOR QUANTITATION: STANDARD
CONSTRUCTS
The invention provides novel internal standards (STANDARD constructs) and
methods of preparing these, as well as barcoding methods enabling multiplex
analysis of
multiple samples.
Some embodiments make use of an internal standard (STANDARD) for quantitation
in a single molecule sequencing system. The STANDARD is designed and selected
to allow
optimal differential single molecule detection compared to its TARGET peptide
(and other
peptides likely to be present in an enriched sample) while minimizing any
differences in
binding of Target and STANDARD by a specific affinity reagent used to enrich
these peptides
from a complex sample (e.g., a sample digest) or in subsequent reactions
involved in assembly
of detectable constructs. Use of a cognate-sequence internal standard is novel
for quantitation
in a sequence-sensitive detection system, and provides the means to achieve
precise
quantitation of a TARGET peptide in a sample digest.
185
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
8.3 ENRICHMENT OF CONSTRUCTS USING BINDERS
The invention provides novel internal standards (STANDARD constructs) and
methods of preparing these, as well as barcoding methods enabling multiplex
analysis of
multiple samples
8.4 PEPTIDE MODIFICATION WHILE A PEPTIDE IS CAPTURED BY A BINDER.
Some embodiments make use of linkages of polymers or other functional groups
to
either n-terminus or c-terminus (or both) of a peptide, and may involve
chemical reactions with
peptides while the peptide is bound to a specific enrichment reagent (e.g., an
anti-peptide
antibody) which may in turn be bound to a solid support (e.g., a magnetic bead
or column).
The ability to carry out assembly of a heteropolymer or modified peptide
construct while the
peptide is held non-covalently on a solid support is novel and provides an
immense
simplification compared to conventional methods of construct assembly in
stages, which
usually require separative steps between stages.
8.5 STOICHIOMETRIC FLATTENING TO DRAMATICALLY IMPROVE SINGLE
MOLECULE DETECTION EFFCIENCY
Some embodiments make use of specific enrichment to recover a relatively
larger
proportion of a low abundance TARGET peptide and a relatively lower proportion
of a higher
abundance TARGET peptide, thereby reducing the abundance difference between
them, while
preserving information as to their respective abundances in a sample through
the inclusion of
internal standards (STANDARD). This stoichiometric flattening, which can be
achieved by
tuning the amounts of the specific affinity reagents used to capture different
TARGET peptides
at one or more stages, can be used to compress the dynamic range required for
peptide
detection, thereby enabling detection of TARGET peptides having very large
sample
abundance differences in the sample using a detector with smaller dynamic
range (e.g., a
detector capable of counting molecules, and whose precision is thus governed
by counting
statistics requiring detection of some minimum number of molecules in the
lowest abundance
of a series of peptides to be detected).
8.6 POTENTIAL FOR ULTIMATE ASSAY SENSITIVITY.
Efficient enrichment of TARGET peptides from complex samples, combined with
direct detection of single molecules, offers a path to ultimate sensitivity
(i.e., detecting and
186
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
counting all the analyte molecules present) using small, inexpensive equipment
and without
sacrificing specificity.
187
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
9 EXAMPLES
9.1 EXAMPLE 1: ASSEMBLY OF AN IN-LINE NANOPORE-SEQUENCEABLE
OLIGO:PEPTIDE:OLIGO CONSTRUCT.
A sequenceable construct was assembled by insertion of a tryptic target
peptide
sequence derived from C-reactive protein (proteotypic peptide ESDTSYVSLK,
having a
nominal length of 4nm) into a nanopore sequenceable "loop insertion" VEHICLE
(Figure 36
and Figure 37).
The method makes use of three types of linkage reactions: 1) NHS reaction with
the
two peptide amino groups (n-terminal and lysine epsilon-amino) to introduce
BCN
(bicyclo[6.1.0]nonyne) click groups into the peptide; 2) click chemical
reaction between BCN
on peptide ends with azide (incorporated into the termini of synthetic oligos
A and B); and 3)
enzymatic ligation of oligos assembled using complementary template (oligos D,
E, F).
BCN click groups were added to the amino groups at both ends of the peptide by
mixing
27uL of 10mM synthetic peptide ESDTSYVSLK (Vivitide), 33uL of 1M HEPES buffer
(pH
8.5) and 20uL of 40mM BCN-NHS ester (BroadPharm) in acetonitrile, followed by
incubation
at room temperature for lhr. This peptide conjugate (BCN-ESDTSYVSLK-BCN)was
diluted
to 100uM peptide in a final concentration of 12.5mM HEPES pH 8.5.
A non-peptidic control insert (endo-BCN-PEG2-NHS ester, BroadPharm) was
dissolved in DMSO at a final concentration of 10mM.
A set of 6 oligonucleotides was designed and synthesized (Integrated DNA
Technologies), combined in equal amounts (16.7 uM final concentration), heated
for 5min at
70C and cooled to create a double-stranded VEHICLE design as shown in Figure
36. Oligo A
comprises a sequence serving as a TARGET tag. Oligos A and B comprise azide
click groups
at their 3' and 5' termini, respectively, and are aligned by hybridization to
complementary
template oligo E to create a gap of about 2.5nm between the azide groups.
These 3 oligos
comprise a structure into which BCN-conjugated double-amino peptides can be
inserted by
click reaction with the two nearby azide groups. The proximity of the azide
groups favors
reaction with the two BCN groups of one peptide molecule, instead of coupling
to two different
peptide molecules.
188
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Oligo C, which comprises a sequence serving as a Sample ID tag, is aligned at
the 3'
end of oligo B by hybridization with complementary template oligo F, whose 5'
(right) end is
complementary to the Sample ID tag, and whose 3' (left) end is complementary
to the
universal construct oligo B. Complementary template oligo D hybridizes with
oligo A to
provide a dA overhand site compatible with a commercial Y-adapter for nanopore
sequencing.
Oligos A and C comprise 5' phosphate groups to facilitate ligation by T4
ligase.
A T/S 5'- /5Phos/CCTGAACCTATCCAGTGAGATAACACACAGGC/iAzide
B Loop follower 5'- /iAzide/ACACAGGCAGCCTACTATGCACCTCATGGAAT
C Sample JD 5'- /5Phos/CAGTTCCACCGTATAT
D Y-adapter align 5' ACTGGATAGGTTCAGGA
E Loop template 5'- AGTAGGCTGCCTGTGTTTTTTTTTGCCTGTGTGTTATCTC
F Sample ID linker 5'- ATATACGGTGGAACTGATTCCATGAGGTGCAT
Insertion experiments were performed by combining lOuL of the double-stranded
oligo
construct with three test inserts (defined below), incubation for 90min at
room temperature,
and dilution to achieve a final construct concentration of 50fmo1 construct in
25uL:
Sample Test Insert
Sample135 ¨ blank 1.7uL HEPES buffer (1M pH 8.5)
Sampl el44 ¨ non-pepti di c 1.7uL BCN-C2-BCN (10mM)
Sample146 ¨ peptide 1.7uL BCN- ESDTSYVSLK-BCN (100uM)
Each construct sample (30uL) was mixed with 5 uL T4 ligase (New England
Biolabs),
12.5uL ligation buffer, and 2.5 sequencing adapter mix (AMX-F, Oxford
Nanopore) and
allowed to react for 10min to ligate oligoA to the Y-adapter, and oligo B to
oligo C. These
samples were purified by binding to AMPure XP beads, washed twice in Short
Fragment
Buffer and eluted in 7uL of Elution buffer (Oxford Nanopore). Each sample was
finally mixed
with 15uL Sequencing buffer II and lOuL Loading beads II (Oxford Nanopore)
immediately
prior to loading on a separate Flongle chip for sequencing on an MinION device
(Oxford
Nanopore).
189
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
The construct design and experimental reads (basecalled sequences from
nanopore
traces) from the experiment are shown in Figure 36. In the design sequences of
oligos A, B
and E are underlined for clarity. Downward pointing arrows indicate sites of
oligo ligation.
In the experimental reads, regions that are identical with the design sequence
are underlined.
Sample 135, which was prepared without a BCN-activated insert, generated
sequence
covering the 3' end of the y-adapter and the 5' portion of oligo A, confirming
the effective
ligation of the adapter and oligo A, as expected. Having no insert to connect
the two azide
groups, oligos B and C are missing from the sequenced construct.
Sample 144, which was prepared with a non-peptidic BCN-C2-BCN polyethylene
glycol control insert, generated sequence covering the 3' end of the y-adapter
and the 5' portion
of oligo A, and extensive sequence covering oligos B and C, confirming the
effective ligation
of the adapter and oligo A, and oligo B with oligo C, as expected. The fact
that oligos B and
C are present in the construct demonstrated that the two azide groups were
linked by reaction
with the BCN-C2-BCN insert, used here at high concentration due to its very
limited solubility.
Sample 146, which was prepared with peptide insert BCN-ESDTSYVSLK-BCN,
generated sequence covering the 3' end of the y-adapter and the 5' portion of
oligo A, and
extensive sequence covering oligos B and C, confirming the effective ligation
of the adapter
and oligo A, and oligo B with oligo C, as expected. The fact that oligos B and
C are present
in the construct demonstrated that the two azide groups were linked by
reaction with the BCN-
ESDTSYVSLK-BCN insert, thereby confirming the derivatization of the peptide
with two
BCN and its reaction with the oligo azide groups to create a complete oligo-
peptide-construct
capable of passing through a nanopore and generating correct sequence data
from all oligo
components.
Figure 37 presents nanopore traces obtained from single molecules present in
samples
135, 144 and 146. As is typical in nanopore current traces, distinctive
features of the traces
are recognizable but do not coincide perfectly in time (passage of molecules
through a pore
does not proceed perfectly smoothly). The trace produced by the molecule from
Sample135
includes only the first half of the traces obtained from samples 144 and 146,
as expected since
only the adapter and oligo A are present in Sample 135. Samples 144 and 146
produce longer
traces that are almost identical (as expected since they cover the adapter,
and oligos A, B and
190
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
C). However, the traces differ significantly in the central region where the
insert is expected
to appear, and include, for the Sample146 peptide insert, extreme current
swings indicative of
abrupt opening (current increasing) and closing (current decreasing) of the
nanopore associated
with passage of peptide ESDTSYVSLK.
These results confirm successful assembly of a nanopore-sequencable
oligonucleoti de
VEHICLE incorporating an in-line peptide segment.
9.2 EXAMPLE 2: ASSEMBLY OF AN ORIENTED OLIGO-PEPTIDE-OLIGO
CONSTRUCT USING 2-STEP DIGESTION
A 2-step digestion method (shown in Figure 6) is used to discriminate between
peptide
n-terminal and c-terminal linkage of oligos useful in nanopore sequencing,
both to facilitate
transport of the peptide through a nanopore and to provide contextual
information on the
peptide molecule (its status as TARGET or STANDARD, and optionally what BINDER

captured it during enrichment, and what sample the peptide was derived from).
The steps of
the example are shown in Figure 12. The final construct produced in this
example passes
through a nanopore starting at the 5' end, producing nanopore current traces
(squiggles) that
are decoded to provide sequences of the Adapter, a Sample ID, a BINDER ID, the

TARGET/STANDARD identifying flag sequence, and finally a squiggle
characterizing a
peptide linked in-line with the oligos, with its c-terminus in the 5'
direction, and a following
DNA Trailer segment Modifications of this workflow can omit the Sample ID if
multiple
samples are not being multiplexed together, and can omit the BINDER-ID if this
information
is not required.
Sample proteins are digested with the enzyme Lys-C, and the amino groups of
the
resulting peptides modified by reaction with NHS-BCN to introduce BCN click
groups at both
the n-terminus and the side chain amino group of peptides having c-terminal
lysine residues
(i.e., most peptides resulting from Lys-C digestion). After this reaction has
reached
completion, amine-modified magnetic beads are added in sufficient quantity to
react with and
sequester any excess NHS-BCN reagent, and subsequently removed from the
digest. Trypsin
is then added to cleave peptides with internal Arg residues, resulting in new
free amino groups
on the n-termini thus created.
191
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
As shown schematically in Figure 12A, sample peptides with c-terminal BCN
(attached
to the c-terminal lysine side chain amino group) are reacted with an oligo
nucleotide
comprising a sequence defined as the label (or flag) for sample-derived
peptides ("T ID") and
having a 5' phosphate group and a 3' azide functionality. The BCN-peptide and
azide-oligo
then react to form the peptide:oligo construct shown in Figure 12B. A
previously made
synthetic STANDARD construct comprising the same peptide sequence joined to an
oligo
comprising a sequence defined as the label (or flag) for STANDARD peptides ("S
ID") is
added in known amount to act as an internal reference for quantitation of the
TARGET,
creating a standardized sample. The distinct T ID and S ID sequences comprise
a region of
identical sequence capable of hybridizing with a region of the Assembler
molecule shown
below. Cognate TARGET and STANDARD peptide:oligo constructs are enriched by
binding
to a cognate BINDER attached to magnetic beads (Figure 12C). The BINDER has an
attached
oligo sequence "B tag" identifying the specificity of the BINDER (i.e., which
peptide
sequence it captures). Unbound peptides are washed away. Multiple TARGET
peptides
together with their cognate STANDARDs are bound and enriched from the
standardized
sample by a combination of their respective cognate BINDERs.
Three partially double-stranded DNA constructs are added to the BINDER-
captured
peptide:oligo constnicts. 1) an "Assembler" bridge strand comprising a
sequence
complementary to the BINDER B tag and a second sequence complementary to an
oligo
having a sequence that identifies the BINDER specificity; 2) an oligo
comprising a sequence
that identifies the sample from which the peptides are derived (to enable
multiplexing of
samples for nanopore readout) and a short complementary sequence "S¨; and 3) a

conventional nanopore sequencing Y-adaptor comprising a DNA motor protein. The

Assembler hybridizes to i) the B tag of the BINDER having the complementary
sequence; ii)
the sequence shared by T ID and S ID sequences; and iii) a universal
subsequence shared by
the set of Sample ID sample barcodes. The Sample ID barcodes further comprise
an A/T
overhang matching the requirements for ligation to the sequencing Y-adapter.
Each of the
SID, T ID, BINDER ID, and Sample ID oligos comprise 5' phosphate groups. Once
the
oligos are assembled on the Assembler based on hybridization of complementary
sequences
(Figure 12E), a DNA ligase (e.g., T4 ligase) is used to covalently link the
Adapter, Sample ID,
BINDER ID, T/S ID and peptide molecules into a continuous linear construct
(ligation is
192
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
indicated in the figure by >H<). In Figures 12E-G the presence of a mixture of
TARGET and
STANDARD (T ID and S ID) versions of a peptide present on the enriching BINDER
is
indicated as "T/SID").
The remaining free amino group at the peptide n-terminus is reacted with NHS-
BCN,
followed by removal of excess NHS-BCN using amine-modified magnetic beads (as
carried
out previously), and finally linkage to a further "Trailer" oligo. The Trailer
functions with the
DNA motor to regulate passage of the preceding peptide through the nanopore
and allow
recording of a "squiggle" trace capable of identifying the peptide among the
set of expected
TARGETs.
Figure 13A shows the sequences of oligo components in a specific
implementation of
the workflow used in this example up to, but not including, ligation of the
nanopore Y-Adapter.
Figure 13B shows the construct after dissociation of the peptide from the
BINDER and the
B tag from the Assembler, at which point the BINDER beads are removed and the
construct
is ligated with a nanopore adapter using an A/T overhang (ligation indicated
in the figure by
>11).
9.3 EXAMPLE 3: PREPARATION OF AN ENRICHED STANDARDIZED PEPTIDE
LIBRARY FOR SINGLE MOLECULE IMAGING DETECTION
In this example, a simplified protocol is used to prepare TARGET peptides
EGYYGYTGAFR from serum transferrin (Tf) and GFVEPDHYVVVGAQR from soluble
transferring receptor (sTfR) for detection by super-resolution single molecule
microscopy A
set of samples of lOuL human plasma (or equivalently 20uL of whole blood) were
digested
with trypsin according to a published automated protocol (1 3) .
The selected peptides have a c-terminal Arginine residue and thus have a
single amino
group at their n-terminus available for reaction with an NHS-ester reagent.
The tryptic sample
digest is reacted with a quantity of commercially-available endo-BCN-PEG3-NHS
ester
(BroadPharm) equal to a 1.5-fold molar excess over the peptide amino groups in
the digest.
This reagent was dissolved in acetonitrile to overcome its limited solubility
in aqueous solvents
and added to the digest. After completion of the NHS reaction (60min at room
temperature) a
2-fold excess of TARGET ID oligonucleotide tag comprising a 3' azide group and
a 5'
193
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
phosphate was added and allowed to react overnight. These reactions can be
executed
sequentially as described, or simultaneously as indicated in Figure 38A.
STANDARD constructs of the two target peptides are prepared separately using
synthetic peptides having the same sequences as the TARGETs, and have the same
structure
as their cognate TARGET constructs generated in the foregoing steps, with the
exception that
the TARGET ID (or T ID) flag is replaced by the STANDARD ID (S ID) flag oligo
sequence. In this example, the TARGET and STANDARD tags comprise a 5' region
in which
their sequences are the same, and a 3' region in which they have different
sequences capable
of distinguishing one from another. STANDARD constructs are added to the
sample digests
in molar amounts approximately equal to the amounts of the respective target
peptides in a
typical plasma sample (in this case the same volume of STANDARD is added to
each sample
digest). Following addition of the STANDARD constructs (Figure 38B), the
digest samples
constitute standardized digests (with respect to these two peptides).
Specific BINDERs (rabbit anti-peptide antibodies) were generated to each of
these
peptides and covalently immobilized on magnetic beads by reaction with tosyl
Dynabeads
(ThermoFisher) according to the manufacturer's instructions, at an approximate
load of lug
antibody per Sul of bead suspension. 5uL of beads with the sTfR peptide BINDER
and luL
of beads with the Tf BINDER are added to the standardized digests and
incubated at room
temperature for 30min, after which the beads are collected by a magnets placed
aside the digest
vessels, the digests are removed, and the beads are washed in buffer (PBS)
twice using an
automated liquid handling system as described (4) . The resulting samples of
TARGET and
STANDARD constructs for each of the two peptides (Figure 38C) constitute
enriched sample
digests.
Constructs present in the different sample digests are labeled by addition of
Sample ID
oligos having a 3' region of fixed sequence (the same sequence for all Sample
ID tags) and a
5' region of variable sequence that encodes the identity of each sample. In
this example the
Sample ID oligos additionally comprise a tetrazine (Tz) click group at the 5'
end.
An additional synthetic oligo is added (the "Assembler" in Figure 38C) that
has a 3'
region complementary to the 3' end of the Sample ID oligo, and a 5' region
complementary
to the 5' end present in both the TARGET and STANDARD tags (i.e., the pool of
T/S ID
194
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
tags). The Assembler hybridizes with both the T/S ID oligo tags and the Sample
ID oligos
present in each sample, aligning the Sample ID 3' end with the 5' phosphate at
the end of the
T/S ID components of the antibody-bound constructs. T4 DNA ligase (New England
Biolabs)
is used according to the manufacturer's instructions to ligate the Sample ID
oligos to the
T/S ID oligos, creating continuous covalent constructs each comprising a
peptide, a T/S ID
oligo flag and a Sample ID oligo flag. These additions and reactions are
carried out while the
initial peptide constructs remain bound to the BINDERs on the magnetic beads,
allowing the
use of small reaction volumes and efficient washing between steps.
Following ligation, the beads are washed free of ligation reaction components,
and the
enriched constructs are eluted from the BINDERs on the beads using an acidic
eluent (0.5%
formic acid/ 0.03% CHAPS detergent). After removal of the beads from the
eluates, the eluates
are neutralized with Tris buffer, and the eluates (now labeled with Sample ID
tags) are
combined to create a pooled sample for analysis.
The pooled construct sample is diluted and loaded onto a clean glass slide
that was
previously derivatized with TCO-PEG3-triethoxysilane at low density (about 1-
10 TCO
groups per square micron) and passivated to reduce non-specific binding (80) .
Following
reaction of the construct 5' -Tz click groups with the TCO click groups on the
slide surface, the
slide is washed, assembled into a flow cell and positioned in the light path
of a fluorescence
microscope equipped with total internal reflection (TIRF) optics, laser
illumination and a high-
efficiency camera.
A series of recognition reagents are labeled with Cy5 fluorophore: the peptide

BINDERs (each capable of specifically recognizing one of the peptides), two
oligos
respectively complementary to the unique portions of the TARGET and STANDARD
tags
(distinguishing sample-derived TARGET peptides from added internal STANDARDs),
and
oligos complementary to the unique portions of the SAMPLE ID tags (identifying
the
respective samples from which each construct originated). Each labeled
recognition reagent
is passed over the immobilized constructs in sequence, the resulting bound Cy5
signals
recorded as an image. The slide is washed with acid eluent after each antibody
recognition
reagent and heated to 60C after each oligo recognition reagent in order to
remove all
fluorescence before addition of the following recognition reagent. The
sequential images are
195
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
aligned and interpreted as indicated schematically in Figure 31 to provide
counts of each
peptide and its internal standard in each sample. From these counts, the
ratios of TARGET
molecules to STANDARD molecules in each sample can be calculated, and
multiplied by the
amounts of STANDARDs added to each sample to arrive at a measure of the amount
of each
peptide (and its parent protein) in each sample.
9.4 EXAMPLE 4: PREPARATION OF AN ENRICHED PEPTIDE LIBRARY FOR
DEGRADATIVE SINGLE MOLECULE DETECTION
The method of the invention was adapted (as shown in Figure 39) to provide
peptide:oligo constructs suitable for analysis in a detection platform in
which peptide sequence
is reverse translated into DNA by sequential recognition and removal of n-
terminal amino acids
(81). In this application it is essential that the peptide n-terminus is
accessible for chemical
modification and recognition by specific affinity reagents, and an associated
DNA oligo with
a free 3' end is required to which short base sequences can be added recording
the successive
amino acids detected and removed from the peptide in a cyclical reverse
translation process.
This is achieved by modifying the structure of the TARGET/STANDARD
identification code
oligos (T ID and SD oligos) to provide for click attachment to an internal,
rather than a 3'
terminal, azide-labeled base (Figure 39A and B). The peptide is thereby
connected to a site 2
or more bases in from the 3' end of the oligo (leaving the 3' terminus
accessible for addition
of bases (e.g., by DNA polymerase copying a strand associated with one of a
series of n-
terminal amino acid recognition reagents), but not within the 5' end that
hybridizes with the
Assembler.
The process of reverse translation adds bases to the 3' end of the oligo
construct (81)
in the process of reading out the identity of the n-terminal amino acid as
recognized by a
specific BINDER. When this cyclical process is stopped (e.g., when sufficient
amino acids
have been decoded and bases added to unambiguously identify the TARGET
sequence, or
determine that the peptide is not a relevant sequence), the extended oligo is
cleaved from the
substrate, and any necessary adapter sequences added to allow its introduction
into a
conventional DNA sequencer for analysis (e.g., using sequencing by synthesis
as developed
and commercialized by Illumina). The peptide may be removed from the construct
if it
presents a barrier to copying the oligo.
196
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
197
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
REFERENCES
1. N. L. Anderson, The Clinical Plasma Proteome: A Survey of Clinical Assays
for Proteins in
Plasma and Serum. Clin. Chem. 56, 177 185 (2010).
2. A. N. Hoofnagle, M. H. Wener, The fundamental flaws of immunoassays and
potential
solutions using tandem mass spectrometry. J. Immunol. Methods 347, 3-11
(2009).
3. N. L. Anderson, N. G. Anderson, L. R. Haines, D. B. Hardie, T. W. Pearson,
Mass
spectrometric quantitation of peptides and proteins using Stable Isotope
Standards and Capture
by Anti-Peptide Antibodies (SISCAPA). J. Pro/come Res. 3, 235-244 (2004).
4. M. Razavi, N. Leigh Anderson, M. E. Pope, R. Yip, T. W. Pearson, High
precision
quantification of human plasma proteins using the automated SISCAPA Immuno-MS
workflow. New Biotechnol. 33, 494-502 (2016).
5. C. V. Cheng, DISCREPANCIES BETWEEN SISCAPA LC-MS/MS AND ROCHE
COBAS e601 THYROGLOBULIN REVEAL UNEXPECTEDLY HIGH RATE OF
HETEROPHILE ANTIBODY INTERFERENCE IN IMMUNOASSAY., 1.
6. A. N. Hoofnagle, M. Y. Roth, Improving the Measurement of Serum
Thyroglobulin With
Mass Spectrometry. J. Clin. Endocrinol. Metal). 98, 1343-1352 (2013).
7. K. K. Mangalaparthi, S. Chavan, A. K. Madugundu, S. Renuse, P. M.
Vanderboom, A. D.
Maus, J. Kemp, B. R. Kipp, S. K. Grebe, R. J. Singh, A. Pandey, A SISCAPA-
based approach
for detection of SARS-CoV-2 viral antigens from clinical samples. Clin.
Pro/comics 18, 25
(2021).
8. N. L. Anderson, N. G. Anderson, The human plasma proteome: history,
character, and
diagnostic prospects. Mol. Cell. Pro/comics MCP 1, 845-867 (2002).
198
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
9. L. Anderson, M. Razavi, M. E. Pope, R. Yip, L. Cameron, A. Bassini-Cameron,
T. W.
Pearson, Precision multiparameter tracking of inflammation on timescales of
hours to years
using serial dried blood spots. Bioanalysis 12, 937-955 (2020).
10. J. 0. Becker, A. N. Hoofnagle, Replacing immunoassays with tryptic
digestion-peptide
immunoaffinity enrichment and LC¨MS/MS. Bioanalysis 4, 281-290 (2012).
11. S. Gustafsdottir, E. Schallmeiner, S. Fredriksson, M. Gullberg, 0.
Soderberg, M. Jarvius,
J. Jarvius, M. Howell, U. Landegren, Proximity ligation assays for sensitive
and specific
protein analyses. Anal. Biochem. 345, 2-9 (2005).
12. A. Joshi, M. Mayr, In Aptamers They Trust: Caveats of the SOMAscan
Biomarker
Discovery Platform From SomaLogic. Circulation 138, 2482-2485 (2018).
13. M. Razavi, N. Leigh Anderson, M. E. Pope, R. Yip, T. W. Pearson, High
precision
quantification of human plasma proteins using the automated SISCAPA Immuno-MS
workflow. New Biotechnol. 33, 494-502 (2016).
14. M. Razavi, N. L. Anderson, R. Yip, M. E. Pope, T. W. Pearson, Multiplexed
longitudinal
measurement of protein biomarkers in DBS using an automated SISCAPA workflow.
Bioanalysis 8, 1597-1609 (2016).
15. M. Razavi, L. E. Frick, W. A. Lamarr, M. E. Pope, C. A. Miller, N. L.
Anderson, T. W.
Pearson, High-Throughput SISCAPA Quantitation of Peptides from Human Plasma
Digests
by Ultrafast, Liquid Chromatography-Free Mass Spectrometry. J. Proteome Res. ,

121119143208008-8 (2012).
16. J. Zecha, S. Satpathy, T. Kanashova, S. C. Avanessian, M. H. Kane, K. R.
Clauser, P.
Mertins, S. A. Carr, B. Kuster, TMT Labeling for the Masses: A Robust and Cost-
efficient, In-
solution Labeling Approach. Mal. Cell. Proteomics 18, 1468-1478 (2019).
17. N. Z. Fantoni, A. H. El-Sagheer, T. Brown, A Hitchhiker's Guide to Click-
Chemistry with
Nucleic Acids. Chem. Rev. 121, 7122-7154 (2021).
199
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
18. J. Swaminathan, A. A. Boulgakov, E. M. Marcotte, D. B. Searls, Ed. A
Theoretical
Justification for Single Molecule Peptide Sequencing. PLOS Comput. Biol. 11,
e1004080
(2015).
19. J. A. Alfaro, P. Bohlander, M. Dai, M. Filius, C. J. Howard, X. F. van
Kooten, S. Ohayon,
A. Pomorski, S. Schmid, A. Aksimentiev, E. V. Anslyn, G. Bedran, C. Cao, M.
Chinappi, E.
Coyaud, C. Dekker, G. Dittmar, N. Drachman, R. Eelkema, D. Goodlett, S. Hentz,
U.
Kalathiya, N. L. Kelleher, R. T. Kelly, Z. Kelman, S. H. Kim, B. Kuster, D.
Rodriguez-Larrea,
S. Lindsay, G. Maglia, E. M. Marcotte, J. P. Marino, C. Masselon, M. Mayer, P.
Samaras, K.
Sarthak, L. Sepiashvili, D. Stein, M. Wanunu, M. Wilhelm, P. Yin, A. Meller,
C. Joo, The
emerging landscape of single-molecule protein sequencing technologies. Nat.
Methods 18,
604-617 (2021).
20. C. G. Brown, J. Clarke, Nanopore development at Oxford Nanopore. Nat.
Biotechnol. 34,
810-811 (2016).
21. M. Jain, H. E. Olsen, B. Paten, M. Akeson, The Oxford Nanopore MinION:
delivery of
nanopore sequencing to the genomics community. Genome Biol. 17, 239 (2016).
22. F. Hu, B. Angelov, S. Li, N. Li, X. Lin, A. Zou, Single-Molecule Study of
Peptides with
the Same Amino Acid Composition but Different Sequences by Using an Aerolysin
Nanopore.
ChemBioChem 21, 2467-2473 (2020).
23. S. Yan, J. Zhang, Y. Wang, W. Guo, S. Zhang, Y. Liu, J. Cao, Y. Wang, L.
Wang, F. Ma,
P. Zhang, H.-Y. Chen, S. Huang, Single Molecule Ratcheting Motion of Peptides
in a
Mycobacterium smegmatis Porin A (MspA) Nanopore. Nano Lett. 21, 6703-6710
(2021).
24. M. Miyagi, S. Takiguchi, K. Hakamada, M. Yohda, R. Kawano, Single
polypeptide
detection using a translocon EXP2 nanopore. PROTEOMICS , 2100070 (2021).
25. S. Biswas, W. Song, C. Borges, S. Lindsay, P. Zhang, Click Addition of a
DNA Thread to
the N-Termini of Peptides for Their Translocation through Solid-State
Nanopores. ACS Nano
9, 9652-9664 (2015).
200
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
26. M. A. V. Fahie, Ed., Nanopore Technology: Methods and Protocols (Springer
US, New
York, NY, 2021; http://link.springer.com/10.1007/978-1-0716-0806-7).
27. J. Zuo, N.-N. Song, J. Wang, X. Zhao, M.-Y. Cheng, Q. Wang, W. Tang, Z.
Yang, K. Qiu,
Review
______________________________________________________________________________
Single-Molecule Sensors Based on Protein Nanopores. J. Electrochem. Soc. 168,
126502 (2021).
28. T. Albrecht, Single-Molecule Analysis with Solid-State Nanopores. Annu.
Rev. Anal.
Chem. 12, 371-387 (2019).
29. L. Restrepo-Perez, S. John, A. Aksimentiev, C. Joo, C. Dekker, SDS-
assisted protein
transport through solid-state nanopores. Nanoscale 9, 11685-11693 (2017).
30. P. Mallick, High-density and scalable protein arrays for single-molecule
proteomic studies.
, 33.
31. J. D. Egertson, D. DiPasquo, A. Killeen, V. Lobanov, S. Patel, P. Mallick,
A theoretical
framework for proteome-scale single-molecule protein identification using
multi-affinity
protein binding reagents (Systems Biology,
2021;
http ://biorxiv. org/lookup/doi/10.1101/2021. 10.11. 463967).
32. C. M. Stawicki, T. E. Rinker, M. Burns, S. S. Tonapi, R. P. Galimidi, D.
Anumala, J. K.
Robinson, J. S. Klein, P. Mallick, Modular fluorescent nanoparticle DNA probes
for detection
of peptides and proteins. Sci. Rep. 11, 19921 (2021).
33. J. van Ginkel, M. Filius, M. Szczepaniak, P. Tulinski, A. S. Meyer, C.
Joo, Single-molecule
peptide fingerprinting. Proc. Natl. Acad. Sci. 115, 3338-3343 (2018).
34. J. M. Hong, M. Gibbons, A. Bashir, D. Wu, S. Shao, Z. Cutts, M. Chavarha,
Y. Chen, L.
Schiff, M. Foster, V. A. Church, L. Ching, S. Ahadi, A. Hieu-Thao Le, A. Tran,
M. Dimon,
M. Coram, B. Williams, P. Jess, M. Berndl, A. Pawlosky, ProtSeq: Toward high-
throughput,
single-molecule protein sequencing via amino acid conversion into DNA
barcodes. iScience
25, 103586 (2022).
201
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
35. B. D. Reed, M. J. Meyer, V. Abramzon, 0. Ad, P. Adcock, F. R. Ahmad, G.
Alppay, J. A.
Ball, J. Beach, D. Belhachemi, A. Bellofiore, M. Bellos, J. F. Beltran, A.
Betts, M. W. Bhuiya,
K. Blacklock, R. Boer, D. Boisvert, N. D. Brault, A. Buxbaum, S. Caprio, C.
Choi, T. D.
Christian, R. Clancy, J. Clark, T. Connolly, K. F. Croce, R. Cullen, M. Davey,
J. Davidson, M.
M. Elshenawy, M. Ferrigno, D. Frier, S. Gudipati, S. Hamill, Z. He, S. Hosali,
H. Huang, L.
Huang, A. Kabiri, G. Kriger, B. Lathrop, A. Li, P. Lim, S. Liu, F. Luo, C. Lv,
X. Ma, E.
McCormack, M. Millham, R. Nani, M. Pandey, J. Parillo, G. Patel, D. H. Pike,
K. Preston, A.
Pichard-Kostuch, K. Rearick, T. Rearick, M. Ribezzi-Crivellari, G. Schmid, J.
Schultz, X. Shi,
B. Singh, N. Srivastava, S. F. Stewman, T. R. Thurston, P. Trioli, J. Tullman,
X. Wang, Y.-C.
Wang, E. A. G. Webster, Z. Zhang, J. Zuniga, S. S. Patel, A. D. Griffiths, A.
M. van Oij en, M.
McKenna, M. D. Dyer, J. M. Rothberg, Real-time dynamic single-molecule protein
sequencing
on an integrated semiconductor device (Biophysics,
2022;
http ://biorxiv. org/lookup/doi/10.1101/2022. 01.04.475002).
36. Y. Yao, M. Docter, J. van Ginkel, D. de Ridder, C. Joo, Single-molecule
protein sequencing
through fingerprinting: computational assessment. Phys. Biol. 12, 055003
(2015).
37. E. T. Hernandez, J. Swaminathan, E. M. Marcotte, E. V. Anslyn, Solution-
phase and solid-
phase sequential, selective modification of side chains in KDYWEC and KDYWE as
models
for usage in single-molecule protein sequencing. New I Chem. 41, 462-469
(2017).
38. X. Qu, D. Wu, L. Mets, N. F. Scherer, Nanometer-localized multiple single-
molecule
fluorescence microscopy. Proc. Natl. Acad. Sci. 101, 11298-11303 (2004).
39. Q. Xu, M. R. Schlabach, G. J. Hannon, S. J. Elledge, Design of 240,000
orthogonal 25mer
DNA barcode probes. Proc. Natl. Acad. Sci. 106, 2289-2294 (2009).
40. T. Buschmann, L. V. Bystrykh, Levenshtein error-correcting barcodes for
multiplexed
DNA sequencing. BMC Bioinformatics 14, 272 (2013).
41. J. A. Hawkins, S. K. Jones, I. J. Finkelstein, W. H. Press, Indel-
correcting DNA barcodes
for high-throughput sequencing. Proc. Natl. Acad. Sd. 115 (2018),
doi:10.1073/pnas.1802640115.
202
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
42. P. Somervuo, P. Koskinen, P. Mei, L. Holm, P. Auvinen, L. Paulin,
BARCOSEL: a tool
for selecting an optimal barcode set for high-throughput sequencing. BMC
Bioinformatics 19,
257 (2018).
43. R. R. Wick, L. M. Judd, K. E. Holt, Performance of neural network
basecalling tools for
Oxford Nanopore sequencing. Genome Biol. 20, 129 (2019).
44. F. Schueder, E. M. Unterauer, M. Ganji, R. Jungmann, DNA-Barcoded
Fluorescence
Microscopy for Spatial Omics. PROTEOMICS 20, 1900368 (2020).
45. K. H. Chen, A. N. Boettiger, J. R. Moffitt, S. Wang, X. Zhuang, Spatially
resolved, highly
multiplexed RNA profiling in single cells. Science 348, aaa6090 (2015).
46. R. W. Hamming, Error Detecting and Error Correcting Codes. Bell Syst.
Tech. J. 29, 147-
160 (1950).
47. D. T. Flood, C. Kingston, J. C. Vantourout, P. E. Dawson, P. S. Baran, DNA
Encoded
Libraries: A Visitor's Guide. Isr. I Chem. 60, 268-280 (2020).
48. K. Miyamoto, W. Aoki, Y. Ohtani, N. Miura, S. Aburaya, Y. Matsuzaki, K.
Kajiwara, Y.
Kitagawa, M. Ueda, M. Antopol sky, Ed. Peptide barcoding for establishment of
new types of
genotype¨phenotype linkages. PLOS ONE 14, e0215993 (2019).
49. A. Magi, R. Semeraro, A. Mingrino, B. Giusti, R. D'Aurizio, Nanopore
sequencing data
analysis: state of the art, applications and challenges. Brief Bioinform.
(2017),
doi :10.1093/bib/bbx062.
50. M. J. MacCoss, J. Alfaro, M. Wanunu, D. A. Faivre, N. Slavov, Sampling the
proteome by
emerging single-molecule and mass-spectrometry methods. , is.
51. E. Lerner, A. Barth, J. Hendrix, B. Ambrose, V. Birkedal, S. C. Blanchard,
R. 'Rimer, H.
Sung Chung, T. Cordes, T. D. Craggs, A. A. Deniz, J. Diao, J. Fei, R. L.
Gonzalez, I. V.
Gopich, T. Ha, C. A. Hanke, G. Haran, N. S. Hatzakis, S. Hohng, S.-C. Hong, T.
Hugel, A.
Ingargiola, C. Joo, A. N. Kapanidis, H. D. Kim, T. Laurence, N. K. Lee, T.-H.
Lee, E. A.
Lemke, E. Margeat, J. Michaelis, X. Michalet, S. Myong, D. Nettels, T.-0.
Peulen, E. Ploetz,
203
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
Y. Razvag, N. C. Robb, B. Schuler, H. Soleimaninej ad, C. Tang, R. Vafabakhsh,
D. C. Lamb,
C. A. Seidel, S. Weiss, FRET-based dynamic structural biology: Challenges,
perspectives and
an appeal for open-science practices. eLife 10, e60416 (2021).
52. N. L. Anderson, C. L. Hunter, Quantitative mass spectrometric multiple
reaction
monitoring assays for major plasma proteins. Mol. Cell. Proteomics MCP 5, 573-
588 (2006).
53. D. Branton, D. W. Deamer, Nanopore sequencing: an introduction (World
Scientific, New
Jersey, 2018).
54. T. Aksel, H. Qian, P. Hao, P. F. Indermuhle, C. Inman, S. Paul, K. Chen,
R. Seghers, J. K.
Robinson, M. De Garate, B. Nortman, J. Tan, S. Hendricks, S. Sankar, P.
Mallick, High-density
and scalable protein arrays for single-molecule proteomic studies
(Bioengineering, 2022;
http ://biorxiv. org/lookup/doi/10.1101/2022. 05.02.490328).
55. A. N. Hoofnagle, J. 0. Becker, M. H. Wener, J. W. Heinecke, Quantification
of
thyroglobulin, a low-abundance serum protein, by immunoaffinity peptide
enrichment and
tandem mass spectrometry. Ch,,. Chem. 54, 1796-1804 (2008).
56. R. Beardsley, J. Karty, Enhancing the intensities of lysine-terminated
tryptic peptide ions
in matrix-assisted laser desorption/ionization mass spectrometry - Beardsley -
2000 - Rapid
Communications in Mass Spectrometry - Wiley Online Library. ... Mass Spectrom.
(2000)
(available at
http://onlinelibrary.wiley.com/doi/10.1002/1097-
0231(20001215)14:23%3C2147::AID-RCM145%3E3Ø00;2-M/full).
57. S. Saveliev, M. Bratz, R. Zubarev, M. Szapacs, H. Budamgunta, M. Urh,
Trypsin/Lys-C
protease mix for enhanced protein mass spectrometry analysis. Nat. Methods 10,
i¨ii (2013).
58. Q. Li, Y. Feng, M.-J. Tan, L.-H. Zhai, Evaluation of Endoproteinase Lys-
C/Trypsin
Sequential Digestion Used in Proteomics Sample Preparation. Chin. J. Anal.
Chem. 45, 316-
321 (2017).
59. C. Ricos, V. Alvarez, F. Cava, Current databases on biological variation:
pros, cons and
progress. &and. J. Clin. Lab. Invest. 59, 491-500 (1999).
204
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
60. N. L. Anderson, A. Jackson, D. Smith, D. Hardie, C. Borchers, T. W.
Pearson, SISCAPA
peptide enrichment on magnetic beads using an in-line bead trap device. 8, 995-
1005 (2009).
61. G. Xu, S. B. Y. Shin, S. R. Jaffrey, Chemoenzymatic Labeling of Protein C-
Termini for
Positive Selection of C-Terminal Peptides. ACS Chem. Biol. 6, 1015-1020
(2011).
62. S. D. Brown, D. Graham, Conjugation of an oligonucleotide to Tat, a cell-
penetrating
peptide, via click chemistry. Tetrahedron Lett. 51, 5032 5034 (2010).
63. F. J. Flett, J. G. A. Walton, C. L. Mackay, H. Interthal, Click Chemistry
Generated Model
DNA¨Peptide Heteroconjugates as Tools for Mass Spectrometry. Anal. Chem. 87,
9595-9599
(2015).
64. N. Inoue, A. Onoda, T. Hayashi, Site-Specific Modification of Proteins
through N-
Terminal Azide Labeling and a Chelation-Assisted CuAAC Reaction. Bioconjug.
Chem. 30,
2427-2434 (2019).
65. H. Brinkerhoff, A. S. W. Kang, J. Liu, A. Aksimentiev, C. Dekker, Multiple
rereads of
single proteins at single¨amino acid resolution using nanopores. Science ,
eab14381 (2021).
66. M. Mag, S. Liiking, J. W. Engels, Synthesis and selective cleavage of an
oligodeoxynucleotide containing a bridged intemucleotide 5'-phosphorothioate
linkage.
Nuckic Acids Res. 19, 1437-1441 (1991).
67. Z. Hu, J. Yang, F. Xu, G. Sun, X. Pan, M. Xia, S. Zhang, X. Zhang, Site-
Specific Scissors
Based on Myeloperoxidase for Phosphorothioate DNA. J. Am. Chem. Soc. 143,
12361-12368
(2021).
68. M. T. Noakes, H. Brinkerhoff, A. H. Laszlo, I. M. Derrington, K. W.
Langford, J. W.
Mount, J. L. Bowman, K. S. Baker, K. M. Doering, B. I. Tickman, J. H.
Gundlach, Increasing
the accuracy of nanopore DNA sequencing using a time-varying cross membrane
voltage. Nat.
Biotechnol. 37, 651-656 (2019).
69. J. Nivala, D. B. Marks, M. Akeson, Unfoldase-mediated protein
translocation through an
a-hemolysin nanopore. Nat. Biotechnol. 31, 247-250 (2013).
205
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
70. S. Zhang, G. Huang, R. C. A. Versloot, B. M. H. Bruininks, P. C. T. de
Souza, S.-J.
Marrink, G. Maglia, Bottom-up fabrication of a proteasome¨nanopore that
unravels and
processes single proteins. Nat. Chem. (2021), doi:10.1038/s41557-021-00824-w.
71. B. Albada, J. F. Keijzer, H. Zuilhof, F. van Delft, Oxidation-Induced "One-
Pot" Click
Chemistry. Chem. Rev. 121, 7032-7058 (2021).
72. METHODS FOR DELIVERING AN ANALYTE TO TRANSMEMBRANE PORES
US20210147904A1 (available
at
https://patentimages.
storage.googleapis.com/45/5b/04/e35513c92f0382/US20210147904A1.
pdf).
73. SYSTEMS AND METHODS OF DELIVERING TARGET MOLECULES TO A
NANOP ORE : US20200284783A1 (available
at
haps ://patentimages. storage.googleapi s. com/6f/5
a/e5/85ccf2aal5f897/US20200284783 Al . p
df).
74. Y. Bao, J. Wadden, J. R. Erb-Downward, P. Ranjan, W. Zhou, T. L. McDonald,
R. E. Mills,
A. P. Boyle, R. P. Dickson, D. Blaauw, J. D. Welch, SquiggleNet: real-time,
direct
classification of nanopore signals. Genome Biol. 22, 298 (2021).
75. M. Lelek, M. T. Gyparaki, G. Beliu, F. Schueder, J. Griffie, S. Manley, R.
Jungmann, M.
Sauer, M. Lakadamyali, C. Zimmer, Single-molecule localization microscopy.
Nat. Rev.
Methods Primer 1, 39 (2021).
76. H. Yang, G. Garcia-Manero, K. Sasaki, G. Montalban-Bravo, Z. Tang, Y. Wei,
T. Kadia,
K. Chien, D. Rush, H. Nguyen, A. Kalia, M. Nimmakayalu, C. Bueso-Ramos, H.
Kantarjian,
L. J. Medeiros, R. Luthra, R. Kanagal-Shamanna, High-resolution structural
variant profiling
of myelodysplastic syndromes by optical genome mapping uncovers cryptic
aberrations of
prognostic and therapeutic significance. Leukemia (2022), doi:10.1038/s41375-
022-01652-8.
77. T. Chatterjee, A. Knappik, E. Sandford, M. Tewari, S. W. Choi, W. B.
Strong, E. P. Thrush,
K. J. Oh, N. Liu, N. G. Walter, A. Johnson-Buck, Direct kinetic fingerprinting
and digital
counting of single protein molecules. Proc. Natl. Acad. Sci. 117, 22815-22822
(2020).
206
CA 03238472 2024-5- 16

WO 2023/102502
PCT/US2022/080781
78. J. Swaminathan, A. A. Boulgakov, E. T. Hernandez, A. M. Bardo, J. L.
Bachman, J.
Marotta, A. M. Johnson, E. V. Anslyn, E. M. Marcotte, Highly parallel single-
molecule
identification of proteins in zeptomole-scale mixtures. Nat. Biotechnol. 36,
1076-1082 (2018).
79. J. M. Rothberg, W. Hinz, T. M. Rearick, J. Schultz, W. Mileski, M. Davey,
J. H. Leamon,
K. Johnson, M. J. Milgrew, M. Edwards, J. Hoon, J. F. Simons, D. Marran, J. W.
Myers, J. F.
Davidson, A. Branting, J. R. Nobile, B. P. Puc, D. Light, T. A. Clark, M.
Huber, J. T.
Branciforte, I. B. Stoner, S. E. Cawley, M. Lyons, Y. Fu, N. Homer, M. Sedova,
X. Miao, B.
Reed, J. Sabina, E. Feierstein, M. Schorn, M. Alanjary, E. Dimalanta, D.
Dressman, R.
Kasinskas, T. Sokolsky, J. A. Fidanza, E. Namsaraev, K. J. McKernan, A.
Williams, G. T.
Roth, J. Bustillo, An integrated semiconductor device enabling non-optical
genome
sequencing. Nature 475, 348-352 (2011).
80. R. Roy, S. Hohng, T. Ha, A practical guide to single-molecule FRET. Nat.
Methods' 5, 507-
516 (2008).
81. M. S. Chee, S. Diego, D. A. Routenberg, S. Diego, US2018/0201980,47.
207
CA 03238472 2024-5- 16

Representative Drawing

Sorry, the representative drawing for patent document number 3238472 was not found.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date Unavailable
(86) PCT Filing Date 2022-12-01
(87) PCT Publication Date 2023-06-08
(85) National Entry 2024-05-16

Abandonment History

There is no abandonment history.

Maintenance Fee


 Upcoming maintenance fee amounts

Description Date Amount
Next Payment if standard fee 2024-12-02 $125.00
Next Payment if small entity fee 2024-12-02 $50.00

Note : If the full payment has not been received on or before the date indicated, a further fee may be required which may be one of the following

  • the reinstatement fee;
  • the late payment fee; or
  • additional fee to reverse deemed expiry.

Patent fees are adjusted on the 1st of January every year. The amounts above are the current amounts if received by December 31 of the current year.
Please refer to the CIPO Patent Fees web page to see all current fee amounts.

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Application Fee $555.00 2024-05-16
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
SISCAPA ASSAY TECHNOLOGIES, INC.
Past Owners on Record
None
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Declaration of Entitlement 2024-05-16 1 21
Patent Cooperation Treaty (PCT) 2024-05-16 1 54
Description 2024-05-16 207 10,697
Claims 2024-05-16 5 161
Drawings 2024-05-16 60 1,723
Patent Cooperation Treaty (PCT) 2024-05-16 1 72
International Search Report 2024-05-16 1 50
Amendment - Claims 2024-05-16 7 350
Correspondence 2024-05-16 2 53
National Entry Request 2024-05-16 10 290
Abstract 2024-05-16 1 9
Cover Page 2024-05-24 1 33
Abstract 2024-05-22 1 9
Claims 2024-05-22 5 161
Drawings 2024-05-22 60 1,723
Description 2024-05-22 207 10,697