Patent 2593916 Summary

(12) Patent Application:	(11) CA 2593916
(54) English Title:	PROBES, LIBRARIES AND KITS FOR ANALYSIS OF MIXTURES OF NUCLEIC ACIDS AND METHODS FOR CONSTRUCTING THE SAME
(54) French Title:	SONDES, BIBLIOTHEQUES ET TROUSSES POUR L'ANALYSE DE MELANGES D'ACIDES NUCLEIQUES ET LEURS PROCEDES DE CONSTRUCTION
Status:	Deemed Abandoned and Beyond the Period of Reinstatement - Pending Response to Notice of Disregarded Communication

Bibliographic Data

(51) International Patent Classification (IPC):	C07H 21/00 (2006.01) C07C 22/34 (2006.01) C07C 50/18 (2006.01) C12P 19/34 (2006.01) C40B 30/04 (2006.01) C40B 40/06 (2006.01) C40B 70/00 (2006.01)
(72) Inventors :	RAMSING, NIELS BIRGER (Denmark) MOURITZEN, PETER (Denmark) ECHWALD, SOREN MORGENTHALER (Denmark) TOLSTRUP, NIELS (Denmark)
(73) Owners :	EXIQON A/S
(71) Applicants :	EXIQON A/S (Denmark)
(74) Agent:	MARKS & CLERK
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date:	2005-12-21
(87) Open to Public Inspection:	2006-06-29
Availability of licence:	N/A
Dedicated to the Public:	N/A
(25) Language of filing:	English

Patent Cooperation Treaty (PCT):	Yes
(86) PCT Filing Number:	PCT/DK2005/000815
(87) International Publication Number:	DK2005000815
(85) National Entry:	2007-06-21

(30) Application Priority Data:

Application No.	Country/Territory	Date
60/637,857	(United States of America)	2004-12-22
PA 2004 01987	(Denmark)	2004-12-22
PA 2004 02012	(Denmark)	2004-12-28

Abstracts

English Abstract

The invention relates to nucleic acid probes, nucleic acid probe libraries,
and kits for detecting, classifying, or quantifying components in a complex
mixture of nucleic acids, such as a transcriptome, and methods of using the
same. The invention also relates to methods of identifying nucleic acid probes
useful in the probe libraries and to methods of identifying a means for
detection of a given nucleic acid.

French Abstract

L'invention concerne des sondes d'acides nucléiques, des bibliothèques de sondes d'acides nucléiques et des trousses permettant de détecter, de classer ou de quantifier des composants dans un mélange complexe d'acides nucléiques, par exemple un transcriptome, et leurs procédés d'utilisation. L'invention concerne également des procédés permettant d'identifier des sondes d'acides nucléiques utiles dans les bibliothèques de sondes ainsi que des procédés permettant d'identifier un moyen de détecter un acide nucléique donné.

Claims

Note: Claims are shown in the official language in which they were submitted.

81
CLAIMS
1. A library of oligonucleotide probes wherein each probe in the library
consists of a
recognition sequence tag and a detection moiety wherein at least one monomer
in each oli-
gonucleotide probe is a modified monomer analogue, increasing the binding
affinity for the
complementary target sequence relative to the corresponding unmodified
oligonucleotide,
such that the library probes have sufficient stability for sequence-specific
binding and
detection of a substantial fraction of a target nucleic acid in any given
target population and
wherein the number of different recognition sequences comprises less than 10%
of all
possible sequence tags of a given length(s), and wherein
each probe contains a fluorophore-quencher pair for detection where the
quencher has
formula (I)
<IMG>
wherein one or two of R1, R4, R5 and R8 independently is/are a bond or
selected from a
substituted or non-substituted amino group, which constitute(s) the linker(s)
to the
remainder of the oligonucleotide probe, and wherein the remaining R1 to R8
groups are each,
independently hydrogen or substituted or non-substituted hydroxy, amino,
alkyl, aryl,
arylalkyl or alkoxy, and/or wherein
less than 20% of the oligonucleotide probes of said library have a guanidyl
(G) residue in the
5' and/or 3' position.
2. The library according to claim 1, wherein the quencher is selected from 1,4-
bis-(3-
hydroxy-propylamino)-anthraquinone, 1-(3-(4,4'-dimethoxy-
trityloxy)propylamino)-4-(3-
hydroxypropylamino)-anthraquinone, 1,5-bis-(3-hydroxy-propylamino)-
anthraquinone, 1-(3-
hydroxypropylamino)-5-(3-(4,4'-dimethoxy-trityloxy)propylamino)-anthraquinone,
1,4-bis-
(4-(2-hydroxyethyl)phenylamino)-anthraquinone, 1-(4-(2-(4,4'-dimethoxy-
trityloxy)ethyl)phenylamino)-4-(4-(2-hydroethyl)phenylamino)-anthraquinone,
1,8-bis-(3-
hydroxy-propylamino)-anthraquinone, 1,4-bis(3-hydroxypropylamino)-6-
methylanthraquinone, 1-(3-(4,4'-dimethoxy-trityloxy)propylamino)-4-(3-
hydroxypropylamino)-6(7)-methyl-anthraquinone, 1,4-bis(4-(2-
hydroethyl)phenylamino)-6-
methyl-anthraquinone, 1,4-bis(4-methyl-phenylamino)-6-carboxy-anthraquinone,
1,4-bis(4-
methyl-phenylamino)-6-(N-(6,7-dihydroxy-4-oxo-heptane-1-yl))carboxamido-
anthraquinone,
1,4-bis(4-methyl-phenylamino)-6-(N-(7-dimethoxytrityloxy-6-hydroxy-4-oxo-
heptane-l-
yl))carboxamido-anthraquinone, 1,4-bis(propylamino)-6-carboxy-anthraquinone,
1,4-

82
bis(propylamino)-6-(N-(6,7-dihydroxy-4-oxo-heptane-1-yl))carboxamido-
anthraquinone,
1,4-bis(propylamino)-6-(N-(7-dimethoxytrityloxy-6-hydroxy-4-oxo-heptane-1-
yl))carboxamido-anthraquinone, 1,5-bis(4-(2-hydroethyl)phenylamino)-
anthraquinone, 1-(4-
(2-hydroethyl)phenylamino)-5-(4-(2-(4,4'-dimethoxy-
trityloxy)ethyl)phenylamino)-
anthraquinone, 1,8-bis(3-hydroxypropylamino)-anthraquinone, 1-(3-
hydroxypropylamino)-8-
(3-(4,4'-dimethoxy-trityloxy)propylamino)-anthraquinone, 1,8-bis(4-(2-
hydroethyl)phenylamino)-anthraquinone, and 1-(4-(2-hydroethyl)phenylamino)-8-
(4-(2-
(4,4'-dimethoxy-trityloxy)ethyl)phenylamino)-anthraquinone.
3. The library according to claim 1, wherein the quencher is 1,4-Bis(2-hydroxy-
ethylamino)-6-methylanthraquinone.
4. The library according to any of the preceding claims, wherein less than 10%
of the
oligonucleotide probes have a G in the 5' end, such as less than 5%.
5. The library according to claim 4, wherein none of the oligonucleotides in
the library
have a G in the 5' end.
6. A library of oligonucleotide probes according to any one of the preceding
claims,
wherein the recognition sequence tag segment of the probes in the library have
been
modified in at least one of the following ways:
i) substitution with at least one non-naturally occurring nucleotide
ii) substitution with at least one chemical moiety to increase the stability
of the probe.
7. A library of oligonucleotide probes according to wherein the recognition
sequence tag
has a length of 6 to 12 nucleotides.
8. A library of oligonucleotide probes according to claim 7, wherein the
recognition
sequence tag has a length of 8 or 9 nucleotides.
9. A library of oligonucleotide probes according to claim 8, wherein the
recognition
sequence tags are substituted with LNA nucleotides.
10. A library of oligonucleotide probes according to any one of the preceding
claims,
wherein more than 90% of the oligonucleotide probes can bind and detect at
least two target
sequences in a nucleic acid population.

83
11. A library according to claim 10, wherein the recognition sequence tag is
complementary
to at least two target sequences in the nucleic acid population.
12. A library of oligonucleotide probes of 8 and 9 nucleotides in length
comprising a mix-
ture of subsets of oligonucleotide probes defined in any one of claims 1-11.
13. A library of oligonucleotide probes of any one of the preceding claims,
wherein the
number of different target sequences in a nucleic acid population is at least
100.
14. A library of oligonucleotide probes according to any one of the preceding
claims,
wherein at least one nucleotide in each oligonucleotide probe is substituted
with a non-natu-
rally occurring nucleotide analogue, a deoxyribose or ribose analogue, or an
internucleotide
linkage other than a phosphodiester linkage.
15. A library of oligonucleotide probes according to any one of the preceding
claims,
wherein the detection moiety is a covalently or non-covalently bound minor
groove binder or
an intercalator selected from the group comprising asymmetric cyanine dyes,
DAPI, SYBR
Green I, SYBR Green II, SYBR Gold, PicoGreen, thiazole orange, Hoechst 33342,
Ethidium
Bromide, 1-O-(1-pyrenylmethyl)glycerol, and Hoechst 33258.
16. The library oligonucleotide probes according to claim 14 or 15, wherein
the internucleo-
tide linkage other than phosphodiester linkage is a non-phosphate
internucleotide linkage.
17. The library of oligonucleotide probes according to claim 16, wherein the
internucleotide
linkage is selected from the group consisting of alkyl phosphonate,
phosphoramidite, alkyl-
phosphotriester, phosphorothioate, and phosphorodithioate linkages.
18. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein said oligonucleotide probes contain non-naturally occurring
nucleotides, such as 2'-
O-methyl, diamine purine, 2-thio uracil, 5-nitroindole, universal or
degenerate bases, inter-
calating nucleic acids or minor-groove-binders, to enhance their binding to a
complementary
nucleic acid sequence.
19. The library according to claim 18, wherein all oligonucleotide probes
contain at least
one 5-nitroindole residue.

84
20. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein said different recognition sequences comprise less than 1% of all
possible oligonu-
cleotides of a given length.
21. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein each probe can be detected using a dual label by the molecular beacon
assay princi-
ple.
22. The library of oligonucleotide probes according to any one of claims 1-20,
wherein each
probe can be detected using a dual label by the 5' nuclease assay principle.
23. The library according to any one of the preceding claims, wherein each
probe contains
a single detection moiety that can be detected by the molecular beacon assay
principle.
24. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein the target nucleic acid population is an mRNA sample, a cDNA sample or
a genomic
DNA sample.
25. The library of oligonucleotide probes according to claim 24, wherein said
target mRNA
or target cDNA population originates from the transcriptomes of human, mouse,
rat,
Arabidopsis thaliana, Drosophila melanogaster, Chimpanzee or Caenorhabditis
elegans.
26. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein said probe target sequences occur at least once within more than 4% of
different
target nucleic acids in a target nucleic acid population.
27. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein self-complementary probe sequences have been omitted from the said
library.
28. The library of oligonucleotide probes according to claim 27, wherein said
self-
complementary sequences have been de-selected.
29. The library of oligonucleotide probes according to claim 27, wherein said
self-
complementary sequences have been eliminated by sequence-specific
modifications, such as
non-standard nucleotides, nucleotides with SBC nucleobases, 2'-O-methyl,
diamine purine, 2-
thio uracil, universal or degenerate bases or minor-groove-binders.

85
30. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein the melting temperature (T m) of each probe is adjusted to be suitable
for PCR-based
assays by substitution with non-occurring modifications, such as LNA,
optionally modified
with SBC nucleobases, 2'-O-methyl, diamine purine, 2-thio uracil, 5-
nitroindole, universal or
degenerate bases, intercalating nucleic acids or minor-groove-binders, to
enhance their
binding to a complementary nucleic acid sequence.
31. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein the melting temperature (T m) of each probe is at least 50°C.
32. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein each probe has a DNA nucleotide at the 5'-end and/or has a DNA
nucleotide at the
3'-end.
33. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein each probe can be detected by the molecular beacon principle.
34. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein the target population is the human transcriptome.
35. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein each oligonucleotide probe detects the largest possible number of
different target
nucleic acids resulting in maximum coverage for a given target nucleic acid
population by the
said library.
36. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein the oligonucleotide probes are selected to have as many target
sequences or binding
sites as possible within the target population of nucleic acids in order to
obtain a maximum
degree of detection.
37. The library of oligonucleotide probes according to any one of the
preceding claims,
wherein the oligonucleotide probes are selected to have at least one target
sequence in as
many target nucleic acids as possible within the target population of nucleic
acids in order to
obtain a maximum degree of detection.
38. The library of oligonucleotide probes in TABLE 1 or TABLE 1a or Fig. 13 or
Fig. 14 capa-
ble of detecting the complementary sequences in any given nucleic acid
population.

86
39. The library according to any one of the preceding claims, which comprises
probes each
having a recognition element listed in TABLE 1 or TABLE 1a in the
specification and/or which
comprises probes each having a recognition element complementary to the
recognition ele-
ments listed in said TABLE 1.
40. An oligonucleotide probe comprising a quencher of formula I and a 5'-
nitroindole
residue.
41. The oligonucleotide probe of claim 40, which is free from a 5' guanidyl
residue.
42. The oligonucleotide probe of claim 40 or 41, which is as defined in any
one of claims 1-
9, 14-18,21-23, and 31-1.
43. The oligonucleotide probe according to any one of claims 40-42, said probe
being
selected from probes complementary to or identical with the sequences set
forth in Table 1,
Table 1A, Fig. 13, or Fig 14.
44. The oligonucleotide probe according to any one of claim 40-43, which has
an exact
nucleotide sequence selected from Table 1 or Table 1A.
45. A method of selecting oligonucleotide sequences useful in the library
according to any
one of the preceding claims, comprising
a) providing a first list of all possible oligonucleotides of a predefined
number of nucleotides,
N, said oligonucleotides having a melting temperature, T m, of at least
50°C,
b) providing a second list of target nucleic acid sequences,
c) identifying and storing for each member of said first list, the number of
members from
said second list, which include a sequence complementary to said each member,
d) selecting a member of said first list, which in the identification in step
c matches the
maximum number, identified in step c, of members from said second list,
e) adding the member selected in step d to a third list consisting of the
selected oligonucleo-
tides useful in the library according to any one of the preceding claims,
f) subtracting the member selected in step d from said first list to provide a
revised first list,
m) repeating steps d through f until said third list consists of members which
together will be
contemplary to at least 30% of the members on the list of target nucleic acid
sequences from
step b, wherein
said method has a bias against including a member in the third list that have
a 5' guanidyl
(G) and/or a bias against including members in the third list that have a 3'
guanidyl (G).

87
46. The method according to claim 45, wherein guanidyl is avoided as the 5'
residue in all
oligonucleotide sequences in said third list.
47. The method according to claim 46, wherein the avoidance of guanidyl as the
5' residue
is achieved by i) reducing the list of step a to include only those that do
not include a 5'
guanidyl residue, and/or ii) avoiding selection in step d of those sequences
which include a 5'
guanidyl residue, and/or iii) omitting step e for those sequences that include
a 5' guanidyl
residue.
48. The method according to any one of claims 45-47, wherein T m is at least
60°.
49. The method according to any one of claims 45-48, wherein the first list of
oligonucleotides only includes oligonucleotides incapable of self-
hybridization.
50. The method according to any one of claims 45-49, which after step f and
before step m
comprises the following steps:
g) subtracting all members from said second list which include a sequence
complementary to
the member selected in step d to obtain a revised second list,
h) identifying and storing for each member of said revised first list, the
number of members
from said revised second list, which include a sequence complementary to said
each member,
i) selecting a member of said first list, which in the identification in step
h matches the
maximum number, identified in step h, of members from said second list, or
selecting a
member of said first list that provides the maximum number obtained by
multiplying the
number identified in step h with the number identified in step c,
j) adding the member selected in step i to said third list,
k) subtracting the member selected in step i from said revised first list, and
I) subtracting all members from said revised second list which include a
sequence com-
plementary to the member selected in step i.
51. The method according to claim 50 insofar as it depends on claim 46,
wherein the
avoidance of guanidyl as the 5' residue is achieved by avoiding selection in
step i of those
sequences which include a 5' guanidyl residue, and/or omitting step j for
those sequences
that include a 5' guanidyl residue.
52. The method according to any one of claims 45-51, wherein repetition in
step m is
continued until said third list consists of members which together will be
contemplary to at
least 85% of the members on the list of target nucleic acid sequences from
step b.

88
53. The method according to any one of claims 45-52, wherein, after selection
of the first
member of said third list, the selection in step d after step c is preceded by
identification of
those members of said first list which hybridizes to more than a selected
percentage of the
maximum number of members from said second list so that only those members so
identified
are subjected to the selection in step d.
54. The method according to claim 53, wherein the selected percentage is 80%.
55. The method according to any one of claims45-54, wherein it is ensured that
members
are not entered on the third list if such members have previously failed
qualitative as useful
probes.
56. The method according to claim 55, wherein oligonucleotide sequences that
have
previously failed qualitatively are not included in the third list by i)
reducing the list of step a
to include only those that have not previously failed qualitatively, and/or
ii) avoiding selection
in step d or i of those sequences that have not previously failed
qualitatively, and/or iii)
omitting step e or j for those sequences that have not previously failed
qualitatively.
57. The method according to any one of claims 45-56, wherein N is an integer
selected
from 6, 7, 8, 9, 10, 11, and 12.
58. The method according to claim 57, wherein N is 8 or 9.
59. The method according to any one of claims 45-58, wherein said second list
of step b
comprises target nucleic acid sequences as defined in claim 24 or 25.
60. The method according to any one of claims 45-59, essentially performed as
set forth in
Fig. 2.
61. The method according to any one of claims 45-60, wherein said first,
second and third
lists are stored in the memory of a computer system, preferably in a database.
62. A computer program product providing instructions for implementing the
method accor-
ding to any one of claims 45-61, embedded in a computer-readable medium.
63. A system comprising a database of target sequences and an application
program for
executing the computer program of claim 62.

89
64. A method for identifying a specific means for detection of a target
nucleic acid, the
method comprising
A) inputting, into a computer system, data that uniquely identifies the
nucleic acid sequence
of said target nucleic acid, wherein said computer system comprises a database
holding in-
formation of the composition of at least one library of nucleic acid probes
according to any
one of claims 1-39, and wherein the computer system further comprises a
database of target
nucleic acid sequences for each probe of said at least one library and/or
further comprises
means for acquiring and comparing nucleic acid sequence data,
B) identifying, in the computer system, a probe from the at least one library,
wherein the
sequence of the probe exists in the target nucleic acid sequence or a sequence
complemen-
tary to the target nucleic acid sequence,
C) identifying, in the computer system, primer that will amplify the target
nucleic acid se-
quence, and
D) providing, as identification of the specific means for detection, an output
that points out
the probe identified in step B and the sequences of the primers identified in
step C.
65. The method according to claim 64, wherein step A also comprises inputting,
into the
computer system, data that identifies the at least one library of nucleic
acids from which it is
desired to select a member for use in the specific means for detection.
66. The method according to claim 65, wherein the data that identifies the
composition of
the at least one library is a product code.
67. The method according to any one of claims 64-66, wherein inputting in step
A is per-
formed via an internet web interface.
68. The method according to any one of claims 64-66, wherein the primers
identified in
step C are chosen so as to minimize the chance of amplifying genomic nucleic
acids in a PCR
reaction.
69. The method according to claim 68, wherein at least one of the primers is
selected so as
to include a nucleotide sequence which in genomic DNA is interrupted by an
intron.
70. The method according to any one of claims 64-69, wherein the primers
selected in step
C are chosen so as to minimize length of amplicons obtained from PCR performed
on the tar-
get nucleic acid sequence.
71. The method according to any one of claims 64-70, wherein the primers
selected in step
C are chosen so as to optimize the GC content for performing PCR.

90
72. A computer program product providing instructions for implementing the
method accor-
ding to any one of claims 64-71 embedded in a computer-readable medium.
73. A system comprising a database of nucleic acid probes as defined in any
one of claims
1-39 and an application program for executing the computer program of claim
72.
74. A method for profiling a plurality of target sequences comprising
contacting a sample of
target sequences with a library according to any one of claims 1-39 and
detecting, characteri-
zing or quantifying the probe sequences which bind to the target sequences.
75. The method according to claim 74, providing detection of a nucleic acid
sequence which
is present in less than 10% of the plurality of sequences which are bound by
the multi-probe
sequences.
76. The method according to claim 75, wherein the target mRNA sequences or
cDNA
sequences comprise a transcriptome.
77. The method according to claim 76, wherein the transcriptome is a human
transcrip-
tome.
78. The method according to any one of claims 74-77, wherein the library of
probes are
covalently coupled to a solid support.
79. The method according to claim 78, wherein the solid support comprises a
microtiter
plate and each well of the microtiter plate comprises a different library
probe.
80. The method according to any one of claims 74-79, wherein the step of
detecting is per-
formed by amplifying a target nucleic acid sequence containing a recognition
sequence com-
plementary to a library probe.
81. The method of claim 80, wherein target nucleic acid amplification is
carried out by u-
sing a pair of oligonucleotide primers flanking the recognition sequence
complementary to a
library probe.
82. The method of claim 74-81, wherein the presence or expression level of one
or more
target nucleic acid sequences is correlated with a species' phenotype.
83. The method of claim 82, wherein the phenotype is a disease.

91
84. A method of analysing a mixture of nucleic acids using a library according
to any one of
claims 1-39 comprising the steps of
(a) contacting a target oligonucleotide with a library of labelled
oligonucleotide probes, each
of said oligonucleotide probes having a known sequence and being attached to a
solid sup-
port at a known position, to hybridize said target oligonucleotide to at least
one member of
said library of probes, thereby forming a hybridized library;
(b) contacting said hybridized library with a nuclease capable of cleaving
double-stranded
oligonucleotides to release from said hybridized library a portion of said
labelled oligonucleo-
tide probes or fragments thereof; and
(c) identifying said positions of said hybridized library from which labelled
probes or frag-
ments thereof have been removed, to determine the sequence of said unlabelled
target oli-
gonucleotide.
85. A method of analysing a mixture of nucleic acids using a library of any
one of claims 1-
39 comprising the steps of
(a) contacting a target oligonucleotide with a library of labelled
oligonucleotide probes, each
of said oligonucleotide probes having a known sequence and being attached to a
solid sup-
port at a known position, to hybridize said target oligonucleotide to at least
one member of
said library of probes, thereby forming a hybridized library;
(b) identifying said positions of said hybridized library at which labelled
probes or fragments
thereof have hybridized, to determine the sequence of said target
oligonucleotide; and
(c) identifying said positions of said hybridized library from which labelled
probes or frag-
ments thereof have been removed, to determine the sequence of said unlabelled
target oli-
gonucleotide.
86. A method for quantitatively or qualitatively determining the presence of a
target nucleic
acid in a sample, the method comprising
i) identifying, by means of the method according to any one of claims 64-71, a
specific
means for detection of the target nucleic acid, where the specific means for
detection com-
prises an oligonucleotide probe and a set of primers,
ii) obtaining the primers and the oligonucleotide probe identified in step i),
iii) subjecting the sample to a molecular amplification procedure in the
presence of the pri-
mers and the oligonucleotide probe from step ii), and
iv) determining the presence of the target nucleic acid based on the outcome
of step iii).
87. The method according to claim 86, wherein the primers obtained in step ii)
are ob-
tained by synthesis.

92
88. The method according to claim 86 or 87 or, wherein the oligonucleotide
probe is ob-
tained from a library according to any one of claims 1-39.
89. The method according to any one of claims 86-88, wherein the procedure in
step iii) is
a PCR or a NASBA procedure.
90. The method according to claim 89, wherein the PCR procedure is a qPCR.

Description

Note: Descriptions are shown in the official language in which they were submitted.

DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.
CECI EST LE TOME 1 DE 2
CONTENANT LES PAGES 1 A 80
NOTE : Pour les tomes additionels, veuillez contacter le Bureau canadien des
brevets
JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME
THIS IS VOLUME 1 OF 2
CONTAINING PAGES 1 TO 80
NOTE: For additional volumes, please contact the Canadian Patent Office
NOM DU FICHIER / FILE NAME:
NOTE POUR LE TOME / VOLUME NOTE:

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
PROBES, LIBRARIES AND KITS FOR ANALYSIS OF MIXTURES OF NUCLEIC ACIDS AND
METHODS FOR CONSTRUCTING THE SAME
FIELD OF THE INVENTION
The invention relates to nucleic acid probes, nucleic acid probe libraries,
and kits for detect-
ing, classifying, or quantifying components in a complex mixture of nucleic
acids, such as a
transcriptome, and methods of using the same.
BACKGROUND OF THE INVENTION
With the advent of microarrays for profiling the expression of thousands of
genes, such as
GeneChipTM arrays (Affymetrix, Inc., Santa Clara, CA), correlations between
expressed genes
and cellular phenotypes may be identified at a fraction at the cost and labour
necessary for
traditional methods, such as Northern- or dot-blot analysis. Microarrays
permit the develop-
ment of multiple parallel assays for identifying and validating biomarkers of
disease and drug
targets which can be used in diagnosis and treatment. Gene expression profiles
can also be
used to estimate and predict metabolic and toxicological consequences of
exposure to an
agent (e.g., such as a drug, a potential toxin or carcinogen, etc.) or a
condition (e.g., tem-
perature, pH, etc).
Microarray experiments often yield redundant data, only a fraction of which
has value for the
experimenter. Additionally, because of the highly parallel format of
microarray-based assays,
conditions may not be optimal for individual capture probes. For these
reasons, microarray
experiments are most often followed up by, or sequentially replaced by,
confirmatory studies
using single-gene homogeneous assays. These are most often quantitative PCR-
based me-
thods such as the 5' nuclease assay or other types of dual labelled probe
quantitative assays.
However, these assays are still time-consuming, single-reaction assays that
are hampered by
high costs and time-consuming probe design procedures. Further, 5' nuclease
assay probes
are relatively large (e.g., 15-30 nucleotides). Thus, the limitations in
homogeneous assay
systems currently known create a bottleneck in the validation of microarray
findings, and in
focused target validation procedures.
An approach to avoid this bottleneck is to omit the expensive dual-labelled
indicator probes
used in 5' nuclease assay procedures and molecular beacons and instead use non-
sequence-
specific DNA intercalating dyes such as SYBR Green that fluoresce upon binding
to double-
stranded but not single-stranded DNA. Using such dyes, it is possible to
universally detect
any amplified sequence in real-time. However, this technology is hampered by
several

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
2
problems. For example, non-specific priming during the PCR amplification
process can gene-
rate unintentional non-target amplicons that will contribute in the
quantification process.
Further, interactions between PCR primers in the reaction to form "primer-
dimers" are com-
mon. Due to the high concentration of primers typically used in a PCR
reaction, this can lead
to significant amounts of short double-stranded non-target amplicons that also
bind interca-
lating dyes. Therefore, the preferred method of quantifying mRNA by real-time
PCR uses
sequence- specific detection probes.
One approach for avoiding the problem of random amplification and the
formation of primer-
dimers is to use generic detection probes that may be used to detect a large
number of dif-
ferent types of nucleic acid molecules, while retaining some sequence
specificity, has been
described by Simeonov, et al. (Nucleic Acid Research 30(17): 91, 2002; U.S.
Patent Publica-
tion 20020197630) and involves the use of a library of probes comprising more
than 10% of
all possible sequences of a given length (or lengths). The library can include
various non-
natural nucleobases and other modifications to stabilize binding of
probes/primers in the Ii-
brary to a target sequence. Even so, a minimal length of at least 8 bases is
required for most
sequences to attain a degree of stability that is compatible with most assay
conditions rele-
vant for applications such as real time PCR. Because a universal library of
all possible 8-mers
contains 65,536 different sequences, even the smallest library previously
considered by
Simeonov, et al. contains more than 10% of all possibilities, i.e. at least
6554 sequences
which is impractical to handle and vastly expensive to construct.
From a practical point of view, several factors limit the ease of use and
accessibility of con-
temporary homogeneous assays applications. The problems encountered by users
of conven-
tional assay technologies include:
= prohibitively high costs when attempting to detect many different genes in a
few sam-
ples, because the price to purchase a probe for each transcript is high.
= The synthesis of labelled probes is time-consuming and often the time from
order to
receipt from manufacturer is more than 1 week.
= User-designed kits may not work the first time and validated kits are
expensive per
assay.
= It is difficult to quickly test for a new target or iteratively improve
probe design.
= The exact probe sequence of commercial validated probes may be unknown for
the
customer resulting in problems with evaluation of results and suitability for
scientific publica-
tion.
= When assay conditions or components are obscure it may be impossible to
order rea-
gents from alternative source.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
3
The described invention address these practical problems and aim to ensure
rapid and inex-
pensive assay development of accurate and specific assays for quantification
of gene tran-
scripts.
SUMMARY OF THE INVENTION
It is desirable to be able to quantify the expression of most genes (e.g.,
>98%) in e.g. the
human transcriptome using a limited number of oligonucleotide detection probes
in a homo-
geneous assay system. The present invention solves the problems faced by
contemporary
approaches to homogeneous assays outlined above. This is done by providing a
method for
construction of generic multi-probes with sufficient sequence specificity - so
that they are
unlikely to detect a randomly amplified sequence fragment or primer-dimers -
but are still
capable of detecting many different target sequences each. Such probes are
usable in
different assays and may be combined in small probe libraries (50 to 500
probes) that can be
used to detect and/or quantify individual components in complex mixtures
composed of
thousands of different nucleic acids (e.g. detecting individual transcripts in
the human
transcriptome composed of >30,000 different nucleic acids.) when combined with
a target
specific primer set.
Each multi-probe comprises two elements: 1) a detection element or detection
moiety con-
sisting of one or more labels to detect the binding of the probe to the
target; and 2) a recog-
nition element or recognition sequence tag ensuring the binding to the
specific target(s) of
interest. The detection element can be any of a variety of detection
principles used in homo-
geneous assays. The detection of binding is either direct by a measurable
change in the
properties of one or more of the labels following binding to the target (e.g.
a molecular bea-
con type assay with or without stem structure) or indirect by a subsequent
reaction following
binding (e.g. cleavage by the 5' nuclease activity of the DNA polymerase in 5'
nuclease as-
says).
Each detection element may include a quencher selected from the quenchers
disclosed in
European patent applications 04078170 and 03759288. In that context, all
disclosures
relating to the quenchers disclosed in these two patent applications relate
mutatis mutandis
to quenchers forming part of oligonucleotide probes that are part of the
libraries of the
present invention and both disclosures are therefore incorporated by reference
herein.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
4
The quencher preferably has formula I
R 8 0 R1
R R 1 1 (1)
R6 R 3
R5 0 R4
wherein one or two of Rl, R4 , RS and R8 independently is/are a bond or
selected from
substituted or non-substituted amino group, which constitute(s) the linker(s)
to the
remainder of the oligonucleotide probe, and wherein the remaining Rl to R8
groups are each,
independently hydrogen or substituted or non-substituted hydroxy, amino,
alkyl, aryl,
arylalkyl or alkoxy The substitution of the amino group can be with an alkyl,
alkylaryl or aryl
group.
The term "alkyl" is used herein in the context of formula I to refer to a
branched or
unbranched, saturated or unsaturated, monovalent hydrocarbon radical,
generally having
from about 1-30 carbons and preferably, from 1-6 carbons. Suitable alkyl
radicals include, for
example, structures containing one or more methylene, methine and/or methyne
groups.
Branched structures have a branching motif similar to iso-propyl, t-butyl, i-
butyl, 2-
ethylpropyl, etc. As used herein, the term encompasses "substituted alkyls"
and "cyclic
alkyl". "Substituted alkyl" refers to alkyl as just described including one or
more substituents
such as, for example, Cl-C6-alkyl, aryl, acyl, halogen (i.e. alkylhalos, e.g.,
CF3), hydroxy,
amino, alkoxy, alkylamino, acylamino, thioamido, acyloxy, aryloxy,
aryloxyalkyl, mercapto,
thia, aza, oxo, both saturated and unsaturated cyclic hydrocarbons,
heterocycles and the like.
These groups may be attached to any carbon or substituent of the alkyl moiety.
Additionally,
these groups may be pendent from, or integral to, the alkyl chain.
The term "alkylaryl" in this context means a radical obtained by combining an
alkyl and an
aryl group. Typical alkylaryl groups include phenethyl, ethyl phenyl and the
like.
The term "alkylamino" in this context means amino substituted with alkyl. In a
preferred
embodiment, the amino group is attached to the anthraquinone structure.
The term "lalkylarylamino" in this context means amino substituted with
alkylaryl. In a
preferred embodiment, the amino group is attached to the anthraquinone
structure.
The term "arylamino" in this context means amino substituted with aryl. In a
preferred
embodiment, the amino group is attached to the anthraquinone structure.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
Especially preferred examples of quenchers used in the invention include 1,4-
bis-(3-hydroxy-
propylamino)-anthraquinone, 1-(3-(4,4'-dimethoxy-trityloxy)propylamino)-4-(3-
hydroxypropylamino)-anthraquinone, 1,5-bis-(3-hydroxy-propylamino)-
anthraquinone, 1-(3-
hydroxypropylamino)-5-(3-(4,4'-dimethoxy-trityloxy)propylamino)-anthraquinone,
1,4-bis-
5 (4-(2-hydroxyethyl)phenylamino)-anthraquinone, 1-(4-(2-(4,4'-dimethoxy-
trityloxy)ethyl)phenylamino)-4-(4-(2-hydroethyl)phenylamino)-anthraquinone,
1,8-bis-(3-
hydroxy-propylamino)-anthraquinone, 1,4-bis(3-hydroxypropylamino)-6-
methylanthraquinone, 1-(3-(4,4'-dimethoxy-trityloxy)propylamino)-4-(3-
hydroxypropylamino)-6(7)-methyl-anthraquinone, 1,4-bis(4-(2-
hydroethyl)phenylamino)-6-
methyl-anthraquinone, 1,4-bis(4-methyl-phenylamino)-6-carboxy-anthraquinone,
1,4-bis(4-
methyl-phenylamino)-6-(N-(6,7-dihydroxy-4-oxo-heptane-1-yl))carboxamido-
anthraquinone,
1,4-bis(4-methyl-phenylamino)-6-(N-(7-dimethoxytrityloxy-6-hydroxy-4-oxo-
heptane-l-
yl))carboxamido-anthraquinone, 1,4-bis(propylamino)-6-carboxy-anthraquinone,
1,4-
bis(propylamino)-6-(N-(6,7-dihydroxy-4-oxo-heptane-l-yl))carboxamido-
anthraquinone,
1,4-bis(propylamino)-6-(N-(7-dimethoxytrityloxy-6-hydroxy-4-oxo-heptane-l-
yl))carboxamido-anthraquinone, 1,5-bis(4-(2-hydroethyl)phenylamino)-
anthraquinone, 1-(4-
(2-hydroethyl)phenylamino)-5-(4-(2-(4,4'-dimethoxy-
trityloxy)ethyl)phenylamino)-
anthraquinone, 1,8-bis(3-hydroxypropylamino)-anthraquinone, 1-(3-
hydroxypropylamino)-8-
(3-(4,4'-dimethoxy-trityloxy)propylamino)-anthraquinone, 1,8-bis(4-(2-
hydroethyl)phenylamino)-anthraquinone, and 1-(4-(2-hydroethyl)phenylamino)-8-
(4-(2-
(4,4'-dimethoxy-trityloxy)ethyl)phenylamino)-anthraquinone.
One especially preferred quencher is compound 11 of Example 21, i.e. 1,4-Bis(2-
hydroxyethylamino)-6-methylanthraquinone.
The recognition element also contributes to the novelty of the present
invention. It comprises
a short oligonucleotide moiety whose sequence has been selected to enable
detection of a
large subset of target nucleotides in a given complex sample mixture. The
novel probes
designed to detect many different target molecules each are referred to as
multi-probes. The
concept of designing a probe for multiple targets and exploit the recurrence
of a short
recognition sequence by selecting the most frequently encountered sequences is
novel and
contrary to conventional probes that are designed to be as specific as
possible for a single
target sequence. The surrounding primers and the choice of probe sequence in
combination
subsequently ensure the specificity of the multi-probes. The novel design
principles arising
from attempts to address the largest number of targets with the smallest
number of probes
are likewise part of the invention. This is enabled by the discovery that very
short 8-9 mer
LNA containing oligonucleotide probes are compatible with PCR based assays. In
one aspect
of the present invention modified or analogue nucleobases, nucleosidic bases
or nucleotides
are incorporated in the recognition element, possibly together with minor
groove binders and

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
6
other modifications, that all aim to stabilize the duplex formed between the
probe and the
target molecule so that the shortest possible probe sequence with the widest
range of targets
can be used. In a preferred aspect of the invention the modifications are
incorporation of LNA
residues to reduce the length of the recognition element to 8 or 9 nucleotides
while
maintaining sufficient stability of the formed duplex to be detectable under
ordinary assay
conditions. Typically, less than 20% of the oligonucleotide probes of said
library have a
guanidyl (G) residue in the 5' and/or 3' position of the recognition element,
but it is preferred
that less than 10% of the oligonucleotide probes have a G in the 5' end of the
recognition
element, such as less than 5%. Especially preferred are libraries where the
recognition
elements do not have a G in the 5' end.
Preferably, the multi-probes are modified in order to increase the binding
affinity of the probe
for a target sequence by at least two-fold compared to a probe of the same
sequence without
the modification, under the same conditions for detection, e.g., such as PCR
conditions, or
stringent hybridization conditions. The preferred modifications include, but
are not limited to,
inclusion of nucleobases, nucleosidic bases or nucleotides that has been
modified by a chemi-
cal moiety or replaced by an analogue (e.g. including a ribose or deoxyribose
analogue) or
by using internucleotide linkages other than phosphodiester linkages (such as
non-phosphate
internucleotide linkages), all to increase the binding affinity. The preferred
modifications may
also include attachment of duplex stabilizing agents e.g., such as minor-
groove-binders
(MGB) or intercalating nucleic acids (INA). Additionally the preferred
modifications may also
include addition of non-discriminatory bases e.g., such as 5-nitroindole,
which are capable of
stabiiizing duplex formation regardless of the nucleobase at the opposing
position on the
target strand. Actually, a preferred embodiment entails that all probes in the
inventive library
include at least one 5-nitroindole residue (and most preferred: all probes
include one single
= f
5-nitroindole residue. Finally, multi-probes composed of a non-sugar-phosphate
backbone,
e.g. such as PNA, that are capable of binding sequence specifically to a
target sequence are
also considered as modification. All the different binding affinity increased
modifications
mentioned above will in the following be referred to as "the stabilizing
modification(s)", and
the ensuing multi-probe wili in the following also be referred to as "modified
oligonucleotide".
More preferably the binding affinity of the modified oligonucleotide is at
least about 3-fold, 4-
fold, 5-fold, or 20-fold higher than the binding of a probe of the same
sequence but without
the stabilizing modification(s).
Most preferably, the stabilizing modification(s) is inclusion of one or more
LNA nucleotide
analogs. Probes of from 6 to 12 nucleotides according to the invention may
comprise from 1
to 8 stabilizing nucleotides, such as LNA nucleotides. When at least two LNA
nucleotides are
included, these may be consecutive or separated by one or more non-LNA
nucleotides. In
one aspect, LNA nucleotides are alpha and/or xylo LNA nucleotides.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
7
The invention also provides oligomer multi-probe library useful under
conditions used in
NASBA based assays.
NASBA is a specific, isothermal method of nucleic acid amplification suited
for the amplifica-
tion of RNA. Nucleic acid isolation is achieved via lysis with guanidine
thiocyanate plus Triton
X-100 and ending with purified nucleic acid being eluted from silicon dioxide
particles. Ampli-
fication by NASBA involves the coordinated activities of three enzymes, AMV
Reverse Tran-
scriptase, RNase H, and T7 RNA Polymerase. Quantitative detection is achieved
by way of
internal calibrators, added at isolation, which are co-amplified and
subsequently identified
along with the wild type of RNA using electro chemiluminescence.
The invention also provides an oligomer multi-probe library comprising multi-
probes compri-
sing at least one with stabilizing modifications as defined above. Preferably,
the probes are
less than about 20 nucleotides in length and more preferably less than 12
nucleotides, and
most preferably about 8 or 9 nucleotides. Also, preferably, the library
comprises less than
about 3000 probes and more preferably the library comprises less than 500
probes and most
preferably about 100 probes. The libraries containing labelled multi-probes
may be used in a
variety of applications depending on the type of detection element attached to
the recogni-
tion element. These applications include, but are not limited to, dual or
single labelled assays
such as 5' nuclease assay, molecular beacon applications (see, e.g., Tyagi and
Kramer Nat.
Biotechnol. 14: 303-308, 1996) and other FRET-based assays.
In one aspect of the invention the multi-probes described above, are designed
together to
complement each other as a predefined subset of all possible sequences of the
given lengths
selected to be able to detect/characterize/quantify the largest number of
nucleic acids in a
complex mixture using the smallest number of multi-probe sequences. These
predesigned
small subsets of all possible sequences constitute a multi-probe library. The
multi-probe Ii-
braries described by the present invention attains this functionality at a
greatly reduced com-
plexity by deliberately selecting the most commonly occurring oligomers of a
given length or
lengths while attempting to diversify the selection to get the best possible
coverage of the
complex nucleic acid target population. In one preferred aspect, probes of the
library hybri-
dize with more than about 60% of a target population of nucleic acids, such as
a population
of human mRNAs. More preferably, the probes hybridize with greater than 70%,
greater
than 80%, greater than 90%, greater than 95% and even greater than 98% of all
target nu-
cleic acid molecules in a population of target molecules (see, e.g., Fig. 1).
In a most preferred aspect of the invention, a probe library (i.e. such as
about 100 multi-
probes) comprising about 0.1 % of all possible sequences of the selected probe
length(s), is
capable of detecting, classifying, and/or quantifying more than 98% of mRNA
transcripts in

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
8
the transcriptome of any specific species, particularly mammals and more
particular humans
(i.e., > 35,000 different mRNA sequences). In fact, it is preferred that at
least 85% of all
target nucleic acids in a target population are covered by a multi-probe
library of the inven-
tion.
The problems with existing homogeneous assays mentioned above are addressed by
the use
of a multi-probe library according to the invention consisting of a minimal
set of short detec-
tion probes selected so as to recognize or detect a majority of all expressed
genes in a given
cell type from a given organism. In one aspect, the library comprises probes
that detect
each transcript in a transcriptome of greater than about 10,000 genes, greater
than about
15,000 genes, greater than about 20,000 genes, greater than about 25,000
genes, greater
than about 30,000 genes or greater than about 35,000 genes or equivalent
numbers of dif-
ferent mRNA transcripts. In one preferred aspect, the library comprises probes
that detect
mammalian transcripts sequences, e.g., such as mouse, rat, rabbit, monkey, or
human se-
quences.
By providing a cost efficient multi-probe set useful for rapid development of
quantitative real-
time and end-point PCR assays, the present invention overcomes the limitations
discussed
above for contemporary homogeneous assays. The detection element of the multi-
probes
according to the invention may be single or doubly labelled (e.g. by
comprising a label at
each end of the probe, or an internal position). Thus, probes according to the
invention can
be adapted for use in 5' nuclease assays, molecular beacon assays, FRET
assays, and other
similar assays. In one aspect, the detection multi-probe comprises two labels
capable of in-
teracting with each other to produce a signal or to modify a signal, such that
a signal or a
change in a signal may be detected when the probe hybridizes to a target
sequence. A parti-
cular aspect is when the two labels comprise a quencher and a reporter
molecule.
In another aspect, the probe comprises a target-specific recognition segment
capable of spe-
cifically hybridizing to a plurality of different nucleic acid molecules
comprising the comple-
mentary recognition sequence. A particular detection aspect of the invention
referred to as a
"molecular beacon with a stem region" is when the recognition segment is
flanked by first
and second complementary hairpin-forming sequences which may anneal to form a
hairpin.
A reporter label is attached to the end of one complementary sequence and a
quenching
moiety is attached to the end of the other complementary sequence. The stem
formed when
the first and second complementary sequences are hybridized (i.e., when the
probe recogni-
tion segment is not hybridized to its target) keeps these two labels in close
proximity to each
other, causing a signal produced by the reporter to be quenched by
fluorescence resonance
energy transfer (FRET). The proximity of the two labels is reduced when the
probe is hybri-
dized to a target sequence and the change in proximity produces a change in
the interaction

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
9
between the labels. Hybridization of the probe thus results in a signal (e.g.
fluorescence)
being produced by the reporter molecule, which can be detected and/or
quantified.
In another aspect, the multi-probe comprises a reporter and a quencher
molecule at oppo-
sing ends of the short recognition sequence, so that these moieties are in
sufficient proximity
to each other, that the quencher substantially reduces the signal produced by
the reporter
molecule. This is the case both when the probe is free in solution as well as
when it is bound
to the target nucleic acid. A particular detection aspect of the invention
referred to as a "5'
nuclease assay" is when the muiti-probe may be susceptible to cleavage by the
5' nuclease
activity of the DNA polymerase. This reaction may possibly result in
separation of the
quencher molecule from the reporter molecule and the production of a
detectable signal.
Thus, such probes can be used in amplification-based assays to detect and/or
quantify the
amplification process for a target nucleic acid.
In a first aspect, the present invention relates to libraries of multi-probes
as discussed above.
In such a library of oligonucleotide probes, each probe comprises a detection
element and a
recognition segment having a length of about 8-9 nucleotides, where some or
all of the
nucleobases in said oligonucleotides are substituted by non-natural bases
having the effect of
increasing binding affinity compared to natural nucleobases, and/or some or
all of the nucleo-
tide units of the oligonucleotide probe are modified with a chemical moiety to
increase bin-
ding affinity, and/or where said oligonucleotides are modified with a chemical
moiety to in-
crease binding affinity, such that the probe has sufficient stability for
binding to the target
sequence under conditions suitable for detection, and wherein the number of
different recog-
nition segments comprises less than 10% of all possible segments of the given
length, and
wherein more than 90% of the probes can detect more than one complementary
target in a
target population of nucleic acids such that the library of oligonucleotide
probes can detect a
substantial fraction of all target sequences in a target population of nucleic
acids.
The invention therefore relates to a library of oligonucleotide probes wherein
each probe in
the library consists of a recognition sequence tag and a detection moiety
wherein at least one
monomer in each oligonucleotide probe is a modified monomer analogue,
increasing the
binding affinity for the complementary target sequence relative to the
corresponding unmo-
dified oligonucleotide (which may e.g. be an unmodified
oligodeoxyribonucleotide or
oligoribonucleotide), such that the library probes have sufficient stability
for sequence-
specific binding and detection of a substantial fraction of a target nucleic
acid in any given
target population and wherein the number of different recognition sequences
comprises less
than 10% of all possible sequence tags of a given length(s).

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
The invention further relates to a library of oligonucleotide probes wherein
the recognition
sequence tag segment of the probes in the library have been modified in at
least one of the
following ways:
i) substitution with at least one non-naturally occurring nucleotide; and
5 ii) substitution with at least one chemical moiety to increase the stability
of the probe.
Further, the invention relates to a library of oligonucleotide probes wherein
the recognition
sequence tag has a length of 6 to 12 nucleotides (i.e. 6, 7, 8, 9, 10, 11 or
12), and wherein
the preferred length is 8 or 9 nucleotides.
Further, the invention relates to recognition sequence tags that are
substituted with LNA nu-
10 cleotides.
Also part of the invention is an oligonucleotide probe comprising a quencher
of formula I and
a 5'-nitroindole residue. It is believed that such useful multiprobes are
inventive in their own
right. Preferred such probes are free from a 5' guanidyl residue, and in
general such
inventive probes are disclosed in the present specification and claims.
Especially preferred
probes are those set forth in Table 1, Table 1A, Fig. 13, or Fig 14.
Moreover, the invention relates to libraries of the invention where more than
90% of the oli-
gonucleotide probes can bind and detect at least two target sequences in a
nucleic acid
population, preferably because the bound target sequences that are
complementary to the
recognition sequence of the probes.
Also preferably, the probe is capable of detecting more than one target in a
target population
of nucleic acids, e.g., the probe is capable of hybridizing to a plurality of
different nucleic acid
molecules contained within the target population of nucleic acids.
The invention also provides a method, system and computer program embedded in
a com-
puter readable medium ("a computer program product") for designing multi-
probes compri-
sing at least one stabilizing nucleobase. The method comprises querying a
database of tar-
get sequences (e.g., such as a database of expressed sequences) and designing
a small set
of probes (e.g. such as 50 or 100 or 200 or 300 or 500) which: i) has
sufficient binding stabi-
lity to bind their respective target sequence under PCR conditions, ii) have
limited propensity
to form duplex structures with itself, and iii) are capable of binding to and
detect-
ing/quantifying at least about 60%, at least about 70%, at least about 80%, at
least about
90% or at least about 95% of all the sequences in the given database of
sequences, such as
a database of expressed sequences.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
11
Probes are designed in silico, which comprise all possible combinations of
nucleotides of a
given length forming a database of virtual candidate probes. These virtual
probes are que-
ried against the database of target sequences to identify probes that comprise
the maximal
ability to detect the most different target sequences in the database
("optimal probes"). Op-
timal probes so identified are removed from the virtual probe database.
Additionally, target
nucleic acids, which were identified by the previous set of optimal probes,
are subtracted
from the target nucleic acid database. The remaining probes are then queried
against the
remaining target sequences to identify a second set of optimal probes. The
process is re-
peated until a set of probes is identified which can provide the desired
coverage of the target
sequence database. The set may be stored in a database as a source of
sequences for tran-
scriptome analysis. Multi-probes may be synthesized having recognition
sequences, which
correspond to those in the database to generate a library of multi-probes.
In one preferred aspect, the target sequence database comprises nucleic acid
sequences
corresponding to human mRNA (e.g., mRNA molecules, cDNAs, and the like).
In another aspect, the method further comprises calculating stability based on
the assump-
tion that the recognition sequence comprises at least one stabilizing
nucleotide, such as an
LNA molecule. In one preferred aspect the calculated stability is used to
eliminate probe re-
cognition sequences with inadequate stability from the database of virtual
candidate probes
prior to the initial query against the database of target sequence to initiate
the identification
of optimal probe recognition sequences.
In another aspect, the method further comprises calculating the propensity for
a given probe
recognition sequence to form a duplex structure with itself based on the
assumption that the
recognition sequence comprises at least one stabilizing nucleotide, such as an
LNA molecule.
In one preferred aspect the calculated propensity is used to eliminate probe
recognition se-
quences that are likely to form probe duplexes from the database of virtual
candidate probes
prior to the initiai query against the database of target sequence to initiate
the determination
of optimal probe recognition sequences.
In another aspect, the method further comprises evaluating the general
applicability of a
given candidate probe recognition sequence for inclusion in the growing set of
optimal probe
candidates by both a query against the remaining target sequences as well as a
query
against the original set of target sequences. In one preferred aspect only
probe recognition
sequences that are frequently found in both the remaining target sequences and
in the origi-
nal target sequences are added to in the growing set of optimal probe
recognition sequences.
In a most preferred aspect this is accomplished by calculating the product of
the scores from
these queries and selecting the probes recognition sequence with the highest
product that

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
12
still is among the probe recognition sequences with 20% best score in the
query against the
current targets.
The invention also provides a computer program embedded in a computer readable
medium
comprising instructions for searching a database comprising a plurality of
different target
sequences and for identifying a set of probe recognition sequences capable of
identifying to
at least about 60%, about 70%, about 80%, about 90% and about 95% of the
sequences
within the database. In one aspect, the program provides instructions for
executing the
method described above. In another aspect, the program provides instructions
for imple-
menting an algorithm as shown in Fig. 2. The invention further provides a
system wherein
the system comprises a memory for storing a database comprising sequence
information for
a plurality of different target sequences and also comprises an application
program for exe-
cuting the program instructions for searching the database for a set of probe
recognition se-
quences which is capable of hybridizing to at least about 60%, about 70%,
about 80%, about
90% and about 95% of the sequences within the database.
Another aspect of the invention relates to an oligonucleotide probe comprising
a detection
element and a recognition segment each independently having a length of about
1 to 8 or 9
nucleotides, wherein some or all of the nucleotides in the oligonucleotides
are substituted by
non-natural bases or base analogues having the effect of increasing binding
affinity compared
to natural nucleobases and/or some or all of the nucleotide units of the
oligonucleotide probe
are modified with a chemical moiety or replaced by an analogue to increase
binding affinity,
and/or where said oligonucleotides are modified with a chemical moiety or is
an oligonucleo-
tide analogue to increase binding affinity, such that the probe has sufficient
stability for
binding to the target sequence under conditions suitable for detection, and
wherein the probe
is capable of detecting more than one complementary target in a target
population of nucleic
acids.
A preferred embodiment of the invention is a kit for the characterization or
detection or
quantification of target nucleic acids comprising samples of a library of
multi-probes. In one
aspect, the kit comprises in silico protocols for their use. In another
aspect, the kit com-
prises information relating to suggestions for obtaining inexpensive DNA
primers. The probes
contained within these kits may have any or all of the characteristics
described above. In
one preferred aspect, a plurality of probes comprises at least one stabilizing
nucleotide, such
as an LNA nucleotide. In another aspect, the plurality of probes comprises a
nucleotide cou-
pled to or stably associated with at least one chemical moiety for increasing
the stability of
binding of the probe. In a further preferred aspect, the kit comprises about
100 different
probes. The kits according to the invention allow a user to quickly and
efficiently develop an
assay for thousands of different nucleic acid targets.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
13
The invention further provides a multi-probe comprising one or more LNA
nucleotides, which
has a reduced length of about 8, or 9 nucleotides. By selecting commonly
occurring 8 and 9-
mers as targets it is possible to detect many different genes with the same
probe. Each 8 or
9-mer probe can be used to detect more than 7000 different human mRNA
sequences. The
necessary specificity is then ensured by the combined effect of inexpensive
DNA primers for
the target gene and by the 8 or 9-mer probe sequence targeting the amplified
DNA (Fig. 1).
In a preferred embodiment the present invention relates to an oligonucleotide
multi-probe
library comprising LNA-substituted octamers and nonamers of less than about
1000 sequen-
ces, preferably less than about 500 sequences, or more preferably less than
about 200 se-
quences, such as consisting of about 100 different sequences selected so that
the library is
able to recognize more than about 90%, more preferably more than about 95% and
more
preferably more than about 98% of mRNA sequences of a target organism or
target organ.
Positive control samples:
A recurring problem in designing real-time PCR detection assays for multiple
genes is that the
success-rate of these de-novo designs is less than 100%. Troubleshooting a non-
functional
assay can be cumbersome since ideally, a target specific template is needed
for each probe,
to test the functionality of the detection probe. Furthermore, a target
specific template can
be useful as a positive control if it is unknown whether the target is
available in the test sam-
ple. When operating with a limited number of detection probes in a probe
library kit as de-
scribed in the present invention (e.g. 90), it is feasible to also provide
positive control targets
in the form of PCR-amplifiable templates containing all possible targets for
the limited num-
ber of probes (e.g. 90). This feature allows users to evaluate the function of
each probe, and
is not feasible for non-recurring probe-based assays, and thus constitutes a
further beneficial
feature of the invention. For the suggested preferred probe recognition
sequences listed in
Fig. 13, we have designed concatamers of control sequences for all probes,
containing a PCR-
amplifiable target for every probe in the 40 first probes.
Probe sequence selection
An important aspect of the present invention is the selection of optimal probe
target sequen-
ces in order to target as many targets with as few probes as possible, given a
target selection
criteria. This may be achieved by deliberately selecting target sequences that
occur more
frequently than what would have been expected from a random distribution.
The invention therefore relates in one aspect to a method of selecting
oligonucleotide se-
quences useful in a multi-probe library of the invention, the method
comprising

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
14
a) providing a first list of all possible oligonucleotides of a predefined
number of nucleotides,
N (typically an integer selected from 6, 7, 8, 9, 10, 11, and 12, preferably 8
or 9), said oligo-
nucleotides having a melting temperature, Tm, of at least 50 C (preferably at
least 60 C
such as at least 62 C),
b) providing a second list of target nucleic acid sequences (such as a list of
a target nucleic
acid population discussed herein),
c) identifying and storing for each member of said first list, the number of
members from
said second list, which include a sequence complementary to said each member,
d) selecting a member of said first list, which in the identification in step
c matches the
maximum number, identified in step c, of members from said second list,
e) adding the member selected in step d to a third list consisting of the
selected oligonucleo-
tides useful in the library according to the invention,
f) subtracting the member selected in step d from said first list to provide a
revised first list,
m) repeating steps d through f until said third list consists of members which
together will be
contemplary to at least 30% of the members on the list of target nucleic acid
sequences from
step b (normally the percentage will be higher, such as at least 40%, at least
50%, at least
60%, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at
least 95%, or
even higher such as at least 97%, at least 98% and even as high as at least
99%). As a
further feature, the has a bias against including a member in the third list
that have a 5'
guanidyl (G) and/or a bias against including members in the third list that
have a 3' guanidyl
(G). This is the consequence of the surprising finding that the probes of the
present invention
are by far more effective in assays when they are free from a 5' guanidyl
residue, but it has
also been shown that omission of 3' guanidyl provides for advantages under
assay conditions.
So, it is preferred that guanidyl is avoided as the 5' residue in all
oligonucleotide sequences
in said third list
It is preferred that the first list only includes oligonucleotides incapable
of self-hybridization
in order to render a subsequent use of the probes less prone to false
positives.
The selection method may include a number of steps after step f, but before
step m
g) subtraction of all members from said second list which include a sequence
complementary
to the member selected in step d to obtain a revised second list,
h) identification and storing of, for each member of said revised first list,
the number of
members from said revised second list, which include a sequence complementary
to said
each member, i) selecting a member of said first list, which in the
identification in step h
matches the maximum number, identified in step h, of members from said second
list, or
selecting a member of said first list provides the maximum number obtained by
multiplying
the number identified in step h with the number identified in step c,

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
j) addition of the member selected in step i to said third list,
k) subtraction of the member selected in step i from said revised first list,
and
I) subtraction of all members from said revised second list which include a
sequence or com-
plementary to the member selected in step i.
5 The above-mentioned avoidance of guanidyl as the 5' residue is preferably
achieved by i)
reducing the list of step a to include only those that do not include a 5'
guanidyl residue,
and/or ii) avoiding selection in step d and/or i of those sequences which
include a 5' guanidyl
residue, and/or iii) omitting step e and/or j for those sequences that include
a 5' guanidyl
residue.
10 The selection in step d after step c is conveniently preceded by
identification of those mem-
bers of said first list which hybridizes to more than a selected percentage
(6001o or higher
such as the preferred 80%) of the maximum number of members from said second
list so
that only those members so identified are subjected to the selection in step
d.
The method of the invention can also include the feature that it is ensured
that members are
15 not entered on the third list if such members have previously failed
qualitative as useful
probes. Or, in simpler terms, after design of a library, the individual
members are tested for
their usefulness, and probes which are found to behave sub optimally in a
relevant assay are
included in a negative list' which is checked when later designing new probes
and probe
libraries. To avoid inclusion in the third list of oligonucleotide sequences
that have previously
failed qualitatively, it is possible to i) reduce the list of step a to
include only those that have
not previously failed qualitatively, and/or ii) avoid selection in step d or i
of those sequences
that have not previously failed qualitatively, and/or iil) omit step e or j
for those sequences
that have not previously failed qualitatively
In the practical implementation of the selection method, said first, second
and third lists are
stored in the memory of a computer system, preferably in a database. The
memory (also
termed "computer readable medium") can be both volatile and non-volatile, i.e.
any memory
device conventionally used in computer systems: a random access memory (RAM),
a read-
only memory (ROM), a data storage device such as a hard disk, a CD-ROM, DVD-
ROM, and
any other known memory device.
The invention also provides a computer program product providing instructions
for imple-
menting the selection method, embedded in a computer-readable medium (defined
as
above). That is, the computer program may be compiled and loaded in an active
computer
memory, or it may be loaded on a non-volatile storage device (optionally in a
compressed
format) from where it can be executed. Consequently, the invention also
includes a system

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
16
comprising a database of target sequences and an application program for
executing the
computer program. A source code for such a computer program is set forth in
Fig. 17.
In a randomly distributed nucleic acid population, the occurrence of selected
sequences of a
given length will follow a statistical distribution defined by:
N1 = the complete length of the given nucleic acid population (e.g. 76.002.917
base pairs as
in the 1]une 30, 2003 release of RefSeq).
N2= the number of fragments comprising the nucleic acid population (e.g.
38.556 genes in
the 1)une 30, 2003 release of RefSeq).
N3 = the length of the recognition sequence (e.g. 9 base pairs)
N4 = the occurrence frequency
N4 =(N1-((N3-1) x 2 x N2))/(4N3)
E.g.
76,002,917 - 8 x 2 x 38,556 = approximately 287 occurrences of 9-mer sequences
or
49
or
76,002,917 -7 x 2 x 38,556 = approximately 1,151 occurrences of 8-mer
sequences
4$
Hence, as described in the example given above, a random 8-mer and 9-mer
sequence would
on average occur 1,151 and 287 times, respectively, in a random population of
the described
38,556 mRNA sequences.
In the example above, the 76.002.917 base pairs originating from 38.556 genes
would corre-
spond to an average transcript length of 1971 bp, containing each 1971-16 or
1955 9-mer
target sequences each. Thus as a statistical minimum, 38.556/1955/287 or 5671
9-mer
probes would be needed for one probe to target each gene.
However, the occurrence of 9-mer sequences is not randomly distributed. In
fact, a small
subset of sequences occurs at surprisingly high prevalence, up to over 30
times the preva-

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
17
lence anticipated from a random distribution. In a specific target population
selected accor-
ding to preferred criteria, preferably the most common sequences should be
selected to in-
crease the coverage of a selected library of probe target sequences. As
described previously,
selection should be step-wise, such that the selection of the most common
target sequences
is evaluated as well in the starting target population as well as in the
population remaining
after each selection step.
In a preferred embodiment of the invention the targets for the probe library
are the entire
expressed transcriptome.
Because the success rate of the reverse transcriptase reaction diminishes with
the distance
from the RT-primer used, and since using a poly-T primer targeting the poly-A
tract in
mRNAs is common, the above-mentioned target can further be restricted to only
include the
1000 most proximal bases in each mRNA. This may result in the selection of
another set of
optimal probe target sequences for optimal coverage.
Likewise the above-mentioned target may be restricted to include only the 50
bp of coding
region sequence flanking the introns of a gene to ensure assays that
preferably only monitor
mRNA and not genomic DNA or to only include regions not containing di-, tri-
or tetra repeat
sequences, to avoid repetitive binding or probes or primers or regions not
containing know
allelic variation, to avoid primer or probe mis-annealing due to sequence
variations in target
sequences or regions of extremely high GC-content to avoid inhibition of PCR
amplification.
Depending on each target selection the optimal set of probes may vary,
depending in the
prevalence of target sequences in each target selection.
Examples of probe libraries
Human genomic: A set of genomic sequences can be extracted from a genome,
which could
be the human, by dividing the genomic sequence in pieces of 500 nucleotides in
length. Such
a Probe Library can be used to measure any genomic sequence, including
regulatory
sequences, introns, repetitive sequences and other genomic sequences. The
following library
has been identified by means of the methods disclosed herein, cf. Fig. 17.
Table of oligos that are suitable for the human genome.
# no dnaID n nmer newhit cover sum p tm sc self 1naID ok oligo
1 18805 8 cacicctcc 9059 9059 9059 15 69 60 36 3365869 1 cAGCCTCC
2 21671 8 cccaggct 3786 8143 12845 22 66 56 38 2543023 1 ccCAGGCT
3 23888 8 cctcccaa 2446 8442 15291 26 63 56 8 3660644 1 cCTCCCAA

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
18
4 54564 8 tcccagca 1858 7179 17149 30 68 58 28 7788972 1 tCCCAGCA
55191 8 tcctqcct 1729 7024 18878 33 68 58 28 7798127 1 tCCTGCCT
6 30615 8 ctctqcct 1744 4737 20622 36 65 56 28 4128111 1 cTCTGCCT
7 64852 8 tttcccca 1820 2853 22442 39 63 54 8 8379244 1 tTTCCCCA
5 8 63383 8 ttctqcct 1603 2969 24045 42 62 54 28 8322415 1 tTCTGCCT
9 244667 9 tqtgtgtgt 1647 2570 25692 45 66 59 32 64978423 1 tGTGTGTGT
21781 8 ccccaccc 1457 2710 27149 47 68 60 0 2546029 1 ccCCACCC
11 54741 8 tccctccc 1142 2618 28291 49 63 60 0 7788397 1 tCCCtCCC
12 20964 8 ccactqca 933 6626 29224 51 65 54 38 3563432 1 cCACTGCa
10 13 32117 8 cttcctcc 1046 2428 30270 53 63 56 0 4185069 1 cTTCCTCC
14 55157 8 tcctctcc 1084 2175 31354 55 64 58 0 7797741 1 tCCTCTCC
24029 8 cctctctc 911 2335 32265 56 62 56 0 3661693 1 cCTCTCTC
16 57172 8 tcttccca 908 2163 33173 58 62 54 8 7863148 1 tCTTCCCA
17 57255 8 tcttqqct 697 3146 33870 59 65 54 36 7863727 1 tCTTGGCT
15 18 65365 8 ttttcccc 708 2604 34578 60 62 54 0 8387437 1 tTTTCCCC
19 18807 8 cagcctct 628 2511 35206 61 64 56 36 3365871 1 cAGCCTCT
59351 8 tgcttcct 712 2128 35918 63 62 54 28 8060783 1 tGCTTCCT
21 63380 8 ttctqcca 730 1955 36648 64 63 54 36 8322412 1 tTCTGCCA
22 24407 8 ccttccct 621 2226 37269 65 65 56 0 3668847 1 cCTTCCCT
20 23 56696 8 tctcctqa 530 2944 37799 66 63 54 33 7855092 1 tCTCCTGA
24 57239 8 tcttqcct 636 2062 38435 67 63 54 28 7863663 1 tCTTGCCT
32084 8 cttcccca 593 2028 39028 68 65 56 8 4184940 1 cTTCCCCA
26 62951 8 ttcctqct 577 2011 39605 69 62 54 28 8314799 1 tTCCTGCT
27 59895 8 tqgcttct 577 1892 40182 70 64 54 36 8085487 1 tGGCTTCT
25 28 30167 8 ctcctcct 458 2258 40640 71 62 56 0 4120431 1 cTCCTCCT
29 65108 8 tttqccca 525 1846 41165 72 65 54 33 8383340 1 tTTGCCCA
31639 8 ctqtqcct 452 2046 41617 73 66 56 36 4160879 1 cTGTGCCT
31 55252 8 tccttcca 457 1910 42074 74 62 54 8 7798636 1 tCCTTCCA
32 62792 8 ttcccaga 454 1831 42528 74 62 54 30 8313652 1 tTCCCAGA
30 33 58516 8 tqcagcca 399 1993 42927 75 65 54 38 6999404 1 tgCAGCCA
34 59323 8 tgctqtqt 396 1916 43323 76 62 54 32 8060407 1 tGCTGTGT
58871 8 tgccttct 359 2052 43682 76 62 54 28 8052719 1 tGCCTTCT
36 62840 8 ttccctqa 398 1776 44080 77 64 54 30 8313844 1 tTCCCTGA
37 65195 8 tttgqqqt 421 1613 44501 78 69 54 20 8383927 1 tTTGGGGT
35 38 260055 9 tttcttcct 371 1733 44872 79 62 55 0 67043183 1 tTTCTTCCT
39 30551 8 ctctccct 288 2391 45160 79 62 56 0 4127599 1 cTCTCCCT
14715 8 atqcctqt 275 4214 45435 79 63 54 28 2055159 1 aTGCCTGT
41 56660 8 tctcccca 287 1963 45722 80 68 58 8 7854956 1 tCTCCCCA
42 59381 8 tqctttcc 324 1689 46046 81 63 54 28 8060909 1 tGCTTTCC
40 43 229239 9 tctttctct 300 1731 46346 81 62 55 0 62913519 1 tCTTTCTCT
44 59348 8 t cttcca 296 1711 46642 82 64 54 28 8060780 1 tGCTTCCA
59892 8 tqgcttca 286 1703 46928 82 66 54 36 8085484 1 tGGCTTCA
46 59320 8 tgctqtqa 287 1603 47215 83 64 54 32 8060404 1 tGCTGTGA
47 30021 8 ctcccacc 216 3033 47431 83 67 60 8 4119341 1 cTCCCACC
45 48 30887 8 ctgagqct 217 1972 47648 83 66 56 36 4148655 1 cTGAGGCT
49 55176 8 tcctqaga 243 1668 47891 84 64 54 36 7798068 1 tCCTGAGA
15083 8 atqgtgqt 196 2182 48087 84 65 54 10 2060215 1 aTGGTGGT
51 57063 8 tctqtqct 238 1644 48325 85 63 54 36 7860143 1 tCTGTGCT
52 63399 8 ttctqqct 214 1766 48539 85 62 54 36 8322479 1 tTCTGGCT
50 53 54655 8 tccccttt 204 1753 48743 85 63 54 0 7789567 1 tCCCCTTT
54 31368 8 ctgggaqa 172 2023 48915 86 65 54 22 3108148 1 ctGGGAGA
55289 8 tcctttqc 190 1750 49105 86 64 54 28 7798773 1 tCCTTTGC
56 259575 9 tttccttct 199 1627 49304 86 62 55 0 67035119 1 tTTCCTTCT
57 57317 8 tctttqcc 196 1600 49500 87 64 54 28 7864237 1 tCTTTGCC
55 58 30612 8 ctctgcca 164 1806 49664 87 66 56 36 4128108 1 cTCTGCCA
59 61087 8 tgtg ctt 180 1569 49844 87 65 54 36 8121727 1 tGTGGCTT
53855 8 tcaqcctt 155 1798 49999 88 62 54 36 7760767 1 tCAGCCTT
61 58877 8 tqcctttc 155 1692 50154 88 63 54 28 8052733 1 tGCCTTTC
62 30164 8 ctcctcca 146 1760 50300 88 63 56 8 4120428 1 cTCCTCCA
60 63 244479 9 tqtqgtttt 166 1450 50466 88 67 55 16 64974847 1 tGTGGTTTT
64 58751 8 tqcccttt 151 1472 50617 89 64 54 28 8051711 1 tGCCCTTT
261495 9 ttttcctct 164 1261 50781 89 62 55 0 67099631 1 tTTTCCTCT
66 260085 9 tttctttcc 143 1379 50924 89 62 55 0 67043309 1 tTTCTTTCC
67 259935 9 tttctcctt 140 1356 51064 89 62 55 0 67042175 1 tTTCTCCTT
65 68 251901 9 ttccttttc 145 1239 51209 90 62 55 0 66519037 1 tTCCTTTTC
69 65191 8 tttgqqct 136 1289 51345 90 68 54 36 8383919 1 tTTGGGCT

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
19
70 58868 8 tgccttca 123 1578 51468 90 64 54 28 8052716 1 tGCCTTCA
71 4583 8 acactgct 122 1495 51590 90 63 54 36 1466287 1 aCACTGCT
72 227199 9 tctctcttt 116 1652 51706 91 62 55 0 62847999 1 tCTCTCTTT
73 31300 8 ctqgcaca 113 1487 51819 91 65 54 38 4156200 1 cTGGCACa
74 59901 8 tqgctttc 113 1456 51932 91 64 54 36 8085501 1 tGGCTTTC
75 19796 8 catcccca 110 1496 52042 91 64 56 16 3398508 1 cATCCCCA
76 24039 8 cctctgct 100 1949 52142 91 64 56 28 3661743 1 cCTCTGCT
77 10199 8 aqcttcct 95 1717 52237 91 62 54 38 1769327 1 aGCTTCCT
78 61112 8 tqtgqtga 99 1540 52336 92 66 54 12 8121844 1 tGTGGTGA
79 58543 8 tqcaqqtt 106 1381 52442 92 64 54 38 8048063 1 tGCAGGTT
80 22493 8 cccttctc 90 1719 52532 92 63 56 0 3604349 1 cCCTTCTC
81 61397 8 tgtttccc 92 1538 52624 92 62 54 14 8126317 1 tGTTTCCC
82 59256 8 tqctctqa 95 1423 52719 92 64 54 36 8059892 1 tGCTCTGA
83 7911 8 actgtgct 93 1413 52812 92 64 54 36 1568687 1 aCTGTGCT
84 10196 8 aqcttcca 91 1426 52903 93 63 54 38 1769324 1 aGCTTCCA
85 251895 9 ttcctttct 82 1411 52985 93 62 55 0 66519023 1 tTCCTTTCT
86 63867 8 ttqcctqt 81 1506 53066 93 62 54 28 8346615 1 tTGCCTGT
87 7655 8 actctqct 86 1260 53152 93 63 54 28 1564591 1 aCTCTGCT
88 234487 9 tqcatttct 84 1242 53236 93 62 55 38 64389103 1 tGCATTTCT
89 64119 8 ttgqctct 75 1425 53311 93 62 54 36 8350703 1 tTGGCTCT
90 59284 8 tqctcrcca 71 1512 53382 93 67 54 38 7011692 1 tqCTGCCA
Bacteria: 199 bacteria and archae genomes from which can be downloaded from
NCBI:
ftp.ncbi.nih.gov The genomes can be classified according to the use of
nucleotides. An even
use of nucleotides is if every nucleotide (a,c,g,t) is used 25% of the time.
Deviation from
even usage can for example be taken as any that differs by more than 3%.
Following this
criteria the 199 genomes divide into: 91 AT rich, 44 GC rich, 28 no >3%
skewness, 21 A rich,
15 other categories.
Bacteria can be highly AT rich. This explains why probes from a human probe
library do not
give a good coverage. Designing probes for an AT rich organism is a challenge
because of the
low melting temperature. The probes must be longer to achieve the melting
temperature, but
this lowers the coverage. A Probe library for mainly AT rich genomes is given
in the following
"bacteria table" (also identified by means of the program set forth in Fig.
17).
# no dnaID n nmer newhit cover sum p tm sc self 1naID ok oligo
1 64235 8 ttggtggt 15138 15138 15138 5 64 54 12 8351671 1 tTGGTGGT
2 63976 8 ttgctgga 12289 13631 27427 10 68 54 36 8347572 1 tTGCTGGA
3 228852 9 tcttcttca 11067 12888 38494 14 63 55 8 62906348 1 tCTTCTTCA
4 64099 8 ttggcgat 10164 13063 48658 18 63 54 38 8350631 1 tTGGCGAT
5 64232 8 ttggtgga 9220 13163 57878 22 69 54 12 8351668 1 tTGGTGGA
6 63721 8 ttgatggc 8466 12948 66344 25 64 54 28 8343477 1 tTGATGGC
7 237565 9 tgctttttc 8295 12487 74639 28 66 55 28 64487421 1 tGCTTTTTC
8 62951 8 ttcctgct 7481 12549 82120 31 62 54 28 8314799 1 tTCCTGCT
9 63956 8 ttgctcca 6847 12608 88967 34 63 54 30 8347500 1 tTGCTCCA
10 228855 9 tcttcttct 6418 12133 95385 36 62 55 0 62906351 1 tCTTCTTCT
11 65369 8 ttttccgc 6217 11950 101602 38 62 54 28 8387445 1 tTTTCCGC
12 253945 9 ttcttttgc 5716 11886 107318 41 65 55 28 66584565 1 tTCTTTTGC
13 16057 8 attggtgc 5223 12364 112541 43 66 54 36 2092533 1 aTTGGTGC
14 63843 8 ttgccgat 5032 11970 117573 45 62 54 38 8346535 1 tTGCCGAT
15 53833 8 tcagcagc 4631 12189 122204 46 62 54 38 7744309 1 tCAgCAGC
16 57321 8 tctttggc 4344 12242 126548 48 66 54 28 7864245 1 tCTTTGGC
17 63380 8 ttctgcca 4173 11996 130721 50 63 54 36 8322412 1 tTCTGCCA
18 55679 8 tcgccttt 3935 11760 134656 51 62 54 28 7822335 1 tCGCCTTT
19 261961 9 tttttcagc 3809 11550 138465 53 63 55 28 67107637 1 tTTTTCAGC
20 15689 8 attccagc 3267 12463 141732 54 62 54 28 2087733 1 aTTCCAGC

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
21 57317 8 tctttgcc 3366 11301 145098 55 64 54 28 7864237 1 tCTTTGCC
22 64916 8 tttcgcca 3161 11512 148259 56 63 54 28 8379756 1 tTTCGCCA
23 58249 8 tgatgagc 3063 11204 151322 57 62 54 28 8027445 1 tGATGAGC
24 63717 8 ttgatgcc 2792 11450 154114 59 62 54 28 8343469 1 tTGATGCC
5 25 57172 8 tcttccca 2957 10260 157071 60 62 54 8 7863148 1 tCTTCCCA
26 5759 8 accgcttt 2572 11074 159643 61 65 54 28 1502207 1 aCCGCTTT
27 65209 8 tttggtgc 2413 11267 162056 62 63 54 36 8383989 1 tTTGGTGC
28 57236 8 tcttgcca 2393 10890 164449 62 65 54 36 7863660 1 tCTTGCCA
29 55796 8 tcgcttca 2299 10806 166748 63 62 54 28 7823340 1 tCGCTTCA
10 30 61332 8 tgttgcca 2138 11233 168886 64 64 54 36 8125804 1 tGTTGCCA
31 98292 9 cctttttca 2135 10703 171021 65 65 53 8 29360108 1 cCTTTTTCA
32 237439 9 tgcttcttt 2102 10423 173123 66 65 55 28 64486399 1 tGCTTCTTT
33 97791 9 ccttctttt 2143 9728 175266 67 64 53 0 29351935 1 cCTTCTTTT
34 65429 8 ttttgccc 1855 10845 177121 67 64 54 28 8387949 1 tTTTGCCC
15 35 59348 8 tgcttcca 1844 10290 178965 68 64 54 28 8060780 1 tGCTTCCA
36 98295 9 cctttttct 1911 9610 180876 69 64 53 0 29360111 1 cCTTTTTCT
37 59325 8 tgctgttc 1687 10619 182563 69 62 54 28 8060413 1 tGCTGTTC
38 63855 8 ttgccgtt 1597 10785 184160 70 62 54 28 8346559 1 tTGCCGTT
39 63959 8 ttgctcct 1691 9861 185851 71 62 54 28 8347503 1 tTGCTCCT
20 40 14973 8 atggcttc 1439 10673 187290 71 65 54 36 2059261 1 aTGGCTTC
41 55935 8 tcggcttt 1432 10401 188722 72 65 54 36 7826431 1 tCGGCTTT
42 15083 8 atggtggt 1394 10337 190116 72 65 54 10 2060215 1 aTGGTGGT
43 261501 9 ttttccttc 1531 9094 191647 73 62 55 0 67099645 1 tTTTCCTTC
44 58345 8 tgattggc 1286 10495 192933 73 65 54 28 8028085 1 tGATTGGC
45 40831 9 agcttcttt 1366 9482 194299 74 65 55 38 14154751 1 aGCTTCTTT
46 60409 8 tggtttgc 1221 10407 195520 74 65 54 28 8093685 1 tGGTTTGC
47 65365 8 ttttcccc 1329 9259 196849 75 62 54 0 8387437 1 tTTTCCCC
48 64932 8 tttcggca 1152 10181 198001 75 64 54 36 8379820 1 tTTCGGCA
49 32244 9 acttcttca 1206 9405 199207 76 65 55 8 12574700 1 aCTTCTTCA
50 54911 8 tccgcttt 1024 10796 200231 76 62 54 28 7793663 1 tCCGCTTT
51 64125 8 ttggcttc 1005 10701 201236 77 64 54 36 8350717 1 tTGGCTTC
52 55805 8 tcgctttc 1084 9724 202320 77 62 54 28 7823357 1 tCGCTTTC
53 57305 8 tctttcgc 958 10624 203278 77 62 54 28 7864181 1 tCTTTCGC
54 261621 9 ttttcttcc 1086 8914 204364 78 62 55 0 67100653 1 tTTTCTTCC
55 60047 8 tggggatt 1010 9349 205374 78 68 54 24 8088895 1 tGGGGATT
56 6047 8 acctgctt 922 10045 206296 78 65 54 28 1506687 1 aCCTGCTT
57 56953 8 tctgctgc 847 10447 207143 79 64 54 38 7842805 1 tCTgCTGC
58 14565 8 atgatgcc 854 10029 207997 79 63 54 28 2052013 1 aTGATGCC
59 32247 9 acttcttct 891 9333 208888 79 64 55 5 12574703 1 aCTTCTTCT
60 63969 8 ttgctgac 802 10101 209690 80 62 54 28 8347557 1 tTGCTGAC
61 253941 9 ttcttttcc 841 9306 210531 80 62 55 0 66584557 1 tTCTTTTCC
62 63465 8 ttcttggc 788 9701 211319 80 64 54 28 8322997 1 tTCTTGGC
63 65001 8 tttctggc 738 10120 212057 81 64 54 36 8380341 1 tTTCTGGC
64 131028 9 ctttttcca 776 9397 212833 81 64 53 8 33554284 1 cTTTTTCCA
65 59371 8 tgcttggt 681 10310 213514 81 65 54 36 8060855 1 tGCTTGGT
66 7805 8 actgcttc 673 10218 214187 82 64 54 28 1567741 1 aCTGCTTC
67 59856 8 tggctcaa 658 10072 214845 82 63 54 38 8085348 1 tGGCTCAA
68 86004 9 ccattttca 739 8750 215584 82 62 53 16 28573676 1 cCATTTTCA
69 63869 8 ttgccttc 626 9973 216210 82 62 54 28 8346621 1 tTGCCTTC
70 1695 8 aacggctt 637 9529 216847 83 63 54 36 1240447 1 aACGGCTT
71 59901 8 tggctttc 623 9558 217470 83 64 54 36 8085501 1 tGGCTTTC
72 65161 8 tttggagc 629 9307 218099 83 65 54 28 8383797 1 tTTGGAGC
73 8057 8 acttctgc 592 9719 218691 83 64 54 28 1571829 1 aCTTCTGC
74 65449 8 ttttgggc 643 8621 219334 83 68 54 28 8388021 1 tTTTGGGC
75 228861 9 tcttctttc 632 8621 219966 84 63 55 0 62906365 1 tCTTCTTTC
76 262005 9 tttttctcc 652 8108 220618 84 62 55 0 67107821 1 tTTTTCTCC
77 5369 8 accattgc 523 9894 221141 84 63 54 36 1495029 1 aCCATTGC
78 60395 8 tggttggt 511 9801 221652 84 66 54 24 8093623 1 tGGTTGGT
79 62969 8 ttccttgc 577 8431 222229 85 62 54 28 8314869 1 tTCCTTGC
80 58341 8 tgattgcc 485 9841 222714 85 63 54 28 8028077 1 tGATTGCC
81 8009 8 acttcagc 483 9585 223197 85 62 54 33 1571637 1 aCTTCAGC
82 61341 8 tgttgctc 475 9458 223672 85 62 54 28 8125821 1 tGTTGCTC
83 55289 8 tcctttgc 519 8560 224191 85 64 54 28 7798773 1 tCCTTTGC
84 61413 8 tgtttgcc 481 8906 224672 86 63 54 28 8126381 1 tGTTTGCC
85 261757 9 ttttgcttc 455 9306 225127 86 65 55 28 67103741 1 tTTTGCTTC
86 65179 8 tttggcgt 428 9562 225555 86 64 54 36 8383863 1 tTTGGCGT

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
21
87 122877 9 ctctttttc 479 8379 226034 86 62 53 0 33030141 1 cTCTTTTTC
88 59381 8 tgctttcc 462 8539 226496 86 63 54 28 8060909 1 tGCTTTCC
89 257917 9 ttgttcttc 471 8098 226967 86 62 55 17 66845693 1 tTGTTCTTC
90 60392 8 tggttgga 429 8764 227396 87 70 54 20 8093620 1 tGGTTGGA
Selection of detection means and identification of single nucleic acids
Another part of the invention relates to identification of a means for
detection of a target nu-
cleic acid, the method comprising
A) inputting, into a computer system, data that uniquely identifies the
nucleic acid sequence
of said target nucleic acid, wherein said computer system comprises a database
holding in-
formation of the composition of at least one library of nucleic acid probes of
the invention,
and wherein the computer system further comprises a database of target nucleic
acid se-
quences for each probe of said at least one library and/or further comprises
means for ac-
quiring and comparing nucleic acid sequence data,
B) identifying, in the computer system, a probe from the at least one library,
wherein the
sequence of the probe exists in the target nucleic acid sequence or a sequence
complemen-
tary to the target nucleic acid sequence,
C) identifying, in the computer system, a primer that will amplify the target
nucleic acid se-
quence, and
D) providing, as identification of the specific means for detection, an output
that points out
the probe identified in step B and the sequences of the primers identified in
step C.
The above-outlined method has several advantages in the event it is desired to
rapidly and
specifically identify a particular nucleic acid. If the researcher already has
acquired a suitable
multi-probe library of the invention, the method makes it possible within
seconds to acquire
information reiating to which of the probes in the library one should use for
a subsequent
assay, and of the primers one should synthesize. The time factor is important,
since synthe-
sis of a primer pair can be accomplished overnight, whereas synthesis of the
probe would
normally be quite time-consuming and cumbersome.
To facilitate use of the method, the probe library can be identified (e.g. by
means of a pro-
duct code which essentially tells the computer system how the probe library is
composed).
Step A then comprises inputting, into the computer system, data that
identifies the at least
one library of nucleic acids from which it is desired to select a member for
use in the specific
means for detection.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
22
The preferred inputting interface is an internet-based web-interface, because
the method is
conveniently stored on a web server to allow access from users who have
acquired a probe
library of the present invention. However, the method also would be useful as
part of an
installable computer application, which could be installed on a single
computer or on a local
area network.
In preferred embodiments of this method, the primers identified in step C are
chosen so as to
minimize the chance of amplifying genomic nucleic acids in a PCR reaction.
This is of course
only relevant where the sample is likely to contain genomic material. One
simple way to
minimize the chance of amplification of genomic nucleic acids is to include,
in at least one of
the primers, a nucleotide sequence which in genomic DNA is interrupted by an
intron. In this
way, the primer will only prime amplification of transcripts where the intron
has been spliced
out.
Alternatively, one can choose primer pairs that cannot amplify genomic DNA or
other
transcripts. Such primers can be identified by doing a computerized search
with the primers
against the genome and transcriptome, i.e. an in silico PCR. Such a search
must find and
filter primer pairs where the left and right primer can match the DNA within
the distance of a
typical amplicon length, which can be 600 nucleotides or several thousand
nucleotides. The
left and right primer can match in four different ways: 1: The left primer and
the reverse
complement of the right primer. 2: The left primer and the reverse complement
of the left
primer. 3: The right primer and the reverse complement of the left primer. 4:
The right
primer and the reverse complement of the right primer.
A further optimization of the method is to choose the primers in step C so as
to minimize the
length of amplicons obtained from PCR performed on the target nucleic acid
sequence and it
is further also preferred to select the primers so as to optimize the GC
content for performing
a subsequent PCR.
As for the probe selection method, the selection method for detection means
can be provided
to the end-user as a computer program product providing instructions for
implementing the
method, embedded in a computer-readable medium. Consequentiy, the invention
also pro-
vides for a system comprising a database of nucleic acid probes of the
invention and an ap-
plication program for executing this computer program.
The method and the computer programs and system allows for quantitative or
qualitative
determination of the presence of a target nucleic acid in a sample, comprising
i) identifying, by means of the detection means selection method of the
invention, a specific
means for detection of the target nucleic acid, where the specific means for
detection com-

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
23
prises an oligonucleotide probe and a set of primers,
ii) obtaining the primers and the oligonucleotide probe identified in step i),
iii) subjecting the sample to a molecular amplification procedure in the
presence of the pri-
mers and the oligonucleotide probe from step ii), and
iv) determining the presence of the target nucleic acid based on the outcome
of step iii).
Conveniently, primers obtained in step ii) are obtained by synthesis and it is
preferred that
the oligonucleotide probe is obtained from a library of the present invention.
The molecular amplification method is typically a PCR or a NASBA procedure,
but any in vitro
method for specific amplification (and, possibly, detection) of a nucleic acid
is useful. The
preferred PCR procedure is a qPCR (also known as real-time reverse
transcription PCR or ki-
netic RT-PCR).
Other aspects of the invention are discussed infra.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 illustrates the use of conventional long probes in panel (A) as well as
the properties
and use of short multi-probes (B) from a library constructed according to the
invention. The
short multi-probes comprise a recognition segment chosen so that each probe
sequence may
be used to detect and/or quantify several different target sequences
comprising the comple-
mentary recognition sequence. Fig. 1A shows a method according to the prior
art. Fig. 1B
shows a method according to one aspect of the invention.
Fig. 2 is a flow chart showing a method for designing multi-probe sequences
for a library ac-
cording to one aspect of the invention. The method can be implemented by
executing in-
structions provided by a computer program embedded in a computer readable
medium. In
one aspect, the program instructions are executed by a system, which comprises
a database
of sequences such as expressed sequences.
Fig. 3 is a graph illustrating the redundancy of probes targeting each gene
within a 100-
probe library according to one aspect of the invention. The y-axis shows the
number of genes
in the human transcriptome that are targeted by different number of probes in
the library. It
is apparent that a majority of all genes are targeted by several probes. The
average number
of probes per gene is 17.4.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
24
Fig. 4 shows the theoretical coverage of the human transcriptome by a
seiection of hyper-
abundant oligonucieotides of a given length. The graphs show the percentage of
approxi-
mately 38.000 human mRNA sequences that can be detected by an increasing
number of
well-chosen short multi-probes of different length. The graph illustrates the
theoretical cover-
age of the human transcriptome by optimally chosen (i.e. hyper-abundant, non-
self comple-
mentary and thermally stable) short multi-probes of different lengths. The
Homo sapiens
transcriptome sequence was obtained from European Bioinformatics Institute
(EMBL-EBI). A
region of 1000 nt proximal to the 3' end of each mRNA sequence was used for
the analysis
(from 50 nt to 1050 nt upstream from the 3' end). As the amplification of each
sequence is
by PCR both strands of the amplified duplex was considered a valid target for
multi-probes in
the probe library. Probe sequences that even with LNA substitutions have
inadequate Tm, as
well as self-complementary probe sequences are excluded.
Fig. 5 shows the MALDI-MS spectrum of the oligonucleotide probe EQ13992,
showing [M-H]"
= 4121,3 Da.
Fig. 6 shows representative real time PCR curves for 9-mer multi-probes
detecting target
sequences in a dual labelled probe assay. Results are from real time PCR
reactions with 9 nt
long LNA enhanced dual labelled probes targeting different 9-mer sequences
within the same
gene. Each of the three different dual labelled probes were analysed in PCRs
generating the
469, the 570 or the 671 SSA4 amplicons (each between 81 to 95 nt long). Dual
labelled
probe 469, 570, and 671 is shown in Panel a, b, and c, respectively. Each
probe only detects
the amplicon it was designed to detect. The Ct values were 23.7, 23.2, and
23.4 for the dual
labelled probes 469, 570, and 671, respectively. 2 x 10' copies of the SSA4
cDNA were
added as template. The high similarity between results despite differences in
both probe
sequences and their individual primer pairs indicate that the assays are very
robust.
Fig. 7 shows examples of real time PCR curves for Molecular Beacons with a 9-
mer and a 10-
mer recognition site. Panel (A): Molecular beacon probe with a 10-mer
recognition site de-
tecting the 469 SSA4 amplicon. Signal was only obtained in the sample where
SSA4 cDNA
was added (2 x 10' copies). A Ct value of 24.0 was obtained. A similar
experiment with a
molecular beacon having a 9-mer recognition site detecting the 570 SSA4
amplicon is shown
in panel (B). Signal was only obtained when SSA4 cDNA was added (2 x 10'
copies).
Fig. 8 shows an example of a real time PCR curve for a SYBR-probe with a 9-mer
recognition
site targeting the 570 SSA4 amplicon. Signal was only obtained in the sample
where SSA4
cDNA was added (2 x 10' copies), whereas no signal was detected without
addition of tem-
plate.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
Fig. 9 shows a calibration curve for three different 9-mer multi-probes using
a dual labelled
probe assay principle. Detection of different copy number levels of the SSA4
cDNA by the
three dual labelled probes. The threshold cycle nr defines the cycle number at
which signal
was first detected for the respective PCR. Slope (a) and correlation
coefficients (R 2) of the
5 three linear regression lines are: a=-3.456 & R2 = 0.9999 (Dual-labelled-
469), a = -3.468 &
R2 = 0.9981 (Dual-labelled-570), and a=-3.499 & R2 = 0.9993 (Dual-labelled-
671).
Fig. 10 shows the use of 9-mer dual labelled multi-probes to quantify a heat
shock protein
before and after-exposure to heat shock in a wild type yeast strain as well as
a mutant strain
where the corresponding gene has been deleted. Real time detection of SSA4
transcript levels
10 in wild type (wt) yeast and in the SSA4 knockout mutant with the Dual-
labelled-570 probe is
shown. The different strains were either cultured at 30 C till harvest (- HS)
or they were ex-
posed to 40 C for 30 minutes prior to harvest. The Dual-labelled-570 probe was
used in this
example. The transcript was only detected in the wt type strain, where it was
most abundant
in the + HS culture. Ct values were 26,1 and 30.3 for the + HS and the - HS
culture, respec-
15 tively.
Fig. 11 shows an example of how more than one gene can be detected by the same
9-mer
probe while nucleic acid molecules without the probe target sequence (i.e.
complementary to
the recognition sequence) will not be detected. In (a) Dual-labelled-469
detects both the
SSA4 (469 amplicon) and the POL5 transcript with Ct values of 29.7 and 30.1,
respectively.
20 No signal was detected from the APG9 and HSP82 transcripts. In (b) Dual-
labelled-570 de-
tects both the SSA4 (570 amplicon) and the APG9 transcript with Ct values of
31.3 and 29.2
respectively. No signal is detected from the POL5 and HSP82 transcripts. In
(c) probe Dual-
labelled-671 detected both the SSA4 (671 amplicon) and the HSP82 transcript
with Ct values
of 29.8 and 25.6 respectively. No signal was detected from the POL5 and APG9
transcripts.
25 The amplicon produced in the different PCRs is indicated in the legend. The
same amount of
cDNA was used as in the experiments depicted in Figure 10. Only cDNA from non-
heat
shocked wild type yeast was used.
Fig. 12 shows agarose gel electrophoresis of a fraction of the amplicons
generated in the PCR
reactions shown in the example of Fig. 11, demonstrating that the probes are
specific for
target sequences comprising the recognition sequence but do not hybridize to
nucleic acid
molecules which do not comprise the target sequence. In lane 1 contain the
SSA4-469 am-
plicon (81 bp), lane 2 contains the POL5 amplicon (94 bp), lane 3 contains the
APG9 ampli-
con (97 bp) and lane 4 contains the HSP82 amplicon (88 bp). Lane M contains a
50 bp ladder
as size indicator. It is clear that a product was formed in all four cases;
however, only ampli-
ficates containing the correct multi-probe target sequence (i.e.SSA4-467 and
POL5) were
detected by the dual labelled probe 467. That two different amplificates were
indeed pro-

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
26
duced and detected is evident from the size difference in the detected
fragments from lane 1
and 2.
Fig. 13: Preferred target sequences.
Fig. 14: Further Preferred target sequences.
Fig. 15: Longmers (positive controls). The sequences are set forth in SEQ ID
NOs. 32-46.
Fig. 16: Procedure for the selection of probes and the designing of primers
for qPCR.
Fig. 17: Source code for the program used in the calculation of a multi-probe
dataset.
Fig. 18: The result from performing real time PCR with a probe carrying the Q4
quencher
together with the fluorescein dye.
Figure 19: The result from performing real time PCR with a dual labelled probe
carrying a 3'-
Nitroindole.
Figure 20: The result from performing real time PCR with a probe having
perfect match or a
single mismatch relative to the amplified target sequence. As control, a PCR
without addition
of template was included in the experiment.
DETAILED DESCRIPTION
The present invention relates to short oligonucleotide probes or multi-probes,
chosen and
designed to detect, classify or characterize, and/or quantify many different
target nucleic acid
molecules. These multi-probes comprise at least one non-natural modification
(e.g. such as
LNA nucleotide) for increasing the binding affinity of the probes for a
recognition sequence,
which is a subsequence of the target nucleic acid molecules. The target
nucleic acid mole-
cules are otherwise different outside of the recognition sequence.
In one aspect, the multi-probes comprise at least one nucleotide modified with
a chemical
moiety for increasing binding affinity of the probes for a recognition
sequence, which is a
subsequence of the target nucleic acid sequence. In another aspect, the probes
comprise
both at least one non-natural nucleotide and at least one nucleotide modified
with a chemical
moiety. In a further aspect, the at least one non-natural nucleotide is
modified by the

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
27
chemical moiety. The invention also provides kits, libraries and other
compositions compri-
sing the probes.
The invention further provides methods for choosing and designing suitable
oligonucleotide
probes for a given mixture of target sequences, ii) individual probes with
these abilities, and
iii) libraries of such probes chosen and designed to be able to detect,
classify, and/or quantify
the largest number of target nucleotides with the smallest number of probe
sequences. Each
probe according to the invention is thus able to bind many different targets,
but may be used
to create a specific assay when combined with a set of specific primers in PCR
assays.
Preferred oligonucleotides of the invention are comprised of about 8 to 9
nucleotide units, a
substantial portion of which comprises stabilizing nucleotides, such as LNA
nucleotides. A
preferred library contains approximately 100 of these probes chosen and
designed to cha-
racterize a specific pool of nucleic acids, such as mRNA, cDNA or genomic DNA.
Such a library
may be used in a wide variety of applications, e.g., gene expression analyses,
SNP detection,
and the like. (See, e.g., Fig. 1).
Definitions
The following definitions are provided for specific terms, which are used in
the disclosure of
the present invention:
As used herein, the singular form "a", "an" and "the" include plural
references unless the
context clearly dictates otherwise. For example, the term "a cell" includes a
plurality of cells,
including mixtures thereof. The term "a nucleic acid molecule" includes a
plurality of nucleic
acid molecules.
As used herein, the term "transcriptome" refers to the complete collection of
transcribed
elements of the genome of any species.
In addition to mRNAs, it also represents non-coding RNAs which are used for
structural and
regulatory purposes.
As used herein, the term "amplicon refers to small, replicating DNA fragments.
As used herein, a "sample" refers to a sample of tissue or fluid isolated from
an organism or
organisms, including but not limited to, for exampie, skin, plasma, serum,
spinal fluid, lymph
fluid, synovial fluid, urine, tears, blood celis, organs, tumours, and also to
samples of in vitro

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
28
cell culture constituents (including but not limited to conditioned medium
resulting from the
growth of cells in cell culture medium, recombinant cells and cell
components).
As used herein, an "organism" refers to a living entity, including but not
limited to, for exam-
ple, human, mouse, rat, Drosophila (e.g. D. melanogaster), C. elegans, yeast,
Arabidopsis
(e.g. A. thaliana), zebra fish, primates (e.g. chimpanzees), domestic animals,
etc.
By the term "SBC nucleobases" is meant "Selective Binding Complementary"
nucleobases,
i.e. modified nucleobases that can make stable hydrogen bonds to their
complementary nu-
cleobases, but are unable to make stable hydrogen bonds to other SBC
nucleobases. As an
example, the SBC nucleobase A', can make a stable hydrogen bonded pair with
its comple-
mentary unmodified nucleobase, T. Likewise, the SBC nucleobase T' can make a
stable hy-
drogen bonded pair with its complementary unmodified nucleobase, A. However,
the SBC
nucleobases A' and T' will form an unstable hydrogen bonded pair as compared
to the base-
pairs A'-T and A-T'. Likewise, a SBC nucleobase of C is designated C' and can
make a stable
hydrogen bonded pair with its complementary unmodified nucleobase G, and a SBC
nucleo-
base of G is designated G' and can make a stable hydrogen bonded pair with its
comple-
mentary unmodified nucleobase C, yet C' and G' will form an unstable hydrogen
bonded pair
as compared to the basepairs C'-G and C-G'. A stable hydrogen bonded pair is
obtained when
2 or more hydrogen bonds are formed e.g. the pair between A' and T, A and T',
C and G', and
C' and G. An unstable hydrogen bonded pair is obtained when 1 or no hydrogen
bonds is
formed e.g. the pair between A' and T', and C' and G'.
Especially interesting SBC nucleobases are 2,6-diaminopurine (A', also called
D) together
with 2-thio-uracil (U', also called 25U)(2-thio-4-oxo-pyrimidine) and 2-thio-
thymine (T', also
called 2ST)(2-thio-4-oxo-5-methyl-pyrimidine). Fig. 4 illustrates that the
pairs A-2ST and D-T
have 2 or more than 2 hydrogen bonds whereas the D-ZST pair forms a single
(unstable) hy-
drogen bond. Likewise the SBC nucleobases pyrrolo-[2,3-d]pyrimidine-2(3H)-one
(C', also
called PyrroloPyr) and hypoxanthine (G', also called I)(6-oxo-purine) are
shown in Fig. 9
where the pairs PyrroloPyr-G and C-I have 2 hydrogen bonds each whereas the
PyrroloPyr-I
pair forms a single hydrogen bond.
By "SBC LNA oligomer" is meant a "LNA oligomer" containing at least one "LNA
unit" where
the nucleobase is a "SBC nucleobase". By "LNA unit with an SBC nucleobase" is
meant a
"SBC LNA monomer". Generally speaking SBC LNA oligomers include oligomers that
besides
the SBC LNA monomer(s) contain other modified or naturally-occurring
nucleotides or nucleo-
sides. By "SBC monomer" is meant a non-LNA monomer with a SBC nucleobase. By
"isose-
quential oligonucleotide" is meant an oligonucleotide with the same sequence
in a Watson-
Crick sense as the corresponding modified oligonucleotide e.g. the sequences
agTtcATg is

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
29
equal to agTscD 2SUg where s is equal to the SBC DNA monomer 2-thio-t or 2-
thio-u, D is
equal to the SBC LNA monomer LNA-D and ZSU is equal to the SBC LNA monomer LNA
25U
As used herein, the terms "nucleic acid", "polynucleotide" and
"oligonucleotide" refer to pri-
mers, probes, oligomer fragments to be detected, oligomer controls and
unlabelled blocking
oligomers and shall be generic to polydeoxyribonucleotides (containing 2-deoxy-
D-ribose), to
polyribonucleotides (containing D-ribose), and to any other type of
polynucleotide which is an
N glycoside of a purine or pyrimidine base, or modified purine or pyrimidine
bases. There is
no intended distinction in length between the term "nucleic acid",
"polynucleotide" and "oli-
gonucleotide", and these terms will be used interchangeably. These terms refer
only to the
primary structure of the molecule. Thus, these terms include double- and
single-stranded
DNA, as well as double- and single stranded RNA. The oligonucleotide is
comprised of a se-
quence of approximately at least 3 nucleotides, preferably at least about 6
nucleotides, and
more preferably at least about 8 - 30 nucleotides corresponding to a region of
the designated
nucleotide sequence. "Corresponding" means identical to or complementary to
the designated
sequence.
The oligonucleotide is not necessarily physically derived from any existing or
natural sequen-
ce but may be generated in any manner, including chemical synthesis, DNA
replication, re-
verse transcription or a combination thereof. The terms "oligonucleotide" or
"nucleic acid"
intend a polynucleotide of genomic DNA or RNA, cDNA, semi synthetic, or
synthetic origin
which, by virtue of its origin or manipulation: (1) is not associated with all
or a portion of the
polynucleotide with which it is associated in nature; and/or (2) is linked to
a polynucleotide
other than that to which it is linked in nature; and (3) is not found in
nature.
Because mononucleotides are reacted to make oligonucleotides in a manner such
that the 5'.
phosphate of one mononucleotide pentose ring is attached to the 3' oxygen of
its neighbour
in one direction via a phosphodiester linkage, an end of an oligonucleotide is
referred to as
the "5' end" if its 5' phosphate is not linked to the 3' oxygen of a
mononucleotide pentose
ring and as the "3' end" if its 3' oxygen is not linked to a 5' phosphate of a
subsequent
mononucleotide pentose ring. As used herein, a nucleic acid sequence, even if
internal to a
larger oligonucleotide, also may be said to have a 5' and 3' ends.
When two different, non-overlapping oligonucleotides anneal to different
regions of the same
linear complementary nucleic acid sequence, the 3' end of one oligonucleotide
points toward
the 5' end of the other; the former may be called the "upstream"
oligonucleotide and the
latter the "downstream" oligonucleotide.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
The term "primer" may refer to more than one primer and refers to an
oligonucleotide,
whether occurring naturally, as in a purified restriction digest, or produced
synthetically,
which is capable of acting as a point of initiation of synthesis along a
complementary strand
when placed under conditions in which synthesis of a primer extension product
which is com-
5 plementary to a nucleic acid strand is catalyzed. Such conditions include
the presence of four
different deoxyribonucleoside triphosphates and a polymerization-inducing
agent such as
DNA polymerase or reverse transcriptase, in a suitable buffer ("buffer"
includes substituents
which are cofactors, or which affect pH, ionic strength, etc.), and at a
suitable temperature.
The primer is preferably single-stranded for maximum efficiency in
amplification.
10 As used herein, the terms "PCR reaction", "PCR amplification", "PCR" and
"real-time PCR" are
interchangeable terms used to signify use of a nucleic acid amplification
system, which multi-
plies the target nucleic acids being detected. Examples of such systems
include the poly-
merase chain reaction (PCR) system and the ligase chain reaction (LCR) system.
Other
methods recently described and known to the person of skill in the art are the
nucleic acid
15 sequence based amplification (NASBATM, Cangene, Mississauga, Ontario) and Q
Beta Repli-
case systems. The products formed by said amplification reaction may or may
not be moni-
tored in real time or only after the reaction as an end point measurement.
The complement of a nucleic acid sequence as used herein refers to an
oligonucleotide which,
when aligned with the nucleic acid sequence such that the 5' end of one
sequence is paired
20 with the 3' end of the other, is in "antiparallel association." Bases not
commonly found in
natural nucleic acids may be included in the nucleic acids of the present
invention include, for
example, inosine and 7-deazaguanine. Complementarity may not be perfect;
stable duplexes
may contain mismatched base pairs or unmatched bases. Those skilled in the art
of nucleic
acid technology can determine duplex stability empirically considering a
number of variables
25 including, for example, the length of the oligonucleotide, percent
concentration of cytosine
and guanine bases in the oligonucleotide, ionic strength, and incidence of
mismatched base
pairs.
Stability of a nucleic acid duplex is measured by the melting temperature, or
"Tm". The Tm of
a particular nucleic acid duplex under specified conditions is the temperature
at which half of
30 the base pairs have disassociated.
As used herein, the term "probe" refers to a labelled oligonucleotide, which
forms a duplex
structure with a sequence in the target nucleic acid, due to complementarity
of at least one
sequence in the probe with a sequence in the target region. The probe,
preferably, does not
contain a sequence complementary to sequence(s) used to prime the polymerase
chain reac-
tion. Generally the 3' terminus of the probe will be "blocked" to prohibit
incorporation of the

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
31
probe into a primer extension product. "Blocking" may be achieved by using non-
comple-
mentary bases or by adding a chemical moiety such as biotin or even a
phosphate group to
the 3' hydroxyl of the last nucleotide, which may, depending upon the selected
moiety, may
serve a dual purpose by also acting as a label.
The term "label" as used herein refers to any atom or molecule which can be
used to provide
a detectable (preferably quantifiable) signal, and which can be attached to a
nucleic acid or
protein. Labels may provide signals detectable by fluorescence, radioactivity,
colorimetric, X-
ray diffraction or absorption, magnetism, enzymatic activity, and the like.
As defined herein, "5'43' nuclease activity" or "5' to 3' nuclease activity"
refers to that acti-
vity of a template-specific nucleic acid polymerase including either a 5'43'
exonuclease acti-
vity traditionally associated with some DNA polymerases whereby nucleotides
are removed
from the 5' end of an oligonucleotide in a sequential manner, (i.e., E. coli
DNA polymerase I
has this activity whereas the Klenow fragment does not), or a 5'43'
endonuclease activity
wherein cleavage occurs more than one nucleotide from the 5' end, or both.
As used herein, the term "thermo stable nucleic acid polymerase" refers to an
enzyme which
is relatively stable to heat when compared, for example, to nucleotide
polymerases from E.
coli and which catalyzes the polymerization of nucleosides. Generally, the
enzyme will initiate
synthesis at the 3'-end of the primer annealed to the target sequence, and
will proceed in the
5'-direction along the template, and if possessing a 5' to 3' nuclease
activity, hydrolyzing or
displacing intervening, annealed probe to release both labelled and uniabelled
probe frag-
ments or intact probe, until synthesis terminates. A representative thermo
stable enzyme
isolated from Thermus aquaticus (Tag) is described in U.S. Pat. No. 4,889,818
and a method
for using it in conventional PCR is described in Saiki et al., (1988), Science
239:487.
The term "nucleobase" covers the naturally occurring nucleobases adenine (A),
guanine (G),
cytosine (C), thymine (T) and uracil (U) as well as non-naturally occurring
nucleobases such
as xanthine, diaminopurine, 8-oxo-N6-methyladenine, 7-deazaxanthine, 7-
deazaguanine,
N4,N4-ethanocytosin, N6,N6-ethano-2,6-diaminopurine, 5-methylcytosine, 5-(C3-
C6)-alkynyl-
cytosine, 5-fluorouracil, 5-bromouracil, pseudoisocytosine, 2-hydroxy-5-methyl-
4-triazolopy-
ridin, isocytosine, isoguanine, inosine and the "non-naturally occurring"
nucleobases de-
scribed in Benner et al., U.S. Patent No. 5,432,272 and Susan M. Freier and
Karl-Heinz
Altmann, Nucleic Acid Research,25: 4429-4443, 1997. The term "nucleobase" thus
includes
not only the known purine and pyrimidine heterocycles, but also heterocyclic
analogues and
tautomers thereof. Further naturally and non naturally occurring nucleobases
include those
disclosed in U.S. Patent No. 3,687,808; in chapter 15 by Sanghvi, in Antisense
Research and
Application, Ed. S. T. Crooke and B. Lebleu, CRC Press, 1993; in Englisch, et
al., Angewandte

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
32
Chemie, International Edition, 30: 613-722, 1991 (see, especially pages 622
and 623, and in
the Concise Encyclopedia of Polymer Science and Engineering, J. I. Kroschwitz
Ed., John
Wiley & Sons, pages 858-859, 1990, Cook, Anti-Cancer DrugDesign 6: 585-607,
1991, each
of which are hereby incorporated by reference in their entirety).
The term "nucleosidic base" or "nucleobase analogue" is further intended to
include hetero-
cyclic compounds that can serve as nucleosidic bases including certain
"universal bases" that
are not nucleosidic bases in the most classical sense but serve as nucleosidic
bases. Es-
pecially mentioned as a universal base is 3-nitropyrrole and 5-nitroindole.
Other preferred
compounds include pyrene and pyridyloxazole derivatives, pyrenyl,
pyrenylmethylglycerol
derivatives and the like. Other preferred universal bases include, pyrrole,
diazole or triazole
derivatives, including those universal bases known in the art.
By "universal base" is meant a naturally-occurring or desirably a non-
naturally occurring
compound or moiety that can pair with a natural base (e.g., adenine, guanine,
cytosine,
uracil, and/or thymine), and that has a Tm differential of 15, 12, 10, 8, 6,
4, or 2 C or less as
described herein.
By "oligonucleotide," "oligomer," or "oligo" is meant a successive chain of
monomers (e.g.,
glycosides of heterocyclic bases) connected via internucleoside linkages. The
linkage be-
tween two successive monomers in the oligo consist of 2 to 4, desirably 3,
groups/atoms
selected from -CH2-, -0-, -S-, -NR"-, >C=O, >C=NR", >C=S, -Si(R")2-, -SO-, -
S(O)Z-,
-P(O)2-, -PO(BH3)-, -P(O,S)-, -P(S)2-, -PO(R")-, -PO(OCH3)-, and -PO(NHR")-,
where R" is
selected from hydrogen and C1_4-alkyl, and R" is selected from Cl_6-alkyl and
phenyl. Illustra-
tive examples of such linkages are -CH2-CH2-CH2-, -CH2-CO-CH2-, -CH2-CHOH-CH2-
, -O-CH2-
0-, -O-CHa-CHZ-, -O-CHa-CH= (including R5 when used as a linkage to a
succeeding mono-
mer), -CHZ-CH2-O-, -NR"-CHZ-CHz-, -CH2-CHZ-NR"-, -CHa-NR"-CHz-, -O-CH2-CH2-NR"-
-NR"-CO-O-, -NR"-CO-NR"-, -NR"-CS-NR"-, -NR"-C(=NR")-NR"-, -NR"-CO-CHZ-NR"-, -
0-CO-
O-, -O-CO-CHZ-O-, -O-CHZ-CO-O-, -CHZ-CO-NR"-, -O-CO-NR"-, -NR"-CO-CH2-, -O-CHz-
CO-
NR"-, -O-CHZ-CHZ-NR"-, -CH=N-O-, -CH2-NR"-0-, -CHZ-O-N= (including R5 when
used as a
linkage to a succeeding monomer), -CHz-O-NR"-, -CO-NR"-CHZ-, -CHZ-NR"-0-, -CHZ-
NR"-CO-
, -O-NR"-CHZ-, -O-NR"-, -O-CH2-S-, -S-CH2-O-, -CH2-CHZ-S-, -O-CHZ-CHz-S-, -S-
CH2-CH=
(including RS when used as a linkage to a succeeding monomer), -S-CH2-CH2-, -S-
CH2-CH2-O-
, -S-CH2-CH2-S-, -CH2-S-CH2-, -CHz-SO-CHZ-, -CHZ-SO2-CH2-, -O-SO-O-, -O-S(O)z-
O-, -0-
S(O)2-CH2.-, -O-S(O)2-NR"-, -NR"-S(O)2-CHZ-, -O-S(O)2-CHZ-, -O-P(O)2-0-, -O-
P(O,S)-0-, -0-
P(S)2-O-, -S-P(O)Z-O-, -S-P(O,S)-0-, -S-P(S)2-0-, -O-P(O)Z-S-, -O-P(O,S)-S-, -
O-P(S)2-S-,
-S-P(O)Z-S-, -S-P(O,S)-S-, -S-P(S)z-S-, -O-PO(R")-0-, -O-PO(OCH3)-0-, -O-
PO(OCHZCH3)-O-
, -O-PO(OCHZCH2S-R)-0-, -O-PO(BH3)-0-, -O-PO(NHR")-0-, -O-P(O)Z-NR"-, -NR"-
P(O)2-0-, -
O-P(O,NR")-0-, -CH2-P(O)2-0-, -O-P(O)2-CH2-, and -O-Si(R")2-O-; among which -
CHZ-CO-

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
33
NR"-, -CH2-NR"-0-, -S-CHZ-O-, -O-P(O)2-0-, -O-P(O,S)-0-, -O-P(S)a-O-, -NR"-
P(O)2-0-, -0-
P(O,NR")-0-, -O-PO(R")-0-, -O-PO(CH3)-0-, and -O-PO(NHR")-0-, where R" is
selected form
hydrogen and Ci_4-alkyl, and R" is selected from C1_6-alkyl and phenyl, are
especially desir-
able. Further illustrative examples are given in Mesmaeker et. al., Current
Opinion in Struc-
tural Biology 1995, 5, 343-355 and Susan M. Freier and Karl-Heinz Altmann,
Nucleic Acids
Research, 1997, vol 25, pp 4429-4443. The left-hand side of the
internucleoside linkage is
bound to the 5-membered ring as substituent P* at the 3'-position, whereas the
right-hand
side is bound to the 5'-position of a preceding monomer.
By "LNA unit" is meant an individual LNA monomer (e.g., an LNA nucleoside or
LNA nucleo-
tide) or an oligomer (e.g., an oligonucleotide or nucleic acid) that includes
at least one LNA
monomer. LNA units as disclosed in WO 99/14226 are in general particularly
desirable modi-
fied nucleic acids for incorporation into an oligonucleotide of the invention.
Additionally, the
nucleic acids may be modified at either the 3' and/or 5' end by any type of
modification
known in the art. For example, either or both ends may be capped with a
protecting group,
attached to a flexible linking group, attached to a reactive group to aid in
attachment to the
substrate surface, etc. Desirable LNA units and their method of synthesis also
are disclosed
in WO 00/47599, US 6,043,060, US 6,268,490, PCT/JP98/00945, WO 0107455, WO
0100641, WO 9839352, WO 0056746, WO 0056748, WO 0066604, Morita et al.,
Bioorg.
Med. Chem. Lett. 12(1):73-76, 2002; Hakansson et al., Bioorg. Med. Chem. Lett.
11(7):935-
938, 2001; Koshkin et al., J. Org. Chem. 66(25):8504-8512, 2001; Kvaerno et
al., J. Org.
Chem. 66(16):5498-5503, 2001; Hakansson et al., J. Org. Chem. 65(17):5161-
5166, 2000;
Kvaerno et al., J. Org. Chem. 65(17):5167-5176, 2000; Pfundheller et al.,
Nucleosides
Nucleotides 18(9):2017-2030, 1999; and Kumar et a/., Bioorg. Med. Chem. Lett.
8(16):2219-2222, 1998.
Preferred LNA monomers, also referred to as "oxy-LNA" are LNA monomers which
include
bicyclic compounds as disclosed in PCT Publication WO 03/020739 wherein the
bridge be-
tween R" and R" as shown in formula (I) below together designate -CHZ-O-
(methyloxy LNA)
or -CHZ-CHZ-O- (ethyloxy LNA, also designated ENA).
Further preferred LNA monomers are designated "thio-LNA" or "amino-LNA"
including bicyclic
structures as disclosed in WO 99/14226, wherein the heteroatom in the bridge
between R4'
and R 2'as shown in formula (I) below together designate -CH2-S-, -CH2-CH2-S-,
-CH2-NH- or -
CH2-CH2-NH-.
By "LNA modified oligonucleotide" is meant an oligonucleotide comprising at
least one LNA
monomeric unit of formula (I), described infra, having the below described
illustrative exam-
ples of modifications:

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
34
RP R3XR2B
R4" R~~
R3* R2*
wherein X is selected from -0-, -S-, -N(R")-, -C(R6R6*)-, -O-C(R'R'*)-, -
C(R6R6*)-0-, -S-
C(R'R'')-, -C(R6R6')-S-, -N(R"*)-C(R'R'*)-, -C(R6R6*)-N(R"')-, and -C(R6R6')-
C(R'R'*).
B is selected from a modified base as discussed above e.g. an optionally
substituted carbo-
5 cyclic aryl such as optionally substituted pyrene or optionally substituted
pyrenylmethylgly-
cerol, or an optionaliy substituted heteroalicylic or optionally substituted
heteroaromatic such
as optionally substituted pyridyloxazole, optionally substituted pyrrole,
optionally substituted
diazole or optionally substituted triazole moieties; hydrogen, hydroxy,
optionally substituted
C1_4-alkoxy, optionally substituted Cl_4-alkyl, optionally substituted C1_4
acyloxy, nucleobases,
DNA intercalators, photochemically active groups, thermochemically active
groups, chelating
groups, reporter groups, and ligands.
P designates the radical position for an internucleoside linkage to a
succeeding monomer, or
a 5'-terminal group, such internucleoside linkage or 5'-terminal group
optionally including the
substituent R5. One of the substituents R2, R2*, R3, and R3* is a group P*
which designates an
internucleoside linkage to a preceding monomer, or a 2'/3'-terminal group. The
substituents
7* R ,
" and the ones of R2, R ,
3 and R3* not designating P*
of Rl*, R4*, R ,
z* R ,
S R5*, R ,
6 R6*, R ~
' R ,
each designates a biradical comprising about 1-8 groups/atoms selected from -
C(RaRb)-, -
C(Ra)=C(Ra)-, -C(Ra)=N-, -C(Ra)-0-, -0-, -Si(Ra)a-, -C(Ra)-S, -S-, -SO2-, -
C(Ra)-N(Rb)-, -
N(Ra)-, and >C=Q, wherein Q is selected from -0-, -S-, and -N(Ra)-, and Ra and
Rb each is
independently selected from hydrogen, optionally substituted C1_12-alkyl,
optionally substi-
tuted CZ_12-alkenyl, optionally substituted Cz_12-alkynyl, hydroxy, CI_12-
alkoxy, Ca_12-alkeny-
loxy, carboxy, Cl_1Z-alkoxycarbonyl, Cl_lZ-alkylcarbonyl, formyl, aryl,
aryloxy-carbonyl, aryl-
oxy, arylcarbonyl, heteroaryl, hetero-aryloxy-carbonyl, heteroaryloxy,
heteroarylcarbonyl,
amino, mono- and di(Cl_6-alkyl)amino, carbamoyl, mono- and di(CI_6-alkyl)-
amino-carbonyl,
amino-Cl_6-alkyl-aminocarbonyl, mono- and di(Cl_6-alkyl)amino-Cl_6-alkyl-
aminocarbonyl,
C1_6-alkyl-carbonylamino, carbamido, C1_6-alkanoyloxy, sulphono, Cl_6-
alkylsulphonyloxy, ni-
tro, azido, sulphanyl, C1_6-alkylthio, halogen, DNA intercalators,
photochemically active
groups, thermochemically active groups, chelating groups, reporter groups, and
ligands,
where aryl and heteroaryl may be optionally substituted, and where two geminal
substituents
Ra and Rb together may designate optionally substituted methylene (=CHz), and
wherein two
non-geminal or geminal substituents selected from Ra, Rb, and any of the
substituents Rl*,
R RZ*, R3, R3*, R~*, R5, RS*, R6 and R6*, R', and R7* which are present and
not involved in P,

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
P* or the biradical(s) together may form an associated biradical selected from
biradicals of
the same kind as defined before; the pair(s) of non-geminal substituents
thereby forming a
mono- or bicyclic entity together with (i) the atoms to which said non-geminal
substituents
are bound and (ii) any intervening atoms.
5 Each of the substituents Rl*, Rz, RZ*, R3, R4*, R5, R5*, R6 and R6*, R7, and
R7* which are pre-
sent and not involved in P, P* or the biradical(s), is independently selected
from hydrogen,
optionally substituted C1_12-alkyl, optionally substituted C2_12-alkenyl,
optionally substituted
C2_12-alkynyl, hydroxy, C1_12-alkoxy, C2_1Z-alkenyloxy, carboxy, Cl_1Z-
alkoxycarbonyl, C1_12-
alkylcarbonyl, formyl, aryl, aryloxy-carbonyl, aryloxy, arylcarbonyl,
heteroaryl, heteroaryl-
10 oxy-carbonyl, heteroaryloxy, heteroarylcarbonyl, amino, mono- and di(Cl_6-
alkyl)amino, car-
bamoyl, mono- and di(C1_6-alkyl)-amino-carbonyl, amino-Ci_6-alkyl-
aminocarbonyl, mono-
and di(C1_6-alkyl)amino-Cl_6-alkyl-aminocarbonyl, CI_6-alkyl-carbonylamino,
carbamido, C1_6-
alkanoyloxy, sulphono, Cl_6-alkylsulphonyloxy, nitro, azido, sulphanyl, Cl_6-
alkylthio, halogen,
DNA intercalators, photochemically active groups, thermochemically active
groups, chelating
15 groups, reporter groups, and ligands, where aryl and heteroaryl may be
optionally substi-
tuted, and where two geminal substituents together may designate oxo, thioxo,
imino, or
optionally substituted methylene, or together may form a spiro biradical
consisting of a 1-5
carbon atom(s) alkylene chain which is optionally interrupted and/or
terminated by one or
more heteroatoms/groups selected from -0-, -S-, and -(NR")- where R" is
selected from hy-
20 drogen and C1_4-alkyl, and where two adjacent (non-geminal) substituents
may designate an
additional bond resulting in a double bond; and R"*, when present and not
involved in a bira-
dical, is selected from hydrogen and C1_4-alkyl; and basic salts and acid
addition salts thereof.
Exemplary 5', 3', and/or 2' terminal groups include -H, -OH, halo (e.g.,
chloro, fluoro, iodo,
or bromo), optionally substituted aryl, (e.g., phenyl or benzyl), alkyl (e.g.,
methyl or ethyl),
25 alkoxy (e.g., methoxy), acyl (e.g. acetyl or benzoyl), aroyl, aralkyl,
hydroxy, hydroxyalkyl,
alkoxy, aryloxy, aralkoxy, nitro, cyano, carboxy, alkoxycarbonyl,
aryloxycarbonyl, aralkoxy-
carbonyl, acylamino, aroylamino, alkylsulfonyl, arylsulfonyl,
heteroarylsulfonyl, alkylsulfinyl,
aryisulfinyl, heteroarylsulfinyl, alkylthio, arylthio, heteroarylthio,
aralkylthio, heteroaralkyl-
thio, amidino, amino, carbamoyl, sulfamoyl, alkene, alkyne, protecting groups
(e.g., silyl,
30 4,4'-dimethoxytrityl, monomethoxytrityl, or trityl(triphenylmethyl)),
linkers (e.g., a linker
containing an amine, ethylene glycol, quinone such as anthraquinone),
detectable labels
(e.g., radiolabels or fluorescent labels), and biotin.
It is understood that references herein to a nucleic acid unit, nucleic acid
residue, LNA unit,
or similar term are inclusive of both individual nucleoside units and
nucleotide units and nu-
35 cleoside units and nucleotide units within an oligonucleotide.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
36
A"modified base" or other similar term refers to a composition (e.g., a non-
naturally occur-
ring nucleobase or nucleosidic base), which can pair with a natural base
(e.g., adenine, gua-
nine, cytosine, uracil, and/or thymine) and/or can pair with a non-naturally
occurring nucleo-
base or nucleosidic base. Desirably, the modified base provides a Tm
differential of 15, 12,
10, 8, 6, 4, or 2 C or less as described herein. Exemplary modified bases are
described in EP
1 072 679 and WO 97/12896.
The term "chemical moiety" refers to a part of a molecule. "Modified by a
chemical moiety"
thus refer to a modification of the standard molecular structure by inclusion
of an unusual
chemical structure. The attachment of said structure can be covalent or non-
covalent.
The term "inclusion of a chemical moiety" in an oligonucleotide probe thus
refers to attach-
ment of a molecular structure. Such as chemical moiety include but are not
limited to cova-
lently and/or non-covalently bound minor groove binders (MGB) and/or
intercalating nucleic
acids (INA) selected from a group consisting of asymmetric cyanine dyes, DAPI,
SYBR Green
I, SYBR Green II, SYBR Gold, PicoGreen, thiazole orange, Hoechst 33342,
Ethidium Bromide,
1-0-(1-pyrenylmethyl)glycerol and Hoechst 33258. Other chemical moieties
include the
modified nucleobases, nucleosidic bases or LNA modified oligonucleotides.
The term "Dual labelled probe" refers to an oligonucleotide with two attached
labels. In one
aspect, one label is attached to the 5' end of the probe molecule, whereas the
other label is
attached to the 3' end of the molecule. A particular aspect of the invention
contain a fluores-
cent molecule attached to one end and a molecule, which is attached to the
other end and
which is able to quench the fluorophore by Fluorescence Resonance Energy
Transfer (FRET).
5' nuclease assay probes and some Molecular Beacons are examples of Dual
labelled probes.
The term "5' nuclease assay probe" refers to a dual labelled probe which may
be hydrolyzed
by the 5'-3' exonuclease activity of a DNA polymerase. A 5' nuclease assay
probes is not nec-
essarily hydrolyzed by the 5'-3' exonuclease activity of a DNA polymerase
under the condi-
tions employed in the particular PCR assay. The name "5' nuclease assay" is
used regardless
of the degree of hydrolysis observed and does not indicate any expectation on
behalf of the
experimenter. The term "5' nuclease assay probe" and "5' nuclease assay"
merely refers to
assays where no particular care has been taken to avoid hydrolysis of the
involved probe. "5'
nuclease assay probes" are often referred to as a TaqMan assay probes", and
the "5' nucle-
ase assay" as "TaqMan assay". These names are used interchangeably in this
application.
The term "oligonucleotide analogue" refers to a nucleic acid binding molecule
capable of re-
cognizing a particular target nucleotide sequence. A particular
oligonucleotide analogue is
peptide nucleic acid (PNA) in which the sugar phosphate backbone of an
oligonucleotide is

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
37
replaced by a protein like backbone. In PNA, nucleobases are attached to the
uncharged
polyamide backbone yielding a chimeric pseudopeptide-nucleic acid structure,
which is
homomorphous to nucleic acid forms.
The term "Molecular Beacon" refers to a single or dual labelled probe which is
not likely to be
affected by the 5'-3' exonuclease activity of a DNA polymerase. Special
modifications to the
probe, polymerase or assay conditions have been made to avoid separation of
the labels or
constituent nucleotides by the 5'-3' exonuclease activity of a DNA polymerase.
The detection
principle thus rely on a detectable difference in label elicited signal upon
binding of the mole-
cular beacon to its target sequence. In one aspect of the invention the
oligonucleotide probe
forms an intramolecular hairpin structure at the chosen assay temperature
mediated by com-
plementary sequences at the 5'- and the 3'-end of the oligonucleotide. The
oligonucleotide
may have a fluorescent molecule attached to one end and a molecule attached to
the other,
which is able to quench the fluorophore when brought into close proximity of
each other in
the hairpin structure. In another aspect of the invention, a hairpin structure
is not formed
based on complementary structure at the ends of the probe sequence instead the
detected
signal change upon binding may result from interaction between one or both of
the labels
with the formed duplex structure or from a general change of spatial
conformation of the
probe upon binding - or from a reduced interaction between the labels after
binding. A parti-
cular aspect of the molecular beacon contain a number of LNA residues to
inhibit hydrolysis
by the 5'-3' exonuclease activity of a DNA polymerase.
The term "multi-probe" as used herein refers to a probe which comprises a
recognition seg-
ment which is a probe sequence sufficiently complementary to a recognition
sequence in a
target nucleic acid molecule to bind to the sequence under moderately
stringent conditions
and/or under conditions suitable for PCR, 5' nuclease assay and/or Molecular
Beacon analysis
(or generally any FRET-based method). Such conditions are well known to those
of skill in
the art. Preferably, the recognition sequence is found in a plurality of
sequences being
evaluated, e.g., such as a transcriptome. A multi-probe according to the
invention may com-
prise a non-natural nucleotide ("a stabilizing nucleotide") and may have a
higher binding af-
finity for the recognition sequence than a probe comprising an identical
sequence but without
the stabilizing modification. Preferably, at least one nucleotide of a multi-
probe is modified
by a chemical moiety (e.g., covalently or otherwise stably associated with
during at least
hybridization stages of a PCR reaction) for increasing the binding affinity of
the recognition
segment for the recognition sequence.
As used herein, a multi-probe with an increased "binding affinity" for a
recognition sequence
than a probe which comprises the same sequence but which does not comprise a
stabilizing
nucleotide, refers to a probe for which the association constant (Ka) of the
probe recognition

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
38
segment is higher than the association constant of the complementary strands
of a double-
stranded molecule. In another preferred embodiment, the association constant
of the probe
recognition segment is higher than the dissociation constant (Kd) of the
complementary
strand of the recognition sequence in the target sequence in a double stranded
molecule.
A "multi-probe library" or "library of multi-probes" comprises a plurality of
multi- probes,
such that the sum of the probes in the library are able to recognise a major
proportion of a
transcriptome, including the most abundant sequences, such that about 60%,
about 70%,
about 80%, about 85%, more preferably about 90%, and still more preferably
95%, of the
target nucleic acids in the transcriptome, are detected by the probes.
Monomers are referred to as being "complementary" if they contain nucleobases
that can
form hydrogen bonds according to Watson-Crick base-pairing rules (e.g. G with
C, A with T or
A with U) or other hydrogen bonding motifs such as for example diaminopurine
with T,
inosine with C, pseudoisocytosine with G, etc.
The term "succeeding monomer" relates to the neighbouring monomer in the 5'-
terminal di-
rection and the "preceding monomer" relates to the neighbouring monomer in the
3'-terminal
direction.
As used herein, the term "target population" refers to a plurality of
different sequences of
nucleic acids, for example the genome or other nucleic acids from a particular
species inclu-
ding the transcriptome of the genome, wherein the transcriptome refers to the
complete col-
lection of transcribed elements of the genome of any species. Normally, the
number of diffe-
rent target sequences in a nucleic acid population is at least 100, but as
will be clear the
number is often much higher (more than 200, 500, 1000, and 10000 - in the case
where the
target population is a eukaryotic transcriptome).
As used herein, the term "target nucleic acid" refers to any relevant nucleic
acid of a single
specific sequence, e. g., a biological nucleic acid, e. g., derived from a
patient, an animal (a
human or non-human animal), a plant, a bacteria, a fungi, an archae, a cell, a
tissue, an or-
ganism, etc. For example, where the target nucleic acid is derived from a
bacteria, archae,
plant, non-human animal, cell, fungi, or non-human organism, the method
optionally further
comprises selecting the bacteria, archae, plant, non-human animal, cell,
fungi, or non-human
organism based upon detection of the target nucleic acid. In one embodiment,
the target
nucleic acid is derived from a patient, e. g., a human patient. In this
embodiment, the inven-
tion optionally further includes selecting a treatment, diagnosing a disease,
or diagnosing a
genetic predisposition to a disease, based upon detection of the target
nucleic acid.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
39
As used herein, the term "target sequence" refers to a specific nucleic acid
sequence within
any target nucleic acid.
The term "stringent conditions", as used herein, is the "stringency" which
occurs within a
range from about Tm-5 C (5 C below the melting temperature (Tm) of the probe)
to about
20 C to 25 C below Tm. As will be understood by those skilled in the art, the
stringency of
hybridization may be altered in order to identify or detect identical or
related polynucleotide
sequences. Hybridization techniques are generally described in Nucleic Acid
Hybridization, A
Practical Approach, Ed. Hames, B. D. and Higgins, S. J., IRL Press, 1985; Gall
and Pardue,
Proc. Nati. Acad. Sci., USA 63: 378-383, 1969; and John, et al. Nature 223:
582-587, 1969.
Multi-probes
Referring now to Fig. 113, a multi-probe according to the invention is
preferably a short se-
quence probe which binds to a recognition sequence found in a plurality of
different target
nucleic acids, such that the multi-probe specifically hybridizes to the target
nucleic acid but
do not hybridize to any detectable level to nucleic acid molecules which do
not comprise the
recognition sequence. Preferably, a collection of multi-probes, or multi-probe
library, is able
to recognize a major proportion of a transcriptome, including the most
abundant sequences,
such as about 60%, about 70%, about 80%, about 85%, more preferably about 90%,
and
still more preferably 95%, of the target nucleic acids in the transcriptome,
are detected by
the probes. A multi-probe according to the invention comprises a "stabilizing
modification"
e.g. such as a non-natural nucleotide ("a stabilizing nucleotide") and has
higher binding af-
finity for the recognition sequence than a probe comprising an identical
sequence but without
the stabilizing sequence. Preferably, at least one nucleotide of a multi-probe
is modified by a
chemical moiety (e.g., covalently or otherwise stably associated with the
probe during at
least hybridization stages of a PCR reaction) for increasing the binding
affinity of the recogni-
tion segment for the recognition sequence.
In one aspect, a multi-probe of from 6 to 12 nucleotides comprises from 1 to 6
or even up to
12 stabilizing nucleotides, such as LNA nucleotides. An LNA enhanced probe
library contains
short probes that recognize a short recognition sequence (e.g., 8-9
nucleotides). LNA nu-
cleobases can comprise a-LNA molecules (see, e.g., WO 00/66604) or xylo-LNA
molecules
(see, e.g., WO 00/56748).
In one aspect, it is preferred that the Tm of the multi-probe when bound to
its recognition
sequence is between about 55 C to about 70 C.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
In another aspect, the multi-probes comprise one or more modified nucleobases.
Modified
base units may comprise a cyclic unit (e.g. a carbocyclic unit such as
pyrenyl) that is joined
to a nucleic unit, such as a 1'-position of furasonyl ring through a linker,
such as a straight of
branched chain alkylene or alkenylene group. Alkylene groups suitably having
from 1(i.e., -
5 CH2-) to about 12 carbon atoms, more typically 1 to about 8 carbon atoms,
still more typi-
cally 1 to about 6 carbon atoms. Alkenylene groups suitably have one, two or
three carbon-
carbon double bounds and from 2 to about 12 carbon atoms, more typically 2 to
about 8 car-
bon atoms, still more typically 2 to about 6 carbon atoms.
Multi-probes according to the invention are ideal for performing such assays
as real-time PCR
10 as the probes according to the invention are preferably less than about 25
nucleotides, less
than about 15 nucleotides, less than about 10 nucleotides, e.g., 8 or 9
nucleotides. Prefer-
ably, a multi-probe can specifically hybridize with a recognition sequence
within a target se-
quence under PCR conditions and preferably the recognition sequence is found
in at least
about 50, at least about 100, at least about 200, at least about 500 different
target nucleic
15 acid molecules. A library of multi-probes according to the invention will
comprise multi-
probes, which comprise non-identical recognition sequences, such that any two
multi-probes
hybridize to different sets of target nucleic acid molecules. In one aspect,
the sets of target
nucleic acid molecules comprise some identical target nucleic acid molecules,
i.e., a target
nucleic acid molecule comprising a gene sequence of interest may be bound by
more than
20 one multi-probe. Such a target nucleic acid molecule wili contain at least
two different re-
cognition sequences which may overlap by one or more, but less than x
nucleotides of a re-
cognition sequence comprising x nucleotides.
In one aspect, a multi-probe library comprises a piurality of different multi-
probes, each dif-
ferent probe localized at a discrete location on a solid substrate. As used
herein, "localize"
25 refers to being limited or addressed at the location such that
hybridization event detected at
the location can be traced to a probe of known sequence identity. A localized
probe may or
may not be stably associated with the substrate. For example, the probe could
be in solution
in the well of a microtiter plate and thus localized or addressed to the well.
Alternatively, or
additionally, the probe could be stably associated with the substrate such
that it remains at a
30 defined location on the substrate after one or more washes of the substrate
with a buffer.
For example, the probe may be chemically associated with the substrate, either
directly or
through a linker molecule, which may be a nucleic acid sequence, a peptide or
other type of
molecule, which has an affinity for molecules on the substrate.
Alternatively, the target nucleic acid molecules may be localized on a
substrate (e.g., as a
35 cell or cell lysate or nucleic acids dotted onto the substrate).

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
41
Once the appropriate sequences are determined, multi-LNA probes are preferably
chemically
synthesized using commercially available methods and equipment as described in
the art
(Tetrahedron 54: 3607-30, 1998). For example, the solid phase phosphoramidite
method
can be used to produce short LNA probes (Caruthers, et al., Cold Spring Harbor
Symp.
Quant. Biol. 47:411-418, 1982, Adams, et al., J. Am. Chem. Soc. 105: 661
(1983).
The determination of the extent of hybridization of multi-probes from a multi-
probe library to
one or more target sequences (preferably to a plurality of target sequences)
may be carried
out by any of the methods well known in the art. If there is no detectable
hybridization, the
extent of hybridization is thus 0. Typically, labelled signal nucleic acids
are used to detect
hybridization. Complementary nucleic acids or signal nucleic acids may be
labelled by any
one of several methods typically used to detect the presence of hybridized
polynucleotides.
The most common method of detection is the use of ligands, which bind to
labelled antibo-
dies, fluorophores or chemiluminescent agents. Other labels include
antibodies, which can
serve as specific binding pair members for a labelled ligand. The choice of
label depends on
sensitivity required, ease of conjugation with the probe, stability
requirements, and available
instrumentation.
LNA-containing-probes are typically labelled during synthesis. The flexibility
of the phos-
phoramidite synthesis approach furthermore facilitates the easy production of
LNAs carrying
all commercially available linkers, fluorophores and labelling-molecules
available for this
standard chemistry. LNA may also be labelled by enzymatic reactions e.g. by
kinasing.
Multi-probes according to the invention can comprise single labels or a
plurality of labels. In
one aspect, the plurality of labels comprise a pair of labels which interact
with each other
either to produce a signal or to produce a change in a signal when
hybridization of the multi-
probe to a target sequence occurs.
In another aspect, the multi-probe comprises a fluorophore moiety and a
quencher moiety,
positioned in such a way that the hybridized state of the probe can be
distinguished from the
unhybridized state of the probe by an increase in the fluorescent signal from
the nucleotide.
In one aspect, the multi-probe comprises, in addition to the recognition
element, first and
second complementary sequences, which specifically hybridize to each other,
when the probe
is not hybridized to a recognition sequence in a target molecule, bringing the
quencher mole-
cule in sufficient proximity to said reporter molecule to quench fluorescence
of the reporter
molecule. Hybridization of the target molecule distances the quencher from the
reporter
molecule and results in a signal, which is proportional to the amount of
hybridization.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
42
In another aspect, where polymerization of strands of nucleic acids can be
detected using a
polymerase with 5' nuclease activity. Fluorophore and quencher molecules are
incorporated
into the probe in sufficient proximity such that the quencher quenches the
signal of the
fluorophore molecule when the probe is hybridized to its recognition sequence.
Cleavage of
the probe by the polymerase with 5' nuclease activity results in separation of
the quencher
and fluorophore molecule, and the presence in increasing amounts of signal as
nucleic acid
sequences
In the present context, the term "label" means a reporter group, which is
detectable either
by itself or as a part of a detection series. Examples of functional parts of
reporter groups
are biotin, digoxigenin, fluorescent groups (groups which are able to absorb
electromagnetic
radiation, e.g. light or X-rays, of a certain wavelength, and which
subsequently reemits the
energy absorbed as radiation of longer wavelength; illustrative examples are
DANSYL (5-di-
methylamino)-1-naphthalenesulfonyl), DOXYL (N-oxyl-4,4-dimethyloxazolidine),
PROXYL (N-
oxyl-2,2,5,5-tetramethylpyrrolidine), TEMPO (N-oxyl-2,2,6,6-
tetramethylpiperidine), dinitro-
phenyl, acridines, coumarins, Cy3 and Cy5 (trademarks for Biological Detection
Systems,
Inc.), erythrosine, coumaric acid, umbelliferone, Texas red, rhodamine,
tetramethyl rhoda-
mine, Rox, 7-nitrobenzo-2-oxa-l-diazole (NBD), pyrene, fluorescein, Europium,
Ruthenium,
Samarium, and other rare earth metals), radio isotopic labels,
chemiluminescence labels (la-
bels that are detectable via the emission of light during a chemical
reaction), spin labels (a
free radical (e.g. substituted organic nitroxides) or other paramagnetic
probes (e.g. Cu2+,
MgZ+) bound to a biological molecule being detectable by the use of electron
spin resonance
spectroscopy). Especially interesting examples are biotin, fluorescein, Texas
Red, rhodamine,
dinitrophenyl, digoxigenin, Ruthenium, Europium, Cy5, Cy3, etc.
Suitable samples of target nucleic acid molecule may comprise a wide range of
eukaryotic
and prokaryotic cells, including protoplasts; or other biological materials,
which may harbour
target nucleic acids. The methods are thus applicable to tissue culture animal
cells, animal
cells (e.g., blood, serum, plasma, reticulocytes, lymphocytes, urine, bone
marrow tissue,
cerebrospinal fluid or any product prepared from blood or lymph) or any type
of tissue biopsy
(e.g. a muscle biopsy, a liver biopsy, a kidney biopsy, a bladder biopsy, a
bone biopsy, a car-
tilage biopsy, a skin biopsy, a pancreas biopsy, a biopsy of the intestinal
tract, a thymus bi-
opsy, a mammae biopsy, a uterus biopsy, a testicular biopsy, an eye biopsy or
a brain bi-
opsy, e.g., homogenized in lysis buffer), archival tissue nucleic acids, plant
cells or other cells
sensitive to osmotic shock and cells of bacteria, yeasts, viruses,
mycoplasmas, protozoa,
rickettsia, fungi and other small microbial cells and the like.
Target nucleic acids which are recognized by a plurality of multi-probes can
be assayed to
detect sequences which are present in less than 10% in a population of target
nucleic acid

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
43
molecules, less than about 5%, less than about 1%, less than about 0.1%, and
less than
about 0.01% (e.g., such as specific gene sequences). The type of assay used to
detect such
sequences is a non-limiting feature of the invention and may comprise PCR or
some other
suitable assay as is known in the art or developed to detect recognition
sequences which are
found in less than 10% of a population of target nucleic acid molecules.
In one aspect, the assay to detect the less abundant recognition sequences
comprises hybri-
dizing at least one primer capable of specifically hybridizing to the
recognition sequence but
substantially incapable of hybridizing to more than about 50, more than about
25, more than
about 10, more than about 5, more than about 2 target nucleic acid molecules
(e.g., the
probe recognizes both copies of a homozygous gene sequence), or more than one
target nu-
cleic acid in a population (e.g., such as an allele of a single copy
heterozygous gene sequence
present in a sample). In one preferred aspect a pair of such primers is
provided and flank
the recognition sequence identified by the multi-probe, i.e., are within an
amplifiable distance
of the recognition sequence such that amplicons of about 40-5000 bases can be
produced,
and preferably, 50-500 or more preferably 60-100 base amplicons are produced.
One or
more of the primers may be labelled.
Various amplifying reactions are well known to one of ordinary skill in the
art and include, but
are not limited to PCR, RT-PCR, LCR, in vitro transcription, rolling circle
PCR, OLA and the
like. Multiple primers can also be used in multiplex PCR for detecting a set
of specific target
molecules.
The invention further provides a method for designing multi-probes sequences
for use in
methods and kits according to the invention. A fiow chart outlining the steps
of the method
is shown in Fig. 2.
In one aspect, a plurality of n-mers of n nucleotides is generated in silico,
containing all pos-
sible n-mers. A subset of n-mers are selected which have a Tm > 60 C. In
another aspect,
a subset of these probes is selected which do not self-hybridize to provide a
list or database
of candidate n-mers. The sequence of each n-mer is used to query a database
comprising a
plurality of target sequences. Preferably, the target sequence database
comprises expressed
sequences, such as human mRNA sequences.
From the list of candidate n-mers used to query the database, n-mers are
selected that iden-
tify a maximum number of target sequences (e.g., n-mers which comprise
recognition seg-
ments which are complementary to subsequences of a maximal number of target
sequences
in the target database) to generate an n-mer/target sequence matrix. Sequences
of n-mers,
which bind to a maximum number of target sequences, are stored in a database
of optimal

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
44
probe sequences and these are subtracted from the candidate n-mer database.
Target se-
quences that are identified by the first set of optimal probes are removed
from the target
sequence database. The process is then repeated for the remaining candidate
probes until a
set of multi-probes is identified comprising n-mers which cover more than
about 60%, more
than about 80%, more than about 90% and more than about 95% of targets
sequences. The
optimal sequences identified at each step may be used to generate a database
of virtual
multi-probes sequences. Multi-probes may then be synthesized which comprise
sequences
from the multi-probe database.
In another aspect, the method further comprises evaluating the general
applicability of a
given candidate probe recognition sequence for inclusion in the growing set of
optimal probe
candidates by both a query against the remaining target sequences as well as a
query
against the original set of target sequences. In one preferred aspect only
probe recognition
sequences that are frequentiy found in both the remaining target sequences and
in the origi-
nal target sequences are added to in the growing set of optimal probe
recognition sequences.
In a most preferred aspect this is accomplished by calculating the product of
the scores from
these queries and selecting the probes recognition sequence with the highest
product that
still is among the probe recognition sequences with 20% best score in the
query against the
current targets.
The invention also provides computer program products for facilitating the
method described
above (see, e.g., Fig. 2). In one aspect, the computer program product
comprises program
instructions, which can be executed by a computer or a user device connectable
to a network
in communication with a memory.
The invention further provides a system comprising a computer memory
comprising a data-
base of target sequences and an application system for executing instructions
provided by
the computer program product.
Kits Comprising Multi-Probes
A preferred embodiment of the invention is a kit for the characterisation or
detection or
quantification of target nucleic acids comprising samples of a library of
multi-probes. In one
aspect, the kit comprises in silico protocols for their use. In another
aspect, the kit compri-
ses information relating to suggestions for obtaining inexpensive DNA primers.
The probes
contained within these kits may have any or all of the characteristics
described above. In
one preferred aspect, a plurality of probes comprises a least one stabilizing
nucleobase, such
as an LNA nucleobase.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
In another aspect, the plurality of probes comprises a nucleotide coupied or
stably associated
with at least one chemical moiety for increasing the stability of binding of
the probe. In a
further preferred aspect, the kit comprises a number of different probes for
covering at least
60% of a population of different target sequences such as a transcriptome. In
one preferred
5 aspect, the transcriptome is a human transcriptome.
In another aspect, the kit comprises at least one probe labelled with one or
more labels. In
still another aspect, one or more probes comprise labels capable of
interacting with each
other in a FRET-based assay, i.e., the probes may be designed to perform in 5'
nuclease or
Molecular Beacon -based assays.
10 The kits according to the invention allow a user to quickly and efficiently
to develop assays
for many different nucleic acid targets. The kit may additionally comprise one
or more re-
agents for performing an amplification reaction, such as PCR.
EXAMPLES
The invention will now be further illustrated with reference to the following
examples. It will
15 be appreciated that what follows is by way of example only and that
modifications to detail
may be made while still falling within the scope of the invention.
In the following Examples probe reference numbers designate the LNA-
oligonucleotide se-
quences shown in the synthesis examples below.
EXAMPLE 1
20 Source of transcriptome data
The human transcriptome mRNA sequences were obtained from ENSEMBL. ENSEMBL is
a
joint project between EMBL - EBI and the Sanger Institute to develop a
software system
which produces and maintains automatic annotation on eukaryotic genomes (see,
e.g., But-
ler, Nature 406 (6794): 333, 2000). ENSEMBL is primarily funded by the
Wellcome Trust. It
25 is noted that sequence data can be obtained from any type of database
comprising expressed
sequences, however, ENSEMBL is particularly attractive because it presents up-
to-date se-
quence data and the best possible annotation for metazoan genomes. The file
"Homo_sapiens.cdna.fa" was downloaded from the ENSEMBL ftp site:
ftp://ftp.ensembl.orq/pub/current human/data/ on May 14. 2003. The file
contains all EN-

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
46
SEMBL transcript predictions (i.e., 37347 different sequences). From each
sequence the re-
gion starting at 50 nucleotides upstream from the 3' end to 1050 nucleotides
upstream of the
3' end was extracted. The chosen set of probe sequences (see best mode below)
was further
evaluated against the human mRNA sequences in the Reference Sequence (RefSeq)
collection
from NCBI. RefSeq standards serve as the basis for medical, functional, and
diversity studies;
they provide a stable reference for gene identification and characterization,
mutation analy-
sis, expression studies, polymorphism discovery, and comparative analyses. The
RefSeq col-
lection aims to provide a comprehensive, integrated, non-redundant set of
sequences, inclu-
ding genomic DNA, transcript (RNA), and protein products, for major research
organisms.
Similar coverage was found for both the 37347 sequences from ENSEMBL and the
19567
sequences in the RefSeq collection, i.e., demonstrating that the type of
database is a non-
limiting feature of the invention.
EXAMPLE 2
Calculation of a multi-probe dataset (Alfa library)
Special software running on UNIX computers was designed to calculate the
optimal set of
probes in a library. The algorithm is illustrated in the flow chart shown in
Fig. 2.
The optimal coverage of a transcriptome is found in two steps. In the first
step a sparse
matrix of n_mers and genes is determined, so that the number of genes that
contain a given
n_mer can be found easily. This is done by running the getcover program with
the -p option
and a sequence file in FASTA format as input.
The second step is to determine the optimal cover with an algorithm, based on
the matrix
determined in the first step. For this purpose a program such as the getcover
program is run
with the matrix as input. However, programs performing similar functions and
for executing
similar steps may be readily designed by those of skill in the art.
Obtaining good oligonucleotide cover of the transcriptome,
1. All 4' n-mers are generated and the expected melting temperature is
calculated. n-
mers with a melting temperature below 60 C or with high self-hybridisation
energy are
removed from the set. This gives a list of n-mers that have acceptable
physical proper-
ties.
2. A list of gene sequences representing the human transcriptome is extracted
from the
ENSEMBL database.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
47
3. Start of the main loop: Given the n-mer and gene list a sparse matrix of n-
mers versus
genes is generated by identifying all n-mers in a given gene and storing the
result in a
matrix.
4. If this is the first iteration, a copy of the matrix is put aside, and
named the "total n-
mer/gene matrix".
5. The n-mer that covers most genes is identified and the number of genes it
covers is
stored as "max_gene".
6. The coverage of the remaining genes in the matrix is determined and genes
with
coverage of at least 80% of max_gene are stored in the "n-mer list with good
cover-
age".
7. The optimal n-mer is the one where the product of its current coverage and
the total
coverage is maximal.
8. The optimal n-mer is deleted from the n-mer list (step 1).
9. The genes covered by this n-mer are deleted from the gene list (step 2).
10. The n-mer is added to the optimal n-mer list, the process is continued
from step 3 until
no more n-mers can be found.
The program code ("getcover" version 1.0 by Niels Tolstrup 2003) for
calculation of a multi-
probe dataset is listed in Fig. 17. It consists of three proprietary modules:
getcover.c, dyp.c,
dyp. h
The program also incorporate four modules covered by the GNU Lesser General
Public Li-
cence:
getopt.c, getopt.h, getoptl.c, getopt init.c
/* Copyright (C) 1987,88,89,90,91,92,93,94,95,96,98,99,2000,2001
Free Software Foundation, Inc.
These files are part of the GNU C Library. The GNU C Library is free software;
you can redis-
tribute it and/or modify it under the terms of the GNU Lesser General Public
License as pub-
lished by the Free Software Foundation */
The software was compiled with aap. The main.aap file used to make the program
is like-
wise listed in Fig. 17.
To run the compiled program the following command is used:
getcover -1 8,9 -b bad.lst -p -f < h_sap_cdna_50_1050.fasta >
h_sap_cdna_50_1050_I9.stat
getcover -1 8,9 -b bad.lst -s < h_sap_cdna_50_1050_I9.stat >
h_sap_cdna_50_1050_I9.cover The computer program was used with instructions
for

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
48
implementing the algorithm described above to analyze the human transcriptome
with the
following parameter settings:
L89: probe length = 8 or 9 nucleotides
ii: inclusion fraction = 100%
d15: delta Tm required for target duplex against self duplex = 15 C
t62: minimum Tm for target duplex = 62 C
c: complementary target sequence used as well
m80: optimal probes selected among the most general probes addressing the
remaining tar-
gets with the product rule and the 80% rule
n: LNA nucleotides were preferably included in the central part of the
recognition segment
b: bad.lst is a list of oligos that are known experimentally to be bad and
must be deselected;
and resulted in the identification of a database of multi-probe target
sequences.
Target sequences in this database are exemplary optimal targets for a multi-
probe library.
These optimal multi-probes are listed in TABLE 1 below and comprise 5'
fluorescein fluoro-
phores and 3' Eclipse or other quenchers (see below).
TABLE 1 Dual label oligonucleotide probes
cagcctcc cagagcca agctgtga aggaggga
aggaggag ctggaagc cagagagc tgtggaga
cccaggag cagccaga tgaggaga ctggggaa
ctccagcc cttctggg acagtgga ctcctgca
ctcctcca ttctgcca acagccat tgaggtgg
ctgctgcc aggagaga tttctcca aaggcagc
ctccagca ttcctg ca cagtggtg ctgtggca
ctgctggg tttgggga aaagggga agaagggc
cttcctgg caggcaga tgtgggaa tggatgga
acagcagc ctgtgcca actgggaa ttctggca
cagctcca ttccctgg tcacagga cagaaggc
ccccaccc aaccccat ttcctccc atcccaga
tggtggtg ctgcccag aggtggaa caggtgct
ttcctcca ctgaggca tgtggaca ctgtctcc
ctgctcca ctgctggt tggaggcc tgctgtga
tggagaga cagtgcca atggtgaa agctggat
aaggcaga atggggaa ctggaagg tggagagc

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
49
cagccagg agggagag caggcagc cttggtgg
cagcagga ctctgcca tcaggagc caccttgg
ctgtgctg ctgctgag acacacac cagccacc
agaggaga ccctccca catcttca ctgtgacc
ctgtggct aggaggca cacctgca agggggaa
cagtggct cactgcca ccagggcc tgggacca
ttctccca ctgtgtgg cagaggca acagggaa
cctggagc ttcccagt ctgggact ctgggcaa
cccagcag tccagtgt ctgcctgt ctggagga
ttctcctg ctcctccc tggaaggc tccactgc
cttcctgc cttcccca ctgtgcct ctgccacc
ccacctcc ctctgcca ctgtgctc acagcctca
ttcctctg cagcaggt ctgtgagc ctgtggtc
tggtgatg ctccatcc tcctcctc cttcaggc
tgtggctg tgctgtcc ctcagcca tctgggtc
cttctccc tcctctcc ctcttccc cttggagc
ctgcctcc ctctgcct ctgggcac ccaggctc
ctccttcc ctggctgc tgggcatc tctctggt
tcctgctc ccgccgcc ctctggct cttgggct
catcctcc ctcctcct tgctgggc ctgccatc
aggagctg cagcctgg ctgctctc cactggga
tcctgctg cagcagcc ctggagtc tgccctga
ctcctcca tgctggag cttcagcc ttggtggt
ccagccag cttcctcc cttccagc ttgggact
cagcccag ttcctggc tccaggtc ctgctgga
ctccacca tcctcagc cagcatcc caggagct
ctccagcc aggagcag cagaggct ctcagcct
tggctctg ccaggagg ctgccttc ttctggct
caggcagc cagcctcc ctgggaga ctgtctgc
ctgcctct agctggag cccagccc ctgtccca
cttctgcc ctgctgcc cagctccc tctgccca
ctgctccc tggctgtg ccagccgc ctggacac
tggtggaa cctggaga cctcagcc ttgccatc
agctggga ccagggcc tcctcttct cttcccct
ctgcttcc ccaccacc ctggctcc cttgggca
cagcaggc tctgctgc ccagggca ttctggtc
tctggagc cagccacc ctccacct ccgccgcc
catccagc cagaggag ctgcccca cttcttctc

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
atggctgc ctctcctc tgggcagc ttccctcc
ctcctgcc caggagcc ctggtctc ttcctcaga
tggtggcc tctggtcc ctggggcc tccaaggc
ctggggct ctgtctcc cagtggca ttggggtc
ttgccatc cttcccct cttgggca ttctggtc
cttcttctc ttccctcc ttcctcaga tccaaggc
ttggggtc
These hyper-abundant 9-mer and 8-mer sequences fulfil the selection criteria
in Fig. 2., i.e.,
= each probe target occurs in at least 6% of the sequences in the human
transcriptome
(i.e., more than 2200 target sequences each, more than 800 sequences targeted
within
5 1000 nt proximal to the 3' end of the transcript).
= they are not self complementary (i.e. unlikely to form probe duplexes).
Self score is at least 10 below Tm estimate for the duplex formed with the
target.
= the formed duplex with their target sequence has a Tm at or above 60 OC.
10 They cover > 98 % of the mRNAs in the human transcriptome when combined.
Especially preferred versions of the multi-probes of table 1 are presented in
the following
table la:
TABLE la LNA substituted oligonucleotides
cAgCCTCc cAGAGCCa aGCTGTGa aGGAGGGa
aGGAGGAg cTGGAAGc cAGAGAGc tGTGGAGa
ccCAGGAg cAGCCAGa tGAGGAGa ctGGGGAa
cTCCAgCc cTTCTGGg aCAGTGGa cTCCtGCa
cTCCTCCa tTCTGCCa aCAGCCAt tGAGGtGg
cTgCTGCc aGGAGAGa tTTCTCCa aAGGCAGc
cTCCAGCa tTCCTGCa cAGTGGTg ctGTGGCa
cTGCTGgg tTTGGGGa aAAGGGGa aGAAGGGc
cTTCCTGg cAGGCAGa tGTGGGAa tGGATGGa
aCAGCAGc ctGTGCCa aCTGGGAa tTCTGGCa
caGCTCCa tTCCCTGg tCACAGGa cAGAAGGc
cCCCACCc aACCCCAt tTCCTCCc aTCCCAGa
tGGTGGTg ctGCCCag aGGTGGAa cAGGtGCt

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
51
tTCCTCCa cTGAGGCa tGTGGACa cTGTCTCc
cTGCTCCa cTGCtGGt tGGAGgCc tGCTGTGa
tGGAGAGa cAGtGCCa atGGTGAA aGCTGGAt
aAGGCAGa aTGGGGAa cTGGAAGg tGGAGAGc
cAGCcAGg aGGGAGAg cAGGcAGc cTTGGTGg
cAGCAGGa cTCtGCCa tCAGGaGc cACCTTGg
cTGTGCTg cTGCTGAg aCACACAC cAgCCACc
aGAGGAGa cCCtCCCa cATCTTCA cTGTGACc
ctGTGGCt aGGAGGca cACCtGCa aGGGGGAa
caGTGGCt cACtGCCa cCAGgGcc tGgGACCa
tTCTCCCa cTGTGTGg cAGAGGCa aCAGGGAa
cTGgcTGC cAGCAGGC cAGCATCC tCTGCCCA
ccGCCgCC cTGCCTCT cAGAGGCT cTGGACAC
cTCCTCCT cTCCACCT cATCCTCC tCAgCAGC
cTGGAGGA cTCCTCCC cTCTGCCT tTCTTGGC
caGCcTGG cTTCCCCA cAGTGGCA cggCGGCA
cAGcAGCC cTTCAGCC cAGCACCC cTGGTGGT
cTTCCTCC cTCTGCCA cTCTCCTC cCTTCTCC
ccAGGAGG cTTCTGCC tCTGgTCC cCTCTTCC
cAGCcTCC cAGCAGGT cAGGAGCC tGTTGCCA
aGcTGGAG tcTGGAGC cTGTCTCC tGGaTGGC
cTGcTGcC cTGCCCCA cTGGGACT cCAGCATC
tGGcTGTG cATCCAGC cTGCCTGT tCTTCTTCT
cCTGGAGa aTGGcTGC tGGaAGGC tcgCCGCC
cCAGGGcC cTCCTGCC cTGTGCCT tGCTGTTC
cCACCACC cTGGGGcc cTGTGCTC tCAAGGGC
acAGCCTCA cTCCATCC cTGTGAGC tgCTGCTC
cAGAGGAG cTGGGCAA cTCTTCCC tcGCCGTC
tGcTGGAG cCAGCCGC cTGGGCAC tTGATGCC
aGGAGcAG tGGTGGcc tGGGCATC cCTTCAGC
aGGaGCTG cTGGGGCT tCCTCCTC aTTCCAGC
tCCTGCTG cTGCTCCC cTCTGGCT tTGATGGC
cCTGGAGC tGCTGTCC tgcTGGGC cCAGTTCC
cTCCTCCA tCCTCTCC cTCAGCCA tTGGCTTC
cCAGCCAG tGGTGGAA cTGCTCTC tTGCCTTC
cCCAGCAG aGCTGGGA cTGGAGTC aTGGCTTC
tTCTCCTG cTGGTCTC cTGTGGTC cACCCGCT
cAGCCCAG tTCCCAGT cTTCAGGC tCTTTGCC
cTTCCTGC tCCTCTTCT tCTGGGTC cTGGTTGC

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
52
cTCCACCA tCCAGTGT cTTGGAGC tGGACACC
cTTCCAGC tGGGcAGC cCAGGCTC tcGTCGCC
cCCAGCCC cCAGGGCA tCTCTGGT cCATCAGC
cTGCCTTC cTGGCTCC CTTGGGCT tGGTGGAT
cTCCAGCC tCTGcTGC cTGCCATC aTGGTGGT
cCACCTCC cAGCCACC cACTGGGA cCtGGTGC
tTCCTCTG tTCcTGGC tGCCCTGa tCCTCGTC
tGGCTCTG tCCTCAGC tTGGTGGT tTCTTGCC
tGGTGATG cTCCTTCC tTGGGACT tGGgCTTC
tGTGGcTG cTGGGAGA cTGCTGGA tGATGAGC
cTTCTCCC tCCTGCTC cAGGaGCT tCCTggCC
cTGCCTCC cAGGcAGC cTCAGCCT cCTCCTTC
cAGCTCCC tCCACTGC tTCTGGCT tGCTGGAG
cTGCTTCC cTGCCACC cTGTCTGC
ccTCAGCC tCcAGGTC cTGTCCCA
- wherein small letters designate deoxyribonucleotides and capital letters
designate LNA
nucleotides.
> 95.0 % of the mRNA sequences are targeted within the 1000 nt near their
3'terminal,
(position 50 to 1050 from 3' end) and > 95% of the mRNA contain the target
sequence for
more than one probe in the library. More than 650,000 target sites for these
100 multi-
probes were identified in the human transcriptome containing 37,347 nucleic
acid sequences.
The average number of multi-probes addressing each transcript in the
transcriptome is 17.4
and the median value is target sites for 14 different probes.
The sequences noted above are also an excellent choice of probes for other
transcriptomes,
though they were not selected to be optimized for the particular organisms. We
have thus
evaluated the coverage of the above listed library for the mouse and rat
genome despite the
fact that the above probes were designed to detect/characterize/quantify the
transcripts in
the human transcriptome only. E.g. see table 2.
TABLE 2 Transcriptome
Human probe library Human Mouse Rat
no. of mRNA sequences 37347 32911 28904
Coverage of full length mRNAs 96.7% 94.6% 93.5%
Coverage 1000 nt near the 3'-end 91.0% - -

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
53
At least covered by two probes 89.8% 80.2% 77.0%
nt - nucleotides.
EXAMPLE 3
Expected coverage of human transcriptome by frequently occurring 9-mer
oligonucleotides
Experimental pilot data (similar to Fig. 6) indicated that it is possible to
reduce the length of
the recognition sequence of a dual-labelled probe for real-time PCR assays to
8 or 9 nucleo-
tides depending on the sequence, if the probe is enhanced with LNA. The unique
duplex sta-
bilizing properties of LNA are necessary to ensure an adequate stability for
such a short du-
plex (i.e. Tm > 60 OC). The functional real-time PCR probe will be almost pure
LNA with 6 to
LNA nucleotides in the recognition sequence. However, the short recognition
sequence
10 makes it possible to use the same LNA probe to detect and quantify the
abundance of many
different genes. By proper selection of the best (i.e. most common) 8 or 9-mer
recognition
sequences according to the algorithm depicted in Fig. 2 it is possible to get
a coverage of the
human transcriptome containing about 37347 mRNAs (Fig. 3).
Fig. 3 shows the expected coverage as percentage of the total number of mRNA
sequences in
the human transcriptome that are detectable within a 1000 nt long stretch near
the 3' end of
the respective sequences (i.e. the sequence from 50 nt to 1050 nt from the 3'
end) by opti-
mized probes of different lengths. The probes are required to be sufficiently
stable (Tm>60
degC) and with a low propensity for forming self duplexes, which eliminate
many 9-mers and
even more 8-mer probe sequences.
If all probes sequences of a given length could be used as probes we would
obviously get the
best coverage of the transcriptome by the shortest possible probe sequences.
This is indeed
the case when only a limited number of probes (< 55) are included in the
library (Fig. 4).
However, because many short probes with a low GC content have an inadequate
thermal
stability, they were omitted from the library. The limited diversity of
acceptable 8-mer
probes are less efficient at detecting low GC content genes, and a library
composed of 100
different 9-mer probes consequently have a better coverage of the
transcriptome than a
similar library of 8-mers. However, the best choice is a mixed library
composed of sequences
of different lengths such as the proposed best mode library listed above. The
coverage of this
library is not shown in Fig. 4.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
54
The designed probe library containing 100 of the most commonly occurring 9-mer
and 8-
mers, i.e., the "Human mRNA probe library" can be handled in a convenient box
or microtiter
plate format.
The initial set of 100 probes for human mRNAs can be modified to generate
similar library
kits for transcriptomes from other organisms (mouse, rat, Drosophila, C.
elegans, yeast,
Arabidopsis, zebra fish, primates, domestic animals, etc.). Construction of
these new probe
libraries will require little effort, as most of the human mRNA probes may be
re-used in the
novel library kits (TABLE 2).
EXAMPLE 4
Number of probes in the library that target each gene
Not only does the limited number of probes in the proposed libraries target a
large fraction
(> 98%) of the human transcriptome, but there is also a large degree of
redundancy in that
most of the genes (almost 95%) may be detected by more than one probe. More
than
650,000 target sites have been identified in the human transcriptome (37347
genes) for the
100 probes in the best mode library shown above. This gives an average number
of target
sites per probe of 6782 (i.e. 18 % of the transcriptome) ranging from 2527 to
12066 se-
quences per probe. The average number of probes capable of detecting a
particular gene is
17.4, and the median value is 14. Within the library of only 100 probes we
thus have at least
14 probes for more than 50% of all human mRNA sequences.
The number of genes that are targeted by a given number of probes in the
library is depicted
in Fig. 4.
EXAMPLE 5
Design of 9-mer probes to demonstrate feasibility
The SSA4 gene from yeast (Saccharomyces cerevisiae) was selected for the
expression as-
says because the gene transcription level can be induced by heat shock and
mutants are
available where expression is knocked out. Three different 9mer sequences were
selected
amongst commonly occurring 9mer sequences within the human transcriptome
(Table 3).
The sequences were present near the 3' terminal end of 1.8 to 6.4 % of all
mRNA sequences
within the human transcriptome. Further selection criteria were a moderate
level of self-com-

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
plementarity and a Tm of 60 C or above. All three sequences were present
within the termi-
nal 1000 bases of the SSA4 ORF. Three 5' nuclease assay probes were
constructed by syn-
thesizing the three sequences with a FITCH fluorophore in the 5'-end and an
Eclipse quencher
(Epoch Biosciences) in the 3'end. The probes were named according to their
position within
5 the ORF YER103W (SSA4) where position 1201 was set to be position 1. Three
sets of primer
pairs were designed to produce three non-overlapping amplicons, which each
contained one
of the three probe sequences. Amplicons were named according to the probe
sequence they
encompassed.
Table 3. Designed 5' nuclease assay probes and primers
Sequence Name of Forward primer sequen- Reverse primer se- Amplicon
probe ce quence length
aaGGAGAAG Dual-label- cgcgtttactttgaaaaatt gcttccaatttcctggca 81 bp
led-469 ctg tc
(SEQ ID NO: 1) (SEQ ID NO: 2)
cAAGGAAAg Dual-la- gcccaagatgctataaatt- gggtttgcaacaccttct
95 bp
belled-570 ggttag agttc
(SEQ ID NO: 3) (SEQ ID NO: 4)
ctGGAGCaG Dual-label- tacggagctgcaggtggt gttgggccgttgtctggt 86 bp
led-671 (SEQ ID NO: 5) (SEQ ID NO: 6)
10 bp - base pairs
Two Molecular Beacons were also designed to detect the SSA4 469- and the SSA4
570 se-
quence and named Beacon-469 and Beacon-570, respectively. The sequence of the
SSA4 469
beacon was CAAGGAGAAGTTG (SEQ ID NO: 7, 10-mer recognition site) which should
enable
this oligonucleotide to form the intramolecular beacon structure with a stem
formed by the
15 LNA-LNA interactions between the 5'-CAA and the TTG-3'. The sequence of the
SSA4 570
beacon was CAAGGAAAGttG (9-mer recognition site) where the intramolecular
beacon struc-
ture may form between the 5'-CAA and the ttG-3'. Both the sequences were
synthesized with
a fluorescein fluorophore in the 5'-end and a Dabcyl quencher in the 3'end.
One SYBR Green labelled probe was also designed to detect the SSA4 570
sequence and
20 named SYBR-Probe-570. The sequence of this probe was CAAGGAAaG. This probe
was syn-
thesized with an amino-C6 linker on the 5'-end on which the fluorophore SYBR
Green 101
(Molecular Probes) was attached according to the manufactures instructions.
Upon hybridiza-
tion to the target sequence, the linker attached fluorophore should
intercalate in the genera-
ted LNA-DNA duplex region causing increased fluorescence from the SYBR Green
101.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
56
TABLE 4: SEQUENCES
EQ Name Type Sequence Position
Number in gene
Dual-labelled-
13992 469 5' nuclease assay probe 5'-Fluor-aaGGAGAAG-Eclipse-3' 469-477
Dual-labelled-
13994 570 5' nuclease assay probe 5'-Fluor-cAAGGAAAg-Eclipse-3' 570-578
Dual-labelled-
13996 671 5' nuclease assay probe 5'-Fluor-ctGGAGCaG-Eclipse-3' 671-679
13997 Beacon-469 Molecular Beacon 5'-FI uor-CAAGGAGAAGTTG-Da bcyl -3'
(5'-Fluor-SEQ ID NO: 8-Dabcyl-3')
14148 Beacon-570 Molecular Beacon 5'-Fluor-CAAGGAAAGttG-Dabcyl-3'
(5'-Fluor-SEQ ID NO: 9-Dabcyl-3')
SYBR-Probe-
14165 570 SYBR-Probe 5'-SYBR101-NH2C6-cAAGGAAAg-3'
14012 SSA4-469-F Primer cgcgtttactttgaaaaattctg (SEQ ID NO:
10)
14013 SSA4-469-R Primer gcttccaatttcctggcatc (SEQ ID NO: 11)
14014 SSA4-570-F Primer gcccaagatgctataaattggttag (SEQ ID
NO: 12)
14015 SSA4-570-R Primer gggtttgcaacaccttctagttc (SEQ ID NO:
13)
14016 SSA4-671-F Primer tacggagctgcaggtggt (SEQ ID NO: 14)
14017 SSA4-671-R Primer gttgggccgttgtctggt (SEQ ID NO: 15)
14115 POL5-469-F Primer gcgagagaaaacaagcaagg (SEQ ID NO:
16)
14116 POL5-469-R Primer attcgtcttcactggcatca (SEQ ID NO: 17)
14117 APG9-570-F Primer cagctaaaaatgatgacaataatgg (SEQ ID
NO: 18)
14118 APG9-570-R Primer attacatcatgattagggaatgc (SEQ ID NO:
19)
14119 HSP82-671-F Primer gggtttgaacattgatgagga (SEQ ID NO:
20)
14120 HSP82-671-R Primer ggtgtcagctggaacctctt (SEQ ID NO: 21)
EXAMPLE 6
Synthesis, deprotection and purification of dual labelled oligonucleotides

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
57
The dual labelled oligonucleotides EQ13992 to EQ14148 (Table 4) were prepared
on an
automated DNA synthesizer (Expedite 8909 DNA synthesizer, PerSeptive
Biosystems, 0.2
mol scale) using the phosphoramidite approach (Beaucage and Caruthers,
Tetrahedron Lett.
22: 1859-1862, 1981) with 2-cyanoethyl protected LNA and DNA phosphoramidites,
(Sinha,
et al., Tetrahedron Lett.24: 5843-5846, 1983). CPG solid supports were
derivatized with
either eclipse quencher (EQ13992-EQ13996) or dabcyl (EQ13997-EQ14148) and 5'-
fluorescein phosphoramidite (GLEN Research, Sterling, Virginia, USA). The
synthesis cycle
was modified for LNA phosphoramidites (250s coupling time) compared to DNA
phosphoramidites. 1H-tetrazole or 4,5-dicyanoimidazole (Proligo, Hamburg,
Germany) was
used as activator in the coupling step.
The oligonucleotides were deprotected using 32% aqueous ammonia (lh at room
tempera-
ture, then 2 hours at 60 C) and purified by HPLC (Shimadzu-SpectraChrom
series; XterraTM
RP18 column, 10?m 7.8 x 150 mm (Waters). Buffers: A: 0.05M Triethylammonium
acetate
pH 7.4. B. 50% acetonitrile in water. Eluent: 0-25 min: 10-80% B; 25-30 min:
80% B). The
composition and purity of the oligonucleotides were verified by MALDI-MS
(PerSeptive Bio-
system, Voyager DE-PRO) analysis, see Table 5. Fig. 5 is the MALDI-MS spectrum
of
EQ13992 showing [M-H]- = 4121,3 Da. This is a typical MALDI-MS spectrum for
the 9-mer
probes of the invention.
TABLE 5:
EQ# Sequences MW (Calc.) MW (Found)
13992 5'-Fitc-aaGGAGAAG-EQL-3' 4091,8 Da. 4091,6 Da.
13994 5'-Fitc-cAAGGAAAg-EQL-3' 4051,9 Da. 4049,3 Da.
13996 5'-Fitc-ctGGAGmCaG-EQL-3' 4020,8 Da. 4021,6 Da.
5'- Fitc-mCAAGGAGAAGTTG-dabcy/-3'
13997 (5'-Fitc-SEQ ID NO: 22-dabcyl-3') 5426,3 Da. 5421,2 Da.
Capitals designate LNA monomers (A, G, mC, T), where mC is LNA methyl
cytosine. Small
letters designate DNA monomers (a, g, c, t). Fitc = Fluorescein; EQL = Eclipse
quencher;
Dabcyl = Dabcyl quencher. MW = Molecular weight.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
58
EXAMPLE 7
Production of cDNA standards of SSA4 for detection with 9-mer probes
The functionality of the constructed 9mer probes were analysed in PCR assays
where the
probes ability to detect different SSA4 PCR amplicons were questioned.
Template for the PCR
reaction was cDNA obtained from reverse transcription of cRNA produced from in
vitro tran-
scription of a downstream region of the SSA4 gene in the expression vector
pTRIamp18 (Am-
bion). The downstream region of the SSA4 gene was cloned as follows:
PCR amplification
Amplification of the partial yeast gene was done by standard PCR using yeast
genomic DNA
as template. Genomic DNA was prepared from a wild type standard laboratory
strain of Sac-
charomyces cerevisiae using the Nucleon MiY DNA extraction kit (Amersham
Biosciences)
according to supplier's instructions. In the first step of PCR amplification,
a forward primer
containing a restriction enzyme site and a reverse primer containing a
universal linker se-
quence were used. In this step 20 bp was added to the 3'-end of the amplicon,
next to the
stop codon. In the second step of amplification, the reverse primer was
exchanged with a
nested primer containing a poly-T20 tail and a restriction enzyme site. The
SSA4 amplicon
contains 729 bp of the SSA4 ORF plus a 20 bp universal linker sequence and a
poly-A20 tail.
The PCR primers used were:
YER103W-For-SacI: acgtgagctcattgaaactgcaggtggtattatga (SEQ ID NO: 23)
YER103W-Rev-Uni: gatccccgggaattgccatgctaatcaacctcttcaaccgttgg (SEQ ID NO: 24)
Uni-polyT-BamHI: acgtggatccttttttttttttttttttttgatccccgggaattgccatg (SEQ ID
NO: 25).
Plasmid DNA constructs
The PCR amplicon was cut with the restriction enzymes, EcoRI + BamHI. The DNA
fragment
was ligated into the pTRIamp18 vector (Ambion) using the Quick Ligation Kit
(New England
Biolabs) according to the supplier's instructions and transformed into E. coli
DH-5 by stan-
dard methods.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
59
DNA sequencing
To verify the cloning of the PCR amplicon, plasmid DNA was sequenced using M13
forward
and M13 reverse primers and analysed on an ABI 377.
In vitro transcription
SSA4 cRNA was obtained by performing in vitro transcription with the
Megascript T7 kit (Am-
bion) according to the supplier's instructions.
Reverse transcription
Reverse transcription was performed with lpg of cRNA and 0.2 U of the reverse
transcriptase
Superscript II RT (Invitrogen) according to the suppliers instructions except
that 20 U Supe-
rase-In (RNAse inhibitor - Ambion) was added. The produced cDNA was purified
on a
QiaQuick PCR purification column (Qiagen) according to the supplier's
instructions using the
supplied EB-buffer for elution. The DNA concentration of the eluted cDNA was
measured and
diluted to a concentration of SSA4 cDNA copies corresponding to 2 x 10' copies
pr pL.
EXAMPLE 8
Protocol for of dual label probe assays
Reagents for the dual label probe PCRs were mixed according to the following
scheme (Table
6):
Table 6
Reagents Final Concentration
H20
GeneAmp lOx PCR buffer II lx
Mg2+ 5.5 mM
DNTP 0.2 mM
Dual Label Probe 0.1 or 0.3 pM*
Template 1 pL
Forward primer 0.2 pM

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
Reverse primer 0.2 pM
AmliTaq Gold 2.5 U
Total 50 pL
*) Final concentration of 5' nuclease assay probe 0.1 pM and Beacon/SYBR-probe
0.3 pM.
In the present experiments 2 x 10' copies of the SSA4 cDNA was added as
template. Assays
were performed in a DNA Engine Opticon (MJ Research) using the following PCR
cycle pro-
5 tocols:
Table 7
5' nuclease assays Beacon & SYBR-probe Assays
95 C for 7 minutes 95 C for 7 minutes
& &
40 cycles of: 40 cycles of:
94 C for 20 seconds 94 C for 30 seconds
60 C for 1 minute 52 C for 1 minute*
Fluorescence detection Fluorescence detection
72 C for 30 seconds
* For the Beacon-570 with 9-mer recognition site the annealing temperature was
reduced to
44 C
10 The composition of the PCR reactions shown in Table 6 together with PCR
cycle protocols
listed in Table 7 will be referred to as standard 5' nuclease assay or
standard Beacon assay
conditions.
EXAMPLE 9
Specificity of 9-mer 5' nuc%ase assay probes
15 The specificity of the 5' nuclease assay probes were demonstrated in assays
where each of
the probes was added to 3 different PCR reactions each generating a different
SSA4 PCR am-
plicon. As shown in Fig. 6, each probe only produces a fluorescent signal
together with the
amplicon it was designed to detect (see also Figs. 10, 11 and 12). Importantly
the different
probes had very similar cycle threshold Ct values (from 23.2 to 23.7), showing
that the as-
20 says and probes have a very equal efficiency. Furthermore it indicates that
the assays should

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
61
detect similar expression levels when used in used in real expression assays.
This is an im-
portant finding, because variability in performance of different probes is
undesirable.
EXAMPLE 10
Specificity of 9 and 10-mer Molecular Beacon probes
The ability to detect in real time, newly generated PCR amplicons was also
demonstrated for
the molecular beacon design concept. The Molecular Beacon designed against the
469 ampli-
con with a 10-mer recognition sequence produced a clear signal when the SSA4
cDNA tem-
plate and primers for generating the 469 amplicon were present in the PCR,
Fig. 7A. The
observed Ct value was 24.0 and very similar to the ones obtained with the 5'
nuclease assay
probes again indicating a very similar sensitivity of the different probes. No
signal was pro-
duced when the SSA4 template was not added. A similar result was produced by
the Molecu-
lar Beacon designed against the 570 amplicon with a 9-mer recognition
sequence, Fig. 7B.
EXAMPLE 11.
Specificity of 9-mer SYBR-probes.
The ability to detect newly generated PCR amplicons was also demonstrated for
the SYBR-
probe design concept. The 9-mer SYBR-probe designed against the 570 amplicon
of the
SSA4 cDNA produced a clear signal when the SSA4 cDNA template and primers for
genera-
ting the 570 amplicon were present in the PCR, Fig. 8. No signal was produced
when the
SSA4 template was not added.
EXAMPLE 12
Quantification of transcript copy number
The ability to detect different levels of gene transcripts is an essential
requirement for a
probe to perform in a true expression assay. The fulfilment of the requirement
was shown by
the three 5' nuclease assay probes in an assay where different levels of the
expression vector
derived SSA4 cDNA was added to different PCR reactions together with one of
the 5' nuclease
assay probes (Fig. 9). Composition and cycle conditions were according to
standard 5' nucle-
ase assay conditions.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
62
The cDNA copy number in the PCR before start of cycling is reflected in the
cycle threshold
value Ct, i.e., the cycle number at which signal is first detected. Signal is
here only defined as
signal if fluorescence is five times above the standard deviation of the
fluorescence detected
in PCR cycles 3 to 10. The results show an overall good correlation between
the logarithm to
the initial cDNA copy number and the Ct value (Fig. 9). The correlation
appears as a straight
line with slope between -3.456 and -3.499 depending on the probe and
correlation coeffi-
cients between 0.9981 and 0.9999. The slope of the curves reflect the
efficiency of the PCRs
with a 100% efficiency corresponding to a slope of -3.322 assuming a doubling
of amplicon
in each PCR cycle. The slopes of the present PCRs indicate PCR efficiencies
between 94% and
100%. The correlation coefficients and the PCR efficiencies are as high as or
higher than the
values obtained with DNA 5' nuclease assay probes 17 to 26 nucleotides long in
detection
assays of the same SSA4 cDNA levels (results not shown). Therefore these
results show that
the three 9-mer 5' nuclease assay probes meet the requirements for true
expression probes
indicating that the probes should perform in expression profiling assays
EXAMPLE 13
Detection of SSA4 transcription levels in yeast
Expression levels of the SSA4 transcript were detected in different yeast
strains grown at
different culture conditions ( heat shock). A standard laboratory strain of
Saccharomyces
cerevisiae was used as wild type yeast in the experiments described here. A
SSA4 knockout
mutant was obtained from EUROSCARF (accession number Y06101). This strain is
here re-
ferred to as the SSA4 mutant. Both yeast strains were grown in YPD medium at
30 C till an
OD600 of 0.8 A. Yeast cultures that were to be heat shocked were transferred
to 40 C for 30
minutes after which the cells were harvested by centrifugation and the pellet
frozen at -
80 C. Non-heat shocked cells were in the meantime left growing at 30 C for 30
minutes and
then harvested as above.
RNA was isolated from the harvested yeast using the FastRNA Kit (Bio 101) and
the FastPrep
machine according to the supplier's instructions.
Reverse transcription was performed with 5 pg of anchored oligo(dT) primer to
prime the
reaction on lpg of total RNA, and 0.2 U of the reverse transcriptase
Superscript II RT (Invi-
trogen) according to the suppliers instructions except that 20 U Superase-In
(RNAse inhibitor
- Ambion) was added. After two-hours of incubation, enzyme inactivation was
performed at
70 for 5 minutes. The cDNA reactions were diluted 5 times in 10 mM Tris
buffer pH 8.5 and
oligonucleotides and enzymes were removed by purification on a MicroSpinT" S-
400 HR

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
63
column (Amersham Pharmacia Biotech). Prior to performing the expression assay
the cDNA
was diluted 20 times. The expression assay was performed with the Dual-
labelled-570 probe
using standard 5' nuclease assay conditions except 2 pL of template was added.
The
template was a 100 times dilution of the original reverse transcription
reactions. The four
different cDNA templates used were derived from wild type or mutant with or
without heat
shock. The assay produced the expected results (Fig. 10) showing increased
levels of the
SSA4 transcript in heat shocked wild type yeast (Ct =26.1) compared to the
wild type yeast
that was not submitted to elevated temperature (Ct =30.3). No transcripts were
detected in
the mutant yeast irrespective of culture conditions. The difference in Ct
values of 3.5
corresponds to a 17 fold induction in the expression level of the heat shocked
versus the non-
heat shocked wild type yeast and this value is close to the values around 19
reported in the
literature (Causton, et al. 2001). These values were obtained by using the
standard curve
obtained for the Dual-labelled-570 probe in the quantification experiments
with known
amounts of the SSA4 transcript (see Fig. 9). The experiments demonstrate that
the 9-mer
probes are capable of detecting expression levels that are in good accordance
with published
results.
EXAMPLE 14
Multiple transcript detection with individual 9-mer probes
To demonstrate the ability of the three 5' nuclease assay probes to detect
expression levels
of other genes as well, three different yeast genes were selected in which one
of the probe
sequences was present. Primers were designed to amplify a 60-100 base pair
region around
the probe sequence. The three selected yeast genes and the corresponding
primers are
shown in Table.
TABLE 8
Design of alternative expression assays
Sequence/Name Matching Probe Forward primer Reverse primer Amplicon
sequence sequence length
YEL055C/POL5 Dual-labelled- gcgagagaaaaca- attcgtcttcactggcatca 94 bp
469 agcaagg (SEQ ID NO: 27)
(SEQ ID NO: 26)
YDL149W_APG9 Dual-labelled- cagctaaaaatgat- attacatcatgattaggga- 97 bp
570 gacaataatgg atgc
(SEQ ID NO: 28) (SEQ ID NO: 29)

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
64
YPL240C_HSP82 Dual-labelled- gggtttgaacattg- ggtgtcagctggaacctctt 88 bp
671 atgagga (SEQ ID NO: 31)
(SEQ ID NO: 30)
Total cDNA derived from non-heat shocked wild type yeast was used as template
for the ex-
pression assay, which was performed using standard 5' nuclease assay
conditions except 2
pL of template was added. As shown in Fig. 11, all three probes could detect
expression of
the genes according to the assay design outlined in Table 8. Expression was
not detected
with any other combination of probe and primers than the ones outlined in
Table 8. Expres-
sion data are available in the literature for the SSA4, POL5, HSP82, and the
APG9 (Holstege,
et al. 1998). For non-heat shocked yeast, these data describe similar
expression levels for
SSA4 (0.8 transcript copies per cell), POL5 (0.8 transcript copies per cell)
and HSP82 (1.3
transcript copies per cell) whereas APG9 transcript levels are somewhat lower
(0.1 transcript
copies per cell).
This data is in good correspondence with the results obtained here since all
these genes
showed similar Ct values except HSP82, which had a Ct value of 25.6. This
suggests that the
HSP82 transcript was more abundant in the strain used in these experiments
than what is
indicated by the literature. Agarose gel electrophoresis was performed with
the PCRs shown
in Fig. 11a for the Dual-labelled-469 probe. The agarose gel (Fig. 12) shows
that PCR product
was indeed generated in reactions where no signal was obtained and therefore
the lack fluo-
rescent signal from these reactions was not caused by failure of the PCR.
Furthermore, the
different length of amplicons produced in expression assays for different
genes indicate that
the signal produced in expression assays for different genes are indeed
specific for the gene
in question.
EXAMPLE 15
Selection of targets
Using the EnsMart software release 16.1 from http://www.ensembl.org/EnsMart,
the 50
bases from each end off all exons from the Homo Sapiens NCBI 33 dbSNP115
Ensembl Genes
were extracted to form a Human Exon50 target set. Using the GetCover program
(cf. Fig.
17), occurrence of all probe target sequences was calculated and probe target
sequences not
passing selection criteria according to excess self-Complementarity, excessive
GC content
etc. were eliminated. Among the remaining sequences, the most abundant probe
target
sequences was selected (No. 1, covering 3200 targets), and subsequently all
the probe

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
targets having a prevalence above 0.8 times the prevalence of the most
abundant (3200 x
0.8) or above 2560 targets. From the remaining sample the number of new hits
for each
probe was computed and the product of number of new hits per probe target
compared to
the existing selection and the total prevalence of the same probe target was
computed and
5 used to select the next most abundant probe target sequence by selecting the
highest
product number. The probe target length (n), and sequence (nmer) and
occurrence in the
total target (cover), as well as the number of new hits per probe target
selection (Newhit),
the product of Newhit and cover (newhit x cover) and the number of accumulated
hits in the
target population from all accumulated probes (sum) is exemplified in the
table below.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
66
No n nmer Newhit Cover newhit x cover sum
1 8 ctcctcct 3200 3200 10240000 3200
2 8 ctggagga 2587 3056 7905872 5787
3 8 aggagctg 2132 3074 6553768 7919
4 8 cagcctgg 2062 2812 5798344 9981
8 cagcagcc 1774 2809 4983166 11755
6 8 tgctggag 1473 2864 4218672 13228
7 8 agctggag 1293 2863 3701859 14521
8 8 ctgctgcc 1277 2608 3330416 15798
9 8 aggagcag 1179 2636 3107844 16977
8 ccaggagg 1044 2567 2679948 18021
11 8 tcctgctg 945 2538 2398410 18966
12 8 cttcctcc 894 2477 2214438 19860
13 8 ccgccgcc 1017 2003 2037051 20877
14 8 cctggagc 781 2439 1904859 21658
8 cagcctcc 794 2325 1846050 22452
16 8 tggctgtg 805 2122 1708210 23257
17 8 cctggaga 692 2306 1595752 23949
18 8 ccagccag 661 2205 1457505 24610
19 8 ccagggcc 578 2318 1339804 25188
8 cccagcag 544 2373 1290912 25732
21 8 ccaccacc 641 1916 1228156 26373
22 8 ctcctcca 459 3010 1381590 26832
23 8 ttctcctg 534 1894 1011396 27366
24 8 cagcccag 471 2033 957543 27837
8 ctggctgc 419 2173 910487 28256
26 8 ctccacca 426 2097 893322 28682
27 8 cttcctgc 437 1972 861764 29119
28 8 cttccagc 415 1883 781445 29534
29 8 ccacctcc 366 2018 738588 29900
8 ttcctctg 435 1666 724710 30335
31 8 cccagccc 354 1948 689592 30689
32 8 tggtgatg 398 1675 666650 31087
33 8 tggctctg 358 1767 632586 31445
34 8 ctgccttc 396 1557 616572 31841

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
67
No n nmer Newhit Cover newhit x cover sum
35 8 ctccagcc 294 2378 699132 32135
36 8 tgtggctg 304 1930 586720 32439
37 8 cagaggag 302 1845 557190 32741
38 8 cagctccc 275 1914 526350 33016
39 8 ctgcctcc 262 1977 517974 33278
40 8 tctgctgc 267 1912 510504 33545
41 8 ctgcttcc 280 1777 497560 33825
42 8 cttctccc 291 1663 483933 34116
43 8 cctcagcc 232 1863 432216 34348
44 8 ctccttcc 236 1762 415832 34584
45 8 cagcaggc 217 1868 405356 34801
46 8 ctgcctct 251 1575 395325 35052
47 8 ctccacct 215 1706 366790 35267
48 8 ctcctccc 205 1701 348705 35472
49 8 cttcccca 224 1537 344288 35696
50 8 cttcagcc 203 1650 334950 35899
51 8 ctctgcca 201 1628 327228 36100
52 8 ctgggaga 192 1606 308352 36292
53 8 cttctgcc 195 1533 298935 36487
54 8 cagcaggt 170 1711 290870 36657
55 8 tctggagc 206 1328 273568 36863
56 8 tcctgctc 159 1864 296376 37022
57 8 ctggggcc 159 1659 263781 37181
58 8 ctcctgcc 155 1733 268615 37336
59 8 ctgggcaa 185 1374 254190 37521
60 8 ctggggct 149 1819 271031 37670
61 8 tggtggcc 145 1731 250995 37815
62 8 ccagggca 147 1613 237111 37962
63 8 ctgctccc 146 1582 230972 38108
64 8 tgggcagc 135 1821 245835 38243
65 8 ctccatcc 161 1389 223629 38404
66 8 ctgcccca 143 1498 214214 38547
67 8 ttcctggc 155 1351 209405 38702
68 8 atggctgc 157 1285 201745 38859
69 8 tggtggaa 155 1263 195765 39014
70 8 tgctgtcc 135 1424 192240 39149

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
68
No n nmer Newhit Cover newhit x cover sum
71 8 ccagccgc 159 1203 191277 39308
72 8 catccagc 122 1590 193980 39430
73 8 tcctctcc 118 1545 182310 39548
74 8 agctggga 121 1398 169158 39669
75 8 ctggtctc 128 1151 147328 39797
76 8 ttcccagt 142 1023 145266 39939
77 8 caggcagc 108 1819 196452 40047
78 8 tcctcagc 105 1654 173670 40152
79 8 ctggctcc 103 1607 165521 40255
80 9 tcctcttct 127 1006 127762 40382
81 8 tccagtgt 123 968 119064 40505
EXAMPLE 16
qPCR for Human Genes
Use of the Probe library is coupled to the use of a real-time PCR design
software which can:
= recognise an input sequence via a unique identifier or by registering a
submitted nucleic
acid sequence
= identify all probes which can target the nucleic acid
= sort probes according to target sequence selection criteria such as
proximity to the 3'
end or proximity to intron-exon boundaries
= if possible, design PCR primers that flank probes targeting the nucleic acid
sequence
according to PCR design rules
= suggest available real-time PCR assays based on above procedures.
The design of an efficient and reliable qPCR assay for a human gene is carried
out via the
software found on www.probelibrary.com
The ProbeFinder software designs optimal qPCR probes and primers fast and
reliably for a
given human gene.
The design comprises the following steps:

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
69
1) Determination of the intron positions
Noise from chromosomal DNA is eliminated by selecting intron spanning qPCR's.
Introns
are determined by a blast search against the human genome. Regions found on
the DNA,
but not in the transcript are considered to be introns.
2) Match of the Probe Library to the gene
Virtually all human transcripts are covered by at least one of the 90 probes,
the high
coverage is made possible by LNA modifications of the recognition sequence
tags.
3) Design of primers and selection of optimal qPCR assay
Primers are designed with 'Primer3' (Whitehead Inst. For Biomedical Research,
S. Rozen
and H.J. Skaletsky). Finally the probes are ranked according to selected rules
ensuring
the best possible qPCR. The rules favour intron spanning amplicons to remove
false sig-
nals from DNA contamination, amplicons that will not amplify off target
genomic
sequence or other transcripts as found by an in silico PCR search, small
amplicon size for
reproducible and comparable assays and a GC content optimized for PCR.
EXAMPLE 17
Preparation of ena-monomers and oligomers
ENA-T monomers are prepared and used for the preparation of dual labelled
probes of the
invention.
In the following sequences the X denotes a 2'-O,4'-C-ethylene-5-methyluridine
(ENA-T). The
synthesis of this monomer is described in WO 00/47599. The reaction conditions
for incor-
poration of a 5'-O-Dimethoxytrityl-2'-O,4'-C-ethylene-5-methyluridine-3'-O-(2-
cyanoethyl-
N,N-diisopropyl)phosphoramidite corresponds to the reaction conditions for the
preparation of
LNA oligomers as described in EXAMPLE 6.
The following three dual labelled probes are prepared:
EQ# Sequences MW (Calc.) MW (Found)
16533 5'-Fitc-ctGmCXmCmCAg-EQL-3' 4002 Da. 4001 Da.
16534 5'-Fitc-cXGmCXmCmCA-EQL-3' 3715 Da. 3716 Da.
16535 5'-Fitc-tGGmCGAXXX-EQL-3' 4128 Da. 4130 Da.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
X designates ENA-T monomer. Small letters designate DNA monomers (a, g, c, t).
Fitc =
Fluorescein; EQL = Eclipse quencher; Dabcyl = Dabcyl quencher. MW = Molecular
weight.
Capital letters other than 'X' designate methyloxy LNA nucleotides.
EXAMPLE 18
5 Protocol for dual label probe assays
Reagents for the Real Time dual label probe PCRs were mixed according to the
following
scheme (Table 9):
Table 9
Reagents Final Concentration
H20
GeneAmp lOx PCR buffer II lx
Mga+ 5.5 mM
dATP, dGTP, dCTP 0.2 mM
dUTP 0.6 mM
17302 Q4 Dual Label Probe 0.1 pM
15319 Oligo Template 4 pM
15321 Forward primer 0.2 pM
15322 Reverse primer 0.2 pM
Uracil DNA Glycosylase 0.5 U
AmpliTaq Gold 2.5 U
Total 50 pL
The following primers, probes, and Oligo Templates in Table 10 were included
in the above
10 mentioned PCR mix from Table 9;
Table 10
Name Sequence Quencher
15321 Forward Primer gactcacggtcgcacca (SEQ ID NO: 47) -
15322 Reverse Primer ccgcgttccacggtta (SEQ ID NO: 48) -
17302 Q4 Dual Label
Probe 5' 6-Fitc-tTmCmCTmCTG#Q4z 3' Q4

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
71
15319 Oligo Template attgactcacggtcgcaccaaattcctctgccttcctgctctgctgg
gagaaggaggtggtgatgtggctggaaggaggcagctccagg
agaaaataaccgtggaacgcggtcat (SEQ ID NO: 49) -
LNA nucleotides are in capital letters;
6-Fitc: Fluorescein 6-isothiocyanate;
#Q4: 1,4-Bis(2-hydroxyethylamino)-6-methylanthraquinone, cf. Example 21 which
also
shows preparation of a 2-cyanoethyl protected phosphoramidite version of this
molecule for
use in the general method in Example 6, i.e. of 1-(4-(2-(2-
cyanoethoxy(diisopropylamino)
phosphinoxy)ethyl)phenylamino)-4-(4-(2-(4,4'-dimethoxy-
trityloxy)ethyl)phenylamino)-6(7)-
methyl-anthraquinone;
z: 2'-deoxy-5-nitroindole-ribofuranosyl;
mC: 5-methylcytosin.
The 17302 Q4 dual label probe is prepared as generally described in Example 6.
Assays were performed in a DNA Engine Opticon (MJ Research) using the
following PCR
cycle protocol (Table 11):
Table 11
37 C for 10 minutes
95 C for 7 minutes
40 cycles of: 94 C for 20 seconds
60 C for 1 minute
Fluorescence detection
Results from the Real Time PCR is illustrated in Fig. 18, which shows that the
dual labelled
probe with the quencher Q4 is fully functional as a real time PCR probe.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
72
EXAMPLE 19
Dual labelled probe functionality in real time PCR
Protocol for dual label probe assays
Reagents for the Real Time dual label probe PCRs were mixed according to the
following
scheme (Table 12):
Table 12
Reagents Final
Concentration
H20
GeneAmp lOx PCR buffer II lx
Mga+ 5.5 mM
dATP, dGTP, dCTP 0.2 mM
dUTP 0.6 mM
15305 Q1 Dual Label Probe 0.1 pM
15319 Oligo Template 4 pM
15321 Forward primer 0.2 pM
15322 Reverse primer 0.2 pM
Uracil DNA Glycosylase 0.5 U
AmpliTaq Gold 2.5 U
Total 50 pL
The following primers, probes, and Oligo Templates in Table 13 were included
in the above
mentioned PCR mix from Table 12.
Table 13
Name Sequence Quencher
15321 Forward Primer gactcacggtcgcacca (SEQ ID NO: 47) -
15322 Reverse Primer ccgcgttccacggtta (SEQ ID NO: 48) -
15305 Q1 Dual Label Probe 5' 6-Fitc-tTmCmCTmCTG#Q1z 3' Q1

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
73
15319 Oligo Template attgactcacggtcgcaccaaattcctctgccttcct
gctctgctgggagaaggaggtggtgatgtggctg
gaaggaggcagctccaggagaaaataaccgtgg
aacgcggtcat (SEQ ID NO: 49) -
* LNA nucleotides are in capital letters; 6-Fitc: Fluorescein 6-
isothiocyanate;
#Q1: 1,4-Bis(3-hydroxypropylamino)-anthraquinone, cf. Example 20 which also
shows
preparation of a 2-cyanoethyl protected phosphoramidite version of this
molecule (1-(3-(2-
cyanoethoxy(diisopropylamino)phosphinoxy)propylamino)-4-(3-(4,4'-dimethoxy-
trityloxy)propylamino)-anthraquinone) for use in the general method in Example
6;
z: 2'-deoxy-5-nitroindole-ribofuranosyl;
mC: 5-methylcytosin.
The 15305 Q1 dual label probe is prepared as described in Example 6.
Assays were performed in a DNA Engine Opticon (MJ Research) using the
following PCR
cycle protocol:
Table 14
37 C for 10 minutes
95 C for 7 minutes
40 cycles of: 94 C for 20 seconds
60 C for 1 minute
Fluorescence detection
Results from the Real Time PCR is illustrated in Figure 19, which shows that
the dual labelled
probe with a 3'-Nitroindole is fully functional as a real time PCR probe.
EXAMPLE 20
Preparation of 1-(3-(2-cyanoethoxy(diisopropylamino)phosphinoxy)propylamino)-4-
(3-(4,4'-
dimethoxy-trityloxy)propylamino)-anthraquinone (3)

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
74
OH 0 0 HN~~OH
I \ ~ - _ ~ \ I \ - -
OH 0 0 HN,,,-,~OH
1
O HN"~\ODMT 0 HN"~~ODMT
~/ ~/ --- ~/ I/ ~N~\
O HN0 ,,,-~,,OH 0 HO'p-, O
2 3
1,4-Bis(3-hydrox)(propylamino)-anthraquinone (1)
Leucoquinizarin (9.9 g; 0.04 mol) is mixed with 3-amino-l-propanol (10 mL) and
Ethanol
(200 mL) and heated to reflux for 6 hours. The mixture is cooled to room
temperature and
stirred overnight under atmospheric conditions. The mixture is poured into
water (500 mL)
and the precipitate is filtered off washed with water (200 mL) and dried. The
solid is boiled in
ethylacetate (300 mL), cooled to room temperature and the solid is collected
by filtration.
Yield: 8.2 g (56%)
1-(3-(4,4'-dimethoxy-trityloxy)propylamino)-4-(3-hydroxypropylamino)-
anthraquinone (2)
1,4-Bis(3-hydroxypropylamino)-anthraquinone (7.08 g; 0.02 mol) is dissolved in
a mixture of
dry N,N-dimethylformamide (150 mL) and dry pyridine (50 mL).
Dimethoxytritylchloride (3.4
g; 0.01 mol) is added and the mixture is stirred for 2 hours. Additional
dimethoxytritylchloride (3.4 g; 0.01 mol) is added and the mixture is stirred
for 3 hours. The
mixture is concentrated under vacuum and the residue is re-dissolved in
dichloromethane
(400 mL) washed with water (2 x 200 ml) and dried (Na2SO4). The solution is
filtered through
a silica gel pad (o 10 cm; h 10 cm) and eluted with dichloromethane until mono-
DMT-
anthraquinone product begins to elude where after the solvent is the changed
to 2%
methanol in dichloromethane. The pure fractions are combined and concentrated
resulting in
a blue foam.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
Yield: 7.1 g (54%)
1H-NMR(CDCI3): 10.8 (2H, 2xt, J= 5.3 Hz, NH), 8.31 (2H, m, AqH), 7.67 (2H, dt,
J= 3.8 and
9.4, AqH), 7.4-7.1 (9H, m, ArH + AqH), 6.76 (4H, m, ArH) 3.86 (2H, q, J=
5.5Hz, CHZOH),
3.71 (6H, s, CH3), 3.54 (4H, m, NCH2), 3.26 (2H, t, J= 5.7 Hz, CH2ODMT), 2.05
(4H, m,
5 CCH2C), 1.74 (1H, t, J= 5 Hz, OH).
1-(3-(2-cyanoethoxy(diisopropylamino)phosphinoxy)propylamino)-4-(3-(4,4'-
dimethoxy-
trityloxy)propylamino)-anthraquinone (3)
1-(3-(4,4'-dimethoxy-trityloxy)propylamino)-4-(3-hydroxypropylamino)-
anthraquinone (0.66
g; 1.0 mmol) is dissolved in dry dichloromethane (100 mL) and added 3A
molecular sieves.
10 The mixture is stirred for 3 hours and then added 2-cyanoethyl-N,N,N',N'-
tetraisopropylphosphordiamidite (335 mg; 1.1 mmol) and 4,5-dicyanoimidazole
(105 mg;
0.9 mmol). The mixture is stirred for 5 hours and then added sat. NaHCO3 (50
mL) and
stirred for 10 minutes. The phases are separated and the organic phase is
washed with sat.
NaHCO3 (50 mL), brine (50 mL) and dried (Na2SO4). After concentration the
phosphoramidite
15 is obtained as a blue foam and is used in oligonucleotide synthesis without
further
purification.
Yield: 705 mg (82 %)
31P-NMR (CDCI3): 150.0
1H-NMR(CDCI3): 10.8 (2H, 2xt, J= 5.3 Hz, NH), 8.32 (2H, m, AqH), 7.67 (2H, m,
AqH), 7.5-
20 7.1 (9H, m, ArH + AqH), 6.77 (4H, m, ArH) 3.9-3.75 (4H, m), 3.71 (6H, s,
OCH3), 3.64-3.52
(3.54 (6H, m), 3.26 (2H, t, J= 5.8 Hz, CH2ODMT), 2.63 (2H, t, J= 6.4 Hz,
CH2CN) 2.05 (4H,
m, CCH2C), 1.18 (12H, dd, 3 3.1 Hz, CCH3).
EXAMPLE 21
Preparation of 1-(4-(2-(2-
cyanoethoxy(dfisopropylamino)phosphinoxy)ethyl)phenylamino)-4-
25 (4-(2-(4,4'-dimethoxy-trityloxy)ethyl)phenylamino)-6(7)-methyl-
anthraquinone (13)

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
76
OH
OH 0 OH O HN
O o I
C 0 +
O CI 0 OH O HN
1 ~OH
11
\ I ODMT \ I ODMT
O HN O HN
0 0 0 0
-- ~ ~ ~ ~ _~ ~ ~ ~ ~
O HN I o 0 HN I o ~~
N
/ OH 0. P.O,-,,_,CN
12 13
6-methyl-Quinizarin (10)
4-methyl-phthalic anhydride (10 g, 62 mmol), p-chlorophenol (3.6 g, 28 mmol)
and Boric
acid (1.6 g) were dissolved in concentrated H2SO4 (34 ml) and the mixture was
stirred at
5 200 C for 6 hours in a flask covered with a glass plate. After completion of
the reaction, the
mixture was allowed to cool and then poured into water (160 ml) and the
precipitate
collected by filtration. The solid was suspended in boiling water (320 ml) and
boiled for 5
min, whereupon the solid was collected by filtration. The product was obtained
as a dark red
solid (5 g, 19.7 mmol) after drying. MALDI-MS: m/z 255.7 (M+H).
10 1 4-Bis(4-(2-hydroxyethyl)phenylamino)-6-methyl-anthraquinone (11)
6-methyl-quinizarin (10, 2.5g) is suspended in acetic acid (30ml), Zn-dust
(2g) is added and
the mixture is stirred at 90 C for 1h. The mixture is then filtered through a
pad of celite,
cooled to room temperature and water (90ml) is added and the reduced
anthraquinone
derivative can then be collected by filtration. The solid is then mixed with
boric acid (1.9 g;
0.03 mol) and ethanol (100 mL) and refluxed for 1 hour. The mixture is cooled
to room
temperature and added 4-aminophenethyl alcohol (4.1 g; 0.03 mol) where after
the mixture
is heated to reflux for 3 days. The mixture concentrated redissolved in
dichloromethane (300
mL) washed with water (3 x 100 mL), dried (Na2SO4) and concentrated. The
residue is
purified on silica gel column with MeOH/dichloromethane. Yield: 1.5 g (30%).

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
77
1-(4-(2-(4.4'-dimethoxy-trityloxy)ethyl)phenylamino)-4-(4-(2-
hydroethyl)phenylamino)-
6(7)-methyl-anthraquinone (12)
1,4-Bis(4-(2-hydroethyl)phenylamino)-6-methyl-anthraquinone (0.95 g; 1.9 mmol)
is
dissolved in dry pyridine (30 mL). Dimethoxytritylchloride (0.34g; 1 mmol) is
added and the
mixture is stirred for 2 hours. Additional dimethoxytritylchloride (0.34g; 1
mmol) is added
and the mixture is stirred for 4 hours. The mixture is concentrated under
vacuum and the
residue is redissolved in dichloromethane (200 mL) washed with water (2 x 100
ml) and
dried (Na2SO4). The product is purified by column chromatography
(toluene/EtoAc). Yield:
0.81 g (54%).
1-(4-(2-(2-cyanoethoxy(diisopropylamino)phosphinoxy)ethyl)phenylamino)-4-(4-(2-
(4,4'-
dimethoxy-trityloxy)ethyl)phenylamino)-6(7)-methyl-anthraquinone (13)
1-(4-(2-(4,4'-dimethoxy-trityloxy)ethyl)phenylamino)-4-(4-(2-
hydroethyl)phenylamino)-
6(7)-methyl-anthraquinone (0.50 g; 0.63 mmol) is dissolved in dry
dichloromethane (50 mL)
and added 3A molecular sieves. The mixture is stirred for 3 hours and then
added 2-
cyanoethyl-N,N,N',N'-tetraisopropylphosphordiamidite (215 mg; 0.72 mmol) and
4,5-
dicyanoimidazole (64 mg; 0.55 mmol). The mixture is stirred for 4 hours and
then added sat.
NaHCO3 (25 mL) and stirred for 10 minutes. The phases are separated and the
organic phase
is washed with sat. NaHCO3 (25 mL), brine (25 mL) and dried (NaZSO4). The
phosphoramidite
is then evaporated to dryness and used in oligonucleotide synthesis without
further
purification. Yield: 0.59 g (94%).
EXAMPLE 22
Snp Detection Using A Library Of Probes
Single Nucleotide polymorphisms (SNPs) are the most common type of genetic
variants in
the human and other genomes. Detection of SNPs using dual labelled probes can
be done by
simultaneously using 2 differently labelled probes, which each hybridize
specifically to one
SNP allele. The result of the real time PCR will hence indicate the presence
of one or the
other or both alleles in the sample. As sample can be used either genomic DNA
or RNA.
SNPs occur almost randomly and it is expected that almost any sequence context
can exist in
many permutations as a result of SNPs and currently over 2 million SNPs are
known. Hence
to have all relevant probes on stock for supplying or generating SNP detection
assays,
millions of probes would be needed.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
78
Relevant for the present invention, due to the short probes enabled by the use
of LNA, this
number can be reduced by using LNA-containing 8 or 9-mer probes.
Theoretically, 49 or
262144 possible 9-mers and 48 or 65536 8-mers can exist and would be necessary
to cover
any possible SNP sequence. Still an advantage of LNA-containing oligo's is an
increased
specificity, allowing the SNP-position in the probe to be placed at any
position in the probe.
Hence, each probe can cover 9 different SNP positions, which would reduce the
need for 8-
mer sequences from 65536 to 65536/9= 7281. Detection can also occur at both
strands,
hence only 7281/2=3640 probes are needed.
EXAMPLE 23
SNP discrimination example - demonstrating single mismatch discrimination by
dual labelled
probe in real time PCR.
Protocol for dual label probe assays
Reagents for the Real Time dual label probe PCRs were mixed according to the
following
scheme (Table 15):
Table 15
Reagents Final Concentration
H20
GeneAmp lOx PCR buffer II lx
MgZ+ 5.5 mM
dATP, dGTP, dCTP 0.2 mM
dUTP 0.6 mM
13996 Dual Label Probe 0.1 pM
Oligo Template 40 fM
(14229 or 14226)
14117 Forward primer 0.2 pM
14118 Reverse primer 0.2 pM
Uracil DNA Glycosylase 0.5 U
AmpliTaq Gold 2.5 U
Total 50 pL
The following primers, probes, and Oligo Templates were included in the above
mentioned
PCR mix (Table 15).

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
79
Table 16
Name Sequence
14117 Forward Primer cagctaaaaatgatgacaataatgg
14118 Reverse Primer attacatcatgattagggaatgc
13996 Dual Label Probe 5' 6-Fitc-ctGGAGmCaG-EQL 3'
14229 Single Mismatch Oligo
Template cagctaaaaatgatgacaataatgggctaacggagaa
gcgggagcagatcggcattccctaatcatgatgtaat
14226 Perfect Match Oligo cagctaaaaatgatgacaataatgggctaaaggagaa
Template gctggagcagatcggcattccctaatcatgatgtaat
LNA's in capital letters; 6-Fitc: Fluorescein 6-isothiocyanate; EQL: Eclipse
Tm Dark Quencher
(Epoch Biosciences); mC: 5-methylcytosin.
Assays were performed in a DNA Engine Opticon (MJ Research) using the
following PCR
cycle protocol:
Table 17
37 C for 10 minutes
95 C for 7 minutes
40 cycles of: 94 C for 20 seconds
60 C for 1 minute
Fluorescence detection
Results from the Real Time PCR is illustrated in Figure 20, which shows that
the dual labelled
probe is able to discriminate between a perfectly matching target and a target
having a single
mismatch relative to the probe.

CA 02593916 2007-06-21
WO 2006/066592 PCT/DK2005/000815
REFERENCES AND NOTES
1. Helen C. Causton, Bing Ren, Sang Seok Koh, Christopher T. Harbison, Elenita
Kanin,
Ezra G. Jennings, Tong Ihn Lee, Heather L. True, Eric S. Lander, and Richard
A. Young
(2001). Remodelling of Yeast Genome Expression in Response to Environmental
5 Changes. Mol. Biol. Cell 12:323-337 (2001).
2. Frank C. P. Holstege, Ezra G. Jennings, John J. Wyrick, Tong Ihn Lee,
Christoph J. Hen-
gartner, Michael R. Green, Todd R. Golub, Eric S. Lander, and Richard A. Young
(1998).
Dissecting the Regulatory Circuitry of a Eukaryotic Genome. Cell 1998 95: 717-
728.
3. Simeonov, Anton and Theo T. Nikiforov, Single nucleotide polymorphism
genotyping
10 using short, fluorescently labelled locked nucleic acid (LNA) probes and
fluorescence
polarization detection, Nucleic Acid Research, 2002, Vol.30 No 17 e 91.
Variations, modifications, and other implementations of what is described
herein will occur to
those skilled in the art without departing from the spirit and scope of the
invention as descri-
bed and claimed herein and such variations, modifications, and implementations
are encom-
15 passed within the scope of the invention.
The references, patents, patent applications, and international applications
disclosed above
are incorporated by reference herein in their entireties.

DEMANDE OU BREVET VOLUMINEUX
LA PRESENTE PARTIE DE CETTE DEMANDE OU CE BREVET COMPREND
PLUS D'UN TOME.
CECI EST LE TOME 1 DE 2
CONTENANT LES PAGES 1 A 80
NOTE : Pour les tomes additionels, veuillez contacter le Bureau canadien des
brevets
JUMBO APPLICATIONS/PATENTS
THIS SECTION OF THE APPLICATION/PATENT CONTAINS MORE THAN ONE
VOLUME
THIS IS VOLUME 1 OF 2
CONTAINING PAGES 1 TO 80
NOTE: For additional volumes, please contact the Canadian Patent Office
NOM DU FICHIER / FILE NAME:
NOTE POUR LE TOME / VOLUME NOTE:

Representative Drawing

Sorry, the representative drawing for patent document number 2593916 was not found.

Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee and Payment History should be consulted.

Event History

Description	Date
Inactive: IPC expired	2019-01-01
Inactive: IPC expired	2018-01-01
Application Not Reinstated by Deadline	2011-12-21
Time Limit for Reversal Expired	2011-12-21
Deemed Abandoned - Failure to Respond to Maintenance Fee Notice	2010-12-21
Inactive: Abandon-RFE+Late fee unpaid-Correspondence sent	2010-12-21
Inactive: Sequence listing - Amendment	2010-02-22
Inactive: Office letter - Examination Support	2010-01-26
Letter Sent	2010-01-26
Inactive: Adhoc Request Documented	2009-12-23
Inactive: Delete abandonment	2009-12-23
Reinstatement Request Received	2009-12-08
Reinstatement Requirements Deemed Compliant for All Abandonment Reasons	2009-12-08
Inactive: Sequence listing - Amendment	2009-12-08
Inactive: Abandoned - No reply to Office letter	2009-08-26
Inactive: Abandoned - No reply to Office letter	2009-08-26
Inactive: Office letter - Examination Support	2009-05-26
Inactive: Office letter	2009-05-26
Inactive: Sequence listing - Amendment	2009-05-06
Inactive: Office letter	2009-03-11
Inactive: Correspondence - PCT	2008-10-29
Letter Sent	2008-07-02
Inactive: Single transfer	2008-05-05
Inactive: Sequence listing - Amendment	2008-01-23
Inactive: IPC assigned	2007-11-16
Inactive: IPC assigned	2007-11-16
Inactive: IPC assigned	2007-10-24
Inactive: IPC assigned	2007-10-24
Inactive: IPC assigned	2007-10-24
Inactive: IPC assigned	2007-10-24
Inactive: IPC assigned	2007-10-24
Inactive: First IPC assigned	2007-10-24
Inactive: IPC assigned	2007-10-24
Inactive: IPC assigned	2007-10-24
Inactive: IPC removed	2007-10-24
Inactive: Cover page published	2007-09-17
Inactive: Notice - National entry - No RFE	2007-09-12
Inactive: Declaration of entitlement - Formalities	2007-08-21
Inactive: First IPC assigned	2007-08-15
Application Received - PCT	2007-08-14
National Entry Requirements Determined Compliant	2007-06-21
Application Published (Open to Public Inspection)	2006-06-29

Abandonment History

Abandonment Date	Reason	Reinstatement Date
2010-12-21
2009-12-08

Maintenance Fee

The last payment was received on 2009-11-18

Note : If the full payment has not been received on or before the date indicated, a further fee may be required which may be one of the following

the reinstatement fee;
the late payment fee; or
additional fee to reverse deemed expiry.

Patent fees are adjusted on the 1st of January every year. The amounts above are the current amounts if received by December 31 of the current year.
Please refer to the CIPO Patent Fees web page to see all current fee amounts.

Fee History

Fee Type	Anniversary Year	Due Date	Paid Date
Basic national fee - standard			2007-06-21
MF (application, 2nd anniv.) - standard	02	2007-12-21	2007-06-21
Registration of a document			2008-05-05
MF (application, 3rd anniv.) - standard	03	2008-12-22	2008-12-03
MF (application, 4th anniv.) - standard	04	2009-12-21	2009-11-18
Reinstatement			2009-12-08

Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
EXIQON A/S

Past Owners on Record
NIELS BIRGER RAMSING
NIELS TOLSTRUP
PETER MOURITZEN
SOREN MORGENTHALER ECHWALD

Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.

Documents

To view selected files, please enter reCAPTCHA code :

To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Filter

Download Selected in PDF format (Zip Archive)

Download Selected as Single PDF

Document Description	Date (yyyy-mm-dd)	Number of pages	Size of Image (KB)
Description	2007-06-20	82	4,068
Drawings	2007-06-20	102	2,723
Claims	2007-06-20	12	520
Abstract	2007-06-20	1	60
Description	2007-06-20	14	243
Description	2010-02-21	80	4,038
Notice of National Entry	2007-09-11	1	207
Courtesy - Certificate of registration (related document(s))	2008-07-01	1	104
Courtesy - Abandonment Letter (Office letter)	2010-01-11	1	164
Notice of Reinstatement	2010-01-25	1	171
Reminder - Request for Examination	2010-08-23	1	121
Courtesy - Abandonment Letter (Maintenance Fee)	2011-02-14	1	173
Courtesy - Abandonment Letter (Request for Examination)	2011-03-28	1	164
PCT	2007-06-20	3	89
Correspondence	2007-08-20	2	78
Correspondence	2008-10-28	3	106
Correspondence	2009-03-10	2	61
Correspondence	2009-05-25	1	35
Correspondence	2010-01-25	1	35

Biological Sequence Listings

Choose a BSL submission then click the "Download BSL" button to download the file.

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Please note that files with extensions .pep and .seq that were created by CIPO as working files might be incomplete and are not to be considered official communication.

BSL Files

File Name	Received On	Size (bytes)
OURREF.TXT	2008-01-23	11,706
OURREF.SEQ	2009-12-08	11,351
OURREF.TXT	2009-12-08	11,652
OURREF.TXT	2009-04-16	11,724
OURREF.SEQ	2009-04-16	11,461

To view selected files, please enter reCAPTCHA code :

Language selection

Menus

English Abstract

French Abstract

Event History

Abandonment History

Maintenance Fee

Fee History

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.

Patent 2593916 Summary

English Abstract

French Abstract

Event History

Abandonment History

Maintenance Fee

Fee History

Your request is in progress.Requested information will be availablein a moment.Thank you for waiting.

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.