Note: Descriptions are shown in the official language in which they were submitted.
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 1 -
A NOVEL TYPE OF TRANSPOSON-BASED GENETIC MARKER
BACKGROUND OF THE INVENTION
(a) Field of the Invention
The invention relates to a method for
genotyping a nucleic acid sequence using amplification
with a primer pair comprising a first primer having a
DNA sequence homologous to a miniature inverted-repeat
transposable element (MITE) and a second primer,
identical or different from the first primer. The
invention generally relates to the use of MITE primers
in fingerprinting or linkages studies.
(b) Description of Prior Art
After the discovery of the transposable element
system Ac/Ds by McClintock (McClintock B. 1946. Maize
genetics. Carnegie Inst. Wash. Yearbook 45: 176-186;
and McClintock B. 1947. Cytogenetic studies of maize
and neurospora. Carnegie Inst. Wash. Yearbook 46: 146
152.), genetic identification of new transposable
element systems (families) became a popular area of
genetic studies in plants (Peterson P. A. 1986. Mobile
elements in maize. Plant Breeding Reviews 4: 3-122.)
as well as in other organisms. This was followed by
the molecular characterization of transposable elements
and exploitation of these elements as gene
identification and isolation tools, especially after
the cloning of the white locus with the copia
retrotransposon in Drosophila (Bingham P. M., R. Lewis
and G. M. Rubin 1981. Cloning of DNA sequences from
the white locus of D. melanogaster by a novel and
general method. Cell 25: 693-704.), and molecular
characterization of the maize transposable element Ac
(Pohlman R. F., N. V. Fedoroff and J. Messing 1984.
The nucleotide sequence of the maize controlling
element Activator. Cell 37: 635-643.) and En/Spm
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 2 -
(Pereira A., 2s. Schwarz-Sommer, A. Gierl, I. Bertram,
P. A. Peterson and H. Saedler 1985. Genetic and
molecular analysis of the Enhancer (En) transposable
element system of Zea mays. EMBO J. 4: 17-25.). Since
then, transposable element-related studies have become
a major focus in biological sciences.
As in other areas of biological research, the
identification of transposable elements has been
accelerated by modern computer technologies. Bureau et
a1. (Bureau T. E., P. C. Ronald, and S. R. Wessler
1996. A computer-based systematic survey reveals the
predominance of small inverted-repeat elements i.n wild-
type rice genes. Proc: Natl. Acad. Sci. 93: 8524-
8529.) adopted this approach to identify numerous
members of a new family of transposable elements.
These elements resemble the traditional DNA-mediated
transposable elements (as opposed to retroelements
which transpose via RNA intermediates, Boeke J. D., D.
J. Garfinkel, C. A. Styles and G. R. Fink 1985. Ty
elements transpose through an RNA internediate. Cell
40: 491-500.) in that they possess terminal inverted
repeats (TIRs). However unlike the classical
genetically characterized transposable elements, these
elements are small in size, and show no apparent coding
capacity. These elements have been referred to as
miniature inverted-repeat transposable elements or,
MITES (Bureau et al. supra).
Since the introduction of the restriction
fragment length polymorphism (RFLP) technique (Bostein
D., R. White, M. Skolnick and R. W. Davis 1980.
Construction of a genetic linkage map in man using
restriction fragment length polymorphism. Am. J. Hum.
Genet. 32: 314-331.) as a molecular mapping tool,
genome mapping and fingerprinting technologies have
been advanced substantially as evidenced by the
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 3 -
development of other new techniques such as randomly
amplified DNA polymorphism (RAPD, Welsh J. and M.
McClelland 1990. Fingerprinting genomes using PCR with
arbitrary primers. Nucleic Acids Res. 18: 7213-7218.;
Williams J. G. K. , A. R. Kubelik, K. J. Livak, J. A.
Rafalski and S. V. Tingey 1990. DNA polymorphisms
amplified by arbitrary primers are useful as genetic
markers. Nucleic Acids Res. 18: 6531-6535.), and
amplified fragment length polymorphism (AFLP, Vos P.,
R. Hogers, M. Bleeker, M. Reijans, T. van de Lee, M.
Homes, A. Frijters, J. Pot, J. Peleman, M. Kuiper and
M. Zabeau 1995. AFLP: a new technique for DNA
fingerprinting. Nucleic Acid Res. 23: 4407-4414.).
The recently adopted techniques using retroelements
(Sinnet D., J.-M. Deragon, L. R. Simard and D. Labuda
1990. Alumorphs-human DNA polymorphisms detected by
polymerase chain reaction using Alu-specific primers.
Genomics 7: 331-334; and Nelson D. L., S. A. Ledbetter,
L. Corbo, M. F. Victoria, R. Ramirez-Solis, T. D.
Webster, D. H. Ledbetter and C. T. Caskey 1989. Alu
polymerase chain reaction: A method for rapid isolation
of human-specific DNA sequences from complex DNA
sources. Proc. Natl. Acad. Sci. 86: 6686-6690.) and
simple sequence repeats (SSRs) (Litt M. and J. A. Luty
1989, A hypervariable microsatellite revealed by in
vitro amplification of a dinucleotide repeat within the
cardiac muscle actin gene. Am. J. Hum. Genet. 44: 397-
401; Tautz D. 1989. Hypervariability of simple
sequences as a general source for polymorphic DNA
markers. Nucleic Acids Res. 17: 6463-6471; and Weber
J . L . and P . E . May 19 8 9 . Abundant c 1 a s s o f human DNA
polymorphisms which can be typed using the polymerase
chain reaction. Am. J. Hum. Genet. 44: 388-396.) have
set the stage for a new generation of genome mapping
and fingerprinting tools.
11-06-2001 '001 3:38PM SWABEY OGILVY MTL 514 288 8389 N0, 2134 CA 000000351
CA 02371128 2001-10-O1
- 4 -
Izvak et aI. disclose repetitive elements,
called Angel, that have the potential to form a
hairpin-like structure. Their small size and potential
secondary structure formation is the basis the authors
use to ,define it as a MITE. However, Angel does not
fit the specific nor general definition of a MzTE as
defined herein since there is no indication that it is
flanked by a target site duplication tTSD) of any kind.
As TSDs are hallmark features of not only MITES but of
virtually all known transposable elements, it is clear
that .~Ingel should not only be termed a MITE nor even a
transposon.
Sinnett et aI. disclose a technique involving a
very different transposable element called Alu.
Transposons in genezal can be divided into two large
classes, Class I elements encompass endogenous
retroviruses, LTR-retrotransposons, LINES (Long
Interspersed Nuclear 8lements), SINEs (Short
Interspersed Nuclear Elements) and processed
'pseudogenes. AZu is a.SINE. Class II elements include
MITES and other transposons with terminal inverted
repeats. Alu does not.have terminal inverted repeats.
Class I move through a RNA intermediate and the action
of reverse transcriptase whereas Class II elements move
directly in a DNA form via an element-encoded
transposase. MITES are found in many eukaryotes and
prokaryotes. AIu is found only in primates. Clearly
Alu and MITEs are repetitive,' distributed throughout
their host genomes and can be associated with genes.
A1u-PCR involves the designing of primers tmore
' specifically two) based on their terminal sequences.
The 5' and 3' terminal sequences are different and, as
such, the primers are different in sequence. MITE to
MITE based PCR involves a primer designed tv their
AMENDED SHEET
C..nl.,., a.,-... ~ 1 1 i... ~ X11 . ~f1
11-06-2001 )p I 3: 38PM SWABEY OG I LVY MTL 514 288 8389 NU, Z l ;i4 CA
000000351
CA 02371128 2001-10-O1
- 4a -
terminal inverted repeats. Therefore, only one primer
is nec~ssary,
Restriction Fragment Length Polymorphism (RFLP)
marker methodology consists of digesting genomic DNA
with a restriction enzyme,_separating the DNA fragments
by electrophores, transferring the separated DNA
fragments to a solid support consisting of a nylon
membrane in order to obtain an image of the gel on a
support that can be used for hybridization experiments
with known DNA sequences . The known DNA sequence can
be a cloned genomie or eDNA sequence or a specific PCR
product. This DNA sequence (the probing sequence) is
labeled with radioactive, fluorescent or colored
nucleotides. Results of hybridization is seen by
exposing the solid support to either an x-ray s~nsitive
film or can be seen directly on the support when
colored nucleotides are used to label the probe, One
or a few DNA band is often observed depending on the
origin of the probing sequence. Restriction fragment
length polymorphisms are visualized as differences
between the banding patterns of different genotypes and
reflect the difference in the distribution of a given
restriction enzyme cutting sites.
Random Amplified Polymorphic DNA (RAPD) marker
methodology .consists of short DNA sequences of 10
nucleotides that are used as primers to drive a PCR
reaction using total genomic DNA as template. The
nucleotide composition of the oligvnucleotide primers
is chosen arbitrarily without any reference to existing
DNA sequence. PCR products are visualized directly
after agarose gel electrophoresis. generally, one to
15 amplified DNA fragments can be seen as amplification
product of an eukaryote genome. Polymorphisms axe
detected directly on an agarose gel after staining as
differences in amplification patterns between genotypes
AMENDED SHEET
GmnfsngeTOit ll..lnni ~1 :'~4
11-06-2001001 3:39PM SWABEY OGILVY MTL 514 288 8389 N0. 2134 CA 000000351
CA 02371128 2001-10-O1
- 4b -
and reflect sing~,e nucleotide changes in the primer and
insertions/delctions.
AMENDED SHEET
l:mnf,nvn,nif 11 ~nni ~1
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 5 -
Amplified Fragment Length Polymorphism (AFLP)
marker technology consists of digesting genomic DNA
with a restriction enzyme, ligating the resulting
genomic DNA fragments with an adapter sequence (a short
double strand DNA sequence which has at one end the
same sequence site as the one generated by the
restriction enzyme used to digest the genomic DNA) and
performing a PCR reaction using, as primer, an
oligonucleotide homologous to the adapter sequence.
Amplification results are visualized directly on an
acrylamide gel after staining as several (up to 60) DNA
fragments. Polymorphisms are seen as differences in
the presence/absence of specific amplified DNA
fragments in different genotypes and reflect, like
RFLP, differences in the distribution of a given
restriction enzyme cutting site but with a subset of
the genomic DNA.
Simple Sequence Repeat (SSR) marker methodology
consists of using a simple DNA sequence repeat (such as
(TA) n, (GAGA) n, (GA) n, etc..., "n" generally varying
between 5 and 18) as probes to identify genomic clones
from a gene library of an organism carrying these
simple sequence motifs. The clones that are isolated
are then sequenced and a pair of DNA primers
surrounding the SSR are designed for PCR amplification
of the SSR and the surrounding DNA sequences.
Polymorphisms are seen as one or very few amplified DNA
fragments varying by one or a few nucleotide
differences in different genotypes and reflect
differences in the number of repeats ("n") of the
simple sequence.
DNA markers based on retroelements and other
large repeated elements consist of designing primers
surrounding the element and polymorphisms are found
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 6 -
when the element is present or absent in different
genotypes.
Other types of DNA markers exist but they are a
combination of the types of DNA markers described
above. For example, CAPS are cut amplified polymorphic
DNA where a PCR product is digested by restriction
enzymes after PCR amplification. Primers pairs can be
designed from a repeated element and an AFLP primer or
from different repeated elements).
It would be highly desirable to be provided
with a new pervasive nucleic acid sequence for use in
linkage studies and in fingerprinting studies.
It would also be highly desirable to be
provided with a method for detecting polymorphisms in
eukaryotes using this new pervasive nucleic acid
sequence.
SUMMARY OF THE INVENTION
One aim of the present invention is to provide
a new pervasive nucleic acid sequence for use in
linkage studies and in fingerprinting studies.
Another aim of the present invention is to
provide a method for detecting polymorphisms in
eukaryotes using this new pervasive nucleic acid
sequence.
In accordance with the present invention there
is provided a method for detecting polymorphisms of a
nucleic acid sequence of interest. The method
comprises the steps of:
a) amplifying said nucleic acid
sequence of interest with a first primer
homologous to a miniature inverted-repeat
transposable element (MITE), a fragment
thereof or a derivative thereof, and a
second primer wherein said first primer
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
anneals with said MITE when present in said
nucleic acid sequence of interest and said
second primer is identical or not to the
first primer, and homologous or not to a
MITE sequence;
b) separating fragments of the nucleic
acid sequence of interest amplified in step
a ) ; and
c) analyzing the fragments obtained in
step b) in relation to reference fragments
obtained from amplification of a nucleic
acid sequence with the at least one primer
for determining a difference in nucleic
acid sequence between the fragments
obtained in step b) and the reference
fragments, whereby a difference is
indicative of a polymorphism in the nucleic
acid of interest.
Also in accordance with the present invention,
there is provided a method for genotyping an eukaryote.
The method comprises the steps of:
a) amplifying a nucleic acid sequence
of said eukaryote with a first primer
homologous to a MITE, a fragment thereof or
a derivative thereof, and a second primer,
wherein said first primer anneals with said
MITE when present in said nucleic acid
sequence of said eukaryote, and said second
primer is identical or not to the first
primer, and homologous or not to a MITE
sequence;
b) separating fragments obtained from
amplifying the nucleic acid sequence of
step a); and
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
_ g _
c) comparing the fragments obtained
from step b) with fragments of a reference
nucleic acid sequence from said eukaryote,
whereby identity of the fragments of step
b) with the fragments of the reference
nucleic acid sequence is indicative of said
eukaryote having said nucleic acid
sequence.
Further in accordance with the present
invention, there is provided a method for
fingerprinting a eukaryotic organism. The method
comprises the steps of:
a) amplifying a nucleic acid sequence
of a eukaryotic organism with a first
primer homologous to a MITE, a fragment
thereof or a derivative thereof, and a
second primer, wherein said first primer is
specific for a MITE sequence and said
second primer is identical or not to the
first primer, and homologous or not to the
MITE sequence; and
b) separating fragments obtained from
amplifying the nucleic acid sequence of
step a), whereby the fragments so-separated
are representative of the eukaryotic
organism.
Preferably the step of amplifying is effected
by PCR procedures. The first primer is derived from a
consensus sequence from a MITE element. More
preferably, the first primer has a nucleic acid
sequence derived from a consensus sequence from
Tourist, Stowaway, Barfly, or Mariner.
Most preferably, the first primer has a nucleic
acid sequences selected from the group consisting of
SEQ ID NO: l, SEQ ID N0:2, SEQ ID N0:3, SEQ ID N0:4,
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 9 -
SEQ ID N0:5, SEQ ID N0:6, SEQ ID N0:7, SEQ ID N0:8, SEQ
ID N0:9, SEQ ID NO:10, SEQ ID NO:11, SEQ ID N0:12, SEQ
ID N0:13, SEQ ID N0:14, SEQ ID N0:15, SEQ ID N0:16, SEQ
ID N0:17, SEQ ID N0:18, SEQ ID N0:19, SEQ ID N0:20, SEQ.
ID N0:21, SEQ ID N0:22, SEQ ID N0:23, SEQ ID N0:24, SEQ
ID N0:25, SEQ ID N0:26, SEQ ID N0:27, SEQ ID N0:28, SEQ
ID N0:29, SEQ ID N0:30, SEQ ID N0:31, SEQ ID N0:32, SEQ
ID N0:33, SEQ ID N0:34, and SEQ ID N0:35.
The second primer optionally is a primer
selected from the group consisting of a MITE specific
primer, a primer based on a SSR sequence, a primer
based on a retroelement sequence, a primer based on a
sequence of a cloned nucleic acid detecting a RFLP, a
primer based on a random genomic sequence, a primer
based on a vector sequence and a primer based on a gene
sequence.
Also in accordance with the present invention,
there is provided the use of a polymorphism as with the
method of the present invention for tracing progeny of
a eukaryotic organism, for determining hybridity of a
eukaryotic organism, for identifying a variation of a
linked phenotypic trait in a eukaryotic organism, for
identifying individual progenies from a cross wherein
said progenies have a desired genetic contribution from
a parental donor and/or recipient parent, or as genetic
markers for constructing genetic maps.
The method of the present invention may be used
for isolating genomic DNA sequence surrounding a gene-
coding or non-coding DNA sequence. The genomic DNA
sequence surrounding the gene-coding DNA sequence is
preferably a promoter or a regulatory sequence.
Further in accordance with the present
invention, there is provided a nucleic acid fragment or
a derivative thereof, obtained by amplifying a nucleic
acid sequence of a eukaryotic organism with at least
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 10 -
one primer homologous to a MITE for use as a probe on
nucleic acid sequences.
The nucleic acid fragment or the derivative
thereof may be used for marker-assisted selection
(MAS), map-based cloning, hybrid certification,
fingerprinting, genotyping, and allele specific marker.
The eukaryote or eukaryotic organism is
preferably a plant, an animal or fungi.
Still in accordance with the present invention,
there is provided a method for genome mapping, which
comprises the steps of:
a) fractionating the genome of a eukaryotic
organism;
b) cloning the genome so-fractionated into a
vector;
c) testing the vectors so-cloned by amplifying
DNA in the vectors so-cloned using a first
primer homologous to a miniature inverted-
repeat transposable element (MITE), and a
second primer, the first primer being
capable of hybridizing to a miniature
inverted-repeat transposable element (MITE)
in the DNA, and the second primer is
identical or not to the first primer, and
homologous or not to a MITE sequence;
d) separating extension products of the
amplification step by size;
e) measuring the pattern of extension
products; and
f) reconstructing the genome from the
overlapping patterns.
Also in accordance with the present invention,
there is provided a method for mapping a polymorphic
genetic marker, which comprises:
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 11 -
a) providing a mixture of restriction enzyme-
digested nucleic acid sequences from a
biological sample from a eukaryotic
organism;
b) amplifying the mixture of restriction
enzyme-digested nucleic acid sequences
using a first primer homologous to a
miniature inverted-repeat transposable
element (MITE), a fragment thereof or a
derivative thereof, and a second primer,
wherein the first primer is specific for a
MITE, and the second primer is identical or
not to the first primer, and homologous or
not to a MITE sequence;
c) identifying a set of differentially
amplified nucleic acid sequences in the
mixture; and
d) mapping at least one of the differentially
amplified nucleic acid sequences to a
unique genetic polymorphism, thereby
providing a marker for the polymorphism.
The MITE-based marker system of the present
invention is different from any of the approaches of
the prior art, is much simpler, is more high
informative and repeatable.
For the purpose of the present invention the
following terms are defined below.
The term "MITE" is intended to mean a miniature
inverted-repeat transposable element. In fact, MITES
are a superfamily of transposable elements. These
elements are less than 3 kilobases long, contain
perfect or degenerate terminal inverted-repeats, are
flanked by a target site duplication of less than, or
equal to 10 base pairs, and are moderately to highly
abundant in the genome.
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 12 -
MITES are preferably less than one kilobases
long, have perfect or degenerate terminal inverted
repeats, are flanked by a TA or TAA target site
duplication and are moderately to highly abundant in
the genome.
The term "MITE-based primer" is intended to
include a primer comprising a MITE or a fragment
thereof, and a primer derived from a MITE and that
recognizes a MITE, hybridizing or annealing thereto.
The term "MITE-based genetic marker" (MGM) is
intended to mean a marker hybridizing to a MITE
element, or a marker produced by the PCR amplification
of a nucleic acid sequence using at least one MITE
primer and optionally another MITE primer or a primer
based on a SSR sequence, a retroelement sequence, a
RFLP sequence or a gene sequence.
The term "inter-MITE polymorphism" (IMP)
relates to a subset of MGM and is intended to mean a
marker obtained by PCR amplification of a nucleic acid
sequence using one MITE primer or two different MITE
primers.
The term "eukaryote" or "eukaryotic organism"
is intended to refer to plants, animals and fungi.
The term homologous is intended to mean in the
context of a homologous nucleic acid sequence, a
nucleic acid sequence which would hybridize under
stringent conditions to a complement of the nucleic
acid sequence it is homologous with.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 illustrates PCR products of primer
combination TEM-4/-10 or TEM-10 alone on an agarose
gel;
Fig. 2 illustrates a section of the PCR results
of IRD700T"" fluorescence dye-labeled TEM-1 primer,
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 13 -
visualized on a 6% acrylamide gel with the LI-COR
automated system 4200 in accordance with a preferred
embodiment of the invention, in which P1 is parent H.
vulgare, Lina (Pl), P2 is parent H. spontaneum, Canada
Park (P2), and the segregating individuals are from a
cross between the Lina and Canada Park DH (Doubled
Haploid) population;
Fig. 3 illustrates PCR results of TEM-3/-10
with longer extension time of 1 minute and 15 seconds
on agarose gel;
Figs. 4A and 4B illustrate PCR results on
agarose gel of TEM-1/-4 showing different products with
a 60-second extension time and a 75-second extension
time;
Fig. 5 illustrates a linkage map of the H.
vulgare cv. Lina x H. spontaneum Canada Park population
showing the distribution of IMP loci detected with the
TEM-1 and TEM-10 primers;
Fig. 6 illustrates a fingerprinting of the 27
Hordeum lines on agarose gel;
Fig. 7 illustrates a section of the
fingerprinting result of 27 Hordeum lines with IRD700T""
fluorescence dye-labeled TEM-1 primer;
Fig. 8 illustrates a dendrogram resulting from
the UPGMA clustering of the genetic similarity matrix
of 27 cultivars, based on the TEM-1 and TEM-10 banding
patterns.
Figs. 9A, 9B, 9C and 9D illustrate the
universal use of the MITE-based markers in different
eukaryotes, showing PCR-amplified profiles of eleven
different sources of DNA using Master primer TEM-12
(Fig. 9A); Master primer TEM-1 (Fig. 9B); Master primer
TEM-10 (Fig. 9C) and Master primer TEM-11 (Fig. 9D)
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 14 -
Figs. 10A, lOB, lOC, lOD and l0E illustrate an
example of the results obtained with the Master primer
(TEM-1) and its corresponding anchored primer.
DETAILED DESCRIPTION OF THE INVENTION
The present invention provides a new genetic
marker referred to herein as MITE-based genetic marker
(MGM). In this new method using PCR, polymorphisms are
revealed with primers designed from the abundant
transposable elements, MITEs. The usefulness of these
transposable element-based primers was determined by
studying segregation patterns in a barley doubled-
haploid mapping population and in genotyping 26
cultivars of Hordeum vulgare and one line of Hordeum
spontaneum. In accordance with the present invention,
there is provided a novel type of DNA markers, referred
herein as MITE-based genetic markers, as well as the
chromosomal localization of these markers, their
universality and versatility and the fingerprinting
results. Finally, we discuss the feasibility and the
generalization of the MGM and IMP approaches of the
present invention.
Advantages and Improvements over Existing Technology
As mentioned above, MITE members are frequently
found to be associated with genes, and thus, are not
confined to repetitive regions. This pervasiveness of
MITES is of enormous value. It indicates that
virtually any region of the genome is prone to IMP
amplifications in most eukaryotic organisms.
A total of 50-100 storable bands were amplified
with every single primer, indicating that MITES are
present in the genome in high copy numbers. With
several primers and 50-100 loci per primer, the whole
genome can be covered readily in the screening.
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 15 -
The MITE primer can ,be combined with other
types of primers such as primers specific for SSRs,
retroelements, sequenced RFLPs, random genomic
sequences, vector sequences, and genes. This will
certainly increase the capacity of the MGM method of
the present invention.
The method of the present invention, combined
with high resolution LI-COR automated fluorescence
genotyping system, provides enormous power in DNA
mapping and fingerprinting techniques. Its power and
resolution over RAPD and RFLP are obvious as many more
loci could be detected in a single reaction. MGM and
IMP analysis are easy, fast and cost effective. In
contrast to RAPD analysis, significantly fewer primers
are needed. Unlike the AFLP and RFLP techniques, MGM
and IMP does not require digestions with restriction
enzymes or adapter ligation.
Technical Description
i) Plant materials
The mapping population used consists of 88
doubled-haploid individuals from a cross between
Hordeum vulgare cultivar Lina and H. spontaneum
cultivar Canada Park. This population has been used to
construct a linkage map based mostly on RFLP markers.
A total of 27 cultivars (see Table 1) were used
in the fingerprinting experiments including 26 H.
vulgare entries and one H, spontaneum entry, Canada
Park, which was used together with Lina as parents to
generate the mapping population. The collection
included two-row and six-row types. Among the two-row
types, both spring and winter cultivars were included.
All 27 cultivars were previously used in an RFLP
genotyping study and therefore, the RFLP-based genetic
relationships among these cultivars were known.
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 16 -
Table 1
Cultivars in the fingerprinting
used study
IdentificatiCultivar used
on Number
1 Lina #0568
2 Canada Park
3 Alexis
4 Angora
5 Ariel
6 Azhul
7 Ellice
8 Express
9 Fillipa
10 Goldie
11 Golf
12 High amylose glacier
13 Igri
14 Ingrid
15 Kinnan
16 Maud
17 Meltan
18 Mentor
19 Mette
20 Mona
21 Roland
22 Saxo
23 Svani
24 Tellus
25 Tofta
26 Trebon
27 Vixen
ii) PCR detection systems
Two detection systems were used to compare the
resolution and efficiency in polymorphism
identifications. The first was the regular agarose
detection system. In this system, PCRs were performed
with regular primers (non-labeled). PCR products were
visualized in 2o agarose gels, with or without Nusieve
agarose (2/3 Nusieve . 1/3 regular agarose). The
second was the LI-COR automated DNA
sequencing/genotyping system. Primers for this system
were labeled with IRD700T"" fluorescent dye (LI-COR,
Inc., Licoln, Nebraska). PCR products were visualized
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 17 -
with 6o acrylamide denaturing gel with a device of 41
cm long glass plates. The gel electrophoresis was run
with the LI-COR 4200 system.
iii) PCRs
Seven master primers (Table 2) and their 3'-
anchored derivatives (Table 3) were designed and
evaluated in this study. Six of the master primers
were MITE primers (TEM-1, TEM 2, TEM-3, TEM-10, TEM-11
and TEM-12) and TEM-4 was a segment of the conserved
sequences of the reverse transcriptase (RT) domain of
several Ty1/copia-like retrotransposons (Hirochika H.
and R. Hirochika 1993. Tyl-copia group
retrotransposons as ubiquitous components of plant
genomes. Jpn. J. Genet. 68: 35-46.). The master
primers were degenerate as more than one nucleotide was
possible in certain position. The anchored primers were
the master primers with the additional nucleotide added
at the 3' end of the master primer (Table 3). MITE
primers were designed from the consensus sequences in
the terminal inverted repeats (TIR) regions of MITES
from each category. Both TIRs were used to design the
primers. TEM-4 was used only in combinations with
other primers. The primers were used on both the
agarose gel detection system and LI-COR automated
detection system except that primers for the latter
were labeled with a fluorescent dye.
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 18 -
Table 2
Master Sources and Sequences
Primers
PrimerTransposonHost speciesNo. of Sequence
MITEs
TIRs
TEM-1 StowawayHordeum 44 (AG)TATTT(TA)GGAACGGAGGGAG
vulgare (SEO ID N0:1)
TEM-3 Tourist Triticum. 2 TT(TG)CCCAAAAGAACTGGCCC
aestivum (SEQ ID N0:2)
TEM-10Barfly H. vulgare7 TCCCCA(CT)T(AG)TGACCA(CGT)CC
(SEQ ID N0:3)
TEM-4 Ty1/copiaConserved NA GT(TC)TT(ACGT)AC(GA)TCCAT(TC)TG
RT
(SEQ ID N0:4)
TEM-11Barfly H. vulgare8 TC(CT)CCATTG(CT)G(AG)CCAGCCTA
(SEQ ID NO: 5)
TEM-2 Tourist H. vulgare4 CCTT(CT)TAA(AC)(ACGT)GAACAA(CG)CCC
(SEO ID NO: 6)
TEM-12HsMar1 Homo sapiens58 AATT(CA)(CT)TTTTGCACCAACCT
(Ma
riner)/MAD
(SEQ ID NO: 7)
E1
Hiroc hika irochika
and 1993.
H
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 19 -
Table 3
Master primers and the corresponding anchored primers
TEM-1 (AG)TATTT(TA)GGAACGGAGGGAG SEO ID N0:1
TEM-1A (AG)TATTT(TA)GGAACGGAGGGAGA SEQ ID N0:8
TEM-1C (AG)TATTT(TA)GGAACGGAGGGAGC SEQ ID N0:9
TEM-1G (AG)TATTT(TA)GGAACGGAGGGAGG SEQ ID N0:10
TEM-1T (AG)TATTT(TA)GGAACGGAGGGAGT SEQ ID N0:11
TEM-2 CCTT(CT)TAA(AC)(ACGT)GAACAA(CG)CCCSEQ ID N0:6
TEM-2A CCTT(CT)TAA(AC)(ACGT)GAACAA(CG)CCCASEQ ID N0:12
TEM-2C CCTT(CT)TAA(AC)(ACGT)GAACAA(CG)CCCCSEQ ID N0:13
TEM-2G CCTT(CT)TAA(AC)(ACGT)GAACAA(CG)CCCGSEQ ID N0:14
TEM-2T CCTT(CT)TAA(AC)(ACGT)GAACAA(CG)CCCTSEO ID N0:15
TEM-3 TT(TG)CCCAAAAGAACTGGCCC SEO ID N0:2
TEM-3A TT(TG)CCCAAAAGAACTGGCCCA SEO ID N0:16
TEM-3C TT(TG)CCCAAAAGAACTGGCCCC SEO ID N0:17
TEM-3G TT(TG)CCCAAAAGAACTGGCCCG SEO ID N0:18
TEM-3T TT(TG)CCCAAAAGAACTGGCCCT SEQ ID N0:19
TEM-4 GT(TC)TT(ACGT)AC(GA)TCCAT(TC)TGSEO ID N0:4
TEM-4A GT(TC)TT(ACGT)AC(GA)TCCAT(TC)TGASEQ ID N0:20
TEM-4C GT(TC)TT(ACGT)AC(GA)TCCAT(TC)TGCSEQ ID N0:21
TEM-4G GT(TC)TT(ACGT)AC(GA)TCCAT(TC)TGGSEQ ID N0:22
TEM-4T GT(TC)TT(ACGT)AC(GA)TCCAT(TC)TGTSEO ID N0:23
TEM-10 TCCCCA(CT)T(AG)TGACCA(CGT)CC SEQ ID N0:3
TEM-10ATCCCCA(CT)T(AG)TGACCA(CGT)CCA SEO ID N0:24
TEM-10CTCCCCA(CT)T(AG)TGACCA(CGT)CCC SEQ ID N0:25
TEM-10GTCCCCA(CT)T(AG)TGACCA(CGT)CCG SEQ ID N0:26
TEM-10TTCCCCA(CT)T(AG)TGACCA(CGT)CCT SEQ ID N0:27
TEM-11 TC(CT)CCATTG(CT)G(AG)CCAGCCTA SEQ ID N0:5
TEM-11ATC(CT)CCATTG(CT)G(AG)CCAGCCTAA SEQ ID N0:28
TEM-11CTC(CT)CCATTG(CT)G(AG)CCAGCCTAC SEQ ID N0:29
TEM-11GTC(CT)CCATTG(CT)G(AG)CCAGCCTAG SEQ ID N0:30
TEM-11TTC(CT)CCATTG(CT)G(AG)CCAGCCTAT SEQ ID N0:31
TEM-12 AATT(CA)(CT)TTTTGCACCAACCT SEO ID N0:7
TEM-12AAATT(CA)(CT)TTTTGCACCAACCTA SEO ID N0:32
TEM-12CAATT(CA)(CT)TTTTGCACCAACCTC SEO ID N0:33
TEM-12GAATT(CA)(CT)TTTTGCACCAACCTG SEO ID N0:34
TEM-12TAATT(CA)(CT)TTTTGCACCAACCTT SEQ ID N0:35
PCR amplifications for the agarose detection
system were performed in a 25 ~l volume containing 2.5
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 20 -
mM MgClz, 0.4 mM dNTP, 1 ~M of each primer and 0.625
unit of AmpliTaqT"" DNA polymerase (Perkin-Elmer). The
following profile was used: an initial denaturation
step of 1-min 30 sec at 94°C; followed by 35 cycles of
30 sec at 94°C, 45 sec at 58°C and 1 min at 72°C; and a
final extension of 5 min at 72°C. This profile was
used unless otherwise indicated. An annealing
temperature of 60°C was used whenever TEM-1 was
included.
PCR amplifications for the LI-COR detection
system were performed with the same conditions as in
the regular agarose system, except a total reaction
volume of 20 ~,l and 0.5 unit of AmpliTaq DNA polymerase
(Perkin-Elmer) were used. The same general profile was
used (without temperature change for TEM-1). PCR
amplifications were done in two steps. The first step
is a preamplification with non-labeled primers for 35
cycles. An aliquot of 3 ~l of the preamplification mix
was used for the second step of amplification. A 0.1
~,M concentration of the labeled primer was used in the
second round of amplification (compared with 1 ~M of
non-labeled primer in the first step).
iv) Data collection and statistical analyses
a) Fingerprinting and genetic similarity
analyses
Polymorphic as well as common bands were scored
as presence (1), absence (0), or missing data (9) for
each individual. The resulting raw data matrices were
used to generate relative genetic similarity (GS)
matrices using Nei and Li's (Nei M. and W. Li 1979.
Mathematical models for studying genetic variation in
terms of restriction endonucleases. Proc. Natl. Acad.
Sci. 76: 5269-5273. ) measurement, 2nXY/ (nX + nY) , where
nh and nY are the numbers of bands in lines x and y,
respectively and nXY is the number of bands shared by
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 21 -
both lines. Both polymorphic and common bands are used
to calculate the GS values.
Dendrograms were generated based on the GS
matrices using the unweighted pair-group method
arithmetic average (UPGMA). A combined dendrogram
resulting from analyses with two MITE primers (TEM-1
and TEM-10) was generated. The normalized Mantel
statistic (Mantel N. A. 1967. The detection of disease
clustering and a generalized regression approach.
Cancer Res. 27: 209-220) was used to compare the
genetic similarity matrix based on the MITE-based
genetic markers with a genetic similarity matrix of the
same cultivars based on 313 polymorphic RFLP marker
bands. The test of significance was performed by
comparing the observed Z-value with the distribution of
1000 random permutations of the matrices. A11
statistical analyses were performed with the NTSYS-pc
software (Rohlf F. J. 1994. NTSYS-pc numerical
taxonomy and multivariate analysis system, version
1.80, Exeter Software, N. Y.).
b) Genetic mapping
The localization of the MITE-based genetic
markers generated with the TEM-1 and TEM-10 primers was
performed by mapping these within a framework of 71
RFLP markers that had been used previously to construct
a map of the Hordeum vulgare cultivar Lina x H.
spontaneum Canada Park population. A subset of 88
doubled haploid individuals of this population was used
for the mapping. Segregation ratios were analyzed
using xz analysis. Mapping was performed using the
computer program MAPMAKER (Lander E. S., P. Green, J.
Abrahamson, A. Barlow, M. J. Daly, S. E. Lincoln and L.
Newburg 1987. MAPMAKER: An interactive computer
package for constructing primary genetic linkage maps
of experimental and natural populations. Genomics 1:
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 22 -
174-181.). The MITE-based genetic markers were
assigned to linkage groups using two-point analysis at
a LOD threshold of 4 with the exception of group 7H
which formed two groups at this threshold and were
linked based on published location of RFLP markers).
Multipoint analysis with a LOD threshold of 2 was used
to place the markers within the linkage groups.
RESULTS
To evaluate the MITE sequences in the PCR-based
method of the present invention, primers were designed
from the terminal inverted repeat (TIR) regions, with
all primers being directed outward from the TIRs. In
this way, any sequences amplified by these primers are
expected to lie between two adjacent MITES within
amplifiable distances. These primers were used alone
or in combinations in a segregation analysis using a
doubled-haploid population of 88 individuals from a
cross between Hordeum vulgare cultivar Lina and H.
spontaneum cultivar Canada Park.
a) Single primer amplifications
On agarose gels, each of the MITE primers,
generated around 10 scorable bands with 2-5 being
polymorphic. These polymorphisms were clearly detected
between the H. vulgare parent Lina and the H.
spontaneum parent Canada Park and mostly showed the
expected 1:1 Mendelian segregation in the doubled-
haploid. Fig. 1 shows an example of the segregation
patterns.
M identifies a,PStI marker. Lane 1 contains PCR
products of H. vulgare Lina. Lane 2 contains PCR
products of H. spontaneum Canada Park. Lanes 3-28
contain PCR products of individuals in the mapping
population.
Primer TEM-1 showed a high background with some
very weak to almost invisible bands, probably due to
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 23 -
many closely related sequences, e.g., those resulted
from variations in the TIR regions . In this case, 2 0
formamide was added to the reaction mixes, since
formamide -has been reported to reduce PCR background
and enhance specificity (Nagaoka T. and Y. Ogihara
1997. Applicability of inter-simple sequence repeat
polymorphisms in wheat and their use as DNA markers in
comparison to RFLP and RAPD markers. Theor. Appl.
Genet. 94: 597-602).
A total of approximately 100 scorable bands
were detected on the LI-COR sequencing gel with primer
TEM-l, , between 60 and 70 with primer TEM-10 and
between 30 and 40 could be detected with primer TEM-3.
A section of the acrylamide gel electrophoreses with
TEM-1 is shown in Fig. 2. As in the agarose detection
system, the polymorphisms were clearly detected between
the H. vulgare parent Lina and the H. spontaneum parent
Canada Park and mostly showed 1:1 Mendelian segregation
in the doubled-haploid population.
Lane 1 contains PCR products from parent H.
vulgare, Lina. Lane 2 contains PCR products from
parent H. spontaneum, Canada Park. Lanes 3-45 contains
PCR products from individuals of the population
resulting from the cross Lina X Canada Park.
b) Primer combinations
Primer combination tests were only carried out
with the agarose detection system. Several situations
were encountered when these primers were used in
different combinations. Whereas the combination TEM-
4/TEM-10 yielded the same pattern as TEM-10 alone, the
combination TEM-1/TEM-10 produced a different result,
in which, the majority of bands from TEM-10 alone were
inhibited, bands from TEM-1 alone were also less
visible, and bands of smaller sizes appeared.
Combination TEM-3/TEM-10 with longer extension time
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 24 -
(1 min 15 sec), yielded a different pattern with three
clearly visible segregating bands different from that
of either primer alone (Fig. 3). Lanes 1-16 are
individuals in the mapping population. No parents are
shown.
Similar situations were seen with primer TEM-1.
Whereas the banding pattern did not change when TEM-1
was combined with TEM-4 (Fig. 4A), the pattern did
change when this primer was combined with TEM-3 or TEM-
10. Moreover, with a longer extension time (1 min 15
sec), the combination TEM-1/TEM-4 yielded a larger
segregating band with some other bands suppressed (Fig.
4B). Interestingly, the larger band (referred to as
T1-4AA after the primer combination) segregated almost
the same as band T4-l0A (Fig. 1). T1-4AA and T4-l0A
were not the same product of the common primer TEM-4
since TEM-10 alone also amplified band T4-10A. Also,
Tl-4AA was approximately 300bp larger than T4-10A.
This indicates that the two primer combinations
amplified tightly linked regions of DNA. This is not
unexpected because these transposable elements are
predicted to be present in high copy numbers.
Legends in Figs. 4A and 4B are the same as in
Fig 1. Lane numbers correspond to each other in Figs.
4A and 4B.
Some primer combinations yielded inconsistent
results. Possible explanations are:
- Each primer alone amplified more than 10 bands
and therefore, combinations of these primers
could either yield too many bands to be clearly
visualized or could yield band patterns that
fluctuate with micro condition changes;
- Different annealing temperatures (as with
TEM-4, which has a much lower annealing
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 25 -
temperature) may be an important factor in
determining the pattern produced; and
- Different affinities of primers may result in
the predominance of certain bands, as was the
case with TEM-1 and TEM-10.
Nevertheless, it is likely that, as with the
single primer reactions, using the fluorescence
labeling detection system, some of these problems will
be resolved and that primer combinations will
significantly increase the number of detectable loci.
c) Chromosome localization of MITE-based genetic
markers
Using the agarose detection system, the three
MITE primers and the Tyl/copia retrotransposon primer
generated a total of 15 detectable polymorphic markers
on the mapping population. All except two, segregated
in the expected 1:1 segregation ratio. Thirteen of
these markers could be placed on the map. The other
two markers remained unlinked. These were the markers
exhibiting significant deviation from the expected
segregation ratio and are likely to consist of two
bands of similar size that could not be separated on
agarose.
In Fig. 5, the MITE-based genetic markers are
seen in a larger font and in bold character. Only the
loci detected on acrylamide gel with the fluorescently
labeled TEM-1 and TEM-10 can be seen. Loci in
parentheses are those that could not be placed with a
LOD score greater than or equal to 2. Approximately
120 and 90 clear bands were detected on a LI-COR
sequencing gel with primers TEM-1 and TEM-10,
respectively. The size range of the bands detected was
approximately 100 by to 1 kb. Part of the
amplification result with TEM-1 as visualized by
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 26 -
polyacrylamide gel electrophoreses is shown in Figure
2.
Seventy-five and 19 polymorphic bands were generated
with the TEM-1 and TEM-10 primers, respectively. Some
pairs of bands exhibited co-dominant behavior (loci Tl-
0.2 on 1H, Tl-4 and Tl-16 on 2H, T10-6 on 3H, T1-8 on
group 4H and T1-36 on 7H, Figure 5), but the remaining
bands exhibited a presence/absence pattern with exactly
41 coming from the Lina parent and 41 from the H.
spontaneum parent. Of the 70 mapped TEM-1 loci, 24
significantly deviated from the expected 1:1
segregation ratio. All 24 loci except one (T1-19 on
7H) mapped to areas where RFLP markers also exhibited
distorted segregation ratios in this mapping
population. Two of the 18 TEM-10 loci significantly
deviated from the expected segregation ratio and these
were again located in areas where RFLP loci also
deviated from the expected 1:1 ratio.
In total, 88 loci were mapped. These loci covered all
seven linkage groups (Figure 5). Furthermore, the
distribution of the loci showed no significant
clustering other than that which would be expected
around centromeric regions where recombination is
typically reduced (e. g., groups 1H, 3H and 7H, Figure
5). In fact, the distribution is similar to that found
with cDNAs detecting RFLPs (L. S. O'Donoughue,
unpublished). This suggests that MITES are located in
areas of the genome containing coding sequences and
that it will be possible to cover the entire genome
with a limited set of MITE-based primers.
d) Fingerprinting
A total of 27 cultivars, which included the H.
vulgare parent Lina, H. spontaneum parent Canada Park
and 25 H. vulgare cultivars (Table 1) were used to
assess the usefulness of these MITE primers in
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 27 -
fingerprinting. On agarose, the primers and primer
combinations TEM-3, TEM-10, TEM-1/-3 and TEM-1/-4 were
found to be useful in distinguishing these cultivars.
One to three polymorphic bands were seen with each of
these primers and combinations. An example of the
fingerprinting experiments on agarose is shown in Fig.
6.
The three segregating bands in Fig. 6 indicated
that the 27 cultivars separated into 7 groups. M
represents the molecular weight marker a,PstI. The
numbers correspond to those in Table 1.
Two MITE primers, TEM-1 and TEM-10 were studied
in the fingerprinting analysis with the fluorescence
labeling detection system. A total of 62 bands were
scored for TEM-1, 37 of which were polymorphic, and the
remaining 22 were the same across all 27 cultivars. A
section of this electrophoresis is shown in Fig. 7. A
total of 60 bands were scored with TEM-10, 34 of that
were polymorphic and the remaining 26 were the same
across all 27 cultivars. Identification of the lines of
Fig. 7 can be found in Table 1.
Lanes 1-27 present the results from Lina,
Canada Park, Alexis, Angora, Ariel, Azhul, Ellice,
Express, Fillipa, Goldie, Golf, High amylose glacier,
Igri, Ingrid, Kinnan, Maud, Meltan, Mentor, Mette,
Mona, Roland, Saxo, Svani, Tellus, Tofta, Trebon and
Vixen, respectively. Dashes indicate markers that
distinguished at least one cultivar from others.
GS matrices were generated with TEM-l, TEM-10
as well as the combined data of both primers, using Nei
and Li's coefficient (Nei and Li, supra). Dendrograms
were generated with the same sets of data. The
dendrogram of the combined data of TEM-1 and TEM-10 is
shown in Fig. 8. The dendrogram clearly separates~the
H. spontaneum line from the H. vulgare cultivars. With
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 28 -
the exception of Azhul (six-row type), the spring two-
row types clustered together and separated from the 4
winter types (Angora, Express, Igri and Vixen) included
in the present invention. The High Amylose Glacier
line clustering with the winter two-rows is a six-row
type. A comparison of the GS matrix with the one
obtained earlier with an RFLP analysis showed a good
correlation between the two, with a Mantel statistic of
Z - 0.69475. This positive correlation was highly
significant with a probability of P = 0.0020, that this
value of Z would be obtained by chance alone.
e) Universality of the primers
To demonstrate the universality of the primers
the animal-derived MITE master primers and the plant
derived MITE master primers were used on genomic DNA of
plant, insect and human genomic DNA.
Figs. 9A, 9B, 9C and 9D show a typical result
of PCR-amplified profiles of eleven different sources
of DNA using Master primer TEM-12 (Fig. 9A); Master
primer TEM-1 (Fig. 9B); Master primer TEM-10 (Fig. 9C)
and Master primer TEM-11 (Fig. 9D), as referred to, in
Table 2. The sources of DNA (listed above each lane)
are:
1) Normal human DNA, male.
2) Normal human DNA, female.
3) Human DNA, male with albinism.
4) Human DNA, female with albinism.
5) Insect: Trichogramma.
6) Legume: Soya.
7) Legume: alfalfa.
8) Crucifer: Canola.
9) Cereal: wheat.
10) Cereal: oat.
11) Cereal: barley.
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 29 -
These results clearly demonstrate that MITE-based
markers can be used in a broad range of species.
f) Versatility of the Master derived sequences
To demonstrate the versatility of the MITE-
based marker system, the Master primer sequences were
modified by adding an additional nucleotide at their 3'
end. This has the effect of increasing the specificity
of the amplified product and is especially useful when
the amplification profile generated by the Master
sequence is too complex to interpret as with the TEM-1
primer derived from Stowaway.
Figs. 10A, lOB, lOC, lOD and l0E show an
example of the results obtained with the Master primer
TEM-1 in a preamplification step and its corresponding
anchored primer listed in Table 3 in the amplification.
The Figures shows polymerase chain reaction (PCR)-
amplified profiles of cereal DNA (barley) comparing the
profile obtained with Master primer TEM-1 alone (Fig.
l0A); anchored primer TEM-lA, anchored with an
additional "A" at its 3' end (Fig. lOB); anchored
primer TEM-1C, is anchored with "C" (Fig. lOC);
anchored primer TEM-1G, anchored with "G" (Fig. 10D)
and; anchored primer TEM-1T, anchored with "T" (Fig.
l0E) .
It is clear from these results that a more
simple amplification pattern is obtained when TEM-1 is
anchored at its 3' end with either an A, C, T, or G.
It is also clear that different and complementary
amplification patterns are obtained with the different
3' end anchors.
General Purposes and Commercial Applications
Various studies have shown transposable
elements to be present in virtually every species
studied to date. Retrotransposons are present in plant
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 30 -
genomes in high copy numbers. The Alu family was
estimated to be 5 X 105 copies per haploid human genome
that translates to one Alu element in every 5 kb of
DNA. This element alone accounts for 5% of the genome
in primates (Berg D. E. and M. M. Howe 1989. Mobile
DNA. Washington, American Society of Microbiology).
Tyl/copia group elements can accumulate up to 106
copies per genome in Vicia species, making up to >2% of
the genome, although wide variations were seen across
species (Pearce S. R., H. Gill, D. Li, J. S. Heslop-
Harrison, A. Kumar and A. J. Flavell 1996. The Tyl-
copia group retrotransposons in Vicia species: copy
number, sequence heterogeneity and chromosome
localisation. Mol. Gen. Genet. 250: 305-315). The
BARE-1 retrotransposon has a copy number of 3 x 104 and
makes up to 6.7% of the barley genome (Suoniemi A., K.
Anamthawat-Jonsson, T Arna and A. H. Schulman 1996.
Retrotransposon BARE-1 is a major, dispersed component
of the barley (Hordeum vulgate L.) genome. Plant
Molecular Biology 30: 1321-1329). In the study by
SanMiguel et al. (SanMiguel P., A. Tikhonov, Y.-K. Jin,
N. Motchoulskaia, D. Zakharov, A. Melake-Berhan, P. S.
Springer, K. J. Edwards, M. Lee, Z. Avramova and J. L.
Bennetzen 1996. Nested retrotransposons in the
intergenic regions of the maize genome. Science 274:
765-768), sequencing of a contiguous 280-kb region
flanking the maize Adhl-F gene isolated on a yeast
artificial chromosome (YAC) clone revealed 37 classes
of nested retrotransposon repeats that accounted for
>600 of the clone.
The Tourist and Stowaway elements (Bureau T. E.
and S. R. Wessler 1992. Tourist: A large family of
small inverted repeat elements frequently associated
with maize genes. Plant Cell 4: 1283-1294; and Bureau
T. E. and S. R. Wessler 1994. Stowaway: A new family
CA 02371128 2001-10-O1
WO 00160113 PCT/CA00/00351
- 31 -
of inverted repeat elements associated with the genes
of both monocotyledonous and dicotyledonous plants.
Plant Cell 6: 907-916) are members of the TIR class of
transposable elements, although they differ
significantly from the traditional TIR transposable
element families like Ac and En/Spm. Barfly, a new
member of the TIR transposable elements like Tourist
and Stowaway, is found to be associated with the barley
xylose isomerase gene. These elements, together with
some other elements of the type, collectively referred
to as MITES (Bureau T. E., P. C. Ronald, and S. R.
~nlessler 1996. A computer-based systematic survey
reveals the predominance of small inverted-repeat
elements in wild-type rice genes. Proc. Natl. Acad.
Sci. 93: 8524-8529), were found in a great number of
plant species studied so far. MITEs are also expected
to be present in high copy numbers in eukaryotic
genomes.
The ubiquity and dispersion throughout the
genome of transposable elements suggest that they can
be exploited as PCR-based mapping tools. Indeed,
Sinnet et al. (Sinnet D., J.-M. Deragon, L. R. Simard
and D. Labuda 1990. Alumorphs--human DNA polymorphisms
detected by polymerase chain reaction using Alu-
specific primers. Genomics 7: 331-334) used Alu-
specific primers in search of polymorphisms among
different human DNA samples. These investigators
clearly demonstrated the feasibility of using these
polymorphisms (termed alumorphs) as a genome analysis
tool (Sinnet et al., supra) and successfully used these
alumorphs to detect the linkage of one alumorph to a
human disease (Zietkiewicz E., M. Labuda, D. Sinnet, F.
H. Glorieux and D. Labuda 1992. Linkage mapping by
simultaneous screening of multiple polymorphic loci
using Alu oligonucleotide-directed PCR. Proc. Natl.
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 32 -
Acad. Sci. 89: 8448-8451). A copia-like
retrotransposon, PDR1, was also successfully used to
study polymorphisms and, in combination with other
specific -primers, to diagnose different lines in Pisum
(Lee D. , T. H. N. Ellis, L. Turner, R. P. Hellens and
W. G. Cleary 1990. A copia-like element in Pisum
demonstrates the uses of dispersed repeated sequences
in genetic analysis. Plant Molecular Biology 15: 707-
722 ) .
In the present invention, the TIR transposable
element members, MITES, are used as mapping and
fingerprinting tools in barley and succeeded in both
the regular agarose system and the LI-COR automated DNA
Analysis system in detecting polymorphisms, localizing
these MGMs into an existing genetic linkage map and
fingerprinting cultivars within the H. vulgare species.
In the regular agarose detection system, we
showed that with three MITE primers and one
retrotransposon primer, 15 clearly storable
polymorphisms were detected and 13 of the 15 were
mapped to four linkage groups of barley. Each MITE
primer or primer combination generated more than 10
storable bands with 2-5 being polymorphic. In the LI-
COR automated genotyping system, each of the two MITE
primers shown generated Close to 100 storable bands
with up to 75being polymorphic. Markers mapping to all
seven barley linkage groups were obtained using these
two primers. This demonstrates the random distribution
of the MITE-based markers in genomes.
New MITES are constantly being uncovered by
computer-based sequence similarity searches. As the
number of MITES increases, detailed linkage maps of
virtually any species with high copy numbers of MITEs
can be readily constructed based solely on MGMs.
Linkage studies of important genes with MGMs can also
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 33 -
be readily carried out. The high level of variation
detected with these MITE-based primers among cultivars
within the same species demonstrates the practical
value of -these primers.
Some transposable elements, such as Alu (Berg
and Howe, supra; Makalowski W., G. A. Michell and D.
Labuda 1994. Alu sequeces in the coding regions of
mRNA: a source of protein variability. Trends Genet.
10: 188-193), and the mouse B2 element (Clemens M. J.
1987. A potential role for RNA transcribed from B2
repeats in the regulation of mRNA stability. Cell 49:
157-158), have been found to be frequently associated
with genes. MITE members were also frequently
identified within plant and other eukaryotic genes.
Stowaway was first discovered as a mutation cause at
the wx locus of maize (Bureau and Wessler, supra).
More than 100 genes were found to harbor MITEs in their
coding or non-coding regions (Bureau et al., supra).
The close association of retroelements with animal and
plant genes, and MITES with genes in agronomic crops
and other plants has opened a new way of characterizing
genes or gene sequences. Indeed, studies have been
done in isolating gene sequences (Nelson D. L., S. A.
Ledbetter, L. Corbo, M. F. Victoria, R. Ramirez-Solis,
T. D. Webster, D. H. Ledbetter and C. T. Caskey 1989.
Alu polymerase chain reaction: A method for rapid
isolation of human-specific DNA sequences from complex
DNA sources. Proc. Natl. Acad. Sci. 86: 6686-6690;
Sower E., F. Quattrocchio, N. de Vetten, J. Mol and R.
Koes 1995. A general method to isolate genes tagged by
a high copy number transposable element. Plant Journal
7: 677-685), in genome analysis (Hirochika H. 1997.
Retrotransposons of rice: their regulation and use for
genome analysis. Plant Molecular Biology 35: 231-240;
Lee D., T. H. N. Ellis, L. Turner, R. P. Hellens and W.
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 34 -
G. Cleary 1990. A copia-like element in Pisum
demonstrates the uses of dispersed repeated sequences
in genetic analysis. Plant Molecular Biology 15: 707-
722), and in analysis of gene structure and expression
(White S. E., L. F. Habera and S. R. Wessler 1994.
Retrotransposons in the flanking regions of normal
plant genes: A role for copia-like elements in the
evolution of gene structure and expression. Proc. Natl.
Acad. Sci. 91: 11792-11796) using other types of
transposable elements.
The applications of the method of the present
invention are several folds.
1) In Linkage Studies
a) As described in the present application,
linkage maps can be constructed with MITE markers.
This requires a segregating population and the parents.
Linkage maps are constructed based on the segregation.
b) Linkage to a phenotypic trait or a gene can
also be carried out. This can be accomplished in
conjunction with bulked segregant analysis to expedite
the investigation. In this case, two parents and the
pools that are phenotypically (with a trait) or
genetically (with a gene) distinct are to be used in
PCR amplification with MITE primers to identify
polymorphic markers and therefore putative linkages.
c) By the same principle, the association of MGM
or IMP with Quantitative Trait Loci (QTL) controlling
traits under complex genetic control can be detected
using various statistical analysis such as single point
ANOVAs, Interval Mapping and Composite Interval
Mapping.
d) Once linkages of markers with traits of
agronomical importance are known, these markers can be
used in marker assisted selection (MAS) to expedite
breeding programs.
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 35 -
2) In Fingerprinting Studies
The MGM and IMP approaches can be used to
assist construction of large insert libraries such as
YACs (yeast artificial chromosomes) and BACs (bacterial
artificial chromosomes), to assist in cultivar
identifications and to assist in gene isolation as well
as for marker conversion.
a) The MGM and IMP markers generated can serve
as landmarks in aligning contigs and in chromosome
walking. ,
b) The MGM and IMP approaches can be readily
explored in fingerprinting cultivars and breeding lines
to determine their pedigrees and genetic relationships,
to determine the degree of contribution of a parent to
progeny lines, and in certification of new lines and
cultivars.
The MGM approach can be used to assist in gene
isolations and subcloning genomic sequences.
a) When a gene is tagged with a transposable
element, MGM can be exploited, by virtual of its
pervasiveness in the genome. A MITE primer can be used
in conjunction with a primer designed from the tagging
transposon. Flanking sequence can be amplified which
can then be used to isolate the wild type gene . This
approach can save one round of DNA library screening
compared to regular cloning of a transposon tagged
gene.
b) With a similar scenario to gene isolation,
MGM can be exploited to isolate genome sequences
flanking known gene sequences. A MITE primer and a
primer designed from the known gene sequence can be
used in PCRs to amplify the flanking sequences.
c) Amplification using a primer from a DNA
clone detecting an RFLP used in combination with a MITE
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00/00351
- 36 -
primer may be used to convert an RFLP marker to a PCR
based marker.
While the invention has been described in con
nection with specific embodiments thereof, it will be
understood that it is capable of further modifications
and this application is intended to cover any
variations, uses, or adaptations of the invention
following, in general, the principles of the invention
and including such departures from the present
disclosure as come within known or customary practice
within the art to which the invention pertains and as
may be applied to the essential features hereinbefore
set forth, and as follows in the scope of the appended
claims.
CA 02371128 2001-10-O1
WO 00/6Q113 PCT/CA00/00351
1/8
SEQUENCE LISTING
<110> MCGILL UNIVERSITY
DNA LANDMARKS INC.
BUREAU, Thomas
Chang, Ruying
LANDRY, Benoit
O'DONOUGHUE, Louisa
<120> A NOVEL TYPE OF TRANSPOSON-BASED GENETIC
MARKER
<130> 1770-222PCT
<150> 60/127,460
<151> 1999-O1-04
<160> 35
<170> FastSEQ for Windows Version 3.0
<210> 1
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 1
rtatttwgga acggagggag 20
<210> 2
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 2
ttkcccaaaa gaactggccc 20
<210> 3
<211> 18
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 3
tccccaytrt gaccabcc 18
<210> 4
CA 02371128 2001-10-O1
WO 00/60113 PCTICAOOI00351
' 2/8
<211> 1?
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 4
17
gtyttnacrt ccatytg
<210> 5
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 5
tcyccattgy grccagccta 20
<210> 6
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 6
ccttytaamn gaacaasccc
<210> 7
<211> 20
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 7
aattmytttt gcaccaacct 20
<210> 8
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 8
rtatttwgga acggagggag a 21
<210> 9
<211> 21
CA 02371128 2001-10-O1
W0 00/60 13 PCTiCA00/00351
' 3/8
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 9
21
rtatttwgga acggagggag c
<210> 10
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 10
21
rtatttwgga acggagggag g
<210> 11
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 11
rtatttwgga acggagggag t 21
<210> 12
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 12
21
ccttytaamn gaacaasccc a
<210> 13
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 13
21
ccttytaamn gaacaasccc c
<210> 14
<211> 21
<212> DNA
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00100351
' 4/8
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 14
- 21
ccttytaamn gaacaasccc g
<210> 15
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 15
ccttytaamn gaacaasccc t 21
<210> 16
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> I6
ttkcccaaaa gaactggccc a 21
<210> 17 -
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 17
ttkcccaaaa gaactggccc c 21
<210> 18
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 18
ttkcccaaaa gaactggccc g 22
<210> 19
<211> 21
<212> DNA
<213> Artificial Sequence
CA 02371128 2001-10-O1
WO 00/6A113 PCT/CAO!?100351
5/8
<220>
<223> Artificial Primer
<400> 19
ttkcccaaaa gaactggccc t 21
<210> 20
<211> 18
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 20
gtyttnacrt ccatytga 18
<210> 21
<211> 18
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 21
gtyttnacrt ccatytgc 18
<210> 22
<211> 18
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 22
gtyttnacrt ccatytgg 18
<210> 23
<211> 18
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 23
gtyttnacrt ccatytgt 18
<210> 24
<211> 19
<212> DNA
<213> Artificial Sequence
CA 02371128 2001-10-O1
WO 00/60113 PCT/CA00100351
' 6/8
<220>
<223> Artificial Primer
<400> 24
tccccaytrt gaccabcca 19
<210> 25
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 25
tccccaytrt gaccabccc 19
<210> 26
<211> 19
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 26
tccccaytrt gaccabccg 19
<210> 27
<211> 19 _
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 27
tccccaytrt gaccabcct 19
<210> 28
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 28
tcyccattgy grccagccta a 21
<210> 29
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
' CA 02371128 2001-10-O1 .
PCT/CAOOI00351
WO 00l6tt113
~/8
<223> Artificial Primer
<400> 29
tcyccattgy grccagccta c 21
<210> 30
<21I> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 30
tcyccattgy grccagccta g 21
<210> 31
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 31
tcyccattgy grccagccta t 21
<210> 32
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 32
aattmytttt gcaccaacct a 21
<210> 33
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 33
aattmytttt gcaccaacct c 21
<210> 34
<211> 21
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
CA 02371128 2001-10-O1
WO 00/6D113 PCT/CA00/00351
gig
<400> 34
aattmytttt gcaccaacct g 21
<210> 35
<211> 21 _
<212> DNA
<213> Artificial Sequence
<220>
<223> Artificial Primer
<400> 35
aattmytttt gcaccaacct t 21