Note: Descriptions are shown in the official language in which they were submitted.
WO 93/11262 21~3 789 PCT~FR92/01141
Method of selecting at leaæt one mutation ~r~en, its
application to a method for the rapid ide~tification of
S alleles of polymorphous system~ and device for
mplementation thereof.
The present invention relates. to a method of
selecting at least one mutation screen from a set of
allelic sequences of a polymorphous gene, to a method for
the rapid identification of allelic variations (alleles
or allelic sequences) of the sequences of polymorphous
genes, to nucleotide probes obtained from the said
mutation screens, especially placed in the for.m of data
banks, as well as to a device for implementing the said
methods.
The present invention also relates to a kit for
the identification of the alleles of polymorphous genes.
At present, it is very difficult and very tedious
to identify the different alleles of the same gene,
differing by mutation of at least one base in their
nucleotide sequence, especially in the case of naturally
polyallelic systems, such as the major hi~tocompatibility
system (HLA) whose qenes can exist in numerous allelic
forms, as well as in ahy other form of polymorphism,-
~
2S e~pecially those due to somatic mutations such as thoseof immunoglobulins and T cell receptors or alternatively
those encountered in systems equivalent to a polyallelic
system, which are more particularly observed in certain
multiple-mutation genetic diseases such as cystic
fibrosis or Duchenne's muscular dystrophy.
The major histocompatibility complex (HLA
complex) genes, for example, are closely linked on the
short arm of chromosome 6 and extend over about 5000 kb;
they encode three types of proteins, the class I, II and
III proteins; a major characteristic of the HLA system is
its vast polymorphism. The polymorphism of this system
results from the number of genes and the number of
different alleles which are poæsible for.each of these
' - 2 2123789
genes, the polymorphism being further increased if the
fact that an individual may have received the same allele
from both its parents (homozygous state) or may have
received two different alleles (heterozygous state) is
taken into account.
Furthermore, if it is considered that for the ~LA
complex, there may be from 10 to 100 allele~ per gene and
that 15 to 20 genes encoding the proteins of the HLA
complex have currently been characterized, it is
practically impossible to carry out a complete typing (or
identification) of this complex with the methods
currently available, whereas the latter may prove
crucial, especially in transplantation.
Indeed, the typing of the different polymorphous
systems can be currently performed either by
immunochemical methods or by DNA/DNA hybridization
techniques; however, these techni~ues have the
disadvantage:
of not being sufficiently discriminatory, and
therefore of not permitting the differentiation of
alleles of very ~imilar structures and
of necessitating the use of a large number of
oligonucleotide probes (for example: about 50-60 probes
Ln ~he case of the DR~ gene of the HLA sy~em (~ee
especially the nomenclature of the factors of the HLA
syst~m, published in 1990 in Immunogenetics, 31, 131-
140), which comprises 56 alleles), and this, insofar as
in the molecular biology-based conventional typing
methods of the prior art, it is effectively necessary to
provide of the order of one probe per allele in order to
be able to interpret the result.
Now, this identification is often necessary
either for preventi~e reasons, or for curative reasons
(especially therapy, surgery, transplants); more par-
ticularly in the case of the HLA complex, the control ofa reliable typing system is made necessary for a
preventive purpose by the existence of a correlation
between the susceptibility to certain diseases and the
' ' ~ 3 ~ 2 12~78g
frequency of certain HLA alleles; and for a curative
purpose by the necessity to have an HLA compatibility
between donor and recipient, in the case of a transplant,
as specified above an~ for the purpo~e of identifying
individuals (especially crLminology and search for
paternity).
Conse~uently, the Applicant set itself the
objective of providing a method for rapid and reliable
identification of alleles, which has the advantage of
permitting the identification of the complete allelic map
of a subject, and this without necessitating the use of a
large number of oligonucleotide probes (difficulty of
production and high cost of the said probes).
The subject of the present invention is a method
for the selection, from a series of allelic sequences of
a polymorphous gene, of at least one mutation screen
intended to specify at least one nucleotide probe
suitable for use in differentiating all the alleles,
characterized in that it comprises the following steps:
(a) selecting all or part of a known consensus
sequence of the said polymorphous gene;
(b) creating a mutation matrix for the
corresponding sequences of known alleles;
~ (c) identifying indiscernible sequences by~
comparison in twos (alleles having the same mutation
profile in the sequence selected in (a)) and excluding
one of the members of the said pairs;
(e) identifying and counting the obligatory
mutations or so-called allele marker mutations, that is
to say those which are necessary and a~.equate for
distinguishing between two alleles which are otherwise.
identical (set 0 of obligatory mutations); and
(f) obtaining the said minimum mutation
screen(s), comprising at least the obligatory mutations
of step (e).
According to an advantageous embodiment of the
said method, prior to step (e) for identification and
' - 4 212~789
counting of the obligatory mutations, the said method
comprises:
(d) identifying sLmilar mutations in each of the
said sequences of alleles of step (b), so as to treat in
S the next steps only the mutations which are non-redundant
and which constitute the set U of useful mutations; which
step (d) is followed by the steps (e) and (f) modified as
follows:
(e) identifying and counting the obligatory
mutations or so-called allele marker mutations, among the
u~eful mutations of the set U, that is to say those which
are necessary and adequate for distinguishing between two
alleles which are otherwise identical (set 0' of
o~ligatory mutations); and
(f) if the obligatory mutations of step (e) do
not permit mutation screens suitable for the univocal
differentiation of all the alleles to be directly
obtained, a minimum number of useful mutations of step
(d) (subset Ul derived from the set U of useful
mutations) is selected, which mutations, associated with
the obligatory mutations of step (e), form the mutation
screen(s) suitable for the univocal differentiation of
all the allele~.
According to an advantageous arrangement of this~
embodiment, prior to step (f), the said method compri~es
a step (x) for selecting useful mutations of step (d)
(subset U2 derived from the set U of useful mutations),
in order to form a group of useful mutations most
suitable for preparing oligonucleotide probes suitable
for use in differentiating all the alleles; which step
(x) iæ followed by the step (f) modified as follows:
(f) if the obligatory mutations do not permit the
direct selection of mutation screens suitable for the
univocal differentiation of all the alleles, a minimum
number of useful mutations of step (x) is selected, which
mutations, associated with the obligatory mutations of
step (e), form the mutation screen(s) suitable for the
univocal differentiation of all the alleles.
I ~ - 5 _ 21237~9
Such mutation screens are particularly advan-
tageous for the selection and preparation of a limited
number of oligonucleotide probes suitable for use in
differentiating all the alleles of a polymorphou~ gene.
The subject of the present invention is also a
method for the identification of alleles (or allelic
sequences) of a polymorphous gene, characterized in that
it comprises the following steps:
I - selecting at least one mutation screen pre-
pared from a series of allelic sequences of a poly-
morphous gene in the steps:
. (a) to (f) of the method of selecting at least
one mutation screen as defined above (including the
different variants); then
. (gJ choosing, from the screens selected in step
(f) of the said mutation -qcreen selection process, the
most suitable mutation screen for selecting and preparing
oligonucleotide probes suitable for u~e in
differentiating all the allele~;
II - actual typing of an allele X to be
identified by:
(h) appropriately hybridizing the said allele X
with the oligonucleotide probes selected from the
mut~tion screen( 8 ) obtained in steps (a) to (g); and
(i) identifying the allele X by detection of the
said hybrid(q) which may have been formed in step (h).
~Advantageously, when the mutation screen selec-
tion process comprises step (d) as defined above, the
said step (d) has the advantage of bringing about a first
reduction in the mutation to be considered in the
subsequent steps, by eliminating a first subset of
mutations (redundant mutations) and therefore of con-
stituting a set U of mutations useful for the charac-
terization of an allele.
The steps (e) to (g) have the advantage:
- of permitting the selection of a subset of
obligatory mutations, among the useful mutations of the
set U which, optionally in association with:
~ - 6 _ 212~7~9
- either a subset Ul, derived from the set U of
useful mutations (the set Ul corresponding to a minimum
number of useful mutations which, in aæsociation with the
obligatory mutations, form mutation screens suitable for
S the univocal differentiation of all the alleles)
- or a subset U2, derived from the set U of
useful mutations and selected in order to form a group of
useful mutations more suitable for preparing appropriate
oligonucleotide probes, form mutation screens suitable
for the univocal differentiation of all the alleles; and
- of permitting, because of the selection of the
specific oligonucleotide probes, a rapid identification
of the unknown allele.
Indeed, the method conforming to the invention
permits, in addition to the selection of a l.;~ited number
of oligonucleotide probes, the selection of probes having
the following advantageous characteristics:
- maximum pairing with the consensus sequence;
- ab~ence of formation of sequences giving rise
to the formation of non-specific homo- or heterodimeræ;
- high content of GC baæeæ; and
- absence of polypurine or polypyrimidine repeti-
tive sequences.
Furthermore, the~ method conforming to the~
invention permits the direct identification of homozygou~
doublets and their differentiation from heterozygous
doublets.
In this latter caæe, in order to obtain, in fine,
the mutation screen, the æame method aæ described above
is used comprising the analyæis of each sequence of the
doublet at each poæition; it iæ therefore the doubletc of
alleles which are compared with all the other doublets of
alleles.
The preventive and curative implications of the
precise knowledge of the alleles carried by a given
subject are important; the method conforming to the
invention makeæ it poæsible, in a very short time, to
solve this problem.
~ _ 7 _ 212~ ~89
The subject of the present invention is also the
application of the method for the selection of at least
one mutation screen from a series of allelic sequences of
a polymorphous gene, to the preparation of a data bank
consisting of a series of mutation screens obtained by
the above method and intended for the preparation of
oligonucleotide probes suitable for use in
differentiating all the alleles.
The subject of the present invention is also
oligonucleotide probes, characterized in that they are
constructed for the use of at least one mutation screen
derived from the method of selection of at least one
mutation screen from a series of allelic sequences of a
polymorphous gene or from the data bank as defined above,
in that they comprise between 15 and 50 bases and in that
they are the most suitable for hybridizing with an
allelic sequence for the identification of allele~ of a
polymorphous gene.
Such probes can be optionally labelled by means
of a marker such as a radioactive isotope, an appropriate
enzyme, a fluorochrome, an antibody or a base analogue;
such probes can also be constructed for use in the method
for detecting and/or identifying a specific nucleotide
ba~e present in a nucleic acid sequence (mutation)'
described in European Patent Application 412 883, in the
Applicant's name.
According to an advantageous embodiment of the
said probes, they comprise a sequence derived from the
selected consensus sequence and whose nucleotide base
situated at the 3' end corresponds to a base upstream of
one of the mutant bases of the selected mutation screen.
The subject of the present invention is also a
kit for the identification of alleles of a polymorphous
gene, characterized in that it comprises at lea~t:
- appropriate quantities of a collection of
oligonucleotide probes conforming to the invention;
optionally associated with:
2123189
~ - 8 -
- appropriate quantities of a reagent for
detection of the probe-~equence to be identified hybrids
possi~ly formed; and/or with
- a table for interpretation of the result of the
hybridizations obtained, as a function of the selected
mutation screen.
According to an advantageous embodLment of the
said kit, the said probes comprise a sequence derived
from the selected consensus sequence and whoæe nucleotide
base situated at the 3' end corresponds to a base
upstream of one of the mutant bases of the selected
mutation screen.
According to another advantageous embodLment of
the said kit, it additionally comprises:
- appropriate quantities of four nucleotide bases
modified so as to be incorporable into the product of
extension of the said probes used as primers, while
blocking the elongation of the said extension product.
Such an embodiment permits the use of the method
described in European Patent Application 412 883 in the
Applicant's name.
The subject of the present invention i8, in
addition, a device for implementing the method conforming
to ~he invention, characterized in that it comprises at~~
least:
- means for input of data,
- means for programmed calculation in order to
generate the mutation screen(s),
- means for storing the said screens, and
- means capable of permitting the identification
of the alleles from the stored screens.
In addition to the preceding arrangements, the
invention also comprises other~ arrangements which will
become apparent from the following description, which
refers to exemplary embodiments of the method which is
the subject of the present invention as well as to the
accompanying drawing, in which: ~
.` :
- 9 - 21~37~3~
- Figure 1 illustrates an em~odiment of the
method of selecting a mutation screen in which the said
screen is directly obtained from the set o of obligatory
mutations;
- Figure 2 illustrates another embodiment in
which the said screen is obtained from a set O' of
obligatory mutations derived from a set U of useful
mutations, which set O' is optionally associated with a
subset U1 or with a subset U2 of useful mutations, as
defined above;
- Figure 3 illustrates a device for implementing
the methods conforming to the invention (creation phase
and exploitation phase);
- Figure 4 illustrates a mutation matrix for a
lS sequence of 7 alleles, called All;
- Figure 5 illustrates the set U (useful
mutations) for identifying a pair of homozygous alleles
of the All sequence;
- Figure 6 illustrates the mutation screens
suitable for the univocal identification of all the pairs
of homozygous alleles of All;
- Figure 7 illustrates the set U (useful
mutations) for identifying a pair of heterozygous alleles
of the All gene; ,
- Figure 8 illustrates the mutation screen
suitable for the univocal identification of all the
doublets of heterozygous alleles of All;
- Figure 9 illustrates the set U (u~eful
mutations) for identifying a pair of homozygous alleles
of the DQB 1 gene; and
- Figure 10 illustrates the set of mutation
screens `suitable for the identification of all the pairs
of homozygous alleles of the DQ~ 1 gene.
It should be undergtood, however, that these
examples are given solely by way of illustration of the
subject of the invention and do not constitute in any
manner a limitation thereof.
.
21~3~89
- 10 -
A device conforming to the invention permits the
implementation of the methods of selection and
identification as defined above both in the creation
phase (constitution of the screens) and in the exploit-
ation phase (identification of an allele).
In the creation phase, the mutation matrix for
alleles is introduced in (1) into an appropriate
microprocessor (A) and generates in (4), by means of the
method for selecting at least one mutation screen
conforming to the invention, a set of screens, which are
stored in (3, 3') in a data bank.
In the exploitation phase, a sequence to be
identified is hybridized with a collection of suitable
probes, constructed for the use of at least one mutation
screen; from the hybrids obtained, the sequence is
identified (experimental data introduced in (2)); the
result obtained is compared with the screen in (5), which
makes it possible to specify the allele in question.
E~AMP~E 1: Constitut~on of mutat~on s~reens for the
ho~ozygous alloles of the All gene.
. the sequence All 0501 is selected as consensus
sequence as seen in Figure 4, in which the first sequence
is considered as consensus sequence; in the other
sequences, only the mutations with respect to the said,~
consQnsus sequence are indicated.
. the alleles are compared in twos and the
mutations useful for differentiating each pair of alleles
are identified: the useful mutations found with the
method conforming to the invention are 9 in number:
S, 8, 14, 19, 20, 21, 36, 48, 49
in conformity with Figure 5.
In this example, a pair of alleles is indiscernible ~pair
A11*0201 and A11*0202); the allele A11*0202 is
consequently suppressed for the rest of the analysis.
. the search for the obligatory mutations is then
carried out:
A11*0401 and A11*0501 differ by only 36.
.
2123-i89
It emerges from this research that only one
obligatory position exists among the useful mutations, it
is position 36.
. in the present case, the only obligatory
mutation does not permit all the possible pairs of
alleles to be differentiated; no "solutions" exist, that
is to say screens which permit the univocal
identification of all the alleles considered, with a
number of mutations of less than 3 (that is to say 2
additional mutations). All the pos~ible mutation screens
with 3 mutations are:
1) 36, 5, 20
2) 36, 8, 20
3) 36, 14, 20,
in conformity with Figures 6.1 to 6.3 and show that it is
possible to identify an allele of the All gene with the
aid of any one of these mutation screens.
E~AMPLE 2: Con~titution of mutation screen~ for the
heterozygou~ alleles of the All gene.
In this example, after execution of the steps as
described in Example 1, and which result in the
identification of the useful mutations as seen in Figure
7, the search is carried out for the obligatory mutations
which permit differentiation of all the doublets of~
alleles:
A11*0401, A11*0401 and A11*0501, A11*0401 differ by only
36;
A11*0401, A11*0401 and A11*0501, All*OS01 differ by only
36;
A11*0302, A11*0401 and A11*0501, A11*0302 differ by only
36;
A11*0301, A11*0401 and A11*0501, A11*03Cl differ by only
36;
A11*0201, A11*0401 and A11*0501, A11*0201 differ by only
36;
A11*0502, A11*0401 and A11~0501, A11*0502 differ by only
36;
. : :
` - 12 - 2123789
A11*0501, A11*0401 and A11*0501, A11*0501 differ by only
36.
It emer~es from this search that only one
obligatory position exists among the useful mutations; it
is position 36.
In this example, this single obligatory mutation
does not permit differentiation of all the doublets of
alleles. No "solutions" exist with a number less than 3,
that is to say 2 additional positions. One of the
mutation screens which permits~differentiation of all the
doublets of alleles is: 36, 5, 20, in conformity with
Figure 8.
E~AMPLE 3: Typi~g of an All heterozygous indi~idual.
~he mutation screen of Example 2 i9 cho~en in
order to identify alleles, ~ecause it is the most
suitable for the preparation of probes which corre~pond
to the selection criteria defined above (maximum pairing
with the consensus sequence, absence of sequences giving
rise to the formation of homo- or heterodimers, high
content of GC bases and absence of polypurine or
polypyrimidine repetitive sequences).
Probes of 20 oligonucleotides are synthesized
such that position 3' of the said probes corresponds to a
base situated just up~tream of one of the positions of~
the above screens, such that when the hybridization and
extension under the conditions .of the abovementioned
European Patent Application is carried out, it i8
possible to verify which of the base(s) hybridize~s). The
use of such a panel of probes makes it possible to
identify
. in individual 1, the sequence CC CT CC which,
with reference to the chosen screen, makes it po~sible to
identify the pair of alleles A11*0201, A11*0302, and
. in the tested individual 2, the sequence CC CT
CG which, with reference to the` chosen screen, makes it
possible to identify the pair of alleles A11*0502,
A11*0201.
- 13 - 2123789
EXAMPLE 4: Constitution of ~utation screens for the
homozygous allelos of the ELA-DQ~ 1 g~e
The nomenclature of the factors of the ~L~ system
was published in 1990 in Immunogenetics, 31, 131-140 and
S the following exampie illustrates the constitution of a
mutation screen for the alleles of the ELA-DQ~ 1 gene, as
defined in this article.
O the sequence DQ~ 1*0501 (position 1 to position
300) is selected as consens~s sequence;
. the positions of sLmilar mutations are iden-
tified so that they are considered only once; the
following result is obtained:
* mutation 25 is sLmilar to 7;
* mutation 140 is similar to 110;
* mutation 186 is similar to 167;
* mutation 266 i~ sLmilar to 250;
* mutation 269 is similar to 259;
* mutation 280 is s;milar to 277;
consequently, mutations 25, 140, 186, 266, 269 and 280
are ignored in the next step of the method (the numbers
correspond to the positions of the muta~ions on ~he
sequence).
. the alleleq are compared in twos and the
mu~ations useful for differentiating each pair of alleles~
are identifieds the useful mutations found with the
method conformlng to the invention are 54 in number:
7 26 38 40 57 63 68 75 76 77 81 83 88 89 105 109 110 113
114 134 137 141 144 147 153 155 158 164 167 169 170 171
198 199 208 209 211 2~2 213 216 220 221 223 230 231 234
250 253 257 259 260 265 271 277, in conformity with
Figure 9.
In this example, all the pairs of alleles can be
differentiated.
. the search for the obligatory mutation~ is then
carried out:
DQ~ 1*0402 and DQ~ 1*0401 differ by only 68,
DQB 1*03032 and DQ~ 1*03031 differ by only 63,
DQ~ 1*03032 and DQ~ 1*0302 differ by only 170.
2123~89
- 14 -
It emerges from this search that the obligatory
positions among the useful mutations are 63, 68 and 170.
. in the present case, the three obligatory
mutations do not permit differentiation of all the
possible pairs of alleles; no solutions exist with a
number of mutations of less than 7 (that is to say 4
additional mutations). All the possible mutation screens
with 7 mutations are:
1- 63, 68, 170, 7, 76, 88, 171
2- 63, 68, 170, 7, 77, 88, 171
3- 63, 68, 170, 26, 76, 88, 171,
4- 63, 68, 170, 26, 76, 88, 231,
5- 63, 68, 170, 26, 77, 88, 171,
6- 63, 68, 170, 26, 77, 88, 231,
7- 63, 68, 17~, 57, 76, 88, 171,
8- 63, 68, 170, 57, 77, 88, 171,
9- 63, 68, 170, 76, 88, 109, 171,
10- 63, 68, 170, 76, 88, 113, 171,
11- 63, 68, 170, 76, 88, 114, 171,
12- 63, 68, 170, 76, 88, 114, 231,
13- 63, 68, 170, 76, 88, 134, 171,
14- 63, 68, 170, 76, 88, 141, 171,
lS- 63, 68, 170, 76, 88, 141, 231,
16- 63, 68, 170, 76, 88, 153, 171, ,~
17- 63, 68, 170, 76, 88, 158, 171r
18- 63, 68, 170, 76, 88, 158, 231,
19- 63, 68, 170, 76, 88, 164, 171,
and
20- 63, 68, 170, 76, 88, 164, 231,
in conformity with Figures 10.1 to 10.20 (in which the
allele DQ~ 1 is represented by DQB1~ and show that it is
possible to identify an allele of the DQ~ 1 gene with the
aid of any one of these mutation screens.
As evident from the above, the invention i~ not
in any way limited to those of its embodiments,
implementations and applications which have just been
described more explicity; it embraces, on the contrary,
all the variants which may occur to the specialist in
. - 15 - 2123789
this field, without departing from the framework or the
scope of the present invention.
,..~