Language selection

Search

Patent 2358509 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2358509
(54) English Title: MOLECULAR PROFILING FOR HETEROSIS SELECTION
(54) French Title: DETERMINATION DE PROFILS MOLECULAIRES POUR LA SELECTION DE L'HETEROSIS
Status: Dead
Bibliographic Data
(51) International Patent Classification (IPC):
  • A01H 1/04 (2006.01)
  • C12Q 1/68 (2018.01)
  • A01H 5/00 (2006.01)
  • C12Q 1/68 (2006.01)
  • G06F 19/00 (2006.01)
(72) Inventors :
  • BOWEN, BEN (United States of America)
  • GUO, MEI (United States of America)
  • SMITH, OSCAR (United States of America)
(73) Owners :
  • PIONEER HI-BRED INTERNATIONAL, INC. (United States of America)
(71) Applicants :
  • PIONEER HI-BRED INTERNATIONAL, INC. (United States of America)
(74) Agent: SMART & BIGGAR IP AGENCY CO.
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date: 2000-01-19
(87) Open to Public Inspection: 2000-07-27
Examination requested: 2001-08-22
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): Yes
(86) PCT Filing Number: PCT/US2000/001422
(87) International Publication Number: WO2000/042838
(85) National Entry: 2000-07-03

(30) Application Priority Data:
Application No. Country/Territory Date
60/116,617 United States of America 1999-01-21
60/166,368 United States of America 1999-11-17

Abstracts

English Abstract




Methods of correlating molecular profile information and heterosis are
provided. Selection for dominant, additive, or under/overdominant markers
provides for improved heterosis. Selection for the number of expression
products in an expression profile provides for improved heterosis. Methods of
identifying and cloning nucleic acids linked to heterotic traits are provided.
Methods of identifying parentage by consideration of expression profiles are
provided.


French Abstract

L'invention concerne des procédés permettant de corréler des informations relatives au profil moléculaire et à l'hétérosis. La sélection de marqueurs dominants, additifs, ou sous/surdominants permet d'améliorer l'hétérosis. La sélection d'un certain nombre de produits d'expression dans un profil d'expression permet d'améliorer l'hétérosis. L'invention traite également de procédés d'identification et de clonage d'acides nucléiques liés aux caractéristiques de l'hétérosis. L'invention concerne aussi des procédés permettant d'identifier l'ascendance en tenant compte des profils d'expression.

Claims

Note: Claims are shown in the official language in which they were submitted.




52
WHAT IS CLAIMED IS:
1. A method of screening for heterosis in plants, comprising:
(i) profiling expression of a first representative sample of first expression
products
from a first progeny plant to quantify the expression products produced in the
first progeny
plant, wherein the number of first expression products produced in the first
progeny plant is -
correlated with a measure of heterosis in the first progeny plant; or,
(ii) profiling expression of a second representative sample of second
expression
products from the first progeny plant to quantify or identify the dominant
expression products
in the second representative sample, wherein the number of dominant expression
products is
correlated with a measure of heterosis in the progeny plant.

2. The method of claim 1, further comprising:
selecting the progeny plant profited in (i) or (ii), based upon the number of
first
expression products in the first representative sample, or based upon the
number of second
expression products in the second representative sample that exhibit a
dominant expression
pattern.

3. A plant selected by the method of claim 2.

4. The method of claim 1, wherein the first and second expression products
are independently selected from: mRNAs and proteins.

5. The method of claim 1, wherein the first or second representative sample
corresponds to between about 1,000 and about 20,000 gene products.

6. The method of claim 1, wherein expression of at least about 50% of the
first or second expression products produced in a selected tissue are
detected.

7. The method of claim 1, wherein expression is profiled in step (i) or step
(ii)
using one or more technique selected from: hybridization of expressed or
amplified nucleic



53
acids to a nucleic acid array, hybridization to a protein array, hybridization
to an antibody
array, subtractive hybridization, and differential display.
8. The method of claim 1, further comprising:
selecting the first progeny plant for one or more characteristics selected
from: a
selected number of dominant expression products, a selected ratio of dominant
expression
products to total expression products, expression of a dominant expression
product exhibiting
an allelic sequence polymorphism, a desired number of over- or under-dominant
expression
products, a selected ratio of over- or under-dominant expression products to
total expression
products, a selected number of additive expression products, and a selected
ratio of additive
expression products to total expression products.
9. The method of claim 1, further comprising:
identifying which expression products from the first or second representative
sample
show a dominant, additive, under-dominant, or over-dominant expression pattern
for at least
a portion of the representative sample.
10. The method of claim 1, further comprising:
selecting the first progeny plant to maximize the number of dominant
expression
products or to maximize the number of additive expression products, or to
express a
dominant expression product exhibiting an allelic sequence polymorphism, or to
minimize
the number of over- or under-dominant expression products.
11. The method of claim 1, further comprising:
cloning at least one nucleic acid encoding an expression product selected
from: an
additive gene product, a dominant gene product, which dominant gene product
optionally has
an allelic sequence polymorphism, an over-dominant gene product, an under-
dominant gene
product, and the product of a transgene derived from the first or second
parental plant or the
first progeny plant.




54
12. The method of claim 11, further comprising transducing the at least one
nucleic acid into a target plant, resulting in an increase in the number of
additive or dominant
gene products expressed in the target plant.
13. The method of claim 1, further comprising:
crossing a first parent plant with a second parent plant to produce the first
progeny
plant.
14. The method of claim 13, further comprising: profiling parental expression
products from either the first or second parent plant.
15. The method of claim 13, further comprising:
profiling expression of parental representative samples of gene products from
the first
and second parent plant; and,
comparing the resulting parental expression profiles of the first and second
parent
plants with an expression profile of the first progeny plant.
16. The method of claim 13, wherein the first parent plant is a female plant
and the second plant is a male plant, the method further comprising:
crossing at least a third male plant to the first female plant to produce at
least a second
progeny plant;
comparing an expression profile of the first progeny plant and an expression
profile of
the second progeny plant to an expression profile of the first female plant;
and,
selecting the first or second progeny plant based upon similarity to the
expression
profile of the first female plant.
17. The method of claim 13, further comprising: identifying genes which are
silenced in the first parent plant, the second parent plant, or the first
progeny plant.




55
18. The method of claim 13, further comprising: cloning a nucleic acid
encoded by a gene silenced in the first parent plant, the second parent plant
or in the first
progeny plant.
19. The method of claim 13, further comprising: introducing a heterologous
nucleic acid into the first parent plant, the second parent plant, the first
progeny plant, or a
subsequent progeny plant derived from one or more of: the first parent plant,
the second
parent plant, or the first progeny plant, which heterologous nucleic acid
results in increased
expression of an expression product from a silenced gene.
20. The method of claim 13, further comprising: determining a ratio between
the sum of expressed gene products that differ from the progeny plant in each
of the first and
second parent plants and the number of expressed gene products that differ
between the first
and the second parent plant.
21. The method of claim 13, further comprising: crossing 1 or more additional
plants with the first or second parent plant to produce at least one
additional progeny plant.
22. The method of claim 13, further comprising:
crossing 1 or more additional plants with the first or second parent plant to
produce
one or more additional progeny plant;
profiling expression of a representative sample of gene products from the one
or more
additional progeny plant; and,
comparing the resulting expression profile of the one or more additional
progeny plant
with an expression profile of the first progeny plant.
23. The method of claim 22, further comprising: selecting a heterotic progeny
plant from a group of progeny plants comprising the first progeny plant and
the one or more
additional progeny plant.




56
24. The method of claim 23, wherein the heterotic progeny plant is selected
based upon one or more selectable property selected from: an elevated number
of expressed
RNAs relative to one or more parental plant; an elevated number of expressed
RNAs relative
to other progeny plants in the group of progeny plants; an elevated number of
RNAs showing
a dominant expression pattern relative to one or more parental plant; an
elevated number of
gene products showing a dominant expression pattern relative to other progeny
plants in the
group of progeny plants; an RNA showing an allelic sequence polymorphism
relative to one
or more parental plant, an RNA showing an allelic sequence polymorphism
relative to other
progeny plants in the group of progeny plants, a decreased number of gene
products showing
an over or underdominant gene expression pattern as compared to one or more
parental plant;
and, a decreased number of gene products showing an over or underdominant
expression
pattern as compared to other progeny plants in the group of progeny plants.
25. The method of claim 13, wherein the first parent plant, second parent
plant, and first progeny plant are independently selected from: an inbred
plant, and a hybrid
plant.
26. The method of claim 13, wherein the first parent plant is a first inbred
plant, the second parent plant is a second inbred plant and the progeny plant
is a hybrid plant.
27. The method of claim 13, wherein the first or second parent plant is an
inbred or hybrid plant, and the progeny plant is a hybrid plant, the method
further comprising
crossing a plurality of first additional plants of the same strain as the
first parent plant with a
plurality second additional plants of the same strain as the second parent
plant, to produce a
plurality of progeny hybrid plants.
28. The method of claim 27, further comprising topcrossing at least one of the
plurality of progeny hybrid plants with a plurality of inbred plants to
provide a plurality of
topcross plants.




57
29. The method of claim 28, further comprising topcrossing the topcross
plants to an inbred plant to produce a topcross progeny plant, and,
optionally, profiling
expression of a representative sample of RNA from the topcross plant or from
the topcross
progeny plant.
30. The method of claim 28, further comprising:
selfing a test plant selected from: the first parent plant, the second parent
plant, the
first progeny plant, one of the plurality of progeny hybrid plants, one of the
plurality of
topcross plants, and one of the plurality of topcross progeny plants; or
crossing one or more test plants selected from: the first parent plant, the
second parent
plant, the first progeny plant, one of the plurality of progeny hybrid plants,
one of the plurality
of topcross plants, and one of the plurality of topcross progeny plants.
31. The method of claim 30, further comprising: profiling expression of the
test plant.
32. The method of claim 30, further comprising: profiling expression of an
immature tissue from the test plant.
33. The method of claim 28, the method further comprising:
profiling expression of a representative number of expression products from
one or
more of the plurality of hybrid progeny plants, or progeny thereof; and
additionally
performing at least one of:
(i) determining the number of expression products in the representative sample
from
the plurality of hybrid progeny plants, or progeny thereof, wherein the number
of expression
products in the plurality of hybrid progeny plants, or progeny thereof, is
correlated with a
measure of heterosis in the plurality of hybrid progeny plants, or progeny
thereof;
(ii) determining the number of expression products in the representative
sample of
expression products from the plurality of hybrid progeny plants, or progeny
thereof, wherein
the number of expression products exhibiting a dominant expression pattern in
the plurality




58
of hybrid progeny plants, or progeny thereof is correlated with a measure of
heterosis in the
plurality of hybrid progeny plants, or progeny thereof; and,
(iii) selecting the plurality of hybrid progeny plants, or progeny thereof for
plants
which display a selected number of expression products, or a selected number
of dominant
expression products, thereby selecting for an increase in a measure of
heterosis.
34. The method of claim 13, wherein the first and second parent plants are
monocots.
35. The method of claim 13, wherein the first and second parent plant are
selected from the families Gramineae, Compositae, and Leguminosae.
36. The method of claim 13, wherein the first and second parent plant are
selected from: Zea mays, rice, soybean, sorghum, wheat, oats, barley, millet,
sunflower, and
canola.
37. The method of claim 13, further comprising selecting the first and second
parent plant to produce the first progeny plant with a selected number of
expression products
which are dominant, over-dominant, under-dominant or additive.
38. The method of claim 37, wherein the parents are selected to produce the
first progeny plant by selecting for complementary expression of dominant or
additive
expression products between the parents.
39. The method of claim 1, further comprising:
(iii) comparing a set of first expression products in the first progeny plant
to a set of
second plant expression products from a second plant; or,
(iv) comparing a set of expression products exhibiting a dominant expression
pattern
in the first progeny plant to a set of expression products exhibiting a
dominant expression
pattern in a second plant.




59
40. The method of claim 39, wherein step (iii) or step (iv) is performed using
a computer.
41. The method of claim 39, wherein step (iv) or step (v) is performed using a
computer, wherein the second number or expressed gene products or the second
number of
gene products exhibiting a dominant expression pattern is present in a
database in the
computer.
42. The method of claim 1, wherein the steps of profiling expression are
performed in an integrated system comprising a microprocessor with software
for
determining one or more of: how many genes are expressed; whether expressed
genes are
dominant; whether expressed genes are additive; whether expressed genes are
over-dominant;
and, whether expressed genes are under-dominant.
43. The method of claim 1, further comprising inputing a resulting expression
profile for the first progeny plant into a database of expression profiles.
44. The method of claim 43, wherein the database is in an integrated system
comprising a computer.
45. A database produced by the method of claim 43.
46. The database of claim 45, wherein the database is present in a computer.
47. The computer database of claim 46, wherein the database comprises
expression product profiles of a representative sample of expression products
for hybrid
progeny plants resulting from at least 10 separate inbred plant crosses.
48. The method of claim 43, further comprising selecting an expression
profile from the database, which profile provides a unique subset of
expression products.




60
49. The method of claim 48, further comprising:
cloning a nucleic acid which expresses at least one expression product in the
unique
subset of expression products; or,
cloning a nucleic acid which expresses at least one expression product in the
unique
subset of expression products and transducing the nucleic acid into a
heterologous plant; or,
crossing a first selected plant which expresses the unique subset of
expression
products with a second selected plant which does not express the unique subset
of expression
products.
50. The method of claim 1, further comprising: selfing the first progeny
plant.
51. The method of claim 1, further comprising: selfing the first progeny plant
and detecting silencing of dominant expression products in subsequent progeny
plants which
are derived from selfing the first progeny plant.
52. The method of claim 51, further comprising: cloning a silenced nucleic
acid encoding a dominant expression product.
53. The method of claim 51, further comprising: introducing a heterologous
nucleic acid that results in expression of dominant expression products from
silenced genes.
54. The method of claim 53, wherein the heterologous nucleic acid encodes
one or more of: a transcription factor which activates a promoter from a
silenced gene; a
nucleic acid encoded by the silenced gene under the control of a heterologous
promoter; and,
a nucleic acid homologous to the silenced gene with at lease one [region of
difference] with
the silenced gene, which homologous nucleic acid can recombine with the
silenced gene to
produce a modified gene.
55. The method of claim 1, further comprising:
testing the first progeny plant or a subsequent progeny plant thereof for a
desired trait.




61
56. The method of claim 1, further comprising:
testing the first progeny plant, or a subsequent progeny plant thereof, for a
desired
phenotypic trait;
comparing the phenotypic trait between the first progeny plant, or the
subsequent
progeny plant, to a selected hybrid plant;
comparing an expression profile of the selected hybrid plant to an expression
profile
of the first progeny plant, or the subsequent progeny plant; and,
cloning at least one nucleic acid which is differentially expressed between
the selected
hybrid plant and the first progeny plant, or the subsequent progeny plant.
57. The method of claim 56, further comprising transducing the at least one
nucleic acid into a selected plant to produce a transgenic plant.
58. The method of claim 1, wherein the first and second representative
samples arc from an immature tissue of first progeny plant.
59. The method of claim 58, wherein the immature tissue is an immature ear
of the plant, or a seedling plant.
60. A method of identifying plant crosses with an increase in probability for
heterosis in progeny plants, comprising:
(i) comparing expression profiles for a plurality of plants; and
(ii) determining, by pair-wise comparisons of the expression profiles, which
crosses
will produce at least one of the following:
(a) progeny with a selected or optimal number of expression products; or,
(b) progeny with a selected number or type of expression products that display
a dominant, additive, overdominant or underdominant expression pattern.
61. The method of claim 60, further comprising making identified plant
crosses to produce progeny plants.




62
62. The method of claim 60, further comprising making identified plant
crosses to produce progeny plants, which progeny plants are tested for one or
more desired
trait.
63. The method of claim 60, wherein crosses are identified which maximize
the number of expression products in potential progeny, or which maximize the
number of
dominant expression products in potential progeny, or which maximize the
number of
additive expression products in potential progeny, or which minimise the
number of over-
dominant expression products in potential progeny, or which minimize the
number of under-
dominant expression products in potential progeny.
64. The method of claim 60, wherein:
the plants are inbred plants, hybrid plants, or transgenic plants; and,
the plants are selected from: plants in the families Gramineae, Compositae,
and
Leguminosae; or,
the plants are selected from: Zea mays, rice, soybean, sorghum, wheat, oats,
barley,
millet, sunflower, and canola.
65. The method of claim 60, wherein the expression profiles are compiled in a
database.
66. The method of claim 60, wherein a matrix of possible pair-wise
expression profile combinations for the plants is generated.
67. The method of claim 60, wherein the expression profiles are compiled in a
database in a computer and a matrix of possible pair-wise expression profile
combinations for
the plants is considered using an integrated system comprising a computer.
68. The method of claim 60, further comprising: selecting a subset of
potential crosses from all of the possible pair-wise comparisons which exhibit
a maximal
number of expression profile differences.




63
69. The method of claim 60, further comprising: selecting a subset of
potential crosses from all of the possible pair-wise comparisons which exhibit
a maximal
number of expression profile differences, wherein at least a plurality of the
possible pair-wise
comparisons are for plants from different heterotic groups.
70. The method of claim 60, wherein the pair-wise comparisons are
considered to identify crosses from the same heterotic group.
71. The method of claim 60, further comprising:
(iii) identifying crosses where:
the sum of:
(a) expression products produced in a first plant from a first heterotic group
(A j) which are not expressed in a second plant from the first heterotic group
(A) to which the
first plant is crossed (A k), and which are not expressed in a selected third
plant from a second
heterotic group (B); plus
(b) the expression products produced in A k which are not produced A j and
which are not produced in B;
is optimized.
72. The method of claim 71, further comprising making a cross identified in
(iii).
73. The method of claim 71, wherein optimization is made by:
determining all possible pair-wise combinations from the first heterotic group
and
identifying the cross which results in the largest sum of expression products;
or
determining all possible pair-wise combinations from the first heterotic group
and
identifying crosses which result in a hybrid progeny (A i x A j) with a
maximal number of
differences as compared to B; or,




64
determining all possible pair-wise combinations from the first heterotic group
and
identifying crosses which result in the hybrid progeny (A i x A j) having a
greater number of
differences with B than the number of differences between B and Ai or B and A
j.
74. The method of claim 73, further comprising selecting self- or back-
crossed progeny derived from the A i x A j hybrid that:
retain a set of expression products defined by the sum of expression products
expressed in A i (but not A j or B) and A j (but not A i or B); or
which show a larger number of expression products expressed in a topcross with
B
than does either A i or A j when topcrossed with B.
75. A method of identifying a source of a test plant, comprising:
profiling expression of a representative sample of expression products from
the test
plant; and,
comparing the resulting test expression profile to a database of known
expression
profiles for plants from known inbred or hybrid strains.
76. The method of claim 75, wherein the expression profile is for a selected
tissue and the database of expression profiles comprises expression profiles
for the same
tissue from the known inbred or hybrid strains.
77. The method of claim 75, wherein the database of expression profiles is
used to provide a matrix of pair-wise comparisons for potential progeny from
the expression
profiles in the database, which matrix of pair-wise comparisons is compared to
the test
expression profile.
78. The method of claim 75, wherein the source identified is a sub-portion of
the total expression profile, which subportion corresponds to a unique marker
for a specific
parental strain.

Description

Note: Descriptions are shown in the official language in which they were submitted.



CA 02358509 2001-07-12
WO 00/42838 ~ ~03 PCT/US00/01422
1
MOLECULAR PROFILING FOR HETEROSIS SELECTION
FIELD OF THE INVENTION
The invention relates to new methods of improving crop selection and
selecting for heterosis using molecular and computer modeling techniques.
CROSS-REFERENCE TO RELATED APPLICATIONS
This application is a non-provisional filing of and claims priority to
"MOLECULAR
PROFILING FOR HETEROSIS" by Ben Bowen et al., USSN 60/116,617 filed January
21,
1999 and "MOLECULAR PROFILING FOR HETEROSIS" by Ben Bowen et al., USSN
601166,368 filed November 17, 1999.
BACKGROUND OF THE INVENTION
Hybrid offspring often outperform their parents by a variety of different
measures, including yield, adaptability to environmental changes, disease
resistance, pest
resistance, and the like. The improved properties for the hybrid as compared
to the parents
are collectively referred to as "hybrid vigor," or "heterosis." Hybridization
between parents
of dissimilar genetic stock has been used in animal husbandry and especially
for improving
major plant crops, such as corn, sugarbeet and sunflower.
Indeed, for some crops, such as corn (Zea mays), most of the crop which is
grown is hybrid offspring. Because crossing these hybrid offspring results in
a loss of vigor
and lack of uniformity, the production of seed of these crops for planting is
complex, utilizing
inbred strains that are crossed to produce hybrid seed with uniform
characteristics.
For example, the development of a maize hybrid typically involves three steps:
(1) the selection of plants from various germplasm pools for initial breeding
crosses; (2) the
selfing of the selected plants from the breeding crosses for several
generations to produce a
series of inbred lines, which, although different from each other, breed true
and are highly
uniform; and (3) crossing the selected inbred lines with different inbred
lines to produce
hybrid progeny (sometimes referred to as "F1" hybrids). During the inbreeding
process in
maize, the vigor of the lines decreases. Vigor is restored when two different
inbred lines are
crossed to produce hybrid progeny. A consequence of the homozygosity and
homogeneity of
the inbred lines is that hybrids produced by crossing a defined pair of
inbreds are uniform and


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01412
2
predictable. Once the inbreds that give a superior hybrid have been
identified, the hybrid
seed can be reproduced for as long as the homogeneity of the inbred parents is
maintained.
Despite many years of research and the considerable commercial importance
of generating hybrids with desirable traits, the molecular basis for heterosis
is still essentially
unknown. In a few cases, the loss of vigor due to inbreeding can be traced
directly to a
combination of undesirable genes (e.g., lethal or suhlethal recessives).
However, the simple
genetic combination of such genes is not at all sufficient to explain the
phenomenon of
heterosis. Even when crosses are optimized to eliminate such problematic
genes, the
resulting offspring still show a decrease in vigor when inbred. Furthermore,
many phenotypic
traits, such as yield, are the result of several interacting genes and it is
unclear why combining
parents with different genetic backgrounds results in an increase in yield.
Indeed, it is not
even clear whether heterosis is the result of one or a few general genetic
mechanisms, or
whether it is the result of many simultaneously interacting processes.
Because of the lack of understanding of the molecular basis for heterosis,
crop
development has relied upon empirical observations of heterosis for hybrids
which result
from crossing selected inbred crop strains (or resulting from second order
crosses, e.g., in
which two inbreds are crossed to produce a hybrid which is then crossed with
an inbred or
hybrid strain to produce a subsequent 3-4 way heterotic hybrid). This
laborious process has
been conducted on a large scale, resulting in increases in desirable measures
of heterosis,
such as yield, of several percent per year.
Empirical methods based on quantitative genetics theory have resulted in a
tripling of hybrid corn yield over the last 70 years. This has been essential
for food security
and a major contribution to the U.S. and world economy. By 2020, the world
bank and other
groups predict that it will be necessary to double maize production and
increase rice and
wheat production by 50% to support projected population growth. Such an
increase can not
be accomplished by increasing acreage in production (there is not enough
additional acreage
available). It is doubtful that simple empirical approaches will be sufficient
to increase yield
fast enough to meet projected demand.
Molecular methods have been used to a limited extent to supplement crop
breeding programs to select desirable inbreds and hybrids. In general, these
procedures have
been used to identify genetic markers corresponding to desirable or
undesirable loci (e.g.,


CA 02358509 2001-07-12
W0 00/42838 __ V 'JV ~~v y 'V ~~ V , V' PCT/US00/01422
3
"quantitative trait loci" or QTLs) in plants under analysis. Genetic markers
represent (mark
the location of) specific loci in the genome of a species or closely related
species, and
sampling of different genotypes at these marker loci reveals genetic
variation. The genetic
variation at marker loci can then be described and applied to genetic studies,
commercial
breeding, diagnostics, cladistic analysis of variance, or genotyping of
samples. Because
molecular methods are amenable to high throughput analysis and because they do
not require
yield testing, they can be used to speed the process of crop development.
However, although
these techniques are of considerable use, and can and do enhance the
efficiency of crop
breeding programs, they are not currently used, or useful, as a predictor for
the more general
phenomenon of heterosis.
Accordingly, there is a need in the art to determine how molecular, or other
high-throughput methods, or models, can be applied to predict heterosis in
individual
organisms and in populations. The present invention provides a number of
fundamental
discoveries which make it possible to correlate molecular methods and the
phenomenon of
heterosis, as well as a variety of additional aspects which will be apparent
upon complete
renew.
SUMMARY OF THE INVENTION
It is discovered that the number of gene products expressed at optimum levels
in an organism such as a plant correlates with the degree of heterosis the
organism displays.
Thus, by profiling the expression of RNA or protein in a tissue of a plant, it
is possible to
predict the level of heterosis the plant will display if tested for a
heterotic trait such as yield.
Usc of this correlation permits initial selection of organisms, such as
commercial crops,
without actual field testing. Because of the high throughput nature of
molecular methods
which can be used to profile expression, this initial selection dramatically
speeds the process
of increasing desirable traits (and decreasing undesirable traits), resulting
in an increase in the
rate, e.g., of crop improvement.
It is additionally discovered that there is a correlation between the number
of
dominant and additive expression products and the heterosis an organism such
as a plant
displays. As above, determination of the number (and/or ratio) of dominant and
or additive
expression products permits selection of plants for heterosis without field
testing. In all cases,
profiling methods are used to determine the number, and/or relative ratio of
any or all of


CA 02358509 2001-07-12
WO 00/42839 PCTNS00/01422
4
additive, dominant, or under- or over-dominant expression products, thereby
providing
methods of selecting plants for increased heterosis based upon observed
expression profiles.
In addition, modeling methods for predicting which crosses from a panel of
potential crosses
are most likely to result in increases in the number of expressed genes, or
the number or ratio
of additive or dominant genes, or which minimize the ratio of under- or over-
dominant genes-
are provided. New selection methods for obtaining desirable plants, and plants
obtained by
these methods are provided.
It is additionally discovered that gene silencing plays a role in heterosis.
Thus,
by monitoring silencing of genes, it is possible to identify which genes are
responsible for
heterosis. Thus, in one aspect, a heterologous nucleic acid that results in
expression of
expression products from silenced genes (e.g., dominant or additive products)
is introduced
into a target plant. Examples of appropriate heterologous nucleic acids
include one or more
of: a transcription factor which activates a promoter from a silenced gene, a
nucleic acid
encoded by the silenced gene under the control of a heterologous promoter, and
a nucleic acid
homologous to the silenced gene with at least one region of difference with
the silenced gene,
which homologous nucleic acid can recombine with the silenced gene to produce
a modified
gene. Any of these nucleic acids can be cloned under the control of
heterologous promoters
and placed into target plants to increase heterosis of the target plants.
In desirable implementations of the methods herein, integrated systems
comprising computer databases having expression profile information can be
used to select
which parental crosses are most likely to result in an increase in the number
of expression
products (or an optimization of expression products of a selected class, i.e.,
dominant, undcr-
dominant, over-dominant, additive, or the like) in offspring. Thus,
consideration of
expression profile information provides not only a basis for selecting hybrids
from crosses,
but, using the methods herein, also identifies desirable crosses to be made.
Production and
automated consideration of expression profile databases also provides a
mechanism for
identifying the genetic source of particular expression products, thereby
indicating the likely
parentage of given hybrids.
The invention additionally provides methods of cloning and transducing target
plants or animals with dominant, additive, under-dominant and over-dominant
genes
identified by comparative examination of expression profiles.


CA 02358509 2001-07-12
lOntv~~p~n: ta/ 6IDD Zii1= 6~0~77a77 -s EPO/iPA/OE6 Ri~~wi~ki P~pina ~
_, , 85/12/28A8 16: 21 [5193377877 6l.lII~ELANI PAGE B5
BR~F DB~RCR~TiON OF Ti~ FxGZJRES
1 is s scatter plot showing the correlation between the degree of
hetezoais and relationship.
igute 2 is a set of bar grsplu showing classification of gene expc~ession
5 portents in H 'd vs. inbred patents.
gore 3 is a tine graph showing she correlation between the pattern of one
gurc
4
to
a
set
of
bar
graphs
showing
dominant,
additive
sad
over-/under-


daminant RN txpression.


tgure
5
is
a
scatter
graph
showing
the
correlation
betwoai
penmtal
effeeu
on


gene expseuioand hecerosis.


figure
6a-c
is
a
set
of
echetnatic
illustrstions
ahowinE
potysno~hic
do~nsnt


products and sequence:
(SEQ
)I?
NOS
L-23,
rtspoctively)_


AEF~IT10NS


"expression
profile"
is
the
result
of
detecting
a
taprcset~tative
sample
of


expression ucca
p fcnrn
a
cell,
tissue
or
whole
organism,
or
a
rcpraentation
(picture,
graph,


data table, boat,
da etc.)
thereof.
Four
exarapla,
ntaxly
RNA
expression
products
or
a
cell
or


tissue can tancouaty
sim be
deucted
on
a
nucleic
acid
away,
or
by
the
tahnique
of


diffe:endal lay
di or
modifscativn
thereof
such
as
Curagen's
"CiaaaCallittg'~"
technology.


Similarly,'n
pco expression
products
can
lx
tested
by
various
protein
datxcion
methods,


such as hybri'ration
to
peptide
or
antibody
arrays,
or
by
screening
phage
display
libraries.
A


"portion" ubportion" of an expression profile, or a "partial
or " profile" is a subset of the data


provided by a
complete
profile,
such
as
the
information
provided
by
a
subset
of
the
total


number of red
expression
products.


2S An
"expression
product"
is
any
product
tran:cribed
in
a
cell
from
a
DNA
(e.g.,


frogs a gene)translated
from
an
RNA
(e.g.,
a
protein).
Facample
exprtasion
product


include u~N and
proteins.


A
"repc~esentuiva
sample"
of
exptzssion
products,
e.g.,
from
a
particular
x11,


tissue, or e orgu~isrn is a auffici,entty large number
wh l of exptss:;on products that aWi:tical


comparisonthe
actual
number
andlor
type
of
expression
products
between
diffc:mt
cells,


eisaues, of le organisms can be made. Ideally, at least
w o about 5096, and typically 6096,


AMENDED SHEET
___


CA 02358509 2001-07-12
WO 00/42838 PCT/IJS00/01422
6
70%, 80%, 90%, 95% or 100% of the total expression products which are
detectable by a
given technique constitute the "representative sample." The representative
sample will
typically include a large number of expression products, as cells, tissues and
organisms
typically produce a fairly large number of expression products. For example, a
typical
representative sample of expression products includes between about 100 and
20,000 or more
expression products, e.g., about 100-500, 1,000, 1,.500, 2,000, 2,500, 3,000,
3,500, 4,000,
4,500, 5,000, 5,500, 6,000, 6,500, 7,000, 7,500, 8,000, 8,500, 9,000, 9,500,
10,000, 10,500,
11,000, 11,500, 12,000, 12,500, 13,000, 13,500, 14,000, 14,500, 15,000,
15,500, 16,000,
16,500, 17,000, 17,500, 18,000, 18,500, 19,000, 19,500, 20,000, or 30,000
expression
products, or the like.
The term "correlation" unless indicated otherwise, is used herein to indicate
that a "statistical association" exists between, e.g., an expression product
and the degree of
heterosis.
"Dominant" expression for an expression product refers to the situation where
expression of the product in a progeny differs from one parent, and not the
other for the
expression product. "Additive" expression for an expression product refers to
the situation
where expression of the product in a progeny falls within the range of the two
parents (and
may or may not differ from both parents). "Over-dominant" or "under-dominant"
expression
for an expression product refers to the situation where expression of an
expression product in
a progeny differs from both parents and falls outside of the range of the two
parents, either
over the higher parent value, or under the lower parent value, respectively
(Figure 2). Further,
the term "differ" when referring to values is dependent on the technologies
being utilized.
For example, when using Curagen's "GeneCallingT"'" technology, any differences
in value
less than approximately 1.5 to 2.0 fold different from a given parent is
considered not to
differ.
A "biological sample" is a portion of material isolated from a biological
source such as a plant, isolated plant tissue, or plant cell, or a portion of
material made from
such a source, such as a cell extract or the like.
A "promoter" is an array of nucleic acid control sequences which direct
transcription of a nucleic acid. As used herein, a promoter includes necessary
nucleic acid
sequences near the start site of transcription, such as, in the case of a
polymerase II type


CA 02358509 2001-07-12
WO 00142838 PCTNS00/014Z2
7
promoter, a TATA element. A promoter also optionally includes distal enhancer
or repressor
elements which can be located as much as several thousand base pairs from the
start site of
transcription. A "constitutive" promoter is a promoter which is active in a
selected organism
under most environmental and developmental conditions. An "inducible" promoter
is a
promoter which is under environmental or developmental regulation in a
selected organism. -
The phrase "hybrid plants" refers to plants which result from a cross between
genetically different individuals.
The phrase "sexually crossed" or sexual reproduction" in the context of seed
crop plants refers to the fusion of gametes to produce, e.g., seed by
pollination. A "sexual
cross" is pollination of one plant by another. "Selfing" is the production of,
e.g., seed by self-
pollination, i.e., where the pollen and the ovule are from the same plant.
The phrase "tester parent" refers to a parent that is genetically different
from a
set of lines to which it is crossed. The cross is for purposes of evaluating
differences among
the lines in tvpcross combination. Using a tester parent in a sexual cross
allows one of skill
to determine the genetic differences bctwecn the tested lines on the
phenotypic trait with
expression of quantitative trait loci in a hybrid combination.
The phrases "topcross combination" and "hybrid combination" refer to the
processes of crossing a single tester parent to multiple lines. The purposes
of producing such
crosses is to evaluate the ability of the lines to produce desirable
phenotypes in hybrid
progeny derived from the line by the tester cross.
The phrase "transgenic plant" refers to a plant into which exogenous
polynucleotides have been introduced by any process other than sexual cross or
selfing.
Examples of processes by which this can be accomplished are described below,
and include
Agrohacrerium-mediated transformation, biolistic methods, electroporation, in
planta
techniques, and the like. Such a plant containing the exogenous
polynucleotides is referred to
here as an R, generation transgenic plant. Transgenic plants may also arise
from sexual cross
or by selfing of transgenic plants into which exogenous polynucleotides have
been
introduced.


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
DETAILED DESCRIPTION
OVERVIEW OF SELECTION FOR HETEROSIS
Crop improvement relies extensively on the phenomenon of heterosis. inbreds
and/or hybrids are crossed to produce heterotic hybrids with desirable traits
such as high
yield, disease resistance, resistance to heat, cold, salinity, insects, fungi,
herbicides,
pesticides, etc. Secondary desirable traits such as a particular size or shape
of ears, solids
content, sugar content, oil content, water content, etc., can also be affected
by heterosis. The
present invention establishes several correlations between the expression of
gene products
and heterosis, e.g., with respect to yield. These include a statistical
association between the
number of gene products and the degree of heterosis displayed; a statistical
association
between the number of gene products with a dominant expression pattern and the
degree of
heterosis displayed and a statistical association with the number of gene
products with an
additive expression pattern and the degree of heterosis displayed. In
addition, it is discovered
that genes are silenced during inbreeding in plants.
These correlations provide new methods of selecting heterotic hybrids, without
the necessity of field testing every hybrid to monitor heterotic traits. In
the methods,
expression of a first representative sample of first expression products
(e.g., RNAs or
proteins) is profiled from a first progeny plant (e.g., a hybrid from
resulting from crossing two
or more parental lines). The expression products produced in the first progeny
plant are
quantified and/or monitored for the type of expression product (additive,
dominant, under-
dominant, over-dominant, etc.). As noted above, the number of first expression
products
produced in the first progeny plant is statistically associated with a measure
of heterosis in the
first progeny plant, as is the number of dominant, additive, under-dominant or
over-dominant,
or silenced expression products. The plant is then selected (e.g., against
similar measures for
a second progeny plant, or a population of progeny plants, or against the
parental stock) for
further testing based upon the number or type of expression products detected.
Thus, the
plant can be selected for one or more characteristic, including: a selected
number of
expression products, a selected number of dominant expression products, a
selected ratio of
dominant expression products to total expression products, a desired number of
over- or
under-dominant expression products, a selected ratio of over- or under-
dominant expression
products to total expression products, a selected number of additive
expression products, and


CA 02358509 2001-07-12
WO 00/42838 PCT/US00101422
9
a selected ratio of additive expression products to total expression products.
Typically, the
first progeny plant is selected to maximize the number of dominant expression
products
and/or to maximize the number of additive expression products, and/or to
minimize the
number of over- or under-dominant expression products. Crosses can also be
selected to
minimize silencing in the progeny plant.
The parental plants used to produce the first progeny can also be profiled.
Resulting parental expression profiles serve any of a variety of purposes. The
parental
expression profiles can be compared to the first progeny profile to aid in
determining whether
the progeny show an increase in the number of expression products as compared
to parental
stocks (thereby indicating that the progeny is likely to be heteratic). In
addition, comparison
between the parental expression profiles and the progeny profile is used to
determine whether
the individual expression products represented in the profile are dominant,
additive, under-
dominant, over-dominant, or the like. The parental expression profiles can
also be placed
into a database to aid in determining which crosses are most likely to produce
heterotic
hybrids. Potentially desirable crosses among members of the database are
selected by
identifying plants likely to produce progeny plants with a selected number of
expression
products which are dominant, over-dominant, under-dominant or additive. For
example,
parents are selected to produce the first progeny plant by selecting for
complementary
expression of dominant or additive expression products between the parents, or
by selecting
against expression of over-dominant or under-dominant expression products in
the parents.
An additional statistical association relates to the relationship between
parental
and progeny plants. It is discovered that plants which exhibit an expression
profile that is
more similar to the maternal plant than to the paternal plant may be more
heterotic.
Accordingly, comparison of the maternal, paternal and progeny expression
profiles can be
used to monitor this relationship. In addition, multiple crosses to a single
female type can be
made (or the results predicted by comparison in a database) and the progeny
screened (or
predicted) for similarity to the female type.
As noted above, silencing was determined to play a significant role in the
loss
of heterosis due to inbreeding. Accordingly, by comparing parental and progeny
plants it is
possible to determine which genes are silenced. These genes can be rescued,
e.g., by cloning
the silenced genes and placing them under the control of heterologous
promoters, or other


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
strategies noted herein, and transducing the genes back into target plants
(e.g., the parental
lines, the hybrids, or any other plant). In addition, by compiling database
information for
which genes are silenced in inbreds, it is possible to decrease silencing in
hybrids by selecting
crosses where parents have complementary patterns. It is also possible to use
these methods
5 to increase the performance (e.g., grain yield, standability, ete.) of the
inbred lines
themselves.
The first progeny plant selected by any of the methods herein, or a subsequent
progeny plant, or a transgenic plant as described above can be subjected to
any of the field
tests appropriate for monitoring one or more desired traits. Thus, the first
progeny plant, or a
10 subsequent progeny plant thereof, can be tested for a desired phenotypic
trait. The
phenotypic trait can be compared between the first progeny plant, or a
subsequent progeny
plant, and a selected hybrid or inbred plant. The expression profile of the
selected hybrid or
inbred plant can be compared to an expression profile of the first progeny
plant, or the
subsequent progeny plant. Nucleic acids differentially expressed between the
selected hybrid
or inbred plant and the first progeny plant, or the subsequent progeny plant
are identified as
targets for cloning. Similarly, genes that are expressed in high yielding
hybrids that are not
expressed in low yielding hybrids can be determined by comparisons of the
expression
profiles for the high and low yielding hybrids. Nucleic acids from (or
corresponding to) the
differentially expressed genes are cloned for introduction into target nucleic
acids. After
identifying which expression products from the representative sample show an
additive,
dominant, underdominant, or overdominant expression pattern for at least a
portion of the
representative sample, or a nucleic acid corresponding to the expression
product, can be
cloned. The cloned nucleic acid can then be transduced into target plants to
test whether the
nucleic acid encodes a useful trait, or to improve traits in the target plant.
Further details on expression profiling, cloning of nucleic acids, selection
of
hybrids, integrated systems, screening methods and the like are set forth
below.
EXPRESSION PROFILING
As set forth below, a variety of tissues can be profiled, with immature
tissues
being preferentially profiled. Immature tissues are preferred, because it
increases the rate at
which crops can be screened, as a plant does not have to be grown to maturity.
However,
essentially any tissue, or whale plant, can be profiled. A variety of
profiling methods are


CA 02358509 2001-07-12
WO 00/42838 PC'f/US00/01422
11
available, including hybridization of expressed or amplified nucleic acids to
a nucleic acid
array, hybridization of expressed polypeptides to a protein array,
hybridization of peptides or
nucleic acids to an antibody array, subtractive hybridization, differential
display and others.
CROPS TO BE PROFILED
The parental or progeny plants can be inbreds or hybrids. Most commonly, tire
progeny plant is a hybrid, produced by crossing two different inbred lines, or
crossing an
inbred line and a hybrid line, or crossing two hybrid lines (which are the
result of crossing
inbred or hybrid lines), or crossing of more than two lines (e.g., to generate
polyploid yr
recombinant plants) in a single cross. Once a desirable heterotic hybrid is
identified, it can be
treated as such hybrids typically are in breeding schemes, e.g., it can
produced in quantity as
seed; it can be top crossed to inbred lines to produce a 3-way hybrid plant;
it can be selfed to
produce more inbred lines, or the like.
Mast, if not all, plants and animals show hybrid vigor. Much of the discussion
herein relates to commercially valuable crops, as these are an important
target of the methods
of the invention. However, the methods are general and can be applied to non-
commercial
crop plants, fungi, and to the production of animals, including poultry,
cattle, sheep, pigs, and
the like.
Important commercial crops include both monocots and dicots. Monocots
such as plants in the grass family (Gramineae), such as plants in the sub
families Fetucoideae
and Poacoideae, which together include several hundred genera including plants
in the genera
Agrostis, Phleum, Daclylis, Sorgum, Setaria, Zea (e.g., corn), Oryza (e.g.,
rice), Triticum
(e.g., wheat), Secale (e.g., rye), Avena (e.g., oats), Hordeum (e.g., barley),
Saccharum, Poa,
Festuca, Stenotaphrum, Cynodon, Coix, the Olyreae, Phareae and many others.
Plants in the
family Gramineae are a particularly preferred target plants for the methods of
the invention.
Additional preferred targets include other commercially important crops, e.g.,
from the
families Compositae (the iargest family of vascular plants, including at least
1,000 genera,
including important commercial crops such as sunflower), and Leguminosae or
"pea family,"
which includes several hundred genera, including many commercially valuable
crops such as
pea, beans, lentil, peanut, yam bean, cowpeas, velvet beans, soybean, clover,
alfalfa, lupine,
vetch, lotus, sweet clover, wisteria, and sweetpca. Common crops applicable to
the methods


CA 02358509 2001-07-12
WO OO1d2838 PCT/US00/01422
12
of the invention include Zea mays, rice, soybean, sorghum, wheat, oats,
barley, millet,
sunflower, and canola.
TISSUES TO BE PROFILED
As noted above, one advantage of the present invention is that the methods can
be performed without the necessity of field testing progeny (field testing
can, of course, be
used as a part of, or an adjunct to the other methods herein). An extension of
this advantage
is that immature tissues can be profiled from a test plant, which speeds the
testing process.
Thus, although expression profiles can be performed from any tissue or whole
organism, in
one preferred embodiment, the representative samples are from immature tissues
or immature
plants. For example, an immature ear of the plant, or a whole seedling plant
(or any tissue
thereof), can be profiled. It will be appreciated that when comparisons are
performed, they
are typically performed between expression profiles obtained from the same
tissue and
developmental stage (and environmental conditions) for the plants which are
compared.
$,rjA P~tOFILING
In one preferred embodiment, the expression products which are detected in
the methods of the invention are RNAs, e.g., mRNAs expressed from genes within
a cell of
the plant or tissue profiled.
A number of techniques are available for detecting RNAs. For example,
northern blot hybridization is widely used for RNA detection, and is generally
taught in a
variety of standard texts on molecular biology, including: Berger and Kimmel,
Guide to
Molecular Cloning Tr,~chniques ethods in Enz m~o(o,~,Y volume 152 Academic
Press, Inc.,
San Diego, CA (Berger); Sambrook et al., Molecular Cloning - A Laboratory
Manual (2nd
Ed.), Vol. 1-3, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York,
1989
("Sambrook") and Current Protocols inMolecular Bioloav, F.M. Ausubel et al.,
eds., Current
Protocols, a joint venture between Greene Publishing Associates, Inc. and John
VViley &
Sons, Inc., (supplemented through 1998) ("Ausubel")).
Furthermore, one of skill will appreciate that essentially any RNA can be
converted into a double stranded DNA using a reverse transcriptase enzyme and
a
polymerase. See, Ausubel, Sambrook and Berger, id. Thus, detection of mRNAs
can be
performed by converting, e.g., mRNAs into DNAs, which are subsequently
detected in, e.g., a
standard "Southern blot" format.


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
13
Furthermore, DNAs can be amplified to aid in the detection of rare molecules
by any of a number of well known techniques, including: the polymerise chain
reaction
(PCR), the ligasc chain reaction (LCR), Q(3-replicase amplification and other
RNA
polymerise mediated techniques (e.g., NASHA). Examples of these techniques are
found in
S Berger, Sambrook, and Ausubel, id., as well as in Mullis et al., (1987) U.S.
Patent No.
4,683,202; PCR Protocols A Guide to Methods and Applications (Innis et al.
eds) Academic
Press Inc. San Diego, CA (1990) (Innis); Arnheim & Levinson (October 1, 1990)
CC&EN 36-
47; Theh Journal (Zf NI~iRes~rch (1991) 3, 81-94; Kwoh et al. (1989) Proc.
NaI,~,Acad. Sci.
USA 86, 1173; Guatelli et al. (1990) Proc. Natl. Acid. Sci. USA 87, 1874;
Lomell et al.
(1989) J. Clin. Chem 35, 1826; Landegren et al., (1988) Science 241, 10?7-
1080; Van Brunt
(1990) Biotechnolosv 8, 291-294; Wu and Wallace, (1989) Gene 4, 560; Barringer
et al.
(1990) ~ 89, 117, and Sooknanan and Malek (1995) Biotechnology 13: 563-564.
Improved methods of cloning in vitro amplified nucleic acids are described in
Wallace et al.,
U.S. Pat. No. 5,426,039. Improved methods of amplifying large nucleic acids by
PCR are
summarized in Cheng et al. (1994) Nature 369: 684-685 and the references
therein, in which
PCR amplicons of up to 40kb are generated. One of skill will appreciate that
essentially any
RNA can be converted into a double stranded DNA suitable for restriction
digestion, PCR
expansion and sequencing using reverse transcriptase and a polymerise. See,
Ausubel,
Sambrook and Berger, all supra.
These general methods can be used for expression profiling. For example,
arrays of probes can be spotted onto a surface and expression products (or in
vitro amplified
nucleic acids corresponding to expression products) can be labeled and
hybridized with the
array. For convenience, it may be helpful to use several arrays
simultaneously. It is expected
that one of skill is familiar with nucleic acid hybridization. General methods
of hybridization
are found in Berger, Sambrook and Ausubel, ,sc~pra, and further in Tijssen
(1993) ora o
Technigues in Bioche~is~ and Molecular BiologX -I-Iybridiz~,tion with Nucleic
Acid
Probes, e.g., part I chapter 2 "Overview of principles of hybridization and
the strategy of
nucleic acid probe assays," Elsevier, New York.
In one useful variation of these methods, solid phase arrays are adapted for
the
rapid and specific detection of multiple poIymotphic nucleotides. Typically, a
nucleic acid
probe is chemically linked to a solid support and a target nucleic acid (e.g.,
an RNA or


CA 02358509 2001-07-12
wo ooiaza3s -- -- rcriusooioiaaz
14
corresponding amplified DNA) is hybridized to the probe. Either the probe, or
the target, or
both, can be labeled, typically with a fluorophore. Where the target is
labeled, hybridization
is detected by detecting bound fluorescence. Where the probe is labeled,
hybridization is
typically detected by quenching of the label by the bound nucleic acid. Where
both the probe
and the target are labeled, detection of hybridization is typically performed
by monitoring a
signal shift such as a change in color, fluorescent quenching, or the like,
resulting from
proximity of the two bound labels.
In one embodiment of this concept, an array of probes are synthesized on a
solid support. Using chip masking technologies and photoprotective chemistry,
it is possible
to generate ordered arrays of nucleic acid probes with large numbers of
probes. These arrays,
which are known, e.g., as "DNA chips," or as very large scale immobilized
polymer arrays
("VLSIPS"TM arrays) can include millions of defined probe regions on a
substrate having an
area of about lcm= to several cmz. In addition to photomasking technologies,
arrays of
chemicals, nucleic acids, proteins or the like can also be printed on a solid
substrate using
printing technologies.
The construction and use of solid phase nucleic acid arrays to detect target
nucleic acids is well described in the literature. See, Fodor, et al. Science
251:767 (1991);
Sheldon, et al. Clin. Chem. 39(4):718 (1993); Kozal, et al. Nature Medicine
2(7):753 (1996)
and Hubbell, U.S. Pat. No. 5,571,639. In brief, a combinatorial strategy
allows for the
synthesis of arrays containing a large number of probes using a minimal number
of synthetic
steps. For instance, it is possible to synthesize and attach all possible DNA
8-mer
oligonucleotides (4s, or 65,536 possible combinations) using only 32 chemical
synthetic
steps. In general, these procedures provide a method of producing 4°
different
oligonucleotide probes on an array using only 4n synthetic steps.
Light-directed combinatorial synthesis of oligonucleotide arrays on a glass
surface is performed with automated phosphoramidite chemistry and chip masking
techniques
similar to photo resist technologies in the computer chip industry. Typically,
a glass surface
is derivatized with a silane reagent containing a functional group, e.g., a
hydroxyl (for nucleic
acid arrays) or amine group (for peptide or peptide nucleic acid arrays)
blocked by a
photolabile protecting group. Photolysis through a photolithogaphic mask is
used selectively
to expose functional groups which are then ready to react with incoming 5'-
photoprotected


CA 02358509 2001-07-12
WO 00/42838 "- PGT/US00/01422
nucleoside phosphoramidites. The phosphoramidites react only with those sites
which are
illuminated (and thus exposed by removal of the photolabile blocking group).
Thus, the
phosphoramidites only add to those areas selectively exposed from the
preceding step. These
steps are repeated until the desired array of sequences have been synthesized
on the solid
surface. Combinatorial synthesis of different oligonucleotide analogues at
different locations-
on the array is dctermincd by the pattern of illumination during synthesis and
the order of
addition of coupling reagents. Monitoring of hybridization of target nucleic
acids to the array
is typically performed with fluorescence microscopes or laser scanning
microscopes.
In addition to being able to design, build and use probe anays using available
10 techniques, one of skill is alsa able to order custom-made arrays and array-
reading devices
from manufacturers specializing in array manufacture. For example, Affymetrix
Corp. in
Santa Clara, CA manufactures nucleic acid arrays.
It will be appreciated that probe design is influenced by the intended
application. For example, where several allele-specific probe-target
interactions are to be
15 detected in a single assay, e.g., on a single nucleic acid chip, it is
desirable to have similar
melting temperatures for all of the probes. Accordingly, the length of the
probes are adjusted
so that the melting temperatures for all of the probes on the array are
closely similar (it will
be appreciated that different lengths for different probes may be needed to
achieve a
particular Tm where different probes have different GC contents). Although
melting
temperature is a primary consideration in probe design, other factors are also
optionally used
to further adjust probe construction, such as elimination of self-
complementarity in the probe
(which can inhibit hybridization of a target nucleotide). Techniques for
designing and using
sets of probes for screening many nucleic acids, such as expression products,
simultaneously,
and for monitoring expression on nucleic acid arrays are described in EP 0799
897 A1.
One way to compare expression products between two cell populations is to
identify mRNA species which are differentially expressed between the cell
populations (i.e.,
present at different abundances betwccn the cell populations). In addition to
the array
techniques noted above, another preferred method is to use subtractive
hybridization (Lee et
ul. (1991) Proc. Natl. Acad. Sci. (U.S.A.I 88:2825) or differential display
employing arbitrary
primer polymerase chain reaction (PCR) (Lung and Pardee (1992) Science
257:967). Each
of these methods has been used by various investigators to identify
differentially expressed


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
16
mRNA species. See, Salesiotis et al. (1995) ~ance~ Lett. 91:47; Jiang et al.
(1995) Oncoeene
10:1855; Blok et al. (1995) sta a 26:213; Shinoura et al. (1995) Cancer Lett.
89:215;
Murphy et al. (1993) Cell Growth Differ 4:715; Austruy et al. (1993) Cancer
Res. 53:2888;
Zhang et al. (1993) Mol. Carcinoe. 8:123; and Liang et al. (1992) Cancer Res.
52:6966). The
methods have also been used to identify mRNA species which are induced or
repressed, e.g.;
by drugs or certain nutrients (Fisicaro et al. (1995) Mol. Immunol. 32:565;
Chapman et al.
(1995) Mol. Cell. Endocrinol. 108:108; Douglass et al. (1995) J. Neurosci.
15:2471; Aiello et
al. (1994) I'roc. Natl. Acad. Sci. (U.S.A.) 91:6231; Ace et al. (1994)
EndqJ;rinology
134:1305.
For the technique of differential display, Liang and Pardee (1992), supra
provide theoretical calculations for the selection of 5' and 3' arbitrary
primers. Correlation of
observed results to the theory is also provided. In practice, 5' primers of
less than about 9
nucleotides may not provide adequate specificity (slightly shorter primers of
about 8 to 10
nucleotides have been used in PCR methods for analysis of DNA polymorphisms.
See also,
Williams et al. (1991) Nucl~c Acids Research 18: 6531). The primers)
optionally comprise
5'-terminal sequences which serve to anchor other PCR primers (distal primers)
and/or which
comprise a restriction site or half site or other ligatable end. Where a
restriction site or
amplification template for a second primer is incorporated, the primers are
optionally longer
than those described above by the length of the restriction site, or
amplification template site.
Standard restriction enzyme sites include 4 base sites, 5 base sites, 6 base
sites, 7 base sites,
and 8 base sites. An amplification template site for a second primer can be of
essentially any
length, for example, the site can be about 15-25 nucleotides in length.
The amplified products are optionally labeled and are typically resolved by
electrophoresis on a polyacrylamide gel; the locations) where label is present
are excised and
the labeled product species is/are recovered from the gel portion, typically
by elution. The
resultant recovered product species can be subcloned into a replicable vector
with or without
attachment of linkers, amplified further, and/or detected, or even sequenced
directly.
Sequencing methods are dcscribcd in Berger, Sambrook and Ausubel, supra.
Direct
sequencing of PCR generated amplicons by selectively incorporating boronated
nuclease
resistant nucleotides into the amplicons during PCR and digestion of the
amplicons with a


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
17
nuclease to produce sized template fragments has also been proposed (Porter et
al. (1997)
Nucleic Acids Research 25(8):1611).
It is expected that one of skill can use, c.g., differential display for
expression
profiling. In addition, companies such as CuraGen Corp. (New Haven CT) provide
robust
expression profiling based upon modified differential display techniques. See,
e.g., WO
97/15690 by Rothberg et al. Accordingly, one of skill can have expression
profiling
performed by companies which specialize in such techniques.
PROTEIN PROFILING
In addition to profiling RNAs (or corresponding cDNAs) as described above,
it is also possible to profile proteins. In particular, various strategies are
available for
detecting many proteins simultaneously. As applied to the present invention,
detected
proteins, corresponding to expression products, can be derived from one of at
least two
sources. First, the proteins which are detected can be either directly
isolated from a cell or
tissue to be profiled, providing direct detection (and, optionally,
quantification) of proteins
present in a cell. Second, mRNAs can be translated into cDNA sequences, cloned
and
expressed. This increases the ability to detect rare RNAs, and makes it
possible to
immediately associate a detected protein with its coding sequence. For
purposes of the
present invention, it is not necessary even to express nucleic acids in the
proper reading
frame, as it is typically the presence or absence of an expression product
that is, initially, at
issue. Even an out of frame peptide is an indicator for the presence of a
corresponding RNA.
A variety of hybridization techniques, including western blotting, ELISA
assays, and the like are available for detection of specific proteins. See,
Ausubel, Sambrook
and Bergen supra. See also, Antibodies: A Labor~,h~,y M~,ual, ( 1988) E.
Harlow and D.
Lane, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY. Non-hybridization
based
techniques such as two-dimensional electrophoresis can also be used to
simultaneously and
specifically detect large numbers of proteins.
One typical technology for detecting specific proteins involves making
antibodies to the proteins. By specifically detecting binding of an antibody
and a given
protein, the presence of the protein can be detected. In addition to available
antibodies, one
of skill can easily make antibodies using existing techniques, or modify those
antibodies
which are commercially or publicly available. In addition to the art
referenced above, general


CA 02358509 2001-07-12
WO 00/42838 PCTlUS00/014ZZ
18
methods of producing polyclonal and monoclonal antibodies are known to those
of skill in
the art. See, e.g., Paul (ed) (1998) Fundamental lmmunoloQV. Fourth Edition
Raven Press,
Ltd., New York Coligan (1991) Current Protocols in Immunology Wiley/Greene,
NY;
Harlow and Lane (1989) Antibodies A Laboratory Manual Cold Spring Harbor
Press, NY;
Stites et al. (eds.) Basic and Clinical Immunoloav (4th ed.) Large Medical
Publications, Log
Altos, CA, and references cited therein; Goding (1986) IJV~onc,,~lona
Antibodies: ~nci .~les_
and Practice (2d ed.) Academic Press, New York, NY; and Kohler and Milstein
(1975)
Nature 256:495-497. Other suitable techniques for antibody preparation include
selection of
libraries of recombinant antibodies in phage or similar vectors. See, Huse et
al. (1989)
Science 246:1275-1281; and Ward et al. (1989) Nature 341:544-546. Specific
monoclonal
and polyclonal antibodies and antisera will usually bind with a Kp of at least
about .1 ~.M,
preferably at least about .O1 ~M or better, and most typically and preferably,
.001 ~.M or
better.
As used herein, an "antibody" refers to a protein consisting of one or more
polypeptide substantially or partially encoded by immunoglobulin genes or
fragments of
immunoglobulin genes. The recognized immunoglobulin genes include the kappa,
lambda,
alpha, gamma, delta, epsilon and rnu constant region genes, as well as myriad
immunoglobulin variable region genes. Light chains are classified as either
kappa or lambda.
Heavy chains are classified as gamma, mu, alpha, delta, or epsilon, which in
turn define the
immunoglobulin classes, IgG, IgM, IgA, igD and IgE, respectively. A typical
immunoglobulin (antibody) structural unit is known to comprise a tetramer.
Each tetramer is
composed of two identical pairs of polypeptide chains, each pair having one
"light" (about 25
kD) and one "heavy" chain (about 50-70 kD). The N-terminus of each chain
defines a
variable region of about 100 to 110 or more amino acids primarily responsible
for antigen
recognition. The terms variable light chain (VL) and variable heavy chain (VH)
refer to these
light and heavy chains respectively. Antibodies exist as intact
immunoglobulins or as a
number of well characterized fragments produced by digestion with various
peptidases.
Thus, for example, pepsin digests an antibody below the disulfide linkages in
the hinge region
to produce F(ab)',, a dimer of Fab which itself is a light chain joined to VH-
CH1 by a disulfide
bond. The F(ab)'Z may be reduced under mild conditions to break the disulfide
linkage in the
hinge region thereby converting the (Fab'): dimer into an Fab' monomer. The
Fab' monomer


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
19
is essentially an Fab with part of the hinge region (see, Fundamental'
Im~,unoloEV, W.E. Paul,
ed., Raven Press, N.Y. (1993), for a more detailed description of other
antibody fragments).
While various antibody fragments are defined in terms of the digestion of an
intact antibody,
one of skill will appreciate that such Fab' fragments may be synthesized de
novo either
chemically or by utilizing recombinant DNA methodology. Thus, the term
antibody, as used
herein also includes antibody fragments either produced by the modification of
whole
antibodies or synthesized de novo using recombinant DNA methodologies.
Antibodies
include single chain antibodies, including single chain Fv (sFv) antibodies in
which a variable
heavy and a variable light chain are joined together (directly or through a
peptide linker) to
form a continuous polypeptide.
For purposes of the present invention, antibodies or antibody fragments can be
arrayed, e.g., by coupling to an amine moiety fixed to a solid phase array, in
a manner similar
to that described above for construction of nucleic acid arrays. As above for
nucleic acid
probes, the antibodies can be labeled, or proteins corresponding to expression
products can be
labeled. In this manner, il is possible to couple hundreds, or even thousands,
of different
antibodies to an array.
In one embodiment, a bacteriophage antibody display library is screened with
a polypeptide encoded by a cell, or obtained by expression of mRNAs,
differential display,
subtractive hybridization or the like. Combinatorial libraries of antibodies
have been
generated in bacteriophage lambda expression systems which arc screened as
bacteriophage
plaques or as colonies of lysogens (Huse et al. (1989) cience 246:1275; Caton
and
Koprowski (1990) Proc. Ng_t]. Acad. Sci. IU.S.A.I 87:6450; Mullinax et al
(1990) ProcNatl.Natl.
Acad. Sci. IU.S.A.I 87:8095; Persson et al. (1991) Proc. Natl. Acad. Sci.
,~U.S.A.I 88:2432).
Various embodiments of bacteriophage antibody display libraries and lambda
phage
expression libraries have been described (Kang et al. (1991) Proc. Natl. Acad.
Sci. IU.S.A.)
88:4363; Clackson et al. (1991) Nature 352:624; McCafferty et al. (I990)
Nature 348:552;
Burton et al. (1991) PJroc. Natl. Acad. ~ci. IU.S.A~ 88:10134; Hoogenboom et
al. (1991)
Nucleic Acids Res. 19:4133; Chang et al. (1991) J. Immunol. 147:3610;
Brcitling et al.
(1991) Gene 104:147; Marks et al. (1991) J. Mol. Biol. 222:581; Barbas et al.
(1992) Proc.
LVatl. Acad. Sci.,SU.S.A.) 89:4457; Hawkins and Winter (1992) J. I~mWn_ol.
22:867; Marks et


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
al. (1992) Biotechnology 10:779; Marks et al. (1992) J. Hiol. Chem. 267:16007;
Lowman et
al (1991) Biochemistry 30:10832; Lerner et al. (1992) Science 258:1313.
The patterns of hybridization which are detected provide an indication of the
presence or absence of protein sequences. As long as the library or array
against which a
population of proteins arc to be screened can be correlated from one
experiment to the next -
(e.g., by noting the x-y coordinates of the library or array member), no
sequence information
is required to compare expression profiles from one representative sample to
another. In
particular, the mere presence or absence (or degree) of label provides the
ability to determine
differences. One advantage of using libraries of antibodies for protein
detection is that the
10 individual libraries can be uncharacterized. As long as library members
have a set spatial
relationship, e.g., gridded on a plate, duplicate plates can be made and label
patterns to the set
spatial relationship determined.
More generally, peptide and nucleic acid hybridization to arrays or libraries
(or
even simple two dimensional gels) can be treated in a manner analogous to a
bar code label.
15 Any diverse library or array can be used to screen for the presence or
absence of
complementary molecules, whether RNA, DNA, protein, or a combination thereof.
By
measuring corresponding signal information between different sources of test
material (e.g.,
different hybrid or inbred plants, or different tissues, or the like), it is
possible to determine
differences in expression products for the different source materials. As set
forth below, this
20 process is facilitated by various high throughput integrated systems set
forth below.
In addition to array based approaches, mass spectrometry is in use for
identification of large sets of proteins in samples, and is suitable for
identification of many
proteins in a sequential or parallel fashion. For example, Hutchens et al.
U.S. Pat. 5,719,060,
describe methods and apparatus for desotption and ionization of analytes for
subsequent
analysis by mass spectroscopy and/or biosensors. Sample presenting means with
probe
elements with "Surfaces Enhanced for Laser Desorption/ionization" (SELDI)
described in
the '060 patent is particularly useful in the context of the present
invention; however, other
approaches described in the '060 are also generally applicable to the present
invention.
Two and three dimensional gel based approaches can also be used for the
specific and simultaneous identification and quantification of large numbers
of proteins from
biological samples. Mufti-dimensional gel technology is well-known and
described e.g., in


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
21
Ausubel, supra, Volume 2, Chapter 10. Image analysis of multi-dimensional
protein
separation gels provides an indication of the proteins that are expressed
e.g., in a cell or tissue
type. It is worth noting that identification of particular proteins is not
necessary; instead,
positional and pattern information e.g., of protein staining or fluorescing
patterns is suff dent
to identify sets of protein expression products.
In addition to identifying expression products, such as proteins or RNA, it is
also possible to screen for large numbers of metabolites in cell or tissue
samples. The
presence, absence or level of a metabolite can be treated as a character for
comparison
purposes in the same way that nucleic acids or proteins are discussed herein.
Metabolites can
be monitored by any of currently available method, including chromatography,
uni or multi
dimensional gel separations, hybridization to complementary molecules, or the
like.
The invention provides methods of identifying plant crosses with an increase
in probability for heterosis in progeny plants. For example, in a preferred
method, the
expression profiles for a plurality of plants are compared, and the expression
profiles are
considered by pair-wise comparison. Desirable crosses produce progeny with a
selected or
optimal number of expression products, or progeny with a selected number or
type of
expression products that display a dominant, additive. over-dominant or under-
dominant
expression pattern. Desirably, these comparisons are performed in an
integrated system
which includes a computer.
The generation and use of databases of expression profile information for
performing a variety of comparisons is a feature of the invention. Because of
the large
number of comparisons between expression profiles (which, as noted above,
comprise e.g.,
detection information from about 1,000 to about 20,000 or more expression
products), the
most practical way of performing the comparisons is by entering the
information into one or
more database and using a computer to make the comparisons.
A variety of comparative methods can be performed in an integrated system,
e.g., to determine the heterosis (or likely heterosis) of a cross. For
example, one simple
measure that can be compared across different actual or potential crosses to
determine the
desirability of a particular cross is to determine the sum of the expressed
gene products that
differ from a progeny plant in each of a first and second parental plant and
the number of


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
22
expressed gene products that differ between the first and second parental
plant, The larger
this sum, typically, the more desirable the cross.
In the integrated systems herein, it is also possible to predict the likely
outcomes of crosses between parental plants. 1n these methods, matrices of
possible
expression profile combinations for plants are generated, For example, the
expression -
profiles are compiled in a database in a computer and a matrix of possible
pair-wise
expression profile combinations for the plants is generated and queried using
an integrated
system comprising a computer with softwa~-~e for generating and comparing
matrices. Subsets
of potential crosses from all of the possible pair-wise comparisons which
exhibit a maximal
number of expression profile differences represent one preferred cross. Useful
software aids
in determining how many genes are expressed, or whether expressed genes are
additive,
dominant, over-dominant or under-dominant.
Which plants to select as possible crosses is up to the discretion of the
user. It
is possible simply to test all possible first order crosses in a database.
However, it is not
possible to test all possible subsequent crosses, as the set size for such a
procedure is
theoretically infinite. That is, after generating a progeny matrix of
expression products for all
possible pair-wise parental crosses, the progeny matrix can be used to
generate a possible
theoretical set of crosses between the hypothesised progeny represented by the
progeny
matrix and/or the original database of parental expression profiles. A
resulting expression
profile matrix can be generated for hypothesized subsequent progeny, which can
again be
compared to any of the preceding expression profile information. 1n theory,
this process can
be repeated ad infinitum.
More practically, certain rules can be implemented to reduce the total amount
of calculations to be performed. For example, matrix information can be
limited to possible
pair-wise crosses for plants from different heterotic groups, or from the same
heterotic group.
In addition, the fidelity of predicted expression profile information
increasingly varies as
subsequent cross information is considered, and of course, the number of
possible crosses
increases. Accordingly, typically only one or a few rounds of potential
crosses are considered
at one time. In any case, selection of a subset of potential crosses from all
of the possible
pair-wise comparisons which exhibit a maximal number of expression profile
differences is
desirable.


CA 02358509 2001-07-12
WO 00142838 PCTNS00/01422
23
A variety of rules for performing the basic comparisons can be used. In one
desirable implementation, crosses are identified in which the sum of: (i)
expression products
produced in a first plant from a first heterotic group (A;) which are not
expressed in a second
plant from the first heterotic group (A) to which the first plant is crossed
(A~), and which are
not expressed in a selected third plant from a second heterotic group (B),
plus (ii) the
expression products produced in A~ which are not produced A; and which are not
produced in
B, is optimized. This optimization results in crosses which achieve elevated
numbers of
expression products expressed in heterodc hybrid progeny, and also in an
optimization of the
number of dominant products expressed.
In another optimisation protocol, optimization is achieved by determining all
possible pair-wise combinations from the first heterotic group and identifying
the cross which
results in the largest sum of expression products, or by determining all
possible pair-wise
combinations from the first heterotic group and identifying crosses which
result in a hybrid
progeny (A; x A~) with a maximal number of differences as compared to B, or by
determining
all possible pair-wise combinations from the first heterotic group and
identifying crosses
which result in the hybrid progeny (A; x A~) having a greater number of
differences with H
than the number of differences between B and A; or B and A,~. As above, this
optimization
results in crosses which achieve elevated numbers of expression products
expressed in
heterotic hybrid progeny, and also in an optimization of the number of
dominant products
expressed.
Such implementations can also be used to improve selection methods per se.
For example, in one method, self or back-crossed progeny derived from the A; x
A~ hybrid
are selected which either retain a set of expression products defined by the
sum of expression
products expressed in A; (but not A~ or B) and A~ (but not A; or B), or which
show a larger
number of expression products expressed in a topcross with B than does either
A, or A~ when
topcrossed with H.
One approach for comparing profiles is a nested analysis in which expression
profiles are successively grouped together, and the many gene expression
differences seen in
individual pair-wise comparisons can be ranked hierarchically in a filtering
process. This
method is useful for identifying genes expressed in one set of genotypes vs,
another, e.g.


CA 02358509 2001-07-12
WO 00/428:18 PCT/US00101422
24
hybrids vs. inbreds or bulked segregants from the two ends of a quantitative
phenotypic
distribution.
In any case, the methods of the invention can include inputing an expression
profile for progeny or parental plants into a database of expression profiles.
This can be
performed manually, but is more typically performed in an automated system.
Computer databases of expression profile information can be quite large, with
from a few up to several thousand profiles in the database. Typically, the
database will have
expression product profiles of a representative sample of expression products
for hybrid
progeny plants resulting from at least 10 separate inbred plant crosses, or at
least 10 inbred
plant expression product profiles.
The phrase "computer system" or "integrated system" in the context of this
invention refers to a system in which data entering a computer corresponds to
physical
objects or processes external to the computer, e.g., nucleic acid
hybridization or protein
binding data and a process that, within a computer, causes a physical
transformation of the
input signals to different output signals. In other words, the input data,
e.g., hybridization of
expression products on a specific array, is transformed to output data, e.g.,
the identification
or counting of the sequence hybridized, comparison to similar arrays with
different test
materials, counting and categorization of expression products or the like. The
process within
the computer is a program by which positive (or negative) hybridization
signals are
recognized by the computer system and attributed to a region of an array, or
other expression
profile format (e.g., simple counting of array signals). The program then
determines which
region of the array the hybridized expression products are located on and,
optionally, the
specific corresponding sequences which the probe is based on (as noted above,
no sequence
information is required for making or assessing expression profiles).
The invention provides integrated systems for plant or plant cell manipulation
and hybridization analysis. Typical systems include a digital computer with
high-throughput
liquid control software, image analysis software, and data interpretation
software. A robotic
liquid control armature for transferring solutions (e.g., plant cell extracts)
from a source to a
destination, is typically operably linked to the digital computer. An input
device for entering
data to the digital computer to control high throughput liquid transfer by the
robotic liquid
control armature and, optionally, to control transfer by the pinning armature
to the solid


CA 02358509 2001-07-12
WO 00112838 "" ' PCTIUSlIO/01422
support is commonly a feature of the integrated system, as is an image scanner
for digitizing
label signals from labeled probe hybridised to the DNA on the solid support
operably linked
to the digital computer. The image scanner interfaces with the image analysis
software to
provide a measurement of probe label intensity, where the probe label
intensity measurement
is interpreted by the data interpretation software to show whether, and to
what degree, the -
labeled probe hybridizes to a label.
A number of well known robotic systems have also been developed for
solution phase chemistries. These systems include automated workstations like
the automated
synthesis apparatus developed by Takeda Chemical Industries, LTD. (Osaka,
Japan) and
10 many robotic systems utilizing robotic arms (Zymate II, Zymark Corporation,
Hopkinton,
Mass.; Orca, Hewlett-Packard, Palo Alto, Calif.) which mimic the manual
synthetic
operations performed by a scientist. Any of the above devices are suitable for
use with the
present invention. The nature and implementation of modifications to these
devices (if any)
so that they can operate as discussed herein with reference to the integrated
system will be
15 apparent to persons skilled in the relevant art.
High throughput screening systems are commercially available (see, e.g.,
Zymark Corp., Hopkinton, MA; Air Technical Industries, Mentor, OH; Beckman
Instruments, Ine. Fullerton, CA; Precision Systems, lnc., Natick, MA, ete.).
These systems
typically automate entire procedures including all sample and reagent
pipetting, liquid
20 dispensing, timed incubations, and final readings of the microplate in
detectors) appropriate
for the assay. These configurable systems provide high throughput and rapid
start up as well
as a high degree of flexibility and customization. For example, the currently
available
commercial software package, BioWorks~ 1.4~, provided by Beckman Instruments,
Inc. to
control and operate their Hiomek~ 2000 robotics liquid handler supports a
scripting
25 capability based on the publicly available Tool Command Language (TCL).
Beckman has
incorporated a TCL interpreter into the Biomek~ 2000 and has included TCL
extensions
(Bioscript~) to allow direct motor control and other instrument functionality.
A 16-bit (to
run under Microsoft Windows 3.I~ and Microsoft Windows 95~) application to
generate the
TCLBioscript code can be created, e.g., in Microsoft Visual Basic 4.0 ~t .
The manufacturers of such systems provide detailed protocols the various high
throughput. Thus, for example, Zymark Corp. provides technical bulletins
describing


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
26
screening systems for detecting the modulation of gene transcription, ligand
binding, and the
like. More recently, microfluidic approaches to reagent manipulation have been
developed,
e.g., by Caliper Technologies (Palo Alto, CA).
Optical images viewed (and, optionally, recorded) by a camera or other
recording device (e.g., a photodiode and data storage device) are optionally
further processed-
in any of the embodiments herein, e.g., by digitizing the image and/or storing
and analyzing
the image on a computer. A variety of commercially available peripheral
equipment and
software is available for digitizing, storing and analyzing a digitized video
or digitized optical
image, e. g., using PC (Intel x86 or pentium chip- compatible DOSTM> OS2TM
WINDOWST"',
WINDOWS NTTM or WINDOWS95TM based machines), MACiNTOSHTM, or UNIX based
{e.g., SIJNTM work station) computers.
One conventional system carries light from the specimen field to a cooled
charge-coupled device (CCD) camera, in common use in the art. A CCD camera
includes an
array of picture elements (pixels). The light from the specimen is imaged on
the CCD.
Particular pixels corresponding to regions of the specimen (e.g., individual
hybridization sites
on an array of biological polymers) are sampled to obtain light intensity
readings for each
position. Multiple pixels are processed in parallel to increase speed. The
apparatus and
methods of the invention are easily used for viewing any sample, e.g., by
fluorescent or dark
field microscopic techniques.
Integrated systems for hybridization analysis of the present invention
typically
include a digital computer with high-throughput liquid control software, image
analysis
software, data interpretation software, a robotic liquid control armature for
transferring
solutions from a source to a destination operably linked to the digital
computer, an input
device (e.g., a computer keyboard) for entering data to the digital computer
to control high
throughput liquid transfer by the robotic liquid control armature and,
optionally, an image
scanner for digitizing label signals from labeled probe hybridized to
expression products, e.g.,
on a solid support operably linked to the digital computer. The image scanner
interfaces with
the image analysis software to provide a measurement of probe label intensity.
Typically, the
probe label intensity measurement is interpreted by the data interpretation
software to show
whether the labeled probe hybridizes to the DNA on the solid support.


CA 02358509 2001-07-12
WO 00/42838 PGTNS00/01422
27
Software to support sample processing can be divided into 4 functional
categories: 1 ) liquid transfer control software, 2) image analysis software,
3) data
management software, and 4) data interpretation software.
Conveniently, applications can share information through data files which the
applications can read and create. For flexibility and ease of use, files can
be formatted as
simple text files and/or in Microsoft Excel~ or other worksheet format. This
allows viewing
and editing of the files through the use of commercially available software
such as Microsoft
Excel~. Those of skill in the art will recognize that this approach is only
one possible set of
systems that could be used in the support and facilitation of the process of
the present
invention. Other systems can easily designed to fit the particular needs of
the user in the
practice of the invention. By way of example, and not limitation, a Microsoft
Windows~
user interface can be developed for mast applications using Microsoft Visual
Basic 4.0~.
Most applications can be developed for a 32-bit environment to run under
Microsoft
Windows 95~ or 98~. 16-bit applications such as image analysis software
developed by
Optimas Corporation, Optimas 5.0, can also be useful components of the
integrated system.
CLONING OF EXPRESSION PRODUCTS
Any nucleic acid encoding an expression product identified as being of
interest
by the expression profiling techniques noted herein, including dominant,
additive and over or
under dominant expression products can be cloned. Il is expected that many
such nucleic
acids, particularly dominant and additive nucleic acids will be encoded by
loci responsible for
desirable quantitative traits ("QTL" see, Edwards, et al., (1987) in Genetics
115:113). QTL
include genes that control, to some degree, numerically quantifiable
phenotypic traits such as
disease resistance, crop yield, resistance to environmental extremes, etc. In
addition to the
methods herein, other experimental paradigms can be used to identify, analyze
and select for
QTL. One paradigm involves crossing two inbred lines and genotyping multiple
marker loci
and evaluating one to several quantitative phenotypic traits among the progeny
of the cross.
QTL are then identified and ultimately selected for based on significant
statistical
associations between the genotypic values determined by genetic marker
technology and the
phenotypic variability among the segregating progeny.
As applied to the present invention, the identification of particular nucleic
acids which encode dominant, additive or under or over dominant expression
products, or


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
28
which encode silenced expression products, are potential products of QTLs or
other genes or
loci of interest. Accordingly, it is desirable to clone nucleic acids which
are genetically
linked to DNAs encoding these expression products for transduction into cells
(e.g., coding
sequences for expression products, or genetically linked coding or non-coding
sequences),
especially to make transgenic plants. The cloned sequences are also useful as
molecular tags-
for selected plant strains, e.g., to identify parentage, and are further
useful for encoding
expression products, including nucleic acids and polypeptides. Often,
expression products
which are differentially expressed between heterotic and non-heterotic plants
are encoded by
QTL and are responsible for the phenotypic effects of the QTL.
A DNA linked to a locus encoding an expression product is introduced into
plant cells, either in culture or in organs of a plant, e.g., leaves, stems,
fruit, seed, elc. The
expression of natural or synthetic nucleic acids encoded by nucleic acids
linked to expression
product coding nucleic acids can be achieved by operably linking a cloned
nucleic acid of
interest, such as an expression product or a genetically linked nucleic acid,
to a promoter,
incorporating the construct into an expression vector and introducing the
vector into a
suitable host cell. Alternatively, an endogenous promoter linked to the
nucleic acids can be
used.
CIQ,~ine of Exnre~sion Product ~e4uences into Bacterial Hosts
There are several well-known methods of introducing expression product
nucleic acids into bacterial cells, any of which may be used in the present
invention. These
include: fusion of the recipient cells with bacterial protoplasts containing
the DNA,
electroporation, projectile bombardment, and infection with viral vectors,
etc. Bacterial cells
are often used to amplify increase the number of plasmids containing DNA
constructs of this
invention. The bacteria are grown to log phase and the plasmids within the
bacteria can be
isolated by a variety of methods known in the art (see, for instance,
Sambrook). In addition, a
plethora of kits are commercially available for the purification of plasmids
from bacteria. For
their proper use, follow the manufacturer's instructions (see, for example,
EasyPrepTM,
FlexiPrep''M, both from Pharmacia Biotech; StrataCleanTM, from Stratagene;
and, QIAexpress
Expression SystemTM from Qiagen). The isolated and purified plasmids are then
further
manipulated to produce other plasmids, used to transfect plant cells or
incorporated into
Agrobacterium tumefaciens related vectors to infect plants. Typical vectors
contain


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
29
transcription and translation terminators, transcription and translation
initiation sequences,
and promoters useful for regulation of the expression of the particular
nucleic acid. The
vectors optionally comprise generic expression cassettes containing at least
one independent
terminator sequence, sequences permitting replication of the cassette in
eukaryotes, or
prokaryotes, or both, (e.g., shuttle vectors) and selection markers for both
prokaryotic and -
eukaryotic systems. Vectors are suitable for replication and integration in
prokaryotes,
eukaryotes, or preferably both. See, Giliman & Smith, Gene 8:81 (1979);
Roberts, et al.,
Nature, 328:731 (1987); Schneider, B., et al., Protein Expr. Purif. 6435:10
(1995); Bergen
Sambrook, Ausubcl (all supra). A catalogue of Bacteria and Bacteriophages
useful for
cloning is provided, e.g., by the ATCC, e.g., ~'he ATCC Catalogue of Bacteria
a~d-
Bacteriopha~_e (1992) Gherna et al. (eds) published by the ATCC. Additional
basic
procedures for sequencing, cloning and other aspects of molecular biology and
underlying
theoretical considerations are also found in Watson et al. (1992) Recombinant
DNA Second
Edition Scientific American Books, NY.
Transfecting and Manipulating Plant Cells
Methods of transducing plant cells with nucleic acids are generally available.
In addition to Berger, Ausubel and Sambrook, useful general references for
plant cell cloning,
culture and regeneration include Payne et al. (1992) Plant Cell and Tissue
Culture in Liguid
Systems John Wiley & Sons, Inc. New York, NY (Payne); and Gamborg and Phillips
(eds)
(1995) Plant Ccll, Tissue and Organ Culture; Fundamental Methods Springer Lab
Manual,
Springer-Verlag (Berlin Heidelberg New York) (Gamborg). A variety of Cell
culture media
are described in Atlas and Parks (eds) The Handbook of Microbiological Media
(1993) CRC
Press, Boca Raton, FL (Atlas). Additional information for plant cell culture
is found in
available commercial literature such as the Life Science Re,~garch CPS Culture
Catalogue
(1998) from Sigma- Aldrich, Inc (St Louis, MO) (Sigma-LSRCCC) and, e.g., the
Plant
Culture Catalogue and supplement (1997) also from Sigma-Aldrich, Inc (St
Louis, MO)
(Sigma-PCCS).
The nucleic acid constructs of the invention are introduced into plant cells,
either in culture or in the organs of a plant by a variety of conventional
techniques. For
example, the DNA construct can be introduced directly into the genomic DNA of
the plant
cell using techniques such as electroporation and microinjection of plant cell
protoplasts, or


CA 02358509 2001-07-12
WO 00/42838 PCT/USOO/Ot422
the DNA constructs can be introduced directly to plant cells using ballistic
methods, such as
DNA particle bombardment. Alternatively, the DNA constructs are combined with
suitable
T-DNA flanking regions and introduced into a conventional Agrobacterium
tumefaciens host
vector. The virulence functions of the Agrobacterium tumefaciens host directs
the insertion
5 of the construct and adjacent marker into the plant cell DNA when the cell
is infected by the-
bacteria.
Microinjection techniques are known in the art and well described in the
scientific and patent literature. The introduction of DNA constructs using
polyethylene
glycol precipitation is described in Paszkowski, et al., EMBO J. 3:2717
(1984).
10 Electroporation techniques are described in Fromm, et al., Proc. Nat'l.
Acad. Sci. 1SA
82:5824 (1985). Ballistic transformation techniques are described in Klein, et
al., Na_ lure
327:70-73 (1987).
Agrobacterium tumefaciens-mediated transformation techniques, including
disarming and use of binary vectors, arc also well described in the scientific
literature. See,
15 for example Borsch, et al., Science 233:496-498 (1984), and Fraley, et al.,
Proc. Nat'l. Acad.
Sri. USA $0:4803 (1983). Agrobacterium-mediated transformation is a preferred
method of
transformation of dicots.
To use isolated sequences corresponding to or linked to expression products in
the above techniques, recombinant DNA vectors suitable for transformation of
plant cells are
20 prepared. A DNA sequence coding for the desired mRNA, polypeptide, or non-
expressed
sequence is transduced into the plant. Where the sequence is expressed, the
sequence is
optionally combined with transcriptional and translational initiation
regulatory sequences
which will direct the transcription of the sequence from the gene in the
intended tissues of the
transformed plant.
25 Promoters, in nucleic acids linked to loci identified by detecting
expression
products, are identified, e.g., by analyzing the 5' sequences upstream of a
coding sequence in
linkage disequilibrium with the loci. Optionally, such promoters will be
associated with a
QTL. Sequences characteristic of promoter sequences can be used to identify
the promoter.
Sequences controlling eukaryotic gene expression have been extensively
studied. For
30 instance, promoter sequence elements include the TATA box consensus
sequence
(TATAAT), which is usually 20 to 30 base pairs upstream of a transcription
start site. In


CA 02358509 2001-07-12
WO 00/42838 PCT/US00101422
31
most instances the TATA box aids in accurate transcription initiation. In
plants, further
upstream from the TATA box, at positions -80 to -100, there is typically a
promoter element
with a series of adenines surrounding the trinucleotide G (or T) N G. Sec,
e.g., J. Messing, et
al., in Genetic Engineering i-n Plants, pp. 221-227 (Kosage, Meredith and
Hollaender, eds.
(1983)). A number of methods are known to those of skill in the art for
identifying and -
characterizing promoter regions in plant genomic DNA. See, e.g., Jordano, et
al., Plant Cep,
1:855-866 (1989); Hustos, et ul., Plan dell 1:839-854 (1989); Green, et al.,
F,~fB
7:4035-4044 (1988); Meier, et al., Piant Cell 3:309-316 (1991); and Zhang, et
al., Plant
Physioloev 110:1069-1079 (1996).
In construction of recombinant expression cassettes of the invention, a plant
promoter fragment is optionally employed which directs expression of a nucleic
acid in any or
all tissues of a regenerated plant. Examples of constitutive promoters include
the cauliflower
mosaic virus (CaMV) 35S transcription initiation region, the 1'- or 2'-
promoter derived from
T-DNA of Agrvbacteriurrr tumafaciens, and other transcription initiation
regions from various
plant genes known to those of skill. Alternatively, the plant promoter may
direct expression
of the polynucleotide of the invention in a specific tissue (tissue-specific
promoters) or may
be otherwise under more precise environmental control (inducible promoters).
Examples of
tissue-specific promoters under developmental control include promoters that
initiate
transcription only in certain tissues, such as fruit, seeds, or flowers.
Any of a number of promoters which direct transcription in plant cells can be
suitable. The promoter can be either constitutive or inducible. In addition to
the promoters
noted above, promoters of bacterial origin which operate in plants include the
octopine
synthase promoter, the nopaline synthase promoter and other promoters derived
from native
Ti plasmids. See, Herrara-Estrella et al. (1983), Nature, 303:209-213. Viral
promoters
include the 35S and 19S RNA promoters of cauliflower mosaic virus. See, Odell
'et al.
(1985) Na re, 313:810-812. Other plant promoters include the ribulose-1,3-
bisphosphate
carboxylase small subunit promoter and the phaseolin promoter. The promoter
sequence
from the E8 gene and other genes may also be used. The isolation and sequence
of the E8
promoter is described in detail in Deikman and Fischer, (1988) 7:3315- 3327.


CA 02358509 2001-07-12
WO 00/42838 PCTlUS00/01422
32
If polypeptide expression is desired, a polyadenylation region at the 3'-end
of
the coding region is typically included. The polyadenylation region can be
derived from the
natural gene, from a variety of other plant genes, or from T-DNA.
The vector comprising the sequences (e.g., promoters or coding regions) from
genes encoding expression products of the invention will typically comprise a
nucleic acid -
subsequence which confers a selectable phenotype on plant cells. The vector
comprising the
sequence will typically comprise a marker gene which confers a selectable
phenotype on plant
cells. For example, the marker may encode biocide tolerance, particularly
antibiotic
tolerance, such as tolerance to kanamycin, 6418, bleomycin, hygromycin, or
herbicide
tolerance, such as tolerance to chlorosluforon, or phosphinothricin (the
active ingredient in
the herbicides bialaphos and Basta). For example, crop selectivity to specific
herbicides can
be conferred by engineering genes into crops which encode appropriate
herbicide
metabolizing enzymes from other organisms, such as microbes. See, Padgette et
al. (1996)
"New weed control opportunities: Development of soybeans with a Round UP
ReadyT"'
gene" In: Herbicide-Resistant Crops (Duke, ed.), pp 53-84, CRC Lewis
Publishers, Boca
Raton ("Padgette, 1996"); and Vasil (1996) "Phosphinothricin-resistant crops"
In: Herbicide-
Resistant Crons (Duke, ed.), pp 85-91, CRC Lewis Publishers, Boca Raton)
(Vasil, 1996).
Transgenic plants have been engineered to express a variety of herbicide
tolerance/metabolizing genes, from a variety of organisms. For example,
acetohydroxy acid
synthase, which has been found to make plants which express this enzyme
resistant to
multiple types of herbicides, has been cloned into a variety of plants (see,
e.g., Hattori, J., et
al. (1995) Mol. Gen. Genet. 246(4):419). Other genes that confer tolerance to
herbicides
include: a gene encoding a chimeric protein of rat cytochrome P4507A1 and
yeast NADPH-
cytochrome P450 oxidoreductase (Shiota, el al. (1994) Plant Ph~iol. 106(1)17,
genes far
glutathione reductase and superoxide dismutase (Aono, et al. (1995) Plant Cell
Physiol.
36(8):1687, and genes for various phosphotransferases (Datta, et al. ( 1992)
Plant Mol. Biol.
20(4):619. Similarly, crop selectivity can be conferred by altering the gene
coding for an
herbicide target site so that the altered protein is no longer inhibited by
the herbicide
(Padgette, 1996). Several such crops have been engineered with specific
microbial enzymes
for confer selectivity to specific herbicides (Vasil, 1996).


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
33
Further, nucleic acids which can be cloned and introduced into plants to
modify or complement expression of a gene, including a silenced gene, a
dominant gene, and
additive gene or the like, can be any of a variety of constructs, depending on
the particular
application. Thus, a nucleic acid encoding a cDNA expressed from an identified
gene can be
expressed in a plant under the control of a heterologous promoter. Similarly,
a nucleic acid -
encoding a transcription factor that regulates a target identified by the
methods herein, or that
encodes any other moiety affecting transcription, can be cloned and transduced
into a plant.
Methods of identifying such factors are replete throughout the literature. Far
a basic
introduction to genetic regulation, see, Lewin (1995) Genes V Oxford
University Press lnc.,
NY (Lewin), and the references cited therein.
R~eneration of Transeenic Plants
Transformed plant cells which are derived by any of the above transformation
techniques can be cultured to regenerate a whole plant which possesses the
transformed
genotype and thus the desired phenotype. Such regeneration techniques rely on
manipulation
of certain phytohormones in a tissue culture growth medium, typically relying
on a biocide
and/or herbicide marker which has been introduced together with the desired
nucleotide
sequences. Plant regeneration from cultured protoplasts is described in Evans,
et al.,
Protoplasts Isolation and Culture~Handbook of Plant Cell Culture, pp. 124-176,
Macmillian
Publishing Company, New York, (1983); and Binding, Regeneration of Plants.
Plant
Protoplasts, pp. 21-73, CRC Press, Boca Raton, (1985). Regeneration can also
be obtained
from plant callus, explants, somatic embryos (Dandekar, et al., J. Tissue
Cult. Meth. 12:145
(1989); McGranahan, et al., Plant Cell Ren. 8:512 (1990)), organs, or parts
thereof. 5ueh
regeneration techniques are described generally in Klee, et al., Ann. Rev. of
Plant Phvs.
38:467-486 (1987).
One of skill will recognize that after the expression cassette is stably
incorporated in transgenic plants and confirmed to be operable, it can be
introduced into other
plants by sexual crossing. Any of a number of standard breeding techniques can
be used,
depending upon the species to be crossed.
GENE SILENCING AND HETEROSIS
It is discovered that gene silencing and epigenetic effects play a role in
inbreeding depression. As demonstrated herein, the number of genes in hybrids
with a


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/014Z2
34
dominant pattern of gene expression is correlated with hybrid yield, a
component of which is
found to be relief from inbreeding depression. An other way of considering
genes in this
class is to classify them as genes that are expressed at lower levels in one
inbred parent than
the other. When one copy of a gene that is expressed at low levels in one
inbred is combined
with a copy from another inbred, a frequent outcome in the hybrid is an
equivalent level of -
expression to that seen with two copies of the gene in one or other of the
parental inbreds
(most often the more highly expressing parent).
The number of genes in the dominant class were considered as a function of
the number of hybrids that share those genes, and the frequency distribution
indicated that the
overlap between sets of genes contributing to dominant patterns of gene
expression in hybrids
is essentially random. This suggests that, during the process of inbreeding,
expression of a
subset of genes may always be altered (and usually reduced), and that the
expression of
different random subsets of genes are silenced in different inbreds.
These results agree well with the classical complementation concepts of
metabolic balance and physiological bottlenecks (Hageman et al. 1967 "A
biochemical
approach to corn breeding" Advan. A,g,~on. 19:45; Schrader, L.E. 19$5
"Selection for
metabolic balance in maize" pp79-89 in Exploitation of physiological and
genetic variability
1Q enhance crop groductivitv. Harper J.E.(ed). Waverly Press, Baltimore; and
Manglesdorf,
A.J. 1952 "Gene interaction in Heterosis, pp321-329 in to is, Gowen, J. (ed)
Iowa State
College Press, Ames) to explain heterosis. This hypothesis proposes that maize
inbred lines
have unbalanced metabolic systems with some enzymes at optimum level and some
at rate
limiting levels, or bottlenecks. Hybrids from inbred lines that have different
rate limiting
systems can overcome the bottlenecks by complementation. Depending on the gene
product,
a favorable allele can become an unfavorable allele in a different
developmental stage; and
vice-versa. Complementation, therefore results not only from quantitative
aspects; i.e.,
variation in the level of expression, but also from qualitative aspects, e.g.
variation in
function due to sequence polymorphisms.
Closely related crosses are less heterotic because, firstly, there are fewer
band
differences, either in level of expression or in sequence polymorphism,
therefore fewer
heterozygous loci providing potential opportunities for complementation.
Secondly, loci
from closely related crosses are more susceptible to gene silencing. In more
distantly related


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/Ot422
crosses, the inbred parents have a higher number of differential bands, and
the resulting
hybrid tends to express both alleles providing better complementation of
unfavorable parental
alleles. Such complementation allows for better responses to differing
environments or
during different developmental stages.
5 Without being bound to a particular theory, epigenetics provide a simple and
elegant explanation for these effects. In Drosophila and other organisms,
allelic (and
non-allelic) effects have been described where expression in heterozygotes is
normal, but in
homozygotes traps-inactivation (or silencing) of both alleles occurs. These
effects are
mediated by cis-acting regulatory sequences that need to be present at more
than one copy
10 (e.g. on different chromosome homologs) to mediate the cooperative assembly
of multimeric
protein complexes responsible for gene silencing (e.g., Polycomb proteins in
Drosophila or
SIR proteins in yeast). In maize, sequences responsible for these effects most
likely occur in
intergenic regions outside of the chromatin loops flanked by MARS that contain
genes.
About 80% of the sequences in these regions are derived from retroelements
that may be
15 transcriptionally silenced through natural selection. However, the
intergenic regions are also
where make exhibits most DNA sequence polymorphism. Thus, homozygosity of
certain
intergenic regions in inbreds could lead to adjacent gene silencing, whereas
in hybrids fewer
intergenic regions will be homozygous for sites that can assemble silencing
complexes, so
more genes will be derepressed. As new inbreds are created from hybrid
crosses,
20 recombination randomizes the intergenic regions across the genome, thereby
resulting in a
new subset of genes that are silenced when those regions that can assemble
silencing
complexes are made homozygous. This model explains why inbreds express fewer
genes
than hybrids (which accounts for their lower yield) and why the number of
genes that exhibit
a dominant pattern of gene expression in hybrids increases as the percent
relationship
25 between inbreds decreases. It also can easily accommodate potential
explanations for the
existence of heterotic pools, and the higher level of heterosis seen in maize
as compared to
other cereals (e.g. rice), which have a very different genome organization and
level of
sequence polymorphism. Finally, it is also possible that in maize, where
natural inbreeding
occurs infrequently because of its floral characteristics, natural selection
may not have acted
30 to eliminate gene silencing at the same rate as in self-fertilizing
species.


CA 02358509 2001-07-12
WO 00/42838 "" -- PCTNS00101422
36
MOLECULAR SECURITY; IDENTIFICATION OF PARENTAL SOURCES BY
COMPARISON OF EXPRESSION PROFILES
One general concern in the agricultural industry is that proprietary plant
stocks
or other sources of germ plasm can sometimes be inadvertently, or even
deliberately,
misappropriated. Because the germ plasm may be recombined with other sources
of germ
plasm before producing a product such as a hybrid seed, it is not always
possible to tell that
the product is improperly derived from proprietary parental plants, clones, or
the like.
The present invention provides methods of identifying unique expression
products and/or unique profiles (or partial profiles). This ability to
identify unique expression
products provides one way of ascertaining parentage, which, in turn, provides
the ability to
determine whether a hybrid comprises proprietary material.
In the methods, a source or the sources of a test plant such as a hybrid can
be
identified. In the methods, a representative sample of expression products
from the test plant
is profiled and the resulting test expression profile is compared to a
database of known
expression profiles for plants from known inbred or hybrid strains (methods of
making such
databases are described above). For example, the expression profiles for a
selected tissue can
be entered into a database far any or every proprietary plant (or clone, or
any other source of
germ plasm) that a corporation owns.
By profiling a number of plants, it is possible to detect unique expression
products and/or expression patterns within the expression profile of specific
plants. It is also
possible to generate likely expression profiles for hybrid products of members
of the
database. Any of these expression profiles can be compared to an actual
expression profile
for a test plant suspected of being derived from a one or more proprietary
plant. For example,
a matrix of pair-wise comparisons for potential progeny from the expression
profiles in the
database can be compared to the test expression profile. Either the entire
expression profile
or a sub portion of the expression profile (i.e., a plurality of characters
corresponding to
expression products found in the overall profile) comprising at least one
unique expression
marker can be evaluated.
EXAMPLES
The following examples are offered by way of illustration, and are not
intended to be limiting. One of skill will immediately recognize a variety of
alternate


CA 02358509 2001-07-12
WO 00/42838 PCT/USOOl0I422
37
procedures, compositions, reagents and the like which can be substituted for
those
exemplified below.
EXAMPLE 1: DIFFERENCES IN RNA EXPRESSION PROFILES CORRELATE
WITH HETEROSIS
Heterosis is a term used to describe the increased vigor of hybrid progeny in -

comparison to their parents. Although heterosis has been widely used in plant
breeding for
many decades, the molecular mechanisms underlying the phenomenon were
previously
unknown. In this example, heterosis was studied as a phenotype using CuraGen
(CuraGen
Corp., New Haven CT) RNA profiling technology to examine differences in RNA
expression
between hybrids and their inhybrid parents. Using this approach, it was
possible to sort out
cDNA fragments into different categories, depending on their relative levels
of expression in
a given hybrid and its two parents. Data indicated a difference in the number
of genes in each
category (dominant, under-dominant, over-dominant, additive) between heterotic
and non-
heterotic hybrids. The results also suggested the ability of this approach to
explain the
molecular basis of heterosis and the application of the information obtained
to plant breeding
methods.
The degree of heterosis varies tremendously among hybrids from different
parental combinations. In current breeding practice, selection far parent
combinations which
give a high degree of heterosis depends on top-cross yield tests. In this
disclosure, new
methods of monitoring heterosis by identifying genes and gene expression
patterns associated
with heterosis expression are provided. Specific gene expression patterns
associated with
heterosis are identified prior to yield testing. This allows screening of
larger numbers of top-
crosses without having to yield test all combinations. Similarly, non-
optimally expressed
genes in existing commercial hybrids can be identified and improved by
transgenic
manipulation or gene-expression profile assisted selection.
"PAR" names herein are arbitrary predesignations of commercial and
proprietary strain names. Because the invention is applicable to any crop
strain, the particular
strains used are not critical, or even relevant, to the claimed invention.
Accordingly, actual
crop strain names are not provided.
The PAR, series of hybrids used for RNA profiling are listed in Table 1.


CA 02358509 2001-07-12
WO 00/42838 PC"f/US00/01422
38
1n Table 1, these hybrids range from a highly heterotic commercial hybrid
(PAR, = PAR,/PAR~) to sibling crosses (e.g. PAR,/PAR,7) which have much less
heterosis.
Each hybrid is derived from the same female parent (PAR,) and a male parent
with a different
percentage pedigree relationship. The correlation between heterosis and
pedigree relationship
is given in Table 1. Figure 1 graphically represents the correlation between
degree of
heterosis and % relationship: % relationship is designated on the X axis; F1-
MP heterosis in
bu/LCR is given on the Y axis. Data was obtained from 4 locations in JH97.
t o N i'
FI-MP
H brid Inbred arents % Heterosis
Relationshi(bu/LCR)


PAR /PAR, PAR PAR, 0.8 74.6.5


PAR /PAR PAR PAR 1.1 67.15


PAR /PAR PAR PAR 1.5 69.75


PAR /PAR PAR PAR, 2.4 80.20


PAR /PAR PAR PAR 3.1 53.55


PAR /PAR PAR PAR 4.5 71.65


PAR /PAR PAR PAR 20.7 50.40


PAR /PAR PAR PAR 45.9 12.75


PAR /PAR PAR PAR 47.3 34.80


PAR /PAR PAR, PAR 48.1 55.20


PAR /PAR PAR PAR , 48.8 43.95
,


PAR /PAR PAR PAR 60.5 50.05
,


PAR /PAR PAR PAR 62.4 37.50


PAR IPAR PAR PAR 63.9 43.10


PAR /PAR PAR PAR 71.0 48.50


PAR /PAR PAR PAR 85.5 25.10


In seedlings and immature ears, 90-95% of RNAs in each F, hybrid were
expressed at the same levels as in both parental inbreds. Genetically
distantly related inbreds,
e.g., the parents of commercial hybrids, had less than 6% of the mRNAs
differentially
expressed. The number of differentially expressed RNA bands between two inbred
parents


CA 02358509 2001-07-12
WO 00/42838 ~ ' PCT/US00/01422
39
was positively correlated with the corresponding hybrid yield, demonstrating
that either gene
expression differences and/or DNA sequence polymorphism between inbred parents
are
important for heterosis.
The level of RNA expression in the hybrid can differ from one inbred parent
or the other (dominant), or both (additive or over-/under-dominant). Figure 2
depicts the
classification of gene expression patterns in F1 hybrids relative to the
inbred parents. RNA
levels are provided on the vertical axis. Bands in each class exhibited the
following
expression patterns: (A) Over/under-dominant class: the level of expression in
Fl hybrid is at
least two folds higher or lower than both parents, which have either equal or
different levels
of expression. In the additive. The majority of RNA expression level
differences in both
tissues of all hybrids analyzed were in the (B) additive and (C) dominant
classes, the mRNA
levels of the inbred parents are different. Additive class: F1's expression
level falls within the
range of the two parents. Dominant class: the level of expression in Fl hybrid
is equal to one
parent but different from the other. Two-thirds of the differences observed
exhibited additive
expression, and the rest of the differences demonstrated a dominant expression
pattern.
Furthermore, the number of dominant and additive RNA fragments correlated with
the degree
of heterosis. Initial studies of both seedlings and immature ears,
demonstrated correlations
between the number of RNA fragments in the over-/under-dominant class and the
%
relationship between the inbred parents. Five commercial hybrids (PAR,-PARZ~)
selected
for high yield all had high numbers of dominant and additive bands and a lower
number of
under/over dominant bands.
A new metric that measures the genetic distance between the two parents and
the frequency of non-additively expressed RNA's in the over-/under-dominant
class was
developed. This was defined as the ratio between the sum of the RNA fragment
numbers that
differ from the hybrid in each of the two parents and the number of RNA's that
differ between
the two parents. [i.e., (A-Fl)+(B-Fl)/(A-B)J. High yielding commercial hybrids
between
distantly related parents and with fewer over-/under-dominant RNA's give a
lower ratio close
to 1.0 (Tables 2, 3 & 4). Figure 3 illustrates the correlation of gene
expression patterns with
hybrid yield. Hybrid yield in bu/LCR is given on the X axis, while % of bands
in each
expression class is given on the Y axis (% of bands different: dotted line; %
of additive
bands: dashed line; and % of dominant bands: solid lint).


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
4p
0



c~. v,v,a a N r N ..,.p
..


N ~ M N - G W G: o:~ ~ ~nO:
a c. o.c ~ G ooa ooa. oor
r


c ~
~
v


~



w


W
,N N O~~ ~ ~ N ~ h ~ M


~ N N 00N N N V N ~ V1


_C Z



t


'''' .3
a


N .~
c ~n o - .oo v~_ ._o; ~o o,--
~
A
a


V ~fO O ~ t r r vO t~1V
a B d


v 'ZS
a


_


M <


N


'o
" N c o
'


H z '~ a~m ' o
.~ . ~


-b



ca


N


'~ '~'" ~?ao


a oo ~op . ~ ~,a r,a N
_ ey ty M M


'$ 'd
'v


y aE


3



~ ~ N MVN 0~0~ ~1~ ~ ~ r


. Z.
E


U



Q.


o
s
a~


V o ~, e.~n~.r N N
.s o V' v 'nerrv
"
~


3 = ~ ,
a .,


Ebb



ar


O
z


a
~ ~ V Vf,~ ~O O O O O V 'CV
Zf


h h
M ~ n n n n h ~ t~1



r
.


V ~ r ~ yn- ac 4 O O A r oo
~


Z z z z
r v~v ~ r


Zm
..r



J r V p .~V ~ ~ N ~ r V
~ '
~


w ~ e ~ ~ b .'cn n n n
~


-p m



'p ~ ~ o = $ g ~ $ $ o o ~ ,
o


C C ' C C C C G ~ ~p
v o0


r ~ ~. A


c~ m w t



0
,



3 ~ ~ . ~ E ~ o
~


ae ~ oeocm x ~ z o~ ., 0Gof a
~ aG
~


a a a a < a a a a a < <
i! ~ 4 L 0.a L p c 0. 0. 0Ø


~ '. , , Z
LL1
~


V1 ~ .~-~ N N


CA 02358509 2001-07-12
WO 00/42838 PC'f/US00101422
41
The number of RNA fragments in the additive class was higher in all heterotic
hybrids which include five commercial hybrids (PAR,/PARz (+PAR,9], PARzo,
PAR,"
PAR~2, PAR", PAR,/PAR~). The same trend was also found in seedling tissues of
selected
hybrids analyzed. There was also a strong correlation between the number of
dominant RNA
bands and Quo of yield heterosis (Figure 3).
Table 3 Gene expression~att~rns of h"~brids in relation to heterosis. Total
no. of bands
assayed is approximately 14,000 for all genotypes. 1 % is about 140 bands. %
of bands
different (A-B): °h of bands differentially expressed when comparing
the inbred parents of
corresponding hybrids. % of bands additive or dominant: % of bands where F1
had an
additive or dominant expression pattern, respectively. Expression data based
on 3 sample
replicates.
Hybrid 6 of pedigreeHybrid 9 of bands~ of bands9& of
relationshipyield different additive bands
(Bulac (A-B) dominant
rc)


PAR, /PARE0. 8 125.7 5.6 3.7 1.9


PAR~/PAR31.1 123.4 5.6 3.6 2.0


PAR,/PAR,671.0 99.0 3.6 2.4 1.2


PARt/PAR946.0 68.6 0.8 0.5 0.3


PAR,/PAR"85.5 67.1 0.5 0.3 0.2


One way to interpret these data is to assume that for every gene, there is an
optimal level of expression. Different inbreds may have subsets of genes that
are expressed
either below or above the optimum, thus contributing to their poor vigor. In
hybrids, many
genes expressed in parent A, but not in parent B, may be expressed at the same
level as in
parent A and vice versa. Thus, hybrids will have more genes expressed at an
optimum level
than either parent A or parent B, and the genes expressed at optimum level in
A and B will
complement those expressed at sub- or supra-optimal level in the other inbred.
This
arrangement is represented graphically in Figure 4. (Panel A illustrates the
dominant class,
panels B and C illustrate the additive and over/under dominant classes,
respectively. H:
3U "optimum" level of mRNA expression). Thus, high heterosis is associated
with an increase
in the dominant and additive classes and a decrease in the over-/under-
dominant class. In


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
42
crosses between related inbreds, the additive class may disappear as more of
these genes are
likely to be iso-allelic.
T 1 4. The numb r t d' w n
Comparison of RNA No. of No. of Ratio of Heterosis
Profiles DifferencesDominant (A-
bands Fl)+(B-F,)/ (k of
(Inbreds denoted (A- Fl)
by A or B, B)
H brids b F1)


PAR -PAR A-B 1309 1227 0.95 59.4


PAR -PAR A-F 1 602


PAR -PAR B-F 1 647


PAR -PAR A-B) 1404 1297 0.93 54.4


PAR -PAR xPAR (A-F1)706


PAR -PAR xPAR (B-F1)604


PAR -PAR (A-B 952 870 1.12 48.4


PAR -PAR xPAR (A-F 601
1 )


PAR -PAR xPAR (B-F 464
1


PAR -PAR (A-B 505 252 2.96 37.4


PAR -PAR xPAR (A-F1)745


PAR -PAR xPAR (B-F1 748


PAR -PAR A-B 359 276 2.47 18.6


PAR -PAR xPAR (A-F 510
1 )


PA -PAR xPA B-F1 376


PAR PAR (A-B 1487 1327 0.99 > 50


PAR -PAR (A-F 1 762


PAR -PAR (B-F 1 ) 704


PAR -PA A-B 1565 1425 0.81 > 50


PAR -PAR (A-F 1 ) 625


PA -PAR (B-Fl 640


PAR -PAR (A-B 1403 1258 0.93 > 50


PAR -PAR (A-F1) 724


PA -PAR B-F1 574




CA 02358509 2001-07-12
WO 00/42838 PCTNS00/01422
43
The poor correlation between the number of genes in the over-/under-
dominant class and the degree of heterosis is surprising. The data suggest
that when breeders
select for highly heterotic hybrids, they may also be selecting against genes
that fall into this
class. The logical extension to this argument would be that if derivatives,
e.g., of PAR,9 are
selected or engineered that have fewer or no genes that fall into this class,
they will have a
higher yield than PAR,9 itself.
EXAMPLE 2: PREDICTING HETEROSIS FROM ANALYSIS OF SHARED
ADDITIVE BANDS; IDENTIFICATION OF GENES INVOLVED IN HETEROSIS
Immature ear mRNA was profiled from 10 hybrids and their respective
inbred parents. The genotypes profiled included a number of commercial hybrids
and a set
from the "PARZ, series," in which PARZ, was used as a common female with a
series of
males that differed in percent relationship. Differentially expressed bands
among hybrids
and inbred parents were categorized according to whether they were additive,
non-additive
[= over-/under-dominant] or dominant. Analysis of this set of data from
profiles of all 10
hybrids showed the following.
First, there was an inverse correlation between heterosis and the number of
non-additively expressed sequences. Second, the number of RNA fragments in the
additive
class was higher in all heterotic hybrids analyzed, which include five
commercial hybrids
(PAR,9, PARZO, PAR2,, PARz2, PAR23) and PARz4/PARZS (PARzb cross). Third, the
data
also indicated a strong correlation between the number of dominant RNA bands
and the
degree of heterosis.
Table 5: # of Ad ve Bands
'fi


# of Additive Bands# of Hybrids
sharing


26 5 or more


94 4 or more


262 3 or more


612 2 or more


1635 1 or more



Identifying and cloning genes in common to the additive and dominant
classes amongst a series of highly heterotic hybrids that share little
relationship to each
other by pedigree is of value. In comparing the additive class of all 10
hybrids, the


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/014ZZ
44
additive bands occurring in one or more of the 10 hybrids were considered. The
results are
shown in Table 5.
The maximum number of hybrids an additive band occurred in was seven out
of 10. By analyzing the frequency of bands that occurred in each group
mentioned above, in
all groups a constant pattern in the bands shared between hcterotic hybrids
(See Table 6) was
detected. Heterosis prediction based on this data gave the following rank:
10>$>2>7>6>1>3>9>4>S. The corresponding hybrids are: PAR23> PAR22>
PAR27/PAR25>
PAR21> PAR2a> PAR19> PAR2,/PAR43> PAR24/PAR25> PAR2~lPAR3,> PAR~.~'AR~,.
Comparing with the actual yield heterosis data in Table 6, the ranking is very
close to the
yield.
Table 6' Heterosis/Corresnon,~in~Hybrid Information


Hybrid CorrespondingHeterosisFrequency FrequencyFrequencyFrequency
*


hybrid (~ of (2G bands) (94 (262 (G12
Pl ) bands)bands) bends)


1 PARz,/PAR,,,59.4 15 52 103 IG7


(PAR ",)


2 PAR"/PAR=,54.4 20 57 137 212


3 PAR"/PAR"48.5 13 36 63 94


4 PAR=r/PAR,725.1 G 8 12 15


5 PAR=.,/PAR"12.8 I 2 4 6


6 PARE, >50 15 48 107 214


7 PAR" >50 17 52 124 252


2~ 8 PAR" >50 21 62 142 253


10 PARE, Commercial G7 139 253
hybrid
24


The expression patterns of the two SS x SS hybrids, PAR4G/PAR48 and
PAR4G/PAR4~, were also informative. The pedigree relationship of the two
hybrids are similar
(23% and 27%, respectively ); however, the heterosis levels are different
significantly, 2.3%
for the former and 36.6% for the latter. The difference in the number of
additive bands is
striking between the two (1 and 60).
EXAMPLE 3: ANALYSIS OF DOMINANT GENE EXPRESSION CLASS
Additional observation from further data analysis was that there is a
difference
in the number of dominant bands contributed by the male vs. female parent,
i.e., whether the
expression level in F1 is the same as that of male or female parent. The
number of dominant
bands contributed by the male parent are consistently higher across all
hybrids analyzed,
regardless of the degree of heterosis. However, there is a better correlation
between the yield
heterosis and the number of the dominant bands contributed by female parents
than male
parents, especially with the PARZ, series.
When the dominant bands were grouped according to whether they arc up- or
down-regulated in the hybrid, that is, whether the hybrid is the same as
either the higher or


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
lower parent, Fl tends to have an expression level closer to the parents with
higher
expression. This is especially true with the RNA bands that are similar
between an F1 hybrid
and its male parent.
The consistent association of higher numbers of dominant and additive RNA
5 bands with heterotie hybrids, regardless of genetic backgrounds or
developmental stages, and
the tendency of up-regulated gene expression of dominant bands in hybrids,
suggested that
genes in hybrids arc in a more active phase than in inbreds. Secondly, more
genes are in such
an active condition in heterotic hybrids than in poor hybrids. Thus, genes are
mostly silenced
or inactivated when their regulatory elements are in a homozygous condition,
e.g., in inbreds,
10 but re-activated when in a heterozygous condition. Hybrids derived from two
inbreds that
have optimal complementation to each other to give rise to an heterozygosity
condition for
most of these regulatory elements had a maximal number of genes "re-activated"
and were
therefore, heterotic. Crosses of closely related inbreds or inbred lines that
did not have such
"optimal complementation" had fewer genes re-activated and produced low
heterotic hybrids.
Table 7 Difference in the numbers of dominant band~con pbuted by male vs
female parent
Genotypes 96 HeterosisTotal Total No. No. of Ratio
REI, No. No. of of


(~ of of bandsof domntdomnt domnt male
F1 ) to


differentbands bands bands Female
by by


(A-B) male female


parent parent


PAR_; PAR_, 0.08 59.4 1309 605 346 239 1.34
:
1


PART; PAR_s0.11 34.4 140d 64G 419 227 1.85
:
1


PARi,-PAR" 71 48.4 952 462 299 163 1.83:1


PARl,-PAR" 86 37.4 505 241 133 108 1.23:1


PAR"-PAR" 45 18.G 359 180 134 46 2.91:1


PAR,,-PAR", 0.04 com 1487 739 415 324 1.2R
:
1


PARro PAR"0.04 com 1565 751 417 333 1.25
:
1


PAR,_-PAR" 0.01 com 1403 G50 370 280 1.32:1


PARE,-PAR=f 0.21 PARE 1484 681 444 237 1.87:1


PAR,; PAR,f 0.03 com 1297 SGB 364 204 1.78
:
1


Another correlation (statistictll association) from this data set is that
there is a
difference in the number of dominant bands contributed by the male vs. femme
parent, i.e.,


CA 02358509 2001-07-12
WO 00/42838 PCTNS00/01422
46
the expression level in F1 is the same as that of male or female parent.
Figure % illustrates
parental effects on gene expression of heterotic and non-heterotic hybrids.
Total number of
dominant bands were calculated for each hybrid as 100oJo. (Fl=male or female:
dominant
bands where F1 hybrid has equal level of expression as the male or female
parent,
S respectively). The number of dominant bands contributed by the male parent
are constantly
higher across all hybrids analyzed, regardless of whether heterotic or non-
heterotic (Table 7).
Also, there is a better correlation between the yield heterosis and the number
of the dominant bands by female parents than male parents, especially with the
PAR27 series
(Table 7). For example, the least heterotic hybrid PARIlPARI7. a sib cross,
had 96% of male
dominant bands and 4% female dominant bands. Whereas the hybrid exhibiting the
highest
degree of heterosis PARI/PAR2 had 60% and 40°Io male and female
dominant bands,
respectively. When these dominant bands were grouped according to whether they
are up- or
down-regulated in the F1, that is, where the F1 is the same as either the
higher or lower
parent, Fl tends to have an expression level closer to the parents with higher
expression.
This is especially true with the RNA bands dominant by the male parents (Table
8).
Table 8 Iy~amber or
of down
RNA regulated.
bands
where
F1
is
up



Male Female
Parent Parent


Hybrids TotatUp Total Up RegulatedDown
Regulated (F1 Regulated
Down Ratio
Regulated
Ratio


(FI (F=higher(F1 =lower(up:down)=female)(Fl (Fl (up:down)
= =higher=lower


male)parent)pnrent) parent)parent)


2.0 PAR:.,/f'ARa344 258 8G 3:1 259 l57 102 1.5:1


PAR_.~PARa 419 304 ll5 2.G:1227 154 73 2.1:1


PARr,/PAR" 299 20G 93 2.2:11G3 96 67 1.4:1


PAR_.~l'AR" 133 97 3G 2.7:1108 51 57 0.9:1


PAR_~PAR" 134 I06 28 3.B: 46 1 B 28 0.6:1
I


PARp,/PAR~,415 274 141 (.9:1324 152 172 0.9:1


PAR,/PAR" 417 285 132 2.G:1333 189 144 1.3:1


PARr./PAR" 370 251 119 2.1:1280 195 85 2.3:1


PAR'/PAR.~ 444 270 174 LG:1 237 165 72 2.3:1


PAR,,~/PAR" 363 19G 167 1.2:1204 144 GO 2.4:1




CA 02358509 2001-07-12
WO 00/42838 PCT/I1S00/01422
47
EXAMPLE 4: GENES SPECIFICALLY EXPRESSED IN HIGH YIELDING
COMMERCIAL HYBRIDS
Most of the analyses so far with the RNA profile data described in Example 1
are based on the expression patterns of F1 hybrids relative to their inbred
parents, such as
additive vs. non additive classifications and the differences of these
categories between
heterotic and non-heterotic hybrids. While the results so far were
informative, another way of
analyzing this data set by comparing the levels of RNA expression of poor
hybrids with
heterotic hybrids without any involvement of their parents. In comparing all
10 hybrids,
which include 3 breeding crosses and 7 commercial hybrids, a list of bands
that have similar
expression level among heterotic hybrids but different from the non-heterotic
hybrids
(breeding crosses) was determined.
EXAMPLE 5: EXPRESSION PROFILING USING DIFFERENT TISSUES FROM
HYBRIDS AND PARENTS
RNA profiling data from hybrid sets (hybrids and their respective parents)
were obtained in maize. Five other sets utilized kernel tissue at 13 days
after pollination
("DAP"). A total of 14 hybrid sets for the immature ear (V 19), five for the
kernel (R2) and
three for the seedling tissue (V3) were profiled.
For immature ear tissue, the 14 hybrid sets analyzed included seven from the
PAR,7 series, which covers a spectrum of heterosis levels ranging from
commercial hybrids
to low heterotic hybrids of sibling crosses; four commercial hybrids from
diversified genetic
backgrounds other than PAR" series and three crosses between inbreds of the
same heterotic
group, typical of those that would be useful for breeding new inbreds.
The five hybrid sets where kernel tissue was analyzed and the three hybrid
sets
from the PAR~~ series where seedling tissue was analyzed were from the PAR2~
series.
Profiling data of all these hybrids from all three tissues analyzed gave
similar
expression patterns. However, the immature ear tissue was mare informative
than seedling
tissue and less complicated than the kernel tissue, which is compounded with
other effects
due to pollen


CA 02358509 2001-07-12
WO 00/42838 PC'f/US00/01422
4$
o ' n m rs re t N-f ld 'f ere es in the ex res ion of each h
fr A ,"Z
PAR
vs.


S PAR" PAR" PAR" PAR" PAR" PARi,/PAR,sPAR" PAR" PAR;,
vt. vs. vs. vs. vs. vs. vs vs.


Baod PAR"/PAR,"PAR,,/PAR"PAR"/PARE,PAR=,/PAR~,PARa,/PAR"
PARs,/PAR"PAR=,!
Jm (PAR"
crass)
PARr,/PAR"


PARn


d010-163.50 0 0 0 0 -2.66 -2.19-G.49 -4.31


dIlvO-123.20 0 -2.66 -2.06 U 0 3.34 7.09 5.14


dOvO-172.47.8 0 0 0 0 8.91 IB.4818.44 29.45


In dOvO-104.40 0 U U 0 U 2.36 2.55 2.45


gOmO-339.80 0 0 0 0 0 4.97 2.82 2.09


gln0-389.30 0 U 2.69 -4.U3 U -2.73-2.57 -2.98


hOcO-173.10 0 0 0 0 0 2.43 2.83 2.16


hOcO-285.42.07 2.64 0 2.04 0 0 -2.4?-2.1 -2.88


M1~0-131.40 0 0 0 0 3.85 4.83 4.81 5.16


i0e0-45.6D 0 0 0 U U -2.49-2.2G -2.53


i0a0-237.80 0 0 0 0 0 2.59 3.3 2.81


i0a0-242.80 0 0 0 0 0 -2.R4-4.76 -2.87


i0a0-252.1U 0 0 -2.74 2.72 0 69.2129.62 69.23


i0c0-95.60 0 -2.5 0 0 0 -3.02-22 -2.76
f


iUCO-203.40 U 0 0 0 0 -2.26-2.87 -2.04


i0c0-312.10 0 0 0 -R.74 0 6.11 7.11 7.9


i0m(1-271.5U 0 0 0 0 0 -2.922.39 2.63


IOnO-140.4D 0 0 0 -2.03 0 2.55 2.1 2.45


25 IOnO-210.5-2.13 0 -2.21 0 0 12.09 13.438.82 16.27


mla0-89.30 0 0 0 0 0 3.41 3.74 2.4


mlaU-239.6-2.54 0 0 0 0 0 -2.8 -4.05 -4.94


mla0-241.60 0 0 0 0 0 -4 -3.14 -2.G4


mla0-425.1D 0 -2.34 U 0 0 4.59 6.64 6.1


3~ rOkO-190.70 0 0 0 2.45 0 -2.85-4.4 -9.01


w9c0-128.30 U 4.54 0 U 0 2.53 3.63 3.14


wOcO-230.20 0 6.83 0 0 0 -2.72-3.14 -3.13


wOcO-267.2-15.180 3.09 0 -5.12 2.6. 6.62 6.35 10.73


wOcO-381.30 0 0 0 0 0 17.4539.08 44.17


35 wuno-xsl.zo D o 0 0 0 2.64 2.s6 z.s


wOhO-406.60 0 0 0 0 0 7.34 4.21 7




CA 02358509 2001-07-12
WO 00/42838 PCTNS00/01422
49
PAR",
vs.


PAR" PAR" PAR" PAR" PAR" PAR"IPAR"PAR" PAR" PAR"
vs. vs. vs. vs. vs. vs. vs vs.


Band PAR"/PAR"PAR,,/p'AR"PAR"IPAR"PAR,,IPAR"PAR"/PAR"(PAR"
PARi,/PAR"PAR=,/
m cross)
PAR"/PAR"


PAR"


w0i0-J54.40 0 -2.04 0 0 0 3.73 3.65 3.8


w0i0-265.30 0 0 0 0 0 -2.51-2.73 -2.28


y0i0-118.10 0 0 0 -2.15 0 4.19 6.57 6.ti5


y0i0-254.40 0 0 0 0 4.21 2.6 6.95 5.32


sources, such as xenia, maternal effects, etc. Profile analysis of all the
samples consistently
showed similar correlations between profile information and heterosis to those
described in
Examples I, 2 and 3.
IO Dominant bands from all I4 hybrids shared by number of hybrids is presented
in Figure 2. The number of dominant bands shared by one or more hybrids is
normalized to
100%. The data show that the dominant bands shared by two or more hybrids
range from 60-
80%; bands shared by three or more hybrids is about 40-50% and so on. Although
the total
number of dominant bands was important to make a heterotic hybrid, the
dominant bands
shared by a higher number of hybrids may not necessarily contribute to the
heterosis
expression.
Since seedlings show the same trend as immature ears, albeit with different
genes involved, it is possible to select at the seedling stage individual
hybrid combinations
that express the highest number of genes with a dominant expression pattern
and that have
fewest genes in the over-/under-dominant class. Thus, much larger numbers of
F2 top-
crosses are screened using this procedure as a first cut, than could be
screened by rnulti-
location yield tests alone.
In addition to the analyses above, another way of analyzing the profile data
can be used. In this approach, the levels of RNA expression of poor hybrids
and heterotic
hybrids are compared without any involvement of their parents. This approach
examines
whether the absolute level of expression of a subset of genes are important
for heterosis, in
addition to the additive vs, non-additive expression patterns we already
found. In the
dominantly expressed bands, the F1 hybrids tend to have the same expression
levels as the
higher parent, i.e. showing overall an up-regulation of gene expression (Table
10). In


CA 02358509 2001-07-12
WO 00/42838 PGTNS00/01422
comparing all hybrids with PAR,9, 34 bands that have a similar expression
level among
heterotic hybrids but different from the non-heterotic hybrid were identified
(Table 9; the last
three columns are non-heterotic hybrids). For these 34 bands, the 3 poor
hybrids show either
higher or lower expression than PAR,9 whereas all other hybrids, which are
heterotic, show
5 no or little differences in the expression relative to PAR,9. -
TalnP Ill PrPrlnminanc~P of u~reEUlated bands ir~t~e hybrids vs the,~r parents
Dominant
Bands


Genotype TotalUp- Dn-


10 PARz,- 656472 184


PARZO- 672447 225


PAR3z- 629446 183


PARZ,~ 670435 235


PAR~~- 621431 190


15 PAR28- 711422 289


PAR"- 558342 216


PAR2~- 588319 269


PAR46 441313 128


PAR2,- 549311 238


20 PARZ,- 459304 1
SS


PAR46- 346249 97


PARZ~- 269175 94


PAR,,- 191137 54


EXAMPLE 6: CORRELATIONS TO MALE VERSUS FEMALE PARENTS
As indicated previously, a preponderance of male dominant bands was
observed when immature ear mRNA was profiled from hybrids and their respective
inbred
parents (Figure 5). Selected male dominant bands were screened for allelic
sequence
polymorphism between inbred parents such that male and female alleles were
identified.
Several bands exhibited an allelic polymorphism between the two parental
alleles, and these
were further tested for mono- or bi-allelic expression in the F1 hybrids. PCR
primers were
designed based on the sequence information and used to amplify cDNAs derived
from
mRNAs of F1 hybrids. More than 20 cDNA clones derived from F1 mRNA derived
from a
single locus were randomly picked and sequenced. All cDNAs expressed in the Fl
were
identical to the allele expressed in the male parent and none were identical
to that expressed
in the female parent. These results are illustrated in Figure 6a-c which show
an allelic


CA 02358509 2001-07-12
WO 00/42838 PCT/US00/01422
51
expression test of a male dominant band (wOhO) cloned from CuraGen. (A)
schematic
representation of polymorphic amplification products; B) sequences of 9 random
cDNAs
from 50% PAR, + SO% PAR, mRNA used as a control for allelic discrimination in
PCR
cloning; C) sequences of 10 random cDNAs from PAR,/PARZ mRNA are all the same
as
PARZ allele). This result is consistent with expression of only the male-
derived allele and -
silencing of the female-derived allele. To insure that preferential
amplification did not
explain the differential amplification results, equal amounts of mRNA from
each parent
genotype was mixed and amplified by PCR. Of nine cDNA clones sequenced from
the
control reaction, five were from the male parental allele, and four were from
the female
parental allele, demonstrating that no discrimination between the alleles
occurred during
amplification.
Accordingly, the disclosures and descriptions herein are intended to be
illustrative, but not limiting, of the scope of the invention which is set
forth in the following
claims. One of skill will recognize many modifications which fall within the
scope of the
following claims. For example, all of the methods and compositions herein may
be used in
different combinations to achieve results selected by one of skill. All
publications and patent
applications cited herein are incorporated by reference in their entirety for
all purposes, as if
each were specifically indicated to be incorporated by reference.

Representative Drawing

Sorry, the representative drawing for patent document number 2358509 was not found.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date Unavailable
(86) PCT Filing Date 2000-01-19
(85) National Entry 2000-07-03
(87) PCT Publication Date 2000-07-27
Examination Requested 2001-08-22
Dead Application 2005-05-20

Abandonment History

Abandonment Date Reason Reinstatement Date
2002-07-04 R30(2) - Failure to Respond 2003-06-27
2004-05-20 R30(2) - Failure to Respond
2004-05-20 R29 - Failure to Respond
2005-01-19 FAILURE TO PAY APPLICATION MAINTENANCE FEE

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Registration of a document - section 124 $100.00 2001-07-12
Application Fee $300.00 2001-07-12
Advance an application for a patent out of its routine order $100.00 2001-08-22
Request for Examination $400.00 2001-08-22
Maintenance Fee - Application - New Act 2 2002-01-21 $100.00 2002-01-04
Maintenance Fee - Application - New Act 3 2003-01-20 $100.00 2003-01-16
Reinstatement - failure to respond to examiners report $200.00 2003-06-27
Maintenance Fee - Application - New Act 4 2004-01-19 $100.00 2004-01-06
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
PIONEER HI-BRED INTERNATIONAL, INC.
Past Owners on Record
BOWEN, BEN
GUO, MEI
SMITH, OSCAR
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Drawings 2001-07-12 8 145
Abstract 2001-07-12 1 53
Claims 2001-07-12 13 503
Description 2003-06-27 55 2,834
Claims 2003-06-27 13 620
Description 2001-07-12 51 2,770
Description 2001-07-13 55 2,840
Cover Page 2001-11-21 1 31
PCT 2001-07-12 12 514
Assignment 2001-07-12 10 303
Prosecution-Amendment 2001-08-22 2 48
Prosecution-Amendment 2001-07-12 5 102
Prosecution-Amendment 2001-10-30 1 14
Prosecution-Amendment 2002-01-04 3 130
Prosecution-Amendment 2003-06-27 56 2,365
Prosecution-Amendment 2003-11-20 5 252

Biological Sequence Listings

Choose a BSL submission then click the "Download BSL" button to download the file.

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Please note that files with extensions .pep and .seq that were created by CIPO as working files might be incomplete and are not to be considered official communication.

BSL Files

To view selected files, please enter reCAPTCHA code :