Note: Descriptions are shown in the official language in which they were submitted.
CA 02704163 2012-08-10
METHODS, SYSTEMS AND SOFTWARE FOR GENERATING SENTENCES,
AND VISUAL AND AUDIO COMPOSITIONS REPRESENTING SAID SENTENCES
ABSTRACT OF THE DISCLOSURE
The present invention provides methods, systems and software for: rule-based
transforming
of an initial thesaurus-like organized input lexicon into a rule-based lexical
ontology containing
inflectional forms of words represented by means of syntactical-complements-
joining sets the
elements of which are words that act as hidden semantic intermediaries
mediating the syntactic
pairing of words in a sentence; rule-based generating of sentences from
sequences of interdependent
instructions for syntactic pairing of grammatically correct inflectional forms
of words having at least
one semantic intermediary common to specified by said instructions pairs of
syntactical-
complements-joining sets representing said pairs of grammatically correct
inflectional forms in said
rule-based ontology; and, generating of visual and audio compositions
representing said generated
sentences.
SPECIFICATION
This invention relates to the field of computational creativity in the areas
of natural-language
generation (NLG) and the arts.
The goal of computational creativity is to design computer programs that
model, simulate or
replicate human creativity understood as an ability to create novel
representations of pre-existing
ideas or objects, said novel representations arising through rule-based,
statistical based or inspired
by the functioning of biological systems transformations of initial and
intermediate representations.
Patented prior works related to the area of natural-language generation
include: US Pat.
7603267 (Issue date 13 Oct 2009); US Pat. 7496621 (Issue date Feb 24, 2009);
US Pat. 7010479
(Issue date Mar 7, 2006); and, US Pat. 7275033 (Issue date Sep 25, 2007).
Examples of non-patented NLG systems developed to date include FUF/Surge,
RealPro,
Penman/KPML, Nitrogen, Amalgam, and Fergus. Considered broadly, presently
known or used
NLG systems differ in the way they represent linguistic knowledge. In respect
to linguistic
knowledge representation, a lexical ontology has become the defining term for
the part of a
language modeling that excludes the instances, yet describes what they can be.
In general terms, a
lexical ontology defines the set of representational primitives with which to
model the domain of
- 1 -
CA 02704163 2010-05-21
linguistic knowledge. Currently popular classes of models for linguistic
knowledge representation
include semantic networks models (Quillian), KL-ONE class of knowledge
representations (used by
Penman/KPML) and the neural network models. Overall, current NLG systems are
split between
two traditions: stochastic or Markov language models (LM) that are often
criticized because they
have difficulties explaining future linguistic behavior and creativity; and
formal symbolic (rule-
based) LM based on the Chomsky tradition that are often criticized because
they trivialize the
problem of lexical choice.
The present invention has two components: a natural language generating (NLG)
component dealing with the generation of English ontology and the generation
of English sentences
in an improved natural language generation system of the type commonly called
formal symbolic
(rule-based) systems; and, a composition generating (CGS) component dealing
with the generation
visual and audio compositions in a system belonging to the general type of
abstract art generating
systems inspired by language and the functioning of biological systems.
The NLG component of the invention includes methods, systems and software for:
transforming an initial input set of 'thesaurus-like organized' words (L)
wherein said words are
represented by means of subsets of synonyms and antonyms to their part of
speech specific senses
into a rule-based ontology set (Lo) wherein inflection-complement-joining
tagged (Jki) forms of
said words are represented by means of syntactical-complement-joining tagged
(PJk) sets
comprising semantic intermediaries that mediate the syntactic pairing of said
words; and rule-based
transforming into an output sentence (e) of a sequence (Sn) of interdependent
pairing instructions
(kgGig) each of said instruction encoding a specific syntactic rule for PJk-
based pairing of
grammatically correct inflectional forms (Jkii) of words belonging to said
rule-based ontology set.
The methods provided by the present invention are the result of an improved
(based on analyzing
the connectivity between syntactically compatible words belonging to the
initial input set of words L
and inspired by patterns of cortical organizations) rule-based model of
language (LM). Unlike other
rule-based LM, wherein the choice of the words filling in the abstract
framework of a sentence is
reduced to a simple secondary to grammar look-up method, according to LM
provided herein said
abstract framework is: (i) represented as a sequence of interdependent pairing
instructions for
joining together grammatically correct inflectional forms of words represented
by means of
syntactical-complement-joining sets of identified semantic intermediaries (of
the type commonly
called 'semantic universals'); and, (ii) sequentially filled in with
grammatically correct inflectional
forms of pairs of words that have at least one semantic intermediary in
common.
- 2 -
CA 02704163 2012-08-10
In respect to its composition generating component (CGS) the present invention
provides
novel means for producing a potentially infinite number of different visual
and audio compositions
representing sentences produced from the finite rule-based ontology set
provided by the present
invention. More particularly, the CGS component of the invention includes
software for: converting
an input sentence or and a sequence of sentences into an intermediate output
sequence of fragments;
developing and operating visual and audio pallets (COP, ShP, SiP, MP, SP);
converting an input
sequence of fragments (Sf) into an intermediate output visual (Vm) or audio
(Am) motif; converting
an input motif into an output visual (Vc) or audio (Ac) composition
representing a sentence or a
sequence of sentences generated by NLG.
The provided by the present invention methods, systems and the software can be
applied in
the areas of natural language generation and the arts for the purpose of
artistic activities such as the
creation of audio-visual installations comprising potentially infinite number
of different audio-visual
compositions generated and displayed in real time, and or the creation of
potentially infinite number
of different visual patterns or original prints.
The invention has been tested and it can be assumed that software component of
the
invention (which has been written in Java programming language) is capable of
transforming: the
initial thesaurus-like organized input into a finite rule-based ontology set;
and, inputs from the finite
rule-based ontology set provided herein into sentences, audio compositions,
and potentially infinite
number of different visual compositions.
While the invention is susceptible to various modifications and alternative
forms, specific
embodiments thereof are shown by way of examples and in the figures and will
herein be described
in detail. It should be noted that in most of the cases arbitrary symbols and
uncommon terms and
abbreviations have been used for the purpose of describing the present
invention.
The drawings which form a part of this specification and wherein like
reference symbols
indicate like elements are briefly described as follows:
Fig. 1 contains a schematic representation illustrating the invention.
Fig. 2 contains schematic representations illustrating the language modeling
provided by the
present invention.
-3 -
CA 02704163 2012-08-10
Fig. 3 contains schematic representations of rules for transforming: (A) part
of speech
ordering tag of a source word into a part of speech dominance tag; (B) initial
input sets of synants
into 'causality' output sets; (C) input causality sets into 'multiple-
causality' output sets: and, (D)
input multiple-causality sets into 'derivative' output sets.
Fig. 4 contains schematic representations illustrating: (A) types of
inflection-complement
joining tags for words and types of syntactical-complement-joining tags for
sets of semantic
intermediaries; (B) an example of a binary structure composed of two
identification structures; and
(C) the connectivity between a set comprising words that can be used as
semantic intermediaries and
all words in a lexical ontology set.
Fig. 5 contains schematic representations illustrating: (A) types of
prototypes for identifying
semantic intermediaries; (B) an example prototype for identifying semantic
intermediaries; and, (C)
examples of instructions for pairing of inflectional forms of transitive verbs
and objects in a
sentence.
Fig. 6 contains schematic representations illustrating: (A) classes of pairing
instructions; and,
(B) rule-based transforming of examples of sequences of interdependent pairing
instructions into
sentences.
With reference to Fig. 1, an exemplary system for implementing the invention
includes a
natural language generating system (NLG) comprising: a generator of rule-based
ontology set
Lo{w} (unit 06); and, a generator of sentences (unit 07) from words belonging
to Lo{w}. Underlying
said NLG system is an improved rule-based model of language (LM) the result of
analyzing: (a) the
connectivity between two words said connectivity schematically exemplified in
Fig. 2 section A
wherein, w1 and w2 are words, xPS symbolizes sets of synonyms and antonyms
(synants) of primary
part-of-speech specific sense definition(s) of w1 and w2, xNS symbolizes sets
of synants (if any) of
the non-primary part-of-speech specific sense definitions of w1 and w2, SYli
is a set composed of
synants of w1, SY12 is a set composed of synants of the words in SY11, SY1 j
is a set composed of
synants of the words in the preceding set SY1j_1, SY21 is a set composed of
synants of w2, SY22 is a
set composed of the synants of the words in SY21, SY2) is a set composed of
synants of the words in
the preceding set SY2j_i, and, P is a median set composed of synants of words
in both SYli and
SY2J; and, (b) type(s) of sentence(s) schematically exemplified in Fig. 2
section G and in Fig. 6
- 4 -
CA 02704163 2010-05-21
section B wherein S1 a indicates a subject/verb sentence, S1 indicates an
adjective/subject/verb
sentence, S3b indicates an adjective/subject/verb/object sentences and so on.
More particularly, it has been assumed that: (i) since, the number of sets
(SYii) separating w1
and w2 varies, as exemplified in Fig. 2 section B, wherein SY13 indicates a
set of words (if any) that
are synants of words in both set SYli and set SY12, SY15 indicates a set of
words (if any) that are
synants of words in both SY13 and SY14, SY23 indicates a set of words (if any)
that are synants of
words in both SY21 and SY22, and SY25 indicates a set of words (if any) that
are synants of words in
both SY23 and SY24; (ii) and since, as exemplified in Fig. 2 section C, the
connectivity between any
pair of syntactically compatible words w1 and w2 can be represented by means
of two sequences
converging in a base set Pc = SY1 fl SY2i (referred herein as a
set of common semantic
intermediaries); (iii) and since, as exemplified in Fig. 2 section D,
different sequences of different
sets build from different synants (xT1 and xT2) can be used to represent an
assumed connectivity
(C1pC2p) between a primary cause-word (wi) and a secondary cause-word (w2), or
and an assumed
connectivity (C1pE2p) between a cause-word (wi) and an effect-word (w2), or
and an assumed
connectivity (E1pC2p) between an effect-word (w1) and a cause-word (w2), or
and an assumed
connectivity (E1pE2p) between a primary effect-word (wi) and secondary effect-
word (w2); (iv) and
since, different syntactical rules (for pairing words in a sentence) can be
represented by means of
different sequences, as exemplified in Fig. 2 section E, wherein xRirde are
different sets comprising
selected xT1 or xT2 type synants of wi and w2, PJ1 and PJ2 are sets of
semantic intermediaries
mediating the syntactic pairing of w1 used as a subject with w2 used as a verb
in a sentence and PJ4
and PJ3 are sets of semantic intermediaries mediating the syntactic pairing of
wi used as a object
and w2 used as a verb; therefore:
(i) in a rule-based ontology inflection-complement-joining tagged forms of
words can be
represented by means of syntactical-complement-joining tagged sets of
identified semantic
intermediaries, as exemplified in Fig. 2 section F, wherein PJ1, PJ2, PJ3, PJ4
and PJ5 (for pairing a
subject and an object) are sets of identified semantic intermediaries and Pc
denotes a set of at least
one semantic intermediary common to PJ1 and PJ2;
(ii) and, types of sentences can be represented, by means of sequences of
interdependent pairing
instructions, as exemplified in Fig. 2 section G, wherein 11GigI2Gigl3Gigl
represents a sequence of
instructions for generating the sentence (Ws1 Wvt2 w01) on the condition that
PJ1 fl PJ2 , PJ4 fl
PJ3 0 and PJ5 n PJ5 0.
It should be understood that other features of LM underlying the NLG system
provided
herein will become apparent from the detailed description of the methods for:
transforming an input
- 5 -
CA 02704163 2010-05-21
part of speech tag of a source word into a composite part-of-speech tag; the
sequential transforming
of input sets of synants representing part of speech senses of a source word
into 'causality',
'multiple-causality' and derivative sets; transforming a composite part of
speech tag into syntactical-
complement-joining tags for sets of semantic intermediaries; identifying the
part of speech tags of
the words that can be used as semantic intermediaries; identifying of the
elements of syntactical-
complement-joining tagged sets of a source word using trial and error
techniques; and, the
transforming of sequences of interdependent pairing instructions into
sentences.
Referring back to Fig. 1 section A, another exemplary system for implementing
the invention
is a rule-based ontology set generating system comprising: generators of
intermediate temporary
outputs (units 01-04); and a generator of sets of identified semantic
intermediaries (units 05-06).
The rule-based ontology set generating system is now explained with reference
to an initial
input set of words L {w} wherein said words are represented by means of:
(i) part speech ordering patterns (POSo) - the term POSo as used herein
refers to the order in
which the part of speech usage(s) of a given word are listed in the public
domain Webster's
Dictionary from 1913;
(ii) synonyms and antonyms (referred herein as synants) to their part of
speech specific primary
sense definition (xPS) which is the definition listed as first (for a given
part of speech usage)
in the public domain Webster's Dictionary from 1913; and, if a word has more
then one part-
of-speech specific senses, by means of synants to their non-primary sense
definition(s) (xNS)
which is any definition listed as second, third, and so on (for a given part
of speech usage) in
the public domain Webster's Dictionary from 1913.
Unit L {w} shown in Fig.1 section A also relates to a converted into a
computer readable
medium input set of words L {w} = {w l w has POSo, w has xPS, w has xNS} where
x = (a,
n, vi, or vt), and where xPS and xNS are proper subsets of L {w} . It should
be understood
that L {w} provided herein is not a Dictionary, and that said L {w} has been
compiled solely
for the purpose of the present invention.
Referring back to Fig. 1, unit 01 shown in section A, relates to method-010
and executable
by computer instructions for causing a computing system to carry out the steps
of method-010, said
method comprising rules for: (1) POSo-based generating of part-of-speech-
ordered tags (POT); (2)
PoT-based generating of alphabetically-ordered part-of-speech tags (PaT); (3)
PoT-based generating
- 6 -
CA 02704163 2010-05-21
of a provided by the present invention part-of-speech-dominance tags (PdT);
and (4) the generating
of a composite part-of-speech tags (PcT).
(1) POSo of w E L is formally represented by means of PoT-sequence P
1 , P2, ...Pp (p
corresponds to the total number of parts of speech) in which P1 is always
manifested while
P2,...and Pp can be either manifested or unmanifested (um) and wherein: P1
corresponds to
the part of speech listed first in the POSo of a source word w; P2 corresponds
to the part of
speech (if any) listed second in POSo of said w; and Pp corresponds to a part
of speech (if
any) listed last in POSo of said w. More particularly, the POSo-based
generating of PoT is
explained by way of examples of words that can be used as adjectives, nouns,
or and
transitive and intransitive verbs as follows:
POSo = (n) of any word w E L is represented by means of PoT = (n, um, um, um);
POSo = (n, vt) of any word w E L is represented by means of PoT = (n, vt, um,
um);
POSo = (vt, n, a) of any word w E L is represented by means of PoT = (vt, n,
a, um);
POSo = (a, n, vi, vt) of any word w E L is represented by means of PoT = (a,
n, vi, vt).
The above exemplified rules are translated into executable by computer
instructions and L
{w} = {w I w has POSo, w has xPS, and w has xNS} is transformed into temporary
output
set L010 {w} = {w I w has PoT, w has xPS, w has xNS}.
(2) PaT represents the part(s) of speech of all w E L010 in the same
arbitrarily chosen
(alphabetical) order (a, n, vi, vt) as sequences of 1 or and 0 in which: 1
denotes that w
belongs to a particular part of speech class; and, 0 denotes that w does not
belong to the class.
More particularly the PoT-based generating of PaT is explained by way of a
temporary tag
(PaTt) and the following examples:
If w has 'a' as element of PoT (a E PoT) then PaTt = 1 n vi vt, else PaTt = 0
n vi vt;
If PaTt = 1 n vi vt and n E PoT then PaTt = 11 vi vt, else PaTt = 10 vi vt;
If PaTt = 0 n vi vt and n E PoT then PaTt = 01 vi vt, else PaTt = 00 vi vt;
If PaTt = 11 vi vt and vi E PoT then PaTt = 111 vt, else PaTt = 110 vt;
If PaTt = 10 vi vt and vi E PoT then PaTt = 101 vt, else PaTt = 100 vt;
If PaTt = 01 vi vt and vi E PoT then PaTt = 011 vt, else PaTt = 010 vt;
If PaTt = 00 vi vt and vi E PoT then PaTt = 001 vt, else PaTt = 000 vt;
If PaTt = 111 vt and vt E PoT then PaT = 1111, else PaT = 1110;
If PaTt = 110 vt and vt E PoT then PaT = 1101, else PaT = 1100;
If PaTt = 101 vt and vt E PoT then PaT = 1011, else PaT = 1010;
- 7 -
CA 02704163 2010-05-21
If PaTt = 100 vt and vt E PoT then PaT = 1001, else PaT = 1000;
If PaTt = 011 vt and vt E PoT then PaT = 0111, else PaT = 0110;
If PaTt = 010 vt and vt E PoT then PaT = 0101, else PaT = 0100;
If PaTt = 001 vt and vt E PoT then PaT = 0011 else PaT = 0010;
If PaTt = 000 vt and vt E PoT then PaT = 0001 else w L011.
The above exemplified rules are translated into executable by computer
instructions and
L010 {w} is transformed into temporary output L011 {w} = {wl w has PoT, w has
PaT, w
has xPS, w has xNS} wherein PaT is 1111, 1011, 0011, 0001, 1001, 0111, 0101,
1101, 1000,
0100, 0010, 1010, 0110, 1110, or 1100, and PaT 0000.
(3) PoT of any word w E L011 is formally represented by means of
provided by the
present invention part of speech dominance tag (PdT) each component (PdTc, 1 <
c < 4 ) of
which expresses an assumed relative dominance relationship between two
different part of
speech usages of w, more particularly, an assumed relative dominance
relationship between
pair(s) of manifested part of speech usages of w and between a manifested and
a non-
manifested (um) part of speech usage of w. As schematically exemplified in
Fig. 3 section A,
the PoT-based generation of PdT = (PdT1, PdT2, PdT3, PdT4) is explained by way
of an
example type of words having P1 = a as follows:
If P1 = a and P2 = um then PdT1 = an, PdT2 = avi, PdT3 = avt and PdT = (an,
av) based on
the assumption that avi = av and avt = av;
If P1 = a, P2 = n, and P3 = um, then PdT1 = an, PdT2 = av, PdT3 = nv and PdT =
(an, av, nv)
based on the assumption that nvi = nv and nvt = nv;
If P1 = a, P2 = vi, and P3 = um then PdT = (av, an, vn, vivt);
If P1 = a, P2 = vt, and P3 = um then PdT = (av, an, vn, vtvi);
If P1 = a, P2 = n, P3 = vi, and P4 = um then PdT = (an, av, nv, vivt);
If P1 = a, P2 = n, P3 = vt, and P4 = um then PdT = (an, av, nv, vtvi);
If P1 = a, P2 = vi, P3 = n, and P4 = um then PdT = (av, an, vn, vivt);
If P1 = a, P2 = vi, P3 = vt, and P4 = um then PdT = (av, an, vn, vivt);
If P1 = a, P2 = vt, P3 = vi, and P4 = um then PdT = (av, an, vn, vtvi);
If P1 = a, P2 = vt, P3 = n, and P4 = um then PdT = (av, an, vn, vtvi);
If P1 = a, P2 = n, P3 = vi, and P4 = vt then PdT = (av, an, nv, vivt);
If P1 = a, P2 = n, P3 = vt, and P4 = vi then PdT = (av, an, nv, vtvi);
If P1 = a, P2 = vi, P3 = n, and P4 = um then PdT = (av, an, vn, vivt);
If P1 = a, P2 = vi, P3 = vt, and P4 = n then PdT = (av, an, vivt, vn);
- 8 -
CA 02704163 2010-05-21
If P1 = a, P2 = vt, P3 = vi, and P4 = n then PdT = (av, an, vtvi, vn);
If P1 = a, P2 = vt, P3 = n, and P4 = um then then PdT = (av, an, vn, vtvi).
The above exemplified rules for PoT-based generation of PdT are translated
into executable
by computer instructions and L011 {w} is transformed into temporary output set
L012 {w} = {wl w has PoT, w has PaT, w has PdT, w has xPS, w has xNS}. It
should be
noted that for the purpose of the present invention rules for determining of
'same' (sPdTc)
and 'inverted' (iPdTc) part of speech dominance components of PdT have been
introduced.
For example, according to said rules if wi has PdT = (av, an) and w2 has PdT =
(av, na, nv)
then: w1 and w2 are said to have 'ay' as sPdTc; and wi and w2 are said to have
'an' and `na'
as iPdTc.
(4) The generating of composite part-of-speech tags (PcT = (PoT, PaT, PdT)
is explained by way
of the following example:
If w has PoT = (n, um, um, um), and w has PaT (0100), and w has PdT = (na, nv)
then w has PcT = (PoT (n, um, um, um), PaT (0100), PdT (nv, na)).
Accordingly, the rules for generating of PcT are translated into executable by
computer
instructions and L012 {w} is transformed into output set L013 {w} = {wl w has
PcT, w has
xPS, w has xNS}. Sets L010-13 may be further developed by adding to the
initial input L
words such as adverbs (adv).
Referring back to Fig. 1, unit 02 shown in section A, relates to method-020
and executable
by computer instructions for causing a computing system to carry out the steps
of method-020 for
rule-based transforming of xPS and xNS into provided by the present invention
sets referred herein
as 'causality sets' (CS) that are symbolically represented in the figures as
xT1 and xT2 (x is a, n, vi
or vt part of speech) and the formation of which is schematically exemplified
in Fig. 3 section B.
(i) With regard to particulars xT1 (x = a, n, vi, or vt) of a source word w
is formed from:
synonyms of xPS of w that have sPdTc as the word w;
words that have w as a synonym of their xPS and have sPdTc as the word w;
antonyms of xNS of w that have sPdTc as the word w;
words that have w as an antonym of their xNS and have sPdTc as the word w;
synonyms of
xNS of w that have iPdTc relative to the word w;
words that have w as a synonym of their xNS and have iPdTc as the word w;
antonyms of
xPS of w that have iPdTc as the word w;
words that have w as an antonym of their xPS and have iPdTc as the word w.
- 9 -
CA 02704163 2010-05-21
(ii) And, with regard to particulars, xT2 (x = a, n, vi, or vt) of the word
w is formed from:
synonyms of xPS of w that have iPdTc as the word w;
words that have w as a synonym of their xPS and have iPdTc as the word w;
antonyms of xNS of w that have iPdTc as the word w;
words that have w as an antonym of any of their xNS and have iPdTc as the word
w;
synonyms of any xNS of w that have sPdTc as the word w;
words that have w as a synonym of any of their xNS and have sPdTc as the word
w;
antonyms of xPS of w that have sPdTc as the word w;
words that have w as an antonym of their xPS and have sPdTc as the word w.
The above exemplified rules are translated into executable by computer
instructions for
causing a computing system to carry out the steps of method-20 for
transforming the input
set L012 {w} into intermediate output set L020 {w} = {w 1 w has PcT and w has
CS}, where
CS 0 and CS = faT1, aT2, nT1, nT2, viT1, viT2, vtTl, vtT21, and where aT1,
aT2, nT1,
nT2, viT1, viT2, vtTl, and vtT2 are proper subsets of L020.
Referring back to Fig. 1, unit 03 shown in section A, relates to method-030
and executable by
computer instructions for causing a computing system to carry out the steps of
method-030
schematically exemplified in Fig. 3 section C and explained as follows:
(1) Each w E L020 is represented by means of four types of Intermediate
subsets (xIi) so that:
xIl = (xT1 11 xB1) 0 else xIl = xT1 wherein xB1(x um) is composed of the xT1-
synants
of the xT1-synants of w;
xI2 = (xT1 11 x82) 0 else xI2 = xT1 wherein xB2 (x um) is composed of the xT2-
synants of the xT1-synants of w;
xI3 = (xT2 11 xB3) 0 else xI3 = xT2 wherein xB3 (x um) is composed of the xT2-
synants of the xT2-synants of w;
xI4 = (xT2 (1 xB4) 0 else xI4 = xT2 wherein xB4 (x um) is composed of the xT1-
synants of the xT2-synants of w.
The above exemplified rules are translated into executable by computer
instructions and
input set L020 {w} is transformed into a temporary output set:
L030 {w} = {w (w has PcT, and w has xIi, 1 5_ i 5_ 41.
(2) Each w E L030 is further represented by means of 8 types of subsets
(xLi) and respectively
in terms of 8 types of Median subsets (xMi) so that:
- 10 -
CA 02704163 2010-05-21
xL1 (x um) is composed of the xIl -synants of the xT1-synants of w
and xMl = xL1 11 xT1 0 else xMl = xT1;
xL2 (x um) is composed of the x12-synants of the xT1-synants of w
and xM2 = xL2 11 xT1 0 else xM2 = xT1;
xL3 (x um) is composed of xI3-synants of the xT1-synants of w
and xM3 = xL3 II xT1 0 else xM3 = xT1;
xL4 (x um) is composed of xI4-synants of the xT1-synants of w
and xM4 = xL4 11 xT1 0 else xM4 = xT1;
xL5 (x um) is composed of the xIl-synants of the xT2 synants of w
and xM5 = xL5 11 xT2 0 else xM5=xT2;
xL6 (x um) is composed of the x12-synants of the xT2-synants of w
and xM6 = xL6 (l xT2 0 else xL6=xT2;
xL7 (x um) is composed of the x13-synants of the xT2-synants of w
and xM7 = xL7 11 xT2 0 else xM7 = xT2;
xL8 (x um) is composed of the x14-synants of the xT2-synants of w
and xM8 = xL8 (1 xT2 0 else xM8 = xT2.
And the above exemplified rules are translated into executable by computer
instructions and
input set L030 is transformed into a temporary output set:
L031 {w} = {w I w has PcT, and w has xMi, 1 < i 5 8}.
(3) Each w E L031 is further represented by means of temporary sets (x1R and
x2R) so that:
x1R = xMl U xM2 U xM3 u xM4;
x2R = xM5 U xM6 U xM7 U xM8.
And the above exemplified rules are translated into executable by computer
instructions and
input set L031 is transformed into a temporary output set:
L032 {w} = {w I w has PcT, and w has xiR, 1 _ i 5 2}.
(4) Each w E L032 is further represented by means of sets xR1, xR2 and xR3
(referred herein as
multiple-causality sets) so that:
xR1 = x1R ¨ x2R;
xR2 = x1R U x2R;
xR3 = x2R ¨ x1R.
And the above exemplified rules are translated into executable by computer
instructions and
input set L032 is transformed into a temporary output set:
- 11 -
CA 02704163 2010-05-21
L033 {w} = {w I w has PcT, and w has xRiR, 1 < iR < 3};
(5) A plurality of derivative sets (xRirdc) is generated from xRiR
according to rules explained by
way of the following examples and in Fig 3 section D:
xR12dc (x is a, n, vi, or vt) type sets (such as nR12 and viR12) are generated
by the union of
sets xR1 and xR2;
xR32de type sets (such as nR32 and viR32) are generated by the union of sets
xR3 and xR2;
xRivi type sets (such as nR1vi, and nR2vi) are generated by selecting all w
belonging to xRi
that have intransitive verb part of speech (vi) as a component of their PcT;
xRiva type sets (such as nR1va, and nR2va) are generated by selecting all w
belonging to
xRi that have va as a component of their PdT;
xRiav type sets (such as nRlav, and nR2av) are generated by selecting all w
belonging to
xRi that have av as a component of their PdT,
xRiavInv type sets, are generated by the union of sets xRiav and xRinv;
xRivtviInviva type sets are generated by the union of xRivtvi, xRiva and xRinv
and so on.
And the above exemplified rules are translated into executable by computer
instructions for
generating xRicic sets.
Referring back to Fig. 1, unit 04 shown in A, relates to executable by
computer instructions
for causing a computing system to carry out the steps of (schematically
exemplified in Fig 4, A)
tagging rules explained by means of the following examples:
1. (i) PJ1, PJ4, PJ5, and PJ6 tags are assigned to all w E L033 having
one of the following
PaT: 0100, 1100, 0110, 0101, 0111, 1101, 1110, 1111, 1000, 1010, or 1001;
(ii) PJ2 tags are assigned to all w e L033 having one of the following PaT:
0010, 0011,
1010, 0110, 0111, 1110, or 1111;
(iii) PJ3 tags are assigned to all w E L033 having one of the following PaT:
0001, 0011,
1001, 0101, 0111, 1101, or 1111.
2. (i) J11, J41, J51, and J61 tags are assigned to the plural form of
nouns without an article;
J12, J42, and J52 tags are assigned to their plural forms with a definite
article;
J13, J43, and J53, are tag is assigned to their singular forms with an
indefinite article;
J14, J44, and J54 tags are assigned to their singular forms with a definite
article;
J15, J45, J55, and J65 tags are assigned to their singular forms without an
article;
(ii) J21 tag is assigned to the present-simple-plural forms of all verbs;
J22 tag is assigned to their present-simple-third-person-singular forms;
- 12 -
CA 02704163 2010-05-21
J23 tag is assigned to their negatory-present-tense-plural forms;
(iii) J31 tag is assigned to the present-simple-plural-form of all transitive
verb-word;
J32 tag is assigned to their present-simple-third-person-singular forms;
J33 tag is assigned to their negatory-present-tense-plural forms;
(iv) J62 tag is assigned to the positive with a definite article form of all
adjectives;
J63 is assigned to their positive with an indefinite article forms;
J64 is assigned to their comparative without-article forms.
In accordance with the above exemplified tagging rules L033 {w} is transformed
into
intermediate output set L040 {w} = {w I w has PcT, w has xRiR, w has Jkij, w
has PJk}
wherein for example, the form 'birds' has J11, J41, J51, and J61 tags and PJ1,
PJ4, PJ5, and
PJ6 tags while the form 'do not fly' has J23 and J33 tags and PJ2 and PJ3 tags
and the form
'do not vanish' has J23 and PJ2. L040 {w} may be further developed by
introducing more
Jkij¨tagged inflections and PJk-tags reflecting additional syntactic pairing
rules.
Referring back to Fig. 1, unit 05 shown in section A, relates to method-050
said method
comprising: (1) a stage 1 relating to the forming of a binary structure to
represent the connectivity
between pairs of syntactically compatible words, said structure comprising two
component
identification structures as required by LM provided herein; (2) a stage 2
relating to a trial and error
(T&E) procedure for transforming said identification structures into
prototypes to be used for
identifying the elements (semantic intermediaries) of the PJk-tagged sets as
further required by LM.
1. Included in stage 1 of method-50 are: (A) assumptions (i-v); (B)
requirements (RI-RV): and
(C) the forming of identification structures that comply with the
requirements.
(A) Assumptions (i-v) are inspired by patterns of cortical organization
and are described as
follows:
(i) As schematically exemplified in Fig 4 section B, the connectivity
between two
syntactically compatible words can be represented by means of a binary
structure (BS)
composed of `layers'(Lal -L8) the number of which is maximum 8, said number 8
corresponding to cortical layers 1, 2, 3, 4a, 4h, 4c, 5 and 6 (as described by
Gordon Shepherd
in Neurobiology, Oxford University Press, 1983)
(ii) As further exemplified in section B in Fig. 4, said BS is composed of
two
identification structures (gISm) comprising at least one layer of synants and
at least one a
layer of semantic intermediaries and that said two gISm(s) are correlated in
such a way so
- 13 -
CA 02704163 2010-05-21
that sum of the number of layers for the first gISml and the number of layers
for the second
gISm2 does not exceed 8.
(iii) As further exemplified in section B in Fig. 4, each of the two gISm(s)
composing BS
is unfolding alongside two parallel pathways (DPi and IPi) analogous to the
direct (DP) and
the indirect (IP) pathways for the processing of visual information, each step
of the said
pathway corresponding to a layer, said DPi and IPi allowed to unfold alongside
alternative
routes so that overall number of layers forming the DPi and IPi of both
gISm(s) is max. 8.
(iv) As exemplified in section A in Fig. 5, separate BS represent the
connectivity
between a subject-word and a verb-word; a verb-word and an object-word; an
object-word
and a subject-word; an adjective-modifier and a subject or an object word.
Furthermore,
since double part of speech words are richer in connectivity then single part
of speech words
different gISm(s) represent single, double and triple part of speech words.
For example, BS
representing the connectivity between a single part of speech subject-word and
a double part
of speech verb-word is build from one gISm representing single-part of speech
subject-
words and one gISm representing a double part of speech verb-word.
(B) The requirements (RI-V) that comply with assumptions (i-iv) as
exemplified in section B in
Fig. 4 are described as follows:
(RI) Requiring each individual gISm to be minimum 3 and maximum 5 layered
structure
converging alongside two main pathways (DPi and IPi) in a base of maximum 2
layers
composed of words that can be used as semantic intermediaries (PLi), and that
the number of
routes (i) within a pathway be maximum 3.
More specifically:
For the forming of DP1 the number of layers composed of words that are not
used as
semantic intermediaries (NLi) is 1;
For the forming of IP1 the number of NLi are 2;
For the forming of DP2 the number of NLi are 2;
For the forming of IP2 the number of NLi are 3;
For the forming of DP3 the number of NLi are 3;
And, for the forming of all of the above described cases the number of layers
composed of
semantic intermediaries (PLi) is 2.
(RII) Requiring the first layer to be composed of xRide sets representing a
sample of
minimum three (for example words that are used as nouns only) words (wi, w2,
and w3) and
each consecutive layer to be to be composed of xRide sets of a sample of
minimum three
- 14 -
CA 02704163 2010-05-21
words forming the preceding layer. The requirement for the forming of NLi, can
be
expressed as NLi = RD1(wi, W2 ,w3) U RD2(wi, w2, w3) U RD3(wi, w2, w3), where:
if xRidc(wi) # RD1(wi) = xRide(wi) else if xRide (w2) # RD1(w2) = xRidc
(w2) else if
xRide (w3) # RD1(w3) = xRid.c (w3);
if xRidc(wi) # RD2(wi) = xRide(wi) else if xRicic (w2) # RD2(w2) = xRid
(w2) else if
xRidc (w3) # RD2 (w3) = xRide (w3);
if xRidc(wi) # (6 RD3(wi) = xRidc(wi) else if xRicic (w2) # (6 RD3(w2) = xRidc
(w2) else if
xRide (w3) # RD3(w3) = xRide (w3).
The requirements for the forming of PLi are similar to the above described
requirements for
the forming of NLi.
(RIII) Requiring that the number of sets (layers) separating the set P
composed of words
(semantic intermediaries) and L be maximum three as exemplified in Fig. 4
section C
wherein: P is the set of words that can be used as semantic intermediaries;
SY1 is a set
composed of the synonyms and antonyms of all words in P; SY2 is a set composed
of the
synonyms and antonyms of all words in SY1 and P; SY3 is a set composed of the
synonyms
and antonyms of all words in SY2, SY1 and in P; and, L is a set composed of
the synonyms
and antonyms of all words in SY3, SY2, SY1 and P and L is a set composed of
all words in
L040 {w}.
(RIV) Requiring, as exemplified in Fig. 5 section A, the forming of:
(a) BSI comprising gISm for identifying PJ1-set of semantic intermediaries
mediating
the syntactic pairing of a subject-word with a verb-word and gISm for
identifying PJ2-set of
semantic intermediaries mediating the syntactic pairing of a verb-word with a
subject-word;
(b) BS2 comprising gISm for identifying PJ3-set of semantic intermediaries
mediating
the syntactic pairing of a verb-word with an object-word and gISm for
identifying PJ4-set of
semantic intermediaries mediating the syntactic pairing of an object-word with
a verb-word;
(c) BS3 comprising gISm for identifying PJ5-set of semantic intermediaries
mediating
the syntactic pairing of a subject-word with an object-word and gISm for
identifying PJ5-set
of semantic intermediaries mediating the syntactic pairing of an object-word
with a subject-
word;
(d) BS4 comprising gISm for identifying PJ6-set of semantic intermediaries
mediating
the syntactic pairing of a subject and or object word with an adjective-word
and gISm for
- 15 -
CA 02704163 2010-05-21
identifying PJ6-set of semantic intermediaries mediating the syntactic pairing
of an
adjective-word with a subject and or object word.
(RV) Requiring (as further exemplified in Fig. 5 section A) the building of
separate gISm
for identifying:
(a) PJ1 set of semantic intermediaries of all single part of speech subject-
words, PJ1 set
of semantic intermediaries of all double part of speech subject-words, and PJ1
set of
semantic intermediaries of all triple part of speech subject-words;
(b) PJ2 set of semantic intermediaries of all single part of speech verb-
words, PJ2 set of
semantic intermediaries of all double part of speech verb-words, and PJ2 set
of semantic
intermediaries of all triple part of speech verb-words;
(c) PJ3 set of semantic intermediaries of all single part of speech verb-
words, PJ3 set of
semantic intermediaries of all double part of speech verb-words, and PJ3 set
of semantic
intermediaries of all triple part of speech verb-words;
(d) PJ4 set of semantic intermediaries of all single part of speech object-
words, PJ4 set
of semantic intermediaries of all double part of speech object-words, and PJ4
set of semantic
intermediaries of all triple part of speech object-words;
(e) PJ5 set of semantic intermediaries of all single part of speech subject
and object
words, PJ5 set of semantic intermediaries of all double part of speech subject
and object
words, PJ5 set of semantic intermediaries of all triple part of speech subject
and object
words;
(0 PJ6 set of semantic intermediaries of all single part of speech
adjectives, and
subject and object words, PJ6 set of semantic intermediaries of all double
part of speech
adjectives, and subject and object words, and PJ6 set of semantic
intermediaries of all triple
part of speech adjectives, and subject and object words.
(C) The forming of gISm(s) that comply with the requirements (RI-RV) is
explained by way of
the example BS shown in Fig. 4 section B. It should be understood, that,
although DP2 and
IP1 or DP3 and IP2 (shown in section B in Fig. 4), have the same number of NLi
they may
differ because of the use of different R-derivative sets.
2. Stage 2 of method-50 includes the following trial an error (T&E)
procedure:
2.1. T&E for identifying the part of speech tags of the words that can be
used as semantic
intermediaries (i-iv):
- 16 -
CA 02704163 2010-05-21
(i) limiting to three, as exemplified in Fig. 4 section C, the distance
between the words
belonging to P and the words belonging to L (L040) to comply with RIII;
(ii) picking a trial set of words having the same the part of speech tags to
form a test pool
of semantic intermediaries (TPp E L040);
(iii) generating trial test output (TTo);
(iv) analyzing the merits of TPp by comparing the number of words in TTo to
the number
of words in L040. If the difference between said numbers is smaller then a
predetermined threshold treating the words in said TPp as eligible p,
otherwise
repeating steps (ii-iv).
2.2. T&E for a semi-automatic transforming of gISm(s) into ISm-prototypes to
be used for
identifying the elements (p) of PJk-sets by means of which according to LM
provided herein
words should be represented in a lexical ontology set, said T&E comprising:
(i) Creating two trial pools of syntactically compatible words (TP1 E L040
and TP2 E
L040), wherein TP1 comprises words having one of the two syntactical-
complement-
joining tags that encode a particular syntactic rule, and TP1 comprises words
having
the other of the two syntactical-complement-joining tags encoding said
particular
syntactic rule. More particularly: if words having J1 ij are picked as
elements of TP1
then words having J2ij are picked as elements of TP2; if words having J3ij are
picked
for TP1 then words having J4ij are picked for TP2; if words having J5ij are
picked
for TP1 then words having J5ij are picked for TP2; if words having J6ij are
picked
for TP1 then words having J6ij are picked for TP2. In addition each pool
should
include words that are single-part of speech only, or double-part-of speech
only, or
triple-part-of-speech only, and at least one word having xRiR consisting of a
large
number synants, at least one word having xRiR consisting of an average number
of
synants, and at least one word having xRiR consisting of a small number of
synants.
(ii) Forming DPi of words belonging to TPi: (a) selecting xRide and forming
NL1 of DP1
to comply with RII; (b) forming PL1 by selecting and generating xRicic of w E
NL1
that are p; (c) forming PL2 by selecting and generating of xRide of w E PL1
that are
p; (d) if the number of p in PL1 and PL2 meets an established threshold,
forming
DP1 and moving to (iii); (e) otherwise selecting and generating different
xRide to
form NL1, and repeating steps (b-d) a specified number of times; (f) otherwise
selecting and generating different xRide to form NL1 and NL2 of DP2; (g)
forming
PL1 and PL2 from xRide of words that are p; (h) if the number of p meets an
- 17 -
CA 02704163 2010-05-21
established threshold, forming DP2 and moving to (iii); (i) otherwise
selecting and
generating different xRide to form NL1 and NL2, and repeating steps (g-i) a
specified
number of times; (j) otherwise selecting and generating different xRid, to
form NL1,
NL2 and NL3 of DP3 to comply with RI and RII; (k) forming PL1 and PL2; (1) if
the
number of p meets an established threshold, forming DP3 and moving to (iii);
(m)
otherwise selecting and generating different xRide to form NL1, NL2 and NL3,
and
repeating the steps (k-1) a specified number of times.
(iii) Forming IPi of words belonging to TPi: (a) selecting and generating
xRicic and
forming NL1 and NL2 of IP1 to comply with RII; (b) forming PL1 and PL2 by
selecting and generating xRid, of words that are p; (c) if the number of
semantic
intermediaries meets an established threshold, forming IP1 and moving to (iv);
(d)
otherwise selecting and generating different xRicic to form NL1 and NL2, and
repeating steps (b-d) a specified number of times; (e) otherwise generating
all
selected xRide to form NL1, NL2 and NL3 of IP2 to comply with RI and RII; (f)
generating PL1 and PL2; (g) if the number of semantic intermediaries meets an
established threshold, forming IP2 and moving to (iv); (h) otherwise selecting
and
generating different xRide to form NL1, NL2 and NL3, and repeating the
procedural
steps (f-h) a specified number of times.
(iv) Forming gISm from DPi and IPi and repeating steps (ii-iv) for all gISm(s)
to comply
with RV.
(v) Analyzing the merits of gISm when applied to other words having the
same PaT and
adding them to the trial pools TP1 and TP2. If ISm performs well relative to
the new
trail pools moving to (vi); otherwise repeating sub-procedures (ii-v).
(vi) Forming BS; (BSI, BS2, BS3, BS4) from gISm(s) as schematically
exemplified in Fig.
section A.
(vii) Defining predetermined test outputs (DTO) said DTO comprising commonly
used
two-word phrases as well as two-word phrases rarely used (in poetry and
literature)
formed from words belonging to TP1 and TP2.
(viii) Analyzing the merits of BS, based on how well it performs relative to
DTO and if BS,
performs well relative to DTO, preselecting the gISm forming said BS, and
moving
to (ix) otherwise repeating sub-procedures ( ii-vi and viii).
(ix) Creating a new trial pool of words (TP3 E L040).
- 18 -
CA 02704163 2010-05-21
(X) Establishing a threshold for the number of semantic intermediaries to be
generated by
said preselected gISm; applying preselected gISm to words w E TP3; analyzing
the
merits of said gISm by comparing the number of generated semantic
intermediaries
to the predetermined threshold; if the numbers is equal or bigger then the
predetermined threshold, designating said gISm as ISm-prototype for
identification
of semantic intermediaries; otherwise repeating sub-procedures (ii-vi, viii
and x).
(xi) The sub-procedures (i-x) disclosed in 2.2 are repeated for each of the
following types
of ISm-prototypes:
IS10 for identifying mediating the pairing of a subject-word w with any verb-
word
semantic intermediaries (p) in PJ1 of all Jlirtagged forms of w having PaT =
1000,
1100, 1010, 1001, 1110, 1011, 1111; IS11 is for identifying p in PJ1 of all
Jlirtagged
forms of w having PaT = 1000, 1100, 1010, 1001, 1110, 1011, 1111; IS12 for
identifying PJ1 of all Jlirtagged forms of w having PaT = 0100;
IS20 for identifying mediating the pairing of a verb-word w with any subject-
word
semantic intermediaries (p) in PJ2 of all J2irtagged forms of w having PaT =
0110,
0101, 0111; IS21 for identifying p in PJ2 of all J2irtagged forms of w having
PaT =
0010, 0001, 0011; IS22 for identifying p in PJ2 of all J2irtagged forms of w
having
PaT = 1010, 1001, 1011, 1110, 1101, 1111;
IS30 for identifying mediating the pairing of a verb-word w with any object-
word
semantic intermediaries (p) in PJ3 of all J3irtagged forms of w having PaT =
0101,
1101, 0111, 1111; IS31 for identifying p in PJ3 of all J3irtagged forms of w
having
PaT = 0001, 0011; IS32 for identifying p in PJ3 of all J3irtagged forms of w
having
PaT = 1001, 1011;
IS40 for identifying mediating the pairing of an object-word w with any verb-
word
semantic intermediaries (p) in PJ4 of all J4irtagged forms of w having PaT =
1000,
1100, 1010, 1001, 1110, 1011, 1111; IS41 for identifying p in PJ4 of all
J4irtagged
forms of w having PaT = 0100; IS42 for identifying p in PJ4 of all J4irtagged
forms
of w having PaT = 0110, 0101, 0111;
IS50 for identifying mediating the pairing of a subject-word w with any object-
word
(and vice versa) semantic intermediaries (p) in PJ5 of all J5i3-tagged forms
of w
having PaT = 1010, 1001, 1110, 1011, 1111, 0110, 0101, 0111; IS51 for
identifying p
in PJ5 of all J5irtagged forms of w having PaT = 0100; and IS52 for
identifying p in
PJ5 of all J5irtagged forms of w having PaT = 1000, 1100;
- 19 -
CA 02704163 2010-05-21
IS60 for identifying mediating the pairing of an adjective-word w with any
subject or
object word (and vice versa) semantic intermediaries (p) in PJ6 of all J6ij-
tagged
forms of w having PaT = 1010, 1001, 1110, 1011, 1111, 0110, 0101, 0111; IS61
for
identifying p in PJ6 of all Thij-tagged forms of w having PaT = 0100; and IS62
for
identifying p in PJ6 of all J6ij-tagged forms of w having PaT = 1000, 1100.
Referring back to Fig. 1, unit 06 (ADP) relates to executable by computer
instructions for
causing a computing system to carry out the steps of applying designated ISm-
prototypes. An
example of ISm-prototype is shown in Fig. 5 section B. More particularly, the
rules for executing
the expression (vtRlnlvtR12nIvR3vtlUlvtR21 vtR12n1 vtR12v IU I vR3 al vtR12)
shown in NL1 of DP1
in said figure are briefly explained as follows:
(i) generate (according to rules explained in 5 of method-30) set vtRln
if vtRln 0 go to (ii),
else generate vtR12n if vtR12n 0 go to (ii), else generate vR3vt and go to
(ii);
(ii) generate set vtR2 if vtR2 0 go to (iii), else generate vtR12n if
vtR12n 0 go to (iii), else
generate vtR12v and go to (iii);
(iii) generate set vR3a if vR3a 0 go to (iv), else generate vtR12 and go
to (iv);
(iv) generate the elements of the set formed from the union of the sets
generated at steps i, ii and
iii, and if the set is not empty go to PL1, else go to NL1 of DP2.
Accordingly, by applying all (available to the computing the system) ISm-
prototypes, input
set L040 {w} is transformed into a rule-based ontology set Lo {w} = {wl w has
Jkij and w
has PJk} wherein PJk is a set of identified semantic intermediaries,
representing all
(available to the computing system) Jkij-forms of w (unit 06). To reflect
additional features
of English Language, Lo {w} may be further developed by: adding to the initial
input set L
words such as adverbs; and representing each of said adverbs by means of PJ7
set of
identified semantic intermediaries mediating their syntactic pairing with
verbs represented by
means of PJ8 sets of identified semantic intermediaries, as schematically
exemplified in Fig.
6 section A.
Referring back to Fig. 1, unit 07 shown in section B, relates to method-070
and executable
by computer instructions for causing a computing system to carry out the steps
of method-070 for
producing output sentences from words belonging to Lo {w}. Method-070
satisfies the requirements
of LM provided by the present invention and is described relative to the
following: (1) classes of
- 20 -
CA 02704163 2010-05-21
pairing instructions for generating pairs of grammatically correct
inflectional forms; and, (2) types
of sentences.
(1) 1Gig, 3Gig, 5Gig, and 6Gig shown in Fig. 6 section A represent classes of
pairing instructions
(kgGig) for syntactic pairing of Jkii-tagged inflectional form of any word
that has PJki and
Jki2-form of any w that has PJk2 on the condition that PRI fl PJk2 . With
regard to
particulars:
(i) The class of 1Gig represents any instruction for syntactic pairing of J 1
ij-form of a
subject-word (ws) represented by means of PJ1 (set of identified semantic
intermediaries) and
a J2ii-form of a verb-word (wv) represented by means of PJ2, on the condition
that
PJ1 fl PJ2 .
More particularly:
1G1 is instruction for pairing J11-form of ws and J21-form of wv if PJ1 fl PJ2
0;
1G2 is instruction for pairing J12-form of ws and J21-form of wv if PJ1 fl PJ2
0;
1G3 is instruction for pairing J13-form of ws and J22-form of wv if PJ1 fl PJ2
0;
1G4 is instruction for pairing J14-form of ws and J22-form of wv if PJ1 fl PJ2
.
For example, 1G1 represents the instruction for pairing 'birds' and 'fly'
while 1G3
represents the instruction for pairing 'a bird' and 'flies'.
Furthermore: 1G5 represents the instruction for pairing ws in a provided by
the
present invention plural with a definite article form (the birds) and a w, in
a provided by the
present invention negatory present tense plural form (do not fly); 1G6
represents the
instruction for pairing ws in singular with a definite article form (the bird)
and w, in a
negatory present tense third person singular form (does not fly); 1G7
represents the
instruction for pairing ws in plural without a definite article forms (birds,
the birds) and w, in
a provided by the present invention 'that clause' present tense plural form
(that fly); 1G8
represents the instruction for pairing ws in singular with an indefinite
article forms (a bird,
the bird) and w, in 'that clause' present tense third person singular form
(that flies); 1G9
represents the instruction for pairing \vs in plural without a definite
article form (birds, the
birds) and w, in a provided by the present invention 'and clause'(coordinating
conjunction
clause) present tense plural form (and fly); and so on.
(ii) The class of 3Gig (as schematically exemplified in Fig. 5 section C)
represents any
instruction for syntactic pairing of any J3ii-form of a verb-word (wv)
represented by means of
-21 -
CA 02704163 2010-05-21
PJ3 and any J4ii-form of an object-word (wo) represented by means of PJ4 on
the condition
that PJ3 n PJ4 0.
More particularly:
3G1 is instruction for pairing J31-form of w, and J41-form of w0, or J42-form
of w0, or
J43-form of w0 if PJ3 fl PJ4 0;
3G2 is instruction for pairing J32-form of wv and J41-form of w0, or J42-form
of w0, or
J43-form of w0, or J44-form of w0 if PJ3 fl PJ4 0;
3G3 is instruction for pairing J33-form of wv and J41-form of w0, or J42-form
of w0, or
J43-form of w0, or J44-form of w0 if PJ3 fl PJ4 0.
For example: 3G1 represents the instruction for pairing 'have' with 'an idea'
the idea',
'the ideas' and 'ideas'; 3G2 represents the instruction for pairing 'has' with
'an idea' the
idea', 'the ideas' and 'ideas'; and, 3G3 represents the instruction for
pairing `do not have'
with 'an idea' the idea', 'the ideas' and 'ideas'.
(iii) The class of 5Gig represents any instruction for syntactic pairing of
any J5ii-form of
subject-word (ws) represented by means of PJ5 and any J5irform of an object-
word (wo)
represented by means of PJ5, on the condition that PJ5 of w fl PJ5 of w0 0.
More particularly:
5G1 is instruction for pairing J51-form of ws and J51-form of w0, or J52-form
of w0, or
J53-form of w0, or J54-form of w0 if PJ5 of w fl PJ5 of w0 0;
5G2 is instruction for pairing J52-form of ws and J51-form of w0, or J52-form
of w0, or
J53-form of wo or J54-form of w0 if PJ5 of w fl PJ5 of w0 0;
5G3 is instruction for pairing J53-form of ws and J51-form of w0, or J52-form
of w0, or
J53-form of w0 or J54-form of w0 if PJ5 of w fl PJ5 of w0 0;
5G4 is instruction for pairing J54-form of ws and a J51-form of w0, or J52-
form of w0, or
J53-form of w0 or J54-form of w0 if PJ5 of w fl PJ5 of w0 0.
For example, 5G1 represents the instruction for pairing 'artists' with 'an
idea' the
idea', 'the ideas' and 'ideas' while 5G4 represents the instruction for
pairing 'the artist' with
'an idea' the idea', 'the ideas' and 'ideas'.
(iv) The class of 6Gig represents any instruction for syntactic pairing of
any J6ii-form of
an adjective modifier-word (wa) represented by means of PJ6 and any J6ifform
of a subject-
word (ws) or and J6ifform of an object-word (wo) represented by means of PJ6,
on the
condition that PJ6 of wa fl PJ6 of wsio 0.
- 22 -
CA 02704163 2010-05-21
More particularly:
6G1 is instruction for pairing J61-form of wa and J64-form of \vs or wo if
PJ6 of wa fl PJ6 of ws/0 0;
6G2 is instruction for pairing J61-form of wa and J63-form of w5 or wo
if PJ6 OfWa fl PJ6 of ws/0 0;
6G3 is instruction for pairing J62-form of wa and J65-form of w5 or wo
PJ6 of wa fl PJ6 of ws/. 0;
6G4 is instruction for pairing J63-form of wa and J64-form of ws or wo
if PJ6 of wa fl PJ6 of wsio 0.
For example: 6G1 represents the instruction for pairing 'the great' with
'ideas' while
6G2 represents the instruction for pairing 'the great' with 'idea'.
(2) The rules of the steps of method-070 for transforming sequences
(S1,S2,..,Sn) of
interdependent pairing instructions (kGig) into sentences (e) are exemplified
in Fig.6 and
explained by means of examples (I-V).
(I) S1 =16Gig11Gigl generates e = Wa Ws Wv by:
(i) Performing intersection operation on the sets PJ6 (of a selected J6ii-
form of ws) and
PJ6 of a selected (as indicated by the ig components of 6Gig) J6ifform wa and
if PJ6
of w5 fl PJ6 of wa = 0, repeating step (i) for another wa, else moving to
(ii);
(ii) Performing intersection operation on the sets PJ1 of said selected
Jlij-form of \vs and
PJ2 of a selected (as indicated by the ig components of 1Gig) J2ifform of w,
and if PJ1 of w fl PJ2 ofWv = 0, repeating (ii) for another wv, else moving to
(iii);
(iii) Generating e by pairing Jkij-forms as indicated by the ig components
of 6Gig and
1Gig.
For example:
S1 = 16G111G11 generates sentences such as 'the old men speak' and 'the wild
birds migrate';
S1 =16G211G31 generates sentences such as 'the old dog sleeps' and 'the little
girl sings';
S1 = 16G411G11 generates sentences such as 'contemporary artists imagine' as
exemplified in
Fig. 6 section B.
In addition:
Sla =11Gigl generates e = ws wv by applying the rules of steps (ii) and (iii)
of Sl.
For example:
Sla =11G51 generates sentences such as 'the boys do not wait' and 'the girls
do not sing',
- 23 -
CA 02704163 2010-05-21
S 1 a =11G61 generates sentences such as 'the boy does not play' and 'the dog
does not fly';
S 1 a = 11 Gll generates sentences such as 'artists imagine' as exemplified in
Fig. 6 section B.
(II) S2 = 13Gigl6Gigl generates e = wv wa wo by:
(i) Performing intersection operation on the sets PJ4 of a selected J4ii-
form of wo and
PJ3 of a selected (as indicated by the ig components of 3Gig) J3irform of w,
if PJ4 of
wo fl PJ3 of w, = 0, repeating step (i) for another wo else moving to (ii);
(ii) Performing intersection operation on the sets PJ6 of said wo and PJ6
of a selected (as
indicated by the ig components of 6Gig) NJ-form of wa if PJ6 of w0 fl PJ6 of
wa =
repeating step (ii) for another wa else moving to (iii);
(iii) Generating e by pairing Jkii-forms as indicated by pairing forms as
indicated by the ig
components of 3Gig and 6Gig
For example:
S2 = 16G 111 Gll generates sentences such as 'share old dreams!' and 'paint
new pictures!';
S2 = 16G211G31 generates sentences such as 'share the old dream!' and 'change
the new
picture!'.
In addition:
S2a = 13 Gig' generates e = w, wo by applying the rules of steps (i) and (iii)
of S2.
For example:
S2a = 13G31 generates sentences such as 'do not share dreams!' and 'do not
paint pictures!'
S2a = 13G21 generates sentences such as 'take the dog!' and 'change the
picture!'
52a = 13G11, generates sentences such as 'imagine forms!' as exemplified in
Fig. 6 section B.
(III) S3 = 16Gig*11 Gigl3GigISGig16Gig**1 generates e = wa w, wv wa wo by:
(i) Performing intersection operation on the sets PJ6 of a selected J6ii-
form of ws and
PJ6 of a selected (as indicated by the ig components of 6Gig*) J6ij-form of wa
if PJ6 of w fl PJ6 of wa = 0, repeating step (i) for another wa else moving to
(ii);
(ii) Performing intersection operation on the sets PJli of said w, and
PJ2i of a selected
(as indicated by the ig components of 1 Gig) J2ii-form of w,
if PJ1 of w fl PJ2 of wv 0, repeating step (ii) for another w, else moving to
(iii);
(iii) Performing intersection operation on the sets PJ3i of said wv and PJ4i
of a selected
(as indicated by the ig components of 3Gig) J4ij-form of w0
if PJ3 of w fl PJ4 of w0 = 0, repeating step (iii) for another wo else moving
to (iv);
(iv) Performing intersection operation on the sets PJ5i of said w, and
PJ5i of said wo if
PJ5 of w fl PJ5 of w0 = 0, repeating steps (iii-iv) for another wo else moving
to (v);
- 24 -
CA 02704163 2010-05-21
(v) Performing intersection operation on the sets PJ6 of said wo and PJ6 of
a selected (as
indicated by the ig components of 6Gig**) J6ij-form of wa if
PJ6 of wo fl PJ6 of wa = 0, repeating step (v) for another wa else moving to
(vi);
(vi) Generating e by pairing Jkii-forms as indicated by the ig components of
6Gig*, 1 Gig,
3Gig, 5Gig, and 6Gig**.
For example:
S3 = 16G111G113G115G116G41 generates sentences such as 'contemporary artists
imagine
new forms' and 'medical students learn new techniques';
S3 = 16G111G513G315G116G41 generates sentences such as 'old dogs do not chase
black cats'.
In addition:
S3a = 11Gil3Gil5Gil6Gil generates e = ws wv wa wo by applying the rules of
steps (ii), (iii),
(iv), (v), and (vi) of S3.
For example:
S3 a = 11 G113G115G116G41 generates sentences such as 'artists explore new
ideas' and
'students learn new techniques';
S3a = 11 G513G315G116G41 generates sentences such as 'dogs do not chase old
cats'.
Furthermore:
S3b = 16GigllGigl3Gig15Gigl generates e = wa ws wv wo by applying the rules of
steps (i), (ii),
(iii), (iv) and (vi) of S3;
S3c = 11 Gigl3Gigl5Gigl generates e = ws wv wo by applying the rules of steps
(ii), (iii), (iv) and
(vi) of S3.
For example:
S3b = 16G111 G113G115G 11 generates sentences such as 'contemporary artists
explore ideas'
and 'medical students learn techniques';
S3b = 16G111 G513G315G 11 generates sentences such as 'old dogs do not chase
cats'.
S3c = 11 G113G115G11 generates sentences such as 'artists explore ideas' and
'students learn
techniques';
S3c = 11 G513G315G 11 generates sentences such as 'dogs do not chase cats.'
(IV) S4 = 11 Gigl3Gig*I5Gig*11 G713Gig**15Gig**1 generates e = w5 w1 woi wv2
w02 by:
Performing intersection operation on the sets PJ1 of a selected J1 ri-form of
Ws and
PJ2 of a selected (as indicated by the ig components of 1 Gig) J2ii-form of
wvi if PJ1 of
ws fl PJ2 of w1 = 0, repeating step (i) for another wv1 else moving to (ii);
- 25 -
CA 02704163 2010-05-21
(ii) Performing intersection operation on the sets PJ3 of said wv1 and
PJ4 of a selected (as
indicated by the ig components of 3Gi;) J4ij-form of w01
if PJ3 of wvi fl PJ4 of woi = 0, repeating step (ii) for another woi else
moving to (iii);
(iii) Performing intersection operation on the sets PJ5 of said ws and PJ5
of said w01 if
PJ5 of w fl PJ5 of woi = 0, repeating steps (ii-iii) for another w01 else
moving to (iv);
(iv) Performing intersection operation on the sets PJ1 of said w01 and
PJ2 of a selected (as
indicated by the ig components of 1 Gig) J2ii-form of wv2 if
PJ1 of w01 fl PJ2 of wv = 0, repeating step (iv) for another wv2 else moving
to (v);
(v) Performing intersection operation on the sets PJ3 of said wv2 and
PJ4 of a selected (as
indicated by the ig components of 3Gig**) J6ii-form of w02 if
PJ3 of w2 fl PJ4 of w02 = 0, repeating step (v) for another w02 else moving to
(vi);
(vi) Performing intersection operation on the sets PJ5 of said w01 and
PJ5 of said selected
(indicated by the ig components of 5Gig**) J5ij-form of w02 if
PJ5 of woi fl PJ5 of w02 = 0, repeating steps (v-vi) for another w02 else
moving to
(vii);
(vii) Generating e by pairing Jkii-forms as indicated by the ig components of
1 Gig, 3Gi;,
5Gi;, 1G7, 3Gig**, and 5Gig**.
For example:
S4 = 11 G 113G 115G 111 G713 G 115G 11 generates sentences such as 'artists
imagine forms that
change colors';
S4 = 11 G513G315G111 G713G115G11 generates sentences such as 'dogs do not
chase cats that
pursue birds.'
In addition:
S4a = 16Gigl 1 Gigl3Gigi5Gig11 G71 generates e = wa Ws wvi woi wv2 by applying
the rules of
steps (ii), (iii), (iv) and (vii) of S4.
S4b = 16Gigl 1 Gig113Gigl5Gig11G713Gigl5Gigl generates e = wal Ws wvi woi wv2
W02 by applying
the rule of step (i) of S1 and all rules of S4.
For example:
S4a = 16G 1111 G113G115G 111 G71 generates sentences such as 'contemporary
artists imagine
forms that change';
S4a = 16G 111 G513G315G 111 G71 generates sentences such as 'smart dogs do not
chase birds
that fly.'
- 26 -
CA 02704163 2010-05-21
S4b = 16G411 G 1 113 G 115 G 111 G713 G 115G 11 generates sentences such
'contemporary artists
imagine forms that change colors' as exemplified in Fig. 6 section B.
(V) S5 = 11 Gig13Gig 15Gig 11 G913Gig 15Gig 1 generates e = ws wvi woi
wv2 w02 by:
(i) Performing intersection operation on the sets PJ1 of a selected J1
ij-form of ws and
PJ2 of a selected (as indicated by the ig components of 1 Gig) J2ii-form of
wvi
if PJ1 of w fl PJ2 of wvi = 0, repeating step (i) for another wv1 else moving
to (ii);
(ii) Performing intersection operation on the sets PJ3 of said wvi and
PJ4 of a selected (as
indicated by the ig components of 3Gig*) J4ij-form of w01 if PJ3 of wvi fl PJ4
of w01 =
0, repeating step (ii) for another w01 else moving to (iii);
(iii) Performing intersection operation on the sets PJ5 of said ws and PJ5
of said w01
if PJ5 of w fl PJ5 of woi = 0, repeating steps (ii) and (iii) for another w01
else
moving to (iv);
(iv) Performing intersection operation on the sets PJ1 of said ws and PJ2
of a selected (as
indicated by 1G9) wv2 if PJ1 of w fl PJ2 of wv2 = 0, repeating step (iv) for
another
wv2 else moving to (v);
(v) Performing intersection operation on the sets PJ3 of said wv2 and
PJ4 of a selected (as
indicated by 3Gig**) w02 if PJ3 of w2 fl PJ4 of w02 = 0, repeating step (v)
for another
w02; else moving to (vi);
(vi) Performing intersection operation on the sets PJ5 of said ws and PJ5
of said w02
if PJ5 of ws fl PJ5 of vs/02 = 0, repeating steps (v) and (vi) for another w02
else moving
to (vii);
(vii) Generating e by pairing Jkii-forms as indicated by the ig components of
1 Gig, 3Gi;,
5Gi;, 1G9, 3Gig**, and 5Gig**.
For example:
S5 = 11 G113G115G111G913G115G11 generates sentences such as 'artists imagine
forms and
change ideas';
S5 = 11 G313G215G411 G913G215G41 generates sentences such as 'the artist
imagines forms and
pursues dreams.'
To allow the representation and the generation of other types of sentences
said method-070
may be further developed by introducing kgGig representing additional
correlations between
the elements of complex English sentences such as correlations between adverbs
and verbs,
passive transformations, possessive of nouns and etc.
- 27 -
CA 02704163 2010-05-21
Referring back to Fig. 1 (CGS), another exemplary system for implementing the
invention
includes a generator of sentences (unit 07) from words belonging to Lc, {w}
(in unit 06) and a
generator of visual and audio compositions comprising: an Expression
Fragmentizer (EF); a Visual
Motif Builder (VMB); an Audio Motif Builder (AMB); a Visual Composition
Builder (VCB); an
Audio Composition Builder (ACB); a Text-to-Speech Unit (TTSU); Display Unit
(DU); an Audio
Unit (AU); and an integrated multimedia interface system (IMIS).
Unit EF shown in Fig. 1 relates to executable by computer instructions for
causing a
computing system to carry out steps for receiving and fragmentizing an input
sentence (generated by
unit 07) into a sequence of fragments (Sf). More particularly, EF is
configured to allow the
preservation, or the reduction, or the expansion of the number (N) of input
symbols (Yi, i = 1, 2, ...)
relative to the number (M) of output fragments (Fj, j = 1, 2, ...).
Accordingly: N is preserved when
one Yi of the input is represented by one Fj, hence N = M; N is reduced when
two or more Yi of the
input are represented by one Fj, hence N> M; and, N is expanded when one Yi is
represented by
more than one Fj, hence N < M.
Unit VMB shown in Fig. 1 relates to executable by computer instructions for
causing a
computing system to carry out the steps for converting of an input sequence of
fragments (SO
generated by EF into visual motifs (Vm). More particularly, VMB builds a
visual motif by assigning
to each input fragment: a color from a Color Palette (CoP); a shape from a
Shape Palette (ShP); and
size from a Size Palette (SiP). The term CoP as used herein refers to colors
available to the unit. The
term ShP as used herein refers to different manually created and/or computer
generated two or three
dimensional 'shapes' available to the unit. The term SiP as used herein refers
to size of the shapes
available to the unit.
Unit VCB shown in Fig. 1 relates to executable by computer instructions for
causing a
computing system to carry out steps for building a visual composition (Vc)
from an input motif
(Vm) generated by VMB. More particularly, VCB builds a visual composition by
selecting a panel
size from a Panel Size Palette (PSP), a panel background color from a Panel
background Color
Palette (PCP), and a motif replication space from a motif Replication Space
Palette (RSP). The term
PSP as used herein refers to panel sizes available to the unit. The term PCP
as used herein refers to
background colors of the panel available to the unit. The term RSP as used
herein refers to manually
created and/or computer generated two or three dimensional forms within of the
boundaries of
-28-
CA 02704163 2010-05-21
which the motif is replicated. Fig. 7 contains gray scale examples of visual
compositions generated
by the visual composition builder from one and the same input sequence of
fragments.
Unit DU shown in Fig. 1 relates to computer controlled imaging (output)
devices such as
displays, projectors, and etc. which allow the visualization of the
compositions generated by VCB.
Unit DU also relates to any printing computer peripheral (2D and 3D printers)
that puts the
generated by the present invention sentences and compositions on paper or on
another medium, such
as a fabric, plastic, and etc.
Unit AMB shown in Fig. 1 relates to executable by computer instructions for
causing a
computing system to carry out steps for converting of an input sequence of
fragments (SO generated
by EF into audio motifs (Af). More particularly, AMB builds an audio motif by
assigning directly to
each of the fragments sampled sound or short samples of music from a Sound
Palette (SP), or by
using the fragments to create a MIDI type of motifs from a MIDI Palette (MP).
The terms AP and
MP as used herein refer to prerecorded and or computer generated audio effects
available to the unit.
Unit ACB shown in Fig. 1 relates to executable by computer instructions for
causing a
computing system to carry out steps for building an audio composition from an
input motif (Am) or
a sequence of input motifs generated by AMB. More particularly, ACB builds an
audio composition
(Ac) by selecting the audio composition control parameters from an audio
composition control
parameters palette (ARP). The control parameters available to the unit include
a motif s repetition
number, motifs' overlapping, volume, tempo, and panning. The ARP may be
configured to allow
rules on music composition to be applied.
Unit TTSU shown in Fig. 1 relates to executable by computer instructions for
causing a
computing system to carry out steps for converting input generated by the NLG
component of the
present invention to speech using available to the unit text-to-speech engines
such as FreeTTS.
Unit AU shown in Fig. 1 relates to computer controlled audio output device on
which the
audio composition generated by ACB and TTSU can be reproduced or recorded.
The IMIS unit shown in Fig. 1 relates to an integrated multimedia interface
system, in which
the video and audio compositions can be simultaneously displayed.
- 29 -
CA 02704163 2012-08-10
. A number of features of the present invention have been described
by way of examples, and
not limitations.
All of the disclosed methods may be embodied in any information carrier medium
that,
when placed in operable relation to a computer provides the computer with
software for generating
sentences. The software component of the present invention enables the
generation and the display
of sentences and visual and audio compositions with minimal computational
effort and is
operational with any general purpose or special purpose computing system
environments or
configurations. Examples of well known computing systems, environments, and/or
configurations
that may be suitable for use with the invention include, but are not limited
to, personal computers,
multiprocessor systems, hand-held or laptop devices, embedded systems,
microprocessor-based
systems, programmable consumer electronics, (mobile devices), network PCs,
minicomputers,
distributed computing environments that include any of the above systems or
devices, and the like.
While the foregoing is directed to embodiments of the present invention, other
and further
embodiments of the invention may be devised without departing from the basic
scope thereof, and
the scope thereof is determined by the claims.
- 30 -