Note: Claims are shown in the official language in which they were submitted.
Claims:
1. A device comprising:
an obtaining module, configured to obtain sample corpora of various
conversation
scenarios in a plurality of conversation scenarios;
a generating module, configured to generate a scenario feature of the
conversation
scenario based on a sample corpus of the conversation scenario, wherein the
scenario
feature includes feature words of the conversation scenario and a feature word
sequence
obtained by mapping transformation of the feature words; and
a configuring module, configured to configure an intelligent voice robot based
on a preset
word vector space model and scenario features of the various conversation
scenarios,
wherein the word vector space model is for the intelligent voice robot to
perfomi word
vector similarity calculation on a user's conversation and the scenario
features of the
various conversation scenarios, to recognize an intended scenario of the
user's
conversation.
2. The device of claim 1, wherein the generating module further comprises:
a representing unit, configured to obtain discrete representation of the
sample corpus of
the conversation scenario based on a preset domain dictionary;
a screening unit, configured to apply a feature selection algorithm to extract
feature
words of the conversation scenario based on the discrete representation of the
sample
corpus of the conversation scenario; and
a generating unit, configured to:
map and transform the feature words of the conversation scenario to a
corresponding dictionary index;
generate the feature word sequence of the conversation scenario, wherein the
feature selection algorithm is a chi-square statistic feature selection
algorithm.
37
3. The device of claim 1, further comprises:
a storing module, configured to store the various conversation scenarios and
the scenario
features of the conversation scenarios in a scenario feature relation table;
4. The device of claim 3, further comprises:
a receiving module, configured to receive a configuration feature word input
with respect
to any conversation scenario; and
a maintaining module, configured to maintain the scenario feature of the
conversation
scenario in the scenario feature relation table based on the configuration
feature word of
the conversation scenario and a configuration feature word sequence obtained
by
mapping transformation of the configuration feature word.
5. The device of claim 4, wherein the receiving module is configured to
receive a
configuration feature word input by a user having feature configuration
permission with
respect to the conversation scenario.
6. The device of claim 4, wherein the maintaining module is configured to:
merge the configuration feature word of the conversation scenario to the
feature words of the conversation scenario in the scenario feature relation
table;
and
add the configuration feature word sequence of the merged configuration
feature word to the feature word sequence of the conversation scenario.
7. The device of claim 1, further comprises a training module configured to
use a domain
corpora of a domain to which the various conversation scenarios pertain to
train a pre-
trained bidirectional encoder representations from transformers (BERT) word
vector space,
and obtain the word vector space model.
8. The device of claim 1, further comprises a state machine configuring
module configured to:
38
receive a state transition graph input by a first user for the conversation
scenario and
receive supplementary information input by a second user for a state
transition graph, to
generate a state transition matrix of the conversation scenario; and
generate a script file for containing state transition logical relation based
on the state
transition matrix of the conversation scenario and generate a finite state
machine (FSM)
based on the script file, to return a corresponding pattern when the intended
scenario of
the user's conversation is recognized.
9. The device of claim 1, further comprises an intended scenario
recognizing module
comprising:
an obtaining unit, configured to:
preprocess the user's conversation to obtain a plurality of segmented terms in
the user's conversation;
perform mapping transformation on the plural segmented terms;
obtain a feature word sequence of the user's conversation;
a constructing unit, configured to implement the word vector space model to
construct a
feature vector of the user's conversation and scenario feature vectors of the
various
conversation scenarios based on the feature word sequence of the user's
conversation and
feature word sequences of the various conversation scenarios;
a matching unit, configured to:
perform similarity calculation on the feature vector of the user's
conversation
and the scenario feature vectors of the various conversation scenarios; and
recognize intention of the user's conversation based on a similarity
calculation
result, to return a pattern to which the intention corresponds.
39
10. The device of any one of claims 1 to 9, wherein the plurality of
conversation scenarios are
contained in a preset conversation scenario list, wherein the conversation
scenario list is to
record one or more conversation scenarios of a specific business field.
11. The device of any one of claims 1 to 10, wherein the sample corpora of the
various
conversation scenarios are obtained by classifying and marking specific domain
corpora
according to classes of the conversation scenarios.
12. The device of any one of claims 1 to 11, wherein the conversation
scenarios are obtained by
performing scenario abstraction on the specific domain corpora, and the
scenario abstraction
is a process from data to information and then to knowledge.
13. The device of any one of claims 1 to 12, wherein each conversation
scenario is abstracted as
a conversation state, and dialogue process is abstracted as a transition
between conversation
states.
14. The device of any one of claims 1 to 13, wherein the conversation state is
taken as a node,
an oriented line between the conversation states is a process in which one
state is transferred
to another state, wherein entire dialogue process is abstracted as a graph
consisting of nodes
and oriented lines.
15. The device of any one of claims 1 to 14, wherein content transformation
based on a word of
bag (WOB) model is performed on the sample corpus of each conversation
scenario,
wherein discrete representation of the sample corpus of each conversation
scenario is
obtained, wherein the feature selection algorithm is to extract feature words
of each
conversation scenario on the basis of the discrete representation of the
sample corpus of
each conversation scenario.
16. The device of any one of claims 1 to 15, wherein the WOB model divides a
corpus text into
separate terms, and it is imaged that all terms are placed in a bag, such
elements as their
word orders, grammars and syntaxes are ignored, wherein the bag is only
regarded as a
collection of plural vocabularies, wherein each term as it appears in the
corpus text is
independent.
17. The device of any one of claims 1 to 16, wherein the WOB model includes a
one-hot, TF-
EDF and N-gram model.
18. The device of any one of claims 1 to 17, wherein name, the feature words
and the feature
word sequence of each conversation scenario are correspondingly stored in the
scenario
feature relation table.
19. The device of any one of claims 1 to 18, wherein the scenario feature
relation table stores
correspondence relations between conversation scenarios and scenario features,
including
feature words and feature word sequences.
20. The device of any one of claims 1 to 19, wherein the scenario feature
relation table is stored
in a server, maintained offline by backstage algorithm technical personnel
based on
periodical text data mining operations, and isolated from frontstage business
personnel.
21. The device of any one of claims 1 to 20, wherein when the intelligent
voice robot is
conversing with the user, obtain user's conversation text by recognizing and
transforming
user's conversation voice via an automatic speech recognition (ASR) technique,
and extract
feature information out of the user's conversation text.
22. The device of any one of claims 1 to 21, wherein pre-trained embedding is
obtained by
introducing large-scale BERT word vector space, pretrained in google bert
serving.
23. The device of any one of claims 1 to 22, wherein the BERT word vector
space is retrained
by bringing in own business customer service corpora, and calibration of the
BERT word
vector is realized, to conform to specific business scenarios.
24. The device of any one of claims 1 to 23, wherein total segmentation
corpora where stop-
words are removed are to construct the preset domain dictionary, wherein the
preset domain
dictionary includes total effective vocabularies appearing in the corpora, and
the preset
domain dictionary is to perform content transformation on all sample corpora
of a target
conversation scenario based on the WOB model to obtain the discrete
representation.
41
25. The device of any one of claims 1 to 24, wherein the chi-square statistic
(CHI) technique is
to extract feature words of the target conversation scenario.
26. The device of any one of claims 1 to 25, wherein CHI calculation formula
comprises:
<IMG>
wherein c is a certain class, namely a "conversation scenario", t is a certain
term, and N is
a total number of texts in a training corpora.
27. The device of claim 26, wherein the x2 is for chi square hypothesis
verification in statistics,
to judge uniformity or goodness of fit between actual distribution and
theoretical
distribution, wherein a null hypothesis 1/0 being "no marked difference
between observed
frequency and desired frequency".
28. The device of any one of claims 26 to 27, wherein less the chi-square
statistic is, the closer
the observed frequency to the desired frequency, and higher relevancy between.
29. The device of any one of claims 26 to 28, wherein X2 is a measure of
distance between an
observed object and a desired object, wherein less the distance is, higher the
relevancy
between.
30. The device of any one of claims 26 to 29, wherein the observed object is
the term, and the
desired object is the conversation scenario, wherein if the term and the
conversation
scenario are highly relevant, statistic distributions of the two are close to
each other in the
entire samples.
31. The device of any one of claims 26 to 30, wherein through x2 statistic,
relevancies between
all vocabularies in the domain dictionary and the various classes are
calculated quickly and
accurately based on quantities of corpora, and a preset number of terms with
the x2 selected
according to relevancy sorting result to serve as a feature set of
conversation scenarios, to
complete feature mapping of various scenarios and the various classes in the
conversation
scenario list.
42
32. The device of any one of claims 1 to 31, wherein the configuration feature
word input by the
user with respect to the target conversion scenario through a system frontend.
33. The device of any one of claims 1 to 32, wherein the system frontend
provides business
personnel with a feature relation extension function to maintain the business
field.
34. The device of any one of claims 1 to 33, wherein on receiving the
configuration feature
word, updates it into an externally input feature set of the target conversion
scenario.
35. The device of any one of claims 1 to 34, wherein the externally input
configuration feature
word is not contained in the domain dictionary, the configuration feature word
is ignored.
36. The device of any one of claims 1 to 35, wherein the first user is
business personnel, the
second user is an algorithm developing technical personnel.
37. The device of any one of claims 1 to 36, wherein the FSM is a mathematical
model that
expresses a finite number of states and behaviors as transitions and actions
amongst these
states.
38. The device of any one of claims 1 to 37, wherein the FSM is to describe
state sequences of
an object experienced within the objects life cycle, and how to respond to
various events
coming from outside to transition between the states.
39. The device of any one of claims 1 to 38, wherein the FSM includes current
state, condition,
action, and next state.
40. The device of any one of claims 1 to 39, wherein the current state is a
state currently being
in, wherein the condition is referred to as event, wherein the condition is
satisfied, an action
is triggered or one round of state transition is performed;
41. The device of any one of claims 1 to 40, wherein the action indicates the
action executed
after the condition has been satisfied, wherein the action has been executed
to completion, a
new state is transitioned, or the original state is maintained.
43
42. The device of any one of claims 1 to 41, wherein the action is not
indispensable, after the
condition has been satisfied, not execute any action and directly transition
to the new state;
43. The device of any one of claims 1 to 42, wherein the next state is the new
state to be
transitioned after the condition has been satisfied.
44. The device of any one of claims 1 to 43, wherein the next state is
relative to the current
state, once the next state is activated, it becomes a new current state.
45. The device of any one of claims 1 to 44, wherein FSM model is abstracted
comprising:
START event'END.
46. The device of any one of claims 1 to 45, wherein the FSM state transition
torque completely
repaired is automatically translated to the script file of j son format.
47. The device of any one of claims 1 to 46, wherein the json script file is
read and input by the
program to a state machine object when a finite state machine instance is
generated, to
validate logic.
48. The device of any one of claims 1 to 47, wherein the finite state machine
instance as
generated is stored in Redis as an index value according to uuid transferred
by the frontend,
to facilitate subsequent program access when the interactive voice response
(IVR) service
starts.
49. The device of any one of claims 1 to 48, wherein the user performs
persistent operation on
the FSM.
50. The device of any one of claims 1 to 49, wherein the user selects a task
type as a single task
at the frontend, the finite state machine instance of the single task as
stored in Redis is
cleared away as an invalidated object within a preset time period after the
IVR marketing
service has been triggered.
44
51. The device of any one of claims 1 to 50, wherein the intelligent voice
robot converses with
the user, wherein the user's conversation is text content recognized and
transcribed from
user's conversation speech through ASR technology.
52. The device of any one of claims 1 to 51, wherein the text content is word-
segmented to
obtain the plurality of segmented terms, and word-segmentation process
includes character
purification, rectification, word segmentation, and removal of stop-words.
53. The device of any one of claims 1 to 52, wherein plural segmented terms
are mapped and
transformed through index of the domain dictionary to form of expression of
the
conversation scenario in the scenario feature relation table.
54. The device of any one of claims 1 to 53, wherein each element in the
feature word sequence
of the user's conversation is mapped to the BERT word vector space to obtain
the feature
vector of 768 dimensions, and all elements are summated and averaged to obtain
a 1x768
vector to serve as feature expression input by the user's conversation.
55. The device of any one of claims 1 to 54, wherein the feature word
sequences of the various
conversation scenarios in the scenario feature relation table are operated is
converted to
1x768 feature vectors.
56. The device of any one of claims 1 to 55, wherein a cosine similarity
between the feature
vector input by the user's conversation and the scenario feature vector of the
conversation
scenario is calculated, wherein the greater cosine similarity calculation
result, the higher
similarity, and the higher cosine similarity calculation result is the
relevancy between the
user's conversation and the conversation scenario.
57. The device of any one of claims 1 to 56, wherein by arranging all the
conversation scenarios
in a descending order according to the cosine similarity calculation result,
the conversation
scenario with the highest cosine similarity calculation result is returned to
serve as a
judgement result of the intended scenario of current input by the user, and a
corresponding
response pattern under the intended scenario is returned according to the
state in which the
FSM is currently.
58. The device of any one of claims 1 to 57, wherein the response pattern is
transformed to
speech content through Text-To-Speech (TTS) technology is broadcast to the
user.
59. A system comprising:
an obtaining module, configured to obtain sample corpora of various
conversation
scenarios in a plurality of conversation scenarios;
a generating module, configured to generate a scenario feature of the
conversation
scenario based on a sample corpus of the conversation scenario, wherein the
scenario
feature includes feature words of the conversation scenario and a feature word
sequence
obtained by mapping transformation of the feature words; and
a configuring module, configured to configure an intelligent voice robot based
on a preset
word vector space model and scenario features of the various conversation
scenarios,
wherein the word vector space model is for the intelligent voice robot to
perfomi word
vector similarity calculation on a user's conversation and the scenario
features of the
various conversation scenarios, to recognize an intended scenario of the
user's
conversation.
60. The system of claim 59, wherein the generating module further comprises:
a representing unit, configured to obtain discrete representation of the
sample corpus of
the conversation scenario based on a preset domain dictionary;
a screening unit, configured to apply a feature selection algorithm to extract
feature
words of the conversation scenario based on the discrete representation of the
sample
corpus of the conversation scenario; and
a generating unit, configured to:
map and transform the feature words of the conversation scenario to a
corresponding dictionary index;
46
generate the feature word sequence of the conversation scenario, wherein the
feature selection algorithm is a chi-square statistic feature selection
algorithm.
61. The system of claim 59, further comprises:
a storing module, configured to store the various conversation scenarios and
the scenario
features of the conversation scenarios in a scenario feature relation table;
62. The system of claim 61, further comprises:
a receiving module, configured to receive a configuration feature word input
with respect
to any conversation scenario; and
a maintaining module, configured to maintain the scenario feature of the
conversation
scenario in the scenario feature relation table based on the configuration
feature word of
the conversation scenario and a configuration feature word sequence obtained
by
mapping transformation of the configuration feature word.
63. The system of claim 62, wherein the receiving module is configured to
receive a
configuration feature word input by a user having feature configuration
permission with
respect to the conversation scenario.
64. The system of claim 62, wherein the maintaining module is configured to:
merge the configuration feature word of the conversation scenario to the
feature words of the conversation scenario in the scenario feature relation
table;
and
add the configuration feature word sequence of the merged configuration
feature word to the feature word sequence of the conversation scenario.
65. The system of claim 59, further comprises a training module configured to
use a domain
corpora of a domain to which the various conversation scenarios pertain to
train a pre-
trained bidirectional encoder representations from transformers (BERT) word
vector space,
and obtain the word vector space model.
47
66. The system of claim 59, further comprises a state machine configuring
module configured
to:
receive a state transition graph input by a first user for the conversation
scenario and
receive supplementary information input by a second user for a state
transition graph, to
generate a state transition matrix of the conversation scenario; and
generate a script file for containing state transition logical relation based
on the state
transition matrix of the conversation scenario and generate a finite state
machine (FSM)
based on the script file, to return a corresponding pattern when the intended
scenario of
the user's conversation is recognized.
67. The system of claim 59, further comprises an intended scenario recognizing
module
comprising:
an obtaining unit, configured to:
preprocess the user's conversation to obtain a plurality of segmented terms in
the user's conversation;
perform mapping transformation on the plural segmented terms;
obtain a feature word sequence of the user's conversation;
a constructing unit, configured to implement the word vector space model to
construct a
feature vector of the user's conversation and scenario feature vectors of the
various
conversation scenarios based on the feature word sequence of the user's
conversation and
feature word sequences of the various conversation scenarios;
a matching unit, configured to:
perform similarity calculation on the feature vector of the user's
conversation
and the scenario feature vectors of the various conversation scenarios; and
48
recognize intention of the user's conversation based on a similarity
calculation
result, to return a pattern to which the intention corresponds.
68. The system of any one of claims 59 to 67, wherein the plurality of
conversation scenarios
are contained in a preset conversation scenario list, wherein the conversation
scenario list is
to record one or more conversation scenarios of a specific business field.
69. The system of any one of claims 59 to 68, wherein the sample corpora of
the various
conversation scenarios are obtained by classifying and marking specific domain
corpora
according to classes of the conversation scenarios.
70. The system of any one of claims 59 to 69, wherein the conversation
scenarios are obtained
by performing scenario abstraction on the specific domain corpora, and the
scenario
abstraction is a process from data to information and then to knowledge.
71. The system of any one of claims 59 to 70, wherein each conversation
scenario is abstracted
as a conversation state, and dialogue process is abstracted as a transition
between
conversation states.
72. The system of any one of claims 59 to 71, wherein the conversation state
is taken as anode,
an oriented line between the conversation states is a process in which one
state is transferred
to another state, wherein entire dialogue process is abstracted as a graph
consisting of nodes
and oriented lines.
73. The system of any one of claims 59 to 72, wherein content transformation
based on a word
of bag (WOB) model is performed on the sample corpus of each conversation
scenario,
wherein discrete representation of the sample corpus of each conversation
scenario is
obtained, wherein the feature selection algorithm is to extract feature words
of each
conversation scenario on the basis of the discrete representation of the
sample corpus of
each conversation scenario.
49
74. The system of any one of claims 59 to 73, wherein the WOB model divides a
corpus text
into separate terms, and it is imaged that all terms are placed in a bag, such
elements as their
word orders, grammars and syntaxes are ignored, wherein the bag is only
regarded as a
collection of plural vocabularies, wherein each term as it appears in the
corpus text is
independent.
75. The system of any one of claims 59 to 74, wherein the WOB model includes a
one-hot, TF-
EDF and N-gram model.
76. The system of any one of claims 59 to 75, wherein name, the feature words
and the feature
word sequence of each conversation scenario are correspondingly stored in the
scenario
feature relation table.
77. The system of any one of claims 59 to 76, wherein the scenario feature
relation table stores
correspondence relations between conversation scenarios and scenario features,
including
feature words and feature word sequences.
78. The system of any one of claims 59 to 77, wherein the scenario feature
relation table is
stored in a server, maintained offline by backstage algorithm technical
personnel based on
periodical text data mining operations, and isolated from frontstage business
personnel.
79. The system of any one of claims 59 to 78, wherein when the intelligent
voice robot is
conversing with the user, obtain user's conversation text by recognizing and
transforming
user's conversation voice via an automatic speech recognition (ASR) technique,
and extract
feature information out of the user's conversation text.
80. The system of any one of claims 59 to 79, wherein pre-trained embedding is
obtained by
introducing large-scale BERT word vector space, pretrained in google bert
serving.
81. The system of any one of claims 59 to 80, wherein the BERT word vector
space is retrained
by bringing in own business customer service corpora, and calibration of the
BERT word
vector is realized, to conform to specific business scenarios.
82. The system of any one of claims 59 to 81, wherein total segmentation
corpora where stop-
words are removed are to construct the preset domain dictionary, wherein the
preset domain
dictionary includes total effective vocabularies appearing in the corpora, and
the preset
domain dictionary is to perfomi content transformation on all sample corpora
of a target
conversation scenario based on the WOB model to obtain the discrete
representation.
83. The system of any one of claims 59 to 82, wherein the chi-square statistic
(CBI) technique
is to extract feature words of the target conversation scenario.
84. The system of any one of claims 59 to 83, wherein CHI calculation formula
comprises:
<IMG>
wherein c is a certain class, namely a "conversation scenario", t is a certain
term, and N is
a total number of texts in a training corpora.
85. The system of claim 84, wherein the x2 is for chi square hypothesis
verification in statistics,
to judge uniformity or goodness of fit between actual distribution and
theoretical
distribution, wherein a null hypothesis 1/0 being "no marked difference
between observed
frequency and desired frequency".
86. The system of any one of claims 84 to 85, wherein less the chi-square
statistic is, the closer
the observed frequency to the desired frequency, and higher relevancy between.
87. The system of any one of claims 84 to 86, wherein x' is a measure of
distance between an
observed object and a desired object, wherein less the distance is, higher the
relevancy
between.
88. The system of any one of claims 84 to 87, wherein the observed object is
the term, and the
desired object is the conversation scenario, wherein if the term and the
conversation
scenario are highly relevant, statistic distributions of the two are close to
each other in the
entire samples.
51
89. The system of any one of claims 84 to 88, wherein through e statistic,
relevancies between
all vocabularies in the domain dictionary and the various classes are
calculated quickly and
accurately based on quantities of corpora, and a preset number of terms with
the x2 selected
according to relevancy sorting result to serve as a feature set of
conversation scenarios, to
complete feature mapping of various scenarios and the various classes in the
conversation
scenario list.
90. The system of any one of claims 59 to 89, wherein the configuration
feature word input by
the user with respect to the target conversion scenario through a system
frontend.
91. The system of any one of claims 59 to 90, wherein the system frontend
provides business
personnel with a feature relation extension function to maintain the business
field.
92. The system of any one of claims 59 to 91, wherein on receiving the
configuration feature
word, updates it into an externally input feature set of the target conversion
scenario.
93. The system of any one of claims 59 to 92, wherein the externally input
configuration feature
word is not contained in the domain dictionary, the configuration feature word
is ignored.
94. The system of any one of claims 59 to 93, wherein the first user is
business personnel, the
second user is an algorithm developing technical personnel.
95. The system of any one of claims 59 to 94, wherein the FSIV1 is a
mathematical model that
expresses a finite number of states and behaviors as transitions and actions
amongst these
states.
96. The system of any one of claims 59 to 95, wherein the FSIV1 is to describe
state sequences of
an object experienced within the objects life cycle, and how to respond to
various events
coming from outside to transition between the states.
97. The system of any one of claims 59 to 96, wherein the FSIV1 includes
current state,
condition, action, and next state.
52
98. The system of any one of claims 59 to 97, wherein the current state is a
state currently being
in, wherein the condition is referred to as event, wherein the condition is
satisfied, an action
is triggered or one round of state transition is performed;
99. The system of any one of claims 59 to 98, wherein the action indicates the
action executed
after the condition has been satisfied, wherein the action has been executed
to completion, a
new state is transitioned, or the original state is maintained.
100. The system of any one of claims 59 to 99, wherein the action is not
indispensable, after the
condition has been satisfied, not execute any action and directly transition
to the new state;
101. The system of any one of claims 59 to 100, wherein the next state is the
new state to be
transitioned after the condition has been satisfied.
102. The system of any one of claims 59 to 101, wherein the next state is
relative to the current
state, once the next state is activated, it becomes a new current state.
103. The system of any one of claims 59 to 102, wherein FSM model is
abstracted comprising:
<IMG>
104. The system of any one of claims 59 to 103, wherein the FSM state
transition torque
completely repaired is automatically translated to the script file of json
format.
105. The system of any one of claims 59 to 104, wherein the json script file
is read and input by
the program to a state machine object when a finite state machine instance is
generated, to
validate logic.
106. The system of any one of claims 59 to 105, wherein the finite state
machine instance as
generated is stored in Redis as an index value according to uuid transferred
by the frontend,
to facilitate subsequent program access when the interactive voice response
(IVR) service
starts.
107. The system of any one of claims 59 to 106, wherein the user performs
persistent operation
on the FSM.
53
108. The system of any one of claims 59 to 107, wherein the user selects a
task type as a single
task at the frontend, the finite state machine instance of the single task as
stored in Redis is
cleared away as an invalidated object within a preset time period after the
IVR marketing
service has been triggered.
109. The system of any one of claims 59 to 108, wherein the intelligent voice
robot converses
with the user, wherein the user's conversation is text content recognized and
transcribed
from user's conversation speech through ASR technology.
110. The system of any one of claims 59 to 109, wherein the text content is
word-segmented to
obtain the plurality of segmented terms, and word-segmentation process
includes character
purification, rectification, word segmentation, and removal of stop-words.
111. The system of any one of claims 59 to 110, wherein plural segmented terms
are mapped and
transformed through index of the domain dictionary to form of expression of
the
conversation scenario in the scenario feature relation table.
112. The system of any one of claims 59 to 111, wherein each element in the
feature word
sequence of the user's conversation is mapped to the BERT word vector space to
obtain the
feature vector of 768 dimensions, and all elements are summated and averaged
to obtain a
1x768 vector to serve as feature expression input by the user's conversation.
113. The system of any one of claims 59 to 112, wherein the feature word
sequences of the
various conversation scenarios in the scenario feature relation table are
operated is
converted to 1x768 feature vectors.
114. The system of any one of claims 59 to 113, wherein a cosine similarity
between the feature
vector input by the user's conversation and the scenario feature vector of the
conversation
scenario is calculated, wherein the greater cosine similarity calculation
result, the higher
similarity, and the higher cosine similarity calculation result is the
relevancy between the
user's conversation and the conversation scenario.
54
115. The system of any one of claims 59 to 114, wherein by arranging all the
conversation
scenarios in a descending order according to the cosine similarity calculation
result, the
conversation scenario with the highest cosine similarity calculation result is
returned to
serve as a judgement result of the intended scenario of current input by the
user, and a
corresponding response pattern under the intended scenario is returned
according to the state
in which the FSM is currently.
116. The system of any one of claims 59 to 115, wherein the response pattern
is transformed to
speech content through Text-To-Speech (TTS) technology is broadcast to the
user.
117.A method comprising:
obtaining sample corpora of various conversation scenarios in a plurality of
conversation
scenarios;
generating a scenario feature of the conversation scenario based on a sample
corpus of
the conversation scenario, wherein the scenario feature includes feature words
of the
conversation scenario and a feature word sequence obtained by mapping
transformation
of the feature words; and
configuring an intelligent voice robot with a preset word vector space model
and the
scenario features of the various conversation scenarios, wherein the word
vector space
model is for the intelligent voice robot to perform word vector similarity
calculation on a
user's conversation and the scenario features of the various conversation
scenarios, to
recognize an intended scenario of the user's conversation.
118. The method of claim 117, wherein generating the scenario feature of the
conversation
scenario based on the sample corpus of the conversation scenario comprises:
obtaining discrete representation of the sample corpus of the conversation
scenario based
on a preset domain dictionary;
employing a feature selection algorithm to extract feature words of the
conversation
scenario based on the discrete representation of the sample corpus of the
conversation
scenario;
mapping and transforming the feature words of the conversation scenario to a
corresponding dictionary index; and
generating the feature word sequence of the conversation scenario, wherein the
feature
selection algorithm is a chi-square statistic feature selection algorithm.
119. The method of claim 117, further comprises:
storing the various conversation scenarios and the scenario features of the
conversation
scenarios in a scenario feature relation table.
120. The method of claim 117, further comprises:
receiving a configuration feature word input with respect to the conversation
scenario;
and
maintaining the scenario feature of the conversation scenario in the scenario
feature
relation table based on the configuration feature word of the conversation
scenario and a
configuration feature word sequence obtained by mapping transformation of the
configuration feature word.
121. The method of claim 120, further comprises receiving the configuration
feature word input
by the user having feature configuration permission with respect to the
conversation
scenario.
122. The method of claim 120, wherein maintaining the scenario feature of the
conversation
scenario in the scenario feature relation table based on the configuration
feature word of the
conversation scenario and the configuration feature word sequence obtained by
mapping
transformation of the configuration feature word comprises:
56
merging the configuration feature word of the conversation scenario to the
feature words
of the conversation scenario in the scenario feature relation table; and
adding the configuration feature word sequence of a merged configuration
feature word
to the feature word sequence of the conversation scenario.
123. The method of claim 117, wherein the word vector space model is trained
and obtained
comprises:
applying a domain corpora of a domain to the various conversation scenarios
pertaining
to train a pre-trained bidirectional encoder representations from transfonners
(BERT)
word vector space; and
obtaining the word vector space model.
124. The method of claim 117, further comprises:
receiving a state transition graph input by a first user for the conversation
scenario and
receiving supplementary information input by a second user for a state
transition graph,
to generate a state transition matrix of the conversation scenario; and
generating a script file for containing state transition logical relation
based on the state
transition matrix of the conversation scenario and generating a finite state
machine (FSM)
based on the script file, to return a corresponding pattern when the intended
scenario of
the user's conversation is recognized.
125. The method of any one of claims 117 to 124, further comprises:
preprocessing the user's conversation to obtain a plurality of segmented terms
in the
user's conversation;
performing mapping transformation on the segmented terms;
obtaining a feature word sequence of the user's conversation;
57
using the word vector space model to construct a feature vector of the user's
conversation
and scenario feature vectors of the various conversation scenarios based on
the feature
word sequence of the user's conversation and feature word sequences of the
various
conversation scenarios;
performing similarity calculation on the feature vector of the user's
conversation and the
scenario feature vectors of the various conversation scenarios; and
recognizing intention of the user's conversation based on a similarity
calculation result,
to return a pattern to which the intention corresponds.
126. The method of any one of claims 117 to 125, wherein the plurality of
conversation scenarios
are contained in a preset conversation scenario list, wherein the conversation
scenario list is
to record one or more conversation scenarios of a specific business field.
127. The method of any one of claims 117 to 126, wherein the sample corpora of
the various
conversation scenarios are obtained by classifying and marking specific domain
corpora
according to classes of the conversation scenarios.
128. The method of any one of claims 117 to 127, wherein the conversation
scenarios are
obtained by perfonning scenario abstraction on the specific domain corpora,
and the
scenario abstraction is a process from data to information and then to
knowledge.
129. The method of any one of claims 117 to 128, wherein each conversation
scenario is
abstracted as a conversation state, and dialogue process is abstracted as a
transition between
conversation states.
130. The method of any one of claims 117 to 129, wherein the conversation
state is taken as a
node, an oriented line between the conversation states is a process in which
one state is
transferred to another state, wherein entire dialogue process is abstracted as
a graph
consisting of nodes and oriented lines.
58
131. The method of any one of claims 117 to 130, wherein content
transformation based on a
word of bag (WOB) model is performed on the sample corpus of each conversation
scenario, wherein discrete representation of the sample corpus of each
conversation scenario
is obtained, wherein the feature selection algorithm is to extract feature
words of each
conversation scenario on the basis of the discrete representation of the
sample corpus of
each conversation scenario.
132. The method of any one of claims 117 to 131, wherein the WOB model divides
a corpus text
into separate terms, and it is imaged that all terms are placed in a bag, such
elements as their
word orders, grammars and syntaxes are ignored, wherein the bag is only
regarded as a
collection of plural vocabularies, wherein each term as it appears in the
corpus text is
independent.
133. The method of any one of claims 117 to 132, wherein the WOB model
includes a one-hot,
`11, -IDF and N-gram model.
134. The method of any one of claims 117 to 133, wherein name, the feature
words and the
feature word sequence of each conversation scenario are correspondingly stored
in the
scenario feature relation table.
135. The method of any one of claims 117 to 134, wherein the scenario feature
relation table
stores correspondence relations between conversation scenarios and scenario
features,
including feature words and feature word sequences.
136. The method of any one of claims 117 to 135, wherein the scenario feature
relation table is
stored in a server, maintained offline by backstage algorithm technical
personnel based on
periodical text data mining operations, and isolated from frontstage business
personnel.
137. The method of any one of claims 117 to 136, wherein when the intelligent
voice robot is
conversing with the user, obtain user's conversation text by recognizing and
transforming
user's conversation voice via an automatic speech recognition (ASR) technique,
and extract
feature information out of the user's conversation text.
59
138. The method of any one of claims 117 to 137, wherein pre-trained embedding
is obtained by
introducing large-scale BERT word vector space, pretrained in google bert
serving.
139. The method of any one of claims 117 to 138, wherein the BERT word vector
space is
retrained by bringing in own business customer service corpora, and
calibration of the
BERT word vector is realized, to conform to specific business scenarios.
140. The method of any one of claims 117 to 139, wherein total segmentation
corpora where
stop-words are removed are to construct the preset domain dictionary, wherein
the preset
domain dictionary includes total effective vocabularies appearing in the
corpora, and the
preset domain dictionary is to perform content transformation on all sample
corpora of a
target conversation scenario based on the WOB model to obtain the discrete
representation.
141. The method of any one of claims 117 to 140, wherein the chi-square
statistic (CHI)
technique is to extract feature words of the target conversation scenario.
142. The method of any one of claims 117 to 141, wherein CHI calculation
formula comprises:
<IMG>
wherein c is a certain class, namely a "conversation scenario", t is a certain
term, and N is
a total number of texts in a training corpora.
143. The method of claim 142, wherein the X2 is for chi square hypothesis
verification in
statistics, to judge uniformity or goodness of fit between actual distribution
and theoretical
distribution, wherein a null hypothesis 1/0 being "no marked difference
between observed
frequency and desired frequency".
144. The method of any one of claims 141 to 143, wherein less the chi-square
statistic is, the
closer the observed frequency to the desired frequency, and higher relevancy
between.
145. The method of any one of claims 141 to 144, wherein x2 is a measure of
distance between
an observed object and a desired object, wherein less the distance is, higher
the relevancy
between.
146. The method of any one of claims 141 to 143, wherein the observed object
is the tem, and
the desired object is the conversation scenario, wherein if the term and the
conversation
scenario are highly relevant, statistic distributions of the two are close to
each other in the
entire samples.
147. The method of any one of claims 141 to 143, wherein through x2 statistic,
relevancies
between all vocabularies in the domain dictionary and the various classes are
calculated
quickly and accurately based on quantities of corpora, and a preset number of
terms with the
x2 selected according to relevancy sorting result to serve as a feature set of
conversation
scenarios, to complete feature mapping of various scenarios and the various
classes in the
conversation scenario list.
148. The method of any one of claims 117 to 147, wherein the configuration
feature word input
by the user with respect to the target conversion scenario through a system
frontend.
149. The method of any one of claims 117 to 148, wherein the system frontend
provides business
personnel with a feature relation extension function to maintain the business
field.
150. The method of any one of claims 117 to 149, wherein on receiving the
configuration feature
word, updates it into an externally input feature set of the target conversion
scenario.
151. The method of any one of claims 117 to 150, wherein the externally input
configuration
feature word is not contained in the domain dictionary, the configuration
feature word is
ignored.
152. The method of any one of claims 117 to 151, wherein the first user is
business personnel,
the second user is an algorithm developing technical personnel.
153. The method of any one of claims 117 to 152, wherein the FSM is a
mathematical model that
expresses a finite number of states and behaviors as transitions and actions
amongst these
states.
61
154. The method of any one of claims 117 to 153, wherein the FSM is to
describe state sequences
of an object experienced within the objects life cycle, and how to respond to
various events
coming from outside to transition between the states.
155. The method of any one of claims 117 to 154, wherein the FSM includes
current state,
condition, action, and next state.
156. The method of any one of claims 117 to 155, wherein the current state is
a state currently
being in, wherein the condition is referred to as event, wherein the condition
is satisfied, an
action is triggered or one round of state transition is performed;
157. The method of any one of claims 117 to 156, wherein the action indicates
the action
executed after the condition has been satisfied, wherein the action has been
executed to
completion, a new state is transitioned, or the original state is maintained.
158. The method of any one of claims 117 to 157, wherein the action is not
indispensable, after
the condition has been satisfied, not execute any action and directly
transition to the new
state;
159. The method of any one of claims 117 to 158, wherein the next state is the
new state to be
transitioned after the condition has been satisfied.
160. The method of any one of claims 117 to 159, wherein the next state is
relative to the current
state, once the next state is activated, it becomes a new current state.
161. The method of any one of claims 117 to 160, wherein FSM model is
abstracted comprising:
<IMG>
162. The method of any one of claims 117 to 161, wherein the FSM state
transition torque
completely repaired is automatically translated to the script file of json
format.
163. The method of any one of claims 117 to 162, wherein the json script file
is read and input by
the program to a state machine object when a finite state machine instance is
generated, to
validate logic.
62
164. The method of any one of claims 117 to 163, wherein the finite state
machine instance as
generated is stored in Redis as an index value according to uuid transferred
by the frontend,
to facilitate subsequent program access when the interactive voice response
(IVR) service
starts.
165. The method of any one of claims 117 to 164, wherein the user performs
persistent operation
on the FSM.
166. The method of any one of claims 117 to 165, wherein the user selects a
task type as a single
task at the frontend, the finite state machine instance of the single task as
stored in Redis is
cleared away as an invalidated object within a preset time period after the
IVR marketing
service has been triggered.
167. The method of any one of claims 117 to 166, wherein the intelligent voice
robot converses
with the user, wherein the user's conversation is text content recognized and
transcribed
from user's conversation speech through ASR technology.
168. The method of any one of claims 117 to 167, wherein the text content is
word-segmented to
obtain the plurality of segmented terms, and word-segmentation process
includes character
purification, rectification, word segmentation, and removal of stop-words.
169. The method of any one of claims 117 to 168, wherein plural segmented
terms are mapped
and transformed through index of the domain dictionary to form of expression
of the
conversation scenario in the scenario feature relation table.
170. The method of any one of claims 117 to 169, wherein each element in the
feature word
sequence of the user's conversation is mapped to the BERT word vector space to
obtain the
feature vector of 768 dimensions, and all elements are summated and averaged
to obtain a
1x768 vector to serve as feature expression input by the user's conversation.
171. The method of any one of claims 117 to 170, wherein the feature word
sequences of the
various conversation scenarios in the scenario feature relation table are
operated is
converted to 1x768 feature vectors.
63
172. The method of any one of claims 117 to 171, wherein a cosine similarity
between the
feature vector input by the user's conversation and the scenario feature
vector of the
conversation scenario is calculated, wherein the greater cosine similarity
calculation result,
the higher similarity, and the higher cosine similarity calculation result is
the relevancy
between the user's conversation and the conversation scenario.
173. The method of any one of claims 117 to 172, wherein by arranging all the
conversation
scenarios in a descending order according to the cosine similarity calculation
result, the
conversation scenario with the highest cosine similarity calculation result is
returned to
serve as a judgement result of the intended scenario of current input by the
user, and a
corresponding response pattern under the intended scenario is returned
according to the state
in which the FSM is currently.
174. The method of any one of claims 117 to 173, wherein the response pattern
is transformed to
speech content through Text-To-Speech (TTS) technology is broadcast to the
user.
175.A computer equipment comprising:
a memory, including a nonvolatile storage medium and an internal memory,
wherein the
nonvolatile storage medium stores therein an operating system, and a computer
program,
wherein the internal memory provides environment for running of the operating
system
and the computer program in the nonvolatile storage medium.
a processor, configured to provide computing and controlling capabilities;
a network interface connected to each other via a system bus configured to
connect to
other equipment via network for communication;
a computer program stored on the memory and operable on the processor, wherein
the
processor executes the computer program configured to:
obtain sample corpora of various conversation scenarios in a plurality of
conversation scenarios;
64
generate a scenario feature of the conversation scenario based on a sample
corpus of the conversation scenario, wherein the scenario feature includes
feature words of the conversation scenario and a feature word sequence
obtained by mapping transformation of the feature words; and
configure an intelligent voice robot with a preset word vector space model and
the scenario features of the various conversation scenarios, wherein the word
vector space model is for the intelligent voice robot to perform word vector
similarity calculation on a user's conversation and the scenario features of
the
various conversation scenarios, to recognize an intended scenario of the
user's
conversation.
176. The equipment of claim 175, wherein generating the scenario feature of
the conversation
scenario based on the sample corpus of the conversation scenario comprises:
obtaining discrete representation of the sample corpus of the conversation
scenario based
on a preset domain dictionary;
employing a feature selection algorithm to extract feature words of the
conversation
scenario based on the discrete representation of the sample corpus of the
conversation
scenario;
mapping and transforming the feature words of the conversation scenario to a
corresponding dictionary index; and
generating the feature word sequence of the conversation scenario, wherein the
feature
selection algorithm is a chi-square statistic feature selection algorithm.
177. The equipment of claim 175, further comprises:
storing the various conversation scenarios and the scenario features of the
conversation
scenarios in a scenario feature relation table.
178. The equipment of claim 175, further comprises:
receiving a configuration feature word input with respect to the conversation
scenario;
and
maintaining the scenario feature of the conversation scenario in the scenario
feature
relation table based on the configuration feature word of the conversation
scenario and a
configuration feature word sequence obtained by mapping transformation of the
configuration feature word.
179. The equipment of claim 178, further comprises receiving the configuration
feature word
input by the user having feature configuration permission with respect to the
conversation
scenario.
180. The equipment of claim 178, wherein maintaining the scenario feature of
the conversation
scenario in the scenario feature relation table based on the configuration
feature word of the
conversation scenario and the configuration feature word sequence obtained by
mapping
transformation of the configuration feature word comprises:
merging the configuration feature word of the conversation scenario to the
feature words
of the conversation scenario in the scenario feature relation table; and
adding the configuration feature word sequence of a merged configuration
feature word
to the feature word sequence of the conversation scenario.
181. The equipment of claim 175, wherein the word vector space model is
trained and obtained
comprises:
applying a domain corpora of a domain to the various conversation scenarios
pertaining
to train a pre-trained bidirectional encoder representations from transformers
(BERT)
word vector space; and
obtaining the word vector space model.
182. The equipment of claim 175, further comprises:
66
receiving a state transition graph input by a first user for the conversation
scenario and
receiving supplementary information input by a second user for a state
transition graph,
to generate a state transition matrix of the conversation scenario; and
generating a script file for containing state transition logical relation
based on the state
transition matrix of the conversation scenario and generating a finite state
machine (FSM)
based on the script file, to return a corresponding pattern when the intended
scenario of
the user's conversation is recognized.
183. The equipment of any one of claims 175 to 182, further comprises:
preprocessing the user's conversation to obtain a plurality of segmented terms
in the
user's conversation;
performing mapping transformation on the segmented terms;
obtaining a feature word sequence of the user's conversation;
using the word vector space model to construct a feature vector of the user's
conversation
and scenario feature vectors of the various conversation scenarios based on
the feature
word sequence of the user's conversation and feature word sequences of the
various
conversation scenarios;
performing similarity calculation on the feature vector of the user's
conversation and the
scenario feature vectors of the various conversation scenarios; and
recognizing intention of the user's conversation based on a similarity
calculation result,
to return a pattern to which the intention corresponds.
184. The equipment of any one of claims 175 to 183, wherein the plurality of
conversation
scenarios are contained in a preset conversation scenario list, wherein the
conversation
scenario list is to record one or more conversation scenarios of a specific
business field.
67
185. The equipment of any one of claims 175 to 184, wherein the sample corpora
of the various
conversation scenarios are obtained by classifying and marking specific domain
corpora
according to classes of the conversation scenarios.
186. The equipment of any one of claims 175 to 185, wherein the conversation
scenarios are
obtained by performing scenario abstraction on the specific domain corpora,
and the
scenario abstraction is a process from data to information and then to
knowledge.
187. The equipment of any one of claims 175 to 186, wherein each conversation
scenario is
abstracted as a conversation state, and dialogue process is abstracted as a
transition between
conversation states.
188. The equipment of any one of claims 175 to 187, wherein the conversation
state is taken as a
node, an oriented line between the conversation states is a process in which
one state is
transferred to another state, wherein entire dialogue process is abstracted as
a graph
consisting of nodes and oriented lines.
189. The equipment of any one of claims 175 to 188, wherein content
transformation based on a
word of bag (WOB) model is performed on the sample corpus of each conversation
scenario, wherein discrete representation of the sample corpus of each
conversation scenario
is obtained, wherein the feature selection algorithm is to extract feature
words of each
conversation scenario on the basis of the discrete representation of the
sample corpus of
each conversation scenario.
190. The equipment of any one of claims 175 to 189, wherein the WOB model
divides a corpus
text into separate terrns, and it is imaged that all terms are placed in a
bag, such elements as
their word orders, grammars and syntaxes are ignored, wherein the bag is only
regarded as a
collection of plural vocabularies, wherein each term as it appears in the
corpus text is
independent.
191. The equipment of any one of claims 175 to 190, wherein the WOB model
includes a one-
hot, TF-IDF and N-gram model.
68
192. The equipment of any one of claims 175 to 191, wherein name, the feature
words and the
feature word sequence of each conversation scenario are correspondingly stored
in the
scenario feature relation table.
193. The equipment of any one of claims 175 to 192, wherein the scenario
feature relation table
stores correspondence relations between conversation scenarios and scenario
features,
including feature words and feature word sequences.
194. The equipment of any one of claims 175 to 193, wherein the scenario
feature relation table
is stored in a server, maintained offline by backstage algorithm technical
personnel based on
periodical text data mining operations, and isolated from frontstage business
personnel.
195. The equipment of any one of claims 175 to 194, wherein when the
intelligent voice robot is
conversing with the user, obtain user's conversation text by recognizing and
transforming
user's conversation voice via an automatic speech recognition (ASR) technique,
and extract
feature information out of the user's conversation text.
196. The equipment of any one of claims 175 to 195, wherein pre-trained
embedding is obtained
by introducing large-scale BERT word vector space, pretrained in google bert
serving.
197. The equipment of any one of claims 175 to 196, wherein the BERT word
vector space is
retrained by bringing in own business customer service corpora, and
calibration of the
BERT word vector is realized, to conform to specific business scenarios.
198. The equipment of any one of claims 175 to 197, wherein total segmentation
corpora where
stop-words are removed are to construct the preset domain dictionary, wherein
the preset
domain dictionary includes total effective vocabularies appearing in the
corpora, and the
preset domain dictionary is to perform content transformation on all sample
corpora of a
target conversation scenario based on the WOB model to obtain the discrete
representation.
199. The equipment of any one of claims 175 to 198, wherein the chi-square
statistic (CHI)
technique is to extract feature words of the target conversation scenario.
69
200. The equipment of any one of claims 175 to 199, wherein CHI calculation
formula
comprises:
<IMG>
wherein c is a certain class, namely a "conversation scenario", t is a certain
term, and N is
a total number of texts in a training corpora.
201. The equipment of claim 200, wherein the x2 is for chi square hypothesis
verification in
statistics, to judge uniformity or goodness of fit between actual distribution
and theoretical
distribution, wherein a null hypothesis Ho being "no marked difference between
observed
frequency and desired frequency".
202. The equipment of any one of claims 200 to 201, wherein less the chi-
square statistic is, the
closer the observed frequency to the desired frequency, and higher relevancy
between.
203. The equipment of any one of claims 200 to 202, wherein x2 is a measure of
distance
between an observed object and a desired object, wherein less the distance is,
higher the
relevancy between.
204. The equipment of any one of claims 200 to 203, wherein the observed
object is the term,
and the desired object is the conversation scenario, wherein if the term and
the conversation
scenario are highly relevant, statistic distributions of the two are close to
each other in the
entire samples.
205. The equipment of any one of claims 200 to 204, wherein through x2
statistic, relevancies
between all vocabularies in the domain dictionary and the various classes are
calculated
quickly and accurately based on quantities of corpora, and a preset number of
terms with the
x2 selected according to relevancy sorting result to serve as a feature set of
conversation
scenarios, to complete feature mapping of various scenarios and the various
classes in the
conversation scenario list.
206. The equipment of any one of claims 175 to 205, wherein the configuration
feature word
input by the user with respect to the target conversion scenario through a
system frontend.
207. The equipment of any one of claims 175 to 206, wherein the system
frontend provides
business personnel with a feature relation extension function to maintain the
business field.
208. The equipment of any one of claims 175 to 207, wherein on receiving the
configuration
feature word, updates it into an externally input feature set of the target
conversion scenario.
209. The equipment of any one of claims 175 to 208, wherein the externally
input configuration
feature word is not contained in the domain dictionary, the configuration
feature word is
ignored.
210. The equipment of any one of claims 175 to 209, wherein the first user is
business personnel,
the second user is an algorithm developing technical personnel.
211. The equipment of any one of claims 175 to 210, wherein the FSIVI is a
mathematical model
that expresses a finite number of states and behaviors as transitions and
actions amongst
these states.
212. The equipment of any one of claims 175 to 211, wherein the FSIVI is to
describe state
sequences of an object experienced within the objects life cycle, and how to
respond to
various events coming from outside to transition between the states.
213. The equipment of any one of claims 175 to 212, wherein the FSIVI includes
current state,
condition, action, and next state.
214. The equipment of any one of claims 175 to 213, wherein the current state
is a state currently
being in, wherein the condition is referred to as event, wherein the condition
is satisfied, an
action is triggered or one round of state transition is performed;
215. The equipment of any one of claims 175 to 214, wherein the action
indicates the action
executed after the condition has been satisfied, wherein the action has been
executed to
completion, a new state is transitioned, or the original state is maintained.
71
216. The equipment of any one of claims 175 to 215, wherein the action is not
indispensable,
after the condition has been satisfied, not execute any action and directly
transition to the
new state;
217. The equipment of any one of claims 175 to 216, wherein the next state is
the new state to be
transitioned after the condition has been satisfied.
218. The equipment of any one of claims 175 to 217, wherein the next state is
relative to the
current state, once the next state is activated, it becomes a new current
state.
219. The equipment of any one of claims 175 to 218, wherein FSM model is
abstracted
comprising:
<IMG>
220. The equipment of any one of claims 175 to 219, wherein the FSM state
transition torque
completely repaired is automatically translated to the script file of json
format.
221. The equipment of any one of claims 175 to 220, wherein the json script
file is read and input
by the program to a state machine object when a finite state machine instance
is generated,
to validate logic.
222. The equipment of any one of claims 175 to 221, wherein the finite state
machine instance as
generated is stored in Redis as an index value according to uuid transferred
by the frontend,
to facilitate subsequent program access when the interactive voice response
(IVR) service
starts.
223. The equipment of any one of claims 175 to 222, wherein the user performs
persistent
operation on the FSM.
224. The equipment of any one of claims 175 to 223, wherein the user selects a
task type as a
single task at the frontend, the finite state machine instance of the single
task as stored in
Redis is cleared away as an invalidated object within a preset time period
after the IVR
marketing service has been triggered.
72
225. The equipment of any one of claims 175 to 224, wherein the intelligent
voice robot
converses with the user, wherein the user's conversation is text content
recognized and
transcribed from user's conversation speech through ASR technology.
226. The equipment of any one of claims 175 to 225, wherein the text content
is word-segmented
to obtain the plurality of segmented terms, and word-segmentation process
includes
character purification, rectification, word segmentation, and removal of stop-
words.
227. The equipment of any one of claims 175 to 226, wherein plural segmented
terms are
mapped and transforrned through index of the domain dictionary to form of
expression of
the conversation scenario in the scenario feature relation table.
228. The equipment of any one of claims 175 to 227, wherein each element in
the feature word
sequence of the user's conversation is mapped to the BERT word vector space to
obtain the
feature vector of 768 dimensions, and all elements are summated and averaged
to obtain a
1x768 vector to serve as feature expression input by the user's conversation.
229. The equipment of any one of claims 175 to 228, wherein the feature word
sequences of the
various conversation scenarios in the scenario feature relation table are
operated is
converted to 1x768 feature vectors.
230. The equipment of any one of claims 175 to 229, wherein a cosine
similarity between the
feature vector input by the user's conversation and the scenario feature
vector of the
conversation scenario is calculated, wherein the greater cosine similarity
calculation result,
the higher similarity, and the higher cosine similarity calculation result is
the relevancy
between the user's conversation and the conversation scenario.
231. The equipment of any one of claims 175 to 230, wherein by arranging all
the conversation
scenarios in a descending order according to the cosine similarity calculation
result, the
conversation scenario with the highest cosine similarity calculation result is
returned to
serve as a judgement result of the intended scenario of current input by the
user, and a
corresponding response pattern under the intended scenario is returned
according to the state
in which the FSM is currently.
73
232. The equipment of any one of claims 175 to 231, wherein the response
pattern is transformed
to speech content through Text-To-Speech (TTS) technology is broadcast to the
user.
233.A computer readable physical memory having stored thereon a computer
program executed
by a computer configured to:
obtain sample corpora of various conversation scenarios in a plurality of
conversation
scenarios;
generate a scenario feature of the conversation scenario based on a sample
corpus of the
conversation scenario, wherein the scenario feature includes feature words of
the
conversation scenario and a feature word sequence obtained by mapping
transformation
of the feature words; and
configure an intelligent voice robot with a preset word vector space model and
the
scenario features of the various conversation scenarios, wherein the word
vector space
model is for the intelligent voice robot to perform word vector similarity
calculation on a
user's conversation and the scenario features of the various conversation
scenarios, to
recognize an intended scenario of the user's conversation.
234. The memory of claim 233, wherein generating the scenario feature of the
conversation
scenario based on the sample corpus of the conversation scenario comprises:
obtaining discrete representation of the sample corpus of the conversation
scenario based
on a preset domain dictionary;
employing a feature selection algorithm to extract feature words of the
conversation
scenario based on the discrete representation of the sample corpus of the
conversation
scenario;
mapping and transforming the feature words of the conversation scenario to a
corresponding dictionary index; and
74
generating the feature word sequence of the conversation scenario, wherein the
feature
selection algorithm is a chi-square statistic feature selection algorithm.
235. The memory of claim 233, further comprises:
storing the various conversation scenarios and the scenario features of the
conversation
scenarios in a scenario feature relation table.
236. The memory of claim 233, further comprises:
receiving a configuration feature word input with respect to the conversation
scenario;
and
maintaining the scenario feature of the conversation scenario in the scenario
feature
relation table based on the configuration feature word of the conversation
scenario and a
configuration feature word sequence obtained by mapping transformation of the
configuration feature word.
237. The memory of claim 236, further comprises receiving the configuration
feature word input
by the user having feature configuration permission with respect to the
conversation
scenario.
238. The memory of claim 236, wherein maintaining the scenario feature of the
conversation
scenario in the scenario feature relation table based on the configuration
feature word of the
conversation scenario and the configuration feature word sequence obtained by
mapping
transformation of the configuration feature word comprises:
merging the configuration feature word of the conversation scenario to the
feature words
of the conversation scenario in the scenario feature relation table; and
adding the configuration feature word sequence of a merged configuration
feature word
to the feature word sequence of the conversation scenario.
239. The memory of claim 233, wherein the word vector space model is trained
and obtained
comprises:
applying a domain corpora of a domain to the various conversation scenarios
pertaining
to train a pre-trained bidirectional encoder representations from transformers
(BERT)
word vector space; and
obtaining the word vector space model.
240. The memory of claim 233, further comprises:
receiving a state transition graph input by a first user for the conversation
scenario and
receiving supplementary information input by a second user for a state
transition graph,
to generate a state transition matrix of the conversation scenario; and
generating a script file for containing state transition logical relation
based on the state
transition matrix of the conversation scenario and generating a finite state
machine (FSM)
based on the script file, to return a corresponding pattern when the intended
scenario of
the user's conversation is recognized.
241. The memory of any one of claims 233 to 240, further comprises:
preprocessing the user's conversation to obtain a plurality of segmented terms
in the
user's conversation;
performing mapping transformation on the segmented terms;
obtaining a feature word sequence of the user's conversation;
using the word vector space model to construct a feature vector of the user's
conversation
and scenario feature vectors of the various conversation scenarios based on
the feature
word sequence of the user's conversation and feature word sequences of the
various
conversation scenarios;
performing similarity calculation on the feature vector of the user's
conversation and the
scenario feature vectors of the various conversation scenarios; and
76
recognizing intention of the user's conversation based on a similarity
calculation result,
to return a pattern to which the intention corresponds.
242. The memory of any one of claims 233 to 241, wherein the plurality of
conversation
scenarios are contained in a preset conversation scenario list, wherein the
conversation
scenario list is to record one or more conversation scenarios of a specific
business field.
243. The memory of any one of claims 233 to 242, wherein the sample corpora of
the various
conversation scenarios are obtained by classifying and marking specific domain
corpora
according to classes of the conversation scenarios.
244. The memory of any one of claims 233 to 243, wherein the conversation
scenarios are
obtained by perfonning scenario abstraction on the specific domain corpora,
and the
scenario abstraction is a process from data to information and then to
knowledge.
245. The memory of any one of claims 233 to 244, wherein each conversation
scenario is
abstracted as a conversation state, and dialogue process is abstracted as a
transition between
conversation states.
246. The memory of any one of claims 233 to 245, wherein the conversation
state is taken as a
node, an oriented line between the conversation states is a process in which
one state is
transferred to another state, wherein entire dialogue process is abstracted as
a graph
consisting of nodes and oriented lines.
247. The memory of any one of claims 233 to 246, wherein content
transformation based on a
word of bag (WOB) model is performed on the sample corpus of each conversation
scenario, wherein discrete representation of the sample corpus of each
conversation scenario
is obtained, wherein the feature selection algorithm is to extract feature
words of each
conversation scenario on the basis of the discrete representation of the
sample corpus of
each conversation scenario.
77
248. The memory of any one of claims 233 to 247, wherein the WOB model divides
a corpus text
into separate terms, and it is imaged that all terms are placed in a bag, such
elements as their
word orders, grammars and syntaxes are ignored, wherein the bag is only
regarded as a
collection of plural vocabularies, wherein each term as it appears in the
corpus text is
independent.
249. The memory of any one of claims 233 to 248, wherein the WOB model
includes a one-hot,
`11,-IDF and N-gram model.
250. The memory of any one of claims 233 to 249, wherein name, the feature
words and the
feature word sequence of each conversation scenario are correspondingly stored
in the
scenario feature relation table.
251. The memory of any one of claims 233 to 250, wherein the scenario feature
relation table
stores correspondence relations between conversation scenarios and scenario
features,
including feature words and feature word sequences.
252. The memory of any one of claims 233 to 251, wherein the scenario feature
relation table is
stored in a server, maintained offline by backstage algorithm technical
personnel based on
periodical text data mining operations, and isolated from frontstage business
personnel.
253. The memory of any one of claims 233 to 252, wherein when the intelligent
voice robot is
conversing with the user, obtain user's conversation text by recognizing and
transforming
user's conversation voice via an automatic speech recognition (ASR) technique,
and extract
feature information out of the user's conversation text.
254. The memory of any one of claims 233 to 253, wherein pre-trained embedding
is obtained by
introducing large-scale BERT word vector space, pretrained in google bert
serving.
255. The memory of any one of claims 233 to 254, wherein the BERT word vector
space is
retrained by bringing in own business customer service corpora, and
calibration of the
BERT word vector is realized, to conform to specific business scenarios.
78
256. The memory of any one of claims 233 to 255, wherein total segmentation
corpora where
stop-words are removed are to construct the preset domain dictionary, wherein
the preset
domain dictionary includes total effective vocabularies appearing in the
corpora, and the
preset domain dictionary is to perform content transformation on all sample
corpora of a
target conversation scenario based on the WOB model to obtain the discrete
representation.
257. The memory of any one of claims 233 to 256, wherein the chi-square
statistic (CHI)
technique is to extract feature words of the target conversation scenario.
258. The memory of any one of claims 233 to 257, wherein CHI calculation
formula comprises:
<IMG>
wherein c is a certain class, namely a "conversation scenario", t is a certain
term, and N is
a total number of texts in a training corpora.
259. The memory of claim 258, wherein the x2 is for chi square hypothesis
verification in
statistics, to judge uniformity or goodness of fit between actual distribution
and theoretical
distribution, wherein a null hypothesis H 0 being "no marked difference
between observed
frequency and desired frequency".
260. The memory of any one of claims 258 to 259, wherein less the chi-square
statistic is, the
closer the observed frequency to the desired frequency, and higher relevancy
between.
261. The memory of any one of claims 258 to 260, wherein X2 is a measure of
distance between
an observed object and a desired object, wherein less the distance is, higher
the relevancy
between.
262. The memory of any one of claims 258 to 261, wherein the observed object
is the term, and
the desired object is the conversation scenario, wherein if the term and the
conversation
scenario are highly relevant, statistic distributions of the two are close to
each other in the
entire samples.
79
263. The memory of any one of claims 258 to 262, wherein through X2 statistic,
relevancies
between all vocabularies in the domain dictionary and the various classes are
calculated
quickly and accurately based on quantities of corpora, and a preset number of
temis with the
x2 selected according to relevancy sorting result to serve as a feature set of
conversation
scenarios, to complete feature mapping of various scenarios and the various
classes in the
conversation scenario list.
264. The memory of any one of claims 233 to 263, wherein the configuration
feature word input
by the user with respect to the target conversion scenario through a system
frontend.
265. The memory of any one of claims 233 to 264, wherein the system frontend
provides
business personnel with a feature relation extension function to maintain the
business field.
266. The memory of any one of claims 233 to 265, wherein on receiving the
configuration
feature word, updates it into an externally input feature set of the target
conversion scenario.
267. The memory of any one of claims 233 to 266, wherein the externally input
configuration
feature word is not contained in the domain dictionary, the configuration
feature word is
ignored.
268. The memory of any one of claims 233 to 267, wherein the first user is
business personnel,
the second user is an algorithm developing technical personnel.
269. The memory of any one of claims 233 to 268, wherein the FSM is a
mathematical model
that expresses a finite number of states and behaviors as transitions and
actions amongst
these states.
270. The memory of any one of claims 233 to 269, wherein the FSM is to
describe state
sequences of an object experienced within the objects life cycle, and how to
respond to
various events coming from outside to transition between the states.
271. The memory of any one of claims 233 to 270, wherein the FSM includes
current state,
condition, action, and next state.
272. The memory of any one of claims 233 to 271, wherein the current state is
a state currently
being in, wherein the condition is referred to as event, wherein the condition
is satisfied, an
action is triggered or one round of state transition is performed;
273. The memory of any one of claims 233 to 272, wherein the action indicates
the action
executed after the condition has been satisfied, wherein the action has been
executed to
completion, a new state is transitioned, or the original state is maintained.
274. The memory of any one of claims 233 to 273, wherein the action is not
indispensable, after
the condition has been satisfied, not execute any action and directly
transition to the new
state;
275. The memory of any one of claims 233 to 274, wherein the next state is the
new state to be
transitioned after the condition has been satisfied.
276. The memory of any one of claims 233 to 275, wherein the next state is
relative to the current
state, once the next state is activated, it becomes a new current state.
277. The memory of any one of claims 233 to 276, wherein FSM model is
abstracted comprising:
<IMG>
278. The memory of any one of claims 233 to 277, wherein the FSM state
transition torque
completely repaired is automatically translated to the script file of json
format.
279. The memory of any one of claims 233 to 278, wherein the json script file
is read and input
by the program to a state machine object when a finite state machine instance
is generated,
to validate logic.
280. The memory of any one of claims 233 to 279, wherein the finite state
machine instance as
generated is stored in Redis as an index value according to uuid transferred
by the frontend,
to facilitate subsequent program access when the interactive voice response
(IVR) service
starts.
81
281. The memory of any one of claims 233 to 280, wherein the user performs
persistent
operation on the FSIVI.
282. The memory of any one of claims 233 to 281, wherein the user selects a
task type as a single
task at the frontend, the finite state machine instance of the single task as
stored in Redis is
cleared away as an invalidated object within a preset time period after the
IVR marketing
service has been triggered.
283. The memory of any one of claims 233 to 282, wherein the intelligent voice
robot converses
with the user, wherein the user's conversation is text content recognized and
transcribed
from user's conversation speech through ASR technology.
284. The memory of any one of claims 233 to 283, wherein the text content is
word-segmented to
obtain the plurality of segmented terms, and word-segmentation process
includes character
purification, rectification, word segmentation, and removal of stop-words.
285. The memory of any one of claims 233 to 284, wherein plural segmented
terms are mapped
and transformed through index of the domain dictionary to form of expression
of the
conversation scenario in the scenario feature relation table.
286. The memory of any one of claims 233 to 285, wherein each element in the
feature word
sequence of the user's conversation is mapped to the BERT word vector space to
obtain the
feature vector of 768 dimensions, and all elements are summated and averaged
to obtain a
1x768 vector to serve as feature expression input by the user's conversation.
287. The memory of any one of claims 233 to 286, wherein the feature word
sequences of the
various conversation scenarios in the scenario feature relation table are
operated is
converted to 1x768 feature vectors.
82
288. The memory of any one of claims 233 to 287, wherein a cosine similarity
between the
feature vector input by the user's conversation and the scenario feature
vector of the
conversation scenario is calculated, wherein the greater cosine similarity
calculation result,
the higher similarity, and the higher cosine similarity calculation result is
the relevancy
between the user's conversation and the conversation scenario.
289. The memory of any one of claims 233 to 288, wherein by arranging all the
conversation
scenarios in a descending order according to the cosine similarity calculation
result, the
conversation scenario with the highest cosine similarity calculation result is
returned to
serve as a judgement result of the intended scenario of current input by the
user, and a
corresponding response pattern under the intended scenario is returned
according to the state
in which the FSM is currently.
290. The memory of any one of claims 233 to 289, wherein the response pattern
is transformed to
speech content through Text-To-Speech (TTS) technology is broadcast to the
user.
83