Language selection

Search

Patent 3138730 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent: (11) CA 3138730
(54) English Title: PUBLIC-OPINION ANALYSIS METHOD AND SYSTEM FOR PROVIDING EARLY WARNING OF ENTERPRISE RISKS
(54) French Title: METHODE ET SYSTEME D'ANALYSE DE L'OPINION PUBLIQUE POUR FOURNIR UN AVERTISSEMENT PRECOCE DES RISQUES D'ENTREPRISE
Status: Granted
Bibliographic Data
(51) International Patent Classification (IPC):
  • G06F 40/20 (2020.01)
  • G06F 40/279 (2020.01)
  • G06F 40/30 (2020.01)
  • G06Q 10/06 (2012.01)
(72) Inventors :
  • LI, JIAQING (China)
(73) Owners :
  • 10353744 CANADA LTD. (Canada)
(71) Applicants :
  • 10353744 CANADA LTD. (Canada)
(74) Agent: HINTON, JAMES W.
(74) Associate agent:
(45) Issued: 2023-08-01
(22) Filed Date: 2021-11-12
(41) Open to Public Inspection: 2022-05-12
Examination requested: 2022-04-28
Availability of licence: N/A
(25) Language of filing: English

Patent Cooperation Treaty (PCT): No

(30) Application Priority Data:
Application No. Country/Territory Date
202011264306.X China 2020-11-12

Abstracts

English Abstract

The present invention discloses a method and a system of analyzing public- opinion for providing early warning of enterprise risks. The method involves: collecting public- opinion text data from any designated website, and constructing a data-source sequence for website sources of the public-opinion text data; matching risk labels of the public-opinion text data based on a preset risk-label set to construct a risk-label sequence; performing classification of sentiment polarities of the public-opinion text data using a sentiment classification model so as to construct a sentiment-polarity sequence, and identifying associated enterprise entity names in the public- opinion text data so as to construct an enterprise-association sequence; and according to the data- source sequence, the risk-label sequence, the sentiment-polarity sequence, and the enterprise- association sequence corresponding to the public-opinion text data, computing and outputting a public opinion analysis result.


French Abstract

Il est décrit une méthode et un système danalyse de lopinion publique pour fournir un avertissement précoce des risques dentreprise. La méthode comprend le recueil de données texte sur lopinion publique à partir de nimporte quel site Web désigné, et la construction dune séquence de source de données pour des sources de site Web des données texte sur lopinion publique; le jumelage détiquettes de risque des données texte sur lopinion publique daprès un ensemble détiquettes de risque préétabli pour construire une séquence détiquettes de risque; la classification de polarités de sentiments des données texte sur lopinion publique à laide dun modèle de classification de sentiments de manière à construire une séquence de polarités de sentiments; et lidentification de noms dentités dentreprise associés dans les données texte sur lopinion publique de manière à construire une séquence dassociation dentreprise; et, selon la séquence de source de données, la séquence détiquettes de risque, la séquence de polarités de sentiments, et la séquence dassociation dentreprise correspondant aux données texte sur lopinion publique, au calcul et à la production dun résultat danalyse sur lopinion publique.

Claims

Note: Claims are shown in the official language in which they were submitted.


Claims:
1. A method comprising:
collecting public-opinion text data from any designated website;
constructing a data-source sequence for website sources of the public-opinion
text data,
wherein a credit weight is assigned for each designated website;
matching risk labels of the public-opinion text data based on a preset risk-
label set to
construct a risk-label sequence, wherein the preset risk-label set includes
risk keywords
configured in a risk-label class wherein in each risk-label class has a risk
weight;
performing classification of sentiment polarities of the public-opinion text
data using a
sentiment classification model to construct a sentiment-polarity sequence;
identifying associated enterprise entity names in the public-opinion text data
to construct
an enterprise-association sequence; and
computing and outputting a public opinion analysis result according to the
data-source
sequence, the risk-label sequence, the sentiment-polarity sequence and the
enterprise-
association sequence corresponding to the public-opinion text data.
2. The method of claim 1, wherein constructing the data-source sequence for
the website
sources of the public-opinion text data comprises:
summing up a total number of the designated websites;
configuring the credit weight for each designated website, to construct a data-
source
sequence set dimensionally consistent with the total number;
identifying a location of the source website in the data-source sequence set;
constructing the corresponding data-source sequence; and
matching a corresponding credit weight.
22

3. The method of claim 1, further comprises:
constructing the risk-label set in advance, wherein the risk-label set
includes plural risk-
label classes, and each risk-label class corresponds to at least one risk
keyword; and
configuring the risk weight for each risk-label class in the risk-label set.
4. The method of claim 3, wherein matching risk labels of the public-opinion
text data based on
the preset risk-label set to construct the risk-label sequence comprises:
matching the risk keywords to the public-opinion text data by means of text
keyword
matching;
searching for corresponding risk-label class according to matching results;
and
based on locations of the risk-label classes in the risk-label set,
constructing the risk-label
sequence.
5. The method of claim 1, wherein training of the sentiment classification
model comprises:
extracting public opinion corpora of various sentiment polarities respectively
from
acquired public opinion corpora, to construct a tag-corpus set;
training the sentiment classification model based on the tag-corpus set using
a Long
short-term memory (LSTM) or convolutional neural network for text (TextCNN)
model
structure; and
wherein classifications of the sentiment polarities include one or more of
positive
sentiment, neutral sentiment, and negative sentiment, and the sentiment-
polarity sequence
is a sequence representation of one of the three sentiment polarities.
6. The method of claim 5, further comprises configuring a corresponding
polarity weight for
every kind of sentiment polarity.
7. The method of claim 1, wherein identifying the associated enterprise entity
names in the
public-opinion text data to construct the enterprise-association sequence
comprises:
23

constructing a monitored-enterprise list consisting of plural enterprise
entities in advance;
identifying the enterprise entity name associated with the public-opinion text
data by
means of keyword matching with a Chinese word segmentation tool and a named-
entity recognition (NER) naming entity identifying tool; and
based on a location of the enterprise entity name in the monitored-enterprise
list,
constructing the enterprise-association sequence.
8. The method of claim 1, further comprises:
presetting plural kinds of risk-early-warning levels; and
defining boundary intervals of each kind of risk-early-warning levels.
9. The method of claim 8, wherein according to the data-source sequence, the
risk-label
sequence, the sentiment-polarity sequence and the enterprise-association
sequence
corresponding to the public-opinion text data, computing and outputting a
public opinion
analysis result comprises:
using a public-opinion-risk-early-warning equation z = ElLo RiLi +Elc-oWiSt +
Ef_oQiTi to compute a risk value of the public-opinion text data;
computing an early-warning value corresponding to the public-opinion text data
in view
of the enterprise-association sequence;
outputting the risk-early-warning level based on the boundary interval to
which the early-
warning value belongs;
wherein Ri denotes a risk weight of a corresponding risk-label class, Li
denotes the risk-
label sequence, n denotes a total number of the risk-label classes in the risk-
label set, Wi
denotes a credit weight of the designated website, Si denotes the data-source
sequence, k
denotes the total number of the designated websites, Qi denotes a polarity
weight, Ti
denotes the sentiment-polarity sequence, andp denotes a total number of the
sentiment
polarities.
24

10. The method of any one of claims 1 to 9, wherein the public-opinion text
data are collected
from any designated website, and are processed to construct the website
sources, wherein the
sources of public opinions about enterprises include one or more of news
websites,
government websites, forums, micro blogging, and websites receiving
complaints, the source
sequence is S = {S1, S2, ..., Sk}., wherein the sources of public opinions,
different credit
weights Wi are assigned, wherein the credit weights are configured by users,
wherein setting
addresses, site sections, data-collecting frequencies, keywords of the public-
opinion data
sources are performed, wherein an Internet-based data collecting tool is used
to acquire the
public-opinion text data.
11. The method of any one of claims 1 to 10, wherein collecting the public-
opinion text data
comprises:
using a Python-based or Java-based html processing tool to denoise webpages,
clean data,
and extract fields, so that data of public-opinion webpage data is extracted
in a structured
manner by fields including titles, sources, links, releasing date, text,
summaries, and
authors;
storing the extracted structured text data.
12. The method of any one of claims 1 to 10, wherein the risk-label set is
constructed for classes
of risk events that are commonly seen in public opinions about enterprises and
classes of risk
events the users care about, wherein every risk label is assigned with a
corresponding risk
weight Rj, wherein the risk weight may alternatively be configured by the
users, wherein a
keyword set is developed for each of the risk labels, to form a "label -
keyword dictionary",
wherein the risk-label sequence is L = {L1, L2, ..., Ln} of the public
opinions, wherein n is the
total number of the risk labels, Li corresponds to the 0/1 identification
corresponding to the
risk label, wherein 1 denotes that there is the ith label in the public
opinions, and 0 denotes
that there is not the ith label in the public opinions.

13. The method of any one of claims 1 to 12, wherein the sentiment-polarity
and entity-name
identifying module extract public-opinion data sets of three polarities,
including positive,
neutral, and negative sentiment from the acquired public opinion corpora
according to pre-
defined positive and negative sentiment dictionary for a certain enterprise to
form the tag-
corpus set.
14. The method of any one of claims 1 to 13, wherein for every sentiment
polarity, a
corresponding polarity weight Qi is set, wherein Qi = {Q1, Q2, Q3}, wherein
the sentiment-
polarity sequence Ti = {T1, T2,T3}, T1 denotes positive sentiment, T2 denotes
neutral
sentiment, and T3 denotes negative sentiment, wherein Q1 denotes the polarity
weight
corresponding to the positive sentiment, Q2 denotes the polarity weight
corresponding to the
neutral sentiment, and Q3 denotes the polarity weight corresponding to the
negative
sentiment.
15. The method of any one of claims 1 to 14, wherein a public-opinion
processing platform is
used to identify enterprise entities from the collected public-opinion text
data, to extract the
risk labels from the data and to analyze the sentiment polarities of the data,
wherein a
personalized configuring service provides standardized application
configuration interface.
16. The method of any one of claims 1 to 15, wherein the enterprise entities
associated with the
public-opinion text data are identified based on a dictionary of full names,
short names, and
aliases of monitored enterprises through a list of enterprises monitored,
wherein the public-
opinion text data are associated with the enterprise entities to form an
enterprise-association
sequence is E = E2,
Em}, where m is the number of all the monitored enterprises, Ei
is the 0/1 label, wherein 1 denotes the public opinion is associated with the
ith enterprise, and
0 denotes not associated, wherein synchronization of the monitored-enterprise
list, updating
of a sentiment polarity dictionary, and setting of the public opinion sources
and risk label
weights is supported.
26

17. The method of any one of claims 1 to 16, wherein a risk early warning
score is computed
according to early warning labels and list of enterprises monitored is J =
U1,12,
=== 'La
wherein Ji is a 0/1 label, subscribed by the user and according to data-source
sequences,
credit weights, risk-label sequences, risk weights, sentiment-polarity
sequences, polarity
weights, and the enterprise-association sequence of public-opinion text data,
early warning
level is determined according to a risk threshold value, wherein enterprise
public opinion
information that satisfies requirements is pushed to the user as early
warning.
18. The method of any one of claims 1 to 17, wherein the risk-early-warning
level A= {no early
warning, normal, important, serious}, boundary intervals corresponding to
every risk-early-
warning level is: H = {H1, H2, H3}, wherein score is smaller than Hi, the
corresponding risk-
early-warning level is not to give early warning, wherein the score is greater
than H1 and
smaller than H2, the corresponding risk-early-warning level is normal, wherein
the score is
greater than H2 and smaller than H3, the corresponding risk-early-warning
level is important,
wherein the score is greater than H3, the corresponding risk-early-warning
level is serious,
wherein the score corresponding to sentiment polarity is Q = {Q1, Q2, Q3}, and
the sentiment-
polarity sequence corresponding to the public-opinion text data is T = {T1,
T2, T3}.
19. The method of any one of claims 1 to 18, wherein risk early warning score
of entry of public-
opinion text data is computed by:
Image
wherein a vector inner product represented by (x, y), the equation is:
z = (R,L) + (W,S)+ (Q,T);
wherein the sequence information of the associated enterprise combined is:
z' = z = E((E,J));
wherein E(x) is a unit step function is:
Image
27

wherein the enterprise entity name shown in this entry of public-opinion text
data exists
in monitored-enterprise list, the value of E(x) is 1, wherein the risk early
warning score is
computed, wherein the enterprise entity name mentioned in the entry of public-
opinion
text data does not exist in the list of enterprises monitored, the value of
E(x) is 0, wherein
no more computation for the risk early warning score is conducted.
20. The method of any one of claims 1 to 19, wherein early warning mark is:
Output (z') =
(Y (z' ), 11), wherein Y(x) = {y1(x), y2(x), y3(x), y4 (x)}, and the values of
the two-value
function (x), y2 (x), y3 (x), y4(x) is True or False, 1 or 0, wherein:
y1(x) = 0 x < H1;
y2(x) = x < H2 ;
y3 (X) = H2 X < H3 ;
y4(x) = x H3 ; and
wherein Output (z') is output as the early warning mark: no early warning,
normal,
important, or serious.
21. A system comprising:
a public-opinion-collecting module, configured to:
collect public-opinion text data from any designated website;
construct a data-source sequence for website sources of the public-opinion
text
data, wherein a credit weight is assigned for each designated website;
a risk label module, for matching risk labels of the public-opinion text data
based on a
preset risk-label set to construct a risk-label sequence, wherein the preset
risk-label set
includes risk keywords configured in a risk-label class wherein in each risk-
label class
has a risk weight;
28

a sentiment-polarity and entity-name identifying module, configured to:
perform classification of sentiment polarities of the public-opinion text data

using a sentiment classification model to construct a sentiment-polarity
sequence;
identify associated enterprise entity names in the public-opinion text data to

construct an enterprise-association sequence; and
an early warning outputting module, for computing and outputting a public
opinion
analysis result according to the data-source sequence, the risk-label
sequence, the
sentiment-polarity sequence and the enterprise-association sequence
corresponding to the
public-opinion text data.
22. The system of claim 21, wherein constructing the data-source sequence for
the website
sources of the public-opinion text data comprises:
summing up a total number of the designated websites;
conftguring the credit weight for each designated website, to construct a data-
source
sequence set dimensionally consistent with the total number;
identifying a location of the source website in the data-source sequence set;
constructing the corresponding data-source sequence; and
matching a corresponding credit weight.
23. The system of claim 21, further comprises:
constructing the risk-label set in advance, wherein the risk-label set
includes plural risk-
label classes, and each risk-label class corresponds to at least one risk
keyword; and
configuring the risk weight for each risk-label class in the risk-label set.
24. The system of claim 23, wherein matching risk labels of the public-opinion
text data based
on the preset risk-label set to constuct the risk-label sequence comprises:
29

matching the risk keywords to the public-opinion text data by means of text
keyword
matching;
searching for corresponding risk-label class according to matching results;
and
based on locations of the risk-label classes in the risk-label set,
constructing the risk-label
sequence.
25. The system of claim 21, wherein training of the sentiment classification
model comprises:
extracting public opinion corpora of various sentiment polarities respectively
from
acquired public opinion corpora, to construct a tag-corpus set;
training the sentiment classification model based on the tag-corpus set using
a Long
short-term memory (LSTM) or convolutional neural network for text (TextCNN)
model
structure; and
wherein classifications of the sentiment polarities include one or more of
positive
sentiment, neutral sentiment, and negative sentiment, and the sentiment-
polarity sequence
is a sequence representation of one of the three sentiment polarities.
26. The system of claim 25, further comprises configuring a corresponding
polarity weight for
every kind of sentiment polarity.
27. The system of claim 21, wherein identifying the associated enterprise
entity names in the
public-opinion text data to construct the enterprise-association sequence
comprises:
constructing a monitored-enterprise list consisting of plural enterprise
entities in advance;
identifying the enterprise entity name associated with the public-opinion text
data by
means of keyword matching with a Chinese word segmentation tool and a named-
entity recognition (NER) naming entity identifying tool; and
based on a location of the enterprise entity name in the monitored-enterprise
list,
constructing the enterprise-association sequence.

28. The system of claim 21, further comprises:
presetting plural kinds of risk-early-warning levels; and
defining boundary intervals of each kind of risk-early-warning levels.
29. The system of claim 28, wherein according to the data-source sequence, the
risk-label
sequence, the sentiment-polarity sequence and the enterprise-association
sequence
corresponding to the public-opinion text data, computing and outputting a
public opinion
analysis result comprises:
Image
using a public-opinion-risk-early-warning equation
Image to compute a risk value of the public-opinion text data;
computing an early-warning value corresponding to the public-opinion text data
in view
of the enterprise-association sequence;
outputting the risk-early-warning level based on the boundary interval to
which the early-
warning value belongs;
wherein Ri denotes a risk weight of a corresponding risk-label class, Li
denotes the risk-
label sequence, n denotes a total number of the risk-label classes in the risk-
label set, Wi
denotes a credit weight of the designated website, Si denotes the data-source
sequence, k
denotes the total number of the designated websites, Qi denotes a polarity
weight, Ti
denotes the sentiment-polarity sequence, andp denotes a total number of the
sentiment
polarities.
31

30. The system of any one of claims 21 to 29, wherein the public-opinion text
data are collected
from any designated website, and are processed to construct the website
sources, wherein the
sources of public opinions about enterprises include one or more of news
websites,
government websites, forums, micro blogging, and websites receiving
complaints, the source
sequence is S = {S1, S2, ..., Sk}., wherein the sources of public opinions,
different credit
weights Wi are assigned, wherein the credit weights are configured by users,
wherein setting
addresses, site sections, data-collecting frequencies, keywords of the public-
opinion data
sources are performed, wherein an Internet-based data collecting tool is used
to acquire the
public-opinion text data.
31. The system of any one of claims 21 to 30, wherein collecting the public-
opinion text data
comprises:
using a Python-based or Java-based html processing tool to denoise webpages,
clean data,
and extract fields, so that data of public-opinion webpage data is extracted
in a structured
manner by fields including titles, sources, links, releasing date, text,
summaries, and
authors;
storing the extracted structured text data.
32. The system of any one of claims 21 to 31, wherein the risk-label set is
constructed for classes
of risk events that are commonly seen in public opinions about enterprises and
classes of risk
events the users care about, wherein every risk label is assigned with a
corresponding risk
weight Rj, wherein the risk weight may alternatively be configured by the
users, wherein a
keyword set is developed for each of the risk labels, to form a "label -
keyword dictionary",
wherein the risk-label sequence is L = {L1, L2, ..., Ln} of the public
opinions, wherein n is the
total number of the risk labels, Li corresponds to the 0/1 identification
corresponding to the
risk label, wherein 1 denotes that there is the ith label in the public
opinions, and 0 denotes
that there is not the ith label in the public opinions.
32

33. The system of any one of claims 21 to 32, wherein the sentiment-polarity
and entity-name
identifying module extract public-opinion data sets of three polarities,
including positive,
neutral, and negative sentiment from the acquired public opinion corpora
according to pre-
defined positive and negative sentiment dictionary for a certain enterprise to
form the tag-
corpus set.
34. The system of any one of claims 21 to 33, wherein for every sentiment
polarity, a
corresponding polarity weight Qi is set, wherein Qi = {(21, Q2, Q3}, wherein
the sentiment-
polarity sequence Ti = {T1, T2,T3}, T1 denotes positive sentiment, T2 denotes
neutral
sentiment, and T3 denotes negative sentiment, wherein Q1 denotes the polarity
weight
corresponding to the positive sentiment, Q2 denotes the polarity weight
corresponding to the
neutral sentiment, and Q3 denotes the polarity weight corresponding to the
negative
sentiment.
35. The system of any one of claims 21 to 34, wherein a public-opinion
processing platform is
used to identify enterprise entities from the collected public-opinion text
data, to extract the
risk labels from the data and to analyze the sentiment polarities of the data,
wherein a
personalized configuring service provides standardized application
configuration interface.
36. The system of any one of claims 21 to 35, wherein the enterprise entities
associated with the
public-opinion text data are identified based on a dictionary of full names,
short names, and
aliases of monitored enterprises through a list of enterprises monitored,
wherein the public-
opinion text data are associated with the enterprise entities to form an
enterprise-association
sequence is E = {E1,E2,
where m is the number of all the monitored enterprises, Ei
is the 0/1 label, wherein 1 denotes the public opinion is associated with the
ith enterprise, and
0 denotes not associated, wherein synchronization of the monitored-enterprise
list, updating
of a sentiment polarity dictionary, and setting of the public opinion sources
and risk label
weights is supported.
33

37. The system of any one of claims 21 to 36, wherein a risk early warning
score is computed
according to early warning labels and list of enterprises monitored is J =1,,
I
2, = = = 'La
wherein Ji is a 0/1 label, subscribed by the user and according to data-source
sequences,
credit weights, risk-label sequences, risk weights, sentiment-polarity
sequences, polarity
weights, and the enterprise-association sequence of public-opinion text data,
early warning
level is determined according to a risk threshold value, wherein enterprise
public opinion
information that satisfies requirements is pushed to the user as early
warning.
38. The system of any one of claims 21 to 37, wherein the risk-early-warning
level A= {no early
warning, normal, important, serious}, boundary intervals corresponding to
every risk-early-
warning level is: H = {H1, H2, H3}, wherein score is smaller than Hi, the
corresponding risk-
early-warning level is not to give early warning, wherein the score is greater
than H1 and
smaller than H2, the corresponding risk-early-warning level is normal, wherein
the score is
greater than H2 and smaller than H3, the corresponding risk-early-warning
level is important,
wherein the score is greater than H3, the corresponding risk-early-waming
level is serious,
wherein the score corresponding to sentiment polarity is Q = {Q1, Q2, Q3}, and
the sentiment-
polarity sequence corresponding to the public-opinion text data is T = {T1,
T2, T3).
39. The system of any one of claims 21 to 38, wherein risk early warning score
of entry of
public-opinion text data is computed by:
Image
wherein a vector inner product represented by (x, y), the equation is:
z = (R, L) + (W , S) + (Q , T);
wherein the sequence information of the associated enterprise combined is:
Image
wherein E(x) is a unit step function is:
Image
34

wherein the enterprise entity name shown in this entry of public-opinion text
data exists
in monitored-enterprise list, the value of E(x) is 1, wherein the risk early
warning score is
computed, wherein the enterprise entity name mentioned in the entry of public-
opinion
text data does not exist in the list of enterprises monitored, the value of
E(x) is 0, wherein
no more computation for the risk early warning score is conducted.
40. The system of any one of claims 21 to 39, wherein early warning mark is:
Output (z =
(Y (z' ), 11), wherein Y(x) = {y1(x), y2(x), y3(x), y4 (x)}, and the values of
the two-value
function (x), y2 (x), y3 (x), y4(x) is True or False, 1 or 0, wherein:
y1(x) = 0 x < H1;
y2(x) = x < H2 ;
y3 (X) = H2 X < H3 ;
y4(x) = x H3 ; and
wherein Output (z') is output as the early warning mark: no early warning,
nonnal,
important, or serious.
41. A computer readable storage medium, storing thereon a computer program is
executed by a
processor configured to:
collect public-opinion text data from any designated website;
construct a data-source sequence for website sources of the public-opinion
text data,
wherein a credit weight is assigned for each designated website;

match risk labels of the public-opinion text data based on a preset risk-label
set to
construct a risk-label sequence, wherein the preset risk-label set includes
risk keywords
configured in a risk-label class wherein in each risk-label class has a risk
weight;
perform classification of sentiment polarities of the public-opinion text data
using a
sentiment classification model to construct a sentiment-polarity sequence;
identify associated enterprise entity names in the public-opinion text data to
construct an
enterprise-association sequence; and
compute and output a public opinion analysis result according to the data-
source
sequence, the risk-label sequence, the sentiment-polarity sequence and the
enterprise-
association sequence corresponding to the public-opinion text data.
42. The storage medium of claim 41, wherein constructing the data-source
sequence for the
website sources of the public-opinion text data comprises:
summing up a total number of the designated websites;
conftguring the credit weight for each designated website, to construct a data-
source
sequence set dimensionally consistent with the total number;
identifying a location of the source website in the data-source sequence set;
constructing the corresponding data-source sequence; and
matching a corresponding credit weight.
43. The storage medium of claim 41, further comprises:
constructing the risk-label set in advance, wherein the risk-label set
includes plural risk-
label classes, and each risk-label class corresponds to at least one risk
keyword; and
configuring the risk weight for each risk-label class in the risk-label set.
44. The storage medium of claim 43, wherein matching risk labels of the public-
opinion text data
based on the preset risk-label set to construct the risk-label sequence
comprises:
36

matching the risk keywords to the public-opinion text data by means of text
keyword
matching;
searching for corresponding risk-label class according to matching results;
and
based on locations of the risk-label classes in the risk-label set,
constructing the risk-label
sequence.
45. The storage medium of claim 41, wherein training of the sentiment
classification model
comprises:
extracting public opinion corpora of various sentiment polarities respectively
from
acquired public opinion corpora, to construct a tag-corpus set;
training the sentiment classification model based on the tag-corpus set using
a Long
short-term memory (LSTM) or convolutional neural network for text (TextCNN)
model
structure; and
wherein classifications of the sentiment polarities include one or more of
positive
sentiment, neutral sentiment, and negative sentiment, and the sentiment-
polarity sequence
is a sequence representation of one of the three sentiment polarities.
46. The storage medium of claim 45, further comprises configuring a
corresponding polarity
weight for every kind of sentiment polarity.
47. The storage medium of claim 41, wherein identifying the associated
enterprise entity names
in the public-opinion text data to construct the enterprise-association
sequence comprises:
constructing a monitored-enterprise list consisting of plural enterprise
entities in advance;
identifying the enterprise entity name associated with the public-opinion text
data by
means of keyword matching with a Chinese word segmentation tool and a named-
entity recognition (NER) naming entity identifying tool; and
based on a location of the enterprise entity name in the monitored-enterprise
list,
constructing the enterprise-association sequence.
37

48. The storage medium of claim 41, further comprises:
presetting plural kinds of risk-early-warning levels; and
defining boundary intervals of each kind of risk-early-warning levels.
49. The storage medium of claim 48, wherein according to the data-source
sequence, the risk-
label sequence, the sentiment-polarity sequence and the enterprise-association
sequence
corresponding to the public-opinion text data, computing and outputting a
public opinion
analysis result comprises:
Image
using a public-opinion-risk-early-warning equation
Image tO compute a risk value of the public-opinion text data;
computing an early-warning value corresponding to the public-opinion text data
in view
of the enterprise-association sequence;
outputting the risk-early-warning level based on the boundary interval to
which the early-
warning value belongs;
wherein Ri denotes a risk weight of a corresponding risk-label class, Li
denotes the risk-
label sequence, n denotes a total number of the risk-label classes in the risk-
label set, Wi
denotes a credit weight of the designated website, Si denotes the data-source
sequence, k
denotes the total number of the designated websites, Qi denotes a polarity
weight, Ti
denotes the sentiment-polarity sequence, andp denotes a total number of the
sentiment
polarities.
38

50. The storage medium of any one of claims 41 to 49, wherein the public-
opinion text data are
collected from any designated website, and are processed to construct the
website sources,
wherein the sources of public opinions about enterprises include one or more
of news
websites, government websites, forums, micro blogging, and websites receiving
complaints,
the source sequence is S = S2, wherein the sources of public opinions,
different
credit weights Wi are assigned, wherein the credit weights are configured by
users, wherein
setting addresses, site sections, data-collecting frequencies, keywords of the
public-opinion
data sources are performed, wherein an Internet-based data collecting tool is
used to acquire
the public-opinion text data.
51. The storage medium of any one of claims 41 to 50, wherein collecting the
public-opinion text
data comprises:
using a Python-based or Java-based html processing tool to denoise webpages,
clean data,
and extract fields, so that data of public-opinion webpage data is extracted
in a structured
manner by fields including titles, sources, links, releasing date, text,
summaries, and
authors;
storing the extracted structured text data.
52. The storage medium of any one of claims 41 to 51, wherein the risk-label
set is constructed
for classes of risk events that are commonly seen in public opinions about
enterprises and
classes of risk events the users care about, wherein every risk label is
assigned with a
corresponding risk weight Rj, wherein the risk weight may alternatively be
configured by the
users, wherein a keyword set is developed for each of the risk labels, to form
a "label -
keyword dictionary", wherein the risk-label sequence is L = {L1, L2, ...,Ln}
of the public
opinions, wherein n is the total number of the risk labels, Li corresponds to
the 0/1
identification corresponding to the risk label, wherein 1 denotes that there
is the ith label in
the public opinions, and 0 denotes that there is not the ith label in the
public opinions.
39

53. The storage medium of any one of claims 41 to 52, wherein the sentiment-
polarity and entity-
name identifying module extract public-opinion data sets of three polarities,
including
positive, neutral, and negative sentiment from the acquired public opinion
corpora according
to pre-defined positive and negative sentiment dictionary for a certain
enterprise to form the
tag-corpus set.
54. The storage medium of any one of claims 41 to 53, wherein for every
sentiment polarity, a
corresponding polarity weight Qi is set, wherein Qi = {Q1, Q2, Q3}, wherein
the sentiment-
polarity sequence Ti = {T1, T2, T3}, T1 denotes positive sentiment, T2 denotes
neutral
sentiment, and T3 denotes negative sentiment, wherein Q1 denotes the polarity
weight
corresponding to the positive sentiment, Q2 denotes the polarity weight
corresponding to the
neutral sentiment, and Q3 denotes the polarity weight corresponding to the
negative
sentiment.
55. The storage medium of any one of claims 41 to 54, wherein a public-opinion
processing
platform is used to identify enterprise entities from the collected public-
opinion text data, to
extract the risk labels from the data and to analyze the sentiment polarities
of the data,
wherein a personalized configuring service provides standardized application
configuration
interface.
56. The storage medium of any one of claims 41 to 55, wherein the enterprise
entities associated
with the public-opinion text data are identified based on a dictionary of full
names, short
names, and aliases of monitored enterprises through a list of enterprises
monitored, wherein
the public-opinion text data are associated with the enterprise entities to
form an enterprise-
association sequence is E = E2, Em}, where m is the number of all the
monitored
enterprises, Ei is the 0/1 label, wherein 1 denotes the public opinion is
associated with the
enterprise, and 0 denotes not associated, wherein synchronization of the
monitored-enterprise
list, updating of a sentiment polarity dictionary, and setting of the public
opinion sources and
risk label weights is supported.

57. The storage medium of any one of claims 41 to 56, wherein a risk early
warning score is
computed according to early warning labels and list of enterprises monitored
is]. =
=== 'La wherein Ji is a 0/1 label, subscribed by the user and according to
data-source
sequences, credit weights, risk-label sequences, risk weights, sentiment-
polarity sequences,
polarity weights, and the enterprise-association sequence of public-opinion
text data, early
warning level is determined according to a risk threshold value, wherein
enterprise public
opinion information that satisfies requirements is pushed to the user as early
warning.
58. The storage medium of any one of claims 41 to 57, wherein the risk-early-
warning level A=
{no early warning, normal, important, serious}, boundary intervals
corresponding to every
risk-early-warning level is: H = {H1, H2, H3), wherein score is smaller than
H1, the
corresponding risk-early-warning level is not to give early warning, wherein
the score is
greater than H1 and smaller than H2, the corresponding risk-early-warning
level is normal,
wherein the score is greater than H2 and smaller than H3, the corresponding
risk-early-
warning level is important, wherein the score is greater than H3, the
corresponding risk-early-
warning level is serious, wherein the score corresponding to sentiment
polarity is Q =
[Q1, Q2, Q3), and the sentiment-polarity sequence corresponding to the public-
opinion text
data is T = {T1, T2, T3).
59. The storage medium of any one of claims 41 to 58, wherein risk early
warning score of entry
of public-opinion text data is computed by:
Image
wherein a vector inner product represented by (x, y), the equation is:
z = (R,L) + (141,S) + (Q,T);
wherein the sequence information of the associated enterprise combined is:
z' = z = E((E,J));
wherein z(x) is a unit step function is:
41

Image
wherein the enterprise entity name shown in this entry of public-opinion text
data exists
in monitored-enterprise list, the value of E(x) is 1, wherein the risk early
warning score is
computed, wherein the enterprise entity name mentioned in the entry of public-
opinion
text data does not exist in the list of enterprises monitored, the value of
E(x) is 0, wherein
no more computation for the risk early warning score is conducted.
60. The storage medium of any one of claims 41 to 59, wherein early warning
mark is:
Output (z ') = (Y(z'), A), wherein Y(x) = { (x), y2 (x), y3(x), y4(x)}, and
the values of
the two-value function yi(x), y2(x), y3 (x), y4(x) is True or False, 1 or 0,
wherein:
y1(x) = 0 x < H1;
y2(x) = H1 5 x < H2 ;
y3 (X) = H2 5 x < H3 ;
y4(x) = x H3 ; and
wherein Output (z') is output as the early warning mark: no early warning,
normal,
important, or serious.
42

Description

Note: Descriptions are shown in the official language in which they were submitted.


PUBLIC-OPINION ANALYSIS METHOD AND SYSTEM FOR PROVIDING EARLY
WARNING OF ENTERPRISE RISKS
BACKGROUND OF THE INVENTION
Technical Field
[0001] The present invention relates to the technical field of the Internet,
and more particularly
to a public-opinion analyzing method and a system thereof for providing early
warning
of enterprise risks.
Description of Related Art
[0002] Currently, practices of enterprise risk early warning increasingly
depend on and benefit
from applications of technologies like artificial intelligence and natural
language
processing. With the emergence of a great deal of net-based public opinions,
negative
public opinions to or risk events of enterprises have become critical to
identification and
early warning of enterprise risks.
[0003] For users having to pay special attention to enterprise risks, such as
loan approval
managers or risk control managers, it is a significant task to pay close
attention to risk
events of enterprises, thereby acquiring sufficient information about these
risk events and
in turn knowing the risk status of these enterprises. However, this task is
quite labor-
consuming and thus costly. When the number of monitored enterprises is large,
it is
difficult to collect comprehensive information through manual works.
Particularly, when
used to process the massive public-opinion information about enterprises of
interest
circulating over the Internet, manual read can take too much time to give risk
early
warning to relevant enterprises accurately.
SUMMARY OF THE INVENTION
[0004] One objective of the present invention is to provide a method of public-
opinion analysis
for providing early warning of enterprise risks, which can provide a relevant
enterprise
Date Regue/Date Received 2023-01-30

with public-opinion analysis service and early warning service accurately and
efficiently
with reduced human workloads.
[0005] To achieve the foregoing objective, the present invention in a first
aspect provides a
method of public-opinion analysis for providing early warning of enterprise
risks. The
method comprises:
[0006] collecting public-opinion text data from any designated website, and
constructing a data-
source sequence for website sources of the public-opinion text data;
[0007] matching risk labels of the public-opinion text data based on a preset
risk-label set to
construct a risk-label sequence;
[0008] performing classification of sentiment polarities of the public-opinion
text data using a
sentiment classification model so as to construct a sentiment-polarity
sequence, and
identifying associated enterprise entity names in the public-opinion text data
so as to
construct an enterprise-association sequence; and
[0009] according to the data-source sequence, the risk-label sequence, the
sentiment-polarity
sequence and the enterprise-association sequence corresponding to the public-
opinion
text data, computing and outputting a public opinion analysis result.
[0010] Preferably, the step of constructing a data-source sequence for website
sources of the
public-opinion text data comprises:
[0011] summing up a total number of the designated websites and configuring a
credit weight
for each said designated website, so as to construct a data-source sequence
set
dimensionally consistent with the total number; and
[0012] identifying a location of the source website in the data-source
sequence set, constructing
the corresponding data-source sequence, and matching a corresponding said
credit weight
at the same time.
[0013] Preferably, before the step of matching risk labels of the public-
opinion text data with a
preset risk-label set, the method further comprises:
2
Date Recue/Date Received 2022-01-12

[0014] constructing the risk-label set in advance, wherein the risk-label set
includes plural risk-
label classes, and each said risk-label class corresponds to at least one risk
keyword; and
[0015] configuring a risk weight for each said risk-label class in the risk-
label set.
[0016] More preferably, the step of matching risk labels of the public-opinion
text data based on
a preset risk-label set to construct a risk-label sequence comprises:
[0017] performing matching of the risk keywords to the public-opinion text
data by means of
text keyword matching, and searching for corresponding said risk-label class
according
to matching results; and
[0018] based on locations of the risk-label classes in the risk-label set,
constructing the risk-label
sequence.
[0019] Preferably, training of the sentiment classification model comprises:
[0020] extracting public opinion corpora of various sentiment polarities
respectively from
acquired public opinion corpora, so as to construct a tag-corpus set; and
[0021] training the sentiment classification model based on the tag-corpus set
using an LSTM or
TextCNN model structure;
[0022] classifications of the sentiment polarities include positive sentiment,
neutral sentiment,
and negative sentiment, and the sentiment-polarity sequence is a sequence
representation
of one of the three sentiment polarities.
[0023] More preferably, after the step of performing classification of
sentiment polarities of the
public-opinion text data using a sentiment classification model so as to
construct a
sentiment-polarity sequence, the method further comprises:
[0024] configuring a corresponding polarity weight for every said kind of
sentiment polarity.
[0025] Preferably, the step of identifying associated enterprise entity names
in the public-opinion
text data so as to construct an enterprise-association sequence comprises:
[0026] constructing a monitored-enterprise list consisting of plural
enterprise entities in advance;
3
Date Recue/Date Received 2022-01-12

[0027] identifying the enterprise entity name associated with the public-
opinion text data by
means of keyword matching with a Chinese word segmentation tool and/or a NER
naming entity identifying tool; and
[0028] based on a location of the enterprise entity name in the monitored-
enterprise list,
constructing the enterprise-association sequence.
[0029] Preferably, before the step of according to the data-source sequence,
the risk-label
sequence, the sentiment-polarity sequence and the enterprise-association
sequence
corresponding to the public-opinion text data, computing and outputting a
public opinion
analysis result, the method further comprises:
[0030] presetting plural kinds of risk-early-warning levels, and defining
boundary intervals of
each kind of individual risk-early-warning levels.
[0031] More preferably, the step of according to the data-source sequence, the
risk-label
sequence, the sentiment-polarity sequence and the enterprise-association
sequence
corresponding to the public-opinion text data, computing and outputting a
public opinion
analysis result comprises:
[0032] using a public-opinion-risk-early-warning equation z
=riz_oRiLi+V_OWiSi+
Z1:_oQiTi to compute a risk value of the public-opinion text data; and
[0033] computing an early-warning value corresponding to the public-opinion
text data in view
of the enterprise-association sequence, and outputting the risk-early-warning
level based
on the boundary interval to which the early-warning value belongs;
[0034] wherein Ri denotes the risk weight of the corresponding risk-label
class, Li denotes the
risk-label sequence, n denotes a total number of the risk-label classes in the
risk-label
set, Wi denotes the credit weight of the designated website, Si denotes the
data-source
sequence, k denotes the total number of the designated websites, Qi denotes
the polarity
weight, Ti denotes sentiment-polarity sequence, and p denotes a total number
of the
sentiment polarities.
4
Date Recue/Date Received 2022-01-12

[0035] As compared to the prior art, the method of public-opinion analysis for
providing early
warning of enterprise risks provided by the present invention has the
following beneficial
effects:
[0036] in the method of public-opinion analysis for providing early warning of
enterprise risks
of the present invention, public-opinion text data are collected from any
designated
website, and are processed to construct website sources. The risk labels for
the public-
opinion text data are matched based on a preset risk-label set for
constructing a risk-label
sequence. Sentiment polarities of the public-opinion text data are classified
using a
sentiment classification model so as to construct a sentiment-polarity
sequence. The
entity names of enterprises associated with the public-opinion text data are
identified and
used to construct an enterprise-association sequence. At last, a public
opinion analysis
result is computed according to the data-source sequence, the risk-label
sequence, the
sentiment-polarity sequence and the enterprise-association sequence
corresponding to the
public-opinion text data, and then outputted.
[0037] It is thus clear that the present invention deeply digs potential risk
information of
enterprise through multi-dimensional data processing, so as to form a public-
opinion
analyzing process, thereby realizing smart early warning of potential risks
for enterprises
and helping risk business personnel to conduct enterprise risk control and
assessment
more efficiently.
[0038] In a second aspect, the present invention provides a system of public-
opinion analysis for
providing early warning of enterprise risks, which is applied to the method of
public-
opinion analysis for providing early warning of enterprise risks as described
in the
foregoing technical scheme. The system comprises:
[0039] a public-opinion-collecting module, for collecting public-opinion text
data from any
designated website, and constructing a data-source sequence for website
sources of the
public-opinion text data;
[0040] a risk label module, for matching risk labels of the public-opinion
text data based on a
Date Recue/Date Received 2022-01-12

preset risk-label set to construct a risk-label sequence;
[0041] a sentiment-polarity and entity-name identifying module, for performing
classification of
sentiment polarities of the public-opinion text data using a sentiment
classification model
so as to construct a sentiment-polarity sequence, and identifying associated
enterprise
entity names in the public-opinion text data so as to construct an enterprise-
association
sequence; and
[0042] an early warning outputting module, for according to the data-source
sequence, the risk-
label sequence, the sentiment-polarity sequence and the enterprise-association
sequence
corresponding to the public-opinion text data, computing and outputting a
public opinion
analysis result.
[0043] As compared to the prior art, the disclosed public-opinion analyzing
apparatus for
providing early warning of enterprise risks provides beneficial effects that
are similar to
those provided by the method of public-opinion analysis for providing early
warning of
enterprise risks as enumerated above, and thus no repetitions are made herein.
[0044] The present invention in a third aspect provides a computer readable
storage medium,
storing thereon a computer program. When the computer program is executed by a

processor, it implements the steps of the method of public-opinion analysis
for providing
early warning of enterprise risks as described previously.
[0045] As compared to the prior art, the disclosed computer-readable storage
medium provides
beneficial effects that are similar to those provided by the method of public-
opinion
analysis for providing early warning of enterprise risks as enumerated above,
and thus no
repetitions are made herein.
BRIEF DESCRIPTION OF THE DRAWINGS
[0046] The accompanying drawings are provided herein for better understanding
of the present
invention and foun a part of this disclosure. The illustrative embodiments and
their
6
Date Recue/Date Received 2022-01-12

descriptions are for explaining the present invention and by no means form any
improper
limitation to the present invention, wherein:
[0047] FIG. 1 is a schematic flowchart of a method of public-opinion analysis
for providing early
warning of enterprise risks according to one embodiment of the present
invention; and
[0048] FIG. 2 is another schematic flowchart of a method of public-opinion
analysis for
providing early warning of enterprise risks according to one embodiment of the
present
invention.
DETAILED DESCRIPTION OF THE INVENTION
[0049] To make the foregoing objectives, features, and advantages of the
present invention
clearer and more understandable, the following description will be directed to
some
embodiments as depicted in the accompanying drawings to detail the technical
schemes
disclosed in these embodiments. It is, however, to be understood that the
embodiments
referred herein are only a part of all possible embodiments and thus not
exhaustive. Based
on the embodiments of the present invention, all the other embodiments can be
conceived
without creative labor by people of ordinary skill in the art, and all these
and other
embodiments shall be embraced in the scope of the present invention.
[0050] Embodiment 1
[0051] Referring to FIG. 1 and FIG. 2, the present embodiment provides a
method of public-
opinion analysis for providing early warning of enterprise risks, comprises:
[0052] collecting public-opinion text data from any designated website, and
constructing a data-
source sequence for website sources of the public-opinion text data; matching
risk labels
of the public-opinion text data based on a preset risk-label set to construct
a risk-label
sequence; performing classification of sentiment polarities of the public-
opinion text data
using a sentiment classification model so as to construct a sentiment-polarity
sequence,
and identifying associated enterprise entity names in the public-opinion text
data so as to
construct an enterprise-association sequence; according to the data-source
sequence, the
risk-label sequence, the sentiment-polarity sequence and the enterprise-
association
7
Date Recue/Date Received 2022-01-12

sequence corresponding to the public-opinion text data, computing and
outputting a
public opinion analysis result.
[0053] In the method of public-opinion analysis for providing early warning of
enterprise risks
of the present invention, public-opinion text data are collected from any
designated
website, and are processed to construct website sources. The risk labels for
the public-
opinion text data are matched with a preset risk-label set for constructing a
risk-label
sequence. Sentiment polarities of the public-opinion text data are classified
using a
sentiment classification model so as to construct a sentiment-polarity
sequence. The
entity names of enterprises associated with the public-opinion text data are
identified and
used to construct an enterprise-association sequence. At last, a public
opinion analysis
result is computed according to the data-source sequence, the risk-label
sequence, the
sentiment-polarity sequence and the enterprise-association sequence
corresponding to the
public-opinion text data, and then outputted.
[0054] It is thus clear that the present invention deeply digs potential risk
information of
enterprise through multi-dimensional data processing, so as to form a public-
opinion
analyzing process, thereby realizing smart early warning of potential risks
for enterprises
and helping risk business personnel to conduct enterprise risk control and
assessment
more efficiently.
[0055] In the embodiment described above, the step of constructing a data-
source sequence
according to website sources of the public-opinion text data comprises:
[0056] summing up a total number of the designated website and configuring a
credit weight for
each said designated website, so as to construct a data-source sequence set
dimensionally
consistent with the total number; and identifying a location of the source
website in the
data-source sequence set, constructing the corresponding data-source sequence,
and
matching a corresponding said credit weight.
8
Date Recue/Date Received 2022-01-12

[0057] In particular implementations, the public-opinion-collecting module
serves to collect
public-opinion text data of enterprises and perfonn structurized data
extraction. The first
thing to do is to set and configure public-opinion data source. Sources of
public opinions
about enterprises primarily include news websites, government websites,
forums,
WEIBO, and websites receiving complaints. The source sequence is S = {S1, S2,
,SO.
According to the sources of public opinions, different credit weights Wi are
assigned.
The credit weights may alternatively be configured by users. The step further
includes
setting addresses, site sections, data-collecting frequencies, keywords of the
public-
opinion data sources. Then an Internet-based data collecting tool is used to
acquire public-
opinion text data. Afterward, a Python-based or Java-based html processing
tool is used
to denoise the webpages, clean data, and extract fields, so that data of
public-opinion
webpage data can be extracted in a structurized manner by fields like titles,
sources, links,
releasing date, text, summaries, and authors.
[0058] Exemplarily, collecting the public-opinion text data is realized
through the following
steps:
[0059] Step 1: using the Python-based or Java-based html processing tool to
denoise the
webpages, clean data, and extract fields, so that data of public-opinion
webpage data can
be extracted in a structurized manner by fields like titles, sources, links,
releasing date,
text, summaries, and authors. In an example, the set list of designated
websites is
['website 1", "website 2", "website 3", "website 4", "website 5, "website 6",
"website 7",
"website 8", "website 9"], and the credit weights assigned to the designated
websites are
(ranging from 1 to 5): [5, 5, 3, 5, 3, 3, 5, 5, 4].
[0060] Step 2: the extracted structurized text data are stored in the form of:
"title": "Fake products doing great harm, how to rule special formula milk
powder
products in a targeted way";
"content": "The powdered protein beverage event in XXXX is about falsely
claiming that
9
Date Recue/Date Received 2022-01-12

powdered protein beverage is a kind of special formula milk powder, and led to
severe
dysplasia among infants and babies. In this event, a series of violating
operations
including illegal propaganda, sales malpractice, and consumer fraud caused
health
damage to infants and babies .... ,>.
[0061] "datetime": "2020-06-08 09:40:31";
"source": "certain social media platform";
"ur1": http://food.china.com.cn/2020-06/08/content76137776.htm;
"author": "Wang XX",
"summary": ""
1.
[0062] In the embodiment described above, before the step of matching risk
labels of the public-
opinion text data based on a preset risk-label set, the method further
comprises:
[0063] constructing the risk-label set in advance, wherein the risk-label set
includes plural risk-
label classes, and each said risk-label class corresponds to at least one risk
keyword; and
configuring a risk weight for each said risk-label class in the risk-label
set.
[0064] The step of matching risk labels of the public-opinion text data based
on a preset risk-
label set to construct a risk-label sequence comprises:
[0065] performing matching of the risk keywords to the public-opinion text
data by means of
text keyword matching, and searching for corresponding said risk-label class
according
to matching results; and based on locations of the risk-label classes in the
risk-label set,
constructing the risk-label sequence.
[0066] The risk label module mainly serves to extract risk labels in public
opinions by means of
matching risk keywords according to a risk-label set created in advance.
First, a risk-label
set is constructed for classes of risk events that are commonly seen in public
opinions
about enterprises and classes of risk events the users care. Every risk label
is assigned
with a corresponding risk weight R. The risk weight may alternatively be
configured by
the users. A keyword set is developed for each of the risk labels, so as to
form a "label -
Date Recue/Date Received 2022-01-12

keyword dictionary". Then the public opinion text is matched with the risk
keywords by
means text keyword matching, and tagging is made according to the matching
results, so
as to generate a risk-label sequence L = {L1, L2, ,L) of the public opinions,
where n
is the total number of the risk labels, Li corresponds to the 0/1
identification
corresponding to the risk label, 1 denotes that there is the ith label in the
public opinions,
and 0 denotes that there is not the ith label in the public opinions.
[0067] Exemplarily, the risk label matching process performed on the public-
opinion text data is
achieved through the following steps:
[0068] Step 1, a risk-label set is created by performing label definition
wrangling on public
opinion risk class while concerning business requirements from the risk
management
field, e.g.:
[0069] [`bankruptcy and insolvency", "mortgage and pledge", "loss", "equity
change", "default
and thunder", "Illegal fundraising", "infringement and plagiarism", "contract
dispute",
"violation of regulations or laws", "falsity and fraud", "tax evasion",
"security events"];
wherein the risk weights (ranging from 1 to 10) corresponding to the risk-
label classes
are set as: [10, 5, 7, 10, 4, 3, 2, 2, 5, 3, 3].
[0070] Step 2, the risk-label classes corresponding to risk keyword set are
wrangled to form a
"label -keyword dictionary"; for example:
[0071] {
[0072] bankruptcy and insolvency: bankruptcy and insolvency, bankruptcy,
frozen, business
closed, business suspend, suspend business for rectification, seized, revoked,
detained,
non-standard opinion;
[0073] mortgage and pledge: debt collateralizing, collateralizing debt, asset
value less than
issued debt, asset mortgage, security for loan, pledge of equity;
[0074] loss: loss, aggravation, arrears, perfoimance increase, sales decrease;
[0075] equity change: equity change, pledge of equity, changes in equity,
increase holdings,
11
Date Recue/Date Received 2022-01-12

decrease holdings, capital reduction, split-up, merged;
[0076] default and thunder: debt default, thunder, runaway, overdue, dishonest
person,
uncertainty of cashing, arrears in contribution, P2P, blacklist, executed,
risk;
[0077] contract dispute: contract dispute, contract cancellation, labor
dispute, labor lawsuit;
[0078] falsity and fraud: financial fraud, suspected fraud, financial scandal,
fraud;
[0079] Illegal fundraising: Illegal fundraising, fundraising fraud;
[0080] tax evasion: tax dodging, tax fraud, tax avoiding;
[0081] infringement and plagiarism: infringement, plagiarism;
[0082] security events: incident, information leakage, private data, data
leakage, production
incident;
[0083] violation of regulations or laws: violation of law, violation of
regulation, complaint, right
protection, MLM, economic investigation intervention, arbitration, commission,
loan
shark, criminal case, prosecuted, involved in gangs or vices, official
investigation;
[0084] 1.
[0085] Step 3, through keyword matching, the public opinion text is matched
with the risk
keywords, and according to the matching results, tagging is made with the
labels, so as
to obtain a risk-label sequence.
[0086] Assuming that one collected entry of public-opinion text data is "A
series of incidents
happened in constructions undertaken by )00CX and the company is now forbidden
from
managing new projects by the Housing and Construction Office due to violation
",
and the word "violation" in the public-opinion text data matches a risk
keyword in the
risk label of "violation of regulations or laws", the risk label matching the
public-
opinion text data is "violation of regulations or laws". Because the other
risk labels are
all unmatched, "1"is only used to mark the risk-label sequence at the location
of the
element corresponding to "violation of regulations or laws", and the locations
of the other
elements in the risk-label sequence are marked with "0". As a result, the risk-
label
sequence corresponding to the foregoing public-opinion text data is [0, 0, 0,
0, 0, 0, 0, 0,
12
Date Recue/Date Received 2022-01-12

0, 0, 1].
[0087] In the embodiment described above, training of the sentiment
classification model
comprises:
[0088] extracting public opinion corpora of various sentiment polarities from
acquired public
opinion corpora, so as to construct a tag-corpus set; and training the
sentiment
classification model based on the tag-corpus set using an LSTM or TextCNN
model
structure; in which the sentiment polarities include positive sentiment,
neutral sentiment,
and negative sentiment, and the sentiment-polarity sequence is a sequence
representation
of one of the three sentiment polarities.
[0089] In particular implementations, the sentiment-polarity and entity-name
identifying module
extract public-opinion data sets of three polarities, including positive,
neutral, and
negative sentiment kinds (i.e., positive sentiment, neutral sentiment, and
negative
sentiment) from acquired public opinion corpora according to pre-defined
positive and
negative sentiment dictionary for a certain enterprise to form a tag-corpus
set. For
example:
[0090] [
[0091] The public opinion corpora of "negative sentiment":
[0092] A loss as high as 1.7 billion CNY, with power stations devalued; Is the
case of X tech-
company a common suffering of the industry;
[0093] New movies scheduled for February are halted again, 90% film and
television stocks hit
the limit down and cinema stocks enter the "Glacier Era";
[0094] Takkyubin accused: a network technology company presumed to increase
pricing and
graft price differences;
[0095] A courier company in Shanghai is so inefficient that couriers quit for
other careers;
[0096] ....
[0097] The public opinion corpora of "public opinion corpora":
[0098] Challenging "vaccine leader" XXXX! First domestic vaccines launched;
13
Date Recue/Date Received 2022-01-12

[0099] With a burst of bullish news in the tera-scale plate blasted another
harden of hundred-
billion leading stocks;
[0100] Bullish news continuously come in the hydrogen energy industry and two
sectors are
expecting long-temi growth;
[0101] certain video platform is still "solid";
[0102] ....
[0103] The public opinion corpora of "neutral sentiment":
[0104] What exactly the "long-termism" advocated by A, B, and C is;
[0105] An image to the quotations in 2020;
[0106] Say goodbye to the getting-ready 2019 and enter the deep transformation
in 2020;
[0107] Why Central Bank of certain country decided to cut the requirement
reserve ratio in early
January? For providing the market with liquidity;
[0108] ....
[0109]
[0110] After text pre-processing is performed on the public opinion corpora, a
word embedding
model that has been trained with a large quantity of public opinion text about
enterprises
of interest is used as a text vector representative for model training.
Afterward, the
sentiment classification model was trained based on LSTM/TextCNN. As training
sentiment classification models is known in the art, no detailed description
is given and
discussion herein is merely made to the results. As demonstrated by the
statistics, the
sentiment classification model according to the present embodiment when based
on 100
thousand entries of data provided an accuracy rate of 87%, satisfying
expectation.
[0111] In the embodiment described above, after the step of classifying
sentiment polarities of
the public-opinion text data using a sentiment classification model so as to
construct a
sentiment-polarity sequence, the method further comprises:
[0112] For every sentiment polarity, a corresponding polarity weight Qt is
set, wherein Qi=
(Qi,Q2P Q3). In the sentiment-polarity sequence Ti = fT1, T2, T3), T1 denotes
positive
14
Date Recue/Date Received 2022-01-12

sentiment, T2 denotes neutral sentiment, and T3 denotes negative sentiment. Q1

denotes the polarity weight corresponding to the positive sentiment, Q2
denotes the
polarity weight corresponding to the neutral sentiment, and Q3 denotes the
polarity
weight corresponding to the polarity weight.
[0113] In the embodiment described above, the step of identifying associated
enterprise entity
names in the public-opinion text data so as to construct an enterprise-
association sequence
comprises:
[0114] constructing a monitored-enterprise list consisting of plural
enterprise entities in advance;
using a Chinese word segmentation tool and/or a NER naming entity identifying
tool to
identify the enterprise entity name associated with the public-opinion text
data by means
of keyword matching; and based on a location of the enterprise entity name in
the
monitored-enterprise list, constructing the enterprise-association sequence.
[0115] In particular implementations, a public-opinion processing platform is
used to identify
enterprise entities from the collected public-opinion text data, to extract
risk labels from
the data and to analyze sentiment polarities of the data. Meantime, a
personalized
configuring service provides standardized application configuration interface.
[0116] First, the public-opinion input module performs text pre-processing on
titles, content text,
and summary text of public-opinion text data collected in a real-time manner
from public-
opinion data sources according to subscription, so as to remove undesired stop
words and
conduct Chinese word segmentation. The second step is to process public-
opinion labels
and classify sentiment polarities. The pre-processed public-opinion text data
are entered
into a risk label module to generate risk-label sequences and are entered into
the sentiment
polarity analyzing module to generate sentiment polarity labels, such as
positive
sentiment, neutral sentiment or negative sentiment. At the third step, the
enterprise
entities associated with the public-opinion text data are identified using the
combination
Date Recue/Date Received 2022-01-12

of the Chinese word segmentation tool and the NER naming entity identifying
tool as
well as keyword matching, based on the dictionary of full names, short names,
and aliases
of monitored enterprises through the list of enterprises monitored. The public-
opinion
text data are associated with the enterprise entities to form an enterprise-
association
sequence E = 1,E1, E2, , Ern), where m is the number of all the monitored
enterprises,
Ei is the 0/1 label, in which 1 denotes the public opinion is associated with
the ith
enterprise, and 0 denotes not associated. The personalized configuring module
of the
platform supports synchronization of the monitored-enterprise list, updating
of the
sentiment polarity dictionary, and setting of the public opinion sources and
the risk label
weights.
[0117] Exemplarily, for public-opinion text data saying "A series of incidents
happened in
constructions undertaken by XXXX and the company is now forbidden from
managing
new projects by the Housing and Construction Office due to violation ..... ",
this entry of
data is classified by its sentiment polarity to confirm that the public
opinion sentiment
label is negative sentiment. Through extraction of the associated enterprise
entities, the
monitored enterprise list sequence corresponding to the public opinion
association
enterprises is generated as
[0118] [ ... , 0, 1,0, ..
[0119] In the embodiment described above, before the step of according to the
data-source
sequence, the risk-label sequence, the sentiment-polarity sequence and the
enterprise-
association sequence corresponding to the public-opinion text data, computing
and
outputting a public opinion analysis result, the method further comprises:
[0120] presetting plural kinds of risk-early-warning levels, and defining
boundary intervals of
each kind of risk-early-warning level.
[0121] In particular implementations, a risk early warning score is computed
according to the
early warning labels and the list of enterprises monitored I = U1i2, ...,J,i)
(where Ji
16
Date Recue/Date Received 2022-01-12

is a 0/1 label) subscribed by the user and according to the data-source
sequences, credit
weights, risk-label sequences, risk weights, sentiment-polarity sequences,
polarity
weights, and enterprise-association sequence of public-opinion text data, the
early
warning level is determined according to a risk threshold value. Then
enterprise public
opinion information that satisfies the requirements is pushed to the user as
early warning.
[0122] Exemplarily, for the risk-early-warning level A= {no early warning,
normal, important,
serious), the boundary intervals corresponding to every risk-early-warning
level is: H =
tHi, H2, H3). In other words, when the score is smaller than Hi, the
corresponding risk-
early-warning level is not to give early warning. When the score is greater
than Hi and
smaller than H2, the corresponding risk-early-warning level is normal. When
the score
is greater than H2 and smaller than H3, the corresponding risk-early-warning
level is
important. When the score is greater than H3, the corresponding risk-early-
warning level
is serious. The score corresponding to the sentiment polarity is Q = (Q1, Q2,
Q3), and the
sentiment-polarity sequence corresponding to the public-opinion text data is T
=
(T1, T2, T3), where only Ti is 1, and the other two are 0.
[0123] The step of according to the data-source sequence, the risk-label
sequence, the sentiment-
polarity sequence and the enterprise-association sequence corresponding to the
public-
opinion text data, computing and outputting a public opinion analysis result
comprises:
[0124] using a public-opinion-risk-early-warning equation z = J
R1L -EV-007A +
El_oQiTi to compute a risk value of the public-opinion text data; and
computing an
early-warning value corresponding to the public-opinion text data in view of
the
enterprise-association sequence to, and outputting the risk-early-warning
level based on
the boundary interval to which the early-warning value belongs; the Ri denotes
the risk
weight of the corresponding risk-label class, Li denotes the risk-label
sequence, n
denotes a total number of the risk-label classes in the risk-label set, Wi
denotes the credit
weight of the designated website, Si denotes the data-source sequence, k
denotes the
total number of the designated websites, Qi denotes the polarity weight, Ti
denotes
17
Date Recue/Date Received 2022-01-12

sentiment-polarity sequence, and p denotes a total number of the sentiment
polarities.
[0125] In particular implementations, for some user, the risk early warning
score of some entry
of public-opinion text data can be computed using the equation below:
[0126] z = RiLi +ZI[_OWiSi
[0127] With the vector inner product represented by (x, y), the equation above
can be rewritten
as
[0128] z = (R , L) + (W , S) + (Q ,T)
[0129] With the sequence information of the associated enterprise combined, it
is obtained that
[0130] z' = z = e((E , J))
[0131] where E(x) is a unit step function,
[0132] E(x) = [ 0, 1, > 0
x < 0
[0133] It is understandable that when the enterprise entity name shown in this
entry of public-
opinion text data exists in the monitored-enterprise list, the value of E(x)
is 1. At this
time, a risk early warning score is computed. When the enterprise entity name
mentioned
in the entry of public-opinion text data does not exist in the list of
enterprises monitored,
the value of E(X) is 0. In this case, nor more computation for the risk early
warning score
is conducted thereto.
[0134] Further, the early warning mark is Output (z') = (Y (Z), A) , where
Y(x) =
fy, (x), y2 (x),y3(x),y4(x)} , and the values of the two-value function
yi(x), y2(x), y3(x), y4(x) is True or False (i.e., 1 or 0) :
[0135] y1(x) = 0 x <
[0136] y2 (x) = x <H2
[0137] y3(x) = H2 X < H3
[0138] y4(x) = x H3
[0139] Output(z') is output as the early warning mark: no early warning,
nomial, important,
or serious.
18
Date Recue/Date Received 2022-01-12

[0140] For example, risk-early-warning level A= {no early warning, normal,
important, serious),
corresponding threshold value: H = ( H1 = 5, H2 = 10, H3 = 30).
[0141] The score corresponding to the sentiment polarities (positive
sentiment, neutral sentiment,
and negative sentiment) is Q = (1,2,3) , and the sentiment-polarity sequence
corresponding to this entry of the public-opinion text data is T = (0,04
[0142] Taking inputting the public-opinion text data: "A series of incidents
happened in
constructions undertaken by XXXX and the company is now forbidden from
managing
new projects by the Housing and Construction Office due to violation .....
"for example,
the public-opinion text data came from NetEase, and the corresponding data-
source
sequence vector is [0, 0, 0, 1, 0, 0, 0, 0, 0]. The early warning label
subscribed by the user
is "security incident", and the list of monitored enterprises include XXXX.
[0143] According to the equation below, the risk early warning score is:
[0144] z = (R,L) + (W,S) + (Q, T) = 5 + 3 + 3 = 11
[0145] Since the public-opinion text data contains an associated enterprise
(XXXX) that is one of
the enterprises monitored by the user monitoring, (E ,J) > 0, so c((E ,J)) =
1, thereby
obtaining that z' = z = z((E ,J)) = z = 11.
[0146] Further, because H2 <z' <H3, H3 Y(z') = (0,0,1,0), and therefore the
resulting early
warning mark is Output (z') = (Y (Z), A)= "important". The early warning
outputting
module thus outputs the public opinion "A series of incidents happened in
constructions
undertaken by XXXX and the company is now forbidden from managing new projects
by the Housing and Construction Office due to violation .................. "
to the user as an
"important" early warning.
[0147] To sum up, the schemes of the present embodiment are intended to dig
potential risk
19
Date Recue/Date Received 2022-01-12

information about enterprises of interest, and provide automated and
personalized
configuration, so as to form a public-opinion analyzing process and give smart
early
warning of potential risks to relevant enterprises, thereby helping risk
business personnel
to conduct enterprise risk control and assessment more efficiently.
[0148] Embodiment 2
[0149] The present embodiment provides a system of public-opinion analysis for
providing early
warning of enterprise risks. The system comprises:
[0150] a public-opinion-collecting module, for collecting public-opinion text
data from any
designated website, and constructing a data-source sequence according to
website sources
of the public-opinion text data;
[0151] a risk label module, for matching risk labels of the public-opinion
text data based on a
preset risk-label set to construct a risk-label sequence;
[0152] a sentiment-polarity and entity-name identifying module, for performing
classification of
sentiment polarities of the public-opinion text data using a sentiment
classification model
so as to construct a sentiment-polarity sequence, and identifying associated
enterprise
entity names in the public-opinion text data so as to construct an enterprise-
association
sequence; and
[0153] an early warning outputting module, for according to the data-source
sequence, the risk-
label sequence, the sentiment-polarity sequence and the enterprise-association
sequence
corresponding to the public-opinion text data, computing and outputting a
public opinion
analysis result.
[0154] As compared to the prior art, the system of public-opinion analysis for
providing early
warning of enterprise risks of the present embodiment provides beneficial
effects that are
similar to those provided by the method of public-opinion analysis for
providing early
warning of enterprise risks as enumerated in the previous embodiment, and thus
no
repetitions are made herein.
Date Recue/Date Received 2022-01-12

[0155] Embodiment 3
[0156] The present embodiment provides a computer-readable storage medium,
storing thereon
a computer program. When the computer program is executed by a processor, it
implements the steps of the method of public-opinion analysis for providing
early
warning of enterprise risks as described previously.
[0157] As compared to the prior art, the computer-readable storage medium of
the present
embodiment provides beneficial effects that are similar to those provided by
the method
of public-opinion analysis for providing early warning of enterprise risks as
enumerated
in the previous embodiment, and thus no repetitions are made herein.
[0158] As will be appreciated by people of ordinary skill in the art,
implementation of all or a
part of the steps of the method of the present invention as described
previously may be
realized by having a program instruct related hardware components. The program
may
be stored in a computer-readable storage medium, and the program is about
performing
the individual steps of the methods described in the foregoing embodiments.
The storage
medium may be a ROM/RAM, a hard drive, an optical disk, a memory card or the
like.
[0159] The present invention has been described with reference to the
preferred embodiments
and it is understood that the embodiments are not intended to limit the scope
of the present
invention. Moreover, as the contents disclosed herein should be readily
understood and
can be implemented by a person skilled in the art, all equivalent changes or
modifications
which do not depart from the concept of the present invention should be
encompassed by
the appended claims. Hence, the scope of the present invention shall only be
defined by
the appended claims.
21
Date Recue/Date Received 2022-01-12

Representative Drawing
A single figure which represents the drawing illustrating the invention.
Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee  and Payment History  should be consulted.

Administrative Status

Title Date
Forecasted Issue Date 2023-08-01
(22) Filed 2021-11-12
Examination Requested 2022-04-28
(41) Open to Public Inspection 2022-05-12
(45) Issued 2023-08-01

Abandonment History

There is no abandonment history.

Maintenance Fee

Last Payment of $100.00 was received on 2023-12-15


 Upcoming maintenance fee amounts

Description Date Amount
Next Payment if small entity fee 2025-11-12 $50.00
Next Payment if standard fee 2025-11-12 $125.00

Note : If the full payment has not been received on or before the date indicated, a further fee may be required which may be one of the following

  • the reinstatement fee;
  • the late payment fee; or
  • additional fee to reverse deemed expiry.

Patent fees are adjusted on the 1st of January every year. The amounts above are the current amounts if received by December 31 of the current year.
Please refer to the CIPO Patent Fees web page to see all current fee amounts.

Payment History

Fee Type Anniversary Year Due Date Amount Paid Paid Date
Application Fee 2021-11-12 $408.00 2021-11-02
Advance an application for a patent out of its routine order 2022-04-28 $508.98 2022-04-28
Request for Examination 2025-11-12 $814.37 2022-04-28
Final Fee 2021-11-02 $306.00 2023-06-02
Maintenance Fee - Application - New Act 2 2023-11-14 $100.00 2023-06-15
Maintenance Fee - Patent - New Act 3 2024-11-12 $100.00 2023-12-15
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
10353744 CANADA LTD.
Past Owners on Record
None
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
New Application 2021-11-12 6 213
Amendment 2021-11-12 4 112
Translation of Description Requested 2022-01-04 2 199
Description 2022-01-12 21 944
Claims 2022-01-12 4 157
Abstract 2022-01-12 1 24
Drawings 2022-01-12 2 118
Representative Drawing 2022-04-08 1 27
Cover Page 2022-04-08 1 58
Request for Examination / Special Order / Amendment 2022-04-28 26 1,014
Claims 2022-04-28 21 816
Acknowledgement of Grant of Special Order 2022-05-30 1 181
Examiner Requisition 2022-09-29 8 400
Amendment 2023-01-30 52 2,141
Claims 2023-01-30 21 1,176
Description 2023-01-30 21 1,289
Final Fee 2023-06-02 3 64
Representative Drawing 2023-07-12 1 27
Cover Page 2023-07-12 1 59
Electronic Grant Certificate 2023-08-01 1 2,527