Patent 2757771 Summary

(12) Patent Application:	(11) CA 2757771
(54) English Title:	SIMILARITY-BASED FEATURE SET SUPPLEMENTATION FOR CLASSIFICATION
(54) French Title:	AUGMENTATION D'UN ENSEMBLE DE CARACTERISTIQUES BASEES SUR LA SIMILARITE POUR UNE CLASSIFICATION
Status:	Dead

Bibliographic Data

(51) International Patent Classification (IPC):	G06F 17/30 (2006.01)
(72) Inventors :	HE, YU (United States of America) STOUTAMIRE, DAVID PETRIE (United States of America)
(73) Owners :	GOOGLE INC. (United States of America)
(71) Applicants :	GOOGLE INC. (United States of America)
(74) Agent:	SMART & BIGGAR
(74) Associate agent:
(45) Issued:
(86) PCT Filing Date:	2010-03-17
(87) Open to Public Inspection:	2010-10-14
Availability of licence:	N/A
(25) Language of filing:	English

Patent Cooperation Treaty (PCT):	Yes
(86) PCT Filing Number:	PCT/US2010/027709
(87) International Publication Number:	WO2010/117581
(85) National Entry:	2011-10-05

(30) Application Priority Data:

Application No.	Country/Territory	Date
61/167,825	United States of America	2009-04-08

Abstracts

English Abstract

A set of neighbor items associated with a first item is identified based, in
part, on a first feature set associated with
the first item, wherein each neighbor item of the set of neighbor items is
associated with a feature set. A supplemented feature set
is generated for the first item based on the identified set of neighbor items
responsive to combining the first feature set and the
features sets associated with the set of neighbor items. A set of
classification scores associated the first item is generated based on the
supplemented feature set, each classification score of the set of
classification scores indicating a likelihood that the first item belongs
to a class of items.

French Abstract

Un ensemble d'éléments voisins associés à un premier élément est identifié sur la base, en partie, d'un premier ensemble de caractéristiques associé au premier élément, chaque élément voisin de l'ensemble d'éléments voisins est associé à un ensemble de caractéristiques. Un ensemble de caractéristiques augmenté est généré pour le premier élément sur la base de l'ensemble identifié d'éléments voisins en réponse à la combinaison du premier ensemble de caractéristiques et des ensembles de caractéristiques associés à l'ensemble d'éléments voisins. Un ensemble de résultats de classification associé au premier élément est généré sur la base de l'ensemble de caractéristiques augmenté, chaque résultat de classification de l'ensemble de résultats de classification indiquant une probabilité que le premier élément appartienne à une classe des éléments.

Claims

Note: Claims are shown in the official language in which they were submitted.

CLAIMS

1. A computer-implemented method of generating a set of class labels for
labeling an item, the method comprising:
identifying a set of neighbor items associated with a first item based, in
part, on a
first feature set associated with the first item, wherein each neighbor item
of
the set of neighbor items is associated with a feature set;
generating a supplemented feature set for the first item based on the
identified set of
neighbor items responsive to combining the first feature set and the features
sets associated with the set of neighbor items; and
generating a set of classification scores associated the first item based on
the
supplemented feature set, each classification score of the set of
classification
scores indicating a likelihood that the first item belongs to a class of
items.

2. The method of claim 1, wherein the first item is a media content item and
the
neighbor items are media content items.

3. The method of claim 2, wherein the first feature set and the feature sets
associated with the neighbor items include viewing statistics derived from
providing the
media content item to viewers.

4. The method of claim 2, wherein the first feature set and the feature sets
associated with the neighbor items include information describing the media
content item
specified by a user of a media host system.

5. The method of claim 2, wherein the first feature set and the feature sets
associated with the neighbor items include information generated from the
media content
items.

6. The method of claim 1, wherein identifying the set of neighbor items
associated with a first item comprises:
determining a set of distance metrics based on the first feature set and the
feature
sets associated with the set of neighbor items; and
identifying the set of neighbor items based, in part, on the set of distance
metrics.

19

7. The method of claim 6, wherein determining a set of distance metrics based
on the first feature set and the feature sets associated with the set of
neighbor items
comprises:
identifying a set of items associated with feature sets;
identifying at least a first undesirable item of the set of items based on the
features
sets associated with the set of items, wherein the feature set associated with

the at least a first undesirable item comprises a feature that is specified by
an
administrator to indicate that the item is an undesirable item;
generating a filtered set of items responsive to removing the at least a first

undesirable item from the set of items; and
determining a set of distance metrics based on the filtered set of items.

8. The method of claim 6, wherein identifying the set of neighbor items based,

in part, on the set of distance metrics comprises:
generating a similarity graph based on the set of distance metrics, wherein
the
similarity graph is comprised of nodes representing the items; and
identifying the set of neighbor items based on the similarity graph.

9. The method of claim 8, wherein the item is a media content item, the
neighbor items are media content items and identifying the set of neighbor
items based, in
part, on the similarity graph comprises:
identifying at least a first node in the similarity graph representing an item
of media
associated with feature set indicating that the media content item is
associated with one or more viewing statistics beneath a threshold value;
generating a pruned similarity graph responsive to removing the at least a
first node;
and
identifying the set of neighbor items based on the pruned similarity graph.

10. The method of claim 1, wherein combining the first feature set with the
features sets associated with the neighbor items comprises:
aggregating the first feature set with the feature sets associated with the
neighbor
items.

11. The method of claim 1, further comprising:

generating a first set of class labels associated with the first media content
item
responsive to one or more classification scores of the set of classification
scores exceeding a threshold value.

12. The method of claim 11, further comprising:
identifying one or more sets of class labels associated with the set of
neighbor items,
wherein each neighbor items is associated with a set of class labels;
generating a plurality of class consensus scores based on the first set of
class labels
associated with the first item and the one or more sets of class labels
associated with the set of neighbor items, wherein each class consensus score
indicates a correspondence between a class label associated with the first
item
and the neighbor items;
generating a refined set of class labels associated with the first item
responsive to
removing at least one class label from the set of class labels associated with

the first item based on a class consensus score associated with the at least
one
class label; and
storing the set of refined class labels.

21

Description

Note: Descriptions are shown in the official language in which they were submitted.

CA 02757771 2011-10 05
WO 2010/117581 PCT/US2010/027709
SIMILARITY-BASED FEATURE SET SUPPLEMENTATION FOR CLASSIFICATION
BACKGROUND
Field of the Invention

[0001] The invention relates generally to classifying items that are
associated with
sparse or unknown data. In particular, embodiments of the invention are
directed toward
classifying media content items associated with sparse or unknown feature
datasets using
feature datasets associated with related media content items.

Description of Background Art

[0002] Media hosting services allow users to upload media content where it can
be
shared with others for public viewing. Media content provided by the users may
include,
for example, textual content (e.g. blogs), video content, audio content and
image content.
Media hosting services can host millions of media content items. Typically,
users uploading
content provide labels or tags to describe the media content by associating
with the media
content with one or more categories. Other users may browse or search for
media content
by providing keywords to search the information describing the media content
such as the
title, a summary of the media content, as well as the labels and tags.
However, information
provided by the users to describe the media content is often sparse,
inconsistent and/or
inaccurate. In particular, user-provided labels are often inconsistent as they
are provided by
different users and are subject to user opinion on what the media content is
about. For
instance, one user may provide a label indicating a news video discussing the
rising the cost
of gasoline relates to the "Environment," while another user may provide label
indicating
the same news video relates to "Politics."

[0003] The use of statistical classification techniques provides one method of
standardizing assignment of labels indicating classes. In statistical
classification techniques,
a statistical model or "classifier" is computationally generated. The
classifier specifies a set
of features and their associated relevance in determining whether an item
belongs to a class
of items. This classifier is applied to feature data associated with an item
to determine
whether the item has a correspondence to the class of items. Although
statistical

1

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
classification provides an efficient and standardized method of assigning
class labels to
items, this technique is most effective in instances when the items are
associated with large
amounts of accurate feature data. As described above, user provided media
content is often
associated with sparse or inconsistent feature data. As a result, existing
statistical
classification methods do not provide an effective means for classifying
content based on its
labels and descriptive information.

SUMMARY OF THE INVENTION

[0004] Embodiments of the present invention enable the generation of a set of
class labels
for labeling a media content item.

[0005] An embodiment of a method according to the present invention comprises
a
computer-implemented method for generating a set of class labels for labeling
an item. A
set of neighbor items associated with a first item is identified based, in
part, on a first feature
set associated with the first item, wherein each neighbor item of the set of
neighbor items is
associated with a feature set. A supplemented feature set is generated for the
first item
based on the identified set of neighbor items responsive to combining the
first feature set
and the features sets associated with the set of neighbor items. A set of
classification scores
associated the first item is generated based on the supplemented feature set,
each
classification score of the set of classification scores indicating a
likelihood that the first item
belongs to a class of items.

[0006] The features and advantages described in this summary and the following
detailed
description are not all-inclusive. Many additional features and advantages
will be apparent
to one of ordinary skill in the art in view of the drawings, specification,
and claims hereof.

BRIEF DESCRIPTION OF THE DRAWINGS

[0007] Fig. 1 is a high-level block diagram of a system environment according
to one
embodiment.

[0008] Fig. 2 is a screenshot illustrating an interface for browsing media
content
associated with categories according to one embodiment.

[0009] Fig. 3 is a high-level block diagram illustrating a detailed view for
the media host
server according to one embodiment.

2

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
[0010] Fig. 4 is a flow-chart illustrating steps performed by the media host
server to
generate a similarity graph according to one embodiment.

[0011] Fig. 5 is a flow-chart illustrating steps performed by the media host
server to
generate a set of class labels for a media content item according to one
embodiment.
[0012] Fig. 6 is a flow-chart illustrating steps performed by the media host
server to
refine a set of class labels associated with an media content item according
to one
embodiment.

[0013] The figures depict preferred embodiments of the present invention for
purposes
of illustration only. One skilled in the art will readily recognize from the
following
discussion that alternative embodiments of the structures and methods
illustrated herein
may be employed without departing from the principles of the invention
described herein.

DETAILED DESCRIPTION

[0014] FIG. 1 illustrates a system environment 100 comprising a media host
service 104,
a plurality of content providers 102 and a plurality of content viewers 106
connected by a
network 114. Only three content viewers 106 are shown in FIG. 1 in order to
simplify and
clarify the description. Embodiments of the system environment 100 can have
thousands or
millions of content viewers 106 and/or content providers 102 connected to the
network 114.
The media host service 104 communicates with content viewers 106 over the
network 114.
The media host service 104 receives uploaded media content from content
providers 102 and
allows content to be viewed by content viewers 106. Media content may be
uploaded to the
media host service 104 via the Internet from a personal computer, through a
cellular
network from a telephone or PDA, or by other means for transferring data over
the network
114. Media content may be downloaded from the media host service 104 in a
similar
manner; in one embodiment media content is provided as a file download to a
content
viewer 106; in an alternative embodiment, media content is streamed to the
content viewer.
The means by which media content is received by the media host service 104
need not match
the means by which it is delivered to a content viewer 106. For example, the
content
provider 102 may upload a video via a browser on a personal computer, whereas
the
content viewer 106 may view that video as a stream sent to a PDA. Note also
that the media
host service 104 may itself serve as the content provider 102.

3

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
[0015] A content provider 102 can also provide media content to the media host
service
104. Examples of media content include audio, video, image and text content;
other forms of
content available for consumption may also be provided. The media content may
have been
created by content provider 102, but need not have been.

[0016] Content viewers 106 view media content provided by the media host
service 104
via a user interface. Typically, a content viewer 106 runs a web browser such
as Microsoft
Internet Explorer or Mozilla Firefox. A media host service 104 includes a web
server such as
Microsoft Internet Information Services. Using a browser, a content viewer 106
browses and
searches for content provided by the media host service 104 and views content
of interest,
including video content. In some embodiments, the content viewer 106 uses
other types of
software applications to view, browse and search media content from the media
host service
104. As described further below, the content viewer 106 also provides viewing
metrics to
the media host service 104.

[0017] The media host service 104 further functions to generate class labels
for media
content items. For a given media content item (a "target item") the media host
service 104
identifies other media content items, herein referred to as "neighbor media
content items,"
that are similar to the target item, based on feature data associated with the
target and
neighbor media content items. Feature data associated with the media content
items may
include: user provided data associated with the media content items, viewing
information
associated with the media content items, and content data generated from the
media content
items. The media host service 104 generates a supplemented feature set for the
target item
by combining the feature data associated with the target media content item
and the feature
data associated with the neighbor media content items. The media host service
104 classifies
the target media content item based on the supplemented feature set to
generate a set of
class labels for the target media content item. The media host service 104
further refines the
set of class labels for the target media content item based on class labels
associated with the
neighbor media content items. By identifying neighbor media content items and
generating
supplemented feature sets therefrom, the media host service 104 leverages
existing
similarities in media content items to supplement feature data sets that are
sparse,
inconsistent and/or uncertain using feature data associated with neighbor
media content

4

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
items. This improves the classification of media content items that would not
otherwise
have a large enough feature set for classification.

[0018] The media host service 104 further leverages feature data from multiple
independent sources to identify neighbor media content items and classify
media content
items. By integrating data from different sources, the media host service 104
compensates
for sparseness and/or uncertainty associated with one source of feature data
using feature
data from an independent source of data. For example, sparseness associated
with user-
provided data (e.g. a lack of subject, title, or tags in the user-provided
data) may be

compensated for based on viewing information such as viewing statistics (e.g.
a the number
of times media content items are requested by a same user) to identify
similarity between
two media content items. Likewise, uncertainty in similarity between two media
content
items based on viewing statistics (i.e. uncertainty that videos that are
watched by same
viewers have a same subject or category) may be compensated for using content
data
generated for the video (e.g. data generated from face recognition that
indicates the same
individual is shown in two videos provided from separate sources).

[0019] Combining feature data associated with a target media content item and
neighbor media content items to generate a supplemented data set provides a
pre-
processing step in classification. The addition of this step allows for "fine
tuning" of the
classifier accuracy by the manipulation of the different ways that feature
data can be
combined. An administrator of the media host service 104 may optimize
classifier accuracy
by choosing different algorithms and weighting techniques to combine feature
data based
on the types of features, the accuracy of the classifier, the degree of
sparseness in the feature
data sets, the degree of inaccuracy in the feature data sets and other
factors.

[0020] FIG. 2 illustrates a screenshot of a graphical user interface 200 for
browsing
media content items provided by the media host server104 according to an
embodiment. In
the embodiment illustrated, the media content is video content. In other
embodiments, the
media host service 104 may provide a graphical user interface for browsing
other types of
media content including songs, images, and text content.

[0021] The graphical user interface 200 includes a display window 215 for
displaying a
video and an information window 240 for displaying information describing the
video. The
information window 240 displays set of class labels 244 that describe a set of
categories or

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
classes the video is associated with. Using the set of class labels, a user
may browse other
videos that belong to the same categories or classes. For instance, a user may
select the class
label'debate' to retrieve a set of videos associated with the label 'debate'.

[0022] The graphical user interface 200 further includes a related videos
window 230
containing a set of videos that are related to the video. In the embodiment
illustrated, the
related videos are based in part on the class labels associated with the
related videos. The
related videos are displayed with associated class labels 235 that overlap
with the set of class
labels 244 associated with the displayed video.

[0023] FIG. 3 is a high-level block diagram illustrating a detailed view of
the media host
service 104 according to one embodiment. As shown in FIG. 3, the media host
service 104
includes several modules and servers. Those of skill in the art will recognize
that other
embodiments can have different module and/or servers than the ones described
here, and
that the functionalities can be distributed among the module and/or servers in
a different
manner. In addition, the functions ascribed to the media host service 104 can
be performed
by multiple servers.

[0024] In alternate embodiments, the media upload server 306, the media
content
database 330 and/or the feature set database 350 may be hosted at one or more
separate
servers by different entities with the media host service 104 acting as a
third party server to
generate class labels for media received by the media upload server 306 and
stored in the
media content database 330.

[0025] The media upload server 306 receives media content uploaded by the
content
providers 102. The media upload server 306 stores uploaded media content in
the media
content database 330. The media upload server 306 further receives information
derived
from providing the media content to the content viewers 106 such as ratings
associated with
the media content and uploaded comments about the media content.

[0026] The media content database 330 stores received media content in
association with
unique identifiers for that media content. The media content database 330
further stores
user-provided information describing the media content such as an author of
the media
content, the date the media content was received by the media host sever 104,
the subject of
the media content, tags or labels associated with the media content and
comments provided
by an author of the media content. The media content database 330 further
stores viewing

6

CA 02757771 201110 05
WO 2010/117581 PCT/US2010/027709
information derived from providing the media content to content viewers 106
such as
ratings of the media content provided by users, comments provided by the
users, and the
frequency at which the media content is viewed by the users. The media content
database
330 also stores viewing information that is specific to a media content item
such as a set of
media content items that are commonly viewed in association with the media
content item.
[0027] The media content server 310 provides information and media content to
the
users. The media content server 310 retrieves media content from the media
content
database 330. The media content server 310 provides the retrieved media
content to the
content viewers 106. The media content server 310 further functions to
retrieve and provide
information and media content responsive to search queries received from the
content
viewers 106. The search queries may include criteria including search terms,
class labels etc.
The media content server 310 further retrieves items of related media content
based in part
on the class labels associated with a selected media content item and provides
the related
media content items to the content viewer 106. The media content server 310
further
monitors viewing statistics and other viewing information associated with the
media
content such as the frequency at which the media content is viewed or and
stores the
viewing information to the media content database 330.

[0028] The content feature engine 312 generates content features based on the
media
content. Content features are metadata generated from the media content that
can be used
to characterize the media content. The content feature engine 312 generates
content features
specific to the media type of the media content. For still image content,
content features may
include: pixel intensity, luminosity, data derived from shape detection
algorithms and other
data derived from still images. For audio content, content features may
include: pitch, tone,
mel-frequency cepstral coefficients (MFC), and other data derived from audio
content. For
video content, content features may include data derived from shot detection
algorithms,
face detection algorithms, edge detection algorithms, and other data derived
from video
content, such as color, luminosity, texture and other features. The content
feature engine 312
stores the generated content features in the feature set database 350.

[0029] The text feature engine 308 generates text features based on the user-
provided
information describing the media content. The text feature engine 308
generates text
features which comprise one or more tokens and a numeric value associated with
the token,

7

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
such as a frequency value. In one embodiment, the text feature engine 308
generates the text
features by tokenizing the user provided information and determining the
frequency of the
tokens contained there. According to the embodiment, the text feature engine
308 may also
stem the tokens or use lexicons to identify synonymous tokens prior to
enumerating the
frequency of the tokens. In some embodiments, the text feature engine 308
generates text
features comprised of phrases such as noun phrases or verb phrases. The
frequency
information for the tokens can be raw frequency information, or normalized,
such as TF-IDF
or similar frequency measures.

[0030] In most embodiments, the text feature engine 308 generates the text
features
based on information describing the media content such as the title and
summary associated
with the media content. In other embodiments, the text feature engine 308
generates the text
features from comments associated with the media content (e.g., as provided by
users who
view the media content item) and/or other sources of textual data referenced
by the
information describing the media content (e.g. web pages referenced in the
summary
associated with the media content). The text feature engine 308 further
generates text
features from video or image content using techniques such as speech
recognition as applied
to an audio track of a media content item, and optical character recognition
(OCR) as
applied to the images contained in a media content item.

[0031] The feature set database 350 stores feature sets for media content
items in
association with unique identifiers for the media content items. The feature
sets include the
text features generated by the text feature engine 308 and the content
features generated by
the content feature engine 312. The feature sets further include viewing
statistics and other
viewing information stored in the media content database 330 such as the
frequency a media
content item is viewed by users and a set of frequencies specifying the number
of times
other media items are viewed in association with the media content item,
herein referred to
as "co-watch metrics." These frequencies can be raw or normalized, as
determined by the
system administrator.

[0032] The similarity graph module 309 identifies neighbor media content items
for
media content items based on the feature sets. The similarity graph module 309
first
generates a set of distance metrics which specify a measure of similarity
between two media

8

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
content items. Based on the distance metrics, the similarity graph module 309
identifies
neighbor media content items.

[0033] The similarity graph module 309 generates a set of distance metrics
based on the
feature sets stored in association with the media content items in the feature
set database
350. For each pair of feature sets in the feature set database 350 associated
with a respective
a first and second content item, the similarity graph module 309 generates a
distance metric
which indicates the similarity between the pair of feature sets for the items.
In one
embodiment, the distance metric can be a Euclidean distance metric generated
based on
corresponding features in the two feature sets. In other embodiments, the
distance metric
may be a correlation co-efficient between the corresponding features. The
similarity graph
module 309 may generate the distance metrics based on all of the features in
the feature sets
or a sub-portion of the features in the feature sets. In one embodiment, the
similarity graph
module 309 may generate the distance metric based on a specific type of
feature in the
feature sets. For instance, the similarity graph module 309 may generate the
distance metric
based only on viewing information such as co-watch metrics. The similarity
graph module
309 stores the distance metrics in association with the feature sets and media
content items
in the media content database 330.

[0034] In some embodiments, the similarity graph module 309 filters the media
content
items in the feature set database 350, for example by removing the items from
the feature set
database 350 or flagging the media content items in the feature set database
350, prior to
generating the set of distance metrics. In these embodiments, the similarity
graph module
309 filters the media content items according to a set of specified features
that indicate a
media content item is to be filtered. In most embodiments, the set of
specified features are
features that indicate that the media content item is an item of undesirable
content. In these
embodiments, features that indicate that the media content item is an item of
undesirable
content are specified by an administrator of the media host server104 and can
include
features that indicate that the media content item comprises spam, adult
content, or hate
speech.

[0035] The similarity graph module 309 identifies neighbor media content items
based
on the distance metrics associated with the media content items. For each
target item, the
similarity graph module 309 selects a set of neighbor items based on the
distance metrics

9

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
which have some measure of similarity to the target item. Suitable methods for
selecting a
set of neighbor media content items may include clustering the distance
metrics.

[0036] In one embodiment, the similarity graph module 309 selects a set of
neighbor
media content items by generating a similarity graph of the content items
based on the
distance metrics. The similarity graph module 309 generates a similarity graph
containing a
set of nodes, each of which represents a media content item in the feature set
database 350.
The similarity graph module 370 selects a node representing a media content
item as the
target node. The similarity graph module 370 attempts to assign some number N
edges in
the graph to connect the target node to N (e.g. 3<=N<=10) of nodes
representing identified
media content items with distance metrics indicating at least a minimum degree
of similarity
to media content item represented by the target node. For example, in one
specific
embodiment, the similarity graph module 309 connects each the target node to
five (5) of the
most similar media content items, based on their respective distance measure.
The
similarity graph module repeats this process by selecting each node as the
target node and
assigning edges between the target node and a set of nodes.

[0037] If similarity graph module is unable to identify N media content items
with
distance metrics indicating the minimum degree of similarity to the media
content item
represented by the target node, the similarity graph module 309 connects the
target node to
the maximum number of media items with distance metrics indicating at least a
minimum
degree of similarity to the media content item represented by the target node.
If the
similarity graph module is unable to find any media content items with
distance metrics
indicating the minimum degree of similarity to the media content item
represented by the
target node, the similarity graph module 309 connects the target node
representing the
media content item to a media item with a distance metric indicating a maximum
similarity
of all the distance metrics associated with the target node.

[0038] In some embodiments, the similarity graph module 309 prunes the
similarity
graph after the similarity graph is constructed using one or more pruning
criteria. In these
embodiments, the similarity graph module 309 may remove media content items
based on
viewing statistics that indicates which media content items in the graph are
not actively
viewed by content viewers 106. In these embodiment, the viewing statistics
that indicates
that the media content item is not actively viewed on the by content viewers
106 are

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
specified by an administrator of the media host service 104 and may include
the statistics as
the views, ratings, or comments associated with the media content items; these
statistics can
include raw or normalized counts (e.g. number of views of the media content
item), rates
(e.g., weekly rate of comment postings), trends (e.g., average weekly percent
change in
number of views), velocities (number of unique viewers in the last hour), or
distributions
(e.g., number or percentage of users giving each level of rating value), or
the like. In
addition, items can be pruned base on their co-watch metrics. For a given
target item, the
neighbor content items with the least significant (e.g., lowest valued) co-
watch metrics can
be pruned. The above pruning criteria are applied to the neighbor content
items for each
target node until the nodes have been examined according the criteria. These
pruning
criteria can be applied in any order desired by the system administrator.

[0039] The similarity graph module 309 identifies a set of neighbor media
content items
for each media content item based on the similarity graph. In most
embodiments, the
similarity graph module 309 identifies a set of neighbor media content items
comprising a
specified M number of media content items (e.g. 3<=M<=10 media content items).
In most
instances, the M number of media content items is equal to the N number of
edges
connecting each target node to items of media content. If a target node is
connected to M or
greater nodes, the similarity graph module 309 selects the media content items
with distance
metrics indicating the highest similarity to the target media item as the set
of neighbor
media items.

[0040] The similarity graph module 370 traverses the similarity graph to
identify
neighbor media content items. The similarity graph module 370 traverses the
similarity
graph by selecting a neighbor node representing a neighbor media content item
with a
distance metric indicating the highest similarity to the target media content
item. The
similarity graph module 370 then selects a node connected to the neighbor node
with a
distance metric that indicates a highest similarity to the neighbor node as a
neighbor media
content item. The similarity graph module 370 continues this process until all
the number of
neighbor media content items equals the specified number of neighbor media
content items.
[0041] In summary, the foregoing processes of identifying neighbor content
items,
generating a similarity graph, and pruning the similarity graph provide a
robust set of
content items that have a very high likelihood of being meaningfully related
to each, based

11

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
both on their intrinsic features as well as the extrinsic behaviors of the
users who view the
items. Using the pruning criteria to prune the similarity graph leverages
behavioral
information from the user community as to which content items are of
sufficient interest and
are sufficiently related to each other.

[0042] The data aggregation module 314 generates supplemented feature sets for
each
media content item based on the set of neighbor media content items associated
with the
media content item. The data aggregation module 314 combines the feature sets
to generate
a supplemented feature set. The data aggregation module 314 generates the
supplemented
feature sets based on all (or a portion of) the data in the feature sets of
all (or a selected
subset of) the neighbor items. In one embodiment, the supplemented feature
sets are
generated based only on text features associated with the media content items.

[0043] The data aggregation module 314 combines the feature sets associated
with the
target media content item and neighbor media content items to generate a
supplemented
feature set. In one embodiment, the data aggregation module 314 combines the
feature sets
by simply merging the feature sets to generate a combined, unordered, and un-
weighted
feature set containing all of the features of all of the neighbor items.
Alternatively, the data
aggregation module 314 merges features which occur in both data sets by
adding,
averaging, or otherwise mathematically combining the values associated with
the features,
such that for each type of feature, there is a single value (or set of values)
as appropriate for
the feature. For example, for a color feature, the data aggregation module 314
may produce
an average color histogram from the color histograms of the neighbor content
items (hence a
set of frequency counts for color bins), whereas for a ratings feature, the
data aggregation
module 314 may produce a single average rating from the neighbor content
items. In other
embodiments, the data aggregation module 314 combines the feature sets
associated with
the neighbor media content items by weighting the feature sets based on the
similarity
values of the neighbor media content items.

[0044] In some embodiments, the data aggregation module 314 uses consensus
methods
to identify features in the feature sets associated with the neighbor media
content items to
add to the supplemented feature set associated with the target media content
item. In these
embodiments, the data aggregation module 314 identifies features which have a
range of
values in the majority or percentage of the feature sets associated with the
neighbor media

12

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
content items. For example, the data aggregation module 314 can identify that
a feature
'average volume' has a narrow range of values (e.g. each value is 9 or 10 on a
scale of 1-10)
in more than 80% of the feature sets associated with neighbor items of media
content. The
data aggregation module 214 may determine to add an average value of the
feature
"average volume' (i.e. 9.5 out of 10) to the supplemented feature set.

[0045] The classification engine 312 classifies each media content item based
on the
supplemented feature set associated with the media content item. The
classification engine
312 applies one or more classifiers 322 to the supplemented features sets to
generate a set of
classification scores indicating the likelihood that the media content item
belongs to a class
or category of media content items. The classification engine 312 assigns a
set of one or
more class labels to the media content item based the classification score
indicating the
likelihood that the media content item belongs to a class or category of media
content items
exceeding a defined threshold value. For example, an media content item may be
assigned
a label of "soccer" responsive to a classification score that the media
content item belongs to
the class "soccer" being greater than 90%. The classification engine 312
stores set of class
labels in association with the media content item in the classified media
corpus 380.

[0046] According to the embodiment, the classifier 322 may be generated by the
classification engine 312 or received from another source. In one embodiment,
the classifier
322 is a single multi-class classifier which is trained on corpus of content
items which are
classified according to a hierarchical system of classification. In a specific
embodiment, the
single multi-class classifier is trained on a corpus of content items which
are classified
according to the hierarchical system of classifications used by the Open
Directory Project
(ODP). In this embodiment, the training set of media content items is manually
classified, so
that each training media content item has one or more labels from the OPD. The
training set
of media content items is then processed for their features and viewing
statistics and the
classifier 322 is constructed and validated using the training set of media
content items and a
corresponding validation set. In alternate embodiments, the classifiers 322
may be binary
classifiers and the classes may be non-hierarchical. In some embodiments, the
classifier 322
is re-trained using the classified media corpus 380.

[0047] In one embodiment, after the initial classification of a content item
by the
classifier 322, a second stage of classification is provided by the class
label engine 315. More
13

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
specifically, the class label engine 315 refines the set of class labels
associated with a target
media content item based on the class labels of its neighbor media content
items. The class
label engine 315 obtains the set of class labels associated with the target
media content item
in the classified media corpus 380. For each class label associated with the
media content
item, the class label engine 315 determines a class consensus value specifying
the number or
percentage of the neighbor media content items that are also associated with
the class label
in the classified media corpus 380. If the class consensus value is below a
threshold value,
the class label engine 315 removes the class label from the set of labels
associated with the
target media content item in the classified media corpus 380.

[0048] For example, for a class label "soccer" associated with target media
content item,
the class label engine 315 can identify that the class label "soccer" is
associated with 5 out of
6 neighbor media content items of the target media content item in the
classified media
corpus 380. The class label engine 315 can then determine that the class
consensus value of,
for example 83%, associated with the label is greater than a threshold value
of 33% and
retain the class label "soccer" in association with the target media content
item. Conversely,
if the class label engine 315 identifies that 0 out of 6 of the neighbor media
items are
associated with a class label "religious", then the class label engine 315 can
determine that
the corresponding class consensus value of 0% is less than a threshold value
of 33% and
remove the class label "religious" from the set of class labels associated
with the target
media content item.

[0049] According to the embodiment, the threshold consensus value may be
specified,
for example by an administrator of the media host service 104, or determined
by the class
label engine 315. The class label engine 315 may determine the threshold value
based on
many factors. In some embodiments, the threshold value is dependent on a level
of

specificity associated with the class label defined by the hierarchical
classification scheme.
For instance, the threshold value for a label which specifies "soccer" may be
a smaller value
than the threshold value for a label which specifies "sports". In some
embodiments, the
threshold value is dependent on the relative frequency with which a label
occurs in a corpus.
For instance, based on a corpus in which the frequency of the label "soccer"
is 5 times
greater than the frequency of the label "lawn bowling", the threshold value
for a label which
specifies "soccer" and the threshold value for a label which specifies "lawn
bowling" may be

14

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
proportional to their frequency in the corpus with the threshold value for
"soccer" being 5
times greater than the threshold value for "lawn bowling".

[0050] FIG. 4 is a flowchart illustrating steps performed by the media host
service 104 to
identify a set of neighbor media content items for a media content item in
accordance with
an embodiment of the present invention. Other embodiments perform the
illustrated steps
in different orders, and/or perform different or additional steps. Moreover,
some of the
steps can be performed by engines or modules other than the media host service
104.

[0051] The media host service 104 identifies 404 a set of feature sets
associated with a set
of media items. The media host service 104 filters 406 the set of feature sets
associated with
set of media items. The media host service 104 generates 408 a set of distance
metrics, each
distance metric specifying the similarity of the feature sets associated with
pairs of media
items. The media host service 104 generates 410 a similarity graph based on
the set of
distance metrics. The media host service 104 prunes 412 the similarity graph
to remove
media content items. The media host service 104 identifies 414 a set of
neighbor media
content items for each media content item in the similarity graph.

[0052] FIG. 5 is a flowchart illustrating steps performed by the media host
service 104 to
classify a media content item in accordance with an embodiment of the present
invention.
Other embodiments perform the illustrated steps in different orders, and/or
perform
different or additional steps. Moreover, some of the steps can be performed by
engines or
modules other than the media host service 104.

[0053] The media host service 104 identifies 512 a set of neighbor media
content items
for the target media content item. The media host service 104 generates 514 a
supplemented
feature set for the target media content item based on the sets of feature
data associated with
the set of neighbor media content items. The media host service 104 generates
516 a set of
class labels associated with the target media content item.

[0054] FIG. 6 is a flowchart illustrating steps performed by the media host
service 104 to
refine a set of class labels associate with a target node 322 in accordance
with an
embodiment of the present invention. Other embodiments perform the illustrated
steps in
different orders, and/or perform different or additional steps. Moreover, some
of the steps
can be performed by engines or modules other than the media host service 104.

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
[0055] The media host service 104 identifies 610 a set of class labels
associated with a
target node. The media host service 104 determines 612 a class consensus value
for each
class label based on the class labels associated with the neighbor media
content items for the
target node. The media host service 104 removes 614 the class labels having a
class
consensus value beneath a threshold value from the set of class labels
associated with the
target node.

[0056] The present invention has been described in particular detail with
respect to a
limited number of embodiments. Those of skill in the art will appreciate that
the invention
may additionally be practiced in other embodiments. First, the particular
naming of the
components, capitalization of terms, the attributes, data structures, or any
other
programming or structural aspect is not mandatory or significant, and the
mechanisms that
implement the invention or its features may have different names, formats, or
protocols.
Further, the system may be implemented via a combination of hardware and
software, as
described, or entirely in hardware elements. Also, the particular division of
functionality
between the various system components described herein is merely exemplary,
and not
mandatory; functions performed by a single system component may instead be
performed
by multiple components, and functions performed by multiple components may
instead
performed by a single component. For example, the particular functions of the
media host
service may be provided in many or one module.

[0057] Some portions of the above description present the feature of the
present
invention in terms of algorithms and symbolic representations of operations on
information.
These algorithmic descriptions and representations are the means used by those
skilled in
the art to most effectively convey the substance of their work to others
skilled in the art.
These operations, while described functionally or logically, are understood to
be
implemented by computer programs. Furthermore, it has also proven convenient
at times,
to refer to these arrangements of operations as modules or code devices,
without loss of
generality.

[0058] It should be borne in mind, however, that all of these and similar
terms are to be
associated with the appropriate physical quantities and are merely convenient
labels applied
to these quantities. Unless specifically stated otherwise as apparent from the
present

discussion, it is appreciated that throughout the description, discussions
utilizing terms such
16

CA 02757771 201110 05
WO 2010/117581 PCT/US2010/027709
as "processing" or "computing" or "calculating" or "determining" or
"displaying" or the
like, refer to the action and processes of a computer system, or similar
electronic computing
device, that manipulates and transforms data represented as physical
(electronic) quantities
within the computer system memories or registers or other such information
storage,
transmission or display devices.

[0059] Certain aspects of the present invention include process steps and
instructions
described herein in the form of an algorithm. All such process steps,
instructions or
algorithms are executed by computing devices that include some form of
processing unit
(e.g,. a microprocessor, microcontroller, dedicated logic circuit or the like)
as well as a
memory (RAM, ROM, or the like), and input/output devices as appropriate for
receiving or
providing data.

[0060] The present invention also relates to an apparatus for performing the
operations
herein. This apparatus may be specially constructed for the required purposes,
or it may
comprise a general-purpose computer selectively activated or reconfigured by a
computer
program stored in the computer, in which event the general-purpose computer is
structurally and functionally equivalent to a specific computer dedicated to
performing the
functions and operations described herein. A computer program that embodies
computer
executable data (e.g. program code and data) is stored in a tangible computer
readable
storage medium, such as, but is not limited to, any type of disk including
floppy disks,
optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs),
random
access memories (RAMs), EPROMs, EEPROMs, magnetic or optical cards,
application
specific integrated circuits (ASICs), or any type of media suitable for
persistently storing
electronically coded instructions. It should be further noted that such
computer programs
by nature of their existence as data stored in a physical medium by
alterations of such
medium, such as alterations or variations in the physical structure and/or
properties (e.g.,
electrical, optical, mechanical, magnetic, chemical properties) of the medium,
are not
abstract ideas or concepts or representations per se, but instead are physical
artifacts
produced by physical processes that transform a physical medium from one state
to another
state (e.g., a change in the electrical charge, or a change in magnetic
polarity) in order to
persistently store the computer program in the medium. Furthermore, the
computers

17

CA 02]5]]]1 201110 05
WO 2010/117581 PCT/US2010/027709
referred to in the specification may include a single processor or may be
architectures
employing multiple processor designs for increased computing capability.

[0061] Finally, it should be noted that the language used in the specification
has been
principally selected for readability and instructional purposes, and may not
have been
selected to delineate or circumscribe the inventive subject matter.
Accordingly, the
disclosure of the present invention is intended to be illustrative, but not
limiting, of the
scope of the invention.

18

Representative Drawing

A single figure which represents the drawing illustrating the invention.

Administrative Status

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Administrative Status , Maintenance Fee and Payment History should be consulted.

Administrative Status

Title	Date
Forecasted Issue Date	Unavailable
(86) PCT Filing Date	2010-03-17
(87) PCT Publication Date	2010-10-14
(85) National Entry	2011-10-05
Dead Application	2016-03-17

Abandonment History

Abandonment Date	Reason	Reinstatement Date
2015-03-17	FAILURE TO REQUEST EXAMINATION
2015-03-17	FAILURE TO PAY APPLICATION MAINTENANCE FEE

Payment History

Fee Type	Anniversary Year	Due Date	Amount Paid	Paid Date
Registration of a document - section 124			$100.00	2011-10-05
Application Fee			$400.00	2011-10-05
Maintenance Fee - Application - New Act	2	2012-03-19	$100.00	2012-03-02
Maintenance Fee - Application - New Act	3	2013-03-18	$100.00	2013-03-05
Maintenance Fee - Application - New Act	4	2014-03-17	$100.00	2014-03-06

Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
GOOGLE INC.

Past Owners on Record
None

Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.

Documents

To view selected files, please enter reCAPTCHA code :

To view images, click a link in the Document Description column. To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

Filter

Download Selected in PDF format (Zip Archive)

Download Selected as Single PDF

Document Description	Date (yyyy-mm-dd)	Number of pages	Size of Image (KB)
Abstract	2011-10-05	1	60
Claims	2011-10-05	3	108
Drawings	2011-10-05	6	243
Description	2011-10-05	18	958
Representative Drawing	2011-10-05	1	6
Cover Page	2011-12-09	1	37
PCT	2011-10-05	6	289
Assignment	2011-10-05	7	231

Language selection

Menus

English Abstract

French Abstract

Administrative Status

Abandonment History

Payment History

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.

Patent 2757771 Summary

English Abstract

French Abstract

Administrative Status

Abandonment History

Payment History

Your request is in progress.Requested information will be availablein a moment.Thank you for waiting.

Your request is in progress.

Requested information will be available
in a moment.

Thank you for waiting.