Note: Descriptions are shown in the official language in which they were submitted.
CA 02668306 2009-06-08
METHOD AND SYSTEM FOR APPLYING METADATA TO DATA SETS OF
FILE OBJECTS
FIELD OF THE INVENTION
The present invention generally relates to the methods and systems for
developing and assigning descriptive information relating to the contents of a
file
(i.e., metadata).
BACKGROUND OF THE INVENTION
Metadata is broadly defined as "data about data" i.e. a label. Thus, a given
item of metadata may be used to describe an individual datum, or a content
item,
and additionally a collection of data which can include a plurality of content
items.
The fundamental role of metadata is used to facilitate or aid in the
understanding, use and management of data. The metadata required for efficient
data management is dependent on, and varies with the type of data and the
context of use of this data. Using as an example a library, the data is the
content
of the titles stocked, and the metadata about a title would typically include
a
description of the content, and any other information relevant for whatever
purposes, for example the publication date, author, location in the library,
etc.
For photographic images, metadata typically labels the date the
photograph was taken, whether day or evening, the camera settings, and
information related to copyright control, such as the name of the
photographer,
and owner and date of copyright. In cases in which the data is describing the
content of computer files, metadata about an individual data item could
include,
but is not limited to, the-name of the field and its length. Thus metadata
about a
collection of data items in a computer file, might typically include the name
of the
file, the type of file etc.
Novice computer users have access to "giga computers" (gigabyte
storage, gigahertz power) which can overwhelm their ability to access their
stored data. Digital photos, video clips, and music files are easy and
inexpensive
to create, but hard to identify programatically. Users are having to cope with
1
CA 02668306 2009-06-08
many types of digital assets beyond conventional searchable text: video clips,
web pages, music / audio files, word documents, spreadsheets, and various
vertical applications thereof (pop music vs. classical music, personal photo
library
vs. professional photography, etc).
Various vendors have attempted to integrated increasingly sophisticated
search technologies into the operating system or main operating interface of
desktop computers. Unfortunately, such search technologies lack the breadth of
knowledge of the cultural and emotional significance of subject matter that is
required to identify appropriate results.
To give an example, music evokes different responses in individuals, but
persons of specific cultural backgrounds may respond similarly to certain
pieces
of music, due to cultural significance etc., while persons of other cultural
backgrounds will respond differently. What is needed is the ability to create
a
signature that can uniqueily identify any piece of digital music, regardless
of the
encoding algorithm (There are patents for addressing the methods for doing
this.
Effectively what they amount to is determining the musical instruments / tonal
qualities of a fragment of the music and associate that with a specific
recording.),
and for persons from various cultures to apply metadata with consideration of
the
cultural significance of the tags. When such tags are both associated with an
identifiable audio fragment, and the tags are shared and accessible worldwide,
a
sample dataset supporting the development of identification technologies is
established.
In the intervening time, users often desire that the metadata associated
with third-party files is accurate. Microsoft has recognized that publicly
available
metadata libraries are full of mistakes, and assign 'trust' levels to them.
The value of a search result is NOT how MANY results it returns, but how
FEW, and how ACCURATE those few results are: that is, if a search result
includes every file on a users computer, it has no value. These technologies
do
not succeed at correctly identifying contents of digital assets accurately:
the
search result is inaccurate, either returning far too few results, or far too
many,
neither of which are acceptable in most situations. Any data file that is not
2
CA 02668306 2009-06-08
accessed has little value: once stored, if never accessed, what is the value
of a
digital photo or music file? Consumers who purchase digital cameras or media
playback devices become dissatisfied with the technology when they realise the
amount of effort required to engage with the data: figuring out the correct
subset
of files to digest is labour intensive, because there is no automated search
mechanism capable of identifying the contents of digital media files.
That is where Metadata comes in: the user cannot rely on pattern
detection algorithms to identify and return correct results to a user-directed
search. That leaves the user with two alternatives: 1) continue to struggle
with
manual file management techniques until the pattern detection algorithms are
sophisticated enough to return accurate results, or; 2) assign computer-
processable metadata to the digital media files such that accurate search
results
can be returned.
Attempts to provide metadata by manual means are tiring and demand
expert user from a mental point of view, but also a degree of endurance and
physical dexterity. So, the user will accept metadata determined by automated
technologies, or from third parties via internet-based sources, but may still
have
to correct the metadata for accuracy (the original tagging was incorrect) or
just to
suit their own metadata naming conventions.
Thus, with respect to the two alternatives mentioned above, alternative 1)
is not particularly desirable. Desirably, item 2) above regarding assigning
computer-processable metadata to the digital media files should entail
1) allowing users of moderate skill to create limited metadata
vocabularies using specific terms and a phrases that are easily
understood by themselves and their associates, therefore having
greater value than terms and phrases chosen by third-parties and
adapted for other uses;
2) guide users to create metadata vocabularies that employ "best
practices" in creating same, so the metadata vocabulary is sound
both structurally and semantically;
3) provide tools that allow for collaborative creation of metadata
3
CA 02668306 2009-06-08
vocabularies;
4) provide integrated documentation for every field and value in the
metadata vocabulary;
5) provide integrated guidance for collaborators to extend the
metadata vocabulary in a compatible, approved way;
6) provide revision control and distribution mechanisms for
metadata vocabularies;
7) provide mechanisms to establish "synonym" relationships
between two or more tags within a given metadata vocabulary,
and between a specific tag or tags in two or more metadata
vocabularies;
8) allow individual users to restructure the metadata presentation to
better meet their particular requirements, without altering the
general semantic context of the metadata vocabulary;
9) provide users with efficient and streamlined interfaces for
application of metadata to files;
10) store the metadata in the files so wherever the file is moved or
copied to, machine processable metadata travels with it;
11) provide multiple, intuitive search interfaces to leverage metadata:
highly accessible "simple text" search and highly accurate
"context-sensitive" search.
"State of the art" metadata tools lack most of these capabilities by design:
tools
that focus on file editing and playback offer only rudimentary interfaces for
metadata management: it is a small subset of the applications' overall feature
set. Mostly, other applications do not attempt to overcome the following
problems
with user-assigned metadata.
Other tagging tools do not address the fact that discovered metadata
(found embedded in files imported to the users file library from a third party
source) needs to be managed. When one recieves photos from someone else,
other tools provide no means to establish provenance and what to do with the
incoming metadata. Decisions made by the user with regard to the correct
4
CA 02668306 2009-06-08
position in a structured tag hierarchy is not remembered for next time that
same
tag is encountered: the same tag will appear in the Microsoft Windows Vista
Photo Gallery tag tree at the top level again and will need to be moved
manually
to the correct position to update the embedded tags The invention includes
capabilities to remember associate the tag discovered in an unstructured form
with that of the tag filed in the appropriate branch of the structured
vocabulary.
Adobe Photoshop Elements offers a number of features that appear to
support structured tag vocabularies, but are limited in a number of ways: 1)
they
only allow the tags to be processed as structured tags within the Adobe
Photoshop Elements application (the tags embedded in files are not structured
at
all). 2) due to the problems related to item 1), the use of homoglyphs is
forbidden: the same exact word or phrase can only appear once in the entire
tag
hierarchy, undermining the value of structured metadata. 3) Photoshop Elements
allows one to save out one's metadata vocabulary and share it in a standards-
based XML (eXtensible Markup Language) format file, but there is no way
integrated way for the originator to associate documentation with it: there is
no
value added at the source. In order for the recipient of the tag vocabulary to
make proper use of the tags, the originator must devise their own
documentation
scheme and the recipient user must be able to accept that documentation in the
scheme provided. The inventions herein include specifications for integrated
metadata documentation.
Furthermore, other tools do not guide the user to enter keywords or labels
in a way that is usable for search. For example, the user may be encouraged to
provide a natural-language caption, but such natural language phrases are not
easily discovered by mainstream search technologies. Keyword search
integrated at the operating-system level includes file names / path fragments
as
part of the source data that is searched, which is subject to
misinterpretation out
of the context of the designated metadata set, so it is likely that irrelevant
files will
=
be returned in the result, therefore diluting the sanctity of 'keywords'. The
user is
not guided to create metadata in a sensible, managable, 'future proof way.
CA 02668306 2009-06-08
In a sense, desktop search tools help one find things that have been
carelessly managed: the inventions herein are about ensuring files are
properly
managed in the first place, and assisting the user in maintaining the
integrity of
the aggregate metadata over time.
Current metadata systems do not support multi-language 'synonyms' or
'translations'. Claims have been made that "unicode" is used so other codepage-
based operating systems will not fail to render the characters properly, but
that is
simply a mechanism allowing for any lanauage to be used when keying in the
terms. It does nothing to associate semantically identical terms from
different
languages or with different spellings with one another properly.
Almost every software application that bills itself as a "digital asset
manager" or "media file manager" offers some sort of metadata entry. The scope
of the metadata supported is limited to a specific set of proprietary fields.
The
data entry mechanism is manual (typing) and other tools do little or nothing
to
optimize the tagging activity. Innovation on the part of the vendors of such
tools
comes in the selection of standard fields made available for use, the way
fields
are arranged on the forms, or the combination of user interface controls used.
Little has been done to provide users with tools that are address the metadata
workflow the integrity of the metadata library over the long term, in due
consideration of what motivates users to enter metadata (the value of tagging)
nor to articulate workflow in support of, fixing metadata that is incorrect
being as
having some shared requiremens but other distinct requirements from that of
adding metadata to files from scratch.
First, pre-configured metadata vocabularies: the user must gain a
sufficient understanding of the semantic meaning of every field and possible
value., if field values are restricted. Novice users will not understand the
potential
value of the investment in learning a pre-configured metadata vocabulary. In
fact,
only by learning about the vocabulary may the user discover it is
inappropriate for
their use, which is a 100% wasted effort. If field values are not restricted,
novice
users who have not developed the insights to properly plan and establish a
6
CA 02668306 2009-06-08
vocabulary for their own use will be subject to problems that arise with
inconsistent and / or incomplete tagging.
The inventions herein include methods for expressing metadata
vocabularies which include
a) the descriptions of tags so possible adopters will know the exact
purpose of the tags in the tag vocabulary
b) guidance to the creation of tags to supplement the existing tags
should the tag vocabulary be amenable to addition of tags and
c) guidance for the application of a specific tag in the broader
context of the tag vocabulary. Beyond the general description of
the tag, this type of guidance has particular value while the user
is tagging files. This guidance pertains to the specific
characteristics of the file the user should observe to best
determine the correct subtag or parameter value to use. A novice
user may be overwhelmed by the variety of content they must
consider to apply a single tag. This type of guidance helps the
user focus on particular characteristics relevant to a specific tag.
Conversely, a user may be provided with guidance as to which
characteristics of the file to IGNORE to determine which subtags
or parameter values to assign to the file, with guidance that
certain characteristics that may be relevant to the current tag are
more relevant to another tag. The invention provides a platform
for standardization of such guidance.
Secondly, simple-text keywords and description assignment: the user
simply types in any word, or phrase that occurs to them: interfaces do not
present
a catalog of previously-used tags, which does not give the user the benefit of
a
standardized metadata vocabulary which the user can employ consistently.
Additionally, such flat 'keyword' methods provide no 'context' for keywords:
the
only search method available to them will be a simple text search which will
inevitably return false positives and fail to return synonym matches. There
exist
complex natural-language search engines and sophisticated search-algorigthm
7
CA 02668306 2009-06-08
composition tools, but most are far beyond the capabilities of most non-
technical
people.
Embodiments of the present invention provide metadata management
wherein all the features in the application and user interface serve the task
of
creation and assignment of metadata, and returning accurate search results.
SUMMARY OF THE INVENTION
The present invention provides a method and system for applying
metadata to file objects. The file objects may be photographs in digital for
or
digitized photos, music files, video clips, text documents, interactive
programs,
web pages, 3D model files, blueprints, flowcharts, invoices, database reports,
video game assets, sound samples, transaction log files and the like.
Broadly speaking, the method of the present invention involves
= auditing the file system to identify compatible files
= assessing the metadata associated with those files
= presenting the metadata to the user for examination
= providing a variety of tools for the user to examine and change
metadata
= providing tools for the user to add metadata where none exists
= providing tools that allow the user to work with standardized tag
vocabularies (also known as 'controlled vocabularies')
= providing tools that allow the user to create and maintain custom
metadata vocabularies
= priovide mechanisms for the user to make subsets of tag
vocabularies to opimize the tagging process related to specific file
sets
= providing tools for the user to collaborate with other users on
metadata vocabularies
= provide mechanisms that integrate metadata with the files so the
metadata is carried with the file throughout the workflow / lifecycle
= provide tools that let the user create queries against the metadata
8
CA 02668306 2009-06-08
in the fileset, and present the results.
= provide tools to allow the user to export the files that result from
the search for processing outside the application.
= Provide documentation to the user community for creation of third-
party applications that establish critical mass for the tag vocabulary
and encoding scheme
Generally speaking, files will have metadata that is relevant to a number of
characteristics of the file and the overall file set.
1. the file's technical aspects (format, bytes used, date of creation)
2. the workflow in which the file participates (creator, owner, publisher,
date of publication, copyright information, etc)
3. the subject matter of the file (the nature of the sound of an audio
file, be it music or a sound-effect, the subject of a photograph or
video clip, the abstract of a lengthy text document, excerpted
particulars of invoices or other data-interchange format files).
Furthermore, data files can themselves be metadata for a real world
object: the photograph of a collectible (the characteristics applied to the
photo do
not relate to the photo itself, but to the subject of the photo) or the the
sound of a
musical instrument (the sound file is representative of the musical
instrument,
and is not itself a valuable data file). All of these types of metadata will
need to
be managed and, to date, no comprehensive tool set exists that supports these
diverse metadata applications.
SwapNeat Metadata Studio is designed to serve all of the above metadata
applications, and more to come.
A further understanding of the functional and advantageous aspects of the
invention can be realized by reference to the following detailed description
and
drawings contained in Appendix 1.
DETAILED DESCRIPTION OF THE INVENTION
Definitions
9
CA 02668306 2009-06-08
As used herein, the term "metadata" means data used to label in any
fashion a file set or data object.
Appendix 2 contains examples of pseudocode that match those
referenced in this description.
document.
Concepts
the swapneat invention develops several concepts that are then woven into the
infrastructure and supported in the software application.
human factors issues
= the semantic meaning of keywords used in metadata vocabularies can be
abstruse
= some semantic metadata keywords are applicable to many files in a
dataset simultaneously; others only to single files
= application of metadata can be tedious: software supporting this task
should engage user where possible
= users are most enthusiastic about the tagging effort where they personally
value the content being tagged
metadata
Information in a file is known as 'data'.
For instance, the data in a digital photograph is all the information needed
to
display the pixels.
Additional information can be associated or embedded in a file, about the
picture,
but not the pixel data... this is called 'metadata'. examples include the date
and
time the picture was taken, and the focal length or ISO settings of the
camera.
CA 02668306 2009-06-08
Other examples from digital photograph files include the size of the file, the
number of bits per pixel, or the orientation of the digitization device
relative to
gravity at the time of digitization. For other file types, the date of
creation, creator
identity, title, lists of related information, are examples of metadata.
Some information in a digital photograph file, such as the image dimensions,
crosses the line between data and metadata, since it is needed to display the
pixels but is not the pixel data itself.
XML Metadata is not just a standard text encoding mechanism. The structure of
XML metadata enables the explicit increase of specificity relative to the
nesting
position: A high-level (outer element) XML node is more general than a deeply
nested one. The high level nodes can be used alone, (ignoring nested content)
where applicable, even though more specific deeper-nested nodes exist.
XML is used because lots of readily-available XML processing tools and
technologies exist. These tools are standards-based and are free of
proprietary
technology / patent encumbrances.
The invention supports the file-specific association of XML-based metadata by
embedding the metadata in the file itself. The metadata is free of dependence
on
a file - database association, or a particular processing application to
associate
the two distinct types of data (in the case of a photo file, the two distinct
types of
data are 1: the metadata and 2: the pixel information)
metadata
Information about the metadata includes the data format of a given keyword,
the
specification for the format of the metadata when embedded, translation to
other
languages of the names used for the metadata field names, human-readable
information regarding the applicability of the information fields. These
contribute
to the process of applying and verifying the metadata in a file. Essentially,
this is
metadata about the metadata itself.
11
CA 02668306 2009-06-08
Swapneat integrates management of metadata metadata in the infrastructure for
managing the metadata in files.
The facility for inclusion in the tag vocabulary of metadata that describes
how to
enhance and extend the metadata vocabulary might be referred to as metadata
metadata metadata.
As mentioned above,
Field values describe the data in a data file: metadata.
Describing what the fields and their values mean is metadata metadata.
A method that allows inclusion of instructions guiding users on the addition
and
extension of a tag vocabluary (in essence, adding fields that offer
synergistic
extension of the fields currently available in the tag vocabulary). This is
referred
to as "creation guidance".
Swapneat creation guidance allows the metadata metadata (sntv files) to
contain
integrated information that can guide the process of augmenting the
vocabulary.
This is very useful when more than one contributor can add to the metadata
specification.
Describing how to create additional fields and providing the facility to
describe
those created fields is metadata for metadata for metadata.
discovered metadata
The XMP specification allows for use of unknown vocabularies, provided they
adhere to certain rules enforced by the limits on the structure of XMP
metadata.
The data types and allowed choices for those vocabularies will be unknown to
12
CA 02668306 2009-06-08
the receiver of the information unless they can access the spec and write
tailored
code to understand the information.
Therefore, XMP is basically just a container for such metadata, and conducting
searches or inferring data formats is beyond the specification of XMP.
The swapneat metadata studio will infer basic data types and nesting relations
in
discovered metadata, in order to systematize what is discovered in metadata.
The subsequent introduction of the 'real' specification for a metadata
vocabulary
will reinforce and in some cases expand the list of available choices, or
restrict
further the data type for certain fields. Assuming that the originally
encountered
metadata is conformant to the (unknown at the time) specification, subsequent
parsing of the specification will allow adjustments to the stored inferred
vocabulary in a compatible way.
XMP supports the extension of the metadata compartment via third-party defined
metadata vocabulary specifications.
There is no requirement for the specification to be available to the consumer
of
the metadata; fields are discovered and their field names and values therein
are
meant to be descriptive enough for the reader to intuit the semantic purpose
of
the field and value, but that can only be accomplished in the context of the
content of the data file, by comparison to other fields and associated values
in
the rest of the metadata vocabulary. Embedded metadata in an image file lacks
this context and is therefore difficult to process externally.
The syntax and semantics for the metadata fields can be accessed via the
internet, in order to support XML-compliant processing. It's also possible to
get
an inferred vocabulary after the discovery process, which can aid in the
creation
of an official vocabulary in cases where a rigorous vocabulary specification
never
existed yet.
equivalences
13
CA 02668306 2009-06-08
XML is ideal as an intermediate format to use for translation of semantically
identical metadata values between different fields in different metadata
standards.
An equivalence is a user-designated association of 2 different identifiers,
such
that a relation is understood to exist between them. An equivalence can be
used
to correct a spelling error, for instance in a name. If some items have been
interactively tagged with certain element names, and it is subserquently
discovered that there was a spelling error, then rather than gathering up all
the
existing files that use the name, the specigfication of the vocabulary can be
augmented with an equivalence, and then published. Subsequent processing or
access to the files, even by third parties, can access the specification,
apply the
equivalence relations, and carry on as if the original file had been corrected
before it was sent.
sharable synonym lists
Equivalences are part of tag vocabularies: they point to semantically
identical
tags in other tag vocabularies. Either tag vocabulary can point to the
semantically
identical tag in the other. The result is a symmetrical equivalence relation.
Symmetry is then broken by specifying one of the several equivalent nodes to
be
dominant (visible) and all the other equivalent nodes to be hidden. With
respect
to the nodes within one equivalence set, when metadata is displayed or
embedded by the metadata Studio, the visible. nodes are used, and the
processing is as if the file contained the visible equivalent nodes whether it
did or
whether it contained a hidden equivalent. If the metadata is re-embedded into
the
file after some editing process, the visible nodes will be re-embedded,
regardless
of the original content.
references to appropriate qualifiers
Certain tags can be useful as stand-alone keywords, but are more useful when
they are used as 'qualifiers'.
14
CA 02668306 2009-06-08
For example, colours: you can have all sorts of colour names under a specific
"colours" tag vocabulary. When an object in a photo is a particular colour,
you
can add that colour to the photo. Then, you can search for photos containing
an
object of a particular colour.
To use an example of tagged photos:
This techniques does not indicate which object in the photo is of that colour:
you
could have a vase of flowers sitting on a table, the table may be red, the
flowers
may be yellow: the colours "red" and "yellow" can be tagged into the file...
but the
user can not search for "red flowers" or "yellow flowers": they can only
search by
colour: so a search for "red" will return a photo with a red table (or any
other red
object) rather than photos specifically of "red flowers".
Using the "colours" tag vocabulary nodes as qualifiers, you are able to apply
a
colour to a particular object. In the example above, you would assign a
"yellow"
colour node as a qualifier (a type of child node) to the "flowers" node, and
"red"
colour node as a qualifier of the "table" Object node. Now it is not only
possible to
search for "yellow" things or "red" things, but to search for a "red table" or
"yellow
flowers"
redisplay of information in a more friendly tree form
The swapneat metadata studio includes a metadata parsing library, which
converts a variety of structured and flat tag vocabulary formats into a
consistent,
consolidated tree structure. Only one control and user interface paradigm is
required for the user to examine all tags from all embedded tag compartments
in
the files, regardless of origin and datatype.
upgrade from one term to a new term or new vocabulary
To implement an upgrade of a term to another, an eqivalence is created, and
the
old term is designated as hidden. The term sn:upgradesa can be used to create
CA 02668306 2009-06-08
an equvalence and hide the obsolete element at the same time. This is useful
where only one of the vocabularies is owned by the user. Re-rendering and re-
arrangement is possible based in information in the dominant vocabulary.
The converse is also possible. A vocabulary author can choose to defer to an
'authority' (an potentially more popular existing vocabulary) and remap a node
in
his vocabulary to a synonym elsewhere. This creates an equivalence and makes
his referring node the hidden node. Users with files containing the referring
nodes who access the latest vocabulary specification will then see element
names in the authority vocabulary instead of the embedded references, and if
the
metadata is updated, the authority element names will be used.
provision of context to context-free metadata such as flat keywords
An equivalence can be created from a flat keyword such as one found in the
dc:subject or pdf:keywords lists of a file, to a deeply nested element
elsewhere.
This will allow more meaning and specificity to be given to the flat keyword.
This
is useful only when the creator of the link knows about the meaning and
context
of the original keyword by other means.
interactive re-filing of discovered information
The use of equivalences can enable the automatic re-filing of flat keywords
when
they are seen again. Mere transformations on the metadata XML are used when
a non-permanent remapping is required.
dstrings
dstrings stands for'disposable strings'.
dstrings are parameter values used to supply metadata parameter values for
specific fields in specific files. They have use for sharing possible choices
metadata information during the time interval where the mp3 file exists but
its
metadata is not known to be correct.
16
CA 02668306 2009-06-08
A dstring specifies a music signature, a field name, and a value as a tuple.
the
field name is a reference to one of the possible metadata holders in the file,
such
as'title' or'cd track number. dstrings are saved in the database of an
individual
user's metadata studio, so that they areavailable if he re-inspects the
referenced
file at a later time, and dstrings are embedded into m2g files which are
stored on
the insternet-based server for swapneat metadata, and can be served to other
users with the same music signature.
rendition dependent metadata choices
Music signatures are specific down to sampling rate and encoding tool, and can
be used to distinguish not only among different mp3 titles, but also among the
individual compressions of a particular mp3 file. Also possible is to
distinguish
between different renditions of a music track found on different CD albums.
Metadata choices might reflect the quality of the music when compressed to the
settings used. For instance, a piece compressed at 16 kbits per second might
sound muddled or off-key, and this can be reflected in the dstrings for that
rendition.
On the server, using m2g graphics as a back-up check, renditions with
essentially the same metadata can be mapped together to increase the breadth
of metadata choices and the depth of their application statistics. This has to
be
used with care, since some metadata makes reference to time-points within the
music.
voting based on frequency of use
certain tags are subjective to individual users: over a large user population,
there
will be statistical preferences. By anonymous aggregation of the choices made
by users for specific renditions of specific mp3 files, for instance,
recommendations can be communicated to individual users:
17
CA 02668306 2009-06-08
Thus, the meaning of the information is to convey the sentence: "Of the users
who entered a value for this tag, the most popular value is indicated with an
icon
on the button representing the tag value."
vote collation procedures
dstring votes are tallied at intervals on the server, and the m2g files are
updated
to contain statistics about the latest preferences for certain metadata fields
for
which information is available. To a certain extent, it's not even important
what
the true value for a metadata item is, as long as everyone who searches uses
the
same tag and parameter value.
user data anonymity
all user data for dstring votes is strictly anonymous. This is both to
encourage
users to be truthful in their application of tags and to protect fair-use
users from
harrassment if their tagging preferences were made known to copyright
authorities.
protection of information about prevalence of copies on the net
By quoting only percentages for preferences in m2g files, no information about
the size of the database is available to an agent inspecting the m2g file.
Once the
number of votes exceeds a certain threshold, older votes can be randomly
discarded before the new votes are applied, in order that the size of the
sample
won't be deducible from the effects of recent crafted voting attacks.
use of percentages compared to actual counts
Until a sufficient volume of anonymous data has been gathered, users will not
be
shown any suggested "most popular" choice.
18
CA 02668306 2009-06-08
After such time as a significant number of users have entered a value and
SwapNeat Inc. has captured those values, the 'most popular' values be
indicated
to users, as percentages of the overall data set.
storage in a m2g file
dstring values, along with their percentages are stored in m2g files (which
also
contain the music graphic). The resulting file is a compact way to convey all
that
is known about a certain music signature.
internal m2g format spec
a dstring is packed into xml
teflon tabs
Certain metadata fields are suited to storage and reuse of the values used.
For
example, pictures in a user's personal photo collection will likely include
many
pictures of the same person(s). Having the tag that identifies a unique
individual
stored for future use makes sense.
Other types of metadata may be unique to a particular data file: for example,
in a
music collection, the name of a specific song may be used only once in an
entire
music file collection: therefore, it is likely that hundreds of unique values
will be
used and to retain all of these values would be overwhelming to the user, and
redundant, as they are not likely to be used again.
Certainly, the value of storing them for possible future reuse in tagging is
far
outweighed by the cost of managing and finding them over time, if they are
infrequently used, or used only once.
19
CA 02668306 2009-06-08
Note that once embedded in a file, even though they are not retained in a way
that presents them to the user for use to tag files, they are available for
use in
satisfying search queries. Since these parameter values are in general plain
text,
the search queries are based on single words and flat keyword searches among
the text values. To find structured information, it is not contained in
parameter
values, and is found using the tools of the SNMD structured metadata.
m2g display
m2g stands for music-to-graphics
An m2g image is a 2 dimensional representation of sound intensity and
frequency components, such that major transitions in instrumentation, lyrics
or
tempo can be discerned
The sound file is divided up into a few thousand time intervals, and each
interval
is processed via FFT algorithms to yield information about frequency content
The minimum graphical unit of a m2g is a column of 4 pixels, containing a
white
dividing line, and 3 RGB pixels, which encode the energy content for 9
frequency
bands of the digitized sound for the corresponding particular interval.
The positioning of the divisions between frequency bands is configurable, to
provide a good distinction between different parts of the music. However, once
decided, the division should be used by all programs that can refer to that
file.
whack methodology
the 'whack interface' is the primary portal for user-applied metadata
It is important to establish the fact that with regard to structured metadata,
every
node can act as both a category (a "field") and a tag (a "value").
Effectively, the
deepest-nested node in a given FILE is a TAG, whether or not that TAG exists
as
a CATEGORY in another file. Likewise, a node that is at the deepest nesting
CA 02668306 2009-06-08
level, having no descendant nodes, can at any point have more-specific
descendant nodes added. This does not diminish the value of the metadata
previously applied that lacked these extra descendant nodes: rather, files
tagged
in the future are RICHER than those previously tagged.
Before applying metadata to a set of files, the user selects a subset of
values
available from all structured metadata vocabularies, on the principle that for
a
given set of files, a specific subset of tags will be applicable.
Once this subset is compiled, the subset can be saved for future reuse.
The process is more efficient because the user does not have to select from an
overwhelming list of tags, or re-invent the tags for each file in an ad-hoc
way.
Further efficiency is realized by automatically navigating for the user
between
metadata categories within this subset. The mechanism consists of standard
"Button" controls, and "Tab" controls.
Each category of keyword is represented by a Tab control. Each metadata value
is represented as a button on the parent category's tab. The tab provides
context
information, and the button allows a specific choice to be made
At the time the image is first presented for tagging, the user begins on a
tab,
makes a choice, and the choice is queued for application to the metadata in
the
current file(s). As soon as one or more buttons are pressed by the user, the
next
category (tab) is automatically presented (by changing to the next tab in the
progression, and displaying the buttons for it), from which the user will
select
applicable tags, if any. This process continues until all the categories
selected in
the subset have been visited for the file. The queued metadata is applied and
the
next file is presented and the process repeats.
21
CA 02668306 2009-06-08
The user does not consciously have to navigate through the tag subset. The
navigation is linear and automatic. The user need just select appropriate
tags, or
skip the category if it is inapplicable to the file being tagged.
Rather than having the user visit and consider all tabs in the progression,
it's
possible to specify buttons on earlier tabs which will resault in visiting
later tabs.
This reduces the number of decisions required, while modestly increasing the
number of buttons pressed. It also greatly reduces the number of tabs which
have to be actively skipped. A "Buttontab" is a combination of a button and a
category, but that category is only 'activated' for a given file if the
associated
button is clicked from the more general enclosing category for that tag.
creating tabs and collections
This activity also usually only occurs when a new tag vocabulary is being
created
by the user, which is also not an extremely common activity.
Tabs serve as top-level nodes in a given tag vocabulary: very broad categories
into which other tags will be sorted. "Well designed" tag vocabularies will
have
few top level nodes (likely less than 10) that will act as "tabs" in the Whack-
A-Tag
interface: many more nodes will be buttontabs or buttons, and will be far more
likely to be created 'on the fly' as the user examines the data file to
determine
what metadata is needed.
adding tags to the tag vocabulary via the whack interface
The user can approach the requirement from two points of view, depending on
their particular thought processes, relevant to the data being tagged:
1) they can type the tag, then, after typing, decide if it is a category
(button object
hosted on the current tab that is also linked to a new tab object) or a
keyword
(button object hosted on the current tab). Depending on their decision, they
press
either the "Tab" or "Enter" keys, respectively 2) they can click a toolbar
button to
22
CA 02668306 2009-06-08
create either a Tab, ButtonTab or Button (for'top-level category', 'category'
and
'keyword', respectively) then type to apply text to the control. When the
"Enter"
key is pressed, creation of whatever the preselected type of object (indicated
by
the button pressed on the toolbar) will be finalized.
start typing
In order to create a required tag while examining a given data file, the user
need
only begin to type the text.
If the user anticipates a requirement to create further sub-tags (additional
specificity) under the tag they are creating, they will finalize the button
creation
operation by pressing the "tab" key to create a subtab (rather than just a
button).
Once the Tab key is pressed, the corresponding tab is created, and gets
foregrounded.
If the user does NOT want to make further subitems under the item being
created, they instead press the "Enter" key to complete the creation of the
button.
This does not activate ("Whack") the button, until the "Enter" key is pressed
again. In addition to using the enter key, an actively growing button can also
be
finalized by clicking it with the mouse.
During the process of creating a new button, once a few characters have been
typed, the TagBag will display buttons which have related names, in case the
user might prefer one of those. Mouseover will show a tooltip giving the
absolute
path of the button in the vocabulary.
use a toolbar button
Instead of using the keystrokes after typing, the user can instead click a
toolbar
button to create an object in the Whack interface that will receive subsequent
typed characters.
press a button in the tagbag
23
CA 02668306 2009-06-08
The inventions associated with the TagBag will gather keywords from various
sources considered by proprietary algorithms as 'suggestions' to the user, in
the
current context. These buttons may or may not represent keywords that have
already established positions in the user's overall available tag library: the
ones
that do NOT will be created as new items in the structured tag vocabulary.
Depending on whether or not there are relevant sub-items, the new node will be
created either as a 'category' or a simple 'keyword'.
adding tags to the tag vocabulary via tree interaction
Context menus accessible by a secondary-mouse-button click, are presented to
the user, for the nodes in the tree that are selected. For creating a node as
a
descendant node of the node clicked, the menu will allow selection of the node
type. Nodes are be presented in the user interface as buttons or tabs in the
Whack interface, or as nodes in tree structures. Some operations apply to the
entire selection, and some only to the actual clicked node.
rclick a button or tab node
A secondary-mouse button click on a Tab or Button control in the Whack
inteface
presents a context menu from which the user can choose to "Create New...".
Should they choose to create a new object as a descendant of an object that
currently has no descendants, the nature of that object will change from a
button
to a buttontab in the Whack-A-Tag interface.
rclick a tree node
A secondary-mouse button click on a node in the All Tags tree will present a
context menu from which the user can choose from a number of "Create New..."
menu items.
If a tag vocabulary root node was clicked, the node created will be take a
"Whack-A-Tab" default of being a "Tab" node. The user must click the checkbox
24
CA 02668306 2009-06-08
next to the node to add the node to the TagSet tree and thereby to the Whack
interface.
If any other node was clicked, the node created will take the "Whack-A-Tab"
default of a button node. By adding further descendents to a button node, the
parent node will change it's default "Whack-A-Tab" behaviour to a "Buttontab".
The user can also create a "collection" node, that will have as it's default
Whack-
A-Tab behavour a Tab if at the top level of a tag vocabulary, or a Buttontab
node
if below the top level. A collection node does not cause metadata to be
created,
but exists only for organization of the buttons and tabs and their
progression.
Since the collection nodes do not create XML, they can be rearranged at any
time, to balance the number of buttons visible with the number of clicks to
get to
the desired button. Some examples of collection nodes might be to organize
some keywords into letters of the alphabet, or states by region of the
country.
The XML is complete just naming the state, and does not benefit from naming
the region of the country where it was. But the tab gets significantly less
clutter.
Another option is to put less frequently used buttons into a collection called
'rarely used' which will also unclutter the display. At any time the user can
navigate to the rarely used button and press it, getting display of the
contained
buttons. Also, things can be dragged in and out of such nodes at any time. The
context of the name of a collection is the nearest ancestor node that makes
XML
or is a root node. All names within that context, including names of
collections,
must be unique.
interactive help
Existing tagging tools do not offer users a glossary pertinent to the tag
vocabulary being used. SwapNeat integrates three distinct types of interactive
context-sensitive help for tag vocabularies, and provides user interfaces for
creation of same by users for their custom shared tag vocabularies.
creation guidance
CA 02668306 2009-06-08
Some tag vocabularies are complete unto themselves and are designed in such
a way that the user should make use of the existing tags, and not extend the
tag
vocabulary on their own. They are managed under revision control and
'moderated' by a single user.
In cases where end users are ENCOURAGED to create their own nodes,
'creation guidance' gives the user tips on the nature of the nodes to be
created:
rather than try to anticipate all the nodes that will be required, the
moderator of
the tag vocabulary can provide 'creation guidance' that will inform the user
"if you
want to create a node in this location, it should be of the following
nature:". The
moderator will describe that nature, and the user can create nodes freely.
Furthermore, as SwapNeat Metadata Studio supports hierarchical tag
vocabularies, the creation guidance can be created in advance to guide users
to
create deeply nested structures, one level at a time. For example, the
'creation
guidance' node under a State of the United States may say "Create a node for
the name of a municipality". Then once that node is created, the creation
guidance for the newly created 'municipality' node will suggest "Create a node
for
the name of a neighbourhood." down to the street, number of the address of a
building, unit number within a multi-dwelling building, to the room in that
dwelling.
None of the nodes exist in advance: the guidance on what structure to create
is
given for each individual structure, rather than a lenghty verbose explanation
of
the deep structure suggested.
whack prompt
When considering application of a given tag from a given category in a given
tag
vocabulary, the 'whack prompt' suggest to the user applying the tags what they
should consider about the file being tagged when determining which tag to
apply
(or whether a new tag is needed). This is similar information to the creation
guidance, from the other point of view: where creation guidance is the nature
of
tags you create in the tag vocabulary at a given position, whack prompt is the
26
CA 02668306 2009-06-08
type of content to consider in the file, from which you will choose the most
appropriate tag from the foregrounded tab.
detailed description
General documentation about the semantic purpose of tag itself. The Creation
Guidance and Whack Prompt are relevant to the CHILD nodes of a tag, while the
detailed description is relevant to the tag itself.
tutorials
Tutorials presented to the user in context of the operational mode they are
currently in, and the task they are performing. These are presented as a
simple
list, although it will be possible in future versions to associate specific
actions in
the user interface with specific tutorials.
tagbag
the tagbag is a special area with buttons considered appropriate to the
current
task.
there are several kinds of information that can be presented and accessed via
the tagbag
following is a list of the kinds of information, it's source and it's use when
a
tagbag button is clicked
snmd
SNMD is a generalization of XMP, allowing an expanded range of nesting and
element types to be constructed into a coherent nested metadata description.
SNMD format is specified as well formed xml, having a single top level element
sn:snmd where sn: is the standard swapneat prefix with URI '...sn.sntv'.
27
CA 02668306 2009-06-08
in addition to being well formed, there are further requirements for valid
SNMD
all metadata must be conforming to a sntv file that exists or could exist
attributes on nodes are allowed only from a limited set of sn: attributes
any element may contain nested elements unless it contains a parameter value
(content)
any nested element must either be a designated valid child element of its
parent,
or a top-level element from any other sntv file. Nesting of top level elements
from
the same sntv as the parent element is not supported.
qualifiers
qualifiers are a specific case of nested content
a qualifier node is intended to 'qualify' the parent element, much as
attributes
qualify xml elements
qualifiers may optionally be inserted as attributes on nodes
tag vocabularies
tag vocabularies specify the element nesting rules, parameter types, and
provenance of xml vocabularies used for swapneat metadata
a swapneat tag vocabulary is stored in a file with a snty extension
a sntv is more powerful than a dtd because it can restrict sub-elements from a
vocabulary to specific nodes, and distinguish between identical elements
appearing at alternate points in a tree.
optionally, the sntv file can be configured so that elements with different
nesting
locations cannot have the same name, to support unambiguous identification of
28
CA 02668306 2009-06-08
nodes throughout a specific tag vocabulary, and to support enforcement of
content model validation using standard XML processing techniques (DTD / XSD
and validating XML parser)
the spec for the sntv file format is here
snsi files
snsi files are searchable items that encode metadata about metadata metadata
in such a way that the latter can be efficiently located using the same kind
of
searches as are used to find files with structured metadata in them.
a snsi file is treated like a file object, but it points to a tag vocabulary
elsewhere,
perhaps even on the internet. provisions to access the vocabulary when
selecting
an snsi file are available
the specification for the format of a snsi file is here
snsi files are generated at the time of publishing of the tag vocabulary. they
can
be stored locally or uploaded to swapneat.net for public distribution. in the
case
of public distribution, a compendium of the metadata in the most recent snsi
files
uploaded is made available to users when they check for updates. this
information is downloaded in compress3ed form and augments the database of
searchable items. owing to the richness of metadata in a snsi file, the user
is
sufficiently informed to be able to select among them to get the best
vocabulary
that would fit in the current situation
The snsi fiels get updates when the owner updates the tag vocabulary. users
have the option of accepting this published information and the option of
having
this information displayed in the tagbag so that supported extensions to their
vocabularies can be easily chosen and used
29
CA 02668306 2009-06-08
it is possible to do a search and restrict it to snsi files only, in order to
locate a tag
vocabulary relevant to a certain tourist destination or to search for a
friends name
among vocabularies shared among family members, for instance.
tabsets
the specification of tabs and their buttons obeys certain rules
not all buttons and tabs from a tag vocabulary need to be present in the whack
interface
the subset that is present, and the valid qualifiers accessible to each
metadata
item are contained in a snts file
the specification for a snts file is here
loading a snts file into the GUI results in a reconfiguration of the buttons
and tabs
of the whack interface
multiple snts files can be overlaid to combine provide a logical union of the
tag
references in all files.
tabs
Within a given tag vocabulary, the nodes immediately nested within the root
node
typically become "Tabs", if present in the Whack-A-Tag interface: Since there
is
no parent "Tab" in the Whack-A-Tag interface, there is no way for them to be
"buttontabs" (dependent on a button on an ancestor tab to be pressed).
buttontabs
Buttontabs provide an elegant way to 'branch' in the Whack-A-Tag process: if
there are to be a number of branches used, it may be impractical to offer the
user
the same set of tabs containing buttons (tags) for every file... the fact is,
that the
user may be tagging hundreds of files that have common tags across many files,
CA 02668306 2009-06-08
but it may be inefficient to present the user with tags for particular
branches of
the tag vocabulary for every file.
Buttontabs give the user the opportunity to choose which branches are relevant
to the particular file being tagged; in a sense it's a meta-layer on top of
the
TagSet (which is itself a subset of the Tag Vocabulary).
For example, consider the task of tagging a collection of photos of a party
where
members of ones family, some close friends, some mere 'acquaintences, and
some strangers were in attendence. The TagSet in use would present the
"People" node as a Tab, and present three of the four classes of people:
"Family", "Friends", and "Neighbours", as ButtonTabs, and present a leaf node
tag for "Strangers" as a button. When the user is presented with a particular
photo that only has two persons as subjects, and both persons are members of
their immediate family a click on the "Family" button would cause the "Family"
tab
to be foregrounded at the appropriate time, but the user would not be
presented
with tabs for "Friends" or "Neighbours". Along those lines, suppose the photo
contained three neighbours known by specific names, and a couple of strangers.
The user would hold the CTRL-Key to click both the "Neighbours" and
"Strangers" buttons. Strangers, just being a button, is complete as is, but
Neighbours, as a buttontab, would present the destination "neighbours" tab to
the.
user from which they would choose the specific neighbours.
Suppose the neighbours did not already have tags present in the 'Neighbours'
branch of the tag vocabulary. Depending on whether or not the user wanted to
classify their neighbours by surname or firstname, the user could type the
tags
quite quickly as follows:
If they wanted to create tags for a particular Surname first, they would type
the
surname, and press the Tab key. Pressing the Tab key after typing a tag name
would create that node as a Tab in the Whack-A-Tag interface, as an immediate
descendent node that which is represented by the tab (in this case, the
31
CA 02668306 2009-06-08
"Neighbours" node). Suppose the name they type is "Smith". After releasing the
Tab key, the "Smith" button is created on the "Neighbours" tab, and the user
is
immediately taken to the destination tab, where they can create the button for
each individual from that family. After typing the first name of one of the
persons
pictures (suppose the name is "Margie") pressing the "Enter" key would commit
the text to a button, rather than a buttontab.
buttons
Buttons are analagous to leaf nodes in the structured tag vocabulary. They are
the most 'specific' data point relative to a particular branch of the tag
vocabulary.
At any time the user can apply further detail by adding any number of
descendant nodes under that particular node, but this makes the node
unsuitable
for use as a "button", there would be no convenient way using the Whack-A-Tag
method to access the detail provided by the descendant nodes.
Conversely, sometimes a level of detail is unnecessary or undesirable. If the
user
is in a rush to get a large number of files tagged with unique but correct
combinations of tags, they may choose to us a TagSet where nodes having any
number of descendants are rendered in the Whack-A-Tag interface as simple
buttons: they may see no value to further detailed tagging and therefore just
have
buttons instead of buttontabs.
Any node can be rendered as a button in the Whack-A-Tag interface, but so
rendered, no descendant nodes are available via the Whack-A-Tag interface
using that TagSet.
bypass
The conflicting desires of maintaining rich contextuality provided by
structured
tag vocabularies, while simultaneously making the tagging process as efficient
as
possible is served by the ability to bypass tags in the Whack-A-Tag process.
32
CA 02668306 2009-06-08
Any tag that has a bypassed parent will appear on the nearest not-bypassed
ancestor tab as a button.
Suppose you have the rich tag vocabulary structure:
= Places
o Canada
= Ontario
= Toronto
= Downtown
= Eaton Center
= -2 Level
= Food Court
and are tagging a number of photos. Because your photos do not include food
courts of many distinct shopping centers, it is sensible to bypass many of the
tags. For a specific TagSet, being used to tag pictures from a visit to
Toronto, it
would make sense if the following structure was represented in the Whack-A-Tag
process:
= Places
o Toronto
= Eaton Center
= Food Court
The "Places" node would appear as a Tab in the Whack-A-Tag interface. When
tagging, instead of having to click through both "Canada" and "Ontario" to get
to
Toronto, both the "Canada" and "Ontario" nodes would be "bypassed". The
"Toronto" node would be present in the Whack-A-Tag interface as a ButtonTab,
where the button appears directly on the 'Places'. Likewise, "Downtown" isn't
necessary (so it is bypassed) but "Eaton Center" is: "Eaton Center" appears on
the "Toronto" destination tab. The "-2 Level" is not relevant (so it is
bypassed),
but the "Food Court" is.
33
CA 02668306 2009-06-08
If, for example, the only "Food Court" visited on that vacation was the one in
the
Toronto Eaton Center, and the TagSet was developed specifically to tag photos
of that vacation, all the intervening nodes could be bypassed, and the "Food
Court" button would appear directly on the "Places" tab as a button.
The user tags the photo to 8 levels of hierarchical detail, without having to
interact with or navigate down 8 levels of tags.
mode-specific progression
Each tagging mode: "Batch", "Selective" and "Single" present the same set of
tags for use with a given dataset.
During the whack progression through the categories (foregrounding the next
tab
after a tag on the current tab is pressed) certain tag categories are
pertinent to
ALL the files in the Thumbstrip, others may be to many but not all, and still
others
are best considered one file at a time.
There is a balance to be struck between tagging completeness and tagging
accuracy: to associate an incorrect tag with a file can be worse than not
putting
associating any tag for a given category with that file: doing a search for
music
with a "calming" mood and having that search result include a song that
envokes
an "aggressive" mood, because "calming" was erroneously tagged into a file,
would be more disruptive to the purpose of the search result than if that file
was
not included in the search result at all due to the absense of any "mood"
category
tag.
pinning / same auto
A tag can be "pinned" to facilitate automatic application of a tag without the
user
having to proactively apply that tag for every file. The user interface
consists of a
button on the toolbar that turns the current mouse pointer to resemble a "push
pin", and when a button is tagged, that button is 'pinned' down: that button
will be
34
CA 02668306 2009-06-08
depressed for every subsequent file, until such time as the cursor is placed
in
"pin" mode. When the cursor moves over a pinned button, it becomes the 'pluck'
cursor (a hand with pinching fingers, as if to grasp and pull out the pin).
A simliar function from a different point of view: the user can press the CTRL
key
to click a number of buttons on a single tab that are applicable to the
current and
many subsequent files: before releasing the CTRL key, they also press the
"SAME AUTO" button, which will effectively pin all the clicked buttons
simultaneously.
This only differs from the "Pin" cursor operation in that one is proactive
(user
clicks "PIN" then clicks one or more buttons to pin) vs. reactive (user is
applying
one or more tags from the foregrounded tab, then realizes that these same tags
are applicable for subsequent files, so they click the "SAME AUTO", either
while
still holding control, or by revisiting the tab. Fewer mouse clicks are
required, and
both proactive and reactive thought processes are supported.
skip auto
The opposite of "same auto" is "skip auto" where the user realizes that a
particular category is not applicable to the files in the current file set, so
they click
the "SKIP AUTO" button.
Subsequently, the tab will be automatically skipped by operations that seek
the
next tab. The user can still navigate directly to the tab by clicking its tab
symbol in
the whack interface, but does not need to dismiss the tab for each image.
The information about which tabs to auto-skip is retained in a mode-selective
way. Different tabs can be skipped in batch mode than are skipped in selective
mode, or single mode. As a result, a single selection of tabs can be tuned to
work
well in all 3 modes.
CA 02668306 2009-06-08
A skipped tab can be shown as having an icon on it that indicates it will be
skipped, or just no icon. When a tab has no icon on it, it's either a
buttontab
destination which has not been activated for the current image, or it's a
skipped
full tab.
saved searches
a saved search is a snsq file snsq stands for Swapneat Search Query.
Saved searches can be combined in order to add high level meanings to
different
components of the search. Options like certain people or activities can be
saved
into partial search queries, and later combined to make more complex searches.
the specification for snsq files is here A snsq file is an XML formatted file
containing elements which refer to keywords elements and specify whether the
element is AND, OR, or NOT in that component of the search. When combining
snsq files, all the OR are aggregated together, likewise, the AND terms are
collected, and the NOT terms are collected, to produce a full, OR/AND/NOT
search request.
When a snsq file is loaded and made active, the search information for the gui
is
reconfigured to reflect the various AND, OR, and NOT elements in the snsq.
episodes
The term "Episode" refers to a short period of time in which a relatively
large
number of photos were recorded. The idea is that if a number of photos are
captured in a short period of time, with relatively few photos captured in the
time
before and after, then they group of photos in the interval are likely to
require
some common metadata, and it makes sense to consider the photos together in
order to aid the process of both applying common data to the group, and
distinguishing subtleties among the photos, in order that more accurate
tagging
vcan be accomplished.
36
CA 02668306 2009-06-08
Episodes are displayed in tree form and assist in batch application of
metadata to
groups of related photos with less user operations required. Facilities exist
in the
user-interface to identify the episode containing a photo, or to show other
photos
from the same episode.
An episode is defined as the time interval in which the number of photos
exceeds
by a factor the number of photos taken in a time interval of related size,
before
and after the episode interval. The definition is tolerant of a large range of
picture
taking rates, since the time iperiod from the first to the last picture is
used to
determine the size of the guard regions. An episode is considered valid if it
has M
times more pictures in the episode than in the guard region before and in the
guard region after. The factor M is usually set to a number like 5, and the
episode
minimum is best set to a number like 4, and the guard interval ratio is best
set to
a factor like 1Ø
episodes can have sub episodes A sub-episode is a secondary structure, which
takes into account the average rate of photos in the parent episode. The
average
rate is compensated, and any residual clumping can be used to determine sub-
episodes. Once the time interval for a sub episode has been determined, then
all
the photos from that interval are considered members of the sub episode.
The algorithm for determining if a supplied episode has any subepisodes
Loosley
put, an episode can have sub-episodes if it has at least episode_min images,
and after removing the average rate of photos, the remaining number of images
exceeds episode-Min, and there will be at least 2 sub episodes created within
the time interval of the parent episode. Although it can also be possible to
determine a single sub-episode within an episode, and call the rest leftovers,
the
nesting depth that results, and the lack of precision at determining the
episode
boundaries work to make it less useful.
pipelined workflow
37
CA 02668306 2009-06-08
The swapneat metadata studio is designed to be a tool that assists in the
efficient
application and storage of metadata. Operations such as pre-fetch and post
processing are performed while the user is concentrating on the tagging task
for
the current image, so that the user does not get delayed by these steps in the
computation. This helps keep the process flowing by eliminating times of
boredom or distraction while the images are being saved or reloaded. Also, the
process of applying metadata is done in a way that does not require re-
compression of the image data, so that it is a lossless process. This removes
any
concerns about picture quality degradation that could cause distraction if
tagging
within an image editor program.
There is a trade-off between pre-fetch size and efficiency. If proceeding in
direct
sequence is common, then a small prefetch is sufficient to always be ready.
However, since the 'same as last' selection takes very little time to apply,
it's
good to have a few extra images preloaded to help keep up with short-term
speed mismatches between the tagging operation and the data load and store
operations. Since the swapneat metadata studio processes and displays all of
the image metadata, not just a single keyword field, the amount of processing
that needs to be done is significant enough that piplelined operations make
sense.
metadata redecoration after editing
the metadata can be copied out of a file, and the file optionally purged of
metadata, allowing the use of a third-party editor such as Gimp, and then the
metadata can be re-applied to the file after the user indicates that the
editing
process is complete.
optionally the metadata from the original file can be applied to the result of
an
editing process.
38
CA 02668306 2009-06-08
information about image size and bit depth etc are not re-applied, since the
editor
might have changed these. only information that would continue to apply to an
edited file is reapplied
a list of metadata items considered to be integral to the file, and not
reapplied or
stripped is here
reference to exiftool as a redecoration aid
The fundamental use of metadata is to facilitate accurate search: whatever the
list of files returned by a search result can be used for is an application of
metadata.
Once the file list has been returned, the files can be used in other
applications,
including extraction of metadata embedded in the file, or, if the other
application
does not support embedded metadata, the aggregate embedded metadata for
the file set can be provided in an external ("sidecar") file or files.
General Usage
Integrating SwapNeat Unser Interface technology into third party software
Any application that supports tagging of files of any kind can benefit from
the
SwapNeat technology (as described elsewhere).
The file format need not be able to accept integrated / embedded tags (sidecar
files can be saved, or tags can be saved to a database) nor does the
application
have to be able to embed tags into files (SwapNeat provides technology that
can
perform this task called from the command line)
Any client software for editing files can be augmented by the addition of the
Whack Interface to enter keywords, even if the keywords are not structured. To
facilitate this, components such as the Whack Interface and the TagBag, and
39
CA 02668306 2009-06-08
certain aspects of the All Tags tree and TagSet tree could be integrated as
'plug
ins' for any editing application.
Provided the user has a need to tag files, the Whack Interface can be
integrated
for guided; efficient tagging. T
Sharing Tagged Files
Embedded structured tags in media files such as photos, music and video, opens
new horizons for a number of applications and uses. Keywords that can describe
non-text files that are accompanied with the data, allow sharing of media
files in
far more powerful ways than previously know. Prior to this application, either
the
filename was used to describe the file or a database had to accompany the
file.
Applications that rely on an external database are not portable from one
application to another or they are too heavy to share.
IPTC Keywords do address this issue, however, IPTC keywords do not support
true structure. Pseudo structure can be made possible by separating keywords
with a separator character such as ".","/",";" etc. The problem with this
technique
is that the separator character can not be used in the keyword and there is no
standard for which separators to use. Most current applications treat the
entire
string as a single keyword.
Sharing SwapNeat embedded files provide users with powerful information.
example
tag: http://www. swapneat. net/xml/s/w/apneat/swapneat/generic_d raft_01 _001.
snt
v:/Places/Canada/Ontario/Toronto
The SwapNeat Tags Provide:
CA 02668306 2009-06-08
= a reference to the Namespace the tag belongs to. The Namespace
indicates the author, revision and provides a means of acquiring the
Namespace.
= a structured tag, which means each level of the tag is a searchable
keyword itself. For example a tag such as "/Places/Australia/NT/Brisbane"
allows the file to be searched using any of the keywords
"Places","Australia","NT","Brisbane", "/Places/Australia/NT" ...
= unambiguous searching. Structured tags prevent finding the wrong results.
For instance the two Toronto tags will not be confused.
"/Places/Canada/Ontano/Toronto" and "Places/United
States/Illinois/Toronto"
= shared community moderated Tag Vocabularies allow files to be tagged
with standardized keywords. This allows the meaning of keywords to be
globalized rather than having thousands of definitions for the same
keyword.
General Applications
caterer's menus
Tag photos of different dishes with the "course" (i.e. appetizer, dessert,
etc),
useful for searching.
Works in recipe books, too.
assist in generating and maintaining web pages
In the making and maintaining of websites, there is a need to have images that
have alternate text which appears when the iamge has not been loaded yet. THis
text is also useful in some screen reading programs to describe the image.
Maintaining the alternate text descriptions for images is an overhead task
which
41
CA 02668306 2009-06-08
can be simplified if the process of adding the image to a web page pulls the
alternate text from the metadata already in the image. Using SNMD metadata,
(or XMP, for that matter) captions and other information can also be stored
with
multiple alternate language choices, so that depending on the language
settings
in the website, the appropriate text can be displayed.
metadata can be displayed as descriptive captions for files. This description
can
accompany a thumbnail version or icon representing the file, or be displayed
alongside the file itself, as it is consumed (viewed in the case of photos, or
played in the case of audio or video or other 'synchronized' multimedia data)
thematic content presentations
Slideshows, playlists of music and video files, chosen to suit a particular
mood or
occasion, or selected to be of a consistent theme or tempo.
mood migration
songs and photos tagged in such a way that indicate the mood they invoke in
the
viewer or listener can be ranked on a progessive scale: a song can invoke be
'happy' memories, or outright 'glee'. Likewise, photos or music can invoke a
sad
or melancholy feeling. If a content consumer wishes to alter their current
mental
state from one mood to another, they can do so through consumption of media
that invokes feelings similar to their current mental, but progresses over
time by
presenting media that trends away from their current mental state toward a
desired mental state, slowly, non-jarringly.
search for items in order to create a presentation
embedded keywords and the database that keeps track of them can be used
when searching for a collection of photos the resulting set of photos then has
a
common theme and is appropriate for use in slideshows, etc
keep track of provenance
42
CA 02668306 2009-06-08
embedded metadata is the best way to keep track of the original source of an
image, and the copyright status of an image
assist in rendering of items for presentation
embedded metadata can encode things like the best 3:5 crop location so that
when printing photos, the machine can be guided to reflect the user's wishes.
this
information then is available no matter who prints the file.
aspect ratio dependent crop information
a file of a given aspect ratio can include in the file the ideal crop
rectangle for full-
client consumption of the file: for example, popular computer screens have
aspect ratios of 4:3, 5:4 and 16:9. Most digital cameras encode their data
with a
4:3 ratio: in order to display a 4:3 image file full screen on a 5:4 screen,
the sides
of the image will be cropped off. In order to display a 4:3 image full screen
on a
16:9 screen the top and bottom of the will be cropped off. The file can
include the
main subject location so that subject is centered in the screen and the edges
cropped in such a way that the main subject is closer to the center. Or an
arbitrary rectangle can be drawn so that the most significant area having a
"5:4"
aspect ration, and likewise, the most significant area having a "16:9" region
are
indicated.
collectables
It can be impractical to physically mark collectable objects with information
that
identifies the object ownership or pedigree. Photos of the collectable object
can
be tagged using a tag vocabulary that features tag categories that are
important
in that particular class of collectable: for example, photos of coins can have
close
up photos of fine details that are not easily visible to the naked eye, and
the
history of the coin can be encoded.
43
CA 02668306 2009-06-08
Pop-culture collectibles have a very different classification scheme than do
stamps, jewellery, memorabilia, insects, plants, etc. and tags from associated
tag
vocabularies embedded into photos of an object can be very useful in keeping
track of important characteristics and information.
Furthermore, as some collectibles change over time, a record of such changes
can be kept with photos, and the photos can be linked as a progression over
time
for a given object (plants and other living things in particular)
Personal / End-user Applications
Personal Photos Collections
With the popularity of digital photography growing each year, the need for a
more
proficient personal photo management solution becomes more necessary. The
number of personal photos archived on the average computer will continue to
reach unprecedented size. SwapNeat Metadata Studio is a large part of the
solution. By embedding structured cross compartment keywords (snmd, exif,
microsoft is) into photos the user has the flexibility of using the MetaData
Studio
application, their current photo editing software or the basic Microsoft
Windows
Explorer to find and enjoy their photo collection. Using shared community
moderated Tag Vocabularies alleviates the work required to create their own
vocabularies. Modular shared vocabularies allow a user to download just the
pieces they need. ie Towns in the State of New York. Providing a means and
encouraging embedding many keywords into each file promotes greater
manageability.
Group sharing of personal photos is also a growing trend. An issue with the
current method of photo sharing is that guests need to view all photos the
publisher has shared. Generally, they are shared as albums. With shared
moderated tag vocabularies an application can be developed that allows guests
to search out only the photos that they are interested in. An example would be
a
children's baseball league photo gallery or forum. Each parent could be
members
44
CA 02668306 2009-06-08
of the group and share photos with other parents. `If they all tagged the
photos
with a common tag vocabulary created for that particular league, then any
parent
can easily perform queries specific to their team or child.
Personal Music Collection
Creating playlists of music files is key to enjoying a portable music player,
especially, when the player holds tens of thousands of songs. Currently, the
technique would be to look at ID3, MP3 tags such as song title, band, album,
genre. Unfortunately, the data in these fields are often incorrect. SwapNeat
Metadata Studio provides an efficient mechanism for correction using the Whack
Interface. The TagBag (see description above) has capabilities of providing
potentially useful tags, based on filename, history and internet voting.
However,
even with these few fields correct, it is still difficult to generate creative
playlists.
By embedding an addition variety of tags that describe attributes such as
mood,
key vocals, occasion, topic, theme, etc that are not part of the standard
metadata
(ID3, MP3) compartments, music files can be described in more detail.
Personal Video Clips
Most point and shoot digital camera's come with a video recorder. Cellular
phones, PDAs and even portable music players are also equipped with
camcorders. Internet websites for sharing video clips are also a growing
trend.
Having a large collection of video clips is going to reach the same level of
unmanageability as digital photos have. Embedding structured shared tags will
be critical in the long term for all the same reasons as digital photos.
Commercial Applications
There are countless industries that depend on digital media to do business.
Currently an Asset Management solution is required to bring coherency to the
multitudes of information and files. Unfortunately, Asset Management solutions
are expensive, large and complicated. They are also not portable. The files
can
CA 02668306 2009-06-08
not simply be moved from one vender's application to another. Database records
are also linked to filenames which make asset management solutions vulnerable
to disk changes or disasters. Standardized structured embedded tags is key to
making asset management application stronger or for smaller companies,
potentially unnecessary.
copyright notice
Use "batch tag" to embed copyright information that will be carried with the
file
wherever it is used.
capital equipment asset management
Taking photos of equipment and any unique identifiers (serial numbers,
distinguishing physical characteristics) then embedding textual metadata in
the
photo of that object. That data can include original date of purchase, point
of
purchase, warranty information, ownership information, proprietary item
identification number, etc.
cataloguing of contractor work
a contractor can use digital photos of a job site throughout a project to
track
issues and make note of original condition prior to the start of work. The
contractor can tag these photos with the physical location the work was done,
the
type of work done, the subcontractors involved, parts and labour invoice
references etc.
Clip Art Libraries
Clip art libraries are generally managed by some database or the clip art is
organized by folder location and filenames, however, by themselves, a single
clip
art image needs to be viewed to have an idea of its content. Embedding tags
that
describe the clip art is far more ideal.
46
CA 02668306 2009-06-08
The HVAC Industry
It is customary in the HVAC (Heating Ventilation and Air Conditioning)
industry to
take photos of installations, heating units, equipment, duct work and other
components. It is important to document installation locations, model numbers,
serial numbers and part numbers to provide warranties and aid in maintenance.
Often the photos are taken by the installer but then lost in the sea of photos
once
they are deposited back at the office. If there was a tagging policy in place,
that
used an industry standard tag vocabulary these photos would always retain
their
context and value. Furthermore, since the keywords are standardized, when
maintenance contracts are transfered to another company, the photos can be
transferred also.
SNMD (SwapNeat Metadata) supports qualifiers and parameters which are
perfect for keywords such as Model Number and Serial Number where the
keyword is a value rather than a fixed string.
A possible tag vocabulary for this industry would be:
Hospital Equipment Inventory
Hospitals are unique institutions which require studious management of a wide
variety of medical and computer equipment. Knowledge of what equipment is in
which room is necessary in order to ensure equipment is up to date and in
working order. One technique is to photograph the equipment and its bar codes
and store it in a databases. Embedding a photograph of a piece of equipment
with information such as Model Number, Serial Number, Purchase Date, Room
number, Equipment Name, Equipment purpose, Appraisal value, last
maintenance date, firmware version, software version can be extremely
valuable.
The single file can contain all the necessary data without risk of database
corruption.
47
CA 02668306 2009-06-08
Another very useful feature would be to take a photograph of an entire
hospital
room and region tag the various equipment in the photo with the same equipment
information listed above. Region tagging is associating certain tags with a
rectangular section of the photo. Hovering the mouse over the region provides
a
tooltips of the tags in that region.
Crime Scene Photos
An enormous amount of photos are taken at crime scenes around the world. The
management of the digital photos would be greatly aided by embedded
keywords. Having data embedded in the photos would allow sharing of photos
between police forces without extraneous paperwork or accompanying files.
Global searching of similiar crimes via standardized keywords is another
benefit.
A compelling use of subregion tagging would be the tagging of the
investigators
findings at the area of interest. As each new investigator studies the case
they
can easily extract the previous investigators findings.
House Insurance Inventory
If a household tag vocabulary was standardized by the Insurance industry, home
owners could make digital photographs of their property and embed data such as
= Original Purchase Price
= Date of Purchase
= Retailer
= Item Name, model number, serial number
Potentially the photos can be used as a form of proof of purchase.
Categorizing Collectibles
Example of collectibles are:
48
CA 02668306 2009-06-08
Stamps, Comic Books, Trading Cards, Memorabilia, Action Figures, Autographs,
rocks, gems.
Instead of purchasing a customized application to manage collectibles or
creating
a spreadsheet, pictures can be taken of each item and categorized by
embedding tags such as:
Item Title, Item Description, Original Purchase Price, Purchase Date, MSRP,
Print Date, Current Appraisal, Condition of item ....
Searches can be performed on keywords such as Appraisal Value and
spreadsheets can be generated from the output. Others can perform searches on
archives for trading or purchasing purposes.
If items are tagged with a standardized controlled vocabulary others can find
the
photos more readily.
To be enjoyed and viewed the items don't have to be physically taken out.
Cataloguing collectibles
Users can contribute tagged photos to form the basis of a catalogue, built
around
a standard moderated tag vocabulary: as users contribute items, they are
effectively building a catalog: individual users will take photos of their OWN
version of these items, tagged with standard tags from a moderated tag
vocabulary, but also tagged from a tag vocabulary that identifies the user
uniquely, and the object uniquely (quality, pedigree, etc).
In addition to standardized tags, certain tags could have standard / uniform
values: for example, the market value of a particular object could be included
in
the. Tag Vocabulary, and when a more recent version of the tag vocabulary is
downloaded, photos of that particular object would be automatically updated
with
the new 'market value' for that object.
49
CA 02668306 2009-06-08
Each object's photo could have a unique GUID identifying that item uniquely by
its photo GUID.
Also, the owner of the photo can distribute a "metadata addendum" that could
be
imported into an app, find the photo by its GUID, and add / change tags in a
photo that is already "in the wild".
Real Estate Databases and Online Real Estate Websites
Photos taken by real estate agents of client's houses could be tagged with an
industry approved structured Tag Vocabulary. Photos of the exterior, each
bedroom, bathroom and kitchen could be decorated with tags. Each photo could
even be region tagged to describe unique features of a room.
A Tag Vocabulary might have tags such as: Address, Asking Price, Listing Date,
BedRoom Number, Square Footage, Room Footage....
With information embedded into the photos themselves, generating a website for
a new house listing could be done dynamically using scripts and a database
rather than manually crafted.
Archiving photos would also be made easier.
Filing of Medical Records
Acquiring Digital Media such as X-Rays, MRIs, Retinal photographs are an
important part of providing good patient care.
Archiving of Paper Records
There are a number of industries where the employees are stilling documenting
on physical paper. The paperwork is extremely valuable to the company and in
some cases must be put into storage for future reference. If all paper
documents
CA 02668306 2009-06-08
could be scanned and then tagged with a controlled vocabulary these documents
can be saved and retrieved in a far more effective and efficient manner.
There are countless industries that still retain documents from before the
switch
to computers. These historical records could also be scanned and tagged.
In medical or research laboratories often previous articles or documents of
experiments need to be referenced. Rummaging through crates of old
documents is not efficient. The potential of having these documents online and
searchable is far more effective.
The judicial system could also benefit from this type of digitization of
historical
documents.
Accounting System support
Scan an invoice and whack the particulars into the image file to build an
audit
trail.
The accounts descriptions in those packages use a nested heierarchy of
accounts: supplies, services, capital etc. perfect for snmd ability to search
on
nodes or parent nodes. The items on the invoice could also be photographed and
associated with the invoice file.
When payment is made, scan the check that was issued to pay for it.
Invoices are well suited to hierarchical data... line items having fields etc.
Enhance this with a report generator: such a report generator would have other
applications as well, like generating catalogues of items from embedded
metadata (rather than a separate parts database)
whack FILENAMES of EXISTING IMAGES (like an invoice file name) into the
photo of the objects and the photo of the cheque that paid the invoice.
Linking
51
CA 02668306 2009-06-08
photos together via embedded metadata (rather than a separate document or
database)
Support of financial transactions
Send a digital photo of a financial document: standardized tag vocabulary
agreed
upon transactors where the document is digitized and metadata that supports
the
transaction and audit trail.
Commercial Music / Song Files
Sound Track Libraries
Sound Efects Libraries
Example tags for a sound effect file
= duration
= 'attack ratio'
= pitch
= noise content
= humour
= realistic-ness
= pitch range
= 'closed endedness (does it have an ending)'
= text description (just for completeness)
= royalty status
= cost
= licening issues
The SwapNeat Metadata Studio Application
Generally, the application provides features that allow the user to access and
manage
52
CA 02668306 2009-06-08
= files
= metadata
= metadata applications
Appearance and Innovations in Metadata Presentation
[Please refer to Figure 1 in Appendix 1]
The SwapNeat Metadata Studio application is designed to provide the user with
a workspace that they can customize to maximize efficiency for embedding
metadata into digital media.
The dockable, resizable and closable panes allow the user to adjust the
application to their needs, objectives and preferences. The various panes
provide
file system access, and present metadata in different views, each emphasizing
different aspects of the data and facilitate, various operations.
The sections that follow will discribe the various panes and how they make
file
management via metadata embedding and viewing more efficient:
The features of the application are grouped into panes. The panes are named:
= import
= library
= thumbstrip
= preview pane
= all tags tree
= properties
= statistics
= guidance
= metadata chart
= metadata tree
= tagset tree
53
CA 02668306 2009-06-08
= tagset chooser
= whack-a-tag
= tagbag
= search pane
= organize
One of the design principles of the SwapNeat Metadata Studio was to reduce the
need for mouse movement and extraneous clicking. The use of buttons on tabs
are the prevalent control in the application. When possible, prefetching files
and
automatic navigation of tabs and thumbnails are performed.
Context-sensitive mouse pointer icons
The icon will change when the object underneath the mouse pointer will operate
differently when the primary mouse button is clicked: the CURSOR indicates
what the click will do.
The button icons show the origin or other information, distinct from the 'what
happens when clicked' action
When the button icons change to indicate what the click event will do, the
mouse
pointer icon need not change... at that point, the object property indicated
by the
OLD button icon is irrelevant to the click action.
Basic
= [Please refer to Figure 2 in Appendix 1 ] - Can be used in place of the
default OS pointer
= [Please refer to Figure 3 in Appendix 1 ] - A mouse click or drag operation
is not allowed where the pointer is currently located
= [Please refer to Figure 4 in Appendix 1] - A mouse click on this object is
not allowed in the free or trial version
Whack
54
CA 02668306 2009-06-08
= [Please refer to Figure 5 in Appendix 1] - Whack this tag
= [Please refer to Figure 6 in Appendix 1] - Whack out this tag
= [Please refer to Figure 7 in Appendix 1] - Pin this tag
= [Please refer to Figure 8 in Appendix 1] -Unpin this tag
Recent Addition
= [Please refer to Figure 8=9 in Appendix 1] - The tag under the mouse
pointer was recently added to the Whack interface.
Navigate
= [Please refer to Figure 10 in Appendix 1] - This button represents a tag but
is the last step of a series of buttontabs. Clicking this button will take the
user one step back in the series, AS WELL as unwhacking the button.
= [Please refer to Figure 11 in Appendix 1] - This button represents a Tab,
on which another button is whacked. Clicking the object will navigate to
the tab, and not just unwhack the button.
Image Preview
= [Please refer to Figure 12 in Appendix 1] - When in "colour whack" mode,
and a photo file is displayed in the preview pane, this cursor is used to
click an object to sample the colour in a specific region of the image, and
that colour can be applied to another tag as a qualifier.
= [Please refer to Figure 13 in Appendix 1] - When a photo file is displayed
in the preview pane at "100% view", it may be much larger than the area
of the preview pane: the mouse cursor changes to the Hand to indicate
that the photo can be dragged around with the mouse.
= [Please refer to Figure 14 in Appendix 1] -Image drag icon.
Attractor mover
CA 02668306 2009-06-08
= [Please refer to Figure 15 in Appendix 1 ] - An attractor node is being
dragged and moved to a new position in a tree.
Panes
Import
[Please refer to Figure 16 in Appendix 1 ]
Users require the ability to specify which types of files and where on the
file
system the application should manage. Additionally, they require a facility to
bring newly created files under application control.
The Import pane provides this basic capability.
Users generally store files of similar types together in a single branch of
the file
system.
Users can specify for each type of file managed (photos, music, video) where
the
common ancestor node ("Library Root") for that file type is. Common locations
are the "My Photos", "My Music", and "My Videos" folders in the users "My
Documents" folder. Setting the root folders for each distinct file type
supported is
done in the "Options" dialog (described later)
Once folders are under management of the application, the folders are
represented in the Library pane.
As new files are created or downloaded, they will be moved (or copied) from
locations that are external to the users "Library" through the features
presented
in the "Import Pane".
Files that are on external media are first copied to a temporary folder on the
local
hard drive known as the 'Import Tray'. The user has the option of deleting the
files from the source location once copied.
56
CA 02668306 2009-06-08
When the copy action is completed, the files in the Import Tray are
represented
as thumbnails in the Thumbstrip Pane (described later).
By selecting individual tiles in the Thumbstrip, the user can import the
selected
files to a specific folder in the Library, and repeat the process for as many
different destination folders as desired, clicking on the "Import Selected
Items"
button for each file set. Or, they can import all files to the final
destination folder
in the Library with a click on the "Import All" button, or manually select
individual
files in the Thumbstrip.
The Library Pane
The Library pane is a multi view representation of the folders and files
managed
by the SwapNeat Metadata Studio application. The archive of folders and files
can be displayed in three different modes. The Folder Mode, Episode Mode and
FileSystem Mode.
= [Please refer to Figure 17 in Appendix 1] - Add additional folders to the
list
of managed folders. Clicking this toolbar button will bring a standard folder
chooser dialog
= [Please refer to Figure 18 in Appendix 1 ] - Show each folder as an
individual line item in the library pane, with multiple columns of properties.
= [Please refer to Figure 19 in Appendix 1] - Show files organized by
"Episodes" (discrete time intervals) regardless of their position in the file
system
= [Please refer to Figure 20 in Appendix 1 ] - Show a hierarchical file system
folder tree with bolded text on folders that are managed by the application
= [Please refer to Figure 21 in Appendix 1] - Refresh the three distinct
Folder views manually: this is useful when actions outside the application
affect the contents of monitored folders in a gross way that the application
does not detect.
Folder Mode
57
CA 02668306 2009-06-08
[Please refer to Figure 22 in Appendix 1]
The Folder mode is a multi-column list of all the folders within the SwapNeat
database. File system hierarchy is not represented in this mode, as for many
cases, the folder's name provides enough information for the user to navigate
the
folder list successfully. This mode also allows for arbitrary sorting of
folders
based on the name of the folder, or the number of files contained in each
folder,
or the number of tags on average on each file in the folder, without the
overhead
of trying to maintain the position in the hierarchy.
Each folder that contains media files are listed separately.
The headings for the columns are
= "Folder Name"
= "File Count"
= "Untagged File Count"
= "Average Tags in a File"
= "Unique Tags in the Folder"
= "Date of Newest File"
= "Date of Last Modified File"
= "Date of Last Created File"
= "Full Folder Pathname".
Each column can be sorted, and the order of the columns from left to right can
be
re-arranged via drag and drop of the column heading labels.
The "Folder Name" column also displays an icon indicating what types of files
are
contained within it. ie. music, photos, video, or any combination thereof.
= [Please refer to Figure 23 in Appendix 1] - Folder contains files of PHOTO
types, but none of MUSIC nor VIDEO
58
CA 02668306 2009-06-08
= [Please refer to Figure 24 in Appendix 1 ] - Folder contains files of VIDEO
types, but none of MUSIC nor PHOTO
= [Please refer to Figure 25 in Appendix 1] - Folder contains files of MUSIC
types, but none of PHOTO nor VIDEO
[Please refer to Figure 26 in Appendix 1] - Folder contains files of PHOTO
and VIDEO types, but none of MUSIC
= [Please refer to Figure 27 in Appendix 1] - Folder contains files of MUSIC
and VIDEO types, but none of PHOTO
= [Please refer to Figure 28 in Appendix 1] - Folder contains files of PHOTO
and MUSIC types, but none of VIDEO
= [Please refer to Figure 29 in Appendix 1 ] - Folder contains files of PHOTO,
VIDEO, and MUSIC types
When the "View Filter" is set to NOT show files of a particular type, the icon
in
the "Folder Name" will ALWAYS indicate the combination of file types contained
therein, even though the counts in other columns will NOT reflect suppressed
file
types.
Selecting a row in the Folder List loads the thumbstrip with files.
A thread is then launched to monitor changes that occur to files in the folder
being displayed in the Thumbstrip. The application will detect when files are
added to or deleted from the folder, and make appropriate changes in the
thumbstrip. In addition, changes caused by external applications will also be
detected. If a third-party application, such as a specialized image
manipulation or
audio/video editing software, is used to modify an existing file then it is
possible
that the embedded metadata will be intentionally or unintentionally removed
when the file is written back out in the external application's save
operation.
If users adhere to a process where they tag their photos in SwapNeat, then
launch third-party editing applications from within SwapNeat Metadata Studio
to
59
CA 02668306 2009-06-08
operate on managed files, SwapNeat will re-embed the metadata when the file is
written out, maintaining the integrity of embedded metadata over time.
Clearly, a close tie between SwapNeat's metadata parsing engine and the
operating system is advantageous, but is not an integrated feature of Version
1Ø
Episode Mode
The principle behind "episode" functionality is simple: files that have been
created within a specific timeframe are likely to share common
characteristics.
The Episode mode is a tree style view of the files in the SwapNeat database.
The
files are organized chronologically and grouped into episodes. The file-system
hierarchy is not considered in this mode.
The top nodes of the tree represent the embedded metadata for the year the
file
was originally created. Below the years are the months and below the months
are the days of the month. Nodes only exist if there are photos within it.
Within
each day upon which files were created, there will be one or more episodes.
[Please refer to Figure 30 in Appendix 1 ]
[Please refer to Figure 31 in Appendix 1]
The contents of an episode are computed by comparing the file creation times.
For example, suppose in a given day, ten photos are taken between 10 and 11
AM, and no photos are taken between 11 AM and 7 PM, then starting at 7 PM
ten more photos are taken within a half-hour time frame.
CA 02668306 2009-06-08
Those files would appear as two episodes in a single day. The ten photos taken
between 10 and 11 AM are more likely to share common characteristics than the
entire set of twenty photos taken on a single day, but those twenty photos on
a
single day are more likely to share common characteristics than files taken on
two separate days.
The purpose of grouping files into episodes is to provide the user with a set
of
similar files which can be batch processed easily.
Generally, files within the same episode will exhibit similar content, or
share
similar workflow requirements.
By batch tagging files via episodes, the user is able to process far more
files in
less time than if the files were not organized by episode. A detailed
explanation
of how episodes are calculated can be found in the section on Algorithms.
Each node in the tree contains a thumbnail of the first file in that episode
when
the file types are photos or video clips. This gives the user an idea of which
files
are contained in the episode. When an episode or a parent node is selected,
all
the files within it, along with all descendant files in sub-episodes below it
are
loaded into the thumbstrip.
Various default episode properties are user-alterable. When the user double-
clicks an episode node, a dialog is presented, allowing metadata to be
assigned
to the episode itself: specifically, assigning a descriptive title to the
episode.
Episode titles are stored in the SwapNeat database and are associated with the
first file of the episode. The Episode title can also be inserted as the files
in the
exif/tiff/xmp/mp3 caption or comment tag.
Adding episode titles is a convenient method for batch inserting captions into
photos.
61
CA 02668306 2009-06-08
Another use of the episode title is to use it as a prefix for renaming the
files that
belong to it.
A search can be conducted based on the embedded episode title. Episode titles
are also available in the Advanced (Tree) Search mode.
Filesystem Mode
The Filesystem mode view in the Library pane is a conventional tree-style view
of
the file system hierarchy. The user can choose to see only folders that are
managed by the application or they can choose to see all folders in the
system.
Folders that are in the managed by SwapNeat Metadata studio are highlighted by
bold text and an icon indicating the type(s) of files in the folder. ie
photos, video
or music, or combination of those file types.
The text labels on ancestor folders of a monitored folder is in bold face to
indicate
to the user that one or more of its descendants are monitored. Clicking a
folder
that is NOT in bold face text i.e. not monitored will not affect the contents
of the
Thumbstrip. Selecting any node that has bold text will load the contents of
that
folder AND ITS DESCENDANT FOLDERS in the thumbstrip. Said another way,
when a folder that is monitored by SwapNeat is clicked, a recursive scan of
that
folder and all descendant folders is performed, creating an aggregate file
list that
is then displayed in the thumbstrip.
The usefulness of this managed view is not restricted to folder and file
management use: it is also a convenient mechanism for navigating through the
library of monitored files to identify folders full of files that have not yet
been
tagged.
[Please refer to Figure 32 in Appendix 1]
This view of the library provides the user with a map of where managed files
are
stored on the system, which folders are not managed and the file density
62
CA 02668306 2009-06-08
distribution. Appended to the folder name on each node is a number indicating
the number of files in that folder, reflecting the state of the "Viewable
Files filter"
toolbar buttons. If the folder contains nine files, three each of Music, Video
and
Photo files, and display of Video and Music files is suppressed (only the
Photos
button on the Viewable Files Filter toolbar is pressed) then the number would
be
õ3fl
Context menus are available by right-clicking on a node, depending on the
"monitoring status" of the folder represented by the node. Commands include:
= "Add the selected Folder to the SwapNeat Library" - for folders that are not
already being monitored
= "Removing a Folder from the SwapNeat Library" - for nodes that are
presently being monitored
= "Sorting by Name, Create Date, Modified Date and Newest File Date" - for
change the order of the folders in the tree
A significant problem with digital photos is that across the entire file
system
hierarchy, exact file name duplicates may occur. Most digital cameras use a
common naming convention: sequential names, starting at "PICT0001.JPG" for
the first file written, incrementing by one for each subsequent photo taken.
If the
camera loses power, or the user has configured their camera to name files in
certain ways, the user may end up with many files named "PICT0001.JPG"
scattered throughout their file system. File name collision is very common
when
the library consists of photos taken from numerous digital cameras. Each
camera
would tend to use the same file naming scheme. This would be a common
problem in industries that require their employees to take pictures of job
sites or
equipment, each employee equipped with their own camera.
While it is true that the full path to the photo is not ambiguous, the last
piece, the
filename itself, is not unique.
SwapNeat supports filename disambiguation in a number of ways.
63
CA 02668306 2009-06-08
1) An icon on a node representing a folder containing a file name that is not
unique. 2) File name disambiguation feature available per-file from the
context
menu. 3) Full-library file name disambiguation where every file in the managed
folder list is scanned and filenames are automatically disambiguated.
The uniqueness of this mode is that it provides a topological view of the
entire
system with respect to which folders are managed by the application.
The Thumbstrip Pane
[Please refer to Figure 33 in Appendix 1 ]
The primary purpose of the Thumbstrip pane is to display thumbnail
representations of the files that are being operated on by the user. The list
can
be the contents of a directory, an episode of photos, a list of dropped files
or a
result of a keyword search.
Each set of files that populate the thumbstrip is remembered and can be
navigated similar to a Web browsers "Back" and "Forward" buttons. This way, if
a
user takes an action that would change the contents of the thumbstrip (like
clicking a folder in the Library pane or performing a search) they can go back
to
previous file sets.
Files can be "locked" in the thumbstrip so they will not removed due to any
action
of this nature (Thumbstrip item locking is described below)
The tile background color in the thumbstrip is used to indicate whether the
item is
not selected, selected, or selected and has focus.
The tile background can be further modified with a striping to indicate that
the file
is "locked" in the thumbstrip (and can not be removed from the thumbstrip
until it
is unlocked)
64
CA 02668306 2009-06-08
Also, one or more overlay icons can appear on the tile to indicate the nature
of
the metadata embedded in that file.
The toolbar at the top of the thumbstrip pane offers the following features:
= [Please refer to Figure 34 in Appendix 1] Back in history to previous
thumbstrip file list
= [Please refer to Figure 35 in Appendix 1] Forward in history to next
thumbstrip file list
= [Please refer to Figure 36 in Appendix 1] Copy tags in selected thumbstrip
files to clipboard
= [Please refer to Figure 37 in Appendix 1] Paste tags from clipboard into
selected thumbstrip files
= [Please refer to Figure 38 in Appendix 1 ] Lock selected files in thumbstrip
= [Please refer to Figure 39 in Appendix 1] Toggle thumbstrip display
between "details" and "tiles" views
ThumbStrip Navigation
[Please refer to Figure 40 in Appendix 1 ]
The Thumbstrip is a special thumbnail view that is synchronized and supports
the
"Whack-A-Tag" process.
It behaves in a special way when it is only one thumbnail wide. When in multi-
column mode, it acts like any other 'tile' view window; you navigate with the
scrollbars and click on one or more tiles.
When the thumbstrip is a single column, the behaviour, user interaction,
changes; when a tile is clicked (given focus) the thumbstrip autoscrolls so
that
specific tile is centered vertically in the thumbstrip, as opposed to the view
position staying wherever it was when the tile was clicked. If the tile is the
first in
CA 02668306 2009-06-08
the set, there is blank space above the tile; if it is the last in the file
set, there is
blank space below the tile.
The user can navigate easily through the files by positioning the mouse over
the
tile below the focused one (for navigating forward) or over the tile above the
current one (for navigating backward). When the user clicks the tile below the
focused one, the thumbstrip scrolls upward so the clicked tile moves into the
vertically centered position. The mouse does not move. Clicking again without
moving the mouse is a click on the next tile, and so on. In this way, the user
can
keep their eyes on the full-size preview and navigate through all the files in
the
thumbstrip with a single click. No mouse movement required.
Likewise, positioning the mouse pointer over the tile immediately above the
tile
that has focus navigates backward.
Dragging the scroll bar thumb allows the user to navigate to anywhere in the
thumbstrip; whichever tile they click at that point is vertically centered.
While using the Whack-a-Tag pane, when the "Done" button is pressed, which by
design automatically moves to the next file, it is kept in sync with the
thumbstrip:
the focused tile moves to the vertical center position in the thumbstrip.
In the future, special tiles may be added to the top and bottom of the
thumbstrip,
providing the means for the user to navigate to the next and previous folders
in
the file system. A tile above the tile representing the first file in the
thumbstrip
would provide the ability to navigate back to the previous folder in the
folder sort
order, or the previous episode in chronological order, or to the result of the
previous search conducted, or reload whatever the contents of the thumbstrip
were before the current contents were loaded.
Likewise, a tile below the tile representing the last file in the thumbstrip
would
provide access to the next folder or episode. Logically, one can not navigate
to
66
CA 02668306 2009-06-08
the 'next' search result, unless one has previously navigated back from the
most-
recent one.
Thumbstrip Rendering
The thumbstrip has the potential to contain thousands or hundreds of thousands
of items. If a basic technique of loading an image list with thumbnails was
used,
the system would be exhausted of memory and the application would grind to a
halt or crash. In order to reduce the memory footprint, special rendering and
caching code was developed.
Thumbnail Vault:
As files are imported into the application, thumbnails are generated in a
background thread and stored in the thumbnail vault. The thumbnails are then
requested from the vault for display in the thumbstrip. For space efficiency,
the
vault stores the files in JPG format.
Thumbnail LRU:
A JPG cache is maintained of the last N requested thumbnails. This is useful
when.the thumbstrip contains many files and is scrolled back and forth.
Bitmap rendering and readahead:
Instead of dynamically rendering each thumbnail to the thumbstrip as it is
scrolled, the thumbnail rendering code prepares a bitmap the size of the
display
area of the thumbstrip. Each visible thumbnail with its background and
overlays
are rendered onto the bitmap in the appropriate location. The entire bitmap is
then rendered to the display. This technique provides a performance boost of
an
order of magnitude. As an additional performance improvement, real-time
readahead bitmaps are prepared depending on the direction of scrolling.
Generally, the page before and the page after are pre-created when the
thumbnail is scrolled. Invalidation of the bitmaps are required when any given
67
CA 02668306 2009-06-08
thumbnail within it is rotated, cropped, tagged or any other operation which
may
alter its display. The invalidation and therefore refresh is aided by the
Thumbnail
LRU, resulting in quick redraws with little or no noticeable latency. The
three
techniques are required for smooth operation.
Thumbnail Locking
[Please refer to Figure 41 in Appendix 1]
When the user navigates away from either the current "Episode" or "Folder" in
the Library, or performs a search, the normal behavior is that the thumbstrip
is
cleared and the new folders' or search-results' contents are displayed in the
Thumbstrip.
When putting together a set of files for use in a slideshow or playlist, it
may not
be possible, with a single search or file-folder click, to get the exactly
right file list
displayed in the Thumbstrip. For example, the user may want to do a simple
text
search on a person's name, then keep selected files in the Thumbstrip, and
continue and perform a second search on another person, keeping selected files
in the thumbstrip, etc.
Or they may want a few files from a particular file-system folder, and
selected
others from additional folders.
In order to keep files available after performing multiple folder navigation
actions
or searches, the "Lock Tile" feature in the Thumbstrip was devised.
Any tile can be locked by clicking on the "Lock" toolbar button, and then
clicking
the thumbstrip tile that represents the file the user does not want a
subsequent
action to remove from the thumbstrip.
In this way, the user can perform complex searches where the aggregate result
would not be possible to attain with a single search (or without methodical
manual removal of specific tiles from the thumbstrip).
68
CA 02668306 2009-06-08
Locked thumbnails are indicated by a striped background. Locked items can be
unlocked with a right click context menu, or a second click on the tile. Note
that
even if the current unlocked tiles are the result of a search, and the newly-
unlocked tile does not meet the search criteria, it will remain in place in
the
Thumbstrip until another action that would alter the Thumbstrip contents is
taken.
Should the user accidentally unlock an item they do not want to unlock, they
can
click the item again to lock it again.
It was proposed in early designs of the software that there be two panes that
offer similar functionality to the current thumbstrip. One was the "real" file
set that
would be operated on, and the other being a "temporary" one, to hold the
results
of the most recent search result, or the contents of the folder clicked in the
Library. This design was rejected as confusing and resource-intensive, in
favor of
offering the ability to "lock" items so they would not be affected by
subsequent
actions.
Thumbstrip Tile Background Colours
[Please refer to Figure 42 in Appendix 1]
The surrounding background of the thumbnail indicates various states a file
can
be in.
= The light blue background is the neutral (not selected nor focused) state.
= Dark blue indicates the item is selected.
= Green indicates the file is currently the focused file (the focused file is
always considered to be 'selected')
The preview pane also loads the focused file.
If a file is "locked" in the thumbstrip, a yellow diagonal stripe is overlaid
on the
thumbstrip tile background. See explanation above.
Each tile also shows the files name (without the full path).
69
CA 02668306 2009-06-08
Image Tooltips
[Please refer to Figure 43 in Appendix 1 ]
Tooltips show a subset of embedded metadata when the mouse pointer hovers
over an image thumbnail. The tooltip will also be shown when the pointer
hovers
over the image in the Preview pane.
The subset of tags that appear in the tooltip over thumbstrip tiles and the
preview
pane image is configurable through the Metadata Chart Pane, saved and shared
using a snmc (SwapNeat Metadata Chart) file.
User can customize the tooltips for specific purposes. For instance, Asset
Management, Collectibles, or Personal Photo Libraries. Industry trade groups
can define their own SwapNeat configurations, including snmc files, so all
tooltip
information is always relevant. Each file type, music, photo and video, can be
configured independently.
This file can be used to also show if important metadata is missing. If a tag
that is
required for a specific workflow is NOT present in the embedded metadata, then
the tooltip can indicate the LACK of a particular tag.
Thumbstrip Tile Metadata Indicator Overlays
Small icons are overlaid on the Thumbstrip tile, down the right edge of each
tile,
to indicate the types of metadata embedded in that file, by metadata
compartment.
These compartments and their icons are shown below:
= [Please refer to Figure 44 in Appendix 1] - snmd
= [Please refer to Figure 45 in Appendix 1] - iptc
= [Please refer to Figure 46 in Appendix 1] - xmp
= [Please refer to Figure 47 in Appendix 1] - exif
CA 02668306 2009-06-08
= [Please refer to Figure 48 in Appendix 1] - photoshop
= [Please refer to Figure 49 in Appendix 1 ] - tiff
= [Please refer to Figure 50 in Appendix 1] - mp3
= [Please refer to Figure 51 in Appendix 1 ] - id3
= [Please refer to Figure 52 in Appendix 1] - dublincore
= [Please refer to Figure 53 in Appendix 1 ] - prism
= [Please refer to Figure 54 in Appendix 1] - rdf
The last 3 items in the list are specific vocabularies and collections of
other
items, and are generally found in the XMP compartment. The overlays provide a
quick and easy way for users to determine what general types of metadata are
in
each file.
This feature may be extended to allow sub-types within compartments. For
example, specific SNMD-based Tag Vocabularies might have unique overlay
icons embedded in the XML of the "SNTV" file (as binhex data) that, when the
Tag Vocabulary is loaded into the SwapNeat Metadata Studio application, will
be
displayed in addition to the more general SNMD compartment indicator icon.
If "bad" or "invalid" metadata is detected in a file, a "Red tag" icon [Please
refer to
Figure 55.png" ALT="Image:Badmetadata 12.png"
LONG DESC="/snwiki/index.php/Image:Badmetadata_ 12 in Appendix 1] is
displayed as an overlay. This tells the user that another tool has been used
insert
embedded tags into the file, and one or more of the tags therein is not
compliant
with the standards. The user will then review the embedded metadata in the
Metadata Tree, which will show red tag icons on any node that has a descendant
node that exhibits an error. By expanding the branches that have the red tags,
the user will discover the bad tag, and can operate on that specific tag in
the
Metadata tree, without having to retag the entire image.
File Busy Indicator
71
CA 02668306 2009-06-08
If a file is currently being modified ie. rotated, cropped or metadata is
actively
being embedded, then the thumbnail with have a semi-transparent haze over the
thumbnail image to indicate the file is in a busy state. If a large number of
files
are operated on simultaneously (batch tagging a large number of files, for
example) the user will see the 'haze' disappear as each file is written out to
the
file system with modifications.
= [Please refer to Figure 56 in Appendix 1] - The focused file before the
metadata is written into the file
= [Please refer to Figure 57 in Appendix 1] - While the file is being written
out, it appears 'hazy'.
Thumbstrip Item Context Menu
A context menu is also available, which provides easy access to features such
as
copy metadata to clipboard, paste metadata from clipboard, load episode the
file
belongs to, search for other files take on the same day, locate on disk,
delete the
file, rename the file, etc.
Auto Rotation of Photos
Modern cameras have orientation sensors built in, and a tag known as
'tiff/orientation' is embedded in the file, with a code indicating whether the
camera
was oriented in "portrait" orientation, or "landscape" orientation.
When a file is loaded into the thumbstrip it is interrogated for the
tiff/orientation
value. If the orientation value indicates a rotation of the image is need, the
thumbnail is automatically rotated appropriately.
Furthermore, a thread is launched to permanently rotate the file so the
natural
orientation of the image on screen will be correct, and the tiff/orientation
value is
reembedded in the file, set to 0 (indicating that the image data origin point
is the
top-left corner of the image).
72
CA 02668306 2009-06-08
All rotated or cropped files are backed up before they are modified.
The thumbstrip context menu provides a mechanism for restoring the file to the
original orientation, as saved by the camera.
This feature can be disabled altogether, but since modern cameras support
this,
it is a great convenience to the majority of users.
Preview Pane
The preview pane provides a detailed view of the file represented by the
focused
thumbstrip tile.
There are different views presented depending on the file type.
Music
[Please refer to Figure 58 in Appendix 1 ]
When the tile representing an audio file has focus in the thumbstrip, a
special
graphic called an "M2G" (music to graphics) file is displayed. Various
frequencies
of the file are computed into values that are rendered as pixels.
The graphic is visibly segmented such that individual rows of the graphic
represent a specific amount of time in the file. Note that the number of
seconds
represented by each vertical segment varies with the overall length of the
file
being displayed.
When the file is played, an overlay travels across the surface of the graphic,
indicating the play position.
The user can click anywhere in the area of the graphic to move the playback
position.
73
CA 02668306 2009-06-08
Metadata items which refer to specific points in the music can cause graphic
symbols to decorate the m2g, allowing similar function to tagged subregions
and
points in images. The ability to precisely position metadata references on an
m2g
is enhanced both by the 2d nature of it (providing a longer overall timeline
due to
it being in many rows) and the ability to discern changes in the music from
the
colour changes in the m2g graphic.
A playback control toolbar is available for previewing of media files:
= [Please refer to Figure 59 in Appendix 1] - scan backwards at 2 x speed
= [Please refer to Figure 60 in Appendix 1 ] - stop playback
= [Please refer to Figure 61 in Appendix 1 ] - start playback
= [Please refer to Figure 62 in Appendix 1 ] - scan forwards at 2 x speed
= [Please refer to Figure 63 in Appendix 1 ] - mute audio
= [Please refer to Figure 64 in Appendix 1 ] - delete current file from file
system (prompt for confirmation)
Photos
[Please refer to Figure 65 in Appendix 1]
The Photo is displayed at a size that will show the full, image in the Pane,
no
matter how large the pane is.
A toolbar incorporated directly into the Preview pane deals specifically with
Photo
files.
= [Please refer to Figure 66 in Appendix 1] - View the image full screen
(against a black background, scaled to fit entire image on the screen
without altering the image's original aspect ratio
= [Please refer to Figure 67 in Appendix 1] - View the image full size in the
preview pane: the user can click and drag the mouse to see different parts
74
CA 02668306 2009-06-08
of the image. In Version 1.0, there is no ability to zoom in or out to scale
the image arbitrarily.
= [Please refer to Figure 68 in Appendix 1] - rotate the image 90 degrees
clockwise. When the metadata is applied, the file is written out with the
image in this native orientation. That is to say, there is no need to refer to
an orientation code in the image in order to transform the data before
displaying it, since the data will have been rearranged to show the image
in the desired
orientation and the orientation code will have been set to '1' which means no
transformation necessary. The orientation of '1' is the most common
orientation
and many tools which don't refer to it assume that the image is in this
orientation.
= [Please refer to Figure 69 in Appendix 1] - rotate the image 90 degrees
counterclockwise. When the metadata is applied, the file is written out with
the image in.this native orientation.
The following icons deal with adjustments to the crop / aspect ratio of the
image.
= [Please refer to Figure 70 in Appendix 1 ] - go into "subregion whack"
mode, described below. The shape and size of the area depend on the
buttons described below.
= [Please refer to Figure 71 in Appendix 1] - The crop / subregion area
should be in a Landscape orientation (wider than tall)
= [Please refer to Figure 72 in Appendix 1] - The crop / subregion area
should be in a Portrait orientation (taller than wide)
The following icons offer one-click access to common aspect ratios: whichever
button is depressed, the aspect ratio of the rectangular border will scale at
that
aspect ratio, no matter how the size of the rectangle is adjusted.
CA 02668306 2009-06-08
= [Please refer to Figure 73 in Appendix 1] - Set crop / subregion area to a
proportionate size for computer screens: common aspect ratios are 4:3,
5:4, 16:9, and 2.35:1
= [Please refer to Figure 74 in Appendix 11 - Set crop / subregion area to a
proportionate size for traditional photographic prints: common aspect
ratios are4x6, 5x7, 8x 10
= [Please refer to Figure 75 in Appendix 1] - Set crop / subregion area to a
proportionate size for printing at home on a conventional printer: common
aspect ratios are A4 Sized Paper, and Letter Sized Paper
= [Please refer to Figure 76 in Appendix 1] - The user can control the aspect
ratio height and width completely independently, or, conversely, create an
area that is a perfect square and will scale in that proportion only.
= [Please refer to Figure 77 in Appendix 1] - Permanently delete the file
currently being viewed in the Preview Pane from the file system. A
confirmation prompt will be displayed.
In a subsequent release, special support is planned for assigning a colour to
a
particular object: a special image preview toolbar button will be added:
= [Please refer to Figure 78 in Appendix 1] - Colour Whack
this toolbar will change the mouse pointer icon to that of an eyedropper: the
user
will enter "colour whack" mode by clicking this toolbar button, then click a
tag in
the Summary tab of the Whack interface, then when the mouse pointer moves
over the image, the user will click on an area of the image represented by the
selected tag: this will sample the colour space of the pixels in an area
around the
mouse pointer hot spot and put the value for the colour in the HSB colour
model
as a qualifier of the selected tag. The user can repeat this process, clicking
a tag,
then clicking on the image, until all the objects that they wish to be colour
tagged
are done.
SubRegion Tagging
76
CA 02668306 2009-06-08
[Please refer to Figure 79 in Appendix 1 ]
By pressing the "SubRegion Tagging" toolbar button, a region similar to the
"Crop" region is overlaid on the image. The user can resize the rectangle. Any
tags applied to the image while in SubRegion Tagging mode will be associated
with a specific rectangular area in the photo.
When the image is subsequently displayed in the Preview Pane, and the mouse
hovers over this specific region of the photo, a tooltip is displayed for the
tags
specific to that rectangle. A semi-transparent rectangle is shown over the
area
outside the region having subregion tags.
The tags are encoded in the file with a straightforward coordinate system and
orientation indicator. If the file is later rotated in SwapNeat Metadata
Studio,
these values are adjusted. If the file properties change outside the control
of
SwapNeat Metadata Studio (external crop or rotate for example) then the next
time the file is previewed, SwapNeat Metadata Studio will detect the change
and
the region will be flagged as 'suspect'. The user will then be prompted to
update
the region coordinates / size and re-embed the metadata.
Three toolbar buttons for SubRegion whacking are on the "Preview" toolbar. Two
affect the Whack progression. One is depressed to show all the regions on the
currently focused file.
The two distinct types of subregion whacking both affect the selected files.
= whack multiple tags into a single region: after the region is drawn, the
normal whack progression will proceed, and all tags applied will be
associated with that region. Useful for tagging multiple characteristics of
an object.
= whack tags from the foregrounded tab into multiple regions in this file:
after the region is drawn, and the current tab is Whacked, instead of
progressing to the next Whack tab, the user can draw another region, and
77
CA 02668306 2009-06-08
tag into that region a tag from the same tab: useful coupled with a TagSet
that gathers a variety of "People" leaf nodes onto the "People" tab... the
user can easily tag the faces of multiple persons in the same photo: draw
a region, click a tag, draw another region, click another tag, etc.
If a rectangle is 'selected' then the mouse should have a special cursor and
the
image should have a color shading to indicate area whacking activated and the
status bar should indicate the state too, with some symbol.
The tagbag will show the buttons in the selected rectangle with some extra
icon
so you know which ones are in there.
The metadata chart will show only those icons directed at the rectangle.
right-most toolbar button: show areas that are on the photo. xor outline boxes
appear for all known areas use SnDb_GetListOfAreas(mCtrl) to get the areas to
show, using the metadata tree
in this mode, we will support moving (click inside a box, and drag) and
resizing,
(click on the edge of an area when the mouse button changes to arrows and
drag). Click and drag away from an existing triangle is how a new rectangle is
drawn. A sufficient threshold distance away from existing boxes is required to
make the user's intention to draw a new bounding box unambiguous.
there is the concept of a selected rectangle. if you click in a rectangle, it
gets
'selected' and mousing out of it does not change the metadata chart anymore.
In
that case, while you do any whacking or unwhacking you are editing the
metadata int he selected rectangle. The mouse pointer should show a rectangle
whacking overlay all the while.
Behavior of the mouse when viewing the "full size" image prioritizes the
"click
and drag to reposition the viewable image area.
78
CA 02668306 2009-06-08
leftmost toolbar button: convenient naming mode (this only works in normal
mode, not in full size mode, because there's no way to reposition the image)
= click and drag on the image to mark a rectangular region around a face
= whack a single leaf-node (not just a buttontab)
= after whacking, selection goes away from area, and a click and drag
action on the Preview image draws another area
= control returns to the tab you were on when you pressed the convenient
naming mode toolbar button
Press the toolbar button again to unpress it and get out of the mode
you can whack and navigate any tabs and buttons, and all the data will be
assigned to the image with an area code on them.
you can edit the area code to refine it before pressing done.
when you press done, the area will be embedded as a child element of all the
leaf-level tags you just whacked.
in both modes, when areas are whacked, the simple rule is... they apply to the
lowest level tag before qualifiers
ie, the area is a child of <stephen> not <people>
mousing over a rectangle will combine all the metadata from each rectangle the
mouse is in\ the metadata chart will show all metadata in the appropriate
rectangle if the mouse is in no rectangle, then all metadata shows in the
chart
use SnDb_GetListOfMetadata(areaItem);
Rectangles in the metadata tree
79
CA 02668306 2009-06-08
a <sn:subregion> tag will be a child of the located node it will have sub
elements:
a param-taking <sn:subregion_id> element, which can contain a GUID, and a
<sn:subregion_location> element having a string containing the coordinates
another item is a <sn:point> tag, which has a <sn:point_id> and a
<sn:point_Iocation> sub elements. again, the point-id is a guid and the
point location is in fractions of the photo, assuming that the photo has
square
pixels, so 0,0 is the upper left corner of the upper left pixel.
Video
[Please refer to Figure 80.png" ALT="Preview Pane showing Video"
LONG DESC="/snwiki/index.php/Image: Previewpane_video in Appendix 1]
All Tags Tree Pane
[Please refer to Figure 81 in Appendix 1]
The All Tags Tree pane is both the foundation for all other trees in the
application, and provides access to all classes of metadata and individual
tags
available to the user in the application.
Through the All Tags Tree Pane users can create new tag vocabularies and
manage tags therein.
The All Tags Tree pane toolbar has the following functions:
= [Please refer to Figure 82 in Appendix 1 ] - Create a new tag vocabulary
= [Please refer to Figure 83 in Appendix 1] - Import a tag vocabulary from a
file
= [Please refer to Figure 84 in Appendix 1 ] - Save a tag vocabulary to a file
(for sharing or backup)
= [Please refer to Figure 85 in Appendix 1 ] - Publish a tag vocabulary to the
SwapNeat Community
CA 02668306 2009-06-08
= [Please refer to Figure 86 in Appendix 1] - Create a new tag; it's Whack
Default property will be set automatically based on context
= [Please refer to Figure 87 in Appendix 1 ] - Create a new collection: it's
Whack Default property will be set automatically to "Buttontab"
= [Please refer to Figure 88 in Appendix 1] - Perform a simple text find on
strings on nodes in the metadata tree
= [Please refer to Figure 89 in Appendix 1] - Find next occurrence
The five rightmost buttons on the toolbar change the node icons in the tree to
indicate different types of information.
= [Please refer to Figure 90 in Appendix 1] - Change node icons to indicate
the default Whack Interface presentation for all nodes
= [Please refer to Figure 91 in Appendix 1] - Change node icons to indicate
the ownership of all nodes
= [Please refer to Figure 92 in Appendix 1 ] - Change node icons to indicate
the guidance completion status for all nodes
= [Please refer to Figure 93 in Appendix 1] - Change node icons to indicate
the data compartment (tag vocabulary) for all nodes
= [Please refer to Figure 94 in Appendix 1] - Change node icons to indicate
the data type for all nodes
Tree Node Icons
The All Tags Tree node icons will change to one of the icons in each group,
depending which of the five rightmost toolbar buttons are depressed. One of
the
five buttons is always pressed, and no more than one can be pressed at any
time.
Default Whack Interface Presentation
81
CA 02668306 2009-06-08
= [Please refer to Figure 95 in Appendix 1] - When the tag represented by
this node is added to the TagSet, it will appear in the Whack Interface as a
Tab
= [Please refer to Figure 96 in Appendix 1] - When the tag represented by
this node is added to the TagSet, it will appear in the Whack Interface as a
Buttontab
= [Please refer to Figure 97 in Appendix 1] - When the tag represented by
this node is added to the TagSet, it will appear in the Whack Interface as a
Button
= [Please refer to Figure 98 in Appendix 1] - When the tag represented by
this node is added to the TagSet, it will NOT appear in the Whack
Interface: descendant nodes will appear on the tab or buttontab that
represents the nearest non-bypassed ancestor node
It is important to know if a node is a collection. The following four icons
indicate
for any object in the Whack interface if it is also a collection node
= [Please refer to Figure 99 in Appendix 1] - the node represented by this
tab is a collection node
= [Please refer to Figure 100 in Appendix 1] - the node represented by this
buttontab is a collection node
= [Please refer to Figure 101 in Appendix 1 ] - the node represented by this
button is a collection node
= [Please refer to Figure 102 in Appendix 1] - this bypassed node is a
collection node
Note that collection nodes are usually buttontabs or are bypassed... it is not
logical to have a collection node in the Whack interface that does not have
children, as collection nodes are not 'XML-making'. They just help organize
the
All Tags tree and TagSet tree (and thereby the Whack interface)
Ownership
82
CA 02668306 2009-06-08
The colours of the icons indicate general ownership status:
= green: the currently logged-in user is the owner / moderator of this node
and all descendant nodes.
= blue: another user is the owner / moderator of the tag vocabulary, and the
currently logged-in user has accepted this tag vocabulary for use in their
app. All standard moderated tag vocabularies (i.e. ID3, XMP, DublinCore,
etc.) fall into this category.
= orange: descendant nodes are owned by another user, but the currently
logged-in user has NOT yet accepted this node for general use in their
app-
= red: another user is the owner / moderator of the tag vocabulary, and this
node contains nodes that the user has REJECTED this for general use in
their app. It will be suppressed throughout the app, except in performing
searches
= black: this node has descendants that are "renegade" nodes. Renegade
nodes are indicated where a previous version of a tag vocabulary
contradicts the latest version of the tag vocabulary
The ownership indicators at the node level are important because through the
use of "Collection" nodes, the user can arbitrarily reorganize their All Tags
Tree
for various reasons, and the nodes natural context can be obscured by such
reorganizations.
Additionally there are different types of nodes
= the 8 spoked 'asterisks' are tag vocabulary root notes: some of the spokes
are missing to indicate a partial tag vocabulary.
= the "building block" icons indicate the top-level node of a tag vocabulary
module
= the "bracketed buttons" icons indicate that a node can accept particular
qualifiers
83
CA 02668306 2009-06-08
= the "tag" nodes are regular tags within a tag vocabulary
Tag Vocabulary Root Nodes
= [Please refer to Figure 103 in Appendix 1 ] - This is a root node of a tag
vocabulary that is owned by the user currently logged into the SwapNeat
Metadata Studio application
= [Please refer to Figure 104 in Appendix 1] - This node can only be
displayed when the logged-in user has previously set up a tag vocabulary
and is now running the application under a different Windows Logon ID, or
somehow not under the management of this installation of SwapNeat
Metadata Studio. The users' ID is included in the discovered tags, and
known to be the same as the currently logged-in user.
= [Please refer to Figure 105 in Appendix 1] - This is a root node of a tag
vocabulary that has been "discovered": tags from this tag vocabulary have
been found in files, but the full tag vocabulary is not available within
SwapNeat Metadata Studio application. Without accessing the SwapNeat
Community to retrieve the entire tag vocabulary file, it is impossible to
know how much of the tag vocabulary has been discovered, so the icon
indicates that it is a "partial" tag vocabulary. The user has not yet
accepted nor rejected these nodes for general use in their tag library: if
they accept, the entire tag vocabulary will be retrieved from the SwapNeat
Community Website, and the node colour will change to "blue" to indicate
a full third-party tag vocabulary.
= [Please refer to Figure 106 in Appendix 1 ] - This is a root node of a tag
vocabulary that is complete (i.e. not "discovered") but is NOT owned by
the user currently logged into the SwapNeat Metadata Studio application.
= [Please refer to Figure 107 in Appendix 1] - The user has accepted the
discovered nodes, but has not elected to download the entire tag
vocabulary from the SwapNeat Community. In future releases, it may be
that to "accept" discovered tags will mean non-optional acceptance of the
entire tag vocabulary.
84
CA 02668306 2009-06-08
= [Please refer to Figure 108 in Appendix 1] - This is a root node of a tag
vocabulary that has been "discovered", after which time the user chose to
download the entire tag vocabulary from the SwapNeat Community, but
then the user subsequently decided to "reject" the tag vocabulary: the user
does NOT want the tags to be visible in the application generally, even
though due to the presence of the metadata in a file under management of
the application, the tag is known to the application. The user can SEARCH
on the tags present in rejected tag vocabularies, but they will not be
exposed to these tags elsewhere in the application.
= [Please refer to Figure 109 in Appendix 1) - This is a root node of a tag
vocabulary that has been "discovered", but the user did not elect to
download the entire tag vocabulary (leaving all nodes in that tag
vocabulary as "discovered") but the user has subsequently decided to
"reject" the tag vocabulary: the user does NOT want the tags to be visible
in the application generally, even though due to the presence of the
metadata in a file under management of the application, the tag is known
to the application. The user can SEARCH on the tags present in rejected
tag vocabularies, but they will not be exposed to these tags elsewhere in
the application.
= [Please refer to Figure 110 in Appendix 1] - Indicates that the tag
vocabulary has been completely downloaded from the SwapNeat
Community, but that it contains nodes that are incompatible / missing from
the most current version of the tag vocabulary.
= [Please refer to Figure 111 in Appendix 1] - This icon indicates that the
tag
vocabulary has NOT been completely downloaded from the SwapNeat
Community (all tags in this tag vocabulary are "discovered", but more
recent versions of "discovered" tags contradict the structure / are
incompatible / missing from the most current version of the discovered
tags.
Module Root Nodes
CA 02668306 2009-06-08
= [Please refer to Figure 112 in Appendix 1] - Root node of a module
created by the currently logged-in user.
= [Please refer to Figure 113 in Appendix 1] - Root node of a module NOT
created by the currently logged-in user, that has been accepted for use
within the application by the user.
= [Please refer to Figure 114 in Appendix 1] - Root node of a module NOT
created by the currently logged-in user, that has been discovered, and not
yet accepted for use within the application by the user.
= [Please refer to Figure 115 in Appendix 1] - Root node of a module NOT
created by the currently logged-in user, that has been rejected for use
within the application by the user.
= [Please refer to Figure 116 in Appendix 1] - Root node of a module that
has "renegade" nodes, or is rooted at a spot that is itself a renegade node.
Qualifier Nodes
= [Please refer to Figure 117 in Appendix 1] - Qualifier node in a tag
vocabulary owned by the current user
= [Please refer to Figure 118 in Appendix 1 ] - Qualifier node in an accepted
third-party tag vocabulary
= [Please refer to Figure 119 in Appendix 1] - Discovered Qualifier node in a
third-party tag vocabulary
= [Please refer to Figure 120 in Appendix 1] - Qualifier node in a rejected
discovered third-party tag vocabulary
= [Please refer to Figure 121 in Appendix 1] - Qualifier node that is itself a
renegade (it's host tag vocabulary doesn't contain this node in a later
version) or the node is a descendant of a renegade node
Tag Nodes
= [Please refer to Figure 122 in Appendix 1] - A tag that is in a tag
vocabulary that is owned by the currently logged-in user.
86
CA 02668306 2009-06-08
= [Please refer to Figure 123 in Appendix 1] - A tag that is in a tag
vocabulary that is not owned by the currently logged in user. This tag
could be described as a "third-party tag"
= [Please refer to Figure 124 in Appendix 1] - A tag that is in a tag
vocabulary that has been 'discovered'. This could be described as a
"discovered tag".
= [Please refer to Figure 125 in Appendix 1] - A tag that is in a tag
vocabulary that has been rejected. An individual tag can not be rejected
without rejecting the entire tag vocabulary.
= [Please refer to Figure 126 in Appendix 1] - A tag that is a renegade,
regardless of the ownership of the ancestor nodes.
Guidance Completeness Status
These icons help the user know what types of help are missing, with creation
of
new Lang Packs in mind.
There are 5 items to be filled:
= Friendly Name
= Tooltip
= Description
= Whack Prompt
= Creation Guidance
Friendly Name is the most significant thing to have entered for distributing a
new
Lang Pack: translations of the other four types of tag help are useful, to,
but not
crucial to using the lang pack, as Friendly Name is.
They are listed in order of "importance". That is, it is important for the
user to
know what's missing, not what's present.
Missing Everything:
87
CA 02668306 2009-06-08
= [Please refer to Figure 127 in Appendix 1] - no friendly name nor help of
any kind: nodes that have this icon have been separated from their default
lang pack, and the SwapNeat Metadata Studio application, and on
discovered tags: The app has nothing but the actual XML keyword either
from the source tag vocabulary file, or the node was discovered
embedded in a data file.
Has some stuff, but NOT the friendly name
= [Please refer to Figure 128 in Appendix 1] - it is important when creating a
Lang Pack that all nodes have Friendly name: Discovered nodes do NOT
have the Friendly Name filled in, so this will inform the user that the
friendly name is still required, regardless of how complete other help text
strings are.
Only has Friendly Name
= [Please refer to Figure 129 in Appendix 1] - Only has Friendly Name in the
default language: this is most common for end users creating tag
vocabularies for their own use: it is unlikely they will enter extensive
documentation for the tag vocabulary, until they decide to share it. There
may be some notes to themselves as a reminder of certain characteristics
of the tag, but until they want to share the tag vocabulary, Friendly Name
is most likely the only language pack content in the tag vocabulary.
Friendly Name + at least one other in default language
= [Please refer to Figure 130 in Appendix 1] - friendly, and partially
populated help strings: the user has begun to enter additional help strings,
but has not completed them all.
The rest of the icons are used to indicate that only one type of help is
missing,
but the friendly name IS filled in: very useful for putting the finishing
touches on
88
CA 02668306 2009-06-08
an otherwise complete language pack. For cases where certain types of
guidance are not necessary, it is still recommended that a default text value
be
entered as confirmation that no other descriptive value is needed (something
like
"not applicable" or "see Tag Description" etc.):
= [Please refer to Figure 131 in Appendix 1] - lacks brief descriptive text
designated for use in 'tooltips' when the mouse pointer hovers over a
representation of the tag, either in the Whack interface button or a tree
node that represents the tag.
= [Please refer to Figure 132 in Appendix 1] - lacks description (often all
that
is available as 'description' of a tag in third-party tag vocabularies, like
XMP, IPTC, DublinCore etc.
= [Please refer to Figure 133 in Appendix 1 ] - needs whack prompt: most
useful for novice taggers. Whack prompt is useful when associated with a
tag that has child tags: when a tab is foregrounded, it is presented to the
user to advise them what to look for when, so the tagger will choose
properly from available buttons on a tab.
= [Please refer to Figure 134 in Appendix 1] - needs creation guidance: most
important if sharing tag vocabulary, or if the tag vocabulary is designed to
be extended by the addition of modules. See description of "Creation
Guidance" in the Tag Properties pane.
Full Help has been entered for the currently selected default language:
= [Please refer to Figure 135 in Appendix 1] - fully populated with friendly
name and help strings
Compartment
= [Please refer to Figure 136 in Appendix 1] - SwapNeat Metadata
= [Please refer to Figure 137 in Appendix 1] - Dublin Core
= [Please refer to Figure 138 in Appendix 1] -Exchangeable Image File
89
CA 02668306 2009-06-08
= [Please refer to Figure 139 in Appendix 1] - ID3 (embedded MP3 file
metadata)
= [Please refer to Figure 140 in Appendix 1 ] - MP3 (statistics related to
technical properties of MP3 files)
= [Please refer to Figure 141 in Appendix 1] - International Press
Telecommunications Council
= [Please refer to Figure 142 in Appendix 1] - Resource Description
Framework
= [Please refer to Figure 143 in Appendix 1] - Tagged Image File Format
= [Please refer to Figure 144 in Appendix 1] - eXtensible Metadata Platform
Datatype
= [Please refer to Figure 145 in Appendix 1] - this node is a root node: it is
not colour-coded to indicate ownership or anything else: it's grey.
= [Please refer to Figure 146 in Appendix 1 ] - this node is a 'collection'
node:
a node that just is used to organise the All Tags tree, and does not get
embedded.
= [Please refer to Figure 147 in Appendix 1 ] - This node is a param-taking
node, where there is a set list of possible values
= [Please refer to Figure 148 in Appendix 1 ] - a standard "tag" from any tag
vocabulary / compartment
= [Please refer to Figure 149 in Appendix 1 ] - this is a creation guidance
node
Param taking nodes sometimes have specific datatypes
= [Please refer to Figure 150 in Appendix 1] -This param taking node
accepts a numeric value.
= [Please refer to Figure 151 in Appendix 1] - This param taking node
accepts a string value.
CA 02668306 2009-06-08
= [Please refer to Figure 152 in Appendix 1] - This param taking node
accepts a uncommon string that is not saved for future reuse.
= [Please refer to Figure 153 in Appendix 1] - This param taking node
accepts a binary object that has been binhexed.
= [Please refer to Figure 154 in Appendix 1] - This param taking node
accepts a special string value that is a URI / URL
= [Please refer to Figure 155 in Appendix 1 ] - This param taking node
accepts a "date" value
= [Please refer to Figure 156 in Appendix 1 ] - This param taking node
accepts the portion of a date that is the time of day only (not the full date)
= [Please refer to Figure 157 in Appendix 1] - This param taking node
accepts a special string value that is designed to be read by humans, but
is not necessarily easily searched or parsed by a computer.
For all other types of param-taking nodes
= [Please refer to Figure 158 in Appendix 1] - This node is a param taking
node: the value will be shown as child node in the tree. the "values" for the
param are represented with the "tag" icon, above.
= [Please refer to Figure 159 in Appendix 1] - this node is an attribute /
attribute value
In the Metadata tree, problematic embedded metadata will be indicated
[Please refer to Figure 160 in Appendix 1] - This node or one of its
descendant nodes has bad metadata. Expand the branches until you get
to the deepest-nested node that has this icon and correct it.
The Metadata tree can also display the location of the file as nested path
segments.
91
CA 02668306 2009-06-08
= [Please refer to Figure 161 in Appendix 1] - This node represents a portion
of the path to the file on the file system.
Here is a list of all the 'param-taking' data types
= PT MASK Oxf
= PT-RATIONAL-3-1 0x5
= PT-RATIONAL-3-11S 0x6
= PT-RATIONAL-2-2 0x7
= PT-RATIONAL-1-3 0x8
= PT-RATIONAL-4-0 Oxf // assume a denom of 256
= PT-RATIONAL-4-OS Ox10 // assume a denom of 256
= PT RATIONAL 4 OSS Ox11 // assume a denom of 65536
= PT UNKNOWN OxO
= PT DATE Ox1
= PT S NUMBER 0x2
= PT-US-NUMBER 0x3
= PT STRING 0x4
= PT-RATIONAL-3-1 0x5
= PT-RATIONAL-3-1 S 0x6
= PT-RATIONAL-2-2 0x7
= PT-RATIONAL-1-3 0x8
= PT RATIONAL 0x9 /* generic
= PT FLOAT Oxa /* have not seen any yet
= PT-TREE-ROOT OxOb
= PT OPAQUE OxOb
= PT CHOICE OxOc
= PT POLYGON OxOd // has an int, then pairs of floats.
= PT GPS OxOe // has a sequence of doubles. time, lat, long, alt for camera
loc, center of photo, corners of photo etc.
= PT-RATIONAL-4-0 Oxf // assume a denom of 256
= PT-RATIONAL-4-OS Ox10 // assume a denom of 256
92
CA 02668306 2009-06-08
= PT RATIONAL 4 OSS Ox11 // assume a denom of 65536
= PT LAST Ox11+1 // for looping.
= PT LAST PT_GPS+1 // for looping.
= pt_metadataTreeltem pt_metadataTreeltemArray[use_metadataTree Item]
= pt_metadataTreeltemCompartment
pt_metadataTreeltemCompartmentArray[use_metadataTreeItem]
= PT POLYGON OxOd // has an int, then pairs of floats.
= PT GPS OxOe // has a sequence of
Tag Properties Pane
The Tag Properties pane is where the user can examine and modify all the
properties of a tag vocabulary, a specific tag, or non-XML node (a collection
or
pointer). This pane hosts sections in a Rollup format: like a window shade, it
can
be rolled up and down. (This "rollup" style control is used elsewhere in the
app: in
the Search pane and the Metadata Chart pane)
The contents track the current focused tag. The Focus is usually synchronized
across numerous panes: for instance, if the user is currently viewing the
"Events"
tab in the Whack Interface, the properties pane will be fill with all the
details
regarding the "Event" tag, and the various trees (the Metadata, All Tags, and
TagSet trees) will be foregrounded / indicated in their respective panes. The
Tag
properties pane content will also change if the user clicks nodes in the
Metadata
Tree, or clicks an item in the Metadata chart.
Almost all aspects of a keyword or Tag Vocabulary can be managed from this
convenient pane without having to navigate the main menu or calling up a
properties dialog.
Whenever a non-root node is selected, the properties for the tag vocabulary
that
hosts the node will be displayed in the two lowest rollups in the Tag
Properties
pane: it is often important to be able to examine the overall tag vocabulary
properties while examining the properties of a specific tag.
93
CA 02668306 2009-06-08
General Properties
[Please refer to Figure 162 in Appendix 1 ]
This section of the Properties pane allows the user to set enter a Friendly
Name
for the tag in various languages, view or change the Tag Type and edit the
available / allowed parameter values if the tag is a parameter type.
The tag can also be added to the Metadata Chart.
The default way the tag participates in the Whack interface can be adjusted
here,
including bypassing the tag.
Guidance
[Please refer to Figure 163 in Appendix 1 ]
SwapNeat tags are not just structured. Each tag has its own metadata
associated with it.
Note for each type of help listed below, text can be associated with other
languages, creating what is known as a "lang pack" for a tag vocabulary: users
will be able to create, share, and import these lang packs so an international
multi-lingual community can use tags with the exact same position in the
specific
tag vocabulary 'namespace', even though they are presented in a user-
selectable
language.
= ToolTip Help: This text is displayed when the mouse is hovering over the
button or tab. Generally it would be a brief description of the tag
definition,
or an enticement to examine the "Guidance" pane.
= Tag Description: This is the detail definition of the tag, often supplied as
'documentation' for each tag in a standardized tag vocabulary. Note this
94
CA 02668306 2009-06-08
only defines what the tag itself means, and possibly a highly-general
description of the type of data contained therein.
= Whack Prompt: This is text that appears in the Guidance pane when it is
in "Whack Prompt" mode. The "Whack Prompt" text generally suggests to
the user what specific types of things to examine in the file being tagged:
for example, suggesting to look for objects or persons in motion on the
'Actions' tab, or ways that the user can determine what the name of the
original album on which a specific music file was originally released (as
opposed to a 'greatest hits' package)
= Creation Guidance: This is text that appears on the Guidance pane when
the user is in Tag Management mode, but is also useful in Whack mode
(where, while tagging files, it is often necessary to create new tags). It is
an extremely powerful mechanism for a tag vocabulary moderator to craft
a 'framework' that individual users will populate with their own tags, but
those tags should fit into a general guideline. For example, when creating
new tags on the "Places" tab, it is suggested to start with a Country. then,
under that country, it is suggested to use the next jurisdictional region (in
the United States, the "State" would be entered). The guidance can be
structured prompting the user down many levels of hierarchy to create
tags of a specific nature, so an inexperienced user will not be confused or
craft a sloppy, disorganized tag vocabulary. For more information see the
"Guidance Pane".
Data Storage
[Please refer to Figure 164 in Appendix 1]
The "Compartment" field is shown for every tag, even though it is actually a
property of the tag vocabulary to which the tag belongs. This is important to
know
per-tag, because a tag can be repositioned in the All Tags tree, meaning it
can
reside out of context of it's natural root node.
CA 02668306 2009-06-08
The Data Type is one of a selected number of standard data types, including:
The Data Format is a sub-format of the data type
The "Allow Text Searches..." checkbox will search the non-indexed parameter
tag string values, but will require significant memory to be allocated to load
the
strings into memory.
Statistics
[Please refer to Figure 165 in Appendix 1]
Some interesting usage statistics, including Number of uses, when the tag was
originally created, and when it was last used to tag a file.
Tag Vocabulary
[Please refer to Figure 166 in Appendix 1]
The "Tag Vocabulary" rollup in the Tag Properties pane provides important
information about the tag vocabulary to which the focused tag belongs.
URI, Moderator ID, and Prefix are all important means to uniquely identify the
tag
throughout the SwapNeat Community.
Two buttons in this area allow you to visit the tag's parent vocabulary's
forum on
the SwapNeat Community Website; the other, "Update from Web" will retrieve
the latest version of this tag's parent vocabulary.
The "Installed Version" is known to the application from a downloaded Tag
Vocabulary file.
"Discovered Version" is populated only if tags have been discovered in files,
and
the metadata indicates that those tags are from this tags parent vocabulary,
but
having a LATER version number.
96
CA 02668306 2009-06-08
Optionally, the user can retrieve "SwapNeat Search Index" files from the
SwapNeat Community: after doing so, the application will be aware of the
latest
version of ALL tag vocabularies available from the SwapNeat Community
Website, and present the latest available version number here.
Tag Vocabulary Publishing
[Please refer to Figure 167 in Appendix 1 ]
The user is able to publish a tag vocabulary from the Properties dialog:
before
publishing, the user may want to examine the the various tags in the tag
vocabulary to ensure they are ready for publishing: when their examination is
complete, the "Tag Vocabulary Publishing" rollup in the Tag Properties pane is
available for this purpose.
The user has a number of options at publish time to specify how broadly the
published tag vocabulary will be available
= My Use Only (not published) - The tag vocabulary will be saved to the
local file system as an SNTV file. The user can save the file to a
removable disk or USB memory key, or email the file for use by other
SwapNeat Metadata Studio users, or by themselves on another computer
or Windows XP user logon identity.
= My Use Only (published for remote access) - this allows the user to store
the tag vocabulary and lang packs on the SwapNeat Community Website
for easy retrieval on another computer, using the same SwapNeat ID as
they used to create and publish the tag vocabulary on their primary
computer.
= Invitation Only - publishes the tag vocabulary to their "group" forum,
access to which is controlled by them: they will invite other users to
97
CA 02668306 2009-06-08
access all of their 'shared' tag vocabularies, of which the tag vocabulary
being published is one.
= Public (for all users) - if the user wishes to make the tag vocabulary
available to all SwapNeat Metadata Studio users, they would choose this
option, which requires they also choose a Category into which the tag
vocabulary will be published, as well as an existing discussion forum if this
tag vocabulary is a replacement for or contribution to another tag
vocabulary. This user's ID must have access privileges for the destination
forum, or the the publish action will fail New Vocabulary
[Please refer to Figure 168 in Appendix 1]
At any time, a user may create a new Tag Vocabulary, which appears in the All
Tags tree as a lone "root" node. All tag vocabularies created this way are in
the
SNMD Compartment.
= The user should specify a friendly name that will be the common name in
the users default language.
= The Prefix will be automatically generated from an abbreviation of the
Friendly Name, but the user can overtype what is generated automatically
if they want to designate their own prefix.
= The URI will also be automatically generated based on the users
SwapNeat ID and the prefix generated or entered by the user.
= The user must also select a general category for the Tag Vocabulary (note
this is not necessarily the same category into which the tag vocabulary
can later be published.)
The user has some additional options:
= They can designate the tag vocabulary as a Module, and choose the root
anchor point for the module in another tag vocabulary
98
CA 02668306 2009-06-08
= they can migrate a branch of an existing tag vocabulary to become a new
tag vocabulary, or as a module or qualifier for other tag vocabularies or
tags respectively.
= they can designate the tag vocabulary to be a qualifier for other nodes.
When all the fields are filled, the user clicks the "Create Vocabulary Now"
button,
and a new root node will appear in the All Tags Tree as a sibling to the
"Discovered Tag Vocabulary" attractor node.
The last two buttons are used in general tag vocabulary maintenance.
= Update SNSI for Vocabulary - republishes the keyword index to the
SwapNeat Community for access by those with access privileges to the
overall tag vocabulary.
= Increment Major Revision - the tag vocabulary moderator can, at their
discretion, decide that a significant number of tags have been created or
undergone changes such that a new "major version" number is required.
note that Minor revision numbers are automatically generated each time
the tag vocabulary is published.
The three buttons at the bottom of the Properties Pane allow the user to
perform
routine management tasks for the tag vocabulary:
Guidance Pane
[Please refer to Figure 169 in Appendix 1 ]
The Guidance Pane is a fundamental enabler for metadata tagging: it explains
to
the user not only what the tags mean, but what to consider when using the
tags,
and how to create new tags that fit with the overall purpose of the tag
vocabulary.
The guidance pane also serves as a general metadata usage tutorial area,
offering both step-by-step instructions in support of specific tagging tasks,
and
access to usage tutorial videos.
99
CA 02668306 2009-06-08
The content displayed in the Guidance pane is derived from the "Help" area of
the Tag Properties dialog (with the exception of the Tooltip, which is
displayed as
a tooltip when the mouse pointer hovers over a user interface component that
represents a tag)
The Guidance Pane has a toolbar that allows the user to select which type of
guidance they desire.
The three buttons to the left of the separator are for tag-specific guidance
and
help. When the user clicks any object in the application that represents a Tag
in a
particular tag vocabulary, the contents of the pane are updated to reflect a
specific type of information for that tag. The contents of the Guidance Pane
in
these cases is not affected by changes to operational mode.
The two buttons to the right of the separator provide application usage
guidance
based on the current mode of operation (based on the state of the buttons on
the
"Modes" toolbar). When the user changes modes, the contents of this pane
change to provide tutorials or videos relevant to this mode of operation: the
contents of the guidance pane in these cases is not affected by changes of
focus
to different tags.
Tag Description
[Please refer to Figure 169 in Appendix 1 ]
The first button on the Guidance Pane toolbar when pressed puts the Guidance
pane into a mode where it shows the general "Tag Description" for whichever
tag
has focus.
Creation Guidance
[Please refer to Figure 170 in Appendix 1 ]
100
CA 02668306 2009-06-08
The second button on the Guidance Pane toolbar when pressed puts the
Guidance pane into a mode where it shows "creation guidance". This text
informs
the user of the nature of tags they should create as immediate children of the
focused tag.
Creation Guidance can be developed to anticipate deeply nested structures: the
user will not only get creation guidance for children of the tag represented
by the
focused user interface object, but also guidance as they create deeply nested
descendant branches, as deep as the original tag vocabulary moderator
provided.
For example, creation guidance would instruct the user when creating tags on
the "Places" tab, they should use the name of a country. Then, under the
country
node they just created, the next governmental structure, like a province or a
state. Then, the name of a municipality or county, whichever is most suitable
for
the country.
Creation guidance can also branch: for example, the user could be instructed
that if the node created under "province" was a "county", the NEXT node deeper
that they create should be a municipality, but if it was a municipality, the
next
node down should be a neighbourhood within that municipality, and so on.
Creation Guidance does not provide the user with specific tags they should
create, but rather "guides" them to create a new tags within a general
metadata
structure that makes sense within the context of the rest of the tag
vocabulary.
Whack Prompt
[Please refer to Figure 171 in Appendix 1]
Similar to the Creation Guidance, but slightly different, is text that
instructs the
user what they should consider when using tags that are immediate children of
the tag represented by the focused user interface control. For example, if the
101
CA 02668306 2009-06-08
user is tagging photos of people, and the "People" node has immediate children
for "Family", "Friends", and "Neighbours", the Whack prompt would instruct the
user thusly: "Consider the general nature of the people in the photo, and
their
relationship to you."
Usage Guidance
[Please refer to Figure 172 in Appendix 1]
Usage guidance is designed to help users that are learning about metadata
generally and the SwapNeat Metadata Studio application specifically to learn
how to be most productive in accomplishing tasks with the application.
The Usage Guidance is combined into the "Guidance" pane along with the Tag
Guidance because the tasks related to USAGE questions do not usually require
direct access to a tag's specific help text, and "usage guidance" is not
helpful
unless it is at the users fingertips: the user is likely to keep the Guidance
pane
docked and visible because of the value the pane has in delivering tag-
specific
help, then click the "Usage Guidance" toolbar button when they want to
accomplish a task relevant to the operational mode.
The "Usage Guidance" is static HTML pages that use principles of "Dynamic
HTML" to present a step-by-step task workflow simply using standard HTML /
JavaScript / CSS.
Video Tutorials
[Please refer to Figure 173 in Appendix 1 ]
Video tutorials provide more in-depth usage tutorials for the most challenging
or
difficult-to-grasp concepts of metadata, and usage of the SwapNeat Metadata
Studio application. Delivered dynamically from the SwapNeat website, this pane
provides an opportunity for SwapNeat Inc. to support users with new tutorials
102
CA 02668306 2009-06-08
should any widespread usage difficulties arise after the product is in the
hands of
customers.
Statistics
The Statistics pane shows users how various tags have been used.
Metadata Chart Pane
[Please refer to Figure 174 in Appendix 1 ]
The Metadata Chart contains two rollup sections. (A rollup is a gui control
that
can hide its content (rolled up) or display it (unrolled). This feature allows
either
section to be visible or rolled up which is useful if the pane is long and
thin.)
Each section displays one of either structured or non-structured metadata for
the
focused file.
= A user-configurable selection of non-structured keywords are displayed in
the rollup with the heading "Photo Keywords".
= The structured SNMD metadata is displayed in the rollup with the heading
"Structured Metadata".
In contrast to other tag vocabulary compartments that operate mostly as flat
field
/ value pairs, SwapNeat Metadata (SNMD) is richly structured. The ideal way to
review structured metadata is in a tree control (which we provide via the
"Metadata Tree" pane, described later) but for at-a-glance, compact viewing of
embedded, structured metadata, the Metadata Chart provides such a facility.
General Information Section
The "General Information" rollup, by default, displays values from fielded
metadata compartments appropriate for the focused file's type.
103
CA 02668306 2009-06-08
For Photos, fields such as "EXIF/Original Date and Time", "EXIF/Exposure
Time", "TIFF/Model" and "TIFF/Orientation" are listed, with an "=" character
separating the field name from the value. If the selected file is an MP3 file,
metadata such as "ID3/Lead artist" and "MP3/bit rate" are shown. For Video,
the
encoding scheme and aspect ratio are shown.
The list of fields displayed can be modified, by adding, removing, or changing
the
order in the list. The user can click a value to obtain more detailed info or
get
directions to where the keyword exists within the overall metadata tag
library.
This list of metadata is completely customizable by the user. Anywhere within
the
app the user can select a piece of metadata from the standard compartments ie.
EXIF, TIFF, ID3 and have it shown in the "General Information" section.
Furthermore, the list can be reorder to the users preference. The same list is
used for displaying tooltip in the Thumbstrip and Preview panes. The list
itself is
stored within the SwapNeat database, which is unique for each user on the
computer system.
Another unique feature of this control is that the list can be saved by the
user into
a snmc extension file and then shared with other user. The SwapNeat Metadata
Studio application has registered this snmc extension, such that when this
type
of file is activated, by double clicking or other means, the SwapNeat
application
is started and subsequently loads the snmc file into the "General Information"
rollup.
A few example uses of this feature are:
1) Professional DJs may be more interested in music tags that are related to
the
Audio characteristics such as Beats per Minute, Tempo, PeakValue, Average
Level and mood, where other applications for DJs may be more about the theme
of the music or appropriateness for a specific occasion, rather than a
consistent
tempo. By providing industry specific.snmc files for sharing, the application
can
be easily customized or packaged for specific professions.
104
CA 02668306 2009-06-08
2) Printed media experts may be more interested in metadata tags such as
Headline, Credit, gps location, Subject Location, Subject Distance,
EditorialUpdate, Number of Inks, Artist, Copyright, etc., while an amateur
photographer looking at personal snapshots may be more interested in the
particular details of a special occasion, rather than detailed information
about the
lifecycle of the photo as a managed data file.
3) Industries such as HVAC (Heading, Ventilation and Air Conditioning),
Insurance, Architecture, and Hospitals can use this feature to distribute a
list of
important / relevant metadata fields that should be reviewed, as an aid to the
management and cataloging of equipment. The Metadata Chart configuration file,
coupled with a specific Tag Vocabulary, and TagSets, all contribute to making
the tagging process support very specific user requirements. Certain industry
Tag Vocabularies may have global fields that the user may always want
displayed in the "General Information" rollup section. An example keyword
might
be the UPS code, Equipment ID, Room #, Catalog Number, Retail Price or
Insurance Item ID.
The primary uniqueness of this control is:
1) that it can be managed by the user. (added, removed, ordered, value
modified,
properties)
2) the configuration can be saved and shared. (.snmc files)
3) the contents can be from any compartment.
4) the contents can be specialized and packaged for various industries.
An example of the file format is listed below. Each entry contains the full
path of
the namespace and the version the keyword belongs to. This allows the
namespace to be downloaded if not available on locally when a user receives a
.snmc file.
105
CA 02668306 2009-06-08
Format of metadata chart configuration file (.snmc)
0
18
=1 /"http://ns.adobe.com/exif/1.0/":"/DateTimeOriginal"
=1 /"http://www.swapneat.net/xml/sn/stat.snty":"/mtime"
=1 /"http://www.swapneat.net/xml/sn/stat.snty":"/size"
=1 /"http://ns.adobe.com/tiff/1.0/":"/ImageWidth"
=1 /"http://ns.adobe.com/tiff/1.0/":"/ImageLength"
=1 /"http://ns.adobe.com/exif/1.0/":"/ISOSpeedRatings"
=1
/"http://www.swapneat.net/xml/sn/exif 01 _001.snty":"/ExposureTime"
=1
/"http://www.swapneat.net/xml/sn/exif-0 1 001.snty":"/FN umber"
=1 /"http://ns.adobe.com/tiff/1.0/":"/Model"
=1 /"http://ns.adobe.com/tiff/1.0/":"/Orientation"
=1 /"http://www.swapneat.net/xml/sn/mp3.snty":"/bitrate"
=1 /"http://www.swapneat.net/xmI/sn/mp3.snty":"/duration"
=1 /"http://www.swapneat.net/xml/sn/mp3.snty":"/frequency"
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"/TIT2"
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"/TALB"
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"/TPE1"
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"/TCOM"
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"/TRCK"
17
=1 /"http://ns.adobe.com/exif/1.0/":"/DateTimeOriginal"
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"/TPE1"
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"/TIT2"
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"/TALB"
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"/TCOM"
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"/TRCK"
106
CA 02668306 2009-06-08
=1 /"http://www.swapneat.net/xml/sn/stat.snty":"/size"
=1 /"http://ns.adobe.com/tiff/1.0/":"/ImageWidth"
=1 /"http://ns.adobe.com/tiff/1.0/":"/ImageLength"
=1 /"http://ns.adobe.com/exif/1.0/":"/ISOSpeedRatings"
=1
/"http://www.swapneat.net/xml/sn/exif 01 001.snty":"/ExposureTime"
=1
/"http://www.swapneat.net/xml/sn/exif 01 _00I .snty":"/FN umber"
=1 /"http://ns.adobe.com/tiff/1.0/":"/Model"
=1 /"http://ns.adobe.com/tiff/1.0/":"/Orientation"
=1 /"http://www.swapneat.net/xml/sn/mp3.snty":"/bitrate"
=1 /"http://www.swapneat.net/xmi/sn/mp3.snty":"/duration"
=1 /"http://www.swapneat.net/xml/sn/mp3.snty":"/frequency"
18
=1 /"http://ns.adobe.com/exif/1.0/":"/DateTimeOriginal"
=1 /"http://www.swapneat.net/xmi/sn/stat.snty":"/mtime"
=1 /"http://www.swapneat.net/xml/sn/stat.snty":"/size"
=1 /"http://ns.adobe.com/tiff/1.0/":"/ImageWidth"
=1 /"http://ns.adobe.com/tiff/1.0/":"/ImageLength"
=1 /"http://ns.adobe.com/exif/1.0/":"/ISOSpeedRatings"
=1
/"http://www.swapneat.net/xml/sn/exif 01 _001.snty":"/ExposureTime"
=1
/"http://www.swapneat.net/xmi/sn/exif-0 1 _001.snty":"/FN umber"
=1 /"http://ns.adobe.com/tiff/1.0/":"/Model"
=1 /"http://ns.adobe.com/tiff/1.0/":"/Orientation"
=1 /"http://www.swapneat.net/xml/sn/mp3.snty":"/bitrate"
=1 /"http://www.swapneat.net/xmI/sn/mp3.snty":"/duration"
=1 /"http://www.swapneat.net/xml/sn/mp3.snty":"/frequency"
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"/TIT2"
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"/TALB"
107
CA 02668306 2009-06-08
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"/TPE1"
=1 /"http://www.swapneat.net/xml/sn/id3.snty":"ITCOM"
=1 /"http://www.swapneat.net/xmI/sn/id3.snty":" /TRCK"
00000000002
=2 /"http://ns.microsoft.com/photo/1.0":"/Rating"
=3 /"http://ns.adobe.com/xap/1.0/":"/Rating"
-1
Photo/MusicNideo Tags Section
This rollup displays all SNMD keywords embedded in a file. For simplicity only
the root node and the leaf node are displayed for each keyword. For instance a
keyword such as "Places/Canada/Ontario/Toronto/CN Tower" would be shown
as Places:CN Tower. If the mouse is hovered over the entry, a tooltip will
show
the full path, "Places/Canada/Ontario/Toronto/CN Tower".
A context menu is available on each item. Choosing "Delete" removes the tag
from the embedded metadata as though it were "whacked out". More information
for the keyword can also be easily navigated to. Information such as, which
tab in
the WhackTab the tag belongs to, or show the detailed properties of the tag
can
be obtained by clicking the entry. Furthermore, any entry can be selected as
the
source for a search.
Clicks in the Whack interface are reflected immediately in the Metadata Chart,
even though the file has not yet been written out, embedding newly-whacked
tags into the file.
This area of the interface is also a convenient place to strip any snmd tags
from a
file, using the "Delete All Tags" context menu item.
Metadata Tree Pane
108
CA 02668306 2009-06-08
[Please refer to Figure 175 in Appendix 1]
The Metadata Tree pane serves the purpose of displaying the structured
metadata exactly as it is embedded in the selected files. Contrast this with
the
display of metadata in the All Tags tree (which may have been arbitrarily
rearranged for compactness or to ease examination of the entire metadata
library) and that of the TagSet Tree (which does not show the tags that are
directly embedded in a file, but rather, is a subset of the All Tags tree that
includes only the tags relevant to the current set of data files being tagged.
In SwapNeat Metadata Studio, if in the Thumbstrip any of the tiles that
represent
files are selected for tagging, one of those files will be considered to "have
focus". The Focused file is the file that will be previewed in the Preview
Pane.
All metadata shown in the Metadata Tree are for the selected files (the
metadata
shown in the Metadata Chart is always that of the focused file, which is the
only
file displayed in the Preview pane).
The metadata in the focused file is indicated with multi-coloured tree nodes,
while
metadata embedded in the other selected files (and not having focus) is
indicated
with similar icon shapes, except all the icons are blue.
When using the Metadata Tree to examine the embedded metadata, what is
displayed is that which is actually embedded in the file, not altered in any
way by
other user-interface-specific settings, nor "collection" nodes that are merely
used
to group tags for organizational purposes. See "Collections" for more
information.
The icons on each node communicate a number of different properties. By
pressing buttons on the pane toolbar, different types of information is
exposed:
1. whether the tag is in a tag vocabulary that is owned by the current logged-
in user.
2. the metadata vocabulary "compartment" the tag is associated with
109
CA 02668306 2009-06-08
3. the datatype of the node (date, number, string, etc)
Different colours for specific icon designs are used to signify the ownership
of the
tag, as well as whether or not the tag pertains to the currently focused file.
If only one file is selected, all the tags displayed in the Metadata tree are
embedded in the focused file. Tags that have a "Blue" node are members of tag
vocabularies owned by another user, while tags that are "green" are in tag
vocabularies administered (moderated, or "owned") by the currently logged-in
user.
When multiple files are selected, the above colours only apply to the
currently
focused file. All other nodes are indicated in "blue", meaning that they are
embedded in a selected file that is NOT the focused file.
The Metadata Tree Pane has a toolbar providing direct access to the following
features:
= [Please refer to Figure 176 in Appendix 1 ] - Create a new tagset based on
the current metadata tree contents
= [Please refer to Figure 177 in Appendix 1] - Add the selected tag to the
current tagset
= [Please refer to Figure 88 in Appendix 1] - Perform a simple text find on
strings on nodes in the metadata tree
= [Please refer to Figure 89 in Appendix 1] - Find next occurrence
= [Please refer to Figure 178 in Appendix 1] - Locate this tag in the Whack
interface, if present
For the following tree node icon indicators, they are identical to those
described
in the coverage of the All Tags Tree Pane, except for the first, which
indicates for
each item in the metadata tree the origin of the metadata state: was the tag
embedded in the file when it was loaded, or has the user tagged it into the
file
and the file has not yet been written out, etc.
110
CA 02668306 2009-06-08
= [Please refer to Figure 179 in Appendix 1] - Indicate in the Metadata Tree
what the current Whack status is for all nodes
= [Please refer to Figure 93 in Appendix 1] - Indicate in the Metadata Tree
what the data compartment (tag vocabulary) is for all nodes
= [Please refer to Figure 91 in Appendix 1] - Indicate in the Metadata Tree
the ownership of all nodes
= [Please refer to Figure 94 in Appendix 1] - Indicate in the Metadata Tree
what the data type for all nodes
Tree Node Icons
Where the All Tags Tree pane is designed to represent the overall tag library,
the
Metadata Tree Pane is designed to represent the metadata currently embedded
in the file(s) represented by tiles in the Thumbstrip.
There are common requirements with the All Tags tree w.r.t. rendering
information as icons on tree nodes: the significant difference that for the
All Tags
tree, the user wants to know how the node will behave when used in the Whack
interface, but for the Metadata tree, which is a tree rendering of the
EMBEDDED
metadata, the user needs to know the current status of the node w.r.t. its
being
embedded in the current file, or another file in the active file set, whether
the user
has assigned it recently, and whether any changes to the metadata have been
written out to the files.
When the toolbar button fourth from the right is clicked, the node icons
change to
indicate the "Current Whack Status" of the node w.r.t. embedded metadata, as
follows:
= [Please refer to Figure 180 in Appendix 1] - the tag represented by the
node was pressed in the Whack interface, but the change to the
embedded metadata has not been written out yet.
= [Please refer to Figure 181 in Appendix 1 ] - the tag represented by the
node was embedded in the file when it was loaded for processing.
111
CA 02668306 2009-06-08
= [Please refer to Figure 182 in Appendix 1 ] - the tag represented by the
node was embedded in the file when it was loaded for processing, but the
user has subsequently "whacked it out", meaning that they want the tag to
not be embedded when the file is written out.
= [Please refer to Figure 183 in Appendix 1] - the tag represented by the
node is present in another file in the file set, either recently assigned or
previously embedded.
= [Please refer to Figure 184 in Appendix 1] - the tag represented by the
node was pressed in the Whack interface, but the user changed their mind
and unwhacked the button. This indicator usually appears on param
values for tags to which the user has made a change to the value to be
associated with the param.
= [Please refer to Figure 185 in Appendix 1] - the tag represented by the
node was embedded in the file when it was loaded for processing, but the
tag is associated with a version of its host tag vocabulary lesser than that
currently known to SwapNeat Metadata Studio, and the current version of
the tag vocabulary does not host the embedded tag.
The following icons indicate the same characteristics of the metadata state as
above, but do so for tags that are associated with a subregion of an image.
= [Please refer to Figure 186 in Appendix 1] - the tag represented by the
node was pressed in the Whack interface, but the change to the
embedded metadata has not been written out yet.
= [Please refer to Figure 187 in Appendix 1 ] - the tag represented by the
node was embedded in the file when it was loaded for processing.
= [Please refer to Figure 188 in Appendix 1] - the tag represented by the
node was embedded in the file when it was loaded for processing, but the
user has subsequently "whacked it out", meaning that they want the tag to
not be embedded when the file is written out.
112
CA 02668306 2009-06-08
= [Please refer to Figure 189 in Appendix 1 ] - the tag represented by the
node is present in another file in the file set, either recently assigned or
previously embedded.
= [Please refer to Figure 190 in Appendix 1 ] - the tag represented by the
node was pressed in the Whack interface, but the user changed their mind
and unwhacked the button. This indicator usually appears on param
values for tags to which the user has made a change to the value to be
associated with the param.
= [Please refer to Figure 191 in Appendix 1 ] - the tag represented by the
node was embedded in the file when it was loaded for processing, but the
tag is associated with a version of its host tag vocabulary lesser than that
currently known to SwapNeat Metadata Studio, and the current version of
the tag vocabulary does not host the embedded tag.
In the Metadata Tree, it is not important for the user to know that a node is
a
qualifier or from a module: it is only important to know what the whack state
is,
and whether or not the tag applies to the entire file, or just to a subregion
of the
file.
TagSet Tree Pane
[Please refer to Figure 192 in Appendix 1]
There is a toolbar for creating and managing the contents and properties of
the
TagSet:
= [Please refer to Figure 193 in Appendix 1] - Create a new blank TagSet:
All TagSets are created as files on the file system: the tagset properties
pane will appear and prompt the user for file name and other TagSet
properties.
= [Please refer to Figure 194 in Appendix 1] - Load a TagSet file (.snts) from
the file system
= [Please refer to Figure 195 in Appendix 1] - Save the current TagSet
113
CA 02668306 2009-06-08
= [Please refer to Figure 196 in Appendix 1] - Save the current TagSet with
a new file name
= [Please refer to Figure 197 in Appendix 1 ] - Create a new tag in the
TagSet with a "button" presentation
= [Please refer to Figure 198 in Appendix 1 ] - Create a new tag in the
TagSet with a "tab" presentation
= [Please refer to Figure 199 in Appendix 1 ] - Create a new tag in the
TagSet with a "buttontab" presentation
= [Please refer to Figure 200 in Appendix 1 ] - Create a new collection tab:
by default, collections are buttontabs
= [Please refer to Figure 201 in Appendix 1 ] - Move the node and its
descendant nodes sooner in the Whack progression
= [Please refer to Figure 202 in Appendix 1] - Move the node and its
descendant nodes later in the Whack progression
= [Please refer to Figure 203 in Appendix 1] - Remove the node and its
descendant nodes from the TagSet (the tags are not deleted from the All
Tags tree; they are only deleted from the current TagSet)
[Please refer to Figure 204 in Appendix 1 ]
A TagSet is a configuration file describing the rendering of tags in the Whack
Interface.
The TagSet contains information about which tags from installed Tag
Vocabularies will be displayed in the Whack interface, and how they will be
rendered ie. button, buttontab, or tab. Additionally, the three tagging modes
ie.
"Batch", "Selective" and "Single", are supported in TagSets by allowing
specific
TAB objects in the Whack interface to be skipped depending on the mode, and
that information is also stored in the Tagset.
114
CA 02668306 2009-06-08
Icons on the tree nodes indicate the way the node is represented. Right-
clicking
on a node displays a context menu from which the nodes rendering can be
changed.
There are four types of icons.
= [Please refer to Figure 95 in Appendix 1 ] - Tab: tabs are always visited
(unless the "mode bypass" is enabled for that tab)
= [Please refer to Figure 96 in Appendix 1 ] - Buttontab: a button hosted on
another tab that represents another tab; the tab will only be displayed
= [Please refer to Figure 97 in Appendix 1] - button
= [Please refer to Figure 98 in Appendix 1 ] - bypassed
Each item can be either a full tag node, or a collection node (that is only
used for
organizational purposes): note the small "collection" indicator overlay
= [Please refer to Figure 99 in Appendix 1] - This node represents a tab that
is a collection
= [Please refer to Figure 100 in Appendix 1] - This node represents a
buttontab that is a collection
= [Please refer to Figure 101 in Appendix 1] - This node represents a button
that is a collection. Collection nodes, being nodes that do not themselves
represent XML tags that will be embedded, are of no practical value used
as buttons.
= [Please refer to Figure 102 in Appendix 1] -This collection node is
bypassed.
In the screenshot, observe that the "Spain" node is bypassed. In the Whack
interface the result is illustrated below:
The "Grenada" buttontab button appears on the "Places" tab, and there is no
"Spain" tab or button in the interface.
115
CA 02668306 2009-06-08
TagSet Chooser Pane
[Please refer to Figure 205 in Appendix 1 ]
The TagSet Chooser Pane facilitates the selection and management of TagSets.
Its primary purpose is to allow the user to quickly switch between TagSets
depending on the users general understanding of the content of the files to be
tagged.
For example, if the user is tagging photos of his family, they may have their
personal TagSet loaded that includes the names of their family members.
After completing those files, they proceed to a new folder containing photos
taken at a recent sporting event: a major league baseball game.
The user may can quickly use this pane to switch to the "MLB" TagSet, which
would load a completely new set of tabs and buttons related to the sport
hockey
in the Whack Interface.
The TagSet Pane is a multi tab control, where each tab holds TagSets of a
particular category. Some example categories are "Personal", "Hobbies",
"Music", "Printed Media", "Manufacturing", "Photography".
Only categories that contain TagSets are visible.
The first tab is always the "All" tab, which contains every TagSet that is
available
to the user, in alphabetical order.
The TagSets themselves appear as buttons, just like the Whack Interface. More
than one TagSet can be selected at any given time by holding down the "Shift"
or
"Ctrl" key while clicking. Multi selected TagSets allow the user to
superimpose
TagSets on top of each other: this allows the user to craft "tagset modules"
that,
116
CA 02668306 2009-06-08
when combined, provide useful Whack interfaces, without having to store each
individual combination of tags as separate tagset files.
For example, the user might CTRL-Click their personal tag vocabulary AND the
"MLB" tag vocabulary to tag photos of a family outing to a baseball game:
photos
may contain members of the sports team, or specific activities related to
baseball, or the stadium, as well has having members of the persons family,
taking part in activities that are unique to attendance at a major league
baseball
game.
A new tagset file can be saved, and at any time, additional tagset buttons can
be
pressed, adding those tags to the user interface.
= [Please refer to Figure 193 in Appendix 1 ] - Create a new blank TagSet to
which tags can be added (from the All Tags tree pane)
= [Please refer to Figure 206 in Appendix 1] - Import a TagSet from a file.
= [Please refer to Figure 207 in Appendix 1] - Save current TagSet: disabled
if current tagset is a composite tagset
= [Please refer to Figure 208 in Appendix 1] - Save current TagSet under a
new file name.
= [Please refer to Figure 209 in Appendix 1 ] - Edit TagSet properties
= [Please refer to Figure 210 in Appendix 1] - Publish TagSet to the
SwapNeat Community Website
= [Please refer to Figure 211 in Appendix 1] - Search for TagSets
Whack Interface Pane
[Please refer to Figure 212 in Appendix 1]
The Whack Interface is the primary control for creating, assigning, and
reviewing
keyword tags. It is an ingenious and unique method for rendering and creating
hierarchical structured tags.
117
CA 02668306 2009-06-08
The Whack interface appears as a separate pane, but this functionality has
applications for metadata generally, whether structured or unstructured, and
can
be integrated in the user interfaces of various digital devices for immediate
point-
and-click application of metadata at the time the file is created: for example
a
digital camera with integrated whack interface could be operated using either
a
touch-sensitive screen or a four-way "joystick" type of control common to
digital
cameras: moving the 'joystick' around would allow the user to select a button:
clicking the "Enter" button on the camera would apply the tag.
In a computer application, touch screens could also be used, as could a mouse:
in the user interfaces described below, the use of a mouse is assumed,
although
any means that allows the user to indicate a specific point on the screen
could be
used, including touch-screen or light-sensitive controller that can be pointed
directly at the screen.
A toolbar provides access to the basic navigational and management features of
the Whack interface.
[Please refer to Figure 197 in Appendix 1] - create a new button on the
foregrounded tab: disabled if there is no tagset loaded
= [Please refer to Figure 198 in Appendix 1 ] - create a new tab in the Whack
interface. This will be created as a top-level node in the currently active
tag vocabulary. Effectively creates a new unnamed TagSet if no TagSet is
open.
= [Please refer to Figure 199 in Appendix 1] - create a new buttontab button
on the foregrounded tab: disabled if there is no tagset loaded.
= [Please refer to Figure 200 in Appendix 1] - create a new collection node.
If there is no tagset currently loaded, the node will be created as a top
level node in the currently active tag vocabulary, and rendered as a tab in
the Whack interface. If there is a TagSet loaded, the node will be created
as a child node of the node represented by the foregrounded tab, and
rendered in the Whack interface as a buttontab button on the
118
CA 02668306 2009-06-08
foregrounded tab. Effectively creates a new unnamed TagSet if no TagSet
is open.
= [Please refer to Figure 7 in Appendix 1] - Changes mouse click on buttons
from a simple "whack" to a "pin" operation, which locks the button down
until the button is unpinned: pinned tags are applied to all subsequent files
until the button is unpinned.
= [Please refer to Figure 213 in Appendix 1] - Applies all the tags that were
applied to the previous image to the current image, and either jumps to the
Summary Tab (if the summary tab is optionally present in the TagSet) or
leaves the current foregrounded tab foregrounded, allowing the user to
either click on individual Whack interface tabs to examine the result /
augment / change the result, or they can click the "Done" button to apply
the tags.
= [Please refer to Figure 214 in Appendix 1] - Move to the first tab in the
Whack progression.
= [Please refer to Figure 215 in Appendix 1 ] - Move to the previously visited
tab in the Whack progression.
= [Please refer to Figure 216 in Appendix 1 ] - Move to the next tab that will
be visited in the Whack progression.
= [Please refer to Figure 217 in Appendix 1 ] - Move to the Summary tab, if
visible in the current TagSet. This button is disabled if the Summary tab is
not present in the TagSet.
= [Please refer to Figure 218 in Appendix 1 ] - apply all changes to the
selected thumbstrip files.
Tabs, ButtonTabs and Buttons
The technique for rendering the Whack interface is essentially a translation
of a
hierarchical tree into a set of Tabs and Buttons. Instead of a direct one-to-
one
relationship, additional features of the Whack interface support the reality
that in
a given dataset, not all categories of tags are applicable to every file so,
for
efficiency, various mechanisms are available to streamline the process while
still
119
CA 02668306 2009-06-08
making the rarely-used tags available conveniently. Contrast this with
conventional tagging tools that present the entire metadata vocabulary as a
whole, and for every file the entirety of the available metadata must be
navigated
to find appropriate tags.
Consider also that nodes in a hierarchical metadata collection are either or
both a
"category" or a "keyword". "People" is a category of keywords, but "People"
can
itself be used as a keyword to identify that the main subject(s) in a
photograph
are people.
Sometimes it is required to apply keywords very generally. In those cases, a
category may be sufficient to use as a keyword. Other times, it is important
to be
very specific when describing the data in a file.
The Whack interface provides the means to configure an efficient tagging
interface for either application and all points in between those two extremes.
Consider the case where very general categories are required: photos where it
is
only necessary to identify whether the photo contains "people", "plants", or
"animals".
Consider that People is itself a top-level category, but "plants" and
"animals" are
subcategories of "Objects".
The optimal rendering of a Whack interface for the required task is a single
tab,
with three buttons, one each for "People", "Plants" and "Machines".
This is easily achieved in the Whack interface.
Every tag vocabulary has a single root node. This root node can participate in
the
Whack interface ONLY as a tab: is it not a keyword itself. For this example,
the
"People" category will be rendered on that tab as a button, even though it
contains descendant hierarchical metadata that is not required for this
example.
120
CA 02668306 2009-06-08
Likewise, Objects is a category, but not needed explicitly: the requirement
does
NOT include having to tag photos as ONLY containing "Objects" generally: the
requirement is to specify whether there are particular types of objects:
plants or
machines.
So, to avoid the unnecessary presence of "Objects" in the Whack interface,
instead of rendering the "Objects" category, on which will be rendered the
"Plants" and "Machines" buttons, we will 'bypass' the "Objects" tab, which
will
cause the "Plants" and "Machines" buttons to be rendered on the "Root Tab"
alongside the "People" button, as two additional buttons.
The following screenshot shows the hierarchical nature of the tag vocabulary
in
this example:
SCREENSHOT OF ALL TAGS TREE VIEW OF TAG VOCABULARY
CONTAINING ONLY ROOT NODE, TWO TOP LEVEL NODES: PEOPLE and
OBJECTS: under PEOPLE, a smattering of Categories for FAMILY, FRIENDS,
ETC. UNDER OBJECTS, PLANTS and MACHINES. A FEW TYPES OF NODES
UNDER EACH.
To add a node to the Whack interface observe the icon adjacent to the node.
The
icon indicates the default behaviour when that node is added to the Tag Set
tree
(and in turn, the Whack interface). Observe in the Tag Set tree the result of
clicking on the "People" node. The default Whack interface type for the
"People"
node is a "Tab", and the default Whack interface type for the tag vocabulary
root
node is "Bypassed" (not present in the Whack interface).
Right-click on the Root node and point at the "Change To >" flyout menu item
on
the context menu. Choose "Change to Tab" from the flyout menu, and the Root
node will appear in the Whack interface as a tab.
To make the PEOPLE node appear on theRoot tag, right click the People node
and follow the same procedure, but instead of choosing "Change to Tab" (it
121
CA 02668306 2009-06-08
already is a tab) choose "Change to Button". Now, the People node icon in the
TagSet tree will change from that of a Tab to that of a button, and a button
will
appear on the Root Tab in the Whack interface.
For both "Plants" and "Machines" under Objects, we do not want their parent
node (Objects) to appear in the Whack interface at all.
In the All Tags tree, check each of the "Plants" and "Machine" nodes and their
common ancestor, Objects, will also be checked. Since the default behaviour
for
the Objects node is to a tab, and the default behaviour for the Plants and
Machines nodes is to be Buttontabs, those properties must be changed.
First, right-click the Objects node in the TagSet tree to Bypass the objects
node.
Bypassing causes descendant nodes to 'gather up' onto the nearest common
ancestor that is present in the Whack interface: in this case, the Root Tab.
But the Plants and Machines nodes are still present in the Whack Interface as
"buttontabs", rather than just "Buttons". Repeat the right-click process to
change
these two buttontab nodes into buttons.
Now the Whack interface consists of a single tab, with three buttons. Now, the
user need only glance at the photo, and click one of the three buttons to
accomplish their task.
For every file that is presented to the user, they can click the appropriate
category. As soon as they click a button (without holding the CTRL key while
clicking) the next photo automatically loads, and the focus of the file in the
thumbstrip (if visible) changes.
Should more than one of the tags represented by these buttons be applicable to
a single photo, the user can hold the CTRL before clicking the first button,
and
multiple tags will be applied, but only when they release the CTRL key. Until
the
CTRL key is released, the user can un-click any buttons clicked in error. When
122
CA 02668306 2009-06-08
they release the CTRL key, whichever buttons were depressed will be tagged
into the file, and the next file will automatically load.
An important thing to note is that the embedded metadata will still retain
it's full
hierarchy. Files tagged by pressing the 'Plants' or'Machines' button would
STILL
have metadata embedded as "Objects/Plants" or "Objects/Machines". In each
case, TWO 'keywords' are embedded into the photo with a single button-click.
Much more diversity and complexity is allowed by a hybrid concept between a
"tab" and a "button" called "buttontabs".
A buttontab is a combination of a tab that only appears when a specific button
is
pressed.
Suppose in the above scenario, if the "People" category was pressed, it
becomes
important to identify the occupation of the person represented in the photo
(construction worker, police officer, firefighter, doctor, etc.).
The "People" node instead of being a simple "button" would be configured to be
rendered as a "buttontab". A button labelled "People" would still appear on
the
"Root Tab", but, when the People button was pressed, an additional Tab would
appear, having one button each for the occupations represented by the people
in
the photo.
In this case, where the photo does include a representation of a person, the
user
need not see all the keywords representing occupations unless they first
indicate
that the photo is that of a person. Depending on the number of tags present in
the Whack interface, the user may elect to have the 'People' tab rendered 'on
standby', but only visited when the "People" button is pressed.
This is discussed further in the next section.
Whack Interface Progression
123
CA 02668306 2009-06-08
For this example, the tabs present in the Whack interface are "Places",
"Events",
"Objects", "People", in that order.
When a new photo is selected from the Thumbstrip, the Whack Interface is
initialized, bringing the leftmost tab in the tab control list to the
foreground, which
in our example it would be "Places".
All direct nodes below "Places" are shown as buttons on the "Places" tab.
Examples are "Canada", "US", "China" and "Africa".
Each of these nodes have been configured to be "ButtonTabs". If the user
selects a button, such as "Canada", a "Canada" tab will be inserted
immediately
after the current tab (the "Places" tab) and before the next tab (the "Events"
tab).
The "Canada" tab is then foregrounded, and it's child nodes are displayed as
buttons on the tab. Suppose the "Canada" node had direct child nodes "Toronto"
and "Vancouver". There would be buttons on the "Canada" tab labelled "Toronto"
and "Vancouver". Should the user click the "Toronto" button, that would assign
the "Toronto" keyword to the file, and automatically foreground the the next
tab,
"Events". By automatically progressing to the next tab after a button has been
clicked, the user progresses through all the tag categories with few mouse
moves or clicks, which is more efficient than having the user manually choose
the next category from which to choose the next tag, and more efficient by way
of
only presenting a subset of the overall tag hierarchy to the user.
With only clicks of the mouse in the tab-area of the Whack interface,
rendering a
deeply hierarchical tag vocabulary 'flattened' for efficiency, the user can
tag a
photo with very specific details about the "Places", "Event", "Actions",
"Objects",
and "People" depicted in a photograph, and quickly apply a "Rating" as a
subjective quality metric in mere seconds per file.
The more tags a file contains the more thorough the description, and therefore
the more unique the file becomes relative to other files. This promotes and
124
CA 02668306 2009-06-08
facilitates refined searches which allows the user to find exactly what they
are
looking for.
For tags where there is a continuum of values from which to choose, buttons
can
be sensitive to where on the button face the click event occurs: closer to the
left
may be a lower value: closer to the right, a higher value. This is useful for
tagging
things like a rating or temperature or similar "quantity" tags. Additionally,
the
button face could change colour depending on how far to the left or right the
mouse cursor is positioned. Also, there could be "snap" points for when the
mouse moves around over the button: the mouse would jump between set points
rather than move smoothly across the face of the button. This method is
especially useful when only a coarse approximation of the value is needed.
Dragging and Dropping Tabs and Buttons
[Please refer to Figure 219 in Appendix 1]
[Please refer to Figure 220 in Appendix 1]
By default the tabs in the Whack interface are organized as they are ordered
vertically in the All Tags tree. On each tab, the buttons are sorted
alphabetically.
However, if the user prefers to have control over the order of tabs and
buttons,
the Whack Interface supports dragging and dropping of tabs and buttons to
reorder them.
The user may wish to put the most commonly used tabs at the beginning,
allowing them to skip the less-often used ones, and move on to the next file,
for
increased efficiency. Likewise, the most commonly used buttons may be put at
the top of the tab, or position buttons that are conceptually similar
together. Also,
once the user has become accustomed to a particular Whack interface, keeping
the buttons in a particular place makes the use of those buttons more
reflexive
and faster.
125
CA 02668306 2009-06-08
When dragging a Tab, the insertion point is indicated by a blue vertical bar
between the tabs where the dropped-tab would be positioned. In the screenshot
below, note the position of the "Drag" cursor (the grabbing hand) and the blue
vertical indicator.
When dragging a button, arrow indicators show where the button is going to be
inserted. In the screenshot below, note the outline of the button, the "Drag"
cursor (the grabbing hand) and the two small arrows are the right- and left-
edges
(respectively) of the buttons to the left and to the right of the buttons new
position.
The button and tab orders are saved and rendered properly on application
restart.
Before saving out the new location of buttons, a popup dialog is presented to
confirm the action. If the action is ambiguous the dialog presents the various
options the dropping of the button or tab can mean. ie. 1) Make an equivalence
if
the dropped button already exists in the tab it is dropped in.
Whack Interface Navigation
The Whack Interface can be fully navigated using the mouse or the keyboard.
Tags can be selected by using the keyboard to spell the first few characters
of
the tag. If the typed characters match the first characters of a button, the
button
is automatically focused, such that, striking the "Enter" key would select the
button. If the typed characters do not match any buttons on the current tab,
creation of a new button is assumed. The up, down, left and right arrow keys
can
also be used to navigate tabs and buttons. Striking "Enter" on the focused
button
or tab is equivalent to clicking the button control with the mouse.
Whack Interface Icon Indicators,
126
CA 02668306 2009-06-08
A variety of icons are used in the Whack interface to guide the users actions,
and
help them anticipate the result of a button click or automatic tab
progression.
Tabs, ButtonTab buttons and ButtonTab tabs, and Buttons, all have a different
set of possible icons.
Tabs
A single icon can be displayed on each tab.
A "gold star" [Please refer to Figure 221 in Appendix 1 ] is used only on
'real' Tab
items, as opposed to a ButtonTab Tab. This icon indicates that the tab WILL
always be visited in the current tagging mode, and that the node represented
by
this tab was NOT already present in the focused file, nor recently whacked.
A Tab having NO icon indicates both that the node represented by the tab was
not present in the focused file when it was loaded, and that it will be
skipped in
the tab progression.
If a button is whacked on a tab, the Tab icon will change to reflect the state
of
one button on that tab, in the following order:
= [Please refer to Figure 180 in Appendix 1] - indicates that one or more
buttons on this tab has been whacked and not yet embedded into the
focused file.
= [Please refer to Figure 182 in Appendix 1] - indicates that one or more
nodes represented by buttons on this tab was present in the file when the
file was loaded, and none of the above conditions are also met.
= [Please refer to Figure 185 in Appendix 1] - indicates that one or more
nodes represented by buttons on this tab was embedded in the file when it
was loaded for processing, but the tag is associated with a version of its
host tag vocabulary lesser than that currently known to SwapNeat
Metadata Studio, and the current version of the
127
CA 02668306 2009-06-08
= [Please refer to Figure 181 in Appendix 1] - indicates that one or more
buttons on this tab was present in the file when the file was loaded, and
has subsequently been "whacked out", and that change has not yet been
saved in the file... and the above condition is not met.
= [Please refer to Figure 183 in Appendix 1] - indicates that one or more
nodes represented by buttons on this tab was pressed in the Whack
interface, but the user changed their mind and unwhacked the button. This
indicator usually appears on param values for tags to which the user has
made a change to the value to be associated with the param. tag
vocabulary does not host the embedded tag.
Optionally, the user can configure SwapNeat Metadata Studio to "remind" them
of recently used tags:
= [Please refer to Figure 184 in Appendix 1] - indicates that this tab
contains
a button that was used (whacked) into a recent file. This is informative so
the user knows, should they be tagging files with similar characteristics,
that this tab might deserve attention.
Specifically related to a feature relevant to Photos known as "subregion
whacking" (possibly relevant to other file types also), icons are used to
indicate of
a node represented by a button on the tab has been used on a subregion:
= [Please refer to Figure 186 in Appendix 1] - indicates that one or more
buttons on this tab has been whacked onto a subregion of the photo and
not yet embedded into the focused file.
= [Please refer to Figure 188 in Appendix 1] - indicates that one or more
nodes represented by buttons on this tab was present in the file and
relevant to a subregion of the photo when the file was loaded, and none of
the above conditions are also met.
= [Please refer to Figure 191 in Appendix 1] - indicates that one or more
buttons on this tab was embedded in the file when it was loaded for
128
CA 02668306 2009-06-08
processing, but the tag is associated with a version of its host tag
vocabulary lesser than that currently known to SwapNeat Metadata
Studio, and the current version of the tag vocabulary does not host the
embedded tag.
= [Please refer to Figure 187 in Appendix 1 ] - indicates that one or more
buttons on this tab was present and relevant to a subregion of the photo,
when the file was loaded, and has subsequently been "whacked out", and
that change has not yet been saved in the file... and the above condition is
not met.
= [Please refer to Figure 189 in Appendix 1 ] - indicates that one or more
buttons on this tab was pressed in the Whack interface, but the user
changed their mind and unwhacked the button. This indicator usually
appears on param values for tags to which the user has made a change to
the value to be associated with the param.
Optionally, the user can configure SwapNeat Metadata Studio to "remind" them
of recently used tags:
= [Please refer to Figure 190 in Appendix 1] - indicates that this tab
contains
a button that was used (whacked) into a subregion of a recent photo. This
is informative so the user knows, should they be tagging files with similar
characteristics, that this tab might deserve attention.
While tagging, the user may need temporary access to a bypassed node.
Bypassed nodes have descendants (if a node has no descendants, it is either
present in or absent from the TagSet, it is not "bypassed". When a node is
temporarily unbypassed, it will be rendered as a Tab in the Whack interface,
and
the Tab will have the following icon:[Please refer to Figure 222 in Appendix 1
]
ButtonTabs
129
CA 02668306 2009-06-08
A tab that is designated as a ButtonTab will also have NO icon on the tab
unless
its associated button is pressed, or the tag represented by the Tab is already
present in the focused file.
= [Please refer to Figure 223 in Appendix 1] - indicates that this tab is a
buttontab, and one or more buttons on this tab has been whacked into the
file and is not yet embedded.
= [Please refer to Figure 224 in Appendix 1] - indicates that this tab is a
buttontab, and one or more nodes represented by buttons on this tab was
present in the file when the file was loaded, and none of the above
conditions are also met.
= [Please refer to Figure 225 in Appendix 1] - indicates that this tab is a
buttontab, and one or more of the nodes represented by buttons on this
tab was present in the file when the file was loaded, and has subsequently
been "whacked out", and that change has not yet been saved in the file...
and the above condition is not met.
Optionally, the user can configure SwapNeat Metadata Studio to "remind" them
of recently used tags:
= [Please refer to Figure 226 in Appendix 1] - indicates that this tab is a
buttontab, and one or more of the buttons on this tab was used (whacked)
into recently tagged file. This is informative so the user knows, should they
be tagging files with similar characteristics, that this tab might deserve
attention.
No icon indicates the tab IS connected to a button and will NOT be visited
unless
that tabs associated button is pressed on a tab prior in the progression. When
such a button is pressed, the icon on both the button and the associated tab
change to indicate both that the tab WILL be visited, and that the visit is
the result
of a button being pressed.
130
CA 02668306 2009-06-08
If the tab hosts a button that represents a keyword already embedded in the
focused file when it was loaded, the tab label will have an orange-coloured
icon
with concentric rings, like a target. The button on that tab that represents
the
keyword will also have that icon. Similarly, if a tab hosts a button that was
pressed after the focused file was loaded, it will have a green icon of
similar
design. Optionally, the user may choose to be reminded of tags that have been
used recently. If that option is selected, a tab hosting a button that was
used
recently will have a similar icon that is blue in colour.
If a button represents a keyword that was in the file when the file was loaded
(the
icon would be an orange target), and the user wants to delete that keyword
from
the file, the user can click that button, and the orange target will be
replaced with
a red "X" (the standard symbol for "delete"). If the user should change their
mind,
and reconsider the deletion, they can click the button again, and it will
become an
"orange target" icon.
If the user clicks a button that has a blue target icon (indicating a keyword
used
recently in another file) that icon will become a green target (indicating
that the
button has been pressed since the file was loaded and was not already present
in the file when it was loaded). Should they reconsider, clicking the same
button
again would return it to the "blue target" icon.
In this way, the user can not only add tags to the file, but see which tags
were
present in the file when it was loaded. Note that the "TagBag" feature also
serves
this purpose to some degree, and will be explained later.
Buttons
There are three special 'navigation' buttons that appear on every Whack
interface
tab (based on checkboxes being activated in the Whack options tab)
= [Please refer to Figure 227 in Appendix 1] - apply on this tab the same
tags as were used in the previous file
131
CA 02668306 2009-06-08
= [Please refer to Figure 228 in Appendix 1] - Skip this tab and follow normal
Whack progression rules
= [Please refer to Figure 229 in Appendix 1 ] - Skip this tab in this tagging
mode. Note the text on the button will reflect the mode. This button only
affects full "Tab" nodes, not buttontabs. The button is 'depressed' if the tab
will be skipped. To deactivate "This Mode Skip", navigate to the tab using
the Windows standard Tab navigation controls, and "whack out" this
button.
The rest of the buttons reflect each tag's status with respect to embedded
metadata.
= [Please refer to Figure 180 in Appendix 1] - indicates that this button has
been whacked and not yet embedded into the focused file.
= [Please refer to Figure 182 in Appendix 1 ] - indicates that the tag
represented by this button was present in the file when the file was
loaded, and has subsequently been "whacked out", and that change has
not yet been saved in the file... and the above condition is not met.
= [Please refer to Figure 181 in Appendix 1] - indicates that the tag
represented by this button was present in the file when the file was
loaded, and none of the above conditions are also met.
= [Please refer to Figure 184 in Appendix 1 ] - indicates that the tag
represented by this button was pressed in the Whack interface, but the
user changed their mind and unwhacked the button. This indicator usually
appears on param values for tags to which the user has made a change to
the value to be associated with the param.
= [Please refer to Figure 185 in Appendix 1 ] - indicates that the tag
represented by this button was embedded in the file when it was loaded
for processing, but the tag is associated with a version of its host tag
vocabulary lesser than that currently known to SwapNeat Metadata
Studio, and the current version of the tag vocabulary does not host the
embedded tag.
132
CA 02668306 2009-06-08
Optionally, the user can configure SwapNeat Metadata Studio to "remind" them
of recently used tags:
[Please refer to Figure 183 in Appendix 1] - indicates that the tag
represented by this button was used (whacked) into a recent file. This is
informative so the user knows, should they be tagging files with similar
characteristics, that this tab might deserve attention.
Specifically related to a feature relevant to Photos known as "subregion
whacking" (possibly relevant to other file types also), icons are used to
indicate of
a node represented by a button on the tab has been used on a subregion:
= [Please refer to Figure 186 in Appendix 1] - indicates that the tag
represented by this button has been whacked onto a subregion of the
photo and not yet embedded into the focused file.
= [Please refer to Figure 188 in Appendix 1 ] - indicates that the tag
represented by this button was present and relevant to a subregion of the
photo, when the file was loaded, and has subsequently been "whacked
out", and that change has not yet been saved in the file... and the above
condition is not met.
= [Please refer to Figure 187 in Appendix 1] - indicates that the tag
represented by this button was present in the file and relevant to a
subregion of the photo when the file was loaded, and none of the above
conditions are also met.
= [Please refer to Figure 190 in Appendix 1 ] - indicates that the tag
represented by this button was pressed in the Whack interface, but the
user changed their mind and unwhacked the button. This indicator usually
appears on param values for tags to which the user has made a change to
the value to be associated with the param.
= [Please refer to Figure 191 in Appendix 1] - indicates that the tag
represented by this button was embedded in the file when it was loaded
for processing, but the tag is associated with a version of its host tag
133
CA 02668306 2009-06-08
vocabulary lesser than that currently known to SwapNeat Metadata
Studio, and the current version of the tag vocabulary does not host the
embedded tag.
Optionally, the user can configure SwapNeat Metadata Studio to "remind" them
of recently used tags:
= [Please refer to Figure 189 in Appendix 1] - indicates that the tag
represented by this button was used (whacked) into a subregion of a
recent photo. This is informative so the user knows, should they be
tagging files with similar characteristics, that this tab might deserve
attention.
= [Please refer to Figure 223 in Appendix 1] - indicates that this is a
buttontab button, and one or more buttons on the associated tab has been
whacked into the file and is not yet embedded.
= [Please refer to Figure 224 in Appendix 1] - indicates that this is a
buttontab button, and one or more buttons on the associated tag was
present in the file when the file was loaded, and none of the above
conditions are also met.
= [Please refer to Figure 225 in Appendix 1 ] - indicates that this is a
buttontab button, and one or more buttons on the associated tag was
present in the file when the file was loaded, and has subsequently been
"whacked out", and that change has not yet been saved in the file... and
the above condition is not met.
Optionally, the user can configure SwapNeat Metadata Studio to "remind" them
of recently used tags:
[Please refer to Figure 226 in Appendix 1] - indicates that this is a
buttontab button, and one or more buttons on the associated tag was used
(whacked) into recently tagged file. This is informative so the user knows,
134
CA 02668306 2009-06-08
should they be tagging files with similar characteristics, that this tab might
deserve attention.
The "Pin" feature explained with the rest of the "Whack Toolbar" buttons has
associated icons for use on buttons and tabs.
= [Please refer to Figure 230 in Appendix 1] - This button is pinned, and this
tag was newly added to this file: this button will remain pressed and
therefore tagged into subsequent files until it is whacked out.
= [Please refer to Figure 231 in Appendix 1] - This button is pinned, but this
tag was already embedded in the currently-loaded file when the file was
loaded.
= [Please refer to Figure 232 in Appendix 1 ] - This button is pinned, the was
not whacked in the current file, nor was it already embedded: because it is
pinned, it WILL be embedded in the current file when the file is saved.
If the button is a Qualifier, it is indicated with the following icons:
= [Please refer to Figure 233 in Appendix 1 ] - This is a qualifier, but is
not
whacked.
= [Please refer to Figure 117 in Appendix 1] - This is a qualifier, newly
whacked into this file.
= [Please refer to Figure 118 in Appendix 1] - This is a qualifier, already
whacked into this file when the file was loaded.
= [Please refer to Figure 119 in Appendix 1] - This is a qualifier that was
embedded in the file when it was loaded, but has been whacked out
= [Please refer to Figure 120 in Appendix 1] - This is a qualifier that was
embedded in the file when it was loaded, but has been whacked out
= [Please refer to Figure 121 in Appendix 1 ] - This is a qualifier that was
embedded in the file when it was loaded, but it is a renegade: the tag
vocabulary hosting this qualifier is a later version than is embedded in the
file, and the qualifier is absent from that later tag vocabulary version.
135
CA 02668306 2009-06-08
In the TagBag, the application presents tags from various sources for the user
to
add to their TagSet and use. Two icons indicate that a button is or was added
by
TagBag interaction:
= [Please refer to Figure 234 in Appendix 1] - When a tag is created by
concatenating text strings from more than one TagBag button (by holding
CTRL and clicking multiple buttons) this icon appears on the resulting
button until the CTRL key is released.
= [Please refer to Figure 235 in Appendix 1 ] - After a button is added to the
Whack interface from the TagBag, this icon is used to indicate to the user
which button was recently added.
Whack Interface Tag Creation
At any point while tagging files, the user may need to create new tags in
categories present in the Whack interface.
Two icons appear on newly-created Whack interface objects: one while the
object is being created, the other on the most-recently-created object. Both
serve
to draw the users attention to the user interface objects that have been
recently
created, should the user's eye move away from the Whack interface.
= [Please refer to Figure 236 in Appendix 1] - This whack interface object is
being created: this icon is displayed only until the user commits the text to
the object, after which time the following icon is displayed
= [Please refer to Figure 237 in Appendix 1 ] - This is the most recently-
created object in the Whack interface.
This is facilitated when the user starts typing a keyword. As soon as they
start
typing, a blank "button" appears on the foregrounded tab, and each character
appears on the button as typed. An icon on the button that looks like a
computer
keyboard key indicates that is the button that is being created by the typing
(should the user look away from the Whack interface to examine the file being
136
CA 02668306 2009-06-08
tagged, they may not recall which specific button is being created without
this
icon indicator)
When they have entered all the text needed, they can choose to set the item as
a
"category" node or a "keyword" node.
If the user wants to create a "keyword" node (that is not meant to contain any
descendant nodes) they press the "Enter" key. The icon on the button changes
immediately from the "typing" icon to the "new button" icon. A second "Enter"
key
is equivalent to clicking the newly created button with the mouse. If they do
NOT
press Enter a second time, but instead start typing again, another button will
be
created as above. In this way, the user can create many tags very quickly, by
typing text, pressing a single "Enter" keystroke, typing again, pressing
"Enter" a
single time, etc. until all the keywords they want to enter have been created.
Then the user can even assign many of the tags just created by holding the
CTRL-Key and clicking multiple buttons with the mouse pointer, with a process
called "Multi-whack", that is described later in this document.
If the phrase is completed by the user pressing the "Tab" key, the tag is
saved as
a "ButtonTab". Pressing the Tab key a second time is equivalent to clicking
the
button, which, following the Whack interface tab progression rules, would
foreground the associated tab, on which the user can create additional nested
keywords. If the user wanted to make multiple "categories", they could type,
press "Tab" once, type again, press "Tab" once, again, etc. until all the
categories have been created. They could "Multi-Whack" many categories, then
visit each tab in turn, adding further nested categories or keywords.
Whack Interface Text Auto Complete Selection
If the user types characters that exactly match the first characters of an
existing
keyword, the button associated with the keyword is focused, and the matching
characters are highlighted in green while they are typed.
137
CA 02668306 2009-06-08
By striking the "Enter" key the button is selected. If at any point while
typing the
exact string of characters do not matbh any existing keyword, then a new
keyword is assumed and a new button is created with the typed text.
If the user wishes to create a new keyword for which the entire keyword is the
exact initial subset of an existing keyword, they must first click one of the
"New..."
buttons on the toolbar. This will create a new tag onto which whatever they
type
will be created as a new tag. This is because it is much more common to
attempt
to create an accidental duplicate of an existing word than it is to create
pure
subsets of existing words as new words.
This feature allows tagging without having to force the user to remove his
hands
from the keyboard to navigate the mouse.
Another important user interface goal is supported: the goal to reduce or
eliminate duplicate / synonym keywords in a given tag vocabulary. The software
has explicit support for synonyms, but ideally, synonyms are most valuable
when
searching: that is, a search term should find existing keywords that are
synonyms
to the search term entered: it should not be a requirement to tag files with
every
known synonym in order for the search to return desired results, and
therefore,
synonyms should not exist as separate keywords in either the All Tags tab
library, nor embedded in files. This facility will use the TagBag pane:
similar
keywords that are located in the All Tags keyword library hierarchy that is a
direct
ancestor or a direct descendant of the foregrounded tab. The user's attention
will
be drawn to similar keywords that may be present in the All Tags tree, but not
present in the subset that is used to render the Whack interface.
The TagBag is described later in this document.
Collections in the Whack Interface
For the purposes are controlling the hierarchy of both the Whack-A-Tag
interface
and the All Tags tree, it is possible to create "Collection" nodes. Unlike
"real"
138
CA 02668306 2009-06-08
nodes in the tag vocabulary, collection nodes do NOT affect the structure of
the
XML embedded in the file. They are for presentation purposes in the SwapNeat
Metadata Studio application.
They can be created in the Whack interface by clicking the "New Collection"
toolbar button [Please refer to Figure 200 in Appendix 1] on the Whack
interface
pane, and typing in the Collection name. When created in the Whack interface,
the Collection node is rendered as a buttontab.
Typically, collection nodes are presented in the Whack-A-Tag interface either
as
Buttontabs, or they are bypassed
Collection nodes support two organizational requirements:
1. Group similar nodes together, independent of their 'natural' position in
the
overall tag library: nodes from various tag vocabularies that all contribute
to 'workflow management' can be grouped together under a single
"collection" node, regardless of their compartment, datatype, or source tag
vocabulary. This collection node being a child of another real node can
then be bypassed: it serves the purpose of bringing nodes from various
branches of the overall All Tags tree to a single tab.
2. Disperse a large number of sibling nodes to subcategories, without
altering their hierarchy: if you have a large number of nodes at a single
level, collection nodes can be created (for example, the letter the tags
start with can be used to group items A-K, L-R, S-Z). These types of
collection nodes appear as buttontabs: using this example, if you want to
access a leaf node that begins with the letter "C" you would click the 'A-K'
buttontab button, which would take the user to the tab hosting the tags
that begin with letters A-K. The alternative may be dozens of buttons on a
single tab, which would require scrolling down a long list to find a
particular tag. Over the course of dozens or hundreds of files, this
139
CA 02668306 2009-06-08
introduces extra seconds-per-file, that can add up to dozens of extra
minutes to complete the tagging task.
The Same as Last command
After completely tagging a file the user can press the "Same as Last" button
or
use the shortcut key "Alt+F2" to apply the same tags to the next file. The
Whack
Interface automatically moves to the Summary Tab which contains all the tags
that will be applied.
From the summary tab or the TagBag, tags can be added or removed.
This is especially useful for Photos that are taken within moments of one
another,
or songs that are all off the same album by the same artist: the user need
then
only assign any specific differences but the rest will be complete.
Multiple thumbnails can also be selected and tagged with the "Same as Last"
command, within "Selective" tagging mode, rather than switching to "Batch"
mode.
Pinning Buttons
Any "Button" or "ButtonTab" can be pinned down. A pinned button means the tag
associated with the button will automatically be applied to every file that is
processed by the Whack Interface. This feature is very useful if all files in
a folder
require the same tag, such as a specific person or place. The pinnned button's
icon and cursor changes to indicate the pinned state. Pinning is a toolbar
state
button, meaning if the Toolbar "Pin" button is pressed, all clicks on buttons,
pin
the button. The cursor changes to indicate the pinning action.
Compare this feature to the "Same as Last" feature: this feature is most
useful
when a few tags are common, but most tags are different; "Same as Last" is
most useful when most tags are common, and a few or none are unique to
specific files.
140
CA 02668306 2009-06-08
Skipping Tabs when in Certain Modes
The SwapNeat Metadata Studio Application has three primary embedding
modes. "Batch", "Selective" and "Single". The objective of the three modes is
to
promote a funnel or pipeline technique when tagging a set of photos. For more
details concerning the modes see the section on "Tagging Modes".
The "Mode Skip" button in the Whack Interface are used to configure which
"Tabs" are visited during the various tagging modes. Even if a "Tab" is not
automatically visited in a particular mode, the tab still exists in the Whack
Interface. It can be manually selected by the user. The state for the tabs are
stored in the TagSet files, such that the state can be shared along with the
TagSet.
Same and Auto Same
The "Same As Last" button [Please refer to Figure 227 in Appendix 1] is like
"Pin", but rather than pinning the button down for all future files, this
feature
reaches back to the previous file, and Whacks all buttons on the foregrounded
tab that were whacked on the most recent file. This is useful when the current
file
is very similar to the previous file, but the user does not anticipate a long
run of
files requiring similar tags.
For instance if the previous file was tagged with "Running" in the "Actions"
tab,
then clicking the "Same" button while on the "Actions" tab would result in
"Running" being inserted in the current file.
Another way to provide access to the "Pin" functionality may be to add another
button to the Whack tab, similar to "Same as Last", but providing the feature
"Same as this from now on" [Please refer to Figure 238 in Appendix 1 ]. This
feature will pin all the buttons that were used on the foregrounded tab for
the
previous file, rather than manually clicking each one with the "Pin" cursor.
141
CA 02668306 2009-06-08
The user would press and hold the CTRL key, then whack a number of tags on
the foregrounded tab, then also whack the "Same as last from now on" button,
effectively pinning them all.
This feature has been suppressed because it will be rarely used and was
deemed to be redundant by the availability and ease of access and use of the
"Pin" function: the user will first click the "pin" button on the Whack
interface
toolbar, then CTRL-CLICK all the buttons they want pinned: the result is the
same: the only difference is that you either proactively select the "Pin" mode
from
the Whack interface toolbar, or you retroactively click the "Same as this from
now
on" button, and all whacked buttons on that tab are pinned.
The Whack Summary Tab
The furthest right tab, or the last tab visited while tagging a photo is
called the
"Summary Tab". It can be turned on or off using the options dialog. It
contains all
the tags that are to be embedded into the currently selected file. It also
displays
any tags that where taken out. The states of each tag are shown through the
button icon or the pressed state of the button. The "Summary Tab" is a
convenient location to double check the tags before embedding them into the
files. It is also a convenient location to make changes. ie. Remove unwanted
tags.
Tagbag Pane
[Please refer to Figure 239 in Appendix 1]
The TagBag where, for various reasons, potentially useful Tags that are not
part
of the current TagSet are presented to the user.
The TagBag supports two distinct tasks:
1. migration of unstructured tags (IPTC, XMP, file system folder names, etc.)
to a structured tag vocabulary
142
CA 02668306 2009-06-08
2. further increased tagging efficiency, beyond even that of the Whack
interface alone
It does so by presenting various text strings on buttons.
= Some are actual tags in the users All Tags tree (possibly relevant but not
already present in the TagSet)
= Others are from third-party tag vocabularies and determined to be
(possibly) relevant to the users current file set. Others are flat keywords
discovered (embedded) in files. Others are segments of path to the files in
the fileset.
The TagBag toolbar lets the user toggle the presence of certain classes of
buttons, and enable or disable features of the TagBag.
= [Please refer to Figure 240 in Appendix 1] - Toggle visibility of icons on
TagBag buttons. When pressed, icons will be visible.
= [Please refer to Figure 241 in Appendix 1] - Toggle presence of buttons
showing non-structured tags.
= [Please refer to Figure 242 in Appendix 1 ] - Toggle presence of buttons
showing path segments for the current file. From the dropdown next to the
button you can select whether to show the path segments for all the files
in the thumbstrip, or just those of the current file in the Preview pane.
= [Please refer to Figure 243 in Appendix 1 ] - Toggle presence of buttons
showing embedded IPTC keywords. From the dropdown next to the button
you can select whether to interpret specific characters in the IPTC
keywords as structure delimiters according to the settings in the
aÃoelmport Structureaà tab of the aÃoeOptionsaà dialog, or to keep the
text of single IPTC keywords on a single button.
= [Please refer to Figure 244 in Appendix 1] - Toggle presence of buttons
showing ID3 fields. From the dropdown next to the button you can select
143
CA 02668306 2009-06-08
whether to show the ID3 frame values for all the frames in the current file,
or just that of the foregrounded Tab.
= [Please refer to Figure 245 in Appendix 1] - Toggle presence of buttons
relevant to ALL files in the thumbstrip, or just those for the focused file
= [Please refer to Figure 246 in Appendix 1 ] - Auto-shatter: unstructured
keywords and folder names will be shattered into individual words on
buttons, which can then be rejoined in any order by CTRL+Clicking.
= [Please refer to Figure 247 in Appendix 1] - Suggest tags from the All
Tags tree that are immediate children of this tab. Whacking these buttons
in the TagBag will add the button to the TagSet.
= [Please refer to Figure 248 in Appendix 1] - Suggest tags that are
available in related third-party tag vocabularies. These are in tag
vocabularies that have been published to the SwapNeat Community, but
those tag vocabularies have not yet been downloaded into your All Tags
tree.
= [Please refer to Figure 249 in Appendix 1] - Show in the TagBag the tags
that are on the Whack aÃoeSummaryaà tab. This lets you have an up-to-
date status of the tags in the file without navigating to the Summary tab
itself.
= [Please refer to Figure 250 in Appendix 1 ] - Show in the TagBag tags that
have been recently used: tags suggested in this category are in the
branch relevant to the foregrounded Whack tab.
= [Please refer to Figure 218 in Appendix 1 ] - Because it is possible to use
the aÃceRecently Usedaà TagBag buttons to Whack a file, the
aÃoeDoneaà button is present on the TagBag toolbar for convenience. It
also appears on the Whack pane toolbar and the "External Applications"
toolbar.
The TagBag has buttons similar to those in the Whack interface that present
tags
that by a number of means are determined to be contextually useful to the
user.
144
CA 02668306 2009-06-08
The tagbag contents change for each file that is that is presented for
tagging, and
for each tab on the Whack interface.
Button Icons
Even though the TagBag presents basically text strings to the user, and the
user
is not likely to care from where the text string is derived (only that it is
useful in
the context it is presented), each TagBag button can have an icon to indicate
its
type and where it came from.
Increased Tagging Efficiency
The TagBag can provide easy access to structured tags that are not present in
the TagSet (and therefore not present in the Whack interface).
History
= [Please refer to Figure 251 in Appendix 1] - Tags recently applied to files
by clicks in the Whack interface are shown in the TagBag with this icon.
For the foregrounded tab, the most recent N number of tags used. ie. The
tagbag
will show the last 10 buttons clicked on the "People" tab when the "People"
tab is
focused in the Whack Interface.
Existing SNMD tags
= [Please refer to Figure 252 in Appendix 1] - Tag is in the same tag
vocabulary as the foregrounded Whack tab, but the tags are not included
in the current TagSet.
= [Please refer to Figure 253.png"
LONG DESC="/snwiki/index.php/Image:Tag_elsewhere_in this-branch-1
6 in Appendix 1] - This tag is in the same branch as the foregrounded Tab,
and might be relevant.
145
CA 02668306 2009-06-08
= [Please refer to Figure 254.png"
LONGDESC="/snwiki/index.php/I mage:Tag_in_external_module_16 in
Appendix 1 ] - This tag is in a module that is rooted to integrate at the
foregrounded tab.
= [Please refer to Figure 255.png"
LONG DESC="/snwiki/index.php/lmage:Tag_in_uncle_branch_16 in
Appendix 1] - This tag is not a direct ancestor, sibling, or descendant of
the foregrounded tab, but is in a branch that is parallel to the branch
containing the foregrounded tab.
Suggested Tags *smrg*
Based on the "Tab" the user is on and its name, SwapNeat queries the global
internet database for likely useful vocabularies and presents the vocabularies
as
buttons. Steve G. Please fill in some more details about this Section
= [Please refer to Figure 256 in Appendix 1] - The foregrounded tab
matches a simple text search in the tag vocabulary represented by this
button.
= [Please refer to Figure 257 in Appendix 1] - The SwapNeat community has
record that the ID3 field represented by the foregrounded tab has the
value represented by this button as a popular value.
Migration of Flat (unstructured) tags into a structured tag vocabulary
Other Metadata Compartments with existing keywords
If a file already contains keywords, embedded in another compartment such as
IPTC, the keywords appear as buttons in the TagBag. Any existing captions are
also added to the TagBag.
[Please refer to Figure 137 in Appendix 1] - This unstructured keyword
originated in the Dublin Core compartment
146
CA 02668306 2009-06-08
= [Please refer to Figure 138 in Appendix 1 ] - This unstructured keyword
originated in the EXIF compartment
= [Please refer to Figure 139 in Appendix 1 ] - This unstructured keyword
originated in the ID3 compartment
= [Please refer to Figure 141 in Appendix 1] - This unstructured keyword
originated in the IPTC compartment
= [Please refer to Figure 143 in Appendix 1] - This unstructured keyword
originated in the TIFF compartment
= [Please refer to Figure 144 in Appendix 1 ] - This unstructured keyword
originated in the XMP compartment
File path name
= [Please refer to Figure 258 in Appendix 1] -The text on this button comes
from a segment in the file path.
The segments that make up the current file are presented as buttons, because
the path segments might themselves be useful as tags. This is particularly
useful
when the user first starts using SwapNeat Metadata Studio: after a time, any
reusable information from folder paths will have been migrated to real
structured
metadata tags. Therefore, there is a button on the TagBag toolbar to toggle
presence of such path name fragments off.
For instance if the current focused file was
c:\Library\Photos\Family\Erin\Playing
Ball.jpg, the following buttons would appear on the TagBag, "Library",
"Photos",
"Family", "Erin", "Playing Ball". Using the "Shatter" feature, "Playing Ball"
could
be separated into "Playing" and "Ball".
There is no attempt by the SwapNeat Metadata Studio application to interpret
these strings nor present context to the user: the fact that the paths are
nested
folders on the file system does not illustrate the relationship between the
strings,
so they are presented "flat" on buttons in the TagBag for the user to
interpret.
147
CA 02668306 2009-06-08
They persist in the TagBag while the user advances through tabs in the Whack
progression: at any time, the user can refer to the TagBag and click one of
these
buttons which will have the affect of 'moving' the tag to the foregrounded
Whack
tab, and thereby assigning context.
Dragging and Dropping TagBag buttons onto Whack interface buttons is
supported for making a file system folder "equivalent" to a structured,
contextual
SNMD tag.
Example: When the "People" tab is foregrounded, it may already host a
buttontab
button for the category "Family". If the user were to drag the button from the
TagBag that represents the "Family" file system folder and drop it on the
"Family"
button on the "People" tab, that would to make it equivalent: that tells that
application that, from now on, the "Family" path keyword is a direct synonym
of
the "Family" node under "People" and need not be presented again (depending
on other user configured options).
When the "Actions" tab is presented, there may not be a tag for "Playing
Ball".
Clicking on the TagBag button "Playing Ball" when the "Actions" tab is
foregrounded is equivalent to performing multiple actions:
1) Manually keying "Playing Ball" as a new tag on the Actions tab
2) Dragging the "Playing Ball" TagBag button onto the "Playing Ball" button on
the Whack interface to make them equivalent.
Interaction
Some TagBag buttons are keywords embedded in the file, but lack context: the
TagBag can be used to categorize / add context to previously flat keywords.
Also, in support of the reality that sometimes tags are misspelled, in the
wrong
field, or just not in-line with the user's personal preferences w.r.t.
capitalization,
148
CA 02668306 2009-06-08
word order etc., a few features available in the TagBag allow the user to make
use of incorrect tags in an efficient way.
= [Please refer to Figure 259 in Appendix 1 ] - When the order of the words
in a tag is wrong, the user can break the individual words into individual
buttons in the TagBag by holding. Then the user can CTRL-Click
individual buttons to "meld" them together into a single multi-word string
that will become a new button on the foregrounded Whack tab.
= [Please refer to Figure 260 in Appendix 1 ] - When the user wants to
recombine the text that has been shattered onto individual buttons,
clicking any button that has a component word of the original string will
'unshatter the buttons back into a single string.
= [Please refer to Figure 234 in Appendix 1] - This button is being created by
CTRL-CLICKING other buttons. Search Pane
SwapNeat Metadata Studio provides three different modes of searching. Each
mode of searching addresses the users level of sophistication and need for
specificity.
= simple
= whack
= advanced (tree based)
Simple search gives you power to find on simple text: this is the most
familiar
type of search that most users will be familiar with out of the box. As
opposed to
simply keying in a text string, finding results that match a single keyword
would
be many more clicks in either the Whack search mode or Advanced tree-based
search.
Contrast that with the complexity of the string using the simple text search
that
users would have to concoct (combined with locking files in the ThumbStrip) to
get complex results like that possible with the Advanced tree-based search.
149
CA 02668306 2009-06-08
Structured metadata, is not just nested information, it implies levels of
specificity,
so it makes sense to be able to search for structured items in a way different
from merely having folders holding keywords starting with a-d and then f-j
etc.
there's meaning even when only the parent folder has been specified.
The toolbar will adjust depending on the search mode:
= [Please refer to Figure 261 in Appendix 1 ] - Simple search: type text in a
text box and files having tags that match solely on text (not on structure)
will be returned
= [Please refer to Figure 262 in Appendix 1 ] - Use a "Whack" interface to
conduct the search
= [Please refer to Figure 263 in Appendix 1] - Search by clicking nodes in a
tree: much more complex search results are possible.
After a search has been performed:
= [Please refer to Figure 264 in Appendix 1] -The search can be done again
in case the files on the filesystem have changed
= [Please refer to Figure 265 in Appendix 1] - The search parameters will be
cleared, but the thumbstrip will not be altered
In "Simple Search" mode only the preceding 5 buttons are available on the
Search pane toolbar:
In Advanced (tree-based) search mode, the following three buttons are
available
to locate specific nodes in the tree:
= [Please refer to Figure 88 in Appendix 1] -Find a tag in the search tree
= [Please refer to Figure 89 in Appendix 1] - Find the next occurrence of a
tag in the search tree
150
CA 02668306 2009-06-08
[Please refer to Figure 266 in Appendix 1] - Highlight all the nodes that are
affecting the search result (including expanding any collapsed nodes that
hide selected descendant nodes
In Whack search mode, three buttons provide features for navigating between
tabs
= [Please refer to Figure 267 in Appendix 1] - return to the first tab in the
Whack search tab progression
= [Please refer to Figure 268 in Appendix 1] - go to the previous tab in the
Whack search tab progression
= [Please- refer to Figure 269 in Appendix 1 ] - go to the next tab in the
whack
search tab progression
The next three buttons are used in both Whack searches and Advanced
searches. The three toolbar buttons act as radio buttons: they indicate how
clicked tags (buttons on tabs in the Whack search, and nodes in the Advanced
Tree search) participate in the search:
= [Please refer to Figure 270 in Appendix 1] -files must contain all the tags
marked to be returned in the search result.
= [Please refer to Figure 271 in Appendix 1] - files must contain at last one
of the tags marked to be returned in the search result.
= [Please refer to Figure 272 in Appendix 1 ] - files must not contain at any
one of the tags marked to be returned in the search result.
These three types of search operators, used in conjunction with the "Locking"
functionality in the Thumbstrip, allow for the resultant file list to be
equivalent to a
very complex aggregate search query.
= [Please refer to Figure 273 in Appendix 1] - Use this node as a qualifier
for
another node when searching.
151
CA 02668306 2009-06-08
Simple Search
[Please refer to Figure 274 in Appendix 1 ]
The interface for the Simple search is very similiar to a web search engine.
The
user types a random keyword or keyword expression. Syntax such as +, -
,"keyword with spaces" to allow for more complicated and detailed searches.
The simple search takes into account synonyms of nodes with the names given
too, so that spelling errors which have been corrected by use of equivalences
can also be found properly.
Each search term is considered a group, and any match in the group of strings
that match the search term, is considered equivalent.
Whack Searching
[Please refer to Figure 275 in Appendix 1 ]
The Whack search, as its name suggests, is very similar in appearance to the
Whack Interface, whack search buttons are drawn on tabs. Root node buttons
are presented on the first tab. As the user selects buttons, they traverse
deeper
into the tag vocabulary heirarchy and therefore narrow down the search
results.
The pressed button can be in either of four states. "ALL", "ANY", "NOT" and
Qualify. This mode helps reduce the screen real-estate required for the whack
search process, which would otherwise require up to 4 independent whack
interfaces.
After each button is pressed a search result is returned, thus providing
feedback
to the user. Complicated expressions such as, "all photos of Tommy playing
ball
without Jill" and be performed easily by pressing the "People/Family/Tommy"
ALL "Actions/Body Actions/Playing Ball" buttons. Then clicking the NOT toolbar
of People/Family/Jill. Pressed buttons are displayed in either the ALL, ANY,
NOT
152
CA 02668306 2009-06-08
or Qualify sections, so the search expression can be easily deciphered.
Pressed
buttons can also be taken away easily to modify the search results.
The primary advantage of the Whack Search interface is that the user is
presented with what possible keywords are available to be used for creating
search expressions. Often in other products or applications the user is left
to
guess keywords, not knowing whether they even exist in any photos.
The counts of how many photos would be added or removed are displayed on
each button. This provides further information into how many results the user
should expect from each click.
Structured Tree Search
[Please refer to Figure 276 in Appendix 1]
The Structures Search mode is a tree representation of all the possible
metadata
that can be searched. It displays all the standard metadata compartments such
as, exif, xmp, id3, iptc, mp3 and the various snmd tag vocabularies.
The advantage of this mode is that the structure or hierarchy is plainly
visible as
the user traverses trees and selects tags for the search. Similiar to the
Whack
Search, each node can be in either of five states, "Not Selected", "ALL",
"ANY",
"NOT" and "Qualify". Node states are changed by clicking the icon. For each
state change the search results are returned. Counts are also available to
indicate the effect the node will have on the search results.
Saved Search Pane
[Please refer to Figure 277 in Appendix 1]
The Saved Search pane is a control for the management of Saved Search files
(.snsq). Its presentation is very similiar to the TagSet pane. It is a Tab
control
with buttons on each tab. Each tab represents a category such as "Personal",
153
CA 02668306 2009-06-08
"Music", "Manufacturing", "Hobbies", etc. A Saved Search is a recorded
SwapNeat search expressions stored as XML. The file also contains a header
which holds a Title, Author, Category and Description. The contents of the
headers are used to display the tooltips for the saved search which appears as
a
button.
The list of available categories are retrieved from the SwapNeat website. The
list
is dynamically generated based on user contributions and creation of new
categories by the SwapNeat community.
Clicking a button in the Saved Search pane, loads the search expression
contained in the snsq file associated with the button. The results of the
search
are shown in the Thumbstrip. Holding the "shift" or "ctrl" key allows for
multiple
Saved Searches to be superimposed on each other to create more detailed
search expressions.
By allowing multiple partial searches to be saved and overlaid, various
combinations of searches become accessible to novice users without them
having to understand the intricacies of structured metadata and equivalences
These Saved Searches (.snsq) are sharable and can be exchanged among
friends or communities to provide guidance on the kinds of photos desired, for
instance, the receiving user can then augment the search by additional saved
searches which have the effect of filtering out private images or restricting
the
camera that was used so that the images are the ones the user owns, and not
those he has gathered in the past.
Also, saved searches can be tailored to allow the overlay of certain copyright
requirements onto a search.
Organize Files
rename files based on metadata
154
CA 02668306 2009-06-08
move files based on metadata
Modes: Task-centric Pane Arrangements
= [Please refer to Figure 278 in Appendix 1] - Import mode is used to bring
files under the control of SwapNeat. Photos saved from email messages
or from your digital camera; Music files ripped from CD by Windows Media
Player or iTunes.
= [Please refer to Figure 279 in Appendix 1 ] - Batch Tag mode presents
keywords that are suitable for tagging many files at once. For example, to
tag many photos taken at one Event, or to tag the Artist and Album
information for songs ripped from a CD.
= [Please refer to Figure 280 in Appendix 1] - Selective Tag mode is used
for tagging individual files; you'll be presented with different tags to
choose
from. For those Event photos, you'll tag'each individual photo with the
names of the People in the file, or the particular Activity shown. For music
files, the Song Title and Genre are more suited to selective file tagging,
rather than Batch tagging.
= [Please refer to Figure 281 in Appendix 1] - Single Tag mode is for tagging
details into a file that take closer consideration into the file; for example,
a
photo with lots of people, you may want a closer look at the file, and
choose names of more people than usual. For music, if you want to tag
each track with the composer or soloist or mood of the song, you'll want
more direct access to the all the tags available, and a close-up view of the
file.
= [Please refer to Figure 282 in Appendix 1 ] - Organize mode is used to
arrange the files on your hard drive so you can locate them without having
to do a search, either from within SwapNeat or from Windows Explorer.
Depending on the type of file or the number of files you have like it, you'll
want to organize files differently. For example, you probably want to store
a photo of your friend that you received in an email message along side
155
CA 02668306 2009-06-08
the photos of that person that you took yourself. Using the tags in the file,
your photos can be stored in folders that make sense to you.
= [Please refer to Figure 283 in Appendix 1] - Search mode is used to find
files in large numbers with specific criteria. If your files are properly
organised on your hard drive, sometimes it's easier to just go right to the
folder using Windows Explorer. Other times, you want to find specific files
that have a certain combination of tags. Using the Search function, you
can create slideshows, music playlists, and lists of files to back up onto
CD or DVD.
= [Please refer to Figure 284 in Appendix 1] - Manage tags mode is the
control center for all your tags. You can make new tag vocabularies suited
to particular topics for photos, or types of information about music files.
You can make TagSets that simplify your tagging process. You can share
your carefully-crafted tag vocabulary file with friends, or publish it to the
SwapNeat Community website for everyone to use. If you accept
contributions to your tag vocabulary from SwapNeat Community
members, you can examine their ideas and merge them into your tag
vocabulary.
= [Please refer to Figure 285 in Appendix 1] - In addition to the 7 built-in
modes, you can create your own pane arrangements and save them for
later use. The Custom Mode is also used to assign a pane arrangement
that you have loaded from a file to a particular mode. Load a pane
arrangement from a file as a "custom" pane arrangement, then assign the
current pane arrangement to a specific mode.
Tagging Modes
The SwapNeat Metadata Studio Application has three primary tagging modes:
"Batch", "Selective" and "Single". Each mode is comprised of a subset of the
14
panes available in SwapNeat Metadata Studio, a subset of the 7 toolbars
available, and the orientation, position, and size of each of these objects in
the
overall user interface.
156
CA 02668306 2009-06-08
The objective of having three distinct modes is to promote a funnel effect
technique when tagging a set of files.
It is likely that a new set of imported files consist files exhibiting similar
metadata
requirements.
For example, photos taken with a digital camera are often feature subjects
captured at a single event and / or a single location. It would be efficient
to
"Batch" tag all files with the same "Event" and "Places" tag.
With digital music, all the files may be copied from the same compact disc or
be
of the same 'genre'.
A large subset (but not all) of the files may share other common
characteristics. It
is most efficient to review the thumbnails in the Thumbstrip and multi-select
files
that contain share similar metadata requirements, and tag the selected files
simultaneously with identical tag(s). This mode is known as "Selective" mode,
in
that you select many files simultaneously from a larger set.
Photos may contain people engaged in a particular activity, or the same
persons
may appear in multiple photos, or at a given event, the same objects may be
pictured.
Music files may share common traits like the name of the songwriter(s), the
genre of the music, the artist, etc.
Lastly, certain files or metadata requirements are best suited to close
scrutiny of
the file, and careful consideration on a per-file basis. This is known as
"Single"
mode, as each file is examined and tagged individually, a single file at a
time.
For photos, the user might want to rate each photo for image quality and
esthetic
appeal. Music files typically have unique titles that need to be entered
individually.
157
CA 02668306 2009-06-08
Each mode is athe layout and visibility of the various panes to accommodate
the
operation.
Batch Tagging
[Please refer to Figure 286 in Appendix 1]
In Batch mode the thumbstrip is wide with multiple columns, allowing the user
to
better interact with and assess the files as a set. The thumbnails are smaller
than
in other modes, to accommodate more thumbnails in the ThumbStrip. Entering
the "Batch" mode automatically select ALL thumbnails in the thumbstrip, such
that any tagging action will affect all files simultaneously.
Certain panes, like the Preview and Metadata Chart pane, which are only
relevant to the single 'focused' file, are hidden in Batch mode.
It is possible to deselect all files by clicking in the ThumbStrip on an area
that is
not a ThumbStrip tile. The user is given the option to continue with the
deselect
all operation, or cancel, and keep all the files selected as it is the purpose
of
Batch mode to operate on all files in the thumbstrip simultaneously.
"Selective"
and "Single" modes are easily activated with only 1 button-click on the Modes
Toolbar, rather than the relatively cumbersome task of rearranging the panes
in
Batch mode to better suit the respective "Selective" or "Single" file tagging
requirement.
Selective Tagging
[Please refer to Figure 287 in Appendix 1]
In Selective mode the thumbstrip is narrower than it is in Batch mode, and the
thumbnails are larger to allow the user to see more detail in each file. The
preview and metadata chart are visible, as the user will want to ensure with a
closer view that the files they have selected are suited to the specific tags
they
intend to apply to the files.
158
CA 02668306 2009-06-08
Single Tagging
[Please refer to Figure 288 in Appendix 1]
In "Single" mode the ThumbStrip is not visible, and an additional toolbar for
navigating between the files in the file list is displayed. As with normal
"whack"
tagging progression, when the metadata is applied to the focused file, the
next
file is loaded automatically, so the user need not actively pick and activate
the
next file to be tagged.
The preview of the focused item is as large as possible to allow the user to
see
fine details, and perform certain tasks that are only applicable to a single
file at a
time, such as subregion whacking, or picking a colour out of a specific area
of an
image, or identifying the range of time in a song that comprises the "chorus"
or
"solo".
User customizable modes
The various modes in SwapNeat Metadata Studio come with predetermined
layouts and configuration, however, each mode can be customized by the user.
Any changes to the toolbars or visibility of the panes are saved for each mode
and reloaded at application startup. The defaults can also be easily restored.
External Keyword Migration
migration is the means by which existing unstructured metadata is accepted
into
the swapneat database and made available for processing in a structured
metadata environment
migration is a brief operation without lasting effects on interpretation of
new
information: if the same non-structured, non-fielded metadata value, or
"keyword", is encountered again, lacking structure or context it is not safe
to
assume that the keyword has the same semantic meaning as that previously
encountered: it is for the user to decide whether or not it is the same.
159
CA 02668306 2009-06-08
The user can at the time a non-structured 'keyword' is first encountered if
they
want all subsequent occurrences of the same keyword discovered in other non-
structured data compartments as an exact synonym: this is only recommended
when the user has specific knowledge of the general non-structured metadata
previously used on the file set.
batch methods
The TagBag presents to the user all non-structured keywords embedded in the
files in the ThumbStrip file set.
The Whack interface presents structured categories into which non-structured
keywords can be sorted.
The Creation Guidance pane advises the user on the nature of the tags on the
foregrounded tab.
CTRL-Clicking on tags in the TagBag combines individual words on buttons
together, to form a single keyword: the mechanism to sort tags from the TagBag
into structured metadata vocabularies is to simply click them when the
appropriate category tab is foregrounded in the Whack interface.
For example, a collection of photos that has embedded non-structured tags that
represent people, places, objects, events, and activities, would all appear in
the
TagBag simultaneously (as without context, there is no accurate method to
distinguish the classes of metadata from one another)
If the first tab in the Whack interface is the "People" category, the user
would
observe in the Creation Guidance pane a suggestion that "categories" of people
should be created as immediate children of the "People" tab: for example,
"Family", "Friends", "Neighbours", etc. If such tags appear on buttons in the
TagBag, the user would click them individually, and those buttons would be
removed from the TagBag, and appear on the 'People' tab in the Whack
160
CA 02668306 2009-06-08
interface. If what the Creation Guidance recommends does not appear, the user
could choose to ignore the recommendation, or manually create the categories
by typing (described under the Whack Interface' section of the preferred
embodiment)
The process is repeated until all the buttons in the TagBag have been moved to
appropriate tabs in the Whack interface.
mapping tables from flat to structured data
If the user is familiar with the nature of the non-structured metadata that
they
wish to import into a structured tag vocabulary, it is possible to pre-
configure a
translation table that can operate on many files at once: for example, in a
set of
photos, the individual keywords could be mined into a single list, and their
structured counterpart identified: for example, a non-structured keyword label
representing a person's spouse (for example, "Pat") could be automatically
filed
in the structured tag vocabulary under the node at "People / Family".
Subsequently, all files containing the non-structured keyword "Pat" could be
automatically updated to include the structure "People / Family / Pat"
(properly
encoded both as SNMD and in a common delimited form using a character such
as a "/" character, as shown.
automatically map pathed information into specific vocabularies
Workarounds to impose structure in otherwise non-structured metadata
compartments includes using a delimiter character to indicate structural
divisions:
for example "People / Family / Pat" or "People. Family. Pat".
The SwapNeat Metadata Studio application can be configured to interpret such
tags with specific delimiter characters as indicative of "structure". The user
need
only indicate a specific Tag Vocabulary (and optionally, a specific node
therein)
where such psuedo-structured metadata should be rooted.
161
CA 02668306 2009-06-08
If a photo contains a keyword "Cars / Ford / Mustang", and the user has an
"Objects" node in their target tag vocabulary, the user would indicate that
the "/"
character is a delimiter character, and that the destination node for psuedo-
structured metadata is the "Objects" node, resulting in new structures being
generated in their structured tag vocabulary: "Objects / Cars / Ford /
Mustang".
interactive method
User having flat tags has to categorize them: SwapNeat Metadata Studio
includes various hierarchical tag frameworks guidance to file them properly.
User having nonSNMD structured tags can see them file by file, or a set of
files
at a time.
User can click a single tag and say "find all files containing this tag". This
loads
all the files containing a specific tag into the thumbstrip. With a goal being
to
disambiguate flat keywords where the same text has different meanings, is it
possible the files presented in the thumbstrip will exhibit multiple semantic
meanings for the same keyword. The user can select only files of a certain
class.
This method allows the user to pick files from the Thumbstrip that represent
one
of many semantic meanings for the same keyword and remove them from the
list. This is good in the case of a number of photos having a "Me" tag, and
many
people have used that tag for themselves. The user would say "find all files
having this tag" then remove files that didn't match what they were trying to
do.
So for all the files in the thumbstrip, they would somehow translate an
ambiguious structured tag unambiguously. Then repeat the process until they
have resolved the ambiguity for that tag in all files.
Structured Non SNMD can come from a few sources:
= cleverly delimited IPTC (!Match)
= Windows Vista's Windows Photo Gallery
162
CA 02668306 2009-06-08
= Microsoft Digital Image Suite (Picture it!)
= Adobe PhotoShop Elements 4+ (requires extra effort on users part to get
tags into files)
The user probably won't know for sure if the structured metadata they have is
100% their own: unless they know for certain that tagged photos received from
third parties NEVER ends up in their main library: if so, they risk
'corrupting' their
Tag Vocabulary if such tags are introduced to SNMDS.
The user would turn off auto file
The Media Detector
The Media Detector is a background process that is launched during the
operating system startup. The program sets triggers for file changes on all of
the
folders within the SwapNeat library. By doing this it can detect file
additions,
deletions and modifications on files within its database, even when the main
application is not running. The full filenames and the type of operation are
recorded in a journal and then passed to the main SwapNeat Metadata Studio
application if its running at the time. If the main program is not running the
journal
is appended to, until the main application is started. By logging file changes
the
application can keep its database consistent and it can re-embed snmd tags
that
might have been stripped out by third party tools.
Workflow Examples
here are some worked examples of how the studio can be used for performing
certain common tasks
Usages Phases
adoption phase
163
CA 02668306 2009-06-08
the product is being evaluated for compatibility with existing tools and
historical
image or music library management methods particular to the user
early usage phase
in this case, the user has decided to try to use the program but has not
mastered
all of the tools they need for efficient usage... they also may have various
kinds of
tagged data that they are trying to standardize
long term use
these tools are designed to be applied on demand after the user has
established
the database and has some tagged files
user expertise
novice / undisciplined / fatigued
metadata familiarity / discipline
scenarios showing workflow related to organization and pre-planning
Reflect in this section the realities of metadata in the broad 'usability'
sense
preventing metadata entropy
recovery / cleanup / maintenance mechanisms
Tools
SwapNeat Metadata Studio collects and presents several useful routines as
'tools' that can be applied to subsets of the collection
These generally useful operations are split into 2 groups, batch tools and
interactive tools
164
CA 02668306 2009-06-08
Batch tools can be applied to all the items in the database, to a search
result, or
just to selected files in the Thumbstrip, which could in turn have been
populated
by a search
The whack interface is an example of an interactive tool
Integrated tools exist for making presentations such as slideshows, the
mainstay
of photo organizer software
The result of a search can also be packaged up into an xml description for
import
into third partyy tools that can be used to create presentations or other
renderings of the relevant files. by packaging the metadata in xml format, the
task of the post processing application is simplified and encapsulated.
The presentation could even be computed by a remote item, into a batch file
that
would gather the appropriate files on the users machine and process them when
run at a later time
batch tools
DTD generation for standard XML-parser validation
disambiguation
resize files for keeping a smaller one and archiving the full resolution ones
web page maker
split files into chunks for backup
digital signatures on files
interactive tools
filing discovered keywords
165
CA 02668306 2009-06-08
sharing
swapneat metadata studio facilitates sharing of user created information,
including information that configures and customizes the metadata studio
itself
this shared information can assist in maintenance of standard metadata
libraries
and best practices
qualifier namespaces
sub namespaces
personal vocabularies
general vocabularies
incident descriptions for events like hockey
description of industry standard items
= HVAC documentation
music
although many of the examples in this preferred embodiment refer to photos and
their metadata, music files also have metadata and can be processed in similar
ways.
music files have a different kind of metadata, in that most of it is parameter
based, and there's usually a 'right' answer, as determined by reference to
original
materials available when the music recording was made.
following is a list of the music specific aspects of the swapneat metadata
studio
technical details
166
CA 02668306 2009-06-08
storage formats
the specification for recognizing SNMD in a file is here
database items
a list of information stored in the database is here
the information is stored to increase the speed of searches
no information about files is stored that could not in principle be re-
acquired by
examination of the data file, with the exception of the files location and the
list of
folders that contain files of interest
algorithms STEPHEN
to compute the digital signature of a music file for use in the m2g algorithm
The m2g signature is a 64 bit number and a duration in seconds, computed from
the contents of an audio track. It's designed to be independent of the
metadata in
a music file, but dependent upon the duration and content of the file.
Although it's possible for 2 audio tracks to have the same duration and begin
with
the same audio samples, it's unlikely this will occur in practice since the
MPG
recompression will likely result in slightly different sample values after an
editing
operation which combines several music intervals or replaces audio data part
way through. Since the purpose of this algorithm is to aid unambiguous
distinction between common commercial audio tracks, it's unlikely that
collisions
due to the above reasons will occur.
to compute a signature for an audio file
minimum requirements for a music signature
167
CA 02668306 2009-06-08
The purpose of a music signature as defined for metadata purposes is to ensure
that a given rendition of a song can be identified when an exact copy of it
occurs
elsewhere. This allows existing library metadata in a database to be applied
to
the new instance of the song.
The first requirement for a music signature is that it provide a compact
signature
value which does not change if the metadata in the file is changed.
Since a given MP3 may have a certain sample rate, and a certain compression
ratio, and be a rendering of a certain original digital music composition,
from a
certain CD, this information will be relevant to the metadata.
Since changing any one of these could technically change the content or
quality
of the music, it is not appropriate in all cases to assume that metadata
should be
equivalent for the different instances.
For this reason, the second requirement of a signature is for the signature to
distinguish between different sample rates, compression, duration, and
original
CD source. Effectively, anything that changes the sound samples should be
considered a different signature.
The third requirement is that the value be different for different music
compositions. It must produce a sufficiently large range of signatures so that
confusion among unrelated audio track signatures, and renditions of audio
tracks, is minimized.
Since there are about 10A7 different audio tracks and perhaps 10^3 different
ways that they have been rendered, it's important that the signature algorithm
produce enough bits to approximately deal with the square of this number. That
reduces the aggregate probability that 2 unrelated files will ever share the
same
signature to a very small amount, under 50%.
168
CA 02668306 2009-06-08
Much effort in the patent literature has been put into ensuring that the
signature
is invariant with respect to sample rate, noise, and even the sampling
position in
the music. This has commercial impact on music licensing. The SwapNeat
signature is not designed to address these issues and does not need to
consider
them.
music signature algorithm
In consideration of the 3 requirements, a signature which meets the
requirements
is computed as follows:
1. use a standard reference decompression routine to create a
decompressed set of samples, starting at the beginning of the audio track.
Decompress to a standard sample rate, 44100 samples per second. Each
sample will be 16 bits stereo. The number of bytes needed to express 512
samples is 2048 bytes. In the case of stereo, the 2 tracks are added (in a
way that increases the number of bits sufficiently to prevent overflow),
before computation begins. The resulting Mono samples are then grouped
and manipulated.
2. process the data sequentially, considering it in non-overlapping sequential
chunks of 512 samples. Detect the first 512-sample chunk that exceeds
the silence threshold.
3. (optionally) process an additional number of 512-sample blocks to help
reject noise and to get to a more distinctive part of the audio track. Select
the block that will be used for signature.
4. compute a shat signature on a subset of the sample data from the audio
track block.
5. truncate the shat checksum to contain the least significant 64 bits, and
use that as a signature.
169
CA 02668306 2009-06-08
Note that the above algorithm can break down if only one chunk contains any
non-silent audio. In this case, the first chunk exceeding the silence
threshold is
used, and the duration of the entire music file also contributes to the
signature as
usual.
In the case of 'ogg vorbis' compression, the actual number of uncompressed
samples in the music file is present and available in the file header. This
can be
used to distinguish accurately between many different CD tracks, and help
compute a signature that would be invariant across the same CD track encoded
in ogg vorbis at different bit rates and (to a lesser extent) sample rates...
however
the 'ogg vorbis' format is not widely used, and any signature algorithm has to
be
able to handle the existing library of MP3 files.
In the case of ogg vorbis compression being widely used, since each song has a
different random number of samples near 3 minutes, there are about 1 million
different sample counts. Since the total number of compositions is under 100
million, it's possible that all that would be needed to recognize the same
track
again would be a hint at the metadata and the exact number of samples used in
its encoding.
Although this reliance on sample count would result in a signature that could
be
computed much more quickly, the applicability and resilience of the signature
would be less, so it's not considered viable.
Once the signature has been computed, a given piece of music can be
recognized when it occurs again, independent of embedded metadata in the
music file.
An M2G graphic takes all the audio in a file into account, and generates a
deterministic graphic image from it. In addition to its value as a navigation
tool, it
also supports a recognition capability. It considers the audio content, and
the
length. It is less dependent on the compression style and bit-rate in the
music,
since these factors are controlled for when the music is converted to a
standard
170
CA 02668306 2009-06-08
uncompressed WAV file before analysis. As a result, different renditions of a
music track, compressed by different sources, and with differing amounts of
embedded noise, will all produce a substantially similar M2G graphic image.
These can be used as a kind of visual comparison to consider equivalence of
two
renditions. This process will aid the manual equivalence determination of two
renditions. Computerized image recognition software may also be applied to
determine substantial equivalence.
Since the M2G samples nine frequency bands in each sample, it actually
produces a 9-byte value for each sample in the music. These technically carry
sufficient information that a single pixel triplet could specify the
composition.
However, considering the sensitivity of the algorithm to the exact values from
the
audio decompressor, and the fact that lossy compression is commonly used on
the M2G graphic images, it is not likely that exact matches would be found for
different renditions (audio compressed by different tools, at different sample
rates, etc), so this method is not sufficient to use in recognizing an excerpt
from a
music file, where the start alignment and scaling factors are not known. Even
if
the data were stored without compression loss, the difficulty of cataloging
and
searching several trillion 9-byte signatures would be prohibitive, even if
alignment
to the original samples, and access to scaling factors for the entire piece
were
somehow overcome.
to compute a M2G
M2G stands for'Music to Graphics' and is an image based on an interval of
audio
data.
To create an M2G image, the contents of the entire available track is
considered.
This preferred embodiment will be for the case of a music file, but can be
generalized to cover any audio sample.
The M2G algorithm accepts a number of parameters:
171
CA 02668306 2009-06-08
= The desired M2G graphic size in rows and columns, where each minimal
element of the image will be four pixels high and one pixel wide.
= The size in bytes of the Fast Fourier Transform (FFT) sample intervals to
be used. 512 bytes is the default size and seems to work well.
= The JPEG compression ratio desired. Determines the resulting image
detail and size.
= A list of frequency interval indices (specified as edges of frequency bands.
They are in units of 44100/512 Hz) to be used in the computation of
energy content per interval.
The list of interval boundary indices used in the standard M2G algorithm have
been arbitrarily set to 0, 3, 6, 10, 30, 45, 90, 120, 200. There is a three
element
cross-fade computed for energy content at interval edges to minimize the
effect
of music that has energy in frequencies that straddle the edge of a band.
The product of the number of rows and columns gives the total number of M2G
elements that will be generated and displayed. In general, this will be less
than
the number of 512-byte sample time intervals in the original music sample. If
the
number of 512-byte sample time intervals in the original music file is
insufficient
to provide at least one interval per element of the M2G, then elements are
duplicated such that the duplicates are spread evenly throughout the image, in
a
manner similar to the program that draws a diagonal line on a graphics screen,
called Bresenham's Algorithm. This algorithm allows an efficient computation
of
rational interpolation without use of floating point math or integer
divisions.
The M2G can be created with a fixed time per row, perhaps rounded to the
nearest higher second or quarter-second, to simplify the computation of the
visual intervals, in which case there will probably be an interval of black
after the
end of the music, on the last row, or it can be computed such that the rows
are all
full from the start of the first row to the end of the last row, in which case
there is
likely a non-integer number of seconds per row.
172
CA 02668306 2009-06-08
In the case of a completely full M2G, other renditions of the M2G will look
substantially similar, even if they have been compressed in time or encoded at
a
different sample rate. They will also look similar in a way that is tolerant
of the
amount of JPEG compression used, just as other images degrade gracefully as
JPEG compression ratios are increased.
If not already encoded as uncompressed audio 16-bit stereo (or mono) samples,
the audio is first converted into 16-bit digital samples, using a reference
program
that converts MPG data into audio samples.
Step 0
Input is a music file, and
A list of B-1 (where B is the 3 times the number of information bearing pixels
to
show per time unit)
inter-bin boundary positions, related to frequencies in an FFT
The required number of rows R and columns C in the resulting graphic.
B is the number of bins per group. B is normally 9.
Each music sample will be represented by (B/3)+1 pixels in a vertical column,
within the
cell. In the preferred case, this is a 1 wide by 4 high arrangement. One of
the
pixels in the group
is always set to a constant colour. To help reduce jpeg colour bleed, the
preferred colour is black,
but white or any other colour can also be used.
The actual dimensions of the resulting pixel array will be R*((B/3)+1) high by
C
pixels wide.
Step 1
Convert the input music into a list of paired stereo samples, 16 bits per
sample.
173
CA 02668306 2009-06-08
If the input source is in stereo, combine left and right channels to create a
mono
stream,
by simple averaging, for instance.
The process continues with a mono stream of 16 bit samples.
Start the sample starting position at the very beginning of the stream of
samples.
Step 2
For sample starting positions that increase by 256
Step 3
Copy the 512 samples starting at the starting position and ending 511 samples
later.
If insufficient data remains, then skip out of the loop
Step 4
In accordance with best practices for computing fourier transforms on audio
data,
apply a windowing function to the buffer so that discontinuity at start and
end of
buffer is
minimized. A hamming window can be used.
Compute a FFT on the copied buffer
Step 5
Apply a squaring function to the data, in order to get a list of positive
numbers
representing
the energy at each frequency.
174
CA 02668306 2009-06-08
Bin the data into 3*P bins (9 bins) according to the bin boundaries supplied
as
arguments to the routine. At each bin boundary there should be implemented a
25% 50% 75% crossover blending of the energy
numbers.
Step 6
Advance the start position by half the size of a sample group... 256 samples.
This is consistent with best practices in FFT analysis and allows the
windowing
function to consider all the waveform data symmetrically.
Repeat until all the music data has been processed into bins:
Return to step 3 until all the input waveform has been processed.
Step 7
The dimension of the bins will be an array R*C by B
Search through the list of binned data for each sample group for the maximum
for each bin
The maximum bin 1 value seen in all the (R*C) groups becomes a scale divisor
for normalization.
Likewise for the other B-1 bins.
Step 8
Normalize all the binned data so that it ranges between 0 and 1 by dividing
each
bin value by the global
maximum for that bin
Step 9
Compact R*C with the number of sample groups that are available in the list.
175
CA 02668306 2009-06-08
R*C is the number of samples that will be represented graphically in the M2G
image.
Step 10
If more places (R*C) are available than sample groups, expand the sample group
list by uniformly distributed
duplications until the number is matching
If there are less places (R*C) than sample groups, then the binned data will
be
averaged with neighboring bins,
until the number of remaining sample groups matches the number of places to
draw pixels.
Step 11
For all the places (R*C) compute the 3 pixel sample representation as follows.
THe bottom pixel in the row will be black. The next pixel up hold bins 1, 4, 7
as R,
G, B, appropriately scaled
so the maximum is 255.
The next bottom pixel up will hold bins 2 5 8 as R G B respectively.
The top pixel will hold bins 3 6 9 as R G B respectively.
Thus if the music has predominately low frequency energy, it will appear as
red
pixels.
Mid-range energy domination will result in greenish pixels,
and High frequency energy dominance will result in bluish pixels.
Step 12
Return the resulting bitmap in a buffer for further processing,
176
CA 02668306 2009-06-08
such as conversion to a jpg image, to the calling program.
The steps are:
1. Divide the audio into 512-byte sample time intervals, aligned with the
start
of the music, non-overlapping.
2. Compute a FFT on each 512-byte sample section separately, using a
Hamming window. Since the nominal sample rate is 44100 samples per
second, and the number of samples in a FFT section is 512, the
frequencies are in units of 44100/256 Hz.
3. Compute the square of the FFT values to convert from frequency content
to energy content. This eliminates the effect of any components which
have negative magnitudes due to relative phase in the interval.
4. Consider the bands from lowest to highest frequency, and apply energy
band analysis to the resulting FFTs to determine the energy content in
each frequency band. Use a three element cross fade at band boundaries,
with 25%, 50% and 75% weightings to help desensitize the algorithm to
exact frequency content, since a frequency at the band boundary will
contribute equally to the band below and above it.
5. Compute the per-band global maximum values for each band and use the
per-band maximum values to normalize the data so that each band's
energy level values range from some minimum, not necessarily 0, to the
maximum, scaled to be 255. For instance, if the energy computation
resulted in numbers ranging from 2300.0 to 1000000.0 then the scale
factor of 255.0/1000000.0 would be multiplied by each energy value for all
the sample time intervals, in that band.
6. Use the normalized values to compute the RGB values for pixels, and
map the nine available values, one from each band to three independent
RGB pixels. Use the three lowest frequency values for red (R), ranging
from the bottom to the top of the three pixels in the M2G element, then
green (G) then blue (B).
177
CA 02668306 2009-06-08
7. Assemble the pixel RGB triples into vertical columns (lowest pixel holding
the lowest frequency band color in its red component), with a fourth
always-white pixel placed at the top. When the elements are placed
adjacent one another the white pixels create a horizontal stripe. Assemble
several stripes, one per row, into the entire graphic. For each pixel bottom
to top in a column of three, use red first, then green then blue. That is, if
the nine bands are numbered 131 to B9, then the RGB RGB RGB will be
131134137 B2B5B8 B3B6B9. Thus something with a lot of low frequency
content will be more red.
8. Interpolate linearly using a version of Bresenham's algorithm to map the
oversupply (or undersupply) of M2G element data to the available M2G
elements in the graphic. Whenever more than one 512-byte sample
interval's data has to be combined to create a single M2G element,
compute arithmetic means of each pixel color, and use the resulting mean
for the RGB RGB RGB to show in the element position of the M2G.
9. Compress the M2G graphic pixels using JPEG or PNG to create an
image.
10. Apply SNMD to the resulting JPEG or PNG image. Use metadata from the
original audio track, as well as any additional alternative, possibly
misspelled, etc..., metadata, titles, and other choices that are available.
Thus the M2G will be a standalone file containing not just a visual
rendition of the audio content, but also a library of the true, and
potentially
true, but possibly mistaken metadata choices for the Music file.
11. Name the file after its music signature described elsewhere, so that it
can
be stored and retrieved from a server according to the assigned signature-
based name. Since the music signature, described elsewhere has 16 hex
digits, the name is in general,
d15d14/d13d12/dl1d10/d9d8d7d6d5d4d3d2dldO.m2g where d15 .. dO are the
digits of the music signature, MSB first.
to filter some XML in order to apply a digital signature algorithm on it
178
CA 02668306 2009-06-08
In order to produce a checksum that is more immune to XML formatting issues,
the XML is first filtered to replace all occurrences of required white space
with a
single 'space' character. In addition, all snncs: attributes are removed.
The resulting compacted and standardized XML is then fed to a SHA1 generator
to compute a SHA1 checksum.
The embedded signature, if any, is not included in the checksum, mainly
because it will be a snncs:authenticator attribute of the sn:namespace node or
other top-level node to which the checksum applies.
At file creation time, it is ensured that SNTV files have the ordering of the
element names strictly alphabetical, within each parent element. The user-
specified order of the nodes is stored in snncs:index attributes of the
elemets.
snncs:index attributes are not considered when the checksum is computed.
This allows trivial rearrangements of an SNTV file, and reformatting of it for
readability to not affect its checksum.
A further idea would be to allow the creation of 'pool' elements which do not
affect the checksum. Since pool elements do not affect the embedded XML, but
only the presentation, it would allow a user to rearrange the information in a
sntv
file without changing its checksum.
However this would be computationally expensive since the elements within the
pools would no longer be in global alphabetical order, so this feature is not
yet
implemented. To implement it, it would be best to apply pool data in a
separate
overlay section, which is not subject to the checksum process. A parser able
to
preserve checksums in this case would have to first re-arrange the xml to have
no pools, and to have all sibling elements arranged in alphabetical order.
This
would defeat the purpose of checksums, since one goal is to allow protection
of
the structure and the friendly names of the node tree. If some elements are
not
considered for the checksum, spam could be inserted into those non-considered
179
CA 02668306 2009-06-08
elements. Thus it is best to cause the checksum to consider almost all the
data,
other than trivial rearrangements.
The XML checksum maker is currently an integral part of the XML parser. This
is
so that the parts that get a checksum can be selected from the XML data
stream.
Technically a more separate checksum process could be implemented,
according to the above description, where a filtering process is applied to
the
XML to create a new XML stream, and the result is simply passed to a SHA1
program. Since snncs: attributes are removed, the parser that compactifies the
XML will have to be able to detect them no matter what URI they are associated
with.
{INSERT pseudocode for XMLchecksum filter HERE}
to compute a checksum for XML
The XML checksum is a SHA1 sum over the XML character data in standard
form. All whitespace is normalized and snncs: attributes are removed according
to the XML filtering algorithm also included in this Preferred Embodiment.
The SHA1 checksum of the character data is then combined with the publisher
ID and sent to the server. The server encrypts the data with its private key,
provided that the logged-in username matches the requested publisher name in
the string.
{INSERT pseudocode for XML checksum maker HERE}
to create a digital signature
After a XML checksum for a string of character data has been computed, the
checksum together with the publisher ID is combined to create a
plain_authenticator. The plain-authenticator string is sent to the SwapNeat
digital
signature server. The server verifies that the logged-in username requesting
the
180
CA 02668306 2009-06-08
signing operation matches the publisher name in the request; if not, the
request
to create an authenticator is denied. Assuming the request is authenticated,
the
server creates a string containing the SHA1 checksum, the publisher ID and a
timestamp based upon the GMT time of signing at the server, and encrypts the
string with its PKI private key.
The resulting encrypted digital signature data (128 bytes) is returned to the
client
computer and used as an authenticator in the file to be signed. To embed the
signatuire, the binary data (128 bytes) is first converted to base64 notation,
using
+ and / as the Ox3e and Ox3f character codes, and embedded as an
snncs:authenticator attribute within the sn:namespace element or the
sn:lang_namespace element. In general, a signature attribute is embedded into
the first or second element of the supplied XML; if the first element is just
<xml...> , the second element is used.
This digital signature algorithm can be used to create an authenticator for
any
XML data. The authenticator certifies that the logged-in user was in
possession
of the XML data at the indicated date and time. Since the XML filtering
process is
applied first, the resulting XML can be reformatted without invalidating its
signature.
{INSERT pseudocode for digital signature maker HERE)
to verify a digital signature
There are 2 kinds of digital signatures that can be applied by the studio to
files.
One kind is the local authenticator. It uses the machine ID, as determined by
the
routine that licenses SwapNeat, to create an authenticator. The local
authenticator is crafted entirely by software running on the user's machine.
These
authenticators can be forged, because they do not use the SwapNeat's digital
signature server, but they are resistant to casual errors.
181
CA 02668306 2009-06-08
A local authenticator will be accepted by any other instance of SwapNeat on
the
same machine (having the same hardware-dependent machine signature), or by
the named user (username in the authenbticator and/or vocabulary URI) on any
other machine, provided that the user is logged in on the other machine. The
named user means the user whose name is integral to the URI in the
authenticated item, in the case where the item is an SNTV file. In the case
where
the item is a tag vocabulary, search or other item, the authenticator will be
accepted if there is a match between the logged in user and the name in the
authenticator. In this case, there is a potential for forgery if the machine
ID in the
authenticator is not also checked by the user.
Because searches and tag vocabularies do not, by themselves, make permanent
changes to the database, it's not as important to verify their authenticity.
The purpose of this local authenticator is that the content is deemed to be
from
the user. The user can share it among other users on the same machine without
external authentication. If the user wants it to be generally distributed,
then a
secure authenticator must be created for the item. The exception is that he
can
take a locally signed file and read it in to a system where he is logged on.
There's
no loss of security because he could just as easily create the content anew on
the other machine, and then sign it. However a locally signed forgery might be
crafted which would enable him to input harmful information.
It is not clear how to counteract this threat.
Since the Sherrif machine IDs are not distributed with the files, it is
difficult for an
impostor to craft a targeted file to send to the studio, that will be accepted
by it.
He would have to have access to the sheriff signature of the machine, which
requires electronic access to the machine itself. Once this is achieved, any
form
of hacking would be possible, including using the GUI to sign things locally,
so
there's no loss of security there either.
182
CA 02668306 2009-06-08
{INSERT pseudocode for digital signature checker HERE)
to compute a sub-episode
An episode is loosely defined as a clump of photos in time. The concept of
episodes is most useful when attempting to present a rendering of the photos
to
the user in a manner that corresponds to the time line in which they were
taken.
Consider a simple example consisting of 6 photos taken at time (in minutes
after
noon on a certain day) of 1, 10, 11, 12, 13, and 15 after.
If the photos taken at 10-13 are considered, then there are 4 photos in 3
minutes,
with no photos in the preceding 3 minutes and I photo in the foillowing 3
minutes. If the episode parameters are set to time-minifier=1 and significance
minifier=4 then the episode will be considered valid.
The rigorous definition is that it is a group of photos at least episode-min
in
count, having a total duration from first to last photo, called
episode_duration...
such that the interval of size episode-duration * interval_minifier has less
photos
than the significance_minifier * num_photos_in_episode.
This has to apply to the interval before and after the episode interval.
In the case of a sub-episode, only photos within the boundaries of the current
episode are considered, not those in regular time in other episodes beyond the
boundaries.
Further, in order to compute sub-episodes, the average photo rate is computed,
and photos are selectively eliminated from consideration, up to one per
average
interval. Where no photo is present in an average interval, none is
eliminated,
and where 2 or more are present in an interval, only one is eliminated. The
183
CA 02668306 2009-06-08
resulting distribution of times is then considered for sub-episodes. This
allows
bursts of picture-taking during an otherwise uniformly busy photo session to
be
rendered as sub-episodes.
Once the interval of significance for a sub-episode is determined, the above-
eliminated photos are restored before the sub-episode is populated, so it
contains all the photos in the interval; not just those that remained after
the
elimination step.
{INSERT pseudocode for sub-episode maker HERE}
to decide how to display music dstring metadata
{INSERT pseudocode for dstring display decision HERE}
to combine new vote data with existing dstring data
The requirements for the vote information are...
There are several metadata destinations.., title, year, etc.. For each
destination,
the percentages listed in the m2g information will add to 100%.
When new votes are received, the secret counts being maintained in the server
are augmented by the number of votes received for each destination, for each
string. The new numbers are saved in the server, and the m2gmd file is updated
to have new percentages.
In order that it non-trivial to determine the number of votes that have been
received for a given destination in a given file, (and so to determine how
many
SwapNeat users might be in possession of that file), the initial votes of 100
are
used for any new item in the list. Once the total number of votes exceeds
1000,
then the percentages are adjusted to reflect the actual number of votes.
184
CA 02668306 2009-06-08
That way, a few votes won't sway the percentages too much, and an attacker
cannot simply supply a small number of votes and then re-inspect the
percentages after the next update to determine the effect that number of votes
had, and thereby the number of votes total in play.
Each vote will change the percentages by less than one percent. Also, the
update period for new percentages will be random, so that it won't be possible
to
see the effect of a vote immediately.
{INSERT pseudocode for vote data combining HERE}
to determine the next tab to visit in a sequence in the whack interface
The full tabs of the whack interface are visited in sequence. If any buttontab
is
selected on a tab, that buttontab will be inserted following the tab, and
visited
next.
If control is held when a buttontab is pressed, then a sub-tab for the
buttontab is
inserted beyond the tab, but the tab remains visible until the control key is
released. While the control key remains pressed, other buttontabs can be
pressed. Pressing other buttontabs while control remains held causes
additional
sub-tabs to get inserted after the first-pressed buttontab's sub-tab.
In the case where all the tabs and sub-tabs are always visible (a user
selectable
mode) then instead of inserting subtabs, the icon is changed and a bit is set
that
will allow the software to detect the next tab or sub-tab to visit, by
incrementing a
search pointer from the current tab. In the case of using the control key to
select
multiple buttontabs, this process commences when the control key is released.
Since the sub-tabs are already present, their order need not be shuffled. In
some
cases a user may come to expect a certain order and appearance of tabs, and
benefit from having them always in the same configuration.
185
CA 02668306 2009-06-08
Any given tab in the whack interface will contain some combination of zero or
more buttons and buttontabs. A button is a window inside the tab view that can
be selected. Each button is decorated with a literal string (a.k.a. tag) and
an icon
that indicates the button's selection state. A buttontab is a button that
represents
a shortcut to a tab that is currently not part of the displayed tab set. A
buttontab is
decorated with a literal string (a.k.a. tag) that describes the hidden tab and
an
icon that identifies the button as a buttontab.
The order that each tab, and optionally each buttontab, within the whack
interface is visited is dependent on the selection or selections made in
preceding
tabs.
The simple case is where there have been no buttontabs selected either on a
previous tab or on the current tab; in this case the next tab is the tab
directly
beside the current tab.
The next case to consider assumes that no buttontab has been selected on any
previous tab and one or more buttontabs are selected on this tab. When the
user
is finished with the current tab the whack interface will be adjusted by
inserting a
new tab directly following the current tab for each buttontab that was
selected.
The order of the inserted tabs is sometimes based upon the selection order and
sometimes on the order in which the buttontabs appear on the tab. If the
interface is in the mode where sub-tabs are not displayed unless selected (use
the selection order); if sub-tabs are selected to be always displayed, just
change
the icon (and use the order that the buttontabs appear on the tab or sub-tab,
which mirros the order the subtabs will appear). {INSERT pseudocode for choose
next tab HERE)
to determine what icon to put on a button
Buttons have optional icons which can convey meaning about what the button
will do if pressed, where it came from, and whether there is metadata in the
current file related to the button.
186
CA 02668306 2009-06-08
The priority for displaying these information is determined by a priority
list. The
priority list is not secret, but is set by the programmers of the metadata
studio,
and not under user control at this time. An example of the important factors
about
a node are whether it is new, whether it makes a sub-tab, whether it is the
most
recently created new button, whether it holds a parameter value.
This allows the most important attribute of the button to be displayed
prominently,
and avoids having to generate a very complex overlay system to show all the
attributes at once.
{INSERT pseudocode for button icon computation HERE}
to make a searchable web page
{INSERT pseudocode for searchable webpage maker HERE}
to compute the numbers to display on search buttons, and on nodes in the
advanced search tree
The number displayed on a search button (displayed delta), will be prefixed by
+
or - prefix. These numbers can also be displayed on nodes in the advanced
search tree. There's no provision for displayed deltas to be associated with
the
simple search.
Some definitions: OR, AND, NOT correspond to the 3 modes of the search
interface. Depending on the mode, the selected button will become a condition
of
the search expression for OR (all search result files must contain at least
one of
the OR terms) AND (all search result files must contain all of the AND
keywords)
NOT (all search results contain none of the NOT keywords)
The condition is the total set of search criteria applied during the search.
The
condition needs to be separately and simultaneously satisfied for OR, AND, NOT
187
CA 02668306 2009-06-08
in order for the considered file to be part of the search result. (To be
satisfied for
NOT, none of the listed conditions for NOT can be present in the file
metadata)
Because of the way equivalences are made, 2 or more equivalent nodes are
considered one node for purposes of computing search results. They must be
selected and deselected as a group, and the count applies to the group, not
just
the visible button.
A button that is pressed is 'in' the set of conditions for the search; the
count will
reflect the result of the button becoming 'not in' (see next definition). A
button that
is unpressed is 'not in'. The count on the button will reflect the result of
it being
pressed and becoming 'in'.
A given button can only be out, or in as OR, AND, NOT. It can never be in for
more than one of OR, AND,NOT at the same time. Also, it is not logical to
display
a count on the AND version of a button already present in an OR or NOT list
for
the condition.
OR count, AND count, NOT count. 3 separate counters assigned to each
keyword to keep track of the results of the search analysis process.
When a number will appear on a button/node
OR.
Not in yet.
A +count will appear if selecting the OR term will affect the number of search
results.
In already:
A -count will appear if deselecting the OR term will reduce the number of
search
results.
188
CA 02668306 2009-06-08
AND.
Not in yet.
A -count will appear if selecting the AND term will reduce the number of
search
results.
In already:
A +count will appear if deselecting the AND term will increase the number of
search results.
NOT.
Not in yet.
A -count will appear if selecting the NOT term will reduce the number of
search
results.
In already:
A +count will appear if deselecting the NOT term will increase the number of
search results.
How the number is computed: definition
If a change in the status of this keyword would result in a change in the
number
of search results, then the delta should be noted and displayed. The delta for
removing a keyword from the search is dependent on which mode the keyword
was in.
A delta for adding a keyword to the search must be computed for each of the
possible modes it could be added in, and the appropriate number is displayed
189
CA 02668306 2009-06-08
depending on the search mode. In the case of a whack search, where buttons
are displayed,
On a pressed OR button, a - will indicate how many images will be removed from
the search result.
On an unpressed OR button, a + number will indicate how many more files will
be added to the search result if the button is pressed.
On an unpressed AND button, a -number will indicate how many files will be
removed from the search as a result of pressing the button.
On a pressed AND button, a + number will indicate how many more files will be
added to the search result as a result of unpressing the button.
On an unpressed NOT button, a -number will indicate how many files will be
removed from the search result if the button is pressed.
On a pressed NOT button, a +number will indicate how many files will be added
to the search result if the button is unpressed.
How the displayed delta is computed: algorithm
Consider each file in the database separately. There is a list of 'unique'
keywords
maintained for each file. After the determination is made whether a file is to
be
included in the search result or not, then the unique list is processed. For
each
item in the list of unique keywords, increment the counts based on the
following.
Or None: (In this case, the file is not in the search result yet. If AND and
NOT
conditions are met, then any non-AND and non-NOT keyword in the unique list
should have its OR count incremented. If AND or NOT conditions are not
currently being met, then none of the OR numbers should be affected. Or
Minimal: (In this case, removing the OR keywords that was matched in this
190
CA 02668306 2009-06-08
image will cause the image to be removed from the search result. Add 1 to the
count for that particular keyword. None of the other OR counts for the other
keywords will be modified. Or not minimal: None of the OR counts should be
modified, since redundant conditions are being satisfied with respect to this
file.
And unsatisfied: (In this case, if OR and NOT conditions are being met, and
there
is only one AND condition unmet for this file in the search, then removing
that
AND condition would result in the file being included in the search result, so
the
count for that keyword should be incremented. No other AND keywords should
get an AND increment.
And Satisfied: (In this case, all the terms in the AND list should have their
counts
incremented)
Not unsatisfied: (In this case some NOT term is preventing the file from being
included in the search result) If there is only one NOT term in the list, then
increment the NOT count for that keyword. Otherwise add nothing to the NOT
count for that keyword. Not Satisfied: (In this case, the NOT terms that are
present are not in the file of interest. So take the list of all keywords that
are in
the file, and increment their NOT counts, since NOT'ing any of them would
cause
the search result to decrease.
If a node appears more than once in the metadata for a file, then the counts
should be incremented only once, so that the search counts remain accurate.
If there are no nodes selected in the search interface, it's a special case.
Instead
of showing every node, no nodes are shown. As soon as an AND node is
selected, a dummy OR term is generated which is present in each file of the
database. As a result, the OR becomes satisfied. If any other OR term is
added,
then the dummy OR term is not needed and it's no longer displayed.
to conduct a search
191
CA 02668306 2009-06-08
In order to conduct a search, the algorithm scans through the metadata of all
the
stored images, and selects those that satisfy the search criteria.
The most innovative aspect of the searching is the ability to find qualifiers.
As the metadata for a given file is examined, if a node is found to be
specified as
a searchable qualifier in the current search, then an additional check is made
to
ensure that the node is contained in a standard search item. Thus, a search
for
blue car will find the qualifier blue only within a node car.
The routine is somewhat general in that it does not specify which node
contains
the qualifier. This generalization makes the outcome more understandable and
the specification of searches more tractable.
Another aspect of the search routine is to compute the significance of the
various
search terms. Basically, if this term were not present, would the outcome of
the
search for this image be different. If so, then the count is incremented.
These
counts allow the interactive search display to be able to predict, based on
the
previous search conducted, the probable outcome of changes to the search term
choices.
{INSERT pseudocode for doing a search HERE}
to search for a qualifier
A qualifier for the purposes of this discussion is any xml element having a
URI
different from that of its parent element, and the parent element not being in
the
URI sn: or <xml>. Thus the nodes immediately under the sn:snmd node in SNMD
are not considered qualifiers. They are 'regular metadata elements'.
Qualifiers can be from any URI and serve to decorate 'regular metadata
elements' and other'qualifer elements' with further metadata.
The instructions for searching in a qualifier-aware way are given here.
192
CA 02668306 2009-06-08
Consider the metadata from each file in turn: call it the 'current file'.
process the list of contained keyword elements for the current file.
For every element in the current file's metadata, if the element in this
position in
the list is also in the list of qualifiers being searched for, then check the
parent
elements (from the metadata) which contain this qualifier element.
If any parent element of this qualifier element is in a standard AND or OR
group,
then consider this qualifier as being present in the metadata. If none of the
parents of this qualifier element are in AND or OR groups, then consider this
qualifier not present. However, the qualifier may still end up considered
present if
it is found again further along in the metadata under a different parent which
turns out to be in an AND or OR group. For those qualifiers that survive the
above process of determining valid-parentage, put them in a list.
Once a list of the valid qualifier elements for the image has been created,
apply
the standard AND OR and NOT to their presence and absence, to determine if
the image is considered a target of the search.
{INSERT pseudocode for qualifier search HERE}
to insert a node into the active tree given its location in the all tags tree
Given a starting point for a node, check if the node is already in the active
tree.
If so, the algorithm is considered to have finished successfully.
If not, then determine the visible parent node. Visibility is a global
attribute of
every node and applies whether the parent is actually shown in a tree or
currently
hidden due to non-expanded nodes in the tree view. of the desired node, and
recurse to higher ancestor nodes to put the desired node into the active tree.
193
CA 02668306 2009-06-08
When a parent node is found to be present in the active tree, then return to
the
outer levels (containing more deeply nested element requests) and add them to
the active tree under the parent.
rephrased:
I think the problem was the definition of 'visible' is not clear.
Compute the id of the visible parent of the requested node. (ie, if this node
is a
hidden equivalence, determine the visible equivalent of the nodes parent).
If the parent node is not already in the active tree, recurse to get it into
the tree.
Then insert the requested node into the active tree under the found or newly
inserted parent node.
{INSERT pseudocode for active tree node insertion HERE}
to convert the active tree into a whack interface button and tab list for
display
Starting at the root of the active tree, (a.k.a. the tagset tree) recurse into
each
child node, looking for nodes that make tabs.
When a tab node is found via the recursion (depth first), place the tab into
the
whack interface following the order that they were located.
For every tab placed into the whack interface, immediately compute the list of
buttons that will be on the tab, taking note of any buttontabs that will also
be on
the tab. The portion of the active tree starting at a buttontab must also be
recursed as any other tab to populate the whack interface because a buttontab
is
both a button and a tab.
The resulting ordering should be that any given tab's buttontabs have their
tabs
placed into the progression before any of the later siblings of the tab. Sort
of like
British royalty, where the oldest son of the heir to the throne is ahead of
any
194
CA 02668306 2009-06-08
younger brothers of the heir. In this analogy, the parent node is like the
heir, and
its first child node is like the oldest son.
Tabs that are associated with buttontabs are, by default, initially hidden in
the
whack interface.
{INSERT pseudocode for treetotabs HERE}
{INSERT pseudocode for metadata as XML extractor HERE}
Embedded metadata in computer-readable file objects consists of parameters
with values. In the case of keywords, sometimes the value is considered the
parameter. Metadata in a file is considered to be isolated to 'metadata
compartments' where the data in a given compartment is stored and processed
as a unit. Other compartments may be considered but can be left unchanged by
certain operations. XMP metadata is designated to be in its own compartment,
as
is SNMD. A list of the compartments currently supported is EXIF, IPTC, XMP,
SNMD, ID3, MP3, STAT, and PHOTOSHOP.
This document describes a way to extract the metadata information and
represent it as XML, for instance so that tools that process XML can operate
on
the extracted metadata.
In order to read the metadata from a file object and store it as XML, a
namespace mapping the elements of the metadata to ASCII text element names
must be created. This is then used to generate XML elements containing a
rendered version of the binary data that is the parameter of the metadata
item.
The encoding for the binary data must conform to the rules of XML syntax. For
instance, the NULL character is not allowed, even when escaped as � So,
a binary to hex64 conversion can be used. This maps a binary buffer of data
into
a string of characters using only 0-9, A-Z, a-z, and +/ the = character is a
kind of
195
CA 02668306 2009-06-08
terminator for the 64 symbols required. Non-listed characters may intrude into
the
sequence without changing the translated binary value.
Another method of encoding binary data which is particularly useful in some
kinds of image files is gzip encoding. The binary data is first compressed
with
gzip, and then the resulting binary compressed data is converted to hex64
notation for output in the XML.
When the data is a simple number or string of characters, it can be
represented
in the XML as the plain text.
In order to aid readability, recognized ASCII strings embedded in binary
encoded
data can be listed inside XML comment characters, which the wiki suppresses
from being displayed here, but they are the angle bracket, exclamation and
double minus sign to start, and the double minus sign and closing angle
bracket
to end.
Internally, the program can convert all the strings into a single standard
notation
for manipulation and storage. Since the internal xml processing is a closed
system, the 0x00 char and other character encodings can be used without
concern for reformatting and character substitutions.
Conversion programs can be written for handling EXIF, TIFF, IPTC, ID3 and
other forms of metadata, by simply examining the specification for the
metadata
and representing each element that can store metadata with a string, and
listing
the contents when it's time to extract. The inverse process will be obvious to
those skilled in the art of algorithmic analysis.
Extracted XML is kept separate on a per-compartment basis (compartments such
as EXIF, IPTC) so that if the information in one compartment is somehow
discovered to be corrupt, the information in the other compartments can still
be
processed. Also, having the compartments separate allows meta-rules to be
196
CA 02668306 2009-06-08
enforced in the program for precedence of item contents in the case of
conflicting
or missing information in the metadata.
to display xml metadata in a tree structure
{INSERT pseudocode for xmltotree HERE}
to convert binary data into a form compatible with insertion in XML
arbitrary binary data can be embeded in XML via binhex encoding. In the case
where binary data needs to be embedded into a binary compartment, such as
TIFF, the data is still passed as XML, but the data is encoded as binhex and
there is a sn:encoding='binhex' attribute added to the enclosing element. This
allows arbitrary information to be passed as XML without requiring special
characters and NULLs to be encoded into the XML stream. Such references,
even though the stream contains no actual NULLs (� is used for a NULL,
for instance), are still considered illegal syntax for well-formed XML,
probably
because of a conversion step that is performed on the data before it is
parsed,
where character references such as the above NULL example are converted into
real character codes, to simplify lower level parsing.
Regardless of the reason that � is illegal in XML, it's not a problem
because
binhex notation can handle the data without recourse to the binary code
representations. It's also significantly more dense in terms of characters per
encoded byte.
In some cases, it may be advantageous to use zip compression on the binary
data as well. In this case, the zip is applied first, and then the resulting
zip
memory buffer is binhex notated and inserted into the XML.
The attribute used in that case is sn:encoding='gzip'
The image metadata library interprets the encoding values before constructing
the binary memory buffer which will contain the data for insertion into the
file.
197
CA 02668306 2009-06-08
Likewise, when information that would require not-well-formed or bulky binary
representations is encountered, binhex encoding can be used by the image
metadata library to generate well formed text content for transfer and
manipulation.
When binhex notation is used, as an added aid to XML readability, an XML
comment element is inserted into the stream, containing any discovered ascii
or
UTF-16 strings that are present in the binary data, where the minimum string
size
can be set to 4 characters to reduce the likelihood of junk data showing up in
the
comment section.
(INSERT pseudocode for binary data in xml HERE)
process all the pointers and equivalences in the database and make the
appropriate associations
Each element in the database has associated with it a list of pointers. Each
pointer in the list points to other elements, some of which may or may not
happen
to exist.
To process the pointers, a procedure is performed for every keyword item in
each vocabulary in the database.
For each keyword item, for each pointer in its list, compute the destination
item to
which the pointer is pointing. If the destination exists, a reference to it is
added to
the equivalence set for the element.
Equivalence is transitive. Once all the pointers have been processed, some
nodes will be members of equivalence sets.
In order for the trees to make sense, only one element in a given list of
equivalent nodes is considered 'visible'. The visible node is the one drawn in
trees, regardless of which node actually was named in the XML being processed.
198
CA 02668306 2009-06-08
The flags associated with each pointer in the pointer list can direct the
decision
making process which selects the visible node for the set. Either the
referring
node becomes invisible, and the destination of the pointer becomes the visible
node, or vice versa. A flag is set for each node indicating if it is hidden,
and the
software should examine the equivalence set to find the visible node, or if it
is
visible, in which case the software can use it directly.
At all times, there is exactly one visible node in each equivalence set.
One of the benefits of equivalences is that transformations can be done to the
equivalence structure without having to modify the metadata of all files that
have
metadata referring to now-invisible nodes. Also, "Pool" nodes can be created
and
used to gather and rearrange nodes in the trees, to make the trees easier to
inspect while maintaining details of the data elsewhere in the tree.
A Pool node can contain any node without changing its effective parentage for
purposes of final metadata rendering. This is to say that a Pool node does not
make XML, although it has a significant effect on the display in trees.
The software will ensure that a Pool node cannot be converted into an XML-
making node (button, tab, buttontab). The opposite transformation is also
prevented.
Thus, XML-making metadata elements can be nested arbitrarily within Pool
elements without altering their meaning. This makes them a lot easier to
organize.
This method of using Pools is analogous to the way Photoshop Elements (PSE)
stores keywords in image files. PSE allows only the leaf node (most deeply
nested element) to be embedded in the file, and keeps the outer nodes as
elements which group the nodes and allow display rendering without affecting
the
embedding.
199
CA 02668306 2009-06-08
Since Pools do not change the effective parent of a node, it is necessary that
all
the non-Pool leaf elements, and any nested Pool intermediate elements in a
vocabulary tree that is rooted at an XML-making node, have unique element
names.
{INSERT pseudocode for processpointers HERE}
select the node to use for creation guidance
Some notation useful in describing the algorithm...
Let Parent(X) be the parent of a node X in the vocabulary tree. Let Child(X)
be
one of the children of node X in the vocabulary tree. Let Name(X) be the
element
name for the node X in the vocabulary tree. Starting with a node N which
defines
a tab, the creation guidance node C is selected according to the following
algorithm. Let ChildList(X) be the list of all the child nodes of X
Suffix(X) is that part of the name following "Creation_guidance_" only if If
Name(X) begins with "Creation_guidance_"; otherwise, Suffix(X) will be null.
For a given node T in the vocabulary, select the creation guidance node C
appropriate for T according to the following algorithm.
For all the nodes N in ChildList(T) if exactly one node has a non-null
Suffix(N)
then that is the creation guidance node.
If more than one node in ChildList(T) has a non-null Suffix() then the node T
has
branched creation guidance, and there is no guidance provided at this level.
For every parent node, in progression, from T, call them P(i) where P(O) is T.
Construct the list of parent nodes, P(i) by letting P(i) = Parent(P(i-1)) and
P(O) is
T
200
CA 02668306 2009-06-08
For each i, increasing, starting at 1, examine ChildList(P(i)). If no node N
in
ChildList(P(i)) has a non-NULL Suffix(N) then increment i and try again. If
the list
of P(i) is exhausted, then there is no creation guidance node and the search
terminates.
Create a new empty vector of nodes CO and populate the position i in it with
N.
Now proceed to populate lower indices in co according to the following method.
Examine each node M in the ChildList(C(i)) and count the number with a non-
NULL Suffix(M). Create the list L(j) containing these child nodes, indexed
from 1.
If the list L(j) contains no elements, the search terminates and no guidance
is
found. If the list L(j) contains one element, then assign C(i-1) from L(1) and
then
decrement i. If the list L(j) contains more than one element, then loop
through the
list La): For each member M(k) of list LO), try to find a MU) node where
Suffix(M)
matches the Name(P(i-1)). If a match is found, assign C(i-1) from MU). If no
match is found, and the list of L(j) is exhausted, and there is exactly one
node in
MO) list then assign C(i-1) from L(1). If no match is found, then if there is
only one
creation guidance node at P(i) recurse into it. Otherwise the search
terminates
with no match.
Once the i v (INSERT pseudocode for creation guidance node HERE) selection
The thumbnail cache fronts for the thumbnail vault, a file which contains a
jpeg
thumbnail binary data and a bit of header information for all files <each
file> in
the database.
The most recently accessed 1000 files are retained in the thumbnail cache so
that redrawing the thumbstrip when it is scrolled or when it needs to be
redrawn
for other reasons, can be done without requiring disk access.
201
CA 02668306 2009-06-08
When a file is placed into the thumbnail vault file, it is not put into the
LRU cache,
since the process of computing thumbnails for all image files would then
constantly overwrite the contents of the cache.
embedding
The SwapNeat metadata infrastructure requires that all metadata passing in and
out of the system be encoded as XML. Metadata encoded this way is subject to a
few constraints over and above it being well-formed XML.
1. it must not contain sub-elements and element content within any single
element. Either element content (text data) or sub-elements may occur,
but not both.
2. the order of sibling elements contained by a parent element is, in general,
arbitrary, and any rearrangement of the order is defined to be equivalent
metadata (the exception being rdf:Seq elements).
3. two or more sibling elements having the same element name can be
combined into a single element having the same name, and the union of
sub-elements of the original elements. (some exceptions are supported,
such that sibling elements with different attributes may be considered
different elements and maintain their independence during processing).
There are some metadata formats such as METS that make use of multiple
similarly named elements as siblings within a parent element. This
generalization
is not currently supported by the SwapNeat metadata infrastructure.
The image metadata library has binary image files as its object but XML as its
data transfer encoding.
This allows XML-processing tools to create XML for embedding without
necessarily relying on the Swapneat Metadata Studio to generate the XML.
202
CA 02668306 2009-06-08
The image metadata library extracts binary and textual information from files,
and
re-formats the information into a collection of XML strings, which are
considered
the primary reference metadata. One well-formed XML string is produced for
each 'Compartment' of metadata in the original file.
With the exception of XMP, the compartments have a defined and fairly static
vocabulary that is used to define the meanings of element names and in some
cases, controlled vocabulary contents of elements, such as the IPTC newscode
elements.
The SNMD compartment is embedded without modification or parsing by the
image metadata library. Of course, the image metadata library does insert
appropriate headers and padding along with the SNMD according to a method
similar to the way the XMP compartment is embedded.
Some data formats have as part of their specification the means to hold non-
XML
metadata within the files.
Many data formats support arbitrary embedding of ASCII text in a
'compartment'.
Metadata is embedded into the files in ways that do not interfere with
standards-
compliant processing of the file data: .
The use of XML as an intermediate format, along with the ability to binhex and
gzip encode binary data allows a full set of data encoding capabilities, in a
structured but not rigid format.
embeddable compartments
In any given file format, there may be several logically independent
compartments where metadata is commonly stored. For instance, in images,
there are industry standard places for EXIF/TIFF metadata, Photoshop metadata,
and XMP metadata.
203
CA 02668306 2009-06-08
Image formats including PNG, JPEG and TIFF have their data organized in a
linked-list format that allow arbitrary text (in some cases, of limited total
length) to
be embedded.
The opportunity to embed XML generalizes the metadata compartment and
allows for endless variety of proprietary metadata in support of specific
workflows. The internal formatting options of XML can be handled without
update
to the basic tools that insert and extract the metadata.
Each compartment is effectively independent of other compartments; a parsing
error reading one compartment need not preclude access to the data found in
other compartments, but conversely, the metadata in one compartment might get
updated by certain software tools without corresponding metadata in other
compartments being updated.
Some existing metadata tools ignore information in certain compartments either
because the tools are are old (and the compartments are newer) or the tools
are
so new they don't care about old compartments. In those cases, they may
inadvertently preserve stale metadata, or may erase metadata they don't
understand without notifying the user.
For maximum compatibility, metadata should be put into all compartments where
it belongs, such that any tool can find it.
This is not to say that needless duplication of metadata should take place. If
there is a modern place that a certain metadata item belongs, then it's not
logical
to create a new item with a new location and put it there too. The SwapNeat
paradigm is to embed metadata where it belongs, and in all equivalent industry-
standard places.
SNMD differs from XMP in an important way. SNMD is not held to a certain
parsing model, and is not held to RDF constructs for internal organization.
204
CA 02668306 2009-06-08
SNMD is primarily designed to allow arbitrary metadata XML to be created,
managed, embedded and processed as well as searched. Although originally
targeted at nested keyword notation, SNMD allows nodes which take text content
and effectively hold parameter values.
A corollary is that SNMD XML can be arbitrarily deeply nested, and the
specification for what is allowed, and where, can be verified by industry
standard
tools which check the legality of nesting certain elements within other
elements.
In addition to a rigid enforceable structure of nesting rules, SNMD allow two
additional key features:
The first key feature is the ability to infer compatible revisions of
vocabularies,
based on the URI assigned to the prefixes used in the XML.
The tools which are revision aware (SwapNeat metadata infrastructure) can
inter-
operate with revision-non-aware tools such as SAXON. Making use of
compatible revisions allows 'discovery' of the newer metadata constructs in a
later revision of a vocabulary without having to stop everything in order to
access
the reference for the vocabulary. Thus, a newly arrived file may contain SNMD
metadata to a standard newer than the one known to the SwapNeat metadata
infrastructure. The available older-revision vocabulary's rules can be
enforced,
(and data types defined there parsed and used), and newer constructs can be
considered 'discovered'.
This is not revolutionary... The XMP standard required tolerance of discovered
namespaces, although it limits their nesting depth and complexity a great
deal.
Such discovered metadata should be faithfully re-embedded into the image file
after editing of other known XMP constructs, with minimal damage. To a certain
extent XMP libraries are able to comply with this requirement.
The second key feature of SNMD is the ability to legally insert qualifiers
into any
nodes in order to add more specificity to items in the metadata.
205
CA 02668306 2009-06-08
The selection of these qualifiers (nested elements) is limited only by the
imagination and cooperation of the people using them. A qualifier is different
from
a sub-element because it is permitted wherever elements can be found, and
must be tolerated by any processing software. It can define arbitrary
complexity
at a contextual location in the parent metadata, without having to resort to
XML id
attribute cross correlation, and other cumbersome programming-intensive ways
to add structure.
It is a natural kind of added structure, and it is the primary reason why SNMD
needs to be used for truly rich metadata structure.
There are two ways to specify that it is legal to have a qualifier within a
certain
node. Either all XML elements in a vocabulary tolerate any qualifier (standard
SNMD) or XML elements tolerate only a limited list of defined sub-elements. In
this case, one of the tolerated sub-elements can be the <sn:qualifiers> sub-
element which then is defined to allow any well formed XML consistent with the
parsing rules of the specifications of the referenced vocabularies.
The advantage of the second method might be more tools can handle the
parsing of the files. SNMD can be saved in either format, and is equivalently
treated at input by the Swapneat Metadata Studio without regard to
<sn:qualifiers> elements.
Because it is possible to convert a snty file to an xsd file and therefore
employ
third-party XML processing to operate on SNMD XML, the use of the
<sn:qualifiers> elements can make those scripts tractable.
EXIF/TIFF
Exchangeable Image File Format or EXIF is a standard created by the Japan
Electronics and Information Technology Industries Association (JEITIA) that
provides metadata for digital still camera images. EXIF is a metadata
compartment extension of Adobe's Tag-based Image File Format (TIFF). TIFF is
206
CA 02668306 2009-06-08
a complete image format specification that deals with both the image's data
and
metadata. For TIFF images SNMD metadata is encoded as an Image File
Directory (IFD) and identified by the TIFF assigned Tag ID (OxC6CO).
IPTC
The International Press Telecommunications Council (IPTC) and the Newspaper
Association of America (NAA) developed the IPTC-NAA Information Interchange
Model. This model is designed to embed metadata into all kinds of data
including
text, photos and graphics so that the data and metadata can be communicated
and shared as one. SNMD can be encoded into the dataset tag id 2:25.
XMP
XMP was an attempt to create a structured but easily maintained collection of
metadata for embedding into file objects. Due to its lack of deep nesting
capability, it had to be worked around in order to embed things such as nested
keywords and other items.
Due to the nature of XML, which XMP uses, there is no provision for embedding
binary information into XMP, so some important metadata cannot be embedded
at all.
MP3
MPEG-1 Audio Layer 3 (MP3) is a popular audio encoding format. The
specification focuses on the data format and compression of audio data. An MP3
file contains one or more frames that consists of an MP3 header and MP3 data.
The MP3 header contains the metadata of the frame and describes the encoding
of the MP3 data including such characteristics as bitrate, frequency, MPEG
version, etc..
Currently, SwapNeat supports MP3 metadata in read-only mode.
207
CA 02668306 2009-06-08
ID3
ID3 provides more extensive metadata for MP3 files; this includes information
about the title, artist, track, lyrics, etc.. There are over 84 different
frames of
metadata in the ID3v2 specification.
SNMD is embedded into MP3 audio files by placing the SNMD into a ID3
comment frame with the description set to SwapNeat XML and the text being the
SNMD XML.
SNMD
There is a need for a more flexible specification for embedding metadata which
can support both strict control of nesting and free discovery of nesting
patterns.
This is SNMD.
searching
many search algorithms exist, and some are innovative and patented
since the collection of files is usually limited in photo collections, to a
few tens of
thousands, there is no need to do pre-indexing and maintain extra information
about the files
the entire database will fit easily in RAM on the machine performing the
search,
and is searched sequentially. as such, there's nothing especially innovative
about
the method of indexing the data in order to provide a faster search
however, the information that specifies what to look for, and the method for
discovering whether, in a given file, the information is present, is a part of
the
invention, and is reviewed here
since all the metadata in the database is also in the file, any industry
standard
208
CA 02668306 2009-06-08
database may be used to attempt to index the contained metadata in any way it
supports, so that faster searches are possible. such algorithms and uses and
indexing of metadata is outside the scope of this patent
adding and changing information in the vocabulary
the swapneat sntv files are the 'vocabulary' that defines the available
metadata
element names that can be applied.
several vocabularies can be references by the whack interface, but all
references
to metadata by the whack interface must be from vocabularies
discovered vocabularies
submissions from contributors
discovered element names
submitted translation files (lang packs)
A word may have many meanings in one language, but each particular meaning
in another language may be entirely different words!
That's why Lang Packs are important: it's not enough to just translate the
words:
you have to get the right word or phrase that has the same meaning in other
languages.
Lang packs are disambiguators: once a given word has a number of languages
of that word, it eliminates ambiguity: in one language, the word may have
multiple meanings: add another language, and you reduce ambiguity, based on
the common definition between the two words in the two languages. The more
words added, the less the possible ambiguity.
rules for community involvement
209
CA 02668306 2009-06-08
there is an official version of the sntv file
the moderator also has a working version of the vocabulary containing his
additional notes and work in progress
duties of the moderator
holds the secret password for the vocabulary and has exclusive privileges to
sign
and upload new versions of the vocabulary
release official versions
accept and decide what to do with contributions
duties of the contributor
methods for third parties to use the moderated data
information in the tag vocabulary can be manipulated and rearranged using
tools
in the gui
there are convenient tools to allow contributions and recent additions to be
located, and deprecated, accepted, or hidden.
a list of the operations that a moderator can perform on a node or on a node
and
all its descendents follows
accept
deprecate
permanently
temporarily
delete
210
CA 02668306 2009-06-08
hide
operations that enable the creation of new content based on existing non-
owned content
take ownership
any swapneat user can import a vocabulary from another user, and use the
element names and parameter types, and equivalence references to create a
part of his own vocabulary.
the process of taking ownership does not create references back to the
original, it
just makes a copy of the original that can be edited by the new owner
migrate tags
it is possible to take ownership and also provide a link to the source files,
so that
the new vocabulary is a substitute for the original source. in that case the
swapneat metadata studio will conduct searches as if the new vocabulary had
been present in affected files all along. sharing the sntv thus produced by
migrate-tags will allow others to also benefit from this association.
refer to an authority vocabulary
it is possible to designate elements in a certain user-owned vocabulary to be
equivalent to elements in another vocabulary such that when the sntv for the
first
vocabulary is shared, the equivalences can be rendered by metadata studio of a
third party, even on files that have been previously distributed.... it is not
necessary to get the files back and modify their metadata. sending the new
sntv
file will allow searches cunducted on third-party machines to run as if the
files
were already modified and had the authority data in them all along
Brainstorming
211
CA 02668306 2009-06-08
= digitally signed photos
= web service from your machine to a standard browser, after authentication
via the server.
= shell extensions decrypt encrypted files via retrieval of key from
swapneat.net
Peer to Peer
No Central Server Technique
Objectives:
No central internet server
Provider does not need to publish to a central repository
Provider can determine which users have access to which files (ACL)
Provider can tag files so requesters can perform queries
Requesters can perform tag searches on the providers to filter out unwanted
files
Transfer is p2p rather than through a central server.
Requester and Provider need to be online at the same time.
The same gui is used to perform local searches and remote searches.
Technique:
TCP/IP is used to communicate between the requester and provider
The requester must be given access to the provider to perform a query
A search consists of a SwapNeat query similar to the protocol used for local
queries.
212
CA 02668306 2009-06-08
The query is sent over the connection.
The provider runs the query locally and generates an XML response which is
followed by a list of thumbnails.
The requester's SwapNeat gui thumbstrip is then filled with the returned
thumbnails.
The objective is to use the same Graphical interface as what already exists.
Aside from choosing which server to perform the search on, the GUI and its
behaviour is identical.
Central Server Technique
Objectives:
Provider publishes files to central internet server.
Provider can determine which resolution to publish files.
Provider can determine which users have access to which files (ACL aÃ" based
on forum groups?)
Provider can tag files so requesters can perform queries
Requesters can perform tag searches on central server to filter out unwanted
files
Requester can request original size files. If the file is available on the
central
server it will provide it. If not, a request is made to the provider to use
p2p to
transfer file or send the file via e-mail
Requester does not need to fish through conventional albums but rather perform
search to access all files from a provider uploaded library.
213
CA 02668306 2009-06-08
Technique:
The provider tags and selects the files to be published.
The provider enters a album name and selects the resolution and starts the
upload to the central repository which automatically creates a new online
album
and parses the metadata of each file transfered.
The provider selects the groups or users who can view the album. Each member
of the group receives an email notification with a hyper link to the album.
Any invitee can browse to the album page and view the photos and tags. The
invitee can also do further queries based on tags, for other photos.
The invitee can request original size images from the server. If the original
size is
available it will be offered to the invitee. If the original file is not
available then the
request is logged for the next time the provider logs in.
When the provider is notified of requests, the provider will attempt to send
the file
via 2p2 to the invitee. If the invitee is not online, the file can be email.
Community Website
Supplementary Features
Conversion of third-party tag vocabulary files for use in SNMDS
Adobe PhotoShop Elements can export the entire 'tag tree' as a very simple XML
file, retaining the hierarchy of the tree.
A logged-in user can upload an exported Adobe Photoshop Elements tagset
XML file to the server to be processed and transformed into a SwapNeat Tag
Vocabulary, which can then be imported in to SwapNeat Metadata Studio as a
new SwapNeat tag vocabulary (a '.sntv' file).
214
CA 02668306 2009-06-08
The user would first log in to the SwapNeat community Website.
A PHP script would accept an Adobe XML file upload, ensure it was benign (not
malware / virus / hack attempt) then launch a server-side XSLT processor,
using
an XSLT script hosted on the swapneat.net community Web server that will
transform the XML file to a snty file.
Then additional PHP scripts would digitally sign the file (in the same manner
that
all published tag vocabularies are so signed). The user could choose at that
time
to share the tag vocabulary publicly etc. or keep it private. Whichever access
options the user chose, the snty file would be sent to the users browser.
The user would save the file to their local file system, then import the file
into
SwapNeat Metadata Studio. All the tags they are familiar with in the Adobe
Photoshop Elements hierarchical tag tree are available in SwapNeat, and will
be
structured in the embedded metadata.
As used herein, the terms "comprises", "comprising", "including" and
"includes" are to be construed as being inclusive and open-ended.
Specifically,
when used in this document, the terms "comprises", "comprising", "including",
"includes" and variations thereof, mean the specified features, steps or
components are included in the described invention. These terms are not to be
interpreted to exclude the presence of other features, steps, or components.
215