info @ archimuse.com
published: April, 2002
Adding Value to Large Multimedia Collections Through Annotation Technologies and Tools: Serving Communities of Interest
Paul Shabajee, Libby Miller, Institute for Learning and Research Technology (ILRT), University of Bristol, UK, Andy Dingley, Codesmiths, UK
A group of research projects based at HP-Labs Bristol, the University of Bristol and ARKive (a new large multimedia database project focused on the worlds biodiversity based in the UK) are working to develop a flexible model for the indexing of multimedia collections that allows users to annotate content utilizing extensible controlled vocabularies. As part of the educationally focused ARKive-ERA project, a series of models for user annotation have been developed.
The need for these types of user support and tools was identified while conducting pre-design user studies with specialist user groups. The needs center around the limitations of current on-line museum and library systems that do not provide support for users to annotate or tag multimedia objects of relevance to their particular community of interest or with specialized indexing terms. Tagging would enable specialized resource discovery and knowledge sharing with other members of their communities.
One example is that of University Lecturers and Researchers studying a particular type of animal behavior. They may wish to identify all relevant images or video of that particular behavior and annotate them as good illustrations of aspects of that behavior. However, significant issues arise over, for example, the validation of information, access control and the use of such annotations by the resource discovery tools. The paper explores these and other issues and problems involved, and explains how the various models can help provide solutions to key problems and thus meet the needs of a diverse range of communities of interest, thereby adding significant value to on-line multimedia collections.
Key Words: community annotation, flexible publishing, semantic web, ontologies, collaboration
The ARKive-ERA project is focused on investigating how best to design the underlying technological infrastructures to enable large multimedia database systems to maximize the educational potential of their multimedia assets, for users from very diverse range of backgrounds and in a wide variety of contexts. The focus for the research has been the ARKive project (http://www.wildscreen.org.uk/arkive/), a large multimedia Web-based database system under development, containing diverse data related to endangered animal, plant and fungi species and their habitats as well as more common UK species.
ARKive is characteristic of many large digitization projects; during its initial phase of development it will contain data profiling some 2000 species and their habitats. This will take the form of approximately 9,000 minutes of digitized video and 30,000 still images along with hours of audio, maps, textual information and other supporting media and educational materials. These assets are donated by aa diverse range of commercial and non-profit organizations as well as by individuals.
Essentially ARKive is a community project insofar as it is part of and relies on a community of organizations and individuals who have an interest in sharing access to rich multimedia resources focused on biodiversity.
ARKive type projects are designed to serve the needs of their diverse potential users by providing tools for individuals and communities of users to annotate the content of the database so as to make the content more valuable to others with similar interests.
It is important to note that we define annotation as metadata (see below) created after the creation of the content. It is this post hoc nature (a note added to anything written, Oxford English Dictionary, 1998) that represents a considerable expansion of its usefulness, as a means of adding value to content, because it now allows people other than the original content author to add metadata descriptions.
As part of early work to identify the key requirements of ARKive and similar projects, it became clear that the diverse range of potential users includes school children and their teachers, media researchers, conservation scientists, customs officers, university lecturers and students and the very many people with personal rather than professional or educational interests in wildlife, to name but a few.
Each of the groups listed above is itself diverse with respect to the particular needs or desires of ARKive or similar systems.
We conducted a small-scale interview survey with University Lecturers about their likely uses and needs of ARKive with respect to supporting their teaching activities. As a result we identified that a key need of this group of users was to be able to search for multimedia resources to illustrate concepts when presenting to and supporting their students. Specific examples included infanticide, drug induced behavior, life strategies of plants, inter-specific competition, identifying and classifying organisms, tropic levels, echolocation, binocular vision, and harvesting theory. Lecturers from different sub-domains (e.g. behavioral biology and ecology) suggested different terms.
This small example shows that even within sub-groups of one relatively well-defined group of users the resource discovery needs alone were complex. Indeed it quickly became clear that all of these sub-groups or communities of interest have their own specialized vocabularies and concepts that they would like the resources indexed under so as to support their resource discovery needs.
Not only did they want to be able to search for assets using specialist vocabularies, but they also wanted ideally be able to find good examples of assets that illustrated a particular aspect of a particular concept.
These are not unreasonable requests as clearly it is not practical to browse 9,000 minutes of video, or even a small sub-set, in the hope of finding a good example to illustrate aspects of, for example, harvesting theory. However for ARKive it is simply not feasible to index every asset with the terms relevant to all possible communities of interest.
The problems can be expressed more clearly by using basic concepts from Information Retrieval (IR) literature (e.g. Chowdhury 1999).
If we use the example above, the University lectures (not unreasonably) want to use their specialist vocabularies to search ARKive's multimedia archive and have high precision and recall from the system based on those terms and queries constructed from them.
When actually doing the indexing, ARKive has to balance the levels of specificity and exhaustiveness of their indexing to make the task tractable within the limits of available resources and time.
It is useful here to refer to some well-defined 'specialist vocabularies' to get some insight into the scale of the challenge.
While any particular collection of multimedia will not necessarily contain objects which are appropriately indexed under many of the terms from each and every available specialist vocabulary, it is none-the-less possible that some terms from all of these will be applicable to some objects.
These issues are relevant to many other types of projects, not least those involved in the development of cross-searching of multiple databases which have been indexed using different indexing schemas (e.g. Clark, 2001). Much work is going on with respect to providing ways of robustly mapping between different schemas. These issues are discussed again below.
Indexing, Metadata, Interoperability and Ontologies
It is beyond the scope of this paper to review the extensive literature on the indexing and applications of 'metadata' (see below) to multimedia objects, the issues of interoperability and related technologies. See Gill and Miller (2002) for an overview of the key issues with regard to digital cultural content, and below for some examples of relevant projects. However a brief overview is necessary and useful in providing additional background to the remainder of the paper.
Metadata is broadly defined as data about data ' (Gilliland-Swetland, 1998). The traditional library catalogue index card is a classic example of metadata. The publication date, author, title, publisher, dewey decimal code are 'metadata elements' within a clearly defined metadata schema and scheme (list of metadata elements, allowed states of those and relationships between them).
As can be seen from this example, there are different types of metadata. Gilliland-Swetland (1998) distinguishes between 5 types:
Descriptive metadata is of most relevance to the challenges outlined above, but in principle the issues apply to all the types.
The interoperability of metadata i.e. the ability of different information systems to inter-operate or be compatible with each others vocabularies is seen as a fundamentally important issue in the development of Web-based information systems (Gill and Miller, 2002). This is because it is valuable if two (or more) systems holding data on similar things can be reliably cross searched and/or share data. Many standards, initiatives and projects are in place to develop systems that will be able to interoperate at a vocabulary and semantic (meaning) level (e.g. W3C 2002b, Miller 2001, see also below).
Part of the development of interoperable Web-based systems includes the creation of systems that utilize semantically interoperable ways of describing things, characteristics of things, and the relationships between them. These 'ontologies' (e.g. Ontologies W3C initiative, W3C 2002b) take the form of structured machine readable representation of the knowledge.
Just like people need to have agreement on the meanings
of the words they employ in their communication, computers need mechanisms
for agreeing on the meanings of terms in order to communicate effectively.
Formal descriptions of terms in a certain area (shopping or manufacturing,
for example) are called ontologies and are a necessary part of the Semantic
Web. RDF [Resource Description Framework], ontologies, and the representation
of meaning so that computers can help people do work are all topics of
the Semantic Web Activity.
These developments form what can be seen as part of a larger movement in Web technology development towards a more semantically interoperable Web (W3C 2001a, Berners-Lee et al 2001) in which information is globally interoperable.
There are significant difficulties with building ontologies and applying them to bodies of information. Ontology creation and application is a very specialized and time-consuming activity. Even more difficult is mapping between ontologies, especially those written by different communities of interest. An ontology provides a machine processable hierarchy of terms, but not all of the intentions of the ontology creator are encoded into the description of the ontology. Therefore mapping between them is prone to errors of interpretation.
Part of a Solution: Community of Interest/Expertise Annotation
The development of more semantically interoperable Web-based technologies seems to promise the ability to solve part of the challenge outlined above; namely, that of enabling machines (computers) to relate terms from different specialist vocabularies about what are essentially the same thing or concept and thus being able to map existing terms to the specialist vocabularies (including other languages).
However they do not provide a solution to the problem that members of specialist interest groups will want to describe (apply metadata to) data in ways relating to totally different concepts.
A simple example makes the issue clearer:
Imagine that there is a database of 100,000 images of people in a wide variety of different settings, say developed for a news agency. The database may be indexed using terms relating to identification of people (name, age, ) and event (time, place ) as well as administrative, preservation and technical metadata. This is because those are the important characteristics to those who originally setup the database.
Now milliners might see great potential for studying how people use hats. The database is likely to be a very useful resource:, they could search to see how the style of hats has changed over time, or what types of hats are most popular; they could answer many more specific questions, e.g. what percentage of women have bows on their hats? or wear a particular type hat at a particular time of year? however the database is not indexed using the concept of 'hat' and so it is not possible to interrogate it to find the answers to these questions.
Imagine now that someone else, a landscape architect, comes across the database and sees that it could be used to study how public seating is used in urban settings
In each of these examples the collection could be of very great value to the user, but the existing indexing was not originally designed with these uses in mind and so it is not. It is in these cases that community annotation of a collection, could offer the key to meeting these needs and thus greatly extend the scope and value of an on-line collection.
In the example above the individuals are from communities of 'milliners' and 'landscape architects'. They could annotate the images with specialist indexing terms used by their communities, ideally from ontologies developed to facilitate a semantically consistent representation. However it might be that they simply want to add 'notes' to particular images for others from their communities to find or make a hyper-link to a page of more detailed complementary information or case studies of that kind of example.
Models of Community Annotation
This section outlines a number of models of community annotation that we have identified, and that we believe help meet the diverse needs of different user communities and the database system developers such as ARKive.
Figure 1 shows via a much simplified Venn diagram some of the communities of interest of an ARKive type project. There are the more traditional target users, in ARKives case, those with interests in biodiversity and wildlife media and their sub-communities A1, A2, A3 (e.g. different sub-disciplines, phases of education ), and ARKives own staff. However there are other communities of interest (B, C, D and E) that lie outside or may have some degree of overlap with the original target community or communities.
Whilst the discussion here is focused on the use of community annotation to apply specialist indexing to objects, the majority of the issues discussed are the same for other kinds of annotation. Examples include case studies in the use of a particular image or media type, or notes relating to the object (e.g. what it shows, interesting facts, controversies).
Possibly the most critical issues related to community annotation for the organizations behind ARKive-type Web sites are related to the quality, accuracy and relevance of any annotation. One of ARKives fundamental values is ensuring that it provides scientifically accurate and up-to-date information.
If users are given tools that enable them to add/link metadata to multimedia assets, many issues arise about how that annotation can, or should be, made available to other users of the system. Below we have outlined four models of annotation that we have developed to illustrate the issues.
Many Web-based projects and sites use some form of annotation to 'add value' to their data. Each example below utilizes one of the approaches above:
More generally the W3C are looking to develop annotation standards under the Annotea project (W3C, 2002a) to allow users to collaboratively annotate Web pages. Inwith these developments, the standards which support the use of metadata descriptions are expanding; e.g. the MPEG-7 standard (Martínez, 2001) for the content description of multimedia includes a comprehensive Description Definition Language which allows complex description of multimedia objects.
These projects all provide tools that enable users of different types to add value to the 'collections' by adding annotation.
Implementation of Community Annotation Issues
Use of and Access to Annotation data
As can be seen above, the use of and access to any annotation is an issue inseparable from that of the under lying model of annotation. Probably the most fundamental issues are deciding who has access to any annotations and how any annotations (explicit, e.g. case studies, or implicit,e.g. search terms) are used and their use signaled. Some approaches follow.
Consistency and Quality Control
Another fundamental issue with respect to using community annotation to assist with indexing metadata is ensuring that there is consistency in both the terms used and the application of those terms, which relate back to precision and recall (see above).
The main solution to the issue of consistency has historically been to utilize a controlled vocabulary with clear instructions about what those terms relate to; e.g. library catalogue systems. In particular, controlled vocabularies appear to improve consistency where indexing is being conducted by a number of indexers (Markey,1984). In the case of ARKive there is already a controlled (bespoke) vocabulary for a number of aspects of the data and metadata used to describe the multimedia objects.
In order to annotate objects with relevant terms/concepts, it seems necessary to provide not only a controlled vocabulary but also a highly structured conceptual framework on which the vocabulary is based. This is because of the very large range of concepts that are covered by ARKive content, including bio- and bio-geographic sciences, wildlife film making, conservation and sustainable development, and educational uses.
These are broadly quality control issues; key questions for any system will relate to the degree of quality control required for any annotations. This may extend from a check that annotations are not obscene or contravene legal requirements (e.g. libel laws) all the way to full multilevel verification by appropriate experts with formal sign off of any new annotations.
The degree to which this is appropriate must depend on the nature of the system; e.g. a closed community in which members of the community are the only users who can access the annotation may require no formal quality control (other than legal issues) from the collection/Web site owners. However trusted community annotation that is accessible to all users may require significant quality control.
Tracking Annotation & Access Control Annotation Metadata
If community annotation is used, there must be systems in place to manage the new data. That is the ability to track and maintain the annotations; it is necessary to have metadata about the annotations. Cross et al (2001) show how this kind of data can be created and maintained. The potential value of this data is significant, as it potentially enables users (internal to the organization or external) to query the system to say show me all the annotation to the collection (or subset) made by person x or members of community y. This forms the basis of providing controlled access to the annotation data.
A further requirement of any system of annotation focused on adding value to a collection by indexing is that it be extensible; i.e. that new terms can be added in a coherent and meaningful way.
For example when a new term is added it must be done in such a way that concepts of which it is a sub-element (e.g. pecking might be a sub-set of feeding or defensive behaviors) retain conceptual integrity, e.g. the term pecking should not be applied to an object that does not have a related parent concept (e.g. feeding) or that existing parent concept must (henceforth) be made to apply to the object as well. Hence there may be a need to create new non-overlapping sub-categories of pecking; e.g. feeding:pecking and defensive-behavior:pecking.
This simple example shows that the creation of any conceptual representation will be very problematic. However the current authors believe that without such a framework, effective use of annotation would be very problematic if not impossible to manage and monitor. Heflin and Hendler (2000) explore the complexities of making changes to formal ontologies and some the many associated problems.
Other issues include how to deal with and represent 'controversial knowledge', 'fallacies', 'old knowledge' and other forms of 'inconstancies' in any knowledge base. There are no simple answers to these problems. Once again the most appropriate solution will depend on the particular situation; e.g. in the case of relatively open annotation such as gimp-savvy.com (http://www.gimp-savvy.com/) it may be appropriate for any user to be able to add indexing terms (given legal considerations are dealt with, see above) whilst in the case of trusted community annotation, changes to the ontology or vocabulary used might require a formal meeting of some form of expert panel.
However whatever form it takes, we argue that there must be some system(s) to facilitate such extension of the available terms and concepts if the overall systems are to be effective and sustainable.
A further fundamental problem is that community annotation using extensible annotation vocabularies and schemes is post-hoc and thus, unless every object is systematically annotated, it is very likely that the some objects will not be tagged with a new type of annotation e.g. a particular indexing term, when it 'should' be. Thus the annotation or indexing becomes inconsistent across the collection. There seems to be no simple solution to this problem other than systematic annotation. However as outlined in the next section, the use of 'semantically aware' tools may provide a means of optimizing the completeness of annotations across a collection where (as in most cases) there are time and resource constraints.
One very interesting requirement that we have identified for all of the models described above is semantic bootstrapping; that is, when a collection has been indexed from one perspective (i.e. for its primary target use or user group) using one set of vocabularies, it is necessary to have or create some kind of semantic hook(s)' in the data to allow users to begin the process of indexing the collection from the new perspective, using the new vocabulary.
This is a form of semantic bootstrapping, conceptually related to the idea developed by Pinker (1984) to refer to his postulated process by which children semantically bootstrap or learn syntax from some form(s) of built in semantic categories and contexts.
Ontology-based tools (see above) could allow existing ontologies to be linked to the already-present vocabularies or ontologies via concepts common to both existing and new domains.
Concept extraction tools such as the Non Zero Match tool were developed at the University of Bristol (http://nzm.dig.bris.ac.uk/index.html). The tool allows users to auto-index text-based documents using concepts defined by a list of words/phrases with positive and negative weights. E.g. say a car by defining the concept via the occurrence of a set of words or phrases registration number, steering wheel, make, model etc the parser then processes the whole corpus of documents indexing the documents under the appropriate concepts. Thus by using existing text or indexing/markup it would be possible to create new concepts to help bootstrap the new indexing.
Another example is described by Bobrovnikoff (2000) using the DIPRE (Dual Interative Pattern Relation Extraction) algorithm, to recognize pattern in existing data. Auto-indexing of still and moving images also provides the potential to extract and index new concepts; e.g. in the example above of looking for hats in the database of images of people. See Campbell et al (1997) and Lew (2000) for examples of this approach.
There are various forms that semantic bootstrapping could take, with various levels of automation. It could be a time-consuming and highly skilled manual task, effectively re-indexing the database manually by re-cataloguing by placing the images within an ontology used by a community of interest, or by using a controlled vocabulary. At the opposite end of the spectrum, the images could be auto-classified using specialist tools for pattern recognition.
Somewhere in between is a stored search by a subject expert. For example, when people search Google for a particular topic, they use their knowledge of their subject area and their common sense coupled with their experience of the content of the Google database itself to choose search terms that will accurately retrieve the information they require. For example someone looking for 'flying things' would use more specific search terms like 'bird', 'helicopter' 'parrot'.
An annotated stored query of a database by someone with knowledge of the indexing terms used in the database and the specialist subject knowledge of the community of interest would enhance the value of the database to that community. Such an annotation would provide fast approximate information for that community. An example might be a search of a database for photos of people dressed for a 'formal event' to get pictures of hats.
If there were time, the user could go through the images found in this way to check if the retrieved pictures were in fact pictures with hats, discarding those that were not. However, even a quick annotated search could provide added value.
Some form of semantic bootstrapping will be essential in making annotation work effectively for communities. Different types of semantic bootstrapping tools are likely to assist with different types of problem, and hence it is likely that what is needed is a suite of tools rather than one single tool or approach.
We are working to formalize the models outlined above and are developing more detailed technical requirements for the implementation of the models. The ideal is that we design an approach that can allow ARKive type organizations to implement any and all of the models of annotation outlined above.
In parallel we are investigating the advantages and disadvantages of the different models in different contexts in order to help developers make decisions about which ones are the most appropriate for their particular needs and contexts. Parallel investigation into different approaches to semantic bootstrapping and development of appropriate tool sets will continue.
Community annotation offers the developers of large multimedia database systems the ability to support specialist communities of Interest and thus enhance the value of their data. There are many technologies available and under development that would support this approach; some projects are already utilizing them.
This paper has dealt primarily with the annotation of multimedia objects with specialist indexing/resource discovery terms and the associated technologies; however, the issues are similar for more generic types of annotation.
The four models of community annotation outlined in the paper provide a framework for the development of community-based approaches to enhance the value of Web-based museum and multimedia collections for specialist communities of interest.
There are many implementation issues that remain highly problematic, in particular the coherent and consistent extensibility of vocabularies and the development of semantic bootstrapping tools.
However, it will likely be possible, in the short to medium term, to find solutions by assessing needs and matching solutions in each specific case. In the longer term, we hope that the on-going development of a more semantically interoperable Web and associated technologies will lead to the creation of sets of approaches and tools to make the implementation of community-based annotation relatively simple and effective.
The ARKive-ERA project is funded by HP-Labs, Bristol. The authors wish to thank Dave Reynolds of HP Labs, for discussing, exploring and expanding the ideas related in this paper.
Berners-Lee, T., Hendler, J. and Lassila, O. (2001) The Semantic Web, Scientific American, May 2001.
Buckingham Shum, S., E. Motta, et al. (2000). International Journal on Digital Libraries 3(3): 237-248.
Chowdhury, G. G. (1999) Introduction to Modern Information Retrieval, Library Association Publishing, London.
Clark, J. (2001). "Subject portals." Ariadne(29). Available on-line: http://www.ariadne.ac.uk/issue29/clark/
Cross, P., Miller, L., and Palmer, S. (2001). Using RDF to Annotate the (Semantic) Web. K-CAP Workshop Knowledge Markup & Semantic Annotation, Victoria B.C., Canada.
DELESE (2001). Digital Library for Earth System Education (DELESE) Web site. http://www.dlese.org/
Eakins, J. and M. Graham (1999). Content-based Image Retrieval, JTAP (Joint Technology Applications Programme).
European Environment Agency (2002). GEneral Multilingual Environmental Thesaurus (GEMET): The GEMET 2.0 Approach. Available on-line: http://www.mu.niedersachsen.de/cds/etc-cds_neu/library/Gemet.pdf
Gill, T. and P. Miller (2002). "Re-inventing the Wheel? Standards, Interoperability and Digital Cultural Content." D-Lib Magazine 8(1).
Gilliland-Swetland, Anne J. "Setting the Stage: Defining Metadata" in Introduction to Metadata: Pathways to Digital Information, Murtha Baca, ed. (Los Angeles: Getty Information Institute, 1998) Available on-line: http://www.getty.edu/research/institute/standards/intrometadata/2_articles/index.html
Heflin, J. and J. Hendler (2000). Dynamic ontologies on the Web. Seventeenth National Conference on Artificial Intelligence (AAAI-2000).
Lew, Michael (2000) Next-Generation Web Searches for Visual Content, IEEE Computer, 33(11) p46-53, November, 2000
Markey, K. (1984) "Interindexer Consistency Tests: A Literature Review and Report of a Test of Consistency in Indexing Visual Materials.", Library and Information Science Research, 6, 155-177.
Martínez, J. M. (2001). Overview of the MPEG-7 Standard (version 6.0), MPEG (Moving Picture Experts Group). Available on-line http://mpeg.telecomitalialab.com/standards/mpeg-7/mpeg-7.htm
Miller, P. (2001) Interoperability Focus homepage. Available on-line http://www.ukoln.ac.uk/interop-focus/.
Campbell, N. W. Mackeown, W. P. J., Thomas, B. T, and Troscianko, T. Interpreting Image Databases by Region Classification, Pattern Recognition (Special Edition on Image Databases), 30(4):555-563, April 1997.
Qualifications and Curriculum Authority (2002). National Curriculum Online Metadata Standard Overview, Available on-line http://www.nc.uk.net/metadata/
Schreiber, A. T., B. Dubbeldam, et al. (2001). "Ontology-Based Photo Annotation." IEEE Inteligent Systems May/June 2001: 2-10.
W3C. (2001a) Semantic Web. Available online: http://www.w3.org/2001/sw/.
W3C (2001b), XML-in-10-points, Available online: http://www.w3.org/XML/1999/XML-in-10-points/
W3C. (2002a) Annotea Project Homepage. W3C. Available: http://www.w3.org/2001/Annotea/.
W3C. (2002b) Web-Ontology (WebOnt) Working Group Homepage. W3C. Available: http://www.w3.org/2001/sw/WebOnt/.