info @ archimuse.com
published: April, 2002
Exhibits on Demand Project Goals and Approach
Joan C. Nordbotten, University of Bergen, Norway
Museums world-wide are deploying both virtual exhibits and multimedia collections for use by researchers, educators and the general public. With todays technology, users searching for thematic information from multiple autonomous sites must perform a series of separate processes to locate a reference list to relevant sites, search each site for relevant information, extract relevant data, and construct a local collection for off-line development of an integrated presentation. The users problem can be summarized as a need for methods and tools to assist in locating, accessing, and extracting relevant information from multimedia, multi-database systems developed and maintained by autonomous museums.
Two principal problems hinder support for location and access to multiple data sources. First, there is a lack of agreement on how semantically consistent metadata for description of data collections should be created. Thereafter a user-friendly query language and processing system must be developed to support the formulation of search criteria, search in a multi-database space, and integrate and present the search results.
This paper presents the motivation, goals, and approach taken for a newly started project that aims to develop methods and tools to address these problems by integrating and extending existing methods and tools developed separately for metadata, multimedia, and multi-database management. The primary goal is to develop a system to support dynamic generation of specialized collections as the result of an easy-to-use query language to multiple museum collections, i.e. a system to support exhibits-on-demand.
Keywords: Virtual exhibits, multimedia database management, metadata, Information retrieval, Query processing
1. Museums on the Web - an informal report from the perspective of information retrieval
Museums world-wide have been deploying virtual exhibits onto the Web since the mid 1990s. The basic structure used for virtual exhibits is a set of Web pages consisting primarily of text, images, and links from either the text or images to similar pages or image enlargements, respectively. Virtual exhibits are frequently large, often much more than 100 pages. Given that usage studies of virtual exhibits have shown that the average viewer of an in-house PC-exhibit selects less than 20 pages (Yamada, 1995 and Shneiderman, 1989), while a study reported by Nordbotten (2000) found that visitors to a Web-based exhibit selected, on average, less than 10 pages, most virtual exhibits contain far more information than an individual visitor will select.
Most virtual exhibits are self-contained in the sense that a viewer can select only the predefined exhibit pages using predefined links. In some exhibits, for example the Bhutan Exhibit at www.bhutan.at, a keyword search facility has been included to aid the viewer to find specific information within the exhibit. If we assume that the purpose of implementing a virtual exhibit is both to inform and to arouse curiosity about the topic, than the self-contained exhibit may not support the later goal, since there is generally no support for retrieving additional information from outside the exhibit page set.
Many museums, in addition to deploying virtual exhibits, have also made electronic versions of (some of) their collections accessible from their Web site. This makes large quantities of information available to researchers, educators, students, and the general public. However, the electronic collections are not commonly accessible from the virtual exhibits unless the Web site navigation bar contains a link to the collections and is constantly available to the virtual exhibit visitor, as done in the Web site of The State Hermitage Museum at http://www.hermitagemuseum.org/html_En/index.html.
A few examples of electronic museum collections, showing diversity of content themes, include:
The State Hermitage Museum, St. Petersburg, Russia, gives an extensive presentation of the museum and its collections at http://www.hermitagemuseum.org/html_En/index.html.
Various search tools have been made available for these collections, including some combination of:
a) View similar search, based on matching catalogue descriptions to those of a previously retrieved object.
b) Refine search, for modifying a previous search to expand or restrict the result set.
The results from a collection search are frequently listed without an apparent order. At least the viewer is frequently not informed of either the lists total length or of the relevance criteria used to order the result set. For example, a search for Indian Child* through the search facility of the Smithsonian American Art Museum at http://americanart.si.edu/study/ gave a result set, in which the first 3 items were:
If the collection consists of images, the results commonly contain a list of thumbnail images, linked to a larger version of the image, plus an annotation giving the artist/creator, date, title, and perhaps location. If the collection consists of documents, the titles are listed and linked to the full text. Associations/relationships between the objects in the result set are not given, even though it is likely that there is a relationship between objects and documents describing them, particularly if the object has been used in an exhibit.
1.1 Information retrieval problems
The search strategies outlined above have a number of well-known problems based on the nature of communication and the mismatch between the knowledge level and language of the viewer and that of the creator of the collection. Another problem area lies in the (lack of) sophistication of the software search engines. For example:
These problems are well documented in the information retrieval literature, as presented by Kowalski & Maybury (2000) and Baeza-Yates & Ribeiro-Neto (1999). There are also strategies for improving information retrieval from document collections that should be adapted to the more complex problem of information retrieval from multimedia museum collections.
1.2 Result presentation
Presentation of the results of an information search as an unordered list only indicates that each retrieved object has some relation to the selection criteria. This can be ok, if the user simply wants a list of all objects in a particular category, for example by a given artist. However, there is more information about the result collection available that could be utilized in presentation of the result set. Simple orderings could be by artist, chronologically by date of the objects, or by subject, style, or material. In addition, combining ordering criteria may give the viewer even more information.
Again, from information retrieval research and practice, the results can be ordered by relevance to the search criteria, assuming that multiple criteria are given and/or there is an importance/relevance distinction between matches in the title, keywords, and description of the objects.
Finally, the observed sites do not provide cross-referencing between collections, for example from an image collection to a document collection or to presentation within a virtual exhibit. Providing these links could significantly increase the information about the objects in the result set.
2. Designing virtual exhibits on demand: applying a database management perspective
Museum exhibits, and their virtual counterparts, are crafted by hand on a case-by-case basis and include only a small portion of the museums real and electronic collections. Development of an exhibit takes months and several man-years of effort. The resulting exhibit is static in the sense that users cannot extend or tailor the content for their own special needs; for example, to form a specialized exhibit as an element in an educational context. Supporting user-developed exhibits, or specialized user defined collections, requires user access to the underlying data collections, possibly from multiple museums, as well as tools for search, retrieval, and presentation.
Museums maintain, and have made available on the Internet, an increasingly diverse set of electronic multimedia databases. As high quality recording and scanning equipment becomes affordable, document and image collections are being supplemented with audio, video, film, and 3D collections. Parallel digitalization activities have led to the development of sets of separate but related electronic data collections or databases. Given that museums house overlapping collections, the result is a large set of inter-related, Internet accessible multimedia, multi-database systems.
Two principal problems hinder support for location and access to multiple data sources.
Our project aims to develop methods and tools to address these problems by integrating and extending existing methods and tools developed separately for metadata, multimedia, and multi-database management. Our interest lies in developing methods and tools for assisting development of virtual exhibits from a collection of underlying multimedia databases.
2.1 A user scenario
A student or teacher wishes to locate information on
the use of precious metals and gems in royal jewelry in Europe during the middle ages.
Relevant information is located in national museums, as well as national archives, libraries, and collections maintained for the royal families.
Using current Internet search engines, our user must perform the following tasks:
Known problem points in the above scenario:
The problems can be summarized as a need for methods and tools to aid users in locating, accessing, extracting, and presenting relevant information from multimedia, multi-database systems developed and maintained by autonomous museums.
2.2 Current status in multimedia, multi-database management
There are a number of theories, methods, tools, and IT systems that can be extended to provide a solution for accessing autonomous multimedia, multi-database systems.
Resource Location & Metadata development
Location of relevant data and information requires that metadata describing the semantic content of each collection be made available in a standardized format that supports semantic integration [Bearman and Trant 1998]. There are numerous metadata standardization activities in process, perhaps best known are the development efforts behind Dublin Core in which there are 13 working groups including one with museum representatives. The Dublin Core effort began from the requirements of the Digital Library community and proposals for extension to other cultural application areas have been made, for example in [Bearman et al, 1999]. Other metadata proposals, developed for museum collection description include CIDOC [Doerr 1998] and the Warwick Framework [Lagoze1996]. Proposals for description of the semantic content of general multimedia can be found in [Lu 1999, Marcus 1996, Subrahmanian 1998, and Wu et.al 2000].
The Dublin Core proposal (1999) consists of 13 basic metadata elements for describing aspects of media objects or resources. Controlled vocabularies are recommended for certain descriptive values; for example, for subject and type. However, it is well known that the use of controlled vocabularies for developing collection metadata/indexes requires trained users since the vocabulary seldom matches that of general public users [Hillman 2001]. This problem is also well known in the information retrieval community where much research has gone into linguistic based methods for selecting descriptive document terms and establishing thesauruses to aid the match between user query terms and the collection index terms [Baeza-Yates 1999, Kowalski 2001]. The problem is also well known in the multi-database research community where a structural analysis of database schemas has been the dominant approach for synonym resolution [Elmagarmid et.al 1999].
Data access and retrieval
As noted above, Internet search engines retrieve lists of Web site URLs after matching the user key word request with site indexes constructed from the metadata available to the search engine via crawler activity. No actual data is returned to the users who must then continue the search using the URL list and the established links on the referenced Web sites.
Multimedia database management systems, including document retrieval systems, also use a keyword-based search, but can also return actual resources (multimedia or document objects). These systems can also search for resources similar to a given resource, perhaps from the response to an initial keyword search [Baeza-Yates 1999, Kowalski 2001, Lu 1999, and Wu et.al 2000]. The problem with current approaches is that the search engine is specialized to one data type (text, image, video, audio, or spatial), thus requiring multiple queries, one to each data collection. Current Object Relational DB management systems, for example IBMs DB2, can search for multiple multimedia data types within one database by combining access methods. However, these systems do not address the autonomous multi-database problem.
Current multi-database systems can search and retrieve data from multiple source databases, but only structured databases managed by relational or object-oriented DB management systems.
In both the multimedia and multi-database approaches, extensions to SQL3 are proposed and used, perhaps with a form interface. The major problem here is also that SQL3 is not a user friendly language as it requires user knowledge of the structure of the underlying database to be searched, an impractical restriction for an environment with a large number of multimedia data collections.
Query results are generally given as a list of titles and/or thumbnail images linked to the objects/resources that are considered relevant to the search query. The list may be ordered by a relevance match between the query terms and the object descriptors. If the result is to be formed as an exhibit, the user must then do so.
2.3 A database perspective of exhibit construction
Figure 1 shows the main components for a planned ICT (Information and Communication Technology) system for construction of a virtual exhibit from the results of a multi-database query. The lower section of the figure illustrates the multimedia multi-database set, i.e. the environment of interest for retrieving information for an exhibit. Note that the figure shows only the databases for one museum/organization. The databases are assumed to be managed by an object-relational database system (for example Oracle or DB2) containing SQL3 as its access language and search functions for document, image, video, audio, and spatial/3D media objects. A catalog describes the objects in the database set.
The upper part of Figure 1 shows the main components of a user interface for retrieving information from the database collection and presentation of the results as a virtual exhibit.
Resource Location & Metadata development
The interface is based on a semantic model of the object catalogs so that similar objects can be identified in the separate source databases. The ICT implementation of the model is termed the semantic schema and contains a union of the local collection metadata as well as term thesauri for the collections.
Data access and retrieval
The SemQL query language is an extension of SQL3. Based on information from the semantic schema, SemQL is able to construct local queries to those databases that have data relevant to the user query. SemQL also has a user interface that is similar to those used in the advanced search functions of Web search engines with the addition of an interactive dialog system to assist with early refinement of the user query.
A virtual exhibit construction module presents the results of queries to the system. The presentation module can construct a set of Web pages, a topic exhibit, for presentation of the query response. The resulting exhibit can be further modified by refining and resubmitting the SemQL query.
2.4 The virtual exhibit project
Bergen Museum, in Bergen, Norway, has embarked on a project, described briefly by Ramirez (2001), to establish and publish a number of different multimedia databases implemented by theme and media type. The project has funding from a grant from the Norwegian National Research Foundation (NFR), and the available databases can be found at http://museum.uib.no. The collection of databases currently includes 3D databases for sculpture and insects, a video database of centipedes, a film database from social anthropology projects, plus various text-document and image databases. At this time, the databases are not inter-connected and are not connected to virtual exhibits. When the first phase of the database set has been implemented, the system is to be integrated with similar systems in other Nordic museums.
At the Department of Information Science, University of Bergen, weve begun a project in cooperation with the Bergen Museum project, described in Nordbotten (2001), to develop tools that can utilize their databases for development of virtual exhibits. The framework for the project is given in Figure 1 above. This project is also supported by NFR and has funding for 6 researchers and graduate students.
The primary project goal is to develop methods and tools for creating on-demand virtual exhibits from multiple multimedia systems, as illustrated in Figure 1 above. The new methods and tools are to be an integration and extension of appropriate methods developed separately for metadata, multimedia, and multi-database management.
Sub tasks for construction of a system for virtual exhibits on demand
The following methods and techniques will be developed in the coming 2 years:
Finally, a prototype system will be constructed and tested to demonstrate the feasibility and utility of above techniques in prototype system for an educational application.
The prototype will consist of 3 basic sub-systems:
Test and Evaluation
Three phases of evaluation are planned:
3. Project status
The Virtual Exhibits on Demand project has 3-year funding from 2002, and consequently is just formally beginning. We have a team of 9 graduate students working together to explore aspects of the design and implementation for the system outlined above and expect that a number of masters level theses will be developed in addition to a working prototype system that can function as a demonstration system for the concepts and functions described.
We would appreciate any comments and suggestions that the reader can give.
Baeza-Yates,R. & Ribeiro-Neto, B. (1999). Modern Information Retrieval. Addison Wesley.
Bearman, D. and Trant, J. (1998). Unifying Cultural Memory. Information Landscapes for a Learning Society, 1998. And presentation at UK Office of Library Networking Conference, July 1998. Also at www.archimuse.com/papers/ukoln98paper/index.html
Bearman, D., Miller, E., Rust, G., Trant, J., and Weibel, S. (1999). A Common Model to Support Interoperable Metadata, Progress report on reconciling metadata requirements from the Dublin Core and INDECS/DOI Communities. D-Lib Magazine, Volume 5 Number 1, Also at http://www.dlib.org/dlib/january99/bearman/01bearman.html
Doer, M. and Dionissiadou, I. (1998). Data Example of the CIDOC Reference Model. Epitaphios GE34604 Benaki Museum, Athens Greece. Also at http://www.geneva-city.ch:80/musinfo/cidoc/oomodel/epitaphios.htm
Dublin Core home site at http://dublincore.org
Dublin Core Metadata Element Set, V1.1 at http://dublincore.org/documents/1999/07/02/dces .
Elmagarmid, A., Rusinkiewicz, M., and Sheth, A. (1999). Management of Heterogeneous and Autonomous Database Systems. Morgan Kaufmann.
Hillman, D. (2001) Using Dublin Core. http://dublincore.org/documents/usageguide/
Kowalski, G.J. & Maybury, M.T. (2000). Information Storage and Retrieval Systems Theory and Implementation, 2nd ed. Kluwer Academic Publishers.
Lagoze, C. (1996). The Warwick Framework - A Container Architecture for Diverse Sets of Metadata. D-Lib Magazine, July/August 1996. ISSN 1082-9873.
Lu, G. (1999) Multimedia Database Management Systems. Artech House, London.
Marcus,S. & Subrahmanian,V.S. (1996). Towards a Theory of Multimedia Database Systems. In Subrahmanian & Jajodia, ed. Multimedia Database Systems. Springer-Verlag, 1996. Pp 1-35.
Nordbotten, J. (2000). Entering Through the Side Door - a Usage Analysis of a Web Presentation. Proc. Int'l Confr. Museums and the Web 2000. Minneapolis, MN, USA. April 17-19. Archives & Museum Informatics, 2000. p.145-151. Also at http://www.archimuse.com/mw2000/papers/nordbotten/nordbotten.html
Nordbotten, J. (2001). Virtual Exhibits - Theory, methods, and tools for development of virtual exhibits on demand. Project description. At http://www.ifi.uib.no/staff/joan/VM-project/project_description.htm.
Ramirez, E.A. (2001) Structuralizing Multimedia Data in Museums. The Use of Internet and Video and Scanned 3D Objects for Our Natural History and Science Museums. In Proc. of the ICOM International Conference in Barcelona, July 3, at http://www.lib.mq.edu.au/mcm/world/icom2001/ramirez.html.
Shneiderman, B., et.al. (1989). Evaluating Three Museum Installations of a Hypertext System. Journal of the American. Society for Information Science, 40(3), 172-182.
Subrahmanian , V. S. (1998). Principles of Multimedia Database Systems. Morgan Kaufmann.
Wu, J.K., Kankanhalli, M.S., Llim, J, and Hong, D. (2000) Perspectives on Content-Based Multimedia Systems. Kluwer Academic Publ.
Yamada, S., et.al. (1995). Development and evaluation of hypermedia for museum education: validation of metrics. ACM Trans. of Computer-Human Interaction, 2(4), 284-307.
Museum sites referenced
Bergen Museum, Bergen, Norway, at http://mediabase.uib.no/Imagelib/
The National Library of Australia Picture Australia at http://www.pictureaustralia.org/.
The New York Botanical Garden The Virtual Herbarium at http://www.nybg.org/bsci/cass/
The Sculpture Center's Ohio Outdoor Sculpture Inventory, at
The Smithsonian American Art Museum in Washington DC, at http://americanart.si.edu/study/.
The State Hermitage Museum, St. Petersburg, Russia at http://www.hermitagemuseum.org/html_En/index.html.
TAMH: Tayside A Maritime History, Scotland, at http://www.tamh.org/.