A Spatial Approach for the Access, Manipulation and Publication of Digital Library ArtifactsDion H. Goh and John J. Leggett, Texas A&M University, USA
Archives & Museum Informatics
2008 Murray Ave.,
Join our Mailing List.
Published: March 1999.
IntroductionWe are living in the "late age of print" - information in the future will be produced, transmitted, and consumed in electronic form. The printed book will be largely replaced by the electronic book and today's static, paper-based library with its archaic indexing schemes will give way to dynamic digital libraries with flexible and efficient mechanisms for locating, organizing, and personalizing vast amounts of multimedia information. Increasingly, scholarly work involves a collaboration of geographically dispersed researchers, teachers, and students. Scholarly work in a digital library will be accomplished through coordinated access to shared information spaces via networks (such as the Internet). Users will organize their own private digital libraries, collaborate with colleagues through shared digital libraries, and have access to huge amounts of multimedia information in global, public digital libraries.
The George Bush Digital Library ProjectThe George Bush Presidential Library, located at Texas A&M University, presents a tremendous opportunity to develop the above ideas. The Center for the Study of Digital Libraries, together with the Bush Presidential Library and the Center for Presidential Studies (part of the Bush School of Government) at the university have undertaken the task of establishing the first digital presidential library that will become a model for other presidential libraries. The presidential library contains an enormous amount of data in the form of nearly 40 million pages of documents, 1.5 million photographs, and 6000 hours of audio and video.
To fully exploit the potential of these resources, the George Bush Digital Library Project seeks to create a research and educational facility capable of serving thousands of researchers and students throughout the world via the Internet. Specifically, the digital library will:
The George Bush Digital Video Library ProjectAs part of this larger project, the George Bush Digital Video Library is currently being developed to implement and test many of the concepts, tools, and facilities envisioned for the entire digital library. The digital video library will eventually contain 6000 hours of digitized speeches given by President Bush together with their associated textual transcripts, as well as a set of network-based interactive tools that extend beyond search and retrieval operations. Put succinctly, the George Bush Digital Video Library Project aims to address the question: "Given the existence of a speech-based digital video library, which interactive tools would be most efficient and effective in enhancing the educational value and usability of the collection?" The basic premise of this project is that digital libraries must offer more than advanced collection maintenance and retrieval services since the ultimate goal of a library, whether physical or digital, is to serve the needs of its patrons whose objectives are often not solely the retrieval of information artifacts. Patrons instead seek these artifacts in order to manipulate and combine them to produce new information artifacts. For example, a study of library use by information analysts revealed that the retrieval of information is not an end in itself but rather the first step in an analyst's task (Levy & Marshall, 1995). Following the retrieval phase, analysts iteratively annotate and develop organizational structures over the information, and finally disseminate the results.
Thus, in addition to being a repository of information artifacts, a digital library should be an environment that supports the manipulation of these artifacts, the authoring of new artifacts, and the incorporation of new artifacts. Such a library may be viewed as a patron-augmented digital library, one in which both librarians and library patrons contribute to the evolution of the collection: librarians provide the seed material to form an initial collection while patrons augment the library with knowledge artifacts (semantically-derived organizations and annotations) over the existing collection.
The George Bush Digital Video Library project adopts this patron-augmented approach with the goal of extending the functionality of the digital library to allow researchers, educators, and students to peruse, compose, and publish knowledge artifacts in the collection to meet their informational needs. For the purposes of this project, knowledge artifacts will initially constitute synchronized mixed text and video hypermedia presentations and annotations. Using a web-based hypermedia authoring, presentation and publication system, patrons are able to search the library for video clips and textual transcripts, integrate the desired information artifacts together with any annotations to dynamically form a synchronized mixed text and video hypermedia presentation through the material, and finally publish the presentation back into the digital library if desired.
Scenario of UseThe following scenario illustrates how users may potentially use the digital video library. Consider an educator preparing a lesson about the Bush presidency and the Soviet Union for her political science class. As a resource for her students, she decides to prepare a hypermedia presentation consisting of selected video clips and textual transcripts of speeches and press conferences given by George Bush on the subject. Pointing her web browser to the digital library's web site, she locates and launches an interactive tool that will help her with the task. She first performs a search for relevant material, the results of which are returned as links to the textual transcripts. By simply clicking on these links, the tool transparently connects with document and video servers and displays the materials in separate synchronized windows.
The educator's next step is to author the presentation. Using the same interactive tool, she uses familiar drag-and-drop operations to assemble the materials into a coherent organization and annotates as necessary to put them into the context of her lesson. At any point during authoring, the educator can view her presentation with a button click. The tool will then connect with document and video servers to obtain the necessary resources and begin the presentation. The educator can also search for more material at any time if the need arises. When she finishes authoring the presentation, she completes her task by publishing it into the digital library using the same tool. In the background, the tool connects with a publication server to register and store the presentation.
Once this process is complete, the educator informs her students about the presentation and sends them a URL, allowing them to access, view and interact with the presentation through the web. The students may use the described presentation authoring tool to personalize the presentation by adding annotations. This personalized knowledge artifact can be stored in their digital library space or turned in as part of a homework assignment.
User Interface Requirements for the Digital Video LibraryThe user interface is crucial to the realization of an interactive tool that supports the retrieval, manipulation and publication of information/knowledge artifacts in a digital video library. In the course of developing prototypes of such a tool, several user interface requirements were identified and will be outlined here.
1. Accessibility. Access to the digital video library via the World-Wide Web allows it to be used by almost anyone with minimal knowledge, on any computer (at home and at the office) equipped with an appropriate web browser. As the Internet grows in popularity, it is expected that a digital video library on the web will provide the greatest level of accessibility to users.
SynchronySynchrony is a web-based interactive tool written in Java that allows users to retrieve, manipulate and publish information/knowledge artifacts in the George Bush Digital Video Library. The interface is patterned on a spatial metaphor and represents a large, 2 1/2 dimensional workspace in which users manipulate and organize library objects of different types such as text, video, and presentations.
The Synchrony interface is depicted in Figure 1 and consists of two major entities. The white background represents the workspace much like a physical desktop on which items are placed. Library objects, that is, the information and knowledge artifacts in use by the patron are positioned on this workspace. The direct manipulation paradigm allows these objects to be arranged and visually altered (through size and color) by the user to create information structures suitable to the current task. In addition, because the size of the workspace is larger than the screen (essentially infinite in the X- and Y-axes), panning is supported to allow users to view different portions of the workspace by dragging on it with the mouse.
Figure 1. The Synchrony interface shown here with a set of library objects on the workspace.
Synchrony ObjectsSynchrony objects fall into four basic categories: queries, documents, presentations and containers.
1. Query objects represent the results of searches, with each query object representing one result set. In the current version of Synchrony, queries are performed against speeches stored in the digital library with results represented in a three-level hierarchy, such that the first level of the hierarchy contains information about the query itself, the second contains matches to entire speeches, and the third level indicates the paragraphs within each speech matching the query. Figure 2 depicts the results to the Boolean query "soviet union". Here, the query object, which is shown as a window on the workspace, indicates that there are 2 documents and 14 paragraphs matching the query. Documents in the query object are represented by their titles, and by clicking on them, their matching paragraphs (each represented by an initial number of characters) are displayed. Figure 2 also shows the document entitled "Inauguration Speech", revealing 1 matching paragraph.
Figure 2. A query object representing the search "Soviet Union".
Figure 3. A document object containing an information artifact.
Figure 4. A presentation object containing 2 sequences.
Figure 5. A container with a query object and 2 document objects.
ExampleReturning to the scenario presented above, this section provides an example which shows how that scenario might be accomplished. To begin a Synchrony session, a user first points her web browser to the appropriate URL and logs in. An empty workspace is then presented as depicted in Figure 6.
Figure 6. An empty Synchrony workspace.
Figure 7. Querying and viewing documents.
Returning to the example, the educator organizes her document objects linearly to form a presentation in a top-to-bottom sequence as depicted in Figure 8. (Note that Synchrony also supports a left-to-right sequence). She may also view the presentation anytime by using her mouse to select these objects, whereupon Synchrony assembles them into a SMIL (Hoschka, 1998) presentation that is viewable through a RealNetworks G2 player (RealNetworks, 1998) as shown in Figure 9. If the educator realizes that more information is necessary, she can issue further queries and/or create new annotations, incorporating these into the presentation by drag-and-drop operations described above. This process of querying, organizing and viewing is repeated for as many times as necessary until the educator has all the material required. When authoring is complete, the educator can then formalize the presentation by creating a presentation object for it. The presentation object can then be published and retrieved for later use.
Figure 8: Authoring a presentation by organizing document objects in a top-to-bottom sequence.
Figure 9: A RealNetworks G2 player displaying a hypermedia presentation.
ImplementationSynchrony is part of a suite of client-server tools that comprise the George Bush Digital Video Library.. The Synchrony client described in this paper was developed as a Java applet and is accessible by Netscape web browsers capable of executing Java 1.2 code. The client works in conjunction with both the web browser and the Synchrony server to perform various tasks. For example, query and publication tasks are performed by sending requests to the server, while presentation-creation tasks are performed by both communicating with the server to assemble the media objects, and then invoking the web browser to display the SMIL presentation with the RealNetworks G2 player. For storage and retrieval of the textual content of speeches, MG (Witten, Moffat & Bell, 1994), a public domain full-text indexing and retrieval system, is currently being used. Video segments of speeches on the other hand are delivered using the RealVideo server (RealNetworks, 1998). Note that while other browsers such as Internet Explorer are Java-enabled, Netscape is the browser of choice for this project because of its support for applet-to-browser communication.
Conclusion and Future WorkDigital libraries must offer more than advanced collection maintenance and retrieval services. Patrons simply do not solely retrieve information artifacts for their own sake, instead they seek these artifacts to manipulate and combine them to produce new artifacts. Synchrony was designed as a response to this observation, and is an integrated, direct manipulation workspace supporting the access, manipulation and publication of information/knowledge artifacts in the George Bush Digital Video Library. Employing a spatial hypertext approach, Synchrony allows users to iteratively query for and select documents in the digital library, create annotations, assemble hypermedia presentations simply by organizing them on the workspace, and finally publishing these annotations and hypermedia presentations back into the digital library for future personal, and if desired, community use.
The next step in the evolution of Synchrony is to perform a pilot test of the system. This will determine if the interface supports the various tasks required in the authoring of annotations and hypermedia presentations, and identify any deficiencies and opportunities to be addressed in future versions of the software. Another area of work deals with video annotations. Currently, because of limited support for video in Java, Synchrony only allows playback of video segments. However, it is expected that with the release of the Java Media Framework 2.0 (Javasoft, 1998) and its support for video processing and playback within Java applets/applications, video will be a first-class object in Synchrony, allowing users to annotate video just as they are now able to annotate textual information.
A third area concerns the customization of hypermedia presentations. As shown in Figure 9, presentations currently consist of 3 regions per presentation sequence - a video window for presenting the video segment associated with a speech, an information artifact window for displaying the current speech segment, and an annotation window for displaying any associated annotations. The first two regions must be present while the third (the annotation window) is optional. Later versions of Synchrony will support full customization of these regions as well as other aspects of a presentation through modifications of SMIL playback parameters.
ReferencesBuchanan, M., and Zellweger, P. (1992). Specifying temporal behavior in hypermedia documents. ECHT '92. Proceedings of the ACM Conference on Hypertext, 262-271.
Cousins, S., Paepcke, A., Winograd, T., Bier, E., and Pier, K. (1997). The digital library integrated task environment (DLITE). Digital Libraries ’97 Proceedings, 142-151.
Furnas, G., and Rauch, S. (1998). Considerations for information environments and the NaviQue workspace. Digital Libraries ’98 Proceedings, 79-88.
Hoschka, P. (1998). Synchronized Multimedia Integration Language. Consulted January 5, 1999. Available: http://www.w3.org/TR/REC-smil/.
Javasoft. (1998). Java Media Framework API. Last updated December 22, 1998. Consulted January 5, 1999. Available: http://www.javasoft.com/products/java-media/jmf/index.html
Levy, D., and Marshall, C. (1995). Going digital: a look at assumptions underlying digital libraries. Communications of the ACM, 38, 4, 77-84.
Marshall, C., and Rogers, R. (1992). Two years before the mist: experiences with Aquanet. Hypertext ’92 Proceedings, 53-62.
Marshall, C., and Shipman, F. (1993). Searching for the missing link: discovering implicit structure in spatial hypertext. Hypertext ’93 Proceedings, 217-230.
Marshall, C., Shipman, F., and Coombs, J. (1994). VIKI: spatial hypertext supporting emergent structure. Hypertext ’94 Proceedings, 13-23.
RealNetworks. (1998). RealNetworks, the Home of RealAudio, RealVideo and RealFlash. Consulted January 5, 1999. Available: http://www.real.com/
Shipman, F., Furuta, R., Brenner, D., Chung, C., and Hsieh, H. (1998). Using paths in the classroom: experiences and adaptations. Hypertext ’98 Proceedings, 267-276.
Shipman, F., and McCall, R. (1994). Supporting knowledge-based evolution with incremental formalization. CHI 94 Proceedings, 285-291.
Van Rossum, G., Jansen, J., Mullender K., and Bulterman, D. (1993). CMIFed: a presentation environment for portable hypermedia documents. Proceedings of the Conference on Multimedia '93, 183-188.
Witten, I., Moffat, A., and Bell, T. (1994). Managing Gigabytes. New York: Van Nostrand Reinhold.