Join our Mailing List.
Published: March 1999.
Using Primary Data to Design Web Sites for Public and Scientific AudiencesPeter Siegel and Nina Grigoryeva, American Museum of Natural History, USA
IntroductionCollections Management Systems have been traditionally designed for researchers, registrars and collections management staff. Re-examining digital material cataloged in these types of systems for the purpose of reaching broader audiences is the subject of this paper. It will encompass the design mechanisms that The American Museum of Natural History's Anthropology Department uses to deliver primary source material. Traditionally, curators would package a small subset of their collections for public consumption in the form of temporary and permanent exhibits. Electronic delivery of the same collection allows researchers and the general public the ability to create their own museum experience and open new interpretations of the collections. In order to create a public access web site we began the process by identifying our audiences as well as the best way to represent electronic collection data. With the extensive possibilities of interface design and database robustness, re-purposing source material becomes possible. Using the same criteria for designing both the research and public web sites was not practical, because the research site requires considerable expertise in the subject matter. In the pages ahead we will explain the methodologies used to create these sites.
Content CreationThe content of this site has been created through projects supported by the NEH to develop a collections management system with a digital image database. These projects are part of a 25 year plan to modernize The American Museum of Natural History's Anthropology storage facilities, preserve the artifacts, and to provide better access to the collections. At the project's inception there was no World Wide Web, no high-resolution digital cameras that were even remotely affordable, and image databases were uncommon. Through years of dedicated work and many learning experiences the collections management system and image database are fully integrated from image capture and archiving image metadata to the tracking of artifacts removed from the storage facilities.
When the 25 year plan was conceived, it was clear that a computerized collections management system was necessary. Today, Anthropology's system is one of the most advanced in the country given the size of the collection. The collections database is modular in design allowing components to be added when necessary. The design flexibility allows the project to use the best possible technologies while safeguarding existing work and information. The collections management system contains the catalog, image, and accessions databases:
Designing a Delivery SystemBefore we started this project there were many innovative ideas germinating; however, the issue that kept reoccurring was: who was our audience? We knew we wanted an interface that was easy to navigate and not frustrating. Our research web database requires that the user have specific research goals in mind before accessing the underlying data. The researcher may want to answer very specific questions pertaining to the ethnographic holdings at the AMNH. The academic interface is needed to enable the researcher to perform sophisticated searches. One might need to find out how much and what material of Chinese origin is in the country of Japan; or do a comparative study of basketry from the Northern Peoples of Russia to the South East Asian peoples of Indonesia. The research web database allows the researcher to discover what types of material culture scientists were bringing back to the AMNH during different periods of the Museum's collection history. The researcher can look for materials, such as silk, across various countries to possibly aid in the discovery of trade routes. It should be noted that submitting a query and getting a negative result back can be informative for the academic. For example, the researcher may be seeking an answer for transmigration or cultural trade, by comparing various combinations of cultures and localities. For the general public, however getting no results from a query is discouraging to say the least. To use the public web database, previous knowledge of the subject is not necessary: no matter what users select they will receive a positive result. The structure of the public web database is close to the organization of objects displayed within the museum exhibition halls. Both the public web database and the Hall of Asian Peoples are laid out geographically. The site expands the visitor's experience because the hall only displays two to three percent of the collections.
After deciding on both web interfaces we compared the similarities and differences between them. Both sites use the same database and image files, which provides completeness and uniform quality of the presented information in addition to which all users are familiarized with research-level data. Both sites have similar layouts, making it easier for the user to eventually migrate to the expert search system. The main benefit of reusing the same data is the economy of time. Rebuilding a database and images with over 38,000 catalog entries and image scans would take years of work. There was no substantive budget for this project so economy of human effort and hardware was essential.
MethodologyOur plan was simple: create a web database that can satisfy the academic researcher and the educated general public using the same source material. We felt the public interface should be an easy point and click approach to getting specific information from the collections. After designing the two targeted interfaces, the original catalog database was prepared and the public interface created. The main tool we used to create this interface was Cold Fusion Studio. The research web database interface is designed with the researcher in mind; this interface allows the user to query by country, donor, culture, accession date, and/or free text search. The interface for the general public had to be simpler; a geographic reference and object type is the criteria for accessing material culture. A contemporary political map of Asia was used commonality. Creating a map based on collection date or culture would require a time line that could dynamically update geopolitical boundaries. While this would be a fascinating approach it would be too complicated for the casual user, not to mention the developers.
One these prototypes were linked to the database we asked the exhibition department's Media Interface Evaluator to give the web site a test drive. We then took a sample of museum visitors and evaluated their responses to the public web interface. Then we asked museum researchers to test the research interface.
The basic difference between both sites is that the public site, has only two search conditions: country and object type.
…and can be sorted by:
Data PreparationIn order to make both sites easier to use, the database needed some house cleaning, especially for the general public web site. Data uniformity, especially object in naming, was the crucial element that needed work. Since there was no standardized vocabulary for ethnographic nomenclature available, one had to be developed. The Asian curator developed a prototype thesaurus from over a 1000 object names and associations from the original catalog nomenclature. After reviewing these associations a database thesaurus was built. An important design consideration in this process was assuring that the thesaurus would not alter the original catalog data and that it could discarded and or exported at a later date if necessary (see Aside). It should be noted that this prototype thesaurus is not designed to be compliant with other institutions, but rather to make ethnographic object descriptions easier to understand. (Aside) The Anthropology Department is part of a consortium to develop a comprehensive ethnographic thesaurus. This project will be designed to serve the scientific and museum/library community and be incorporated into the Collections Management System.
Searching the public web site is a two step process. The user selects a country by clicking on a map of Asia. The image map passes the country name to the database; the database in turn propagates a pick list with object names. The user then selects "all" or a specific object type. This country/object pair is then processed as a query when the user clicks on the submit button. To resolve the problem of the user getting back thousands of hits, the interface delivers the results in thumbnail mode (default) nine objects at a time. After the search results are given, the user can change view mode to "detailed" view to see all textual information and larger images.
The research web site was easier to develop than the public site. What is essential to the research web site is ease of use and performance. The research site could be designed with multiple selection options for any of the search criteria fields, however there are prices to pay for more functionality. The interface design starts to get cluttered with information, buttons, and boxes making it clumsier for the user to work. Another problem with more features is that the performance suffers. Before any change to the interface was made we spent a lot of time evaluating the potential advantages and disadvantages.
After the sites were up and running, we analyzed the public's response and went back to polish up the interfaces. Overall the system worked very well, but we know further development is necessary. We like to use a more standardized nomenclature for our web database. Accomplishing these goals would extend far beyond the scope of this paper and would require considerably more time. However, we decided to use a geographical thesaurus from the Getty Institute and to further develop a true G.I.S.-driven approach to organizing and presenting data. A language module is another component not yet integrated into this system. As the internal system gets more sophisticated, so does the web system. In short, the work created is never lost because the web system is still a derivative of our internal collections management system (primary source material) that can be updated easily.
To provide high quality images without sacrificing performance, we created 3 derivatives of each image. Derivatives are separate resolutions depending on the view in which the user selects i.e. thumbnails (120x90), detail(280x234) or full(756x512). One downside is that using image data is the high maintenance burden. Another possibility would be to use a full resolution image and run a C.G.I. (Common Gateway Interface) applet or other script to scale down images on demand for the user however the performance hit would be too costly. Since storage space is so inexpensive it was decided to create derivatives of primary image. Yet another possibility would be to use a hierarchical file format that allows automatic scaling similar to the Flashpix file format. Unfortunately, this format cannot be used on either on our internal collections management system or the web databases without third party software. This file format is not an ISO standard and it requires the web server and web browsers to run applications and plug-in software. Presently we are test AT&T's DjVu file format for delivering documents with graphics on the web.
Technical NotesThis project uses Phase One Power Phase cameras to create images. The cameras are connected to an Apple G3 class computers. The decision to use this equipment was based on extensive testing to ensure both the highest quality image and system integration. The Phase One image capture software utilizes a complete set of tonal, color, sharpening, and cropping tools, allowing full pictorial control. The Phase One system is ICC (International Color Consortium) compliant, insuring the color quality of the scanned images. The images are saved in TIFF format then archived to CD-ROM using the ISO9660 standard.
The Anthropology collections management system is modular in design, which allows us to use of the best equipment combined with interchangeable components. This system is a client server model that uses the SQL (Structured Query Language) format along with Centura SQLBase7 engine that runs on a Windows NT 4.x Intel platform. The client program for the collections management system is developed in SQLWindows's development environment. The images are converted to Kodak's PCD file format and archived onto CD recordable media in the ISO 9660 file format. The online images are stored in the JPEG file format. The modular design allows us to re-purpose the textual and image data for our web site projects.
The Web server runs on a separate computer and uses a copy of the collection's database and image files. This redundant design was created to secure the museum's collections data. The Web server engine is Netscape Enterprise Server which processes requests from clients. When the client requests a CFM file the Cold Fusion Application server, calls the Centura database through an ODBC (Open Database Connectivity) call. The Cold Fusion server dynamically creates the query results as html files. This Web/ Database servers reside on a Windows NT 4.x server with service pack 4… the buggy one.