Skip to main content

Museums and the Web

An annual conference exploring the social, cultural, design, technological, economic, and organizational issues of culture, science and heritage on-line.

The Transition to Online Scholarly Catalogues

Nik Honeysett, J. Paul Getty Museum, USA

http://www.getty.edu

Abstract

The scholarly catalogue has long been a critical part of a museum's mission, providing authoritative information about collection objects for scholars, students, and the general public. Often based on years of painstaking research and richly illustrated, print catalogues form one of the building blocks of art history. The catalogue's print form, however, is arguably the very component that prevents it from realizing even greater potential. High cost and relatively small print runs limit its accessibility, and printed books cannot easily change to reflect new acquisitions or new scholarly knowledge. While the online environment holds much promise for making collection catalogues more current, interactive, and widely available, museums still face significant financial and organizational challenges in making the transition online. This paper describes the challenges and successes of nine institutions as they step through a project to deliver a scholarly catalogue to their online environment.

Keywords: online scholarly catalogues, electronic publishing, publishing, museum catalogues

Introduction

The Getty Foundation is currently funding the Online Scholarly Catalogue Initiative (OSCI), a five-year initiative to explore the potential for scholarly collection catalogues in an online environment, determine the institutional resources needed, and support the creation of replicable models: specifically, whether the online delivery of scholarly catalogues can:

  1. Offer a more dynamic relationship between a catalogue's research, publication, and re-publication phases
  2. Directly link a wide array of primary and secondary resources to the record of a work of art, (resources ranging from archival and conservation documentation to audio and video interviews)
  3. Make greater use of comparative images

Eight museums have received support: SFMOMA, Art Institute of Chicago, Seattle Art Museum, Tate, Walker Arts Center, the Freer Sackler Gallery, LACMA, and the National Gallery have completed planning and are now entering the implementation phase. Specifics can be found on the Getty’s website (http://www.getty.edu/foundation/funding/access/current/osci_fact_sheet.html).

There may be some existing models within the museum environment that institutions can draw upon as a starting point. For example, exhibition modules are discrete resources reflecting not only a physical installation but also some degree of scholarship around it and may be in default of an exhibition catalogue. However, the major difference is that the focus of these exhibition modules is towards a broad audience and have a PR incentive; true scholarly work is not the primary focus.

What’s the Big Deal? 

There’s no question that museums are attuned to the process of bringing much of their (traditionally printed) content to the online environment – collections, curricula, exhibitions, etc. – so the principle is well established. So, how hard can it be to take a well-established scholarly editorial process that results in a print publication and make it available on the Web? Do the three goals that the Foundation’s initiative seeks to address really represent a transition? As the project participants are finding out, things are never that easy, for a number of reasons:

  • Paradigm Shift
  • Opportunity
  • Cost
  • Digital Rights and Permissions
  • Published or Data Driven?
  • Maintenance & Support

Paradigm Shift

What may be obvious to those of us who work in “new media” and Web departments in museums may not be obvious to scholars, curators and conservators. The opportunities that arise from transitioning to a scholarly catalogue online cover creating a rich and engaging digital resource. The tradition of a printed volume that has that ‘thump factor’ when it is dropped on a desktop as a manifestation of one’s work is significant. Curators are challenged with how an online work that may not be obvious as a discrete endeavour translates to their scholarly world. How do they refer to it in their resume? How do they send a comp copy to a fellow curator or Trustee? How does one gauge the acceptance of the publication within the curatorial community when ‘copies sold’ is no longer a measure? How do online publications support the road to tenure? If it can be updated when a re-attribution occurs, how do they know when the publication is finished? Is it ever finished? There are great benefits to an online catalogue, but they rely on taking advantage of the opportunities that the new medium provides; it requires making the editorial process easy; fitting this system into the existing architecture; having the appropriate resources to support it; and ensuring that comprehensive access is planned for through services like Google Scholar and World Cat.

The issue of curatorial buy-in is critical. One of the catalysts for OSCI was a meeting of the Association of Art Museum Curators in 2008, where this subject was discussed, but initiating the conversation within an institution outside of this project may present a challenge because it does represent a paradigm shift. What makes it a challenge is communicating its value, because the measures of success that curators have traditionally used are no longer there. What does success look like? What academic needs can this initiative service? What is the compelling reason to transition? The issue of transitioning to an online catalogue is part of the broader issue of museums coming to terms with what it means to be a cultural organization in the 21st Century, elegantly summed up in the 2010 Horizon Report, Museum Edition:

…the future of the museum may be rooted in the buildings they occupy but it will address audiences across the world - a place where people across the world will have a conversation. Those institutions which take up this notion fastest and furthest will be the ones which have the authority in the future. … the growing challenge is to ... encourage curatorial teams to work in the online world as much as they do in the galleries.
(Johnson, 2010)

Opportunity

Obviously an institution could simply choose to use its existing print production process, stop at the point of transmittal to the press, and post a PDF version of the catalogue to the website. This raises issues of accessibility and acceptance. How will an audience consume the catalogue? How will they find it? How will they use a PDF? While search engines are improving the ways they index PDFs, it is a constraining format, and there is little added value beyond the ability to search, and copy and paste the text electronically. However, the opportunity presented by transitioning to an online version is to create a much more engaging, useful, usable and citable version of its predecessor. To put it bluntly, why do it if there’s no added value? What does the Web environment offer that makes online catalogues more useful? How are researchers going to use them? What tools can be added to assist the scholar and the scholarly research process? It is at this point that an online scholarly catalogue becomes a much bigger deal, because to support this “version 2.0” requires revisiting the entire editorial and production process and identifying the new skills and tools that are required to support it. Furthermore, there is a greater expectation that the catalogue itself becomes less a snapshot in time, but much more a dynamic document that can reflect the institution’s scholarly research, and that of the field.

The writing and editorial process for a printed catalogue is well established and (relatively) simple. It is likely based on the exchange of a word processing document between editor and author until it is final, at which point it goes to a layout process and to press. The complexities that an online catalogue adds, assuming one takes advantage of the opportunities offered in the online world, can greatly increase the production effort. Providing front-end functionality such as bibliographic references that can pop-up, tombstone metrics that can be converted on the fly, and metadata that can be added to improve discoverability all require additional staff expertise and a mechanism to manage and create the product. All this immediately introduces additional production complexity.

Other opportunities exist in this medium which don’t exist in the print world – opportunities that dovetail into the larger online architecture that exists for an institution. For example, by including a ready-formatted citation within the publication, the institution can directly track citations to it. Opportunities like this hopefully add to the argument for securing curatorial support.

Cost

There is a belief that taking something that was once printed or physical and making it digital somehow eliminates much of the cost. This belief is pervasive at the museum executive level but is certainly not the case when transitioning to an online catalogue. Certainly, removing the print piece from the process would eliminate printing costs, but it does not eliminate the costs of writing, editing, photography/imaging, rights and graphic design. Transitioning to online adds additional costs that are not applicable to print and may include one or all of these: software programming, multimedia authoring, interaction/web design, and database administration. Cost is a significant factor for an institution, and this transition should not be taken on with cost-reduction as a key driver. Cost reductions may not be seen until processes and architectures are properly in place and an ongoing program of digital publishing is established.

Rights

While an institution may own the rights to its catalogue images, comparative images present a significant problem. A lending institution may be perfectly fine licensing image use for print publication: it is defined in scope, controlled in presentation and distribution, and attributable. In the online world, these constraints and controls do not apply. Most institutions have not yet come to terms with something that does not have a specific print run, a licensing model where the number of people viewing their image is (largely) directly related to their fee, and where an additional print run means additional licensing fees. In the online world, to use another institution’s images increasingly requires a term fee - Web rights for 5 years, for example - but this is unsustainable as it likely requires a re-editing process at the end of the term or degrades the product. Indefinite online rights are unusual and/or expensive.

Published or Data Driven?

Printed catalogues may be pump-primed with the output from a database, but that is where dynamic data’s role stops: the printed catalogue represents a snapshot in time of the tombstone information of the works contained in it. Any institution with an online collections presence is likely using some kind of collections management system with an application (at the very least, some kind of Web publishing tool) to deploy the content to their website, so the opportunity exists to use that mechanism to publish the catalogue. The question then becomes: should the catalogue be published as a snapshot like its predecessor, or presented dynamically directly from data? Either option is possible, so, technology architecture aside, it becomes a policy or curatorial decision. To deliver dynamically presents the opportunity to make updates in real time, but likely the scholarly prose refers to the tombstone information, so there is a danger of introducing inconsistencies into the catalogue unless the institution commits to ensuring that any re-attributions, for example, are mirrored in the text.  A crucial factor in the decision-making process of transitioning online is the question of update frequency. It has a substantial impact not only on production methodology and architecture but also on ongoing support of the catalogue post launch.

Maintenance & Support

A traditional print catalogue is printed and distributed. With the exception of storing inventory and ordering another print run when stocks are low, there isn't much else that is required. Not so for an online catalogue, which has much more of a life-after-publish and may bring with it additional costs, such as hosting fees, databases licenses, etc., which may be small and subsumed within other operational costs, but are nonetheless a consideration. Again, the question of data integrity and timeliness arises. If a re-attribution occurs, should the catalogue be refreshed? There are a number of institutional policy-level decisions that arise for this digital transition. This is a great example of one of the challenges of technology deployment in museums. Too many technology-based projects are funded based on getting the product to launch, while not enough emphasis is placed on the consequences of maintaining that product for whatever its lifetime is determined to be.


So this is a big deal, bigger than one might assume at the outset. This paper attempts to point out the challenges, obstacles and solutions reported by the project participants as the initiative progresses, and to offer broad and generic recommendations under the guise of what works rather than what is good or best practice. Many participating institutions are retro-fitting the process into their existing architecture, perpetuating the organic nature by which museum technology architectures grow. The following sections represent topics that the group discussed at its latest convening in November 2010.

Technology Architecture

The project participants represent a cross-section of mid-sized and large museums. The existing architectures that these museums use to deliver their current online presence are as different as the museums themselves, often having grown organically over time. However, by and large they all follow a basic best-practices technology principle by adhering to a tiered approach; that is, the separation of data, application (or logic) and presentation. This modular approach offers a great deal of flexibility and scalability at the macro level of managing data – if data is separated from how it is presented, it can be presented on many different platforms and channels by applying different logic to it. For example, wrapping data in HTML allows presenting it in a Web browse; wrapping it in a PDF allows for a more controlled delivery.

For all participants, their Collections Management System (CMS) is central to how they intend to deliver their online catalogue. The majority of participants have TMS (or EmbARK), though several have a custom-developed system. The ease with which the online catalogue process can be inserted into this environment is dependent on how the catalogue is to be presented, the frequency of its update, and the extent to which the existing architecture is open and flexible. An overriding challenge for all participants is that CMSs are designed to manage data; they were not designed to publish nor to provide an elegant editing environment, particularly one that supports rich text, symbols, hyperlinks, layering, embedded media, and some kind of presentation previewing. Institutions are attempting to solve this problem in a variety of ways, largely involving some kind of authoring tool layered over their CMS. In several cases, Drupal is being looked at as an authoring tool that deploys into a database rather than publishing to a website; in other instances, institutions are supporting the curator’s desire to continue to use a word processing application as they have traditionally done. Ingesting these documents into a database requires some kind of transformation, either manual or using a parsing application. This approach is fundamentally not that different from a tool like Drupal, but it brings with it document management complexity and raises the question of where the master content resides – is it the documents themselves or their representation in data? In one case, SharePoint is being used to satisfy the document management component.

For the institutions that have a custom developed CMS, integrating the publication of an online catalogue into their architecture is presumably straight-forward, assuming that in-house support and expertise is available. In theory, the benefit of a custom-developed CMS is that one can build the functionality one needs with greater ease than find a vendor-provided system. However, that presupposes a number of things: that the application was built in a modular way; that the application code is properly maintained and well-documented; that there is enough institutional memory surrounding its architecture; that there are still programmers around who know the language it was written in. The realities of custom-developed solutions, particularly in the museum world, often are that they have been built by a now-departed employee or, as in one participant’s case, by a single-employee vendor who is winding down the company. Situations like this make it quite a challenge to retrofit functionality and clearly demonstrate the need for institutions to select technology solutions where the viability of the vendor is a contributing factor or the implications of an in-house solution are well understood.

Having a vendor-based CMS that doesn’t support the desired functions requires a number of actions: the institution determines a workaround; talks to the vendor about adding additional functionality (this will likely incur a development cost); or documents the requirements and selects a product that will integrate and provide the functionality. All of these options are being pursued by the participants, largely around TMS. TMS has a number of benefits in regard to this project: the next version release will provide much better support for rich text within descriptive fields, and has the publishing component eMuseum which provides a mechanism to tailor the catalogue presentation. However, the challenge of getting content in and providing the supporting editorial environment to create a rich and engaging online catalogue still remains.

Image management is another potentially significant challenge. Not all the participants have a Digital Asset Management System, but the extent of the challenge is based on scale. For a catalogue with dozens of images, the requirement to manage images automatically is not great. By following simple rules about file organization and  transformation workflow and adhering to filenaming conventions, one could quite easily provide manual image support for the catalogue. For larger scale productions in the hundreds of images and up, the most productive approach should be a digital asset management tool that will guarantee consistency for the image production and deployment tasks. For some institutions, their DAM is integrated with their CMS. While this is very convenient, it is not a requirement as long as one follows a filenaming-convention that is generated in the CMS and based on unique and attributable metadata for the image: the accession number, for example.

Summarising from a macro component point-of-view, the technology architecture that is required to support an online catalogue is comprised of the following:

  • an authoring environment that will support creating the required online catalogue content and all its intricacies;
  • a database, more often than not a CMS;
  • a publishing tool that presents the required online catalogue;
  • an image management tool that may or may not be built into a CMS.

Process and Workflow

Given the potential complexity of producing an engaging online catalogue, a smart approach is to use an existing printed catalogue as a starting point. A number of the participants are taking this approach, although there may not be a huge advantage to be gained, depending on how complex the catalogue is planned to be. But, if one is planning to create a rich, layered catalogue that takes advantage of the interactivity and non-linear engagement that the Web can support, much more support is needed for the content creation process. As well as providing the tools to author a non-linear resource, it might have to allow the creation of citations and provide a mechanism for document management and referencing.

To create a sustainable, reproducible mechanism as opposed to a one-off publication, the catalogue needs to be created as data rather than as a document; i.e. the scholarly prose, tombstone information, citations, bibliography, etc – all items that need to be linked and referenced – need to be managed as data. Therefore, a database in the form of a CMS is central to the process. Starting with already-written text is a plus, but much work still needs to be done to organize the content for its delivery online.

The workflow needed to create an online catalogue is not that different from a traditional print publication, but the support tools, the technology environment and the skills are different. Key acceptance challenges will be on the curatorial side, where there is likely minimal experience with Web authoring tools. The approaches that participants have taken during the planning phase are varied and depend on a number of factors, including expertise in developing complex Web content and access to appropriate human and technology resources.

By far, the most crucial question to resolve is what the catalogue will look like and how it will function. This suggests that one of the first experts required for the project is a Web designer or user experience designer. Addressing this question informs the complexity of the editorial environment and the process. For example, a decision to support pop-up citations requires a totally different approach to how they must be managed: as discrete data objects coded via a mechanism for assigning them within the copy, much like the rich text editors available in blogging software.

The workflow to generate the content may require multiple levels of review and editorial work; building this into a system is complex. Advanced Web Content Management Systems (WCMS) may have workflow engines that can be employed to marshal content, but providing this capability in a CMS is problematic. If one is planning on using a CMS to develop the content, the use of status flags assigned to fields to denote where it is in the process can be used in place of a rigorous workflow engine.

Another crucial question that greatly impacts process and workflow is the frequency of update. This decision informs how the catalogue will be delivered (published or data driven), and how much effort should be assigned to engineering automatic tasks as opposed to manual ones. If an institutional decision is made for an annual or bi-annual update, it is probably worth the extra development effort of implementing some automatic processes; a commitment to update every five years, however, greatly diminishes the benefit of automation.

In an ideal world the process to produce an online catalogue has to start with a clear definition of the end result: what success looks like. While there’s no question that curators can start their research and writing, or indeed, start with an already completed manuscript, the complexity of the process dictates that the scope of the final product must be understood to inform the staffing, resources and infrastructure that will be required.

Staffing and Resourcing

As the participants have found, this project involves many departments and functions, including Curatorial, Conservation, Collections Information, Information Technology, Web/New Media, Design, Registrar, Publications, Imaging and Legal. For the most part, the projects seem to center around the Web/New Media or Collections Information departments: interesting given that this is a publication, but probably due to the fact that the project centers around the institution’s CMS.

One crucial decision for institutions is where ultimate responsibility for the project lies. The decision is really an institutional one and is based on what the catalogue means to the institution. It is very similar to the question of where a museum’s Web department should sit – it depends on what the museum wants its website to be. If the website is a communication tool, the Web department should report to Communications & PR; if it is an educational tool, it should report to the Education department. Obviously, the Museum Director is ultimately responsible, but discipline-based choices for oversight of the project will greatly dictate the final product. For the participants, oversight of their projects ranges from directors, chief curators, communications and audience engagement, and collections information, with some institutions establishing advisory boards.

While publications departments are involved in the projects, none appears to be playing the central role that it would for the traditional printed catalogue. This is an interesting observation, indicative of the latency with which museum publications departments are addressing the issue of digital publication. One might assume from this that the institutions have somehow decided that this is not a publication: ironic, because publications departments are expert at complex project management.

Challenges and Hindsight

As the project progresses, the participants come together periodically to report on progress and discuss challenges and solutions. These convenings are a great opportunity for mutual support and advice, and the following two sets of points summarise a session at the most recent convening dealing with challenges and hindsight. There were many consistent threads in the challenges that participants documented; in particular, the need to maintain consistent staffing for the life of a medium-term project was of great concern. The following challenges, in no particular order, offer a compact summary of the things to watch out for, and while some are specific to the online catalogue process, many resonate with any online project:

  1. Coping with staff attrition – remaining staff must juggle competing priorities and new responsibilities
  2. Finding ways to best represent information within our current database structure and/or augment with a separate content management tool
  3. Determining how to have online catalogue entries peacefully coexist or integrate with other existing online object content; e.g. online collections and Provenance data, in a way that is easily understandable in an end-user experience
  4. Accommodating the project within current workload
  5. Maximizing the potential of an online catalogue
  6. Identifying related images and negotiating rights
  7. Completing all digitization, including management of metadata inclusion for a large number of images
  8. Producing a sample Web template with one artist or related group of work
  9. The contract hire of a scholarly content facilitator to marshall the editorial process
  10. Front-end navigation – integrating the user experience to allow for different engagement levels, consistency and project identity
  11. Evaluation – planning scholarly and practical evaluation of the site and its contents. Examining potential use by different audiences, assessing if fit for purpose
  12. Editorial & publishing resources – how to secure in-house editorial and Web publishing resources to support future research projects?
  13. Dealing with the new pressure this creates on existing systems and applications while avoiding the Rube Goldberg machine
  14. Balancing different collections imperatives and institutional funders (e.g. general educational vs. scholarly outreach)
  15. Presenting flexible, overarching narratives in an asset-driven project with fairly structured data
  16. Corralling staff expertise in curatorial and conservation for a long-term research project not directly tied to an exhibition
  17. Ensuring strategic integration of the online catalogue development with other digital publishing initiatives; i.e., transforming our collection management system into a publishing powerhouse to deliver to different targets such as gallery interactives, exhibition Web sites, and mobile publications
  18. Adequately resourcing the publication process with staff who possess the emerging skill sets needed for producing multifaceted digital publication
  19. Dealing with shortage of time and human resources
  20. Managing scholarly references and bibliography already existing in print format
  21. Working by definition within the constraints of a print format
  22. Ensuring that the catalogue meets audience needs and carries the appropriate scholarly weight
  23. Understanding the distinction between an online archive and an online publication
  24. Project management and coordinating different aspects of the project such as content, software and interface development
  25. Understanding the institutional impacts of the project on resources and workflow
  26. Understanding where the catalogues fits in the current strategic thinking
  27. Creating innovative design that is adaptable, with an easy-to-use authoring tool that is affordable
  28. Determining technology to use for gathering and analyzing feedback about content
  29. Defining the scope of the catalogue and prioritizing objects
  30. Editing translations of foreign language inscriptions
  31. Settling on the Web designer
  32. Dealing with the fact that the research and writing process has led to new discoveries with innovative arguments which make it hard to define limitations to the project and other initiatives
  33. Realizing that our front-end audience research took longer because responses were richer and more complex and helpful than we anticipated
  34. Finding that our photography took much longer than anticipated because we searched files for existing photos and discovered earlier conservation shots

The second list is suggestions, the products of hindsight: and what decisions the participants would have made differently with the benefit of hindsight. Again, these “woulda”, “coulda”, “shoulda” nuggets of advice are not only critical for an institution considering the transition to online for their scholarly catalogue, but are also pertinent to any online project:

  1. Start Small: define realistic content goals based on anticipated resources and workload
  2. Be mindful of sustainability: Initial technical implementation may be overly complex and require  narrow skill sets
  3. Remember that this is a “real” publication, with real schedules, and should be treated as such
  4. Photography for the publication should commence as soon as the scope is fully defined.
  5. Involve the Collections Database Administrator as a key team member from the outset
  6. Contract contributors sooner
  7. Have a full, inclusive discussion on look-and-feel for an ideal website
  8. Research and resolve metadata/image integration systems early in the process
  9. Build more administrative and editorial support into the project plans and budget
  10. Increase the amount of born-digital material in the project
  11. Assess website capabilities and audience needs early
  12. Involve a large internal community early on
  13. Split apart development of content and capture procedures from interface development
  14. Adopt publishing model early, as it relates to content development (e.g. “chapter” concept introduced midway into planning phase).
  15. Involve software developer in early visual designs
  16. Develop & test database-driven tracking of assets early
  17. Analyze new middleware solutions early
  18. Select case studies to demonstrate the range of potential issues
  19. Apply best practice for exhibition catalogue production to this “publication.”
  20. Invest in early design to promote discussion and consideration of the content implications.

Summary & Conclusion

This paper reviews the OSCI project’s work in progress, with the Foundation’s goal to inform the field through an interim and final publication. To start the process, an institution has to ask the fundamental question, “What is an online scholarly catalogue?” In some ways the answer is going to be institution specific, but hopefully the work that these participant museums are doing will help define broader answers to that question. In its printed form there is clarity on the authorship of a scholarly catalogue, but that may not be so apparent in the online world. It will arguably be seen as a product of the institution rather than the individual, and consequently, authority will come from the institution rather than the individual. This may be a hard curatorial-pill to swallow, and the institution needs to do its upmost to maintain that individual curatorial authority. However, the potential benefit is greater with wider and easier access to curators’ work (with the understanding that they are providing more of a scholarly “service”); citations will certainly increase. Initially this may be a hard conversation to have, but it is much better to have that conversation upfront. The key may be to find one curator to focus on, and hope the “me-too” effect takes hold.

Furthermore, the conversation about how the catalogue looks and how it functions has to happen across an institution: it cannot be confined purely to the curatorial ranks because the expertise that can decide these questions lies elsewhere. How it looks and how it functions will play key roles in determining how successful it is, but this will be difficult to measure since the metrics for measuring success are different. How will you measure whether it is outperforming print? How will page views measure against copies sold? If reaching an untapped audience is a measure of success, how will you know that has happened? A definition of what success looks like is key for an institution to identify at the outset.

These projects are unquestionably collaborations and involve many disciplines within the museum field. Scholarly research and writing aside, two key points to understand are that this project involves the transformation of a repository into a publishing tool, and it also requires an authoring environment. An institution has to consider whether it has the resources to support this, either internally or through contract work; there is no doubt that it requires a learning curve and an investment. The decision to transition cannot be based on financial motivations and must be made for the long term. This last point addresses sustainability and highlights one overriding observation that any interested institution has to understand. It is implicit in all the discussions and strategizing that the project participants are having amongst themselves and in their institutions – this is not a “technology project”.

Acknowledgements

The OSCI project participants who provided the raw content for this paper, who are too numerous to mention; Brenda Podemski, Principal Project Specialist in the Collection Information and Access department at the Getty Museum for her critical and editorial eye.

References

Johnson, L., H. Witchey, R. Smith, A. Levine and K.Haywood (2010). The 2010 Horizon Report: Museum Edition. Austin, Texas: The New Media Consortium. Consulted January 30, 2011. http://www.nmc.org/publications/2010-horizon-museum-report

Cite as:

Honeysett, N., The Transition to Online Scholarly Catalogues. In J. Trant and D. Bearman (eds). Museums and the Web 2011: Proceedings. Toronto: Archives & Museum Informatics. Published March 31, 2011. Consulted http://conference.archimuse.com/mw2011/papers/transition_to_online_scholarly_catalogues