Skip to main content

Museums and the Web

An annual conference exploring the social, cultural, design, technological, economic, and organizational issues of culture, science and heritage on-line.

Automatic Heritage Metadata Enrichment with Historic Events

Marieke van Erp, Department of Computer Science, VU University; Johan Oomen, Netherlands Institute for Sound and Vision; Roxane Segers, Department of Computer Science, VU University; Chiel van den Akker, Department of History, VU University; Lora Aroyo, Department of Computer Science, VU University; Geertje Jacobs, Rijksmuseum Amsterdam; Susan Legêne, Department of History, VU University; Lourens van der Meij, Jacco van Ossenbruggen; and Guus Schreiber, Department of Computer Science, VU University, Amsterdam, The Netherlands

http://agora.cs.vu.nl

Abstract

Metadata enrichment of cultural heritage collections by means of structured vocabularies has been shown to improve the work of professionals and to increase accessibility of the collections for end-users. Still, collection metadata lack rich context information, such as references to events, which can be valuable for a more effective navigation in cultural heritage collections. In this paper, we present an approach that extends existing metadata enrichment processes with a method to discover historical events. The events are structured in a historical event thesaurus to enrich object metadata. As such, the event thesaurus is used as a bridge between objects in different collections. The results of this work allow for topic-based and event-centered browsing, searching and navigating in integrated collections.

Keywords: collection metadata enrichment, collection access, historical events, thesauri, linking collections, liniking objects, search and browse

1.Introduction

Most digitized and online available objects from GLAMs (Galleries, Libraries, Archives, Museums) can be browsed through a predefined set of formal metadata, such as its creator, year of creation, and type of material. Standards for metadata management and exchange enable intra-collection search and exploration, and are also the main drivers behind supporting domain and cross-boundary access to collections. However, these formal metadata often only give limited access to information pertaining to the content of the object, such as its topic, or what is depicted. In many cases, institutions have compiled vocabularies containing terms and concepts to categorize their collections. However, these vocabularies are often a) not standardized, and thus not usable across collections, and b) compiled ad-hoc, and thus biased to the collection and/or incomplete (Gilchrest, 2001). Often, additional information about the object and its topic is given through textual descriptions. These descriptions can be quite elaborate and rich in information, but they are mainly accessible only through keyword search. This is limited in the sense that it does not facilitate sorting, or retrieving objects whose descriptions contain terms that are synonymous or otherwise related to the search term.

We showcase an approach to enrich collection metadata with event information through linking it to an automatically built historical event thesaurus. Historical events often have a central place in the description of collection objects; e.g. they may depict an event, or an object with a relevance to an event, or they could have been commissioned to commemorate an event. Furthermore, events are multidimensional, requiring who, what, where and when. Therefore, collection objects can be linked through each of these dimensions, greatly increasing the number of relationships between objects and collections. We use, among others, techniques from text mining to achieve this.

The Agora project is a multi-disciplinary collaboration between the History and Computer Science departments at the VU University Amsterdam, the Rijksmuseum Amsterdam (RMA) and the Netherlands Institute for Sound and Vision (S&V). The aim of this four-year project is to develop a platform for interactive exploration of heterogeneous heritage collections, one that supports a range of social interactions among its users, in which museum objects can be placed into an explicit (art)historic context. Through this context, objects from highly diverse museum collections can be related, resulting in a more complete and illustrated description of historical events. Several user groups will benefit from such platforms, from researchers in the humanities working on a monograph, to the general public with an interest in history. In Agora, users are encouraged to create their own personal narratives that could lead to theoretical reflections on the meaning of digitally-mediated public history in contemporary society. In short, Agora aims to exploit formal metadata and other contextual knowledge to offer new ways to explore cultural heritage collections. This is a big departure from the search interfaces available to end-users presently.

This paper is structured as follows. In Section 2, we describe related work. In Section 3, we describe the data sets we used. In Section 4, we describe the process of enriching collection metadata with event information through an automatically-built event thesaurus. In Section 5, the first version of our event-based collection-browsing demonstrator is described.

2.Related Work

We first look at current collection access methods in cultural heritage institutions and visualizations of historical data and discuss where Agora aims to improve this.

Many publicly available search interfaces support a combination of searches based on full-text, formal metadata, and filter options based on so-called facets. The latter is an effective way to ‘drill-down’ in a set of search results. However, these search interfaces have several shortcomings:

  1. they do not take into account external information structures that can be exploited to add extra layers of knowledge
  2. they do not facilitate active contributions from end-users
  3. they often present results only as ranked lists.

In recent years cultural heritage institutions and computer scientists have teamed up to address these issues. In the remainder of this subsection, we discuss some key projects that aim to resolve these issues.

Exploiting external knowledge structures

The Europeana semantic search engine ( http://www.europeana.eu/) uses reference knowledge, such as thesauri, subject heading lists, classification schemes, authority lists for persons’ names, and place names to support more semantics-intensive exploration (Hildebrand, 2010). They have been converted into RDF format using the SKOS model (Miles & Bechhofer, 2009). This allows a uniform representation of the concepts present in the vocabulary; it also paves the way to alignment of these resources (Isaac, 2010).

Fig 1: Screenshot of the Europeana semantic search engineFig 1: Screenshot of the Europeana semantic search engine

Within the Europeana semantic search engine, this is exploited for disambiguation of search terms, query refinement, supporting multilingual access, and clustering results. Figure 1 shows an example of a search result for the query ‘zee’ (sea) that distinguishes between works entitled ‘Zee’ and works depicting ‘Zee’. Another example of a semantic search engine used in the cultural heritage domain is CultureSampo, providing access to Finnish heritage collections (Hyvönen 2009).

Agora builds on Europeana technology by adopting its approach to object metadata enrichment and collection schemas aligning. Europeana uses only existing knowledge sources, such as, thesauri and vocabularies for traditional collection annotation, whereas Agora aims to fill the gaps in these sources, in particular with respect to historic events, by automatically building these where necessary. Agora is also the first project to define historic events as a semantic dimension.

Active User Contributions

GLAMs have currently started to take advantage of new technologies that allow visitors to create new meanings and contexts for the objects in their collections. A major motivation for these initiatives is the ability to offer services that go beyond search and retrieval. In other words, as Clay Shirky states: "Public reuse produces a kind of value that doesn‘t just come from publication. It comes from republication and reuse" (Shirky, 2009). Social tagging, for example, was introduced by heritage institutions as one of the first ways to explore the possibilities created by Web 2.0 functionality. Examples include the Brooklyn Museum TAG! You’re it! and the Steve Museum project (Trant, 2009). These user-contributed tags can be used to enrich the vocabularies used by institutions and as new sources of information that aids users to explore the collections based on terminology supplied by non-professionals. But adding tags is just one model for active user engagement that is currently being explored. GLAMs are also encouraging users to add their personal stories or contribute actual objects to the collections (Oomen, 2010). A considerable amount of research is currently being conducted to tackle critical challenges that will define the success of these collaborations between amateurs and professionals. The main challenge remains finding enough knowledgeable and loyal users to maintain a reasonable level of quality (Simon, 2010).

Agora invites users to contribute to the digitally-mediated public history platform by providing different perspectives on collection objects and to compile their own historical narratives. As such, Agora facilitates an active discussion between the experts, the general public and the cultural heritage institutions about the meaning of their collections.

New methods for presenting results

A number of factors, such as the emergence of rich Web applications (e.g., Flash, Java), open APIs, and the increased availability of digitized objects, have influenced how collections can be visualized; for example, by plotting objects on maps or timelines. Historypin (http://www.historypin.com/), for instance, uses a map interface to show historical photographs combined with the possibility to zoom in on a specific a date range. UK Sound Map (http://sounds.bl.uk/uksoundmap/index.aspx) and 1001 Stories Denmark (http://www.kulturarv.dk/1001fortaellinger/en_GB) provides comparable functionalities for sounds and stories respectively. Timeline visualizations are closely related to the way Agora is aiming to present objects and their relationships. The British Library launched Timelines, Sources from History (http://www.bl.uk/learning/timeline/index.html) in 2010; it presents items in the collection chronologically from medieval times to the present day. Next to more topical representations, it allows users to look at key events in history. In the example shown in Figure 2, the key event “End of the Second World War” is shown alongside major advances in Medicine, Science and Technology around that date. Such visualizations offer a means to represent an object in its context.

Fig 2: Timelines, Sources from History of the WorldFig 2: Timelines, Sources from History of the World

The BBC and the British Museum collaborated on a project called “A History of the World” (http://www.bbc.co.uk/ahistoryoftheworld/explorerflash/) that also uses a timeline as a means of navigating through the object space. As a notable additional feature, they invite users to contribute their own objects to the timeline.

Agora utilizes best practices from these projects to visualize relations between objects along different dimensions in an intuitive and user-friendly way. Besides the temporal and spatial dimensions, Agora also presents other contextual relations between objects, such as, sub-event and super-event.

The Simple Event Model (SEM)

Events are multidimensional objects. Besides having an identifier (the event name, such as “Police Action”), events also have associated actors (e.g., “Dutch Government”, “Republic of Indonesia”), locations (e.g., “Java”) and times (e.g., “25 March, 1947”). These actors provide meaning contextual information related to items in a collection, and can be used as starting points for new queries. As all of the objects associated with the events are multidimensional objects themselves and are related to the events differently, it is important to structure so as to make optimal use of its expressivity. Various event models have been proposed: (Lagoze & Hunter, 2001; Raimond & Abdallah, 2007; Scherp et al. 2009), but we chose to use the Simple Event Model (SEM) (Van Hage et al., 2009).

We chose SEM because it is created to model events in various domains without making assumptions about domain specific vocabularies. Note that although we are working in the heritage domain, the resources we aim to include in our project (geographical, bibliographical data) are not always from this domain. SEM is designed with a minimum of semantic commitment to guarantee maximal interoperability. It only restricts itself to representing the core of events, where other models, such as CIDOC-CRM (Crofts et al., 2009), aim to describe the domain and information about actors, locations and times. At the same time, SEM is compatible with these other models, and one can relate these to SEM through the Type classes in SEM, through which one can link the “sem:Place Java” object to the GeoNames atlas, for example.

Fig 3: Example of collection object represented with its associated SEM classesFig 3: Example of collection object represented with its associated SEM classes

3.Data

In this section, we describe the data sets we enriched and reveal which additional data sets we used to do so.

3.1. Rijksmuseum Amsterdam collection data

For managing their objects, Rijksmuseum Amsterdam (RMA) uses the curator-targeted AdLib database, based on SPECTRUM, the UK collections management standard. Adlib Museum software also incorporates other international standards, such as the CIDOC guidelines and Getty Object ID. Currently, 60,000 museum objects are registered in this system. A separate information layer was built to connect the Adlib database, image repositories, educational content; it provides access to an extensive collection of online presentations. Search is an integral part of the public RMA website ( http://www.rijksmuseum.nl/zoeken/?focus=assets&lang=nl). It handles hundreds of thousands of hits a day, providing access to museum descriptions and links to exhibition information, quizzes, timelines, Web pages, detailed descriptions, stories, interactive maps, and online games. Figure 3 shows a typical search result from the online collection.

Fig 4: Screenshot of Rijksmuseum search interfaceFig 4: Screenshot of Rijksmuseum search interface

The RMA data contains metadata of all collection objects plus a number of SKOS thesauri (Miles et al., 2009). Included are a subject thesaurus (53,880 concepts), a geographical thesaurus (11,485 concepts), a person thesaurus (6,500 concepts), a separate event thesaurus (1,693 concepts), and 28,889 object descriptions. These vocabularies and metadata have been processed and converted into RDF in an earlier project; e.g., e-Culture (http://e-culture.multimedian.nl/) and CHIP (http://chip-project.org).

3.2. The Netherlands Institute for Sound and Vision collection data

Sound and Vision (S&V) maintains and provides access to 70 per cent of the Dutch audio-visual heritage, comprising approximately 700,000 hours of television, radio, music and film, making Sound and Vision one of the largest audiovisual archives in Europe. It developed the iMMix multimedia catalogue system to preserve and manage the ever-growing collection of archive material. Material is ingested in the digital storage facility from current television production and from large-scale digitisation efforts. The IMMIX system follows the IFLA-FRBR model as a basis for its object-oriented data structure that models various audiovisual resources, as well as online archive functionalities, within a professional broadcast production environment. Metadata is stored in an Oracle database and indexed by a search engine from the Autonomy. Figure 5 shows the results of the search “feyenoord”. On the left side there are specific filter options available that the user can select.

Fig 5: Screenshot of Sound and Vision Search interfaceFig 5: Screenshot of Sound and Vision Search interface

For the object descriptions, we used data publicly accessible data through the Open Images platform operated by S&V (http://www.openimages.com). It contains 998 video descriptions, the videos of which are accessible through the Web.

The metadata of the S&V collections is linked to the GTAA thesaurus (Common Thesaurus Audiovisual Archives). The concepts are divided into five schemes: genre (112 concepts), geographical concepts (14,030 concepts), persons (20,070 concepts), names (27,646 concepts) and general keywords (3,800).

3.3. Wikipedia

In order to build a historical event thesaurus that is not biased towards either the RMA or the S&V collection, we chose a subset of the Dutch Wikipedia to extract events from. This subset consists of 3,724 Wikipedia articles pertaining to Dutch history of the 17th, 19th and 20th century, and articles about the history of Indonesia. We chose this particular time-frame because both the RMA and S&V collections contain many objects that are related to the topics in this historical window. Also, by only focusing on a subset of all possible time-frames and locations, the data set becomes more manageable for evaluating our results.

4.Enriching Metadata with Events

To add events to the collection data, we first built a historical event thesaurus using state of the art information extraction techniques to find events in external text. We then linked the events from our event thesaurus to the collections. We split our approach into four consecutive steps

  1. find event names,
  2. find event actors, locations and times,
  3. relate event names, actors, locations and times (intra-event relations), and
  4. link events to collections.

Below we detail each of the steps in our collection metadata enrichment approach.

4.1. Finding event names

Events can be described in various ways, ranging from proper names and fixed expressions such as Eighty Years’ War and Assassination of William the Silent to more phrasal expressions such as ‘the Spanish invaded the city in August’. In this paper, we focus on detecting event names that behave as proper names. Phrasal references to events are more difficult to detect automatically because (1) many expressions for the same event exist, and (2) phrasal descriptions of historical events cannot easily be distinguished from non-domain-specific events such as ‘locusts invaded the field this summer’. By focusing on events that have a proper name or a fixed expression, we can detect those events that represent important milestones in reality and form the backbone of the historical narrative.

We used a pattern-based technique that relies on a so-called seed-list of positive examples and searches for recurrent strings of words in the context of these examples that indicate the presence of an event. We created a varied seed-list of one hundred high-frequency events, partly based on the RMA Event list. We then issued queries to the Yahoo search engine to retrieve text snippets containing the events, as well as text before or after the event, as the textual context in which an event occurs may indicate its presence. Next, we selected only those contexts that occur before or after the events with a minimal frequency of five. As a result, we obtained a list of 4,539 contexts or patterns that co-occur in the presence of events.

This set of patterns indicates the presence of events; however, this doesn’t imply that they are always preceded or followed by an event. A pattern such as “during the” is a good indicator of the presence of an event, while at the same time it also occurs in front of time periods (e.g., during the 17th century ). We added a score to the patterns that represents an estimation of the reliability for extracting correct event-candidates. This score is based on (1) the number of unique events that were retrieved when querying for a specific patterns on Yahoo and (2) the ratio between the total occurrence of the pattern and the frequency of the patterns on Yahoo. As such, patterns that are very reliable have a high score, while patterns that co-occur with many non-events have a low score. Then, we ran the patterns over our subset of Wikipedia. The pattern scores have been used to determine what extracted events are likely to be good event candidates. After automatically filtering the results from the patterns (e.g., an event name should contain at least a capital letter), we found 9,806 unique event name candidates.

4.2. Finding actors, locations and times

The task of finding actors, locations and times in text is treated as a regular Named Entity Recognition (NER) task. NER is a well-researched field within information extraction, with state-of-the art systems reaching F-scores of 89% for English (Tjong Kim Sang & De Meulder, 2003) and 77% for Dutch (Tjong Kim Sang, 2002). However, some modifications were necessary for our domain. Most NER systems are based on Machine Learning, and work by virtue of being fed (i.e., ‘trained on’) a large number of positive and negative examples of names in text. The algorithm in the NER system will internalize these examples and try to classify a new example as being a name or not, based on how similar the example is to examples (it knows whether or not they are names) stored in memory. Most NER systems are trained on newspaper articles. This means that they are geared towards recognizing names in that type of text. The main difference we found between newspaper text and text dealing with historical events is that historical text contains longer names (e.g., a nobleman’s name such as “Jonkheer Mr. Dr. A.W.L. Tjarda van Starkenborch Stachouwer”). An off-the-shelf named entity recognizer has difficulties dealing with such names, because it does not have these as examples in its memory. We remedied this problem by providing the named entity recognizer with extra information about the grammatical constituents present in the text. We ran the NER system on our subset of Wikipedia to extract named entities for our historical event thesaurus. This resulted in 18,623 candidates for actors, 7,023 locations, and 7,981 dates. In the next step, we determined which actors, locations, and dates belong to which event.

4.3. Finding intra-event relations

A large body of work exists on relation finding in text (Hearst, 1992; Agichtein & Gravano, 2000). A majority of this work is limited to finding relations between two entities within the same sentence through grammatical analysis of the sentence. In our case, it is hardly thinkable that the different objects (i.e., event name, actor, location and time) occur within the same sentence, as event descriptions are oftentimes scattered over one or more paragraphs. Therefore another approach had to be used to find which objects and events belong together. One advantage in using an encyclopedia is that its text ought to be to the point, and thus related information should be found in the vicinity of events.

For each event, we checked which persons, locations and dates occurred with it in the same paragraphs. To fill our SEM instances, we took a conservative approach and only took the one actor, one location and one date that co-occurred most frequently with an event. Although some other elements also co-occurred quite frequently with a particular event, the scores were too close to come to an automatic cut-off point for events with several actors, locations or times. We find for example the following event with associated elements:

Fig 6: Example of automatically filled SEM instanceFig 6: Example of automatically filled SEM instance


As Figure 6 shows, not always are all SEM classes filled. This is due to there sometimes not being enough information present to find the correct value. Currently, we choose precision over recall to prevent the event thesaurus from pollution. The set of 1,250 SEM instances that contain at least an event and an actor, location or date forms our historical event thesaurus.

4.4. Linking events to collections

Various thesauri have been aligned within the context of Agora and related projects. On the one hand, aligning collection metadata schemes has been done in order to facilitate search in the integrated collection. On the other hand, also aligning the collection metadata to the SEM event model scheme has been done. In this context the S&V GTAA thesauri have been lexically aligned to the RMA thesauri.

In Table 1 we present the number of mappings we find between the different thesauri. In the top row, the GTAA thesauri are listed; in the first column the RMA thesauri are listed.

GTAA genre GTAA locations GTAA subjects GTAA people GTAA names GTAA maker
RMA concepts 13 142 570 9 394 -
RMA events 1 - 11 - 62 -
RMA locations - 3135 - 7 312 -
RMA people - - 1 1896 554 311

Table 1: Number of direct mappings between RMA and GTAA thesauri

If we use our historical event thesaurus as an intermediary between the RMA and GTAA thesauri, we can find the number of mappings presented in Table 2.

Thesaurus 1 Thesaurus 2 # mappings
RMA events GTAA locations 20
RMA events GTAA people 15
RMA people GTAA locations 300
RMA locations GTAA people 297
RMA events RMA locations 488
RMA events RMA people 395

Table 2: Number of mappings between GTAA and RMA thesauri via Historical Event Thesaurus

The mappings found between the GTAA and RMA thesauri fill the gap in mappings between the GTAA persons and locations and the RMA events and locations. It can even help find mappings between the RMA events and RMA people and locations, as the RMA events thesaurus is mainly a flat list of event names, with some links to people and locations.

5.Agora demonstrator: presenting the enriched collections

The Agora Historical Event Browser provides an integrated access route to museum objects and audio-visual material from RMA and S&V respectively. It is a platform to research Social Web aspects of cultural heritage and online history, as well as having the added value of using history events and narratives for the exploration of integrated collections. The interface offers possibilities for (1) events and collection object browsing in the context of a selected historical theme, and (2) creating, saving and sharing historical narratives. The Agora demonstrator is built in SWI-Prolog using its Semantic Web and HTTP libraries, as well as the ClioPatria platform for the thesaurus-based search. The integration approach of the RMA and S&V collections is based on Hildebrand et al., (2010). The collection-specific metadata has been mapped onto the VRA model properties (http://www.loc.gov/standards/vracore/). The RMA Event thesaurus concepts in the subject metadata of RMA artworks have been automatically linked to their corresponding SEM Event concepts. For the S&V videos, we manually added the links to the RMA events. Below, we describe the main functionalities of the Agora demonstrator.

Fig 7: Event browsing page in the Agora demoFig 7: Event browsing page in the Agora demo

Event and object browsing

Central to the navigation in the Agora demonstrator are events and their related objects. For each event and object there is an automatically generated page (see figures 7 and 8) that shows (1) all associated objects, e.g. museum and audio-visual objects; (2) all associated events and the type of their relationship, e.g. previous-in-time event, sub-event; (3a) the event descriptive metadata, e.g. actors, place, period; or (3b) object descriptive metadata organized in three groups, e.g. biographical, material and semiotic dimensions and finally (4) the user’s navigation path. The event metadata elements are used as filter for the presentation of the associated objects. For example, the user can select only objects from the same period as the selected historical event. The hyperlinks to each associated object or event, and to each field of the events and objects metadata, allow for continuous browsing through the collections, thus generating elements in the navigation path.

Fig 8: Object browsing page in Agora demoFig 8: Object browsing page in Agora demo

User’s Navigation Path: Narratives

One of the aims of the Agora demonstrator is to explore the notion of historical narratives. By pronouncedly maintaining the history of the browse-path, in principle, a simple narrative of events and groups of objects linked by user-defined content can be built. The Navigation Path Details (figures 7 and 8) shows the list of objects and events in the current user’s path. In figure 9 we show an example of possible navigation paths between objects, events and their metadata elements.

Fig 9: Example of Event and Object NavigationFig 9: Example of Event and Object Navigation

6. Future Work

In future work, we plan to provide functionality for (1) explicitly relating the events and objects, (2) selecting and organizing objects and events into a more elaborate narrative, (3) adding textual description to the narrative, and (4) saving narratives and sharing them with other users in the context of event and object browsing.

7.Acknowledgments

The work reported here was funded by NWO grant 640.004.801 within the Continuous Access to Cultural Heritage (CATCH) programme.

8.References

Agichtein , E., L. Gravano (2000). “Snowball: extracting relations from large plain-text collections”. Proceedings of the fifth ACM conference on Digital libraries (DL’00). San Antonio, TX, US.

Crofts, M., T. Doerr, S. Gill, M. Stead, (2006), Definition of the CIDOC Conceptual Reference Model. Consulted December 2010. Available: http://www.cidoc.ics.forth.gr/docs/cidoc_crm_version_ 4.2.1.pdf .

Gilchrest, A. (2001). Factors Affecting Controlled Vocabulary Usage in Art Museum Information Systems. MSc Thesis. University of North Carolina at Chapel Hill. Chapel Hill, NC, US.

Hage, W.R. van, V. Malaisé, G. de Vries, G. Schreiber and M. van Someren (2009). “Combining Ship Trajectories and Semantics with the Simple Event Model (SEM)”. In: Proceedings of the 1 stACM International Workshop on Events in Multimedia (ACM-EiMM), 73-80

Hildebrand, M., J. van Ossenbruggen, L. Hardman, J. Wielemaker and G. Schreiber (2010). Searching In Semantically Rich Linked Data: A Case Study In Cultural Heritage. CWI technical report INS-1001, Amsterdam.

Hearst, M. A., (1992). Automatic acquisition of hyponyms from large text corpora. COLING '92. Proceedings of the 14th conference on Computational linguistics. Nantes, France.

Hyvönen, E., E. Mäkelä, T. Kauppinen, O. Alm and J. Kurki (2009). “CultureSampo: A National Publication System of Cultural Heritage on the Semantic Web 2.0”. Lecture Notes in Computer Science, Volume 5554: The Semantic Web: Research and Applications, 851-856.

Lagoze, C. and J. Hunter (2001). “The ABC Ontology and Model”. In: Proceedings of the International Conference on Dublin Core and Metadata Applications (DMCI 2001), National Institute of Informatics, Tokyo, Japan.

Miles, A., S. Bechhofer eds., SKOS Simple Knowledge Organization System Reference, W3C Recommendation. Latest version available at: http://www.w3.org/TR/skos-reference

Oomen, J., L. Baltussen, S. Limonard, A. van Ees, M. Brinkerink, L. Aroyo, J. Vervaart, K. Asaf, R. Gligorov (2010). “ Emerging Practices in the Cultural Heritage Domain - Social Tagging of Audiovisual Heritage”. In: Proceedings of the WebSci10: Extending the Frontiers of Society On-Line. Raleigh, NC, US.

Raimond, Y. and S. Abdallah (2007). The event ontology. Available: http://purl.org/NET/c4dm/event.owl

Scherp, A., T. Franz, C. Saathoff, S. Staab (2009). “F- A Model of Events Based on the Foundational Ontology DOLCE+DnS Ultralight”. In International Conference on Knowledge Capturing (K-CAP), Proceedings. Redondo Beach, CA, USA.

Shirky, C, (2009). Let a thousand flowers bloom to replace newspapers; don’t build a paywall around a public good. September 23, 2009. Available: http://www.niemanlab.org/2009/09/clay-shirky-let-a-thousand-flowers-bloom-to-replace-newspapers-dont-build-a-paywall-around-a-public-good/

Simon, N. (2010). The participatory museum. Santa Cruz, California: Museum 2.0.

Tjong Kim Sang, E. and F. de Meulder (2003). “Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition”. In Proceedings of CoNLL-2003. Edmonton, Canada, 142-147.

Tjong Kim Sang, E. (2002). “Introduction to the CoNLL-2002 Shared Task: Language-Independent Named Entity Recognition”. In: Proceedings of CoNLL-2002, Taipei, Taiwan, 2002, 155-158.

Trant, J, (2009), “Tagging, Folksonomy and Art Museums: Results of steve.museum’s Research” Archives & Museum Informatics, Toronto, Canada. Available: http://museumsandtheweb.com/files/trantSteveResearchReport2008.pdf

Wielemaker, J., M. Hildebrand and J. van Ossenbruggen (2007). “Prolog as the Fundament for Applications on the Semantic Web”. In: Proceedings of the ICLP'07 Workshop on Applications of Logic Programming to the Web, Semantic Web and Semantic Web Services (ALPSWS2007). Porto, Portugal.

Cite as:

van Erp, Marieke et al, Automatic Heritage Metadata Enrichment with Historic Events. In J. Trant and D. Bearman (eds). Museums and the Web 2011: Proceedings. Toronto: Archives & Museum Informatics. Published March 31, 2011. Consulted http://conference.archimuse.com/mw2011/papers/automatic_heritage_metadata_enrichment_with_historic_events