Levelling up: Towards best practice in evaluating museum games
Games, and game-like experiences, are becoming increasingly important to museums seeking to engage new audiences and to provide deeper engagement with existing audiences. Demonstrating the value of commissioning and producing games as a core activity of museums is imperative. At the same time, the sector is encountering a new, cold wave of realism in evaluating its online presences. Old canards about new audiences are being thrown out the window, and a more rigorous focus is being applied to the business of gaining and measuring audience attention.
There is a risk, however, that these engagement metrics, oriented towards page-based content and measurable proxies for interaction in social media are poorly suited for the kind of immersive, deep engagement that good games can offer. This paper will identify and discuss best practice in the field of museum games evaluation in order that such evaluation might also move towards standardisation and wide acceptance, with the end result of improving games in our field.
Most museum games are web-based, and have an explicit educational intent. Examples from our own experience include High Tea, Memory (Wellcome Collection), Launchball, Rizk and Thingdom (Science Museum). These types of games will be the focus of this paper, though we expect the methodology to have applications across other gaming activities.
Drawing from our own institutional practices, across the cultural heritage and informal learning sectors, and looking at the area of ‘persuasive’ games in general, we will focus on initial objective setting and three areas of evaluation:
1. Quantitative evaluation: What are the best tools and methods to measure overall interactions, in-game interactions, and profile game achievement against audience segments?
2. Qualitative evaluation: How can we use interviews, surveys, and focus groups to gain a deeper understanding of player motivation and experience and are they sufficient? Can we find more useful and rich ‘in the wild’ where people freely discuss games? If we choose that route, what are the best approaches to gathering and analyzing that data?
3. Pre vs Post-release evaluation: How reliable or valid are expert reviews and user testing of games prior to release? How do these results compare with those from summative evaluations?
In each of these areas, we will also consider aligning evaluation approaches for different types of games: autonomous games, and gamified or gameful experiences. Technical appendices will describe both quantitative and qualitative methods.