April 15-18, 2009
Indianapolis, Indiana, USA

The Future of Mobile Interpretation

Koven J. Smith, The Metropolitan Museum of Art, USA


The last several years have seen museums carefully moving away from outmoded audio technology towards richer multimedia devices. However, while there have been a handful of successful museum installations of multimedia guides, these devices still have yet to take hold in museums in the same way that audio guides have. This may have less to do with the technology itself, and more with the mindset that produces content for the technology. This paper discusses the means by which museums might break through these old ways of working and begin producing truly next-generation mobile content.

Keywords: mobiles, handhelds, audio guides, interpretation, multimedia


The last several years have seen museums carefully moving away from outmoded audio technology towards richer multimedia devices. However, while there have been a handful of successful museum installations of multimedia guides, these devices still have yet to take hold in museums in the same way that audio guides have. The failure of the majority of handheld projects to date has been blamed on their trying to do too much, using technology that is too complex, too expensive, or "not ready for prime time." The resulting best practices, as witnessed in the recent symposium on handheld devices at Tate Modern (, have emphasized simplifying handheld applications and devices, in effect bringing them into line with traditional audio tours but adding a few visuals. Although a few of these devices may have individually failed as a result of poorly executed complexity, simplification as a broad solution is not the answer. If anything, the failure of these devices to find a voice in museums is because museums are, by and large, not taking full advantage of the capabilities of this new generation of multimedia devices.

Multimedia devices represent a break, a sea change, in both content and platform, from audio guides. That is to say, if one thinks of the evolution of mobile interpretive devices as a straight line from AM/FM devices through personal cassette players to the now-ubiquitous random-access mp3 players, multimedia guides do not represent the logical endpoint of that evolution, but rather a parallel and altogether different development. Multimedia guides bring with them a suite of opportunities and difficulties that only occasionally overlap with the opportunities and difficulties associated with audio guides. Although the technology has changed, the mindset that produces content for the technology has not.

It is therefore becoming increasingly apparent that museums need to divert efforts away from an approach in which the device itself drives the content that is created to one in which the mobile platform is merely an endpoint of a given content development effort. Doing this requires first re-analyzing and re-thinking assumptions about mobile interpretation that museums have long since taken for granted, then using that analysis to take advantage of existing or emergent possibilities, and then settling on a development framework that ensures continuous evolution.

Question Assumptions

"A market response to inefficient distribution"

One pervasive notion that has largely been taken for granted is that a "tour" model (selected "stops" with narrative content, accessed either randomly or in sequence) is the appropriate one for a mobile interpretive device. This framework evolved naturally from the traditional docent-led tour, but the methods used themselves evolved not out of preference, but rather out of necessity – the medium determined the approach. Audio guides were originally created to replace (or at least make more readily available) the kind of content that was at that time being delivered to visitors via docents leading tours in galleries.

In the earliest days of mobile interpretation, audio was the only medium that could reliably deliver that kind of narrative content in a small, portable package. Most museums did not have this content readily available in aural form, meaning that it had to be produced from scratch, involving either a significant investment in production personnel and equipment or the engagement of an outside vendor. All the content was, in effect, hand-made; for each object or exhibition, new content had to be written, edited, recorded, and transferred to the given device. In the end, that high per-object cost, combined with limited storage capacity on a given device, forced museums to be highly selective about which objects (or exhibitions) would be included on a given device. Thus the emphasis was heavy on special exhibitions (in which content development is often funded by an exhibition budget) and carefully-selected "greatest hits" from the permanent collection.

The limitations placed on mobile interpretation both by the medium and the high-cost production chain meant that certain practices became ingrained in museums as being inseparable from the very idea of mobile interpretation. These practices include:

  • Content is developed specifically for the mobile device;
  • Content is typically tied to specific stops within the physical space (usually objects or architectural features);
  • Objects from the permanent collections are under-represented, in favor of special exhibition features;
  • Contextual material, beyond gallery introductions, is largely absent.

Museums have taken what was originally a practical response – audio – to a very real problem – how to provide mobile interpretation – and have come to assume that the content model that grew out of that practical response is what suits visitors best. To quote Chris Anderson in The Long Tail: "Many of our assumptions about popular taste are actually artifacts of poor supply-and-demand matching – a market response to inefficient distribution" (Anderson, 2004). One sees this already in the few multimedia guides that have been put together. Although the applications developed for these devices are admittedly significantly more sophisticated than even the top-of-the-line audio player, they still preserve its model. The device forces the visitor to consume content via "stops" on a "tour." At each stop, the visitor is provided narrative content (typically still audio, often now accompanied by text, images, or video). A user of the device is reduced to being a consumer of information. The device does not react to choices the user makes, nor does it respond to the user's input. Because most of the content is still made by hand, the user is limited to listening to or viewing content predominantly from special exhibitions and some "greatest hits" from the permanent collection.

"Don't be stingy"

This is not to say that there is anything inherently wrong with the tour model; there is a portion of the museum-going public who will probably always crave this led-by-the-hand, explicitly curated approach. The problem, however, is that the tour model appeals only to this relatively small segment of the museum-going public. The remainder, who might crave the ability to do more than passively consume information, is out of luck. With each new generation of museum-goers able to consume and filter greater quantities of information more quickly and efficiently than the last, the greatest hits model starts to look quite, in the words of Colleen Macklin, "stingy" (Macklin, 2009). Audio guides remove the ability to skim large quantities of information. An entire stop must be consumed, from start to finish, or not at all. The audio is either on, or it's off. The net effect is that museums are forcing people who are accustomed to digesting a lot, and quickly, to digest very little at a snail's pace.

Significantly, the greatest hits approach preserves a way of looking at collections that has essentially been discarded in most Web presentations. On the Web, museums are moving away from the "curated highlights" approach towards a model in which the entire collection is available for searching, browsing, and filtering. The curatorial facilitator is no longer the sole means by which a visitor might experience an institution – museums now encourage users to self-curate their own groupings from an entire museum's collection. In fact, a whole list of possibilities is available to a user browsing a museum's collection on the Web – ways that are not available in the physical space – not least among which is the ability to acquire depth of context (whether via translations, maps, or encyclopedias). Instead of multiple sources of information, the visitor has access only to a single curated 'voice' (even when multiple narrators are used). Instead of viewing information about the entire collection, the visitor must get by with a small slice of information about a few highlighted objects. Instead of slicing-and-dicing an entire collection in multiple ways, visitors must stick to the physical layout of the museum – American Decorative Arts, Asian Art, 19th Century Paintings, etc. In short, anyone hoping to carve out an experience in the galleries as information-rich as the one on the museum's Web site will leave the building hopelessly frustrated.

Know your audience

All of this speaks to one of the most important un-addressed problems facing museums developing mobile interpretive platforms: the audience for multimedia guides has never been properly defined. In hundreds of pilot multimedia projects conducted in museums over the last ten years, this basic question has not been answered: is the target audience for multimedia devices the same as the target audience for traditional audio guides? If it is, then best practices developed to this point still apply. If it is not, then museums must focus further research and development towards understanding this new (or at least different) audience. Museums need to begin asking themselves tough questions. For whom are these devices intended? What are museums hoping to learn by pursuing pilot projects involving mobile interpretation? Are these pilot projects pushing the development of mobile media in such a way that specific hypotheses are proved or disproved? As more and more museums begin pursuing these pilot projects, new directions and new means of enhancing the visit must be explored. But what specifically can be done to break museums out of the audio tour mold?

Do More With What You Have

Make the entire collection available

First things first. A mobile interpretive device should have some kind of searchable content available for every single object on display (and preferably even for those objects that aren't). Generally, the core of this content would most likely come from a museum's collections management system. Because this content is often already being used for presentation on the Web and is structured, the threshold for usage is low. Rather than having works in special exhibitions and highlights represent the entirety of content on the device, these objects can be called out as a subset of the larger data pool. The implication here is that textual collections data now represents the core of content on a given platform. This does not mean that handmade audio or video content needs to be purged from the handheld device; it is simply that that content would be augmentative instead of core. When available, that content would be displayed. Indeed, in a 2005 study of the "Tate Collections Guide" pilot, it was concluded that visitors viewed audio/video content when available, but found text to be satisfactory otherwise (personal correspondence between the author and Nancy Proctor, 2009).

Having textual data available for every single object opens up the possibility to search, filter, and group objects. Our users have come to expect this ability on the Web; now give them that same ability in the physical space. Ad hoc grouping means that visitors are no longer restricted to highlights constructed by museum personnel – visitors can, in effect, create their own highlights, based on criteria they set. If a visitor wants to see every object in your collection created in 1892, they can do that. If a visitor wants to see every object in your collection donated by a particular benefactor, they can do that, too. The grouping could be arbitrary, as well – a visitor could construct their own group by hand as a result of performing multiple search or browse operations.

Even this simple step of taking content that already exists and making it available to a handheld device fundamentally transforms the nature of the handheld experience. With the ability to search, group, and filter every object, the device becomes a digital surrogate, an assistant, rather than a tour guide. The device has transformed from merely a content-delivery system to a means of helping to turn a visitor's preferences into action.


To truly translate the Web-like experience of discovery into the physical space, however, the handheld device must provide appropriate wayfinding. If the visitor is able to discover objects potentially of interest, but not successfully locate those objects in the gallery space, the handheld device has failed. Object locations should be explicitly mapped, appropriate travel vectors within the gallery space could be defined, and the devices themselves should be location-aware. Currently, wayfinding, whether via maps or text directions, is a problem that might be best tackled by several museums working together. As mapping technology evolves, lightweight (and yet sophisticated) means of solving this problem may present themselves.


Once the visitor has successfully navigated his or her way to an object, it is important to ensure that that object doesn't become a navigational dead end. When a user is viewing an object in the physical space, the device should always suggest additional (possibly related) objects that may be of interest. Doing this means that the end result of a given search is actually the beginning of another potential browsing path. The device must therefore incorporate a recommendation engine. Ideally, this engine would work at the intersection of three vectors: content, location, and preference.

Content recommendations would primarily be based on content contained within a museum's collections management system or in related curatorial scholarship. Content recommendations would involve the clustering together of objects based on similar characteristics. In this scenario, objects could be clustered algorithmically based on co-occurrence of terms/phrases in those objects' collection records, or clustered manually based on objects' inclusion in known groups (such as the Hudson River School, for instance). Potentially, content recommendations could also be made via an Amazon-like "others who viewed this object also viewed" scenario.

The location-aware nature of modern handheld devices allows location to also be a factor in making successful recommendations in several ways. A museum spread out over a large area (such as a sculpture park) might elect to limit additional recommendations to objects that are within comfortable walking distance of the object being currently viewed. A traditional art museum might elect to do exactly the opposite, in that a visitor already in a given gallery might not need to have objects from that same gallery recommended to him or her. If a visitor is traveling along a pre-selected or pre-determined route, the recommendations might be limited to objects with a proximity relation to that route. Additionally, a museum may wish to feature certain content in its gallery space, and indicate a preference to recommend that content when the visitor is nearby.

Preference-based recommendations would involve more active user participation. Internet radio sites like as Pandora ( and ( derive much of their value by understanding a user's preferences over time and suggesting new content based on that understanding. Similarly, a museum handheld device could track content viewed by a given visitor and recommend additional content based on the cumulative understanding gleaned from that information. The visitor could aid in this process by indicating whether the object, content, or location is relevant to them or not (similar to Pandora's "thumbs up/thumbs down" approach). Recommendations could also be based on a stated preference by the user at any time during the visit. Perhaps an initial search on the handheld device brings up a list of results, but also a list of preference options (which might be derived from a museum's own faceted categorization schemas). For example, a search for "Thomas Cole" might bring up a list of objects on display created by Thomas Cole, but also a list of questions such as, "Are you interested in work by artists of the Hudson River School?" or "Are you interested in works created between 1800 and 1850?" These preferences could be then used to determine what types of recommendations are made later in the user's visit.


Beyond this three-pronged recommendation engine, the mobile platform should provide valuable context to the visitor as well. An obvious way to do this would be to give the visitor the ability to compare and contrast objects. The nature of museums is to physically locate objects together by time period, style, or other common themes, removing the ability to see how these objects might compare to other objects within the building, in the same way that two objects might be contrasted in a print publication. It is unlikely that you would see two depictions of the same scene painted 100 years apart on different continents in the same gallery, but you could easily place these two works side-by-side in the device for comparison.

The point here is that once the visitor finds a subject of interest, he or she should be able to know as much as the museum knows about that subject while still inside the building and even in front of the artwork that sparked the interest. A visitor may stumble on that work by Thomas Cole, but find that more so than the individual work itself, he or she is interested in the Hudson River School and artists related to that movement. If he or she so desires, the visitor should then be able to find publication excerpts, artist letters, newspaper articles, and bibliographic citations, then group works by related artists together and map where they are in the building. The platform becomes a portal into the museum's knowledge base.

Provide A Multitude of Unique Experiences

Two of the complaints often lodged against traditional audio guides are that the headphones (or phone receivers) cut off the user from interactions with those around them, and that audio guides force the user to digest information at the guide's pace, rather than the user's. Both of these complaints tie into a much larger problem, which is that the traditional audio guide (and its multimedia descendants) allow for only one type of experience, and that that experience is, by and large, dictated by the institution with little to no true interaction on the part of the visitor. That experience should continue to remain a part of any interpretive device for those visitors who desire it, but the devices of the future should also allow for a multitude of other experiences to occur alongside it.

User-generated content

One way in which museums have attempted to encourage a less passive experience via multimedia guides is through the use of so-called "user-generated content." In this scenario, content is actively solicited from the visitor by the museum, usually in the form of comments or responses. The benefit to the user for submitting this type of content generally falls into one of three categories:

  • The benefit to the user is delayed – visitors can bookmark favorite objects as they move through the galleries, accessing further information about those objects post-visit via a Web link;
  • The benefit to the user is abstract – visitors can "tag" artworks with terms or add comments or reactions, but are unable to do anything specific with this content once it's been entered;
  • The benefit to the user is nonexistent – visitors enter reactions into a guestbook-style application, but these reactions are never seen again.

Unfortunately, much of this content tends to benefit the institution much more than the visitor. Very little of it fundamentally alters the nature of the visit in any way. Handhelds of the future have to make allowances for content coming from users that truly benefits the users themselves, during the visit itself.

Content streams

Facebook, Twitter, FriendFeed, and the like have made the concept of micro-updates in a public forum commonplace. It is not a stretch to imagine incorporating that experience into the gallery environment. Users could enter their reactions, thoughts, and responses into the device via a texting-style interface with a defined character limit. These entries could then be posted to a public stream that everyone in the museum could see. As with most microblogging clients, public replies, direct messages, and pictures taken by the handheld devices themselves could all be incorporated into the stream. This stream could potentially become a valuable discovery tool, particularly when coupled with the location-aware features of the device. As the visitor wanders around the museum, someone posts a photo of a work he or she is interested in. Viewing the stream overlaid onto a map of the museum (à la twittervision []) enables the visitor to figure out in which gallery that photo was taken. The device then provides the visitor with directions to that gallery from the visitor’s current location.

The stream could be used in a number of other ways, as well. First, the aggregate stream could be parsed to group posts into content or emotional areas. For instance, the institution could encourage the use of hashtags to identify posts about certain types of items, which would separate those posts into a usable stream, as with this Twitter stream showing all posts tagged with #mw2009: . An institution could encourage the use of tags like #portrait, #landscape, #still-life or similar to aid in helping other visitors locate certain types of work in the galleries. Additionally, the stream could be parsed for certain emotional key phrases, as with twistori. Tate Britain has already begun creating emotionally-themed tours (i.e., a tour of objects if you're feeling blue, happy, depressed, etc.) that are distributed as paper brochures; parsing the stream in this way could potentially be a means for creating these kinds of tours on-the-fly. As a further extension, the emotionally-themed posts could be grouped on a map as well, allowing a visitor to gauge the "emotional health" of the institution at any given moment (i.e., "it certainly appears that there are a lot of angry people in the Rothko exhibition right now!"). The public microblog stream could also be a way for individual staff voices to interact with visitors in real time. This could take the form of announcements such as posts from docents announcing the start of a tour or an educational program, but it could also encourage interaction with staff at a deeper level. For instance, a conservator could post a message indicating that he is about to deinstall artwork in a particular gallery, and will be available to answer questions.

Community-based discovery

Beyond actively posting content, users could create other kinds of content that would be helpful for other users as well. We have already witnessed the utility of users using social networks for resource discovery on the Web, via social tagging sites like Delicious ( or feed sharing services like Google Reader ( Moving this type of utility to a mobile device is an obvious next step. Once a visitor has mapped out a self-selected tour or grouping of objects or locations, he or she could publish that collection to the mobile platform for use by other visitors. Lists of these user-generated collections could be voted upon by users as well, giving users the ability to later select from "most traveled" user-generated tours or "most highly rated" object groupings.

These types of user-generated content allow the device to go beyond simply being an information delivery vessel, and become a platform for all types of experiences, many of which may yet be unforeseen by the institution that provides the device. The cumulative effect of all of this posting, picture-taking, interacting, and talking is both a more social experience, and one which cannot ever be repeated in the same way twice. The visitor has left the building feeling that he or she has participated in something that both could have only occurred inside the institution (as opposed to on the Web) and could have only occurred on that particular day, at that particular time. The visit has been personalized, and made unique.

Build For the Future

As can be seen from the small number of suggestions posed here, implementing the interpretive device of the future will require that museums challenge their preconceived notions about what these devices should (or can) do, while simultaneously understanding that their visitors are becoming more savvy about technology, not less. Museums will need to build for the future by cycling through a series of steps:

  1. Audience research. Museums should doggedly focus on understanding how to improve device adoption rates among their visitors. It is simply no longer tenable for a museum to not know why a majority of its audience doesn't use interpretive technology. Results of audience research should be targeted towards specific development goals on the device itself.
  2. Application-centered design. With older handheld technologies, there was always the problem that a museum was introducing a device that was almost inevitably going to be unfamiliar to most visitors. This led to an approach to application design in which the emphasis was on simplicity and making a single "foolproof" application. iPhones and their ilk, however, have led users to expect a complex device with simple applications – most iPhone applications do only one thing, in the most straightforward way possible. Museums should take advantage of this by separating functions into different applications that work in concert.
  3. Pilot projects. To date, most pilot projects involving mobile multimedia devices have focused on determining, basically, whether or not the technology is viable in the museum environment. The next wave of pilot projects must move past this and begin to determine what kinds of applications and frameworks work best for museums. Application-centered design means, for instance, that a museum could have stable "object finder" applications running on its devices while piloting "recommender" applications at the same time, on the same devices. Because the application is being piloted on devices that already have a user base, the collated user statistics will better aid audience research efforts.

Often these steps might be concurrent. In fact, a stable device could introduce a new application as a pilot project, and gauge user feedback and statistics as a form of audience research. Understanding the audience, creating separate applications, and running better pilot projects also mean that museums no longer need to develop mobile experiences in isolation. One institution might develop a good wayfinding application, another a recommendation engine, still another a microblogging client. All of these applications could be made available to all institutions – the community benefits from the community.

In any case, it is clear that museums must begin to push harder to develop mobile experiences that challenge traditional notions of interpretation if they don't wish to be left designing for yesterday's audience. Doing so means that museums must be willing to discard outmoded approaches when appropriate, incorporate new ideas and content when available, and recognize that the only steady state is that of constant improvement. Adopting that framework will ensure that mobile interpretation continues to be a vital and important component of the museum experience well into the future.


Anderson, Chris (2004). The Long Tail. Wired, 12.10.

Macklin, Colleen (2009). Closing Plenary. In Smithsonian 2.0.

Cite as:

Smith, K., The Future of Mobile Interpretation. In J. Trant and D. Bearman (eds). Museums and the Web 2009: Proceedings. Toronto: Archives & Museum Informatics. Published March 31, 2009. Consulted