Spatial awareness, the ability to relate where objects are in space, is an important part of our daily lives. We are constantly creating spatial relationships, whether it is as simple as packing many items into the car or remembering where you parked the car at the shopping centre. Visualisation of locations on maps is well understood by the community; maps themselves are a familiar part of our lives and a familiar interface.
Museums deal with large digital collections categorised and described under strict guidelines. Typical searches are based on these categories and the use of keywords. Similarly, large collections can be searched spatially by their location on the Earth, and keyword searches for locations used. However, the true power of spatial relationships is demonstrated when you visualise the information by its location interactively across the entire globe. This experience allows the users to navigate their world, drilling into more information as they zoom in, and exploring new detailed information for a region as they pan.
The effective visualisation of large sets of information poses several problems. The sheer number of records multiplied by the data of each record is typically handled in non-spatial searches and browsed by paging the data. This concept does not work spatially; instead, we must use other powerful techniques to reduce transmission size. Spatially we can deal with the current viewport of the users; that is, the section of the Earth they can currently see on screen - loading only information for that area can vastly reduce the data transmitted. Clustering data, a technique of grouping closely-located data based on the size of the display and elevation, is effective at summarising dense areas. On the interactive map, as you zoom into a dense cluster, more space becomes available on your screen for that area to show, and the cluster separates. Eventually you reach single locations as you zoom right into the detailed region.
Prints and printmaking Australia Asia Pacific provides a gateway for information on printed images from Australia and the Asia Pacific region. The focus of the site is prints and printmaking by artists from Australia, Aboriginal Australia, the Torres Strait Islands, Papua New Guinea, Maori and Pakeha Aotearoa New Zealand, and the Pacific region, including New Caledonia, Nuie, Samoa, Kiribati, and the Solomon Islands. The site also includes references to prints and printmaking in China, India, Indonesia, Korea, Malaysia, Philippines, Singapore, Taiwan, Thailand and Vietnam.
Based on the collection at the National Gallery of Australia, the site provides free on-line access to 17,000 images. The databases can be searched by artist, subject or print techniques such as etching, woodcut, wood-engraving, linocut, lithograph, screenprint, monotype, and other print-related processes such as posters and artists’ books. An index to on-line information on printmakers, print workshops, print publishers, print galleries and public and private collections is provided.
This service is an initiative of Roger Butler, Senior Curator of Australian Prints and Drawings, National Gallery of Australia. This access initiative is generously supported by the Gordon Darling Australia Pacific Print Fund. (http://www.printsandprintmaking.gov.au)
What Is Spatial Data?
Also known as geospatial data or geographic information, it is the data or information that identifies the geographic location of features and boundaries on Earth, such as natural or constructed features, oceans, and more (http://www.webopedia.com).
Within our digital collection we often have location information. This may be a place of birth, a gallery address, a location where the work was made, or the place the work depicts. These locations will vary in quality and accuracy. Not all records will necessarily contain a value.
The Prints database contains a number of location fields. Within the Creator / Artist catalogue we find high quality data for Birth Place, Death Place and the Movements of the creator. The printed works themselves have a Place Made field, and the Subject of the work is sometimes a location. These locations are textual and are at best a city or town with corresponding state and country. Galleries, on the other hand, have a real world address, typically with an actual street name and number.
Additionally, we can derive location information from related catalogues. Works are exhibited at Galleries, allowing us to provide not only a set of locations for exhibitions but also an interesting set of locations for the works themselves. Similarly, we can derive locations from the artist of a work, but we must ask ourselves what the relevance of visualising printed works by some of these related fields is. For example, the effectiveness of showing works by the death place of the artist is very low. Some derived locations make sense and add significant value, while others certainly do not.
What turns traditional collections into spatial data?
Geocoding is the activity of defining the position of geographical objects relative to a standard reference (http://www.sli.unimelb.edu.au/gisweb/glossary.htm). When visualising locations spatially, we use numerical co-ordinates, not textual locations, to allow for fast plotting on the map and also mathematical comparisons and calculations.
In practical terms, this geocoding is the process of converting the textual location of varying accuracy and quality into a Latitude and Longitude that uniquely identifies every location on Earth. We observe five main categories of textual locations:
- No data. There is nothing we can do here. We cannot plot this value on the map. A common mistake is to plot at (0,0); this is not correct as this is a real- world location on the equator near Africa.
- General Area. This may simply be a country, state, city or Zip/Post code. A common technique is to plot the location at the centroid or middle of the area. This alone can be very misleading. Again, the actual co-ordinate will be an exact location on the earth: the middle of a city may very well be over a house or building. We need to supplement the location with an accuracy value or “tolerance” to recognise this. Our visualisation techniques can then use this value to provide more meaning within the interface.
- Exact Street Address. The quality of the geocoder will either approximate the street number as a proportion of the length of the segment of road, called “street interpolation”, or provide an accurate location called “Rooftop Geocoding”.
- Ambiguous location. Take, for example, 1 Margaret Street, Queensland, Australia. There are half a dozen possible matches. At this point a decision needs to be made. For large automated geocodes, this record will have to fall back to the general area of Queensland, Australia. In would be up to a subject matter expert to review other relevant information or a set of more advanced rules. For example, the lack of city or town may indicate it refers to Brisbane, the capital city.
- Not found. The address may not be found due to a number of factors. The textual location may contain spelling errors, or use a non-standard address representation. Over time the spelling of locations can change: a valid address from the 1800s may no longer exist today. Conflicting values; for example, the wrong Zip/Post code, can lead to no results. Additionally, the database from the geocoder may not contain that location. Again the location may fall back to the subject matter expert or a general area.
The Prints database contains all of these possible options. Due to the size of the database it is not practical to manually review each record. By reporting on the confidence of each level of detail within the location for each field, we can find a suitable level. The levels in order of most to least accurate are:
- Street Address
- Zip/Post code
- Town / City
The geocoding for the Prints site was completed using Microsoft’s MapPoint Web Service (MWS). This geocoder offers extensive coverage of much of the populated world. The process is performed in bulk and separately from the Web site logic. As our data is unlikely to change, we perform this operation once and store the Latitude, Longitude and tolerance against the record. Within our database a “GeoCodeStatus” allows us to identify new or edited records that need to be updated, and those that failed to find a result.
As a Web service it is relatively easy to set up this process. The most complex part is creating logic to handle the various quality and accuracy of data. The accuracy of the results from the geocoder varied from country to country. Currently only the US supports the highest quality of Rooftop geocoding results.
No information is available for the physical size of the cities, countries, etc. so the implementation of the tolerance was based on the eight detail level codes above. This allows us to tailor the visual representation of this tolerance independently of what was stored. Currently this is applied as a circular polygon to show there is no clear point.
How do we use this location within our existing Web site?
Alternative Search Results View
Within the collection search of an existing Web site, the simplest integration is as an alternative search results view. Typically a list of results is presented in textual form; retrofitting a tab or navigation button to View results spatially can be integrated into the design. This will load the complete set of results on an interactive map. This simple approach makes use of the powerful search already implemented in the site. The interactive map allows the user to essentially filter the results spatially and navigate to the more detailed view.
For example, a search on intaglio prints would return 13,654 records. Textually we can sort alphabetically, by date, by catalogue raisonné, or filter on images only. For Web performance, only small sets of results are ever returned: the user must explore page by page. Spatially, we can show all these results on the map, clustering data to show the density of results around the world. Immediately this gives the user visual insight into the geographical distribution. If the user is interested in prints from a region of New Zealand, a simple interaction of zooming and panning gets to the region of interest. Essentially we add a dynamic filter to results and present them in new visual interactive interface.
What do we do with those entries that don’t have a location?
The biggest issue we face is incomplete data and data accuracy. Unlike a date search where we can create an unknown category, we simply can’t show results on the map without a location. A common mistake is to plot these results at (0,0), a location near Africa in the ocean. This is misleading and confusing to the user. An alternative is to provide informational text to notify the user that only X of Y results have been plotted overall on the map for the given criteria.
A more powerful section to the site is a dedicated Spatial Browse for the collection. This section provides an unrestricted view of the entire collection spatially. The addition of complementary filters can dynamically modify what is on the map in real time.
The most effective filter that complemented the map was a timeline. The ability to filter the data visualised on the map by periods in history provides amazing insights. Additionally, filters that provide a selection for a known set of values; for example the print type, are quite effective.
What did we find interesting as we explored spatially?
It is only early in this process, but this style of interface actually immediately allows you to view the entire collection, something impossible as giant text list of results. Clearly the view is summarized, and navigating to a pre-determined record by anything apart from location is difficult, bordering on impossible, with such a large dataset. This will not replace the text search. What you do get is immediate visualisation of the size and geographic extent of the collection, something that is not apparent from a piece of text saying “1-50 of 45,000”. The cliché of “a picture tells a thousand words” is multiplied several times here.
Major densities are initially interesting, but to go a step further and explore how these change and build over a timeline is amazing. Also, distribution across the world over a timeline is equally interesting.
Lastly, outliers, single points at strange locations which you did not expect, are quite interesting to explore. This style of browsing for these records would be impossible any other way.
Leading Web-Based Mapping Technologies
Microsoft Virtual Earth
Virtual Earth is a Web 2.0 browser-based mapping platform. It uses AJAX technology without the requirement of any browser plug-ins to provide an incredible amount of visual mapping data. The entire world is broken into small tiles; these are reconstructed on demand in your browser to display road networks and aerial photography. This imagery is then enhanced by high resolution “Birdseye” images giving a unique 45 degree angle view of all sides of buildings and features.
A 3D mode is available as a downloadable plug-in. To support 3D, the software must access the 3D hardware from your computer not available to HTML; hence the plug-in. Within this 3D environment users can more fluidly navigate, rotate and tilt the planet. The world becomes a true sphere without the distortions present in the flattened 2D mode. In 2D mode, the map uses the Mercator projection to flatten the earth. The combination of a full terrain elevation model and 3D building brings cities and locations to life.
Combined with both of these are rich mapping functions, including the finding of locations or business types and step-by-step driving directions. User-generated content can be overlaid in several ways.
The first is as interactive points, lines and polygons. The points are typically called “Pushpins” and represent a single location on the planet. Lines are great for marking a path while polygons can represent an area. These objects have an “infobox” or popup that can contain any HTML content you desire. This is a perfect place for the some summary data for the record and a thumbnail image.
Second is the support for an overlay of tiles. Using the same high performing and scalable tile system, the overlay can provide visual information or improved imagery. Typically this layer is just for a small region on the earth and is pre-generated. The layer can be masked and a transparency applied. The layer can also be dynamically created and is often used in this way for a heat map providing semi-transparent coloured areas representing some relevant information.
Google Maps / Google Earth
Google Maps is very similar to Virtual Earth 2D,in that it is based on the same AJAX concepts and has the same tile system. However, Google offers different features, data and imagery. Although coding uses the same concepts and language, the syntax is very different. The power of competition is great for consumers; Google Maps outperform Virtual Earth in some areas and underperform in others. Both companies update their software regularly to compete.
Google Earth brought 3D mapping to consumers. A very innovative product, this application is installed on your computer and uses your Internet connection to stream in the images and information on demand. This differs from Virtual Earth’s 3D “in browser” experience. A more evolved product, Google Earth offers strong support for user-generated content, including 3D models.
Sharing And Aggregating Collections Spatially
Both mapping technologies allow the rendering of data in a proprietary format for complete control or from a Standard XML format. GeoRSS, an extension on RSS, where additional nodes for location have been added, is useful for providing a location for items in a newsfeed. KML (Keyhole Markup Language), the format behind the original Google Earth product, is more complex and defines many complex structures and features. These files are a complete set of data. This is not ideal for very large databases. Currently there is no clear solution for how to provide a large collection to be aggregated into another system.
Both Google and Microsoft offer a personalised mapping experience with http://maps.google.com ‘my maps’ and http://maps.live.com ‘collections’. Both of these allow you to create your own content on the map or import data, save a new collection, and publish this in a shared format. Again this is limited to small dataset sizes.
Early Adopters Of These Technologies
Real Estate has been the fastest by far to adopt this technology into their Web sites. In their case, the houses are typically easy to Geocode. Where there is an issue, the Estate agent can correct the location and even enhance the listing by drawing the boundary of the property. The different implementations give insights into how to use this technology effectively. When showing just a single house it is useful to identify where the house is, but doing so doesn’t add a significant amount of value, whereas seeing all the houses for sale in as area, filtering by features and price and exploring their proximity to shops, schools and transport is infinitely better.
Photo sites for consumers and professionals have all integrated some sort of mapping interface for viewing photos spatially. Most of these sites actually use the map as the interface to locate the photo or rely on the supply of a GPS position. This removes the need for Geocoding and places the responsibility to accurately locate the photo’s position on the user.
Emergency response planning and asset tracking industries are using the rich interface and ability to overlay both interactive shapes and new tile layers to provide real time, on-line interfaces to their rich data. These solutions are typically more complex as they require data from remote devices (most often GPS’s) to be integrated with user input in real time. These systems rely on logic to trigger alerts and statuses.
Finally, major retail chains are using mapping, often enhanced by current events and driving directions, to provide store locators.
Integration Into Portable Devices
Sales of personal navigation devices are booming. These come in various flavours, including stand alone devices, factory-installed devices in vehicles, and software for smart phones and PDAs. The devices use the GPS to pinpoint your location on the earth and can tell and show you how to get to your destination.
But the devices do not stop there. Aside from music, photo and videos’ most current devices will integrate with your telephone to bring hands-free voice calls. This data connection opens up the ability to access the Internet to retrieve content, specifically spatial content for a current location. Currently this is being used for live traffic integration where, unlike one way, broadcast-only transmissions, this technology allows the unit to provide speed and location data back to the service to further enhance traffic data.
This opens up opportunities to provide other location-specific information to the user. Combined with user profile information, dynamic tour guides can be created, drawing on spatial data from various sources.
An opportunity exists for Museums to be content providers for these devices. This does not necessarily mean you have to develop custom applications. If your collection is made available spatially on the Web, applications can utilise the content for many diverse purposes.
Heavy investment into spatial technology by the leaders in Web search does indicate the likelihood of searching spatially for more and more data in the near future. It may not be long before your on-line collection goes beyond today’s textual and image searches and returns results for locations both textual and from GPS.
Geolocation, the process of determining users’ locations from their Internet addresses, can be used to show collection items more relevant to those users. This works by referencing a lookup database of IP addresses or wifi points to determine from users’ connections their approximate locations on the Earth. Within the Prints site, for example, this technology could allow the site to focus on content within close proximity to a user, perhaps specific printmakers from Sydney or just from the country Australia.
Digital photography is rapidly introducing location information for images. The Exchangeable image file format (Exif), the standard for metadata for photos, has fields for location. New digital cameras are increasingly becoming equipped or optioned with GPS. Leading photo sites already allow for photos to be displayed on Google Maps and Virtual Earth based on these tags.
Looking further ahead, Photosynth is an amazing new technology from Microsoft Live Labs; it will change forever the way you think about digital photos. The software takes a large collection of photos of a place or an object, analyses them for similarities, and displays them in a reconstructed three-dimensional space (http://labs.live.com/photosynth/).
Photosynth offers some amazing possibilities. The most interesting are the aggregation of images from many sources and the power of enriching images with metadata from other sources. Essentially the technology can match images based on features. This builds relationships between images from different sources. The ability to drill into the wealth of information from those sources is very interesting. Take, for example, a statue at a Museum. The Museum would have a wealth of information about the artists involved, its original location or where it was made, how it was made, etc. A visitor to the Museum would only need to take a photo of the statue and find the match within Photosynth to access that wealth of information about that statue for their photograph.
The same concept can be applied to general spatial data. New relationships can be formed based on location.
These combined technologies could allow for a number of interesting experiences in the future. For example, a visitor walking around Pompeii in Italy could be viewing on their mobile phone documented museum pieces originating from Pompeii but housed in various Museum Collections around the world. The visitor would be accessing the rich content, in real time and on demand, directly from the Museums. The key is the GPS location.
As with any project that relies on a large set of data, a significant amount of time is required to analyse and cleanse. The powerful Web-based mapping technologies made accessible by the large software vendors gives us the best opportunity to get this content out on the Web for a rich and interactive experience. These rich features and imagery are not something we would want to re-invent.
The Prints Web site will continue to improve; more entries will receive locations; and existing entries will become more accurate. Once a process is in place, new entries can be added with ease. For such a large dataset, it is unlikely that all records will ever be complete. This is certainly not a reason to delay release of this information. A reasonable judgment has to be made as to what is complete and accurate enough to be live.
The innovative interface has provided another way to visualise and browse our search results; as well, it has become an interface for browsing and filtering in its own right. With the hard work of geocoding in place, the collection is ready to be searched and aggregated into new innovative technologies.
Now is a great time to start looking at how your collection could be explored spatially. The technology exists to effective visualise the collection on-line, and the opportunities for exposing such information to other innovative ideas are just around the corner.