Rory and I have spent this week melting in the summer heat of Madrid at Open Repositories 2010. This is the third OR event I’ve attended (see my reports on OR09 and OR08) and the first that Rory has been able to make. As usual, a great opportunity to find out what’s going on in the world of repositories, for both developers and repository managers, and catch up with friends and colleagues working in the field.

We’ve brought along a smashing poster of our work for SOAS, which attracted a considerable amount of interest for its distinctive and attractive Web user interface.

Of the themes of the conference we found most interesting, more in due course.

But we also decided it was about time we entered the annual Repository Developer Challenge. And at the conference dinner on Wednesday we learned that our entry was awarded first prize. La Roja weren’t the only ones with something to celebrate; and we’r'e very pleased to have our names on the same (virtual) cup as such giants of the reposphere as Tim Donohue, David Tarrant, Ben O’Steen and Tim Brody!

The challenge set by the JISC funded DevCSI project, managed by UKOLN, was to “Create a functioning repository user-interface, presenting a single metadata record which includes as many automatically created, useful links to related external content as possible.”

Our idea began by simply imagining a typical repository abstract page overloaded with functional, contextual popup menus, just like a Google Docs page, for example. We could add a menu to each metadata element, and populate the menu with links pertinent to the element.

In order to do that we needed:

  • first, to be able to automatically identify the metadata values (title, author and the like) in the page, and,
  • second, to manage a list of the web services that are appropriate to each metadata element. The result would automatically generate, on each item’s abstract page, links such as “Find this title at Amazon”, “Find this author at the BL”
  • third, some clever scripts that put all this stuff together

The first task was achieved by editing the EPrints templates to ensure each metadata value was wrapped in RDFa semantic tags (spans with a ‘property’ class) that we could identify programmatically. The semantic schemas used are formally declared in the page header.

The second aim achieved by identifying appropriate target services, and the exact URL required to activate them. As a simple example, we can easily create a link to search Google for VALUE using the URL http://www.google.com/#q=VALUE. By way of a data source for these services, I created a spreadsheet in which each row linked a metadata element (e.g. dc:title) to a service (e.g. Google).

The third part was the tricky bit that Rory deftly dispatched. He created scripts that scour the Abstract page looking for RDFa tags, then look them up in the services data table, and dynamically create links as appropriate.

We chose to create an example using the test server that we maintain for the Linnean Online collection. Linnaeus’s botanical specimens make a nice change from the usual ETDR fare of most repositories. This also allowed us to demonstrate using metadata schema from a domain other than the ubiquitous Dublin Core: life sciences have developed a schema called Darwin Core, which defines necessary metadata in that domain, for example as Genus and Species. What’s more, there is a wide range of resources in the field that might usefully be linked to, such as the collections at Kew or the Natural History Museum, the International Plant Names Index and the Encyclopedia of Life.

The results were pretty much as we’d hoped. It’s worth noting that, while we implemented it in EPrints, the technique could be applied to any template-based repository platform, or, for that matter, virtually any web application. Once the RDF templates and code are in place in the templates, it is only necessary to edit the table of data services in order to add or remove links. With only a bit more polish than we had time for during the conference, we think that this could be a useful addition to the toolkit of any repository developer or repository manager. We’ll keep you posted!

UKOLN filmed the proceedings so you can see me presenting our entry at the conference if you want.

Winner of the Developer Challenge at OR10 (Madrid) – Richard Davis and Rory McNicoll from UKOLN on Vimeo.

About the author:

Leave a Reply