<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>ulcc da blog &#187; EAD</title>
	<atom:link href="http://dablog.ulcc.ac.uk/tag/ead/feed/" rel="self" type="application/rss+xml" />
	<link>http://dablog.ulcc.ac.uk</link>
	<description>blogging about digital archives &#38; repositories since 2007</description>
	<lastBuildDate>Tue, 14 May 2013 10:42:32 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.2</generator>
		<item>
		<title>Open Repositories 2011 (Part 1)</title>
		<link>http://dablog.ulcc.ac.uk/2011/06/22/open-repositories-2011-part-1/</link>
		<comments>http://dablog.ulcc.ac.uk/2011/06/22/open-repositories-2011-part-1/#comments</comments>
		<pubDate>Wed, 22 Jun 2011 10:00:46 +0000</pubDate>
		<dc:creator>Richard M. Davis</dc:creator>
				<category><![CDATA[Repositories]]></category>
		<category><![CDATA[dspace]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[eprints]]></category>
		<category><![CDATA[Open Repositories]]></category>
		<category><![CDATA[OR11]]></category>
		<category><![CDATA[repositories]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/?p=1487</guid>
		<description><![CDATA[Rory and I had a fun, productive and informative time at Open Repositories 2011 in Austin: everyone involved agreed that this year&#8217;s OR conference at the University of Texas was a great success. The conference kicked off with a keynote from Jim Jagielski of the Apache Software Foundation, describing the history and organisation behind Apache [...]]]></description>
			<content:encoded><![CDATA[<p>Rory and I had a fun, productive and informative time at Open Repositories 2011 in Austin: everyone involved agreed that this year&#8217;s OR conference at the University of Texas was a great success.</p>
<div id="attachment_1490" class="wp-caption alignright" style="width: 205px"><a href="http://dablog.ulcc.ac.uk/wp-content/uploads/2011/06/IMAG0559.jpg"><img class="size-medium wp-image-1490 " title="Chris Awre, William Nixon, Rory McNicholl at the Longhorns stadium" src="http://dablog.ulcc.ac.uk/wp-content/uploads/2011/06/IMAG0559-217x300.jpg" alt="" width="195" height="270" /></a><p class="wp-caption-text">Chris Awre, William Nixon, Rory McNicholl at the Texas Longhorns stadium</p></div>
<p>The conference kicked off with a keynote from Jim Jagielski of the<a href="http://www.apache.org/"> Apache Software Foundation</a>, describing the history and organisation behind Apache and its projects. It was observed by some in the Twitter backchannel that the talk could as easily have been from 2001 as 2011, but for all that it was a worthwhile reminder that, in all our efforts, we stand on the shoulders of the giants who created and maintain the infrastructure of the Web and the Internet. And also that many our endeavours benefit from a little more dedication and commitment than you can usually squeeze between 9-to-5.</p>
<p>The closing keynote was by repositories stalwart Clifford Lynch, who managed to touch on so many perennial repository themes, I won&#8217;t attempt to summarise them. There is a handy <a href="http://storify.com/datag/clifford-lynch-keynote-at-open-repositories-2011/">anthology of tweets about his talk on Storify</a>.</p>
<p>In between were plenty of presentations and opportunities to meet friends old and new from the United States of Repoland &#8211; some we have worked with, some we would like to work with, and many with challenging ideas and insights into the many facets of working with repositories.</p>
<p><span id="more-1487"></span>The OR conference hops back and forth across the Atlantic (I&#8217;ve previously attended <a href="http://dablog.ulcc.ac.uk/2008/04/02/open-repositories-2008-in-southampton/">OR08 in Southampton</a>, <a href="http://dablog.ulcc.ac.uk/2009/06/10/open-repositories-2009/">OR09 in Atlanta</a> and <a href="http://dablog.ulcc.ac.uk/2010/07/09/open-repositories-2010-in-madrid/">OR10 in Madrid</a>). Unfortunately when the conference is held Stateside, the representation of the EPrints community tends to be noticeably smaller. Not that there aren&#8217;t EPrints users in the USA (we were particularly pleased to meet the team from <a href="http://library.caltech.edu/">Cal Tech Library</a>, very happy users and advocates of EPrints), but the distribution of software platforms is significantly different from Europe in general, and the UK in particular (if you are interested in such things, you can check out the statistics at <a href="http://www.opendoar.org/find.php?format=charts">OpenDOAR</a>). And of course travel logistics (and costs) are non-trivial. Luckily Rory and I had been saving our prize money from <a href="http://devcsi.ukoln.ac.uk/blog/2010/07/13/we-have-a-winner-developer-challenge-at-open-repositories-2010-madrid/">last year&#8217;s Developer Challenge</a>!</p>
<p>While it eluded me in previous years, I think at last I am starting to grasp at least some of the salient points of the thing they call <a href="http://www.duraspace.org/">Duraspace</a> (launched, if I recall, in Atlanta)! I&#8217;m certainly hoping to find time to take my <a href="http://duracloud.org/trial_account">free Duracloud trial</a>. However other aspects still remain opaque to me. At one panel discussion about the prospects for implementing DSpace over Fedora (or Fedora under DSpace, depending which way up you look at it), I was surprised to hear a description of ongoing DSpace-Fedora alignment efforts as &#8220;more about the journey than the destination&#8221;. An enviable luxury: for the time being we need tangible outcomes for our repositories and customers, and that&#8217;s one reason why we&#8217;ll be sticking with EPrints for the foreseeable future.</p>
<p>Personal highlights for me are described elsewhere: the <strong><a href="http://dablog.ulcc.ac.uk/2011/06/14/open-repositories-2011-part-2-the-developer-challenge/">Developer Challenge</a></strong>, which we enjoyed immensely, and <strong><a href="http://dablog.ulcc.ac.uk/?p=1499">Changing Platforms</a></strong> the talk that I presented with Imma Subirats, of the UN Food &amp; Agricultural Organisation, where we discussed migrating between repository platforms. Rory also had a chance to meet developers from Yale, who had worked on the other end of the <a href="http://dablog.ulcc.ac.uk/2011/02/21/synergies-abound/">SOAS-Yale Islamic Manuscripts</a> collaboration, and show off some of his work for the <a href="http://digital.info.soas.ac.uk/cgi/c">SOAS repository</a>. We were also hugely appreciative of the generosity of the <a href="http://www.eprints.org/">EPrints t</a>eam, who kept us generally amused and amazed, and kindly included us in their group dinner on the last evening.</p>
<div id="attachment_1502" class="wp-caption alignright" style="width: 280px"><a href="http://dablog.ulcc.ac.uk/wp-content/uploads/2011/06/241367_860021243981_61013483_44532206_7367009_o.jpg"><img class="size-medium wp-image-1502 " title="Ade Stevenson on stage at the Blue Moon" src="http://dablog.ulcc.ac.uk/wp-content/uploads/2011/06/241367_860021243981_61013483_44532206_7367009_o-300x225.jpg" alt="Ade Stevenson on stage at the Blue Moon" width="270" height="203" /></a><p class="wp-caption-text">Adrian Stevenson&#39;s got them all-night late bar open repository blues...</p></div>
<p>The facilities at UT&#8217;s AT&amp;T Conference Centre were outstanding, as was the surrounding campus generally, including the Longhorns football stadium (with its insanely massive west stand) where the conference dinner was held. Austin has far more attractions than we could see in such a short time, and it is an impressive and vibrant city, from the spectacular grandeur of the Texas state capitol, to the noisy entertainment on 6th Street, where virtually every bar has some kind of rock or blues band playing. We were most impressed by UKOLN&#8217;s Adrian Stevenson who jammed on a borrowed guitar with the blues band in the Blue Moon bar at 2am. As if that wasn&#8217;t enough, our visit also coincided with the massive Republic Of Texas biker rally &#8211; an insanely noisy procession of up to 50,000 bikers through the main streets of the city. Our ears won&#8217;t forget OR11 in a hurry.</p>
<p>&nbsp;</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2011/06/22/open-repositories-2011-part-1/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Open Repositories 2011 (Part 3): Changing Platforms</title>
		<link>http://dablog.ulcc.ac.uk/2011/06/22/open-repositories-2011-part-3-changing-platforms/</link>
		<comments>http://dablog.ulcc.ac.uk/2011/06/22/open-repositories-2011-part-3-changing-platforms/#comments</comments>
		<pubDate>Wed, 22 Jun 2011 09:38:22 +0000</pubDate>
		<dc:creator>Richard M. Davis</dc:creator>
				<category><![CDATA[Repositories]]></category>
		<category><![CDATA[dspace]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[eprints]]></category>
		<category><![CDATA[Open Repositories]]></category>
		<category><![CDATA[OR11]]></category>
		<category><![CDATA[repositories]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/?p=1499</guid>
		<description><![CDATA[To OR11 I took a presentation, jointly with Imma Subirats, from UN FAO in Rome, which we called Changing Platforms. The aim of the presentation was to discuss the subject of migrating repositories between different software platforms. In addition to her work at FAO, Imma is Chief Executive for the E-LIS repository, a major international [...]]]></description>
			<content:encoded><![CDATA[<p>To OR11 I took a presentation, jointly with Imma Subirats, from UN FAO in Rome, which we called <em>Changing Platforms</em>. The aim of the presentation was to discuss the subject of migrating repositories between different software platforms.</p>
<p>In addition to her work at <a href="http://www.fao.org/">FAO</a>, Imma is Chief Executive for the <a href="http://eprints.rclis.org/">E-LIS</a> repository, a major international and multi-lingual repository of articles about Library and Information Science. E-LIS has operated since 2003 on EPrints, but last year migrated to DSpace, because <a href="http://www.cilea.it/">CILEA</a> in Italy, who generously donate support and hosting, now focuses exclusively on working with DSpace. The E-LIS migration has been largely successful, however a number of EPrints features on which the E-LIS editors and users depended, have been difficult to replicate in DSpace, or had to be put on ice. This is no reflection on the specialists at CILEA, but perhaps indicative of more profound differences between EPrints and DSpace, that aren&#8217;t always reflected in the usual comparisons of repository platforms, such as the otherwise informative <a href="http://www.rsp.ac.uk/start/software-survey/results-2010/">JISC RSP Repository Software survey</a>.</p>
<p>ULCC of course has just completed a repository migration from DSpace to EPrints for the School of Advanced Study. Our motivation was in many respects the same as that of CILEA &#8211; our expertise lies firmly in the EPrints camp. But I think the outcomes for our end-user community are more demonstrably positive: in fact I don&#8217;t think there&#8217;s a single feature of the new SAS-Space-on-EPrints that isn&#8217;t a major improvement over its previous incarnation.</p>
<p>Migration of metadata and data (at least from DSpace to EPrints) presented few issues (that weren&#8217;t of my own making!) &#8211; export, transform, import. Here the similarities between the models of the two platforms was extremely valuable. But we did encounter other significant differences, some of which are set out in more detail below.</p>
<div id="attachment_1510" class="wp-caption aligncenter" style="width: 562px"><a href="http://dablog.ulcc.ac.uk/wp-content/uploads/2011/06/241364_859817033221_61013483_44527715_7459153_o.jpg"><img class="size-large wp-image-1510" title="Richard presenting Changing Platforms at OR11" src="http://dablog.ulcc.ac.uk/wp-content/uploads/2011/06/241364_859817033221_61013483_44527715_7459153_o-1024x288.jpg" alt="Richard presenting Changing Platforms at OR11" width="552" height="155" /></a><p class="wp-caption-text">Richard presenting Changing Platforms at OR11</p></div>
<p style="text-align: center;"><a href="http://dablog.ulcc.ac.uk/wp-content/uploads/2011/06/241364_859817033221_61013483_44527715_7459153_o.jpg"></a></p>
<p><strong id="eprints"><span id="more-1499"></span>Issues in EPrints</strong></p>
<p>Perhaps the most significant issue we encountered with re-implementing SAS-Space on EPrints was the absence of built-in support for Handle persistent identifiers. Handle support comes out-of-the-box with DSpace, but not with EPrints, so the choice we faced was between re-implementing Handle support, or dropping it. We chose the latter, since the benefits of Handles to a relatively small IR like SAS-Space were not obvious, and so it was hard to justify the extra cost and effort. By <a href="http://www.slideshare.net/bezbozhnik/changing-platforms/20">ensuring that items kept the same ID</a> when migrated from DSpace to EPrints, and implementing a simple rewrite rule, we have ensured that Handle URIs created while DSpace was operational continue to point to the same item &#8211; but for items added since EPrints went live, no new Handle URIs are coined.</p>
<p>(Shortly after we returned from OR11, an extended discussion broke out on Twitter, amongst several well-respected gentlemen in our field, about the benefits of using Handles. A considerable amount of scepticism was expressed about their usefulness.)</p>
<p><strong>Issues in DSpace</strong></p>
<p>Imma described some workflow issues encountered with the new implementation of her repository. The E-LIS team is accustomed to a very flexible EPrints-based workflow that allows items to have their workflow status changed quite freely. DSpace, by contrast, has a unidirectional workflow model, so that items cannot (for example) be reverted from Live to Pending, if some kind of error is spotted, but effectively need to be deleted and resubmitted. This is obviously a significant divergence between the superficially similar repository platforms.</p>
<p>Another example Imma gave of a perplexing feature of the default DSpace UI is the button on each abstract page that says &#8220;View Full Item Record&#8221;. It leads to a rather intimidating web page displaying the item metadata as Qualified Dublin Core. It&#8217;s not a very attractive display, nor is it actually a &#8220;data&#8221; rendering of the metadata (as you would get by explicitly choosing to Export As XML, or from some new-fangled Linked Data features). It&#8217;s not clear why this view would be of interest to general users of the repository: why is it there?</p>
<p>At OR11 I talked to several people working with DSpace, and all agreed that there&#8217;s room for improvement in the default Web UI. In some cases they have completely reimplemented the web templates. It&#8217;s also worth noting that the page layout in the default JSP UI is entirely implemented using HTML tables, and doesn&#8217;t pass W3C validation. For a Web application that&#8217;s nearly 10 years old, this is disappointing. (The alternative Manakin XML UI implements an attractive vision of UI abstraction using XSLT, but reports suggest that configuring/maintaining it is not for the faint-hearted.)</p>
<p>Quite a few Web design infelicities are perpetrated in the default Community, Collection and Abstract page templates. (During the conference, many of us enjoyed and applauded Simeon Warner&#8217;s timely rant,&#8221;Don&#8217;t <strong>bold</strong> the field name&#8221;.) Of course we can change them &#8211; it&#8217;s Open Source, isn&#8217;t it? &#8211; but is it unreasonable to expect default Web templates that are at least potentially usable as is? Of course the natural and reasonable response of the DSpace community is to ask that we report the issue as a bug or feature-request to the development team. Or fix it ourselves and share the fix. But where an absent feature is really important to a user (by which I probably mean a repository manager), then the choice faced is between &#8220;getting by&#8221; until it&#8217;s implemented in the core distribution, or doing it themselves (which probably means hiring a specialist developer to implement it for them).</p>
<object type='application/x-shockwave-flash' wmode='opaque' data='http://static.slideshare.net/swf/ssplayer2.swf?id=8305736&doc=or11changingplatformsslides-110614112334-phpapp01' width='630' height='516'><param name='movie' value='http://static.slideshare.net/swf/ssplayer2.swf?id=8305736&doc=or11changingplatformsslides-110614112334-phpapp01' /><param name='allowFullScreen' value='true' /></object>
<p><strong>Out-of-the-box</strong></p>
<p>At the <a href="http://www.aepic.it/conf/DSUG2007/viewabstract8587.html?id=331&amp;cf=11">DSpace User Group meeting</a> in 2007, I described how we considered that, back in 2005, DSpace offered a better &#8220;out-of-the-box&#8221; experience than EPrints. I never thought it was anything to write home about &#8211; in fact I remember being disappointed by the very UI issues I&#8217;ve described above &#8211; but to my untrained eye it did seem better than EPrints, at the time. But, as I&#8217;ve <a href="http://dablog.ulcc.ac.uk/2009/12/21/our-new-eprints-repository-is-not-just-for-christmas/">mentioned elsewhere</a>, EPrints has improved remarkably since.</p>
<p>Of course a lot of people we admire have proved that you can create impressive repository systems using DSpace. It performs and provides a lot of essential repository functionality. Its Lucene search engine is certainly better than anything EPrints currently offers. But I&#8217;m still surprised how much more work seems to be necessary to make a DSpace installation as readily useful and usable as EPrints, and this seems to represent considerable additional cost in setting up DSpace.</p>
<p>I&#8217;ve heard it sometimes argued &#8211; in both EPrints and DSpace camps &#8211; that Repository setup shouldn&#8217;t <em>be</em> too easy, lest repository managers get in a mess and endanger the integrity of their system. In my opinion, as developers and solution providers, our job is to provide as many features and tools as possible to enable Repository Managers to manage their collections effectively and easily &#8211; not act as as gatekeepers to their systems and data.</p>
<p>By way of contrast, we have recently supported the Institute of Education (IOE) in setting up an EPrints repository of UK government publications, and we were pleased to see the repository manager called on us very little, other than to answer some questions and apply a few small configuration changes. The experience with SAS-Space has also confirmed to me that EPrints now has strong out-of-the-box appeal, and a rich set of features available through the Web UI, that enable a reasonably confident repository manager to get to work without needing to initiate a major technical project.</p>
<p>In the current climate, of straitened library budgets, this could make a considerable difference to the viability of a repository startup project. For a growing number of libraries and information services &#8211; not least at smaller research institutions, or in developing countries &#8211; that could be the difference between having a repository, or not.</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2011/06/22/open-repositories-2011-part-3-changing-platforms/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Open Repositories 2011 (Part 2): The Developer Challenge</title>
		<link>http://dablog.ulcc.ac.uk/2011/06/14/open-repositories-2011-part-2-the-developer-challenge/</link>
		<comments>http://dablog.ulcc.ac.uk/2011/06/14/open-repositories-2011-part-2-the-developer-challenge/#comments</comments>
		<pubDate>Tue, 14 Jun 2011 10:50:06 +0000</pubDate>
		<dc:creator>Richard M. Davis</dc:creator>
				<category><![CDATA[Repositories]]></category>
		<category><![CDATA[challenge]]></category>
		<category><![CDATA[Challenges]]></category>
		<category><![CDATA[DevCSI]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[MERLIN]]></category>
		<category><![CDATA[OR11]]></category>
		<category><![CDATA[repositories]]></category>
		<category><![CDATA[SHERPA-LEAP]]></category>
		<category><![CDATA[University of London]]></category>
		<category><![CDATA[University of Texas]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/?p=1463</guid>
		<description><![CDATA[An event that asked developers to demonstrate the Future of Repositories can only be considered a great success when it receives entries that include: Multiple real-time examples of using &#8220;Repositories As A Service (RaaS)&#8221;, not only exchanging data but also sharing sophisticated functionality between EPrints and DSpace &#8211; and even including an Android application A [...]]]></description>
			<content:encoded><![CDATA[<div id="attachment_1465" class="wp-caption alignright" style="width: 310px"><a href="http://dablog.ulcc.ac.uk/wp-content/uploads/2011/06/4d849b5ea9d545a146cb5119f3cb07af08202f38_wmeg_00001.jpg"><img class="size-medium wp-image-1465 " title="OR11 Developer Challenge" src="http://dablog.ulcc.ac.uk/wp-content/uploads/2011/06/4d849b5ea9d545a146cb5119f3cb07af08202f38_wmeg_00001-300x224.jpg" alt="" width="300" height="224" /></a><p class="wp-caption-text">Excitement at the OR11 Developer Challenge Show-and-Tell (Photo by @sparrowbarley)</p></div>
<p>An event that asked developers to demonstrate the Future of Repositories can only be considered a great success when it receives entries that include:</p>
<ul>
<li>Multiple real-time examples of using &#8220;Repositories As A Service (RaaS)&#8221;, not only exchanging data but also sharing sophisticated functionality between EPrints and DSpace &#8211; and even including an Android application</li>
<li>A tool for bundling and depositing a whole raft of research related outputs from the Web via RDF</li>
<li>A tactile repository search interface with dynamic search suggestions, specifically designed for tablets and smartphones</li>
<li>A complete gesture and voice-driven system for depositing and searching in repositories</li>
</ul>
<p>All these &#8211; and other great entries too &#8211; were achieved in a couple of days&#8217; work during the course of the conference, for the annual OR Developer Challenge, and presented at a packed Show-and-Tell session on Thursday afternoon (true, there was free beer).</p>
<p>Stuart Lewis&#8217;s team were worthy winners with their RaaS project, particularly as they showed a genuine commitment to a cross-platform approach &#8211; something which, sensibly, backgrounds the individual software platforms, that often receive too much attention, and focuses on the Repository as an application and entity in its own right.</p>
<p>We were also really pleased to see a prize go to Patrick McSweeney and Matt Taylor. And enjoyed seeing Dave Tarrant stealing the show (again) with his live demonstration of using a Microsoft Xbox <a href="http://www.xbox.com/en-US/Kinect/GetStarted">Kinect</a> to submit items to a repository.</p>
<p><a href="http://is.gd/texasslides">Our own entry </a>may not have won, but several people liked it, and you may see more of it in future. For the second year running, the Developer Challenge was a great opportunity for Rory and me to concentrate on an idea that we&#8217;ve been kicking around, without having found a home for it in existing work (yet). This was true for our Semantic Metadata popup tools that <a href="http://devcsi.ukoln.ac.uk/blog/2010/07/13/we-have-a-winner-developer-challenge-at-open-repositories-2010-madrid/" target="_blank">won the challenge</a> with last year.</p>
<p><span id="more-1463"></span>This year we set about achieving our  longstanding desire to take the very tactile and dynamic <a href="http://lasso.ucl.ac.uk/merlin-ui/" target="_blank">search interface that Rory created for the MERLIN project</a>, and turn it into a touchscreen app, for smartphones and tablets.  The results were pretty effective.</p>
<div id="attachment_1464" class="wp-caption alignleft" style="width: 190px"><a href="http://dablog.ulcc.ac.uk/wp-content/uploads/2011/06/texas-app-screenshot1.png"><img class="size-medium wp-image-1464" title="MERLIN Mobile interface (&quot;TEXAS&quot;)" src="http://dablog.ulcc.ac.uk/wp-content/uploads/2011/06/texas-app-screenshot1-180x300.png" alt="" width="180" height="300" /></a><p class="wp-caption-text">The MERLIN Mobile interface (&quot;TEXAS&quot;) as demonstrated at OR11</p></div>
<p>The MERLIN interface on LASSO is quite complex, but at the heart of it is the tag cloud of related terms that the Termine text-mining suggests. This always looked like it might be good on a touchscreen, so we stripped it down, rearranged and tweaked it to make it viable on an Ipad screen. If you&#8217;ve got an Ipad, you can try it out by pointing your Safari browser at <a href="http://is.gd/texasweb" target="_blank">http://is.gd/texasweb</a>. (It will work on desktop browsers too, but it looks best on a portrait oriented screen.)</p>
<p>If you&#8217;ve got an Android device, you can even download an app-based version of it from <a href="http://is.gd/texasapp" target="_blank">http://is.gd/texasapp</a>. (It&#8217;s a bit cramped in a smartphone display, but still essentially working.)</p>
<p>If you&#8217;ve got neither we created <a href="http://is.gd/texasdemo" target="_blank">this page</a> to give you a rough idea what it looks like.</p>
<p>It&#8217;s worth mentioning that this is a <em>real, live, working application</em>: enter a search term (&#8220;logical positivism&#8221;, &#8220;climate change&#8221;, &#8220;Jeremy Bentham&#8221;,  &#8230;) and it searches over the full-text corpus of all articles in University of London Open Access repositories (<a href="http://www.sherpa-leap.ac.uk">SHERPA-LEAP</a> consortium members), and makes suggestions about additional or alternative search terms, based on the results of the text-mining analysis of the articles</p>
<p>The hackathon approach of working closely together to create something quickly worked well again: Rory hacks, and I test and review each time he hits &#8216;Save&#8217;! All very agile and iterative.</p>
<p>In honor of our hosts in Austin, we decided to call the new interface <em>Touchscreen Enhanced Cross-Search with Augmented Serendipity</em> &#8211; or TEXAS for short.</p>
<p>Kudos to Mahendra Mahey, <a href="http://ptsefton.com/" target="_blank">Peter Sefton</a> and the <a href="http://devcsi.ukoln.ac.uk/" target="_blank">DevCSI</a> project for putting (and keeping) it all together, and to the awesome panel of judges (even though they didn&#8217;t pick us)!</p>
<p>&nbsp;</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2011/06/14/open-repositories-2011-part-2-the-developer-challenge/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Statistically relevant</title>
		<link>http://dablog.ulcc.ac.uk/2011/04/27/statistically-relevant/</link>
		<comments>http://dablog.ulcc.ac.uk/2011/04/27/statistically-relevant/#comments</comments>
		<pubDate>Wed, 27 Apr 2011 11:47:28 +0000</pubDate>
		<dc:creator>Rory McNicholl</dc:creator>
				<category><![CDATA[DA Blog]]></category>
		<category><![CDATA[Repositories]]></category>
		<category><![CDATA[Ars Technica]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[eprints]]></category>
		<category><![CDATA[IRStats]]></category>
		<category><![CDATA[linked data]]></category>
		<category><![CDATA[repositories]]></category>
		<category><![CDATA[SHERPA-LEAP]]></category>

		<guid isPermaLink="false">http://www.sherpa-leap.ac.uk/?p=277</guid>
		<description><![CDATA[Over the last year or so we&#8217;ve installed and configured (in some cases reconfigured) the IRStats package for several of the LEAP repositories, including those hosted by ULCC. It seemed a good moment to share a few thoughts about the process of getting &#8220;all statted up&#8221; with EPrints. By default, and without any further action, [...] ]]></description>
			<content:encoded><![CDATA[<p><i>From the <a href="http://www.sherpa-leap.ac.uk/">SHERPA-LEAP</a> blog.</i></p>
<p>Over the last year or so we&#8217;ve installed and configured (in some cases reconfigured) the IRStats package for several of the LEAP repositories, including those hosted by ULCC. It seemed a good moment to share a few thoughts about the process of getting &#8220;all statted up&#8221; with EPrints.</p>
<p>By default, and without any further action, IRStats provides a kind of smorgasbord control panel, demonstrating the many optional graphs, charts and list available. You can see <a href="http://pubs.ulcc.ac.uk/cgi/irstats.cgi">an example</a> on our own ULCC Publications repository.</p>
<p>More recently we&#8217;ve seen growing demand among repository managers to share data on downloads with both their depositors and users at large. It&#8217;s really important for repository managers to select carefully which statistics views they actually want or need to display &#8211; we can only suggest things we think might work. Once you&#8217;ve decided on the views you want, we can look at the most effective ways to display them: and this is why I&#8217;ve been having fun souping up some of the displays already offered by IRstats.</p>
<p>The first display we&#8217;ve been working on is the Statistics digest. These are common enough and we&#8217;ve used the example of <a href="http://discovery.ucl.ac.uk/past-statistics.html">UCL Discovery</a> repository as the basis of work for both SAS-Space and SOAS institutional repository.</p>
<p>The second approach has been to re-style the IRstats &#8220;dashboard&#8221; view to lay the graphs on top of each other and then use some Javascript to handle the tabbed navigation. This seemed a more elegant approach than inserting lots of charts in the abstract page itself (as, for example, at <a href="http://eprints.ecs.soton.ac.uk/18493/">ECS EPrints</a>). I&#8217;ve used this display technique to display statistics for individual eprints for the School of Pharmacy, as well as SAS and SOAS.</p>
<p><a href="http://www.sherpa-leap.ac.uk/wp-content/uploads/2011/04/irstats-pharmacy-1.png"><img class="aligncenter size-medium wp-image-278" src="http://www.sherpa-leap.ac.uk/wp-content/uploads/2011/04/irstats-pharmacy-1-300x212.png" alt="IRStats on School of Pharmacy EPrints" width="300" height="212" /></a></p>
<p>The tabbed display of graphs and tables was also combined with a &#8216;modal box&#8217; display that keeps the height of page the same (for example on <a href="http://eprints.soas.ac.uk/3316/">this Abstract page</a> at SOAS. At the bottom of the Abstract page I&#8217;ve added a statistics section showing the number full-text downloads, and a link that displays detailed stats in an overlaid box.</p>
<p>This method doesn&#8217;t just work for individual items, but can be used on other datasets in too. For example, on SAS-Space we have added it to the bottom of their <a href="http://sas-space.sas.ac.uk/view/collections/ialsac.html">Collection browse pages</a>, so that at the bottom of each Collection view there is an opportunity to view download statistics for that collection as a whole.</p>
<p>Additionally in SAS-Space, since it is a repository for a number of discrete institutes, there was a requirement for institutional editors to have access to their own institute&#8217;s statistics. To achieve this, I allowed access to a constrained version of the IRStats control panel for editor-users who had the appropriate editorial permissions for the institute in question. (Unless you are a SAS-Space editor, you won&#8217;t be able to access this.)</p>
<p>Which statistics views to insert as tabs is the decision of the repository manager. Views we&#8217;ve used include:</p>
<ul>
<li>Monthly downloads</li>
<li>Daily downloads</li>
<li>Unique visitors</li>
<li>Referrers</li>
<li>Search Engines</li>
<li>Top 10 items downloaded (only for a Collection, Repository or Division)</li>
<li>Top 10 search terms</li>
</ul>
<p>From a technical point-of-view, we will have to review these configurations when we upgrade to EPrints version 3.3, possibly later in the year (if it&#8217;s released!!), in conjunction with our VM infrastructure migration, and start doing things with EPStats rather than IRStats. But we now have an effective framework for adding statistics quickly to any EPrints installation.</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2011/04/27/statistically-relevant/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SHERPA-LEAP: Handy Hints: MIME-Types</title>
		<link>http://dablog.ulcc.ac.uk/2011/03/11/handy-hints-mime-types/</link>
		<comments>http://dablog.ulcc.ac.uk/2011/03/11/handy-hints-mime-types/#comments</comments>
		<pubDate>Fri, 11 Mar 2011 14:21:40 +0000</pubDate>
		<dc:creator>Rory McNicholl</dc:creator>
				<category><![CDATA[Repositories]]></category>
		<category><![CDATA[Ars Technica]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[repositories]]></category>
		<category><![CDATA[SHERPA-LEAP]]></category>

		<guid isPermaLink="false">http://www.sherpa-leap.ac.uk/?p=254</guid>
		<description><![CDATA[Some repositories have reported issues with Microsoft &#8220;DOCX&#8221; files, which IE8 in particular may treat as a ZIP file. This is a potential problem with all the current slew of MS file types. The solution is to add the following entries to your web server configuration. Extension MIME Type .xlsx application/vnd.openxmlformats-officedocument.spreadsheetml.sheet .xltx application/vnd.openxmlformats-officedocument.spreadsheetml.template .potx application/vnd.openxmlformats-officedocument.presentationml.template [...] ]]></description>
			<content:encoded><![CDATA[<p><i>From the <a href="http://www.sherpa-leap.ac.uk/">SHERPA-LEAP</a> blog.</i></p>
<p>Some repositories have reported issues with Microsoft &#8220;DOCX&#8221; files, which IE8 in particular may treat as a ZIP file. This is a potential problem with all the current slew of MS file types. The solution is to add the following entries to your web server configuration.</p>
<table border="1" cellspacing="0">
<tbody>
<tr>
<td valign="top"><strong>Extension</strong></td>
<td valign="top"><strong>MIME Type</strong></td>
</tr>
<tr>
<td valign="top">.xlsx</td>
<td valign="top">application/vnd.openxmlformats-officedocument.spreadsheetml.sheet</td>
</tr>
<tr>
<td valign="top">.xltx</td>
<td valign="top">application/vnd.openxmlformats-officedocument.spreadsheetml.template</td>
</tr>
<tr>
<td valign="top">.potx</td>
<td valign="top">application/vnd.openxmlformats-officedocument.presentationml.template</td>
</tr>
<tr>
<td valign="top">.ppsx</td>
<td valign="top">application/vnd.openxmlformats-officedocument.presentationml.slideshow</td>
</tr>
<tr>
<td valign="top">.pptx</td>
<td valign="top">application/vnd.openxmlformats-officedocument.presentationml.presentation</td>
</tr>
<tr>
<td valign="top">.sldx</td>
<td valign="top">application/vnd.openxmlformats-officedocument.presentationml.slide</td>
</tr>
<tr>
<td valign="top">.docx</td>
<td valign="top">application/vnd.openxmlformats-officedocument.wordprocessingml.document</td>
</tr>
<tr>
<td valign="top">.dotx</td>
<td valign="top">application/vnd.openxmlformats-officedocument.wordprocessingml.template</td>
</tr>
<tr>
<td valign="top">.xlam</td>
<td valign="top">application/vnd.ms-excel.addin.macroEnabled.12</td>
</tr>
<tr>
<td valign="top">.xlsb</td>
<td valign="top">application/vnd.ms-excel.sheet.binary.macroEnabled.12</td>
</tr>
</tbody>
</table>
<p>Exactly how you (or more likely your system manager) achieve this depends on your Web platform (e.g. Apache, Tomcat, IIS) but whoever runs it should be able to make the necessary changes, and once the Web server is restarted, the new types should be picked up. (We&#8217;ve just done this for the ULCC-hosted repositories.)</p>
<p>&#8220;<a href="http://en.wikipedia.org/wiki/Internet_media_type">MIME-Types</a>&#8221; have a long and chequered history as a way of identifying file types to internet applications. To some extent IE8 is correct to infer (in the absence of better information from the Web server) that .docx files are ZIP files, because MS Office Open XML formats are bundled using the ZIP compression tool. But in general what one really wants the browser to do is pass the file to an Office application, not WinZip.</p>
<p>Ironically, it seems other browsers do correctly infer MS OOXML file types.</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2011/03/11/handy-hints-mime-types/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Synergies abound</title>
		<link>http://dablog.ulcc.ac.uk/2011/02/21/synergies-abound/</link>
		<comments>http://dablog.ulcc.ac.uk/2011/02/21/synergies-abound/#comments</comments>
		<pubDate>Mon, 21 Feb 2011 16:55:16 +0000</pubDate>
		<dc:creator>Richard M. Davis</dc:creator>
				<category><![CDATA[DA Blog]]></category>
		<category><![CDATA[Repositories]]></category>
		<category><![CDATA[AIDA]]></category>
		<category><![CDATA[digitisation]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[eprints]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[LEAP]]></category>
		<category><![CDATA[repositories]]></category>
		<category><![CDATA[SOAS]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/?p=1206</guid>
		<description><![CDATA[Some days it all seems worthwhile, and last Friday was such a day. I spent most of it at SOAS listening to accounts of the many digitisation projects of the Centre for Digital Africa, Asia and the Middle East (CeDAAME), including the Fürer-Haimendorf photographic collection, Islamic manuscripts (in partnership with Yale) and other justly named [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://dablog.ulcc.ac.uk/wp-content/uploads/2011/02/FxCam_1298304702885.jpg"><img class="alignright size-medium wp-image-4398" title="Yale/SOAS Islamic Manuscripts Gallery (postcard)" src="http://dablog.ulcc.ac.uk/wp-content/uploads/2011/02/FxCam_1298304702885-300x234.jpg" alt="Yale/SOAS Islamic Manuscripts Gallery (postcard)" width="300" height="234" /></a> Some days it all seems worthwhile, and last Friday was such a day. I spent most of it at SOAS listening to accounts of the many digitisation projects of the Centre for Digital Africa, Asia and the Middle East (<a href="http://www.soas.ac.uk/cedaame/">CeDAAME</a>), including the Fürer-Haimendorf photographic collection, Islamic manuscripts (in partnership with Yale) and other justly named &#8220;Treasures of SOAS&#8221;. What Malcolm, Stuart, Julie and the rest of the SOAS team have achieved is extremely impressive. And of course I was also there to admire the fantastic work Rory has done making an <a href="http://digital.info.soas.ac.uk/cgi/c">attractive and accessible online showcase</a> for them out of EPrints. (There are some rough edges still to polish, but by-Friday was a tough deadline! <img src='http://dablog.ulcc.ac.uk/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' />  )</p>
<p>Friday&#8217;s CeDAAME dissemination event was also an opportunity to be reminded that ULCC&#8217;s Digital Archives team has contributed in other ways to the success of SOAS&#8217;s team, directly and indirectly. Julie Makinson described how SOAS used the <a href="http://aida.jiscinvolve.org/wp/">AIDA digital asset assessment toolkit</a> in developing their strategic approach; and many of the SOAS team are alumni of the <a title="Digital Preservation Training Programme" href="http://www.dptp.org/">DPTP</a>: so Ed and Patricia have also had their part to play in supporting SOAS&#8217;s digitisation efforts.</p>
<p>The presentations at SOAS were extremely interesting, describing the full range of activities of a multi-faceted digitisation programme, from the development of the strategy (using the aforementioned AIDA) to the many challenges of digitising Islamic manuscripts and related materials.</p>
<p>How, for example, do you reliably OCR pages of centuries-old text with mixtures of Arabic and Latin/English/French? The answer is that sometimes rekeying is unavoidable. We learned, too, that Yale used UKOLN&#8217;s<a href="http://www.ukoln.ac.uk/metadata/dcdot/"> DC Dot </a>Dublin Core editor to create their metadata for Islamic collections (and then convert to TEI). Thanks to the native DC and Unicode support in EPrints, SOAS metadata (in English and Arabic) was created and managed directly in the repository. Metadata exchange between Yale&#8217;s Fedora-based system and SOAS&#8217;s EPrints system seems to have been achieved effectively &#8211; I know Rory worked closely with SOAS and Yale on this.</p>
<p>And I sensed genuine excitement in the room when the page-turning interfaces for viewing the books online were unveiled: both very impressive. (For SOAS Rory has been working long and hard on adapting the open source book viewer used by the Internet Archive, and ensuring that the right-to-left reading and page-turning functionality works smoothly.) We also learned about a variety of different approaches to the issues of managing and funding digitisation and cataloguing activities: with my work on the Mediawiki-based <a href="http://www.ucl.ac.uk/transcribe-bentham/">Transcribe Bentham</a> project in mind, it was particularly interesting to hear about University of Michigan&#8217;s <a href="http://www.lib.umich.edu/special-collections-library/clir-islamic-manuscripts-project">Collaborative Cataloguing</a> initiative.</p>
<p>All in all an exciting day, and particularly satisfying to see close-up the kind of synergies that exist across all of the activities of ULCC&#8217;s Digital Archives and Repositories Team. In addition to further enhancing the SOAS Digital Archives system, we are also looking forward to working with them on their JISC-funded <a href="http://digitisation.jiscinvolve.org/wp/2011/02/">Engaging Overseas Communities</a> project, which is going to involve hooking EPrints up to mobile phones in Africa and Asia.</p>
<p>As if that wasn&#8217;t enough, at lunchtime I also dashed over to the School of Pharmacy, where Jean, Neroli and Michelle had kindly organised a lunchtime meeting for the University of London repository managers in the LEAP consortium. It was an opportunity for me to unveil a preview of the new SHERPA-LEAP website (with added social networking goodness, courtesy of WordPress/BuddyPress) that we expect to launch very shortly.</p>
<p>It was a nice way to round off a week in which the Team also achieved significant milestones in our work on preservation metadata for the <a href="http://www.parliament.uk/business/publications/parliamentary-archives/">Parliamentary Archives</a> and strategic development for <a href="http://www.londonmet.ac.uk/thewomenslibrary/">The Women&#8217;s Library</a>, began planning for the next <a title="Digital Preservation Training Programme" href="http://www.dptp.org/">DPTP</a> course, and we received news that the FP7 <a href="http://blogforever.eu/">BlogForever project</a>, which will see us collaborating with Warwick, HATII, CERN and others until 2013, has received its final sign-off from the European Commission.</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2011/02/21/synergies-abound/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Doing It Differently In Sheffield Cathedral!</title>
		<link>http://dablog.ulcc.ac.uk/2010/11/04/doing-it-differently-in-sheffield-cathedral/</link>
		<comments>http://dablog.ulcc.ac.uk/2010/11/04/doing-it-differently-in-sheffield-cathedral/#comments</comments>
		<pubDate>Thu, 04 Nov 2010 09:30:23 +0000</pubDate>
		<dc:creator>Richard M. Davis</dc:creator>
				<category><![CDATA[Repositories]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[eprints]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[JISC]]></category>
		<category><![CDATA[repositories]]></category>
		<category><![CDATA[RSP]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/?p=1104</guid>
		<description><![CDATA[It was great to take part in last week&#8217;s Repositories Support Project event at Sheffield Cathedral. The theme of the day, organised by Jackie Wickham and the RSP team, was &#8220;Doing It Differently&#8221; and it covered a wide range of repository-related themes. I took along an updated and expanded version of the presentation I made [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignright size-medium wp-image-1105" title="183191782" src="http://dablog.ulcc.ac.uk/wp-content/uploads/2010/11/183191782-225x300.jpg" alt="183191782" width="225" height="300" /></p>
<p>It was great to take part in last week&#8217;s Repositories Support Project event at Sheffield Cathedral. The theme of the day, organised by Jackie Wickham and the RSP team, was <a href="http://www.rsp.ac.uk/events/index.php?page=DID1010/index.php">&#8220;Doing It Differently&#8221;</a> and it covered a wide range of repository-related themes. I took along an updated and expanded version of the presentation I made to SHERPA-LEAP repository managers. I covered the same topics, but in preparing the presentation, I was amazed how many more things there were to talk about a year on.</p>
<p>Stephanie Taylor gave an excellent overview of the repository scene, and I hope I followed it up with useful ideas about making repositories more user-friendly or just generally useful to users. Other talks went off into less well trodden areas, though no less interesting: Pat Lockley impressed again with his enthusiastic description of Xpert; Joss Winn described his further adventures in WordPress land; and Stephanie Meece described the challenges of non-textual repositories at UAL. My ears pricked up when Jason Hoyt of Mendeley mentioned that an imminent upgrade to Mendeley will be able to identify OA sources for papers, which might signal it&#8217;s time for me to finally catch up with Mendeley (dissertation starts next year!). I didn&#8217;t catch the final speakers as I had to catch my train, but I commend to you Vicki McGarvey&#8217;s <a href="http://www.ntushare.org/2010/11/rsp-event-doing-it-differently/">post on the SHARE project blog</a> at Nottingham Trent University.</p>
<p>I tried to keep things simple by steering clear of all the complicated issues in repository management &#8211; OA, OAI-PMH, copyright, advocacy, REF, RIM, etc &#8211; and just focus on simple UI enhancements that might improve a user&#8217;s experience of the repository, and effective use of features like RSS feeds and statistics, with examples from all over the world of institutional and specialist repositories. Which features a repository manager might choose, if any, is up to them and their own circumstances, but my aim was to ensure they are at least aware of what&#8217;s possible &#8211; as evidenced by what&#8217;s been done in many repositories around the country.</p>
<div style="width:425px" id="__ss_5583694"><strong style="display:block;margin:12px 0 4px"><a href="http://www.slideshare.net/bezbozhnik/beyond-sneep-ideas-for-creative-repository-management" title="Beyond SNEEP: Ideas for Creative Repository Management">Beyond SNEEP: Ideas for Creative Repository Management</a></strong><object id="__sse5583694" width="425" height="355"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=rsp-did-20101027-davis-101027104922-phpapp02&#038;stripped_title=beyond-sneep-ideas-for-creative-repository-management&#038;userName=bezbozhnik" /><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><embed name="__sse5583694" src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=rsp-did-20101027-davis-101027104922-phpapp02&#038;stripped_title=beyond-sneep-ideas-for-creative-repository-management&#038;userName=bezbozhnik" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"></embed></object>
<div style="padding:5px 0 12px">View more <a href="http://www.slideshare.net/">presentations</a> from <a href="http://www.slideshare.net/bezbozhnik">Richard Davis</a>.</div>
</div>
<p>Although I focused on EPrints installations, I think nearly everything I demonstrated ought to be feasible in other platforms. Overloading an abstract page with features like &#8220;Share this on Facebook/Twitter&#8221;, QR Codes, or metadata export in RSS/JSON/CSV and more, should be a very easy way to enhance the user experience of repositories. As I suggested, adding buttons to support &#8220;the latest thing&#8221; users may be finding useful, is generally not difficult. A &#8220;Send This Paper To My Kindle&#8221; button, for example, seems so trivial I might even try it myself.</p>
<p>I had a long list of ideas/examples to show: for anyone who didn&#8217;t have time to copy down the small print, they were:</p>
<ul>
<li><span><a href="http://wiki.eprints.org/w/Sneep">SNEEP </a></span></li>
<li><span><a href="http://eprints.lincoln.ac.uk/">Lincoln EPrints </a></span></li>
<li><span><a href="http://languagebox.ac.uk/">Language Box </a></span></li>
<li><span><a href="http://wiki.eprints.org/w/MePrints">MePrints </a></span></li>
<li><span><a href="http://humbox.ac.uk/">Hum Box </a></span></li>
<li><span><a href="http://pubs.ulcc.ac.uk/">ULCC Publications Archive</a></span></li>
<li><span><a href="http://eprints.ucl.ac.uk/">UCL EPrints </a></span></li>
<li><span><a href="http://wiki.eprints.org/w/IRStats">IRStats </a></span></li>
<li><span><a href="http://goo.gl/Bp1N">Repository Stats using Google Analytics (presentation by Graham Triggs at OR10) </a></span></li>
<li><span><a href="http://e-space.openrepository.com/">E-Space at Manchester Metropolitan University </a></span></li>
<li><span><a href="http://code.google.com/p/ﬂism/">Framework for Linking Inline Semantic Metadata </a></span></li>
<li><span><a href="http://ora.ouls.ox.ac.uk/">Oxford University Research Archive </a></span></li>
<li><span><a href="http://lasso.ucl.ac.uk/merlin-ui/">MERLIN </a></span></li>
<li></li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2010/11/04/doing-it-differently-in-sheffield-cathedral/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Open Repositories 2010 in Madrid</title>
		<link>http://dablog.ulcc.ac.uk/2010/07/09/open-repositories-2010-in-madrid/</link>
		<comments>http://dablog.ulcc.ac.uk/2010/07/09/open-repositories-2010-in-madrid/#comments</comments>
		<pubDate>Fri, 09 Jul 2010 09:07:23 +0000</pubDate>
		<dc:creator>Richard M. Davis</dc:creator>
				<category><![CDATA[Repositories]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[News]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/?p=1030</guid>
		<description><![CDATA[Rory and I have spent this week melting in the summer heat of Madrid at Open Repositories 2010. This is the third OR event I&#8217;ve attended (see my reports on OR09 and OR08) and the first that Rory has been able to make. As usual, a great opportunity to find out what&#8217;s going on in [...]]]></description>
			<content:encoded><![CDATA[<p>Rory and I have spent this week melting in the summer heat of Madrid at Open Repositories 2010. This is the third OR event I&#8217;ve attended (see my reports on <a href="/2009/06/10/open-repositories-2009/">OR09</a> and <a href="http://dablog.ulcc.ac.uk/2008/04/02/open-repositories-2008-in-southampton/">OR08</a>) and the first that Rory has been able to make. As usual, a great opportunity to find out what&#8217;s going on in the world of repositories, for both developers and repository managers, and catch up with friends and colleagues working in the field.</p>
<p><a href="http://pubs.ulcc.ac.uk/124/1/OR10_poster.pdf"><img class="alignright" title="ULCC/SOAS Poster for OR10" src="http://farm5.static.flickr.com/4102/4745515454_b0e251517b.jpg" alt="" width="210" height="290" /></a>We&#8217;ve brought along a <a href="http://pubs.ulcc.ac.uk/124/">smashing poster</a> of our work for SOAS, which attracted a considerable amount of interest for its distinctive and attractive Web user interface.</p>
<p>Of the themes of the conference we found most interesting, more in due course.</p>
<p>But we also decided it was about time we entered the annual Repository Developer Challenge. And at the conference dinner on Wednesday we learned that our entry was awarded first prize. La Roja weren&#8217;t the only ones with something to celebrate; and we&#8217;r'e very pleased to have our names on the same (virtual) cup as such giants of the reposphere as Tim Donohue, David Tarrant, Ben O&#8217;Steen and Tim Brody!</p>
<p>The challenge set by the JISC funded <a href="http://devcsi.ukoln.ac.uk/">DevCSI </a>project, managed by UKOLN, was to &#8220;Create a functioning repository user-interface, presenting a single metadata record which includes as many automatically created, useful links to related external content as possible.&#8221;</p>
<p>Our idea began by simply imagining a typical repository abstract page overloaded with functional, contextual popup menus, just like a Google Docs page, for example. We could add a menu to each metadata element, and populate the menu with links pertinent to the element.</p>
<p>In order to do that we needed:</p>
<ul>
<li>first, to be able to automatically identify the metadata values (title, author and the like) in the page, and,</li>
<li>second, to manage a list of the web services that are appropriate to each metadata element. The result would automatically generate, on each item&#8217;s abstract page, links such as &#8220;Find this title at Amazon&#8221;, &#8220;Find this author at the BL&#8221;</li>
<li>third, some clever scripts that put all this stuff together</li>
</ul>
<p>The first task was achieved by editing the EPrints templates to ensure each metadata value was wrapped in RDFa semantic tags (spans with a &#8216;property&#8217; class) that we could identify programmatically. The semantic schemas used are formally declared in the page header.</p>
<p>The second aim achieved by identifying appropriate target services, and the exact URL required to activate them. As a simple example, we can easily create a link to search Google for VALUE using the URL http://www.google.com/#q=VALUE. By way of a data source for these services, I created a <a href="http://spreadsheets0.google.com/ccc?key=t_gMe1nKEG2tKUQ5OhTplPg&amp;hl=en_GB#gid=2">spreadsheet</a> in which each row linked a metadata element (e.g. dc:title) to a service (e.g. Google).</p>
<p>The third part was the tricky bit that Rory deftly dispatched. He created scripts that scour the Abstract page looking for RDFa tags, then look them up in the services data table, and dynamically create links as appropriate.</p>
<p>We chose to create an example using the test server that we maintain for the Linnean Online collection. Linnaeus&#8217;s botanical specimens make a nice change from the usual ETDR fare of most repositories. This also allowed us to demonstrate using metadata schema from a domain other than the ubiquitous Dublin Core: life sciences have developed a schema called Darwin Core, which defines necessary metadata in that domain, for example as Genus and Species. What&#8217;s more, there is a wide range of resources in the field that might usefully be linked to, such as the collections at Kew or the Natural History Museum, the International Plant Names Index and the Encyclopedia of Life.</p>
<p style="text-align: center;"><a href="http://lh5.ggpht.com/_EPSfM2OiyFs/TDZwMRBpNsI/AAAAAAAAAvo/G1bXD6YEmkg/s800/or10-dev-challenge-ss0.png"><img class="aligncenter" src="http://lh5.ggpht.com/_EPSfM2OiyFs/TDZwMRBpNsI/AAAAAAAAAvo/G1bXD6YEmkg/s800/or10-dev-challenge-ss0.png" alt="" width="377" height="213" /></a></p>
<p>The results were pretty much as we&#8217;d hoped. It&#8217;s worth noting that, while we implemented it in EPrints, the technique could be applied to any template-based repository platform, or, for that matter, virtually any web application. Once the RDF templates and code are in place in the templates, it is only necessary to edit the table of data services in order to add or remove links. With only a bit more polish than we had time for during the conference, we think that this could be a useful addition to the toolkit of any repository developer or repository manager. We&#8217;ll keep you posted!</p>
<p>UKOLN filmed the proceedings so you can see me presenting our entry at the conference if you want.</p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="400" height="225" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowfullscreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://vimeo.com/moogaloop.swf?clip_id=13172548&amp;server=vimeo.com&amp;show_title=1&amp;show_byline=1&amp;show_portrait=0&amp;color=&amp;fullscreen=1" /><embed type="application/x-shockwave-flash" width="400" height="225" src="http://vimeo.com/moogaloop.swf?clip_id=13172548&amp;server=vimeo.com&amp;show_title=1&amp;show_byline=1&amp;show_portrait=0&amp;color=&amp;fullscreen=1" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p><a href="http://vimeo.com/13172548">Winner of the Developer Challenge at OR10 (Madrid) &#8211; Richard Davis and Rory McNicoll</a> from <a href="http://vimeo.com/ukoln">UKOLN</a> on <a href="http://vimeo.com">Vimeo</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2010/07/09/open-repositories-2010-in-madrid/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>AIDA and repositories</title>
		<link>http://dablog.ulcc.ac.uk/2010/02/11/aida-and-repositories/</link>
		<comments>http://dablog.ulcc.ac.uk/2010/02/11/aida-and-repositories/#comments</comments>
		<pubDate>Thu, 11 Feb 2010 16:38:49 +0000</pubDate>
		<dc:creator>Edward Pinsent</dc:creator>
				<category><![CDATA[DA Blog]]></category>
		<category><![CDATA[Repositories]]></category>
		<category><![CDATA[AIDA]]></category>
		<category><![CDATA[EAD]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/?p=891</guid>
		<description><![CDATA[The AIDA project (Assessing Institutional Digital Assets) has completed its official, funded phase, but it&#8217;s gratifying to see interest emerging in the toolkit. We possibly could have done more at ULCC to publicise and sell our work, but our ongoing partnership with the DCC on the current Research Data Management project for the JISC gives [...]]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft size-medium wp-image-905" title="aidalogo10" src="http://dablog.ulcc.ac.uk/wp-content/uploads/2010/02/aidalogo10-300x300.jpg" alt="aidalogo10" width="210" height="210" /></p>
<p>The <a href="http://aida.jiscinvolve.org/">AIDA</a> project (Assessing Institutional Digital Assets) has completed its official, funded phase, but it&#8217;s gratifying to see interest emerging in the toolkit. We possibly could have done more at ULCC to publicise and sell our work, but our ongoing partnership with the DCC on the current <a href="http://researchdata.jiscinvolve.org/">Research Data Management project</a> for the JISC gives us an opportunity to make up for that. One of the planned outcomes of the RDMP work will be an <em>integrated</em> planning tool for use by data owners or repository managers (or indeed anyone who has a digital collection to curate) that will offer the best of <a href="http://www.data-audit.eu/">DAF</a>, <a href="http://www.repositoryaudit.eu/">DRAMBORA</a>, <a href="http://www.life.ac.uk/2/">LIFE2</a> and AIDA without requiring an Institution to compile the same profile information four times over. We have already massaged the toolkit into a proof-of-concept <a href="http://aida.da.ulcc.ac.uk/wiki">online version of AIDA</a>, using MediaWiki, and this clearly signals the way forward for this kind of assessment tool.</p>
<p>I was recently invited to contribute a module about AIDA to Steve Hitchcock&#8217;s <a href="http://blogs.ecs.soton.ac.uk/keepit/">Keep-It programme</a> in Southampton &#8211; encouragingly, he is someone looking into the detail of how repositories could be used to manage digital preservation, and wants input from as many current toolkits as he could get his hands on. My experiences of the day have <a href="http://blogs.ecs.soton.ac.uk/keepit/2010/01/28/aida-and-institutional-wobbliness/">already been blogged</a>. I thought I would add two other little incidents from the day that I found interesting.</p>
<p>The first was the repository manager whose perception was that assessment of the Institution&#8217;s workings at the highest level (for example, its technology infrastructure, business management planning process and implementation of centralised policies) was not really part of her job. So why work with AIDA at all? The main purpose of AIDA is largely to assess the Institution&#8217;s overall preparedness to do asset management, and the task of assessment can take an individual staff member (repository manager, records manager, librarian) to parts of the organisation they didn&#8217;t know about before. I try and make this sound positive when I encouragingly suggest that an AIDA assessment has to be a collaborative team effort within an organisation. But our friend at Southampton reminded me that people do have these sensitivities and that very often, merely having a repository in place at all represents a hard-won struggle.</p>
<p>The second incident relates to my AIDA exercise, where I asked teams to apply sections of the toolkit to their own organisation. The response fed back by <a href="http://nectar.northampton.ac.uk/">Miggie Pickton</a> was memorable &#8211; her team had elected to analyse three separate organisations, applying one AIDA leg (Organisation, Technology and Resource) to each. My initial feeling was that this makes a complete mockery of AIDA, subjective and unvalidated as it might be; what better way to cheat a good score than by cherry-picking the best results across three institutions? However, Miggie&#8217;s observations were in fact very useful &#8211; and the scores <em>still</em> resulted in a wobbly three-legged stool. It seems that even if they collaborated, HFE Institutions still would not be able to achieve that stability that is the foundation for good asset management.</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2010/02/11/aida-and-repositories/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>A repository for pi(es)</title>
		<link>http://dablog.ulcc.ac.uk/2010/01/07/a-repository-for-pies/</link>
		<comments>http://dablog.ulcc.ac.uk/2010/01/07/a-repository-for-pies/#comments</comments>
		<pubDate>Thu, 07 Jan 2010 16:40:47 +0000</pubDate>
		<dc:creator>Kevin Ashley</dc:creator>
				<category><![CDATA[DA Blog]]></category>
		<category><![CDATA[Repositories]]></category>
		<category><![CDATA[costs]]></category>
		<category><![CDATA[curation]]></category>
		<category><![CDATA[data curation]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[repositories]]></category>
		<category><![CDATA[storage]]></category>
		<category><![CDATA[Technical]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/?p=849</guid>
		<description><![CDATA[As you may have read recently, Fabrice Bellard has announced the computation of &#960; to almost 2.7 trillion decimal places using a faster algorithm that allows desktop technology to be used, rather than the supercomputers that are usually used to break this particular record. Bellard is an extremely talented programmer who has made a useful [...]]]></description>
			<content:encoded><![CDATA[<p>As you may have <a href="http://news.bbc.co.uk/1/hi/technology/8442255.stm">read</a> recently, <a href="http://bellard.org/">Fabrice Bellard</a> has announced the computation of &pi; to almost <a href="http://bellard.org/pi/pi2700e9/announce.html">2.7 trillion decimal places</a> using a faster algorithm that allows desktop technology to be used, rather than the supercomputers that are usually used to break this particular record. Bellard is an extremely talented programmer who has made a useful contribution to one area of digital preservation with his emulation and virtualisation system <a href="http://www.nongnu.org/qemu/">QEMU</a>. But it&#8217;s a <a href="http://twitter.com/lescarr/status/7472981654">comment</a> by <a href="http://users.ecs.soton.ac.uk/lac">Les Carr</a> that set me thinking about costs, research data and repositories. </p>
<p>&#8220;Would you want to put that in your repository?&#8221; asked Les. And this is a particularly extreme example where we can do some calculations to give us a fairly good answer. Scientific data centres and the researchers that <a href="http://www.flickr.com/photos/maitri/2333509032/"><img src="http://dablog.ulcc.ac.uk/wp-content/uploads/2010/01/PiPie.jpg" alt="Pi Pie - CC-BY-NC-SA by Maitri@flickr" title="PiPie" width="240" height="160" class="alignright" style="margin: 4px;" /></a> use them have been considering this question for many years, and one way of looking at it is to see if the cost of recomputation exceeds the cost of storage over a particular time period. We&#8217;re assuming here that the initial question &#8211; <a href="http://scienceblogs.com/bookoftrogool/2010/01/chris_rusbridge_settles_the_qu.php">is this worth keeping at all</a> &#8211; has been answered at least vaguely positively.</p>
<p>Let&#8217;s look first at the cost of recomputation. Fabrice says the equipment used for this task cost no more than €2000. If we assume that it has a life of 3 years, that gives us a cost per day of €1.83. I&#8217;m avoiding the usual accounting practice of allowing for inflation, or lost interest on capital, in calculating the true depreciation value of the asset &#8211; there&#8217;s a number of different schemes and they all give similar results. I&#8217;ve just dividided the capital cost by the number of days of use we&#8217;ll get. But computers use electricity, and that costs money as well. Let&#8217;s assume this is a power-hungry beast that draws 400W and that power costs us 13.5&cent; per kwH (which is what my domestic tarrif is if we assume a euro/sterling rate of €1.10 = £1 and 5% VAT.) That adds €1.30/day to the cost of running the system, for a total cost of €3.13/day.</p>
<p>Fabrice&#8217;s announcement says that it took 131 days of system time to calculate and verify his results, which gives a computational cost of €410.03 &#8211; which I&#8217;ll round to €410 since I&#8217;ve only been using 3 significant figures so far in the computations, and because there&#8217;s a lot of hand-waving involved in lots of these figures. So, we know how much it would take to recompute this result given the software, machine and instructions. (And the computational cost is likely to decline over time in the short term.)</p>
<p>The answer needs a Terabyte of storage. What will it cost to keep that in a repository? That&#8217;s a slightly more difficult question to answer, but we can give a number of figures that provide upper and lower bounds. <a href="http://www.sdsc.edu/services/StorageBackup.html">SDSC quote</a> $390/Tbyte/year for archival tape storage (dual copies), excluding setup costs and assuming no retrieval. <a href="http://chronopolis.sdsc.edu/assets/docs/dt_cost.pdf">Moore et al</a> quote $500/year as a raw figure, obtained by dividing total system costs by usable storage within it. At current rates of $1 = €0.67, that gives us a cost of €261/year or €335/year. SDSC are likely to be at the cheap end of the scale. ULCC&#8217;s costs, given our lower total volumes, would be closer to €1500/year for a similar service (dual archival tape copies on separate sites) although that does include retrieval costs. <a href="http://aws.amazon.com/s3/#pricing">Amazon&#8217;s AWS</a> would be about €100/year for a single copy. You would want two copies, so it&#8217;s twice that, and the cost of transferring the data in would be about 25% more than the storage cost. Since I haven&#8217;t factored in ingest costs for any of the other models, I&#8217;ll ignore it for AWS as well. (And yes, AWS isn&#8217;t a repository, and there&#8217;s no metadata, and&#8230; This is a back-of-the-envelope calculation. It&#8217;s a small envelope.)</p>
<p>Which means, at a very rough level and ignoring many pertinent factors, that after about two years of storage in the repository, we would have been better off recalculating the data rather than storing it. There&#8217;s a lot of assumptions hidden there, however. For one, we&#8217;re assuming that this data will rarely, if ever, be required. If many people want it, the recalculation cost rapidly becomes prohibitive (and so does the 131 days they have to wait for their request to be satisfied!)</p>
<p>One of the other problems is more subtle. I said that, in the short term, recalculation costs would be likely to fall as computational power becomes cheaper. The energy costs involved will rise, of course, but there&#8217;s still a significant downward trend. But after a sufficient period of time, it becomes non-trivial to reconstruct the software and the environment it needs in order to allow the computation to happen. Imagine trying to recalculate something now where the original software is a <a href="http://en.wikipedia.org/wiki/PL/I">PL/I</a> program designed to run under OS/360. It&#8217;s not impossible by any means, but the cost involved and expertise required is non-trivial. At least with our example we won&#8217;t have any doubts about whether the right answer has been produced &#8211; the computation of &pi; produces an exact, if never-ending, answer. Most scientific software doesn&#8217;t do this and the exact answers produced can depend on the compiler, the floating-point hardware, mathematical libraries and the operating system. Over time, it becomes harder and harder to recreate these faithfully, and we often don&#8217;t have any means of checking whether or not we have succeeded. (Keeping the original outputs would help in this, of course, but that&#8217;s exactly what we&#8217;re trying to avoid.) That&#8217;s part of the problem that Brian Matthews and his colleagues examine in the <a href="http://sigsoft.dcc.rl.ac.uk/twiki/bin/view/Main/AboutSigSoft">SigSoft</a> project and there&#8217;s still a great deal of work to be done there.</p>
<p>So have we answered Les&#8217;s question ? My feeling is that in this case we have &#8211; there&#8217;s a fair amount of evidence that suggests that keeping this particular data set isn&#8217;t cost-effective. But in general, the question is far harder to answer. Yet we must strive harder for more general answers as the cost of not doing so is not trivial. Even if money did grow on trees, it still wouldn&#8217;t be free and at present we need to be very careful how we use it.</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2010/01/07/a-repository-for-pies/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>
