<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>ulcc da blog &#187; conference</title>
	<atom:link href="http://dablog.ulcc.ac.uk/tag/conference/feed/" rel="self" type="application/rss+xml" />
	<link>http://dablog.ulcc.ac.uk</link>
	<description>blogging about digital archives &#38; repositories since 2007</description>
	<lastBuildDate>Tue, 14 May 2013 10:42:32 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.4.2</generator>
		<item>
		<title>¡La varita mágica!</title>
		<link>http://dablog.ulcc.ac.uk/2009/06/09/la-varita-magica/</link>
		<comments>http://dablog.ulcc.ac.uk/2009/06/09/la-varita-magica/#comments</comments>
		<pubDate>Tue, 09 Jun 2009 09:23:33 +0000</pubDate>
		<dc:creator>Patricia Sleeman</dc:creator>
				<category><![CDATA[Digital Archives]]></category>
		<category><![CDATA[archives]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[digital preservation]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[presentations]]></category>
		<category><![CDATA[spain]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/?p=641</guid>
		<description><![CDATA[Translation: the digital preservation silver bullet which people keep looking for. Well, as many of us know it doesn&#8217;t exist! This was part of my opening speech for the XV Jornadas de la Conferencia de Archiveros de las Universidades Españolas. This is the annual meeting of all Spanish university archivists. I spoke about &#8220;El perfil [...]]]></description>
			<content:encoded><![CDATA[<p><div id="attachment_652" class="wp-caption alignleft" style="width: 260px"><img src="http://dablog.ulcc.ac.uk/wp-content/uploads/2009/06/canonfire-225x300.jpg" alt="Holes made by French canon fire in a building in Alicante" title="canonfire" width="225" height="300" style="margin-right: 2ex;" class="size-medium wp-image-652" /><p class="wp-caption-text">The magic bullet holes the French made in Alicante</p></div>Translation: <em>the digital preservation silver bullet which people keep looking for</em>. </p>
<p>Well, as many of us know it doesn&#8217;t exist! This was part of my opening speech for the <a href="http://web.ua.es/en/jornadas-cau/programa_pdf.pdf.">XV Jornadas de la Conferencia de Archiveros de las Universidades Españolas. </a>This is the annual meeting of all Spanish university archivists. I spoke about &#8220;El perfil del archivero en el entorno digital&#8221;, or &#8220;the profile of the archivist in the digital world&#8221;. Most universities in Spain have an archivist, who also performs the role of records manager. Records manager as a profession doesn&#8217;t exist in Spain. Repositories existed in almost all universities but were on the whole in the libraries and not within the archives. There is a big divide between librarians and archivists in Spain also so not a great deal of exchange goes on between these sectors. Many questions concerned costs as well as approaches to preservation. An excellent book has been written in Spanish about digital preservation by Jordi Serra of the University of Barcelona, &#8220;Gestión de los documentos digitales: estrategias para su conservación&#8221; =&#8221; Electronic records management: strategies for long term preservation&#8221;.</p>
<p>A lot of discussion revolved around the struggle to convince the superiors in the organization of the importance of digial preservation; a lot of discussion about how access drives so much of what is going on, also the big problem of engaging our techy friends in the area of digital preservation and making them aware of the issue. The biggest universities such as <a href="http://www.ucm.es/">Complutense</a> and <a href="http://www.uned.es/">UNED </a>(<em>Universidad Nacional de Educación a Distancia</em>.) were of course represented. UNED is the UK equivalent of the Open University with braches all over the world and is based in Madrid. A lot of discussion also took place about the involvement of non-anglo saxon countries in international fora such as ISO panels and ICA. While all these endeavours are important a certain amount of frustration is evident in relation to decision making and power!</p>
<p>Much talk and walking through the historical centre of Alicante, and in the interests of professional scholarship and in keeping with traditional Spanish culture I stayed awake until 3am catching the plane home the next day to rainy London.</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2009/06/09/la-varita-magica/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>rpmeet &#8211; the JISC Repositories and Preservation Programme Meeting</title>
		<link>http://dablog.ulcc.ac.uk/2009/05/10/rpmeet/</link>
		<comments>http://dablog.ulcc.ac.uk/2009/05/10/rpmeet/#comments</comments>
		<pubDate>Sun, 10 May 2009 20:32:31 +0000</pubDate>
		<dc:creator>Kevin Ashley</dc:creator>
				<category><![CDATA[DA Blog]]></category>
		<category><![CDATA[Repositories]]></category>
		<category><![CDATA[AIDA]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[JISC]]></category>
		<category><![CDATA[JiSC-PoWR]]></category>
		<category><![CDATA[preservation]]></category>
		<category><![CDATA[PRIMO]]></category>
		<category><![CDATA[repositories]]></category>
		<category><![CDATA[rpmeet]]></category>
		<category><![CDATA[SNEEP]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/?p=575</guid>
		<description><![CDATA[Some of us at ULCC, and over 100 other people from around the UK, spent a couple of days this week at the Aston Business School reviewing the outcomes of JISC&#8217;s repositories and preservation programme and looking forward to what comes next. It was a useful and stimulating couple of days &#8211; the best programme [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.jisc.ac.uk/whatwedo/programmes/reppres.aspx"><img align="left" width="320" height="247" src="http://www.jisc.ac.uk/whatwedo/programmes/~/media/JISC/programmes/reppres/rpprog_structure_smaller3.ashx" alt="Diagram of programme elements" /></a></p>
<p>Some of us at ULCC, and over 100 other people from around the UK, spent a couple of days this week at the Aston Business School reviewing the outcomes of JISC&#8217;s <a href="http://www.jisc.ac.uk/whatwedo/programmes/reppres.aspx">repositories and preservation programme</a> and looking forward to what comes next. It was a useful and stimulating couple of days &#8211; the best programme meeting I&#8217;ve attended so far. The few projects that weren&#8217;t represented at the meeting missed out in a lot of ways. If you&#8217;re involved in a JISC project, make sure you, your project manager, or both of you go to a programme meeting when you are invited. You&#8217;ll learn a lot, make some useful contacts, save some time, get some useful ideas and possibly lay the groundwork for future projects or collaborations.</p>
<p>I began the day by chairing the final meeting of <a href="http://www.jisc.ac.uk/aboutus/committees/workinggroups/repositoriespreservation.aspx">RPAG</a>(the repositories and preservation advisory group.) <span id="more-575"></span>We had a short meeting mainly to follow up on discussions we had been having on how the group had operated and how JISC might make use of advisory bodies in future. Those who expressed an opinion all felt it had been useful to them, but we all had concerns about how our time, and the JISC Executive&#8217;s time, might have been used more effectively. Future advisory groups may try to split responsibility for some areas into smaller working groups. All were agreed that the face-to-face meetings were invaluable, but we weren&#8217;t all agreed on which online technology would be best to use in between times. Enthusiasts for tools like ideascale were matched by those who found them unusable.</p>
<div style="width:425px;text-align:left" id="__ss_1399859"><object style="margin:0px" width="300" height="250"><param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=rpplenary200905-090507081547-phpapp01&#038;rel=0&#038;stripped_title=jisc-repositories-and-preservation-programme-plenary-presentation-2009" /><param name="allowFullScreen" value="true"/><param name="allowScriptAccess" value="always"/><embed src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=rpplenary200905-090507081547-phpapp01&#038;rel=0&#038;stripped_title=jisc-repositories-and-preservation-programme-plenary-presentation-2009" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="300" height="250"></embed></object>
<div style="font-size:11px;font-family:tahoma,arial;height:26px;padding-top:2px;">View more <a style="text-decoration:underline;" href="http://www.slideshare.net/">presentations</a> from <a style="text-decoration:underline;" href="http://www.slideshare.net/kevinashley">kevinashley</a>.</div>
</div>
<p>The meeting proper opened with some background and perspective from Rachel Bruce and Neil Grindley of JISC and myself. I tried &#8211; partly seriously, but without much expectation of accuracy &#8211; to give a one-line summary of what each project set out to do. But there were two things I meant to say which I failed to do. One was to look forward to the theme of day 2 (Value) and stress that repositories are not ends in themselves, but need to be thought of in terms of value, impact and benefits to someone. The second point I omitted was to remind us that , for innovation projects, failure in one sense can still mean success, as long as we understand the nature of the failure and are able to use it to improve and adapt future work. Not achieving what you set out to do is disappointing. Analysing the reasons for that and making sure others are aware of them can be of great value. </p>
<p>But it was the rest of the event that provided greatest interest. The discussion sessions on text mining, research data, teaching and learning repositories and more; presentations from projects from stakeholder, developer and other perspectives; posters and demos from many of the projects; and the fever of activities in the ideas room, which deployed technology ranging from post-it notes upwards to catalyse, capture and refine ideas from the attendees. These activities gave the event much more of a participatory feel &#8211; everyone became a contributor rather than being a consumer.</p>
<p><a href="http://www.wordle.net/gallery/wrdl/830199/AIDA_project_proposal" title="Wordle: AIDA project proposal"><img src="http://www.wordle.net/thumb/wrdl/830199/AIDA_project_proposal" alt="Wordle: AIDA project proposal" style="padding:4px;border:1px solid #ddd" align="right" /></a> I learned a few things over the course of a day or two, most of them unexpected. David Flanders (via Chris Rusbridge) passed on the neat idea of feeding funding proposals through Wordle before marking them. That&#8217;s what ULCC&#8217;s <a href="http://aida.jiscinvolve.org/">AIDA</a> project looked like. Perhaps you ought to try the same with your proposals prior to submitting them?</p>
<p>I learned that talking unprepared and unscripted to a video camera doesn&#8217;t produce great results unless you&#8217;ve had practice or training &#8211; neither of which I&#8217;ve had. I knew that in an abstract sense and now have the unfortunate experience to back it up. But Andy McGregor and Dave Flanders did capture some other people talking far more sense than I did and far more clearly, and you can see the results on the <a href="http://www.youtube.com/user/dev8D">dev8D youtube channel</a>.</p>
<p>Andrew Prescott&#8217;s overview of the Welsh Repository Network provided us with the surprising finding that smaller institutions are more, not less, likely to want to run their own repository rather than contract it out to someone else.</p>
<p>And via a serendipitous typo, we all contemplated whether working in a repositoire might not be an altogether more rarified and sophisticated career option than working with a repository.</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2009/05/10/rpmeet/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>International Repositories Infrastructure Workshop: public wiki now open</title>
		<link>http://dablog.ulcc.ac.uk/2009/04/15/international-repositories-infrastructure-workshop-public-wiki-now-open/</link>
		<comments>http://dablog.ulcc.ac.uk/2009/04/15/international-repositories-infrastructure-workshop-public-wiki-now-open/#comments</comments>
		<pubDate>Wed, 15 Apr 2009 15:13:18 +0000</pubDate>
		<dc:creator>Kevin Ashley</dc:creator>
				<category><![CDATA[DA Blog]]></category>
		<category><![CDATA[Repositories]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[EAD]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[infrastructure]]></category>
		<category><![CDATA[international]]></category>
		<category><![CDATA[JISC]]></category>
		<category><![CDATA[repinf09]]></category>
		<category><![CDATA[repositories]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/?p=485</guid>
		<description><![CDATA[About a month ago (March 15-17) I attended an invitation-only event entitled &#8220;An International Repositories Infrastructure Workshop&#8221; in Amsterdam. Others have already blogged more contemporaneously about this event, including Chris Rusbridge, Amanda Hill and Jeremy Frumkin. They all provide a good summary of some of what took place, the activities which led up to the [...]]]></description>
			<content:encoded><![CDATA[<p>About a month ago (March 15-17) I attended an invitation-only event entitled &#8220;An International Repositories Infrastructure Workshop&#8221; in Amsterdam. Others have already blogged more contemporaneously about this event, including <a href="http://digitalcuration.blogspot.com/2009/03/international-repositories.html">Chris Rusbridge</a>, <a href="http://namesproject.wordpress.com/2009/03/17/repository-infrastructures/">Amanda Hill</a> and <a href="http://digitallibrarian.org/?p=44">Jeremy Frumkin</a>. They all provide a good summary of some of what took place, the activities which led up to the workshop and some sources of other information.</p>
<p>What&#8217;s prompted me to write about it now is the news that the outputs from that workshop are now visible, and the ongoing process of revising and amending them is taking place in a far more public forum on pbwiki. <a href="http://repinf.pbwiki.com/">repinf.pbwiki.com</a> is somewhere you should visit if you are, in the words of its homepage:</p>
<blockquote>
<p>&#8230;. interested in:</p>
<p>1. developing coordinated action plans for specific areas of repository development</p>
<p><span id="more-485"></span></p>
<p>2. pursuing those plans</p>
<p>3. coordinating that activity internationally</p>
<p>An international workshop in March 2009 kicked off this process, which is now open to anyone willing to contribute.</p>
</blockquote>
<p>If you have an opinion on things like interoperable identifiers, citation services, streamlining deposit workflows or (most contentiously) international repository organisations, you need to take a look at these materials. The workshop that produced them was a curious and mixed event, but it certainly had some positive features. It brought together interested experts from around the world to consider things that need doing with repositories that can only happen through joined-up international action. We did our best to focus on things that could be done in a reasonable timescale and that would produce clear benefits. At the end a group of funders &#8211; many of whom clearly weren&#8217;t quite sure what was expected of them &#8211; spent an hour or two considering the ideas that came out of the 4 workshop groups and voiced their own opinions about them. Some of the ideas drew widespread support from public and commercial organisations, whilst others were not yet clearly developed enough, or were still too parochial. Generally, there was a clear willingness to take action, but some of the plans needed more work before funders could act. The wiki is the way that that work will be done (with the contentious exception noted above.)</p>
<p>The idea of getting joined-up thinking between doers, thinkers and funders has succeeded. Anyone can read the material on the repinf wiki site, and anyone can edit it once they ask for a userid. <a href="http://www.youtube.com/watch?v=NMIcCCkIl-w">Do it.</a> <a href="http://www.youtube.com/watch?v=AW94AEmzFhQ">Do it now.</a></p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2009/04/15/international-repositories-infrastructure-workshop-public-wiki-now-open/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>4th International Digital Curation Conference part 3 (idcc4 rides again)</title>
		<link>http://dablog.ulcc.ac.uk/2008/12/22/4th-international-digital-curation-conference-part-3-idcc4-rides-again/</link>
		<comments>http://dablog.ulcc.ac.uk/2008/12/22/4th-international-digital-curation-conference-part-3-idcc4-rides-again/#comments</comments>
		<pubDate>Mon, 22 Dec 2008 17:28:59 +0000</pubDate>
		<dc:creator>Kevin Ashley</dc:creator>
				<category><![CDATA[DA Blog]]></category>
		<category><![CDATA[Digital Archives]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[data curation]]></category>
		<category><![CDATA[DCC]]></category>
		<category><![CDATA[digital curation]]></category>
		<category><![CDATA[edinburgh]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[idcc4]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/?p=246</guid>
		<description><![CDATA[This is the third and final post of mine summing up my notes from the 4th international digital curation conference (now a few weeks ago.) These notes cover the bulk of the second day, which consisted of submitted (as opposed to invited) papers. Most of these were given in parallel tracks, so I&#8217;m only able [...]]]></description>
			<content:encoded><![CDATA[<p>This is the third and final post of mine summing up my notes from the 4th international digital curation conference (now a few weeks ago.) These notes cover the bulk of the second day, which consisted of submitted (as opposed to invited) papers. Most of these were given in parallel tracks, so I&#8217;m only able to cover half of them. As luck would have it, Chris Rusbridge seems to have <a href="http://digitalcuration.blogspot.com/2008/12/strand-b1-research-papers-at-idcc4.html">covered many of the others.</a></p>
<p>The opening paper, however, was in its own session as it won the prize for best paper of the conference: Manjula Patel and Alexander Ball&#8217;s study on the issues surrounding the preservation of engineering CAD models was a useful guide to the <a href="http://www.ukoln.ac.uk/projects/grand-challenge/">problems and to possible solutions</a>. One of the things that they opened my eyes to is that CAD files aren&#8217;t just about geometry and spatial representation &#8211; there&#8217;s lots of other information possibly embedded in them or attached to them, from engineering tolerances to feedback from the manufacturing process or field maintenance.</p>
<p>Aaron Griffiths then spoke about the RIN report looking at attitudes to data publishing across different disciplines (<em><a href="http://www.rin.ac.uk/data-publication">To Share or Not To Share: Publication and quality assurance of research data outputs</a></em>) <span id="more-246"></span> It examined culture, infrastructure barriers, the effects of policy and overall propensity for sharing across a number of disciplines via interviews with researchers: astronomy was generally high (that is, well disposed towards sharing), social and health sciences low, for instance. Motivations included altruism and opportunities for collaboration; constraints included time, legal and ethical, competition, sense of limited rewards, and nowhere to put stuff. The report says we need incentives: evidence of benefits, standard workable mechanisms for citation, more explicit rewards that affect career progression. It also looked at issues around discovery, access and usability. and then asked researchers about processes relating to quality assurance. This didn&#8217;t get support from the researchers they spoke to, but we can&#8217;t have greater rewards for data publishing without a willingness to have QA, I feel. His presentation made it clear to me that I ought to read the report.</p>
<p>Amanda spencer then spoke about TNA&#8217;s web continuity project, which aims to ensure (eventually) that 404 errors on UK government websites will be a thing of the past. One detail that was new to me (we&#8217;ve covered much of this in the PoWR project) was that the TNA team followed up their original study of link persistence in Hansard answers (which found that 60% of the 4,000 URLs given in answers to parliamentary questions over the last 10 years no longer worked) with a more wide-ranging study of government sites using google webmaster tools. They wanted to check to see if PQ answers were in some way a special case: they weren&#8217;t. They are promoting the use of XML sitemaps to help in capture but these also help search engines and therefore current use of the sites.</p>
<p>Ronald Jantz then spoke of ideas around institutional support for authentic digital objects, starting with the assumption that &#8220;digital scholarship requies authentic digital objects&#8221;. An example he gave in support of this (attempts to recreate cold fusion experiment failed because experimental design &#8211; including location of thermometer &#8211; wasn’t available) was interesting but didn&#8217;t seem directly relevant to questions of authenticity but rather of selection or completeness. He reminded us of the useful distinction made in archival diplomatics between authenticity and reliability, but ended up proposing that the solution to the authenticity question depended on institutionally-supported key signing, together with some process like TRAC to authenticate the institution and/or its repository. I and others weren&#8217;t convinced by this argument, nor of its novelty. The issue of keys and what they attest was being dealt with at the time of the second DLM forum in 1999 by the Swedish National Archives, amongst others. Using keys to place the equivalent of wax seals on documents isn&#8217;t difficult; the complex problem is whether the trust networks exist to make the keys useful to anyone.</p>
<p>A presentation about the curation of weather data at NCAR provided lots of numbers and some interesting new terminology. &#8216;Enriched staff&#8217; who have subject knowledge of the data they are curating. NCAR storage requirements double every 2.5 years, and are now about 6 Petabytes. A new phrase for what I think of as media refreshing &#8211; moving data between storage media without necessarily changing the format of the data &#8211; was &#8216;tape oozing&#8217;. This is still something that requires significant planning and coordination for large data centres to ensure that data can be moved before media become obsolete and without interfering with day-to-day access needs. NCAR&#8217;s observation is that poor curation causes more incidents of data loss than equipment failure or media failure. &#8216;Poor curation&#8217; can mean both bad practice and the failure to follow established good practice. (This is true of many areas of activity, from healthcare to manufacturing, and quality management systems like ISO9000 are designed to minimise both causes of error, but they will never eliminate it.)</p>
<p>Mackenzie Smith then gave the day&#8217;s second paper on CAD, specifically in architecture. A fascinating collaboration with MIT&#8217;s architecture school looked at well-known achitectes like Gehry, Thom Mayne and <a href="http://www.msafdie.com/">Moshe Safdie</a>. Data is a 3D CAD model (known as BIM &#8211; building information model) &#8211; which moves from architects to builders to owners. Targer audiences are practitioners, historians, instructors and the public. The practitioners already know they jhave a problem managing the data, so are incentivised. 10,000s of files, 100+ file formats, many gigabytes, almost no metadata: just filesystems, sometimes with spreadsheets in them that are like partial catalogues. The MIT project buildt a gui to help in automating assigning 5 properties to every file: when, where, how, why, what. &#8220;How&#8221; is generic type, such as presentation; &#8220;what&#8221; is specific format; &#8220;when&#8221; is to do with building phase rather than absolute time. Importamt stuff gets extra tags: models tell you what was built, but not why. MIT are archiving by creating 3 derivatives, all of which lack something. (STEP,dessicated shape, and web display.) One problem is that geometry is not authentic, but practiioners say much parametric stuff is tool artefact, not design intent. CAD vendors are very resistant on releasing format information. The project is exploring escrow solutions for this, which sounds like a pragmatic way forward.</p>
<p>In the afternoon, Stephen Abrams began with the statement that “Preservation is not a place” and went on to a wide-ranging reflection on the nature and role of repositories in a way that has relevance for many other institutions and services. Starting from first principles, Abrams&#8217; group worked from values to services, reimagining the repository as a process, not a place (still a work in progress.) They went back to <a href="http://en.wikipedia.org/wiki/Five_laws_of_library_science">Raganathans 5 laws</a> and also looked at archival science, particularly concerns over provenance and its importance in supporting authenticity. They identified 10 properties which they think apply to all objects (identity, viability, visibility are some.) But curation depends on a lot of human things: competency, deicsion making, analysis. They identify 13 minimal services that suppport the human endeavours: these include identity services (ark and noid), storage (pairtree), fixity/replication (ACE), catalog (relational or rdf/xml database), characterization (jhove2), transformation (at ingest, preservation dessication/migration, and use copy generation), ingest (bagit/grabit), request &#8211; metadata-based search and browse, search (lucene), publication (sitemap,rss/atom,oai-pmh), annotation (social tagging,oai-ore). Any one of those would have made an interesting paper in itself, and some of them were the subject of separate posters. He ended by thinking how simple can a curation environment be and still be effective? And also took the concept of LOCKSS a little bit further: lots of description keeps stuff meaningful, services keep stuff useful, use keeps stuff valuable.</p>
<p>Finally Malcolm Atkinson gave a stirring speech which began with thoughts of scientists who still go out once a month to count birds or moss species &#8211; trained volunteers are important as well as industrialised scientists. Data can have uses not envisaged when it was created nor even when it was first preserved: ships logs&#8217; original purpose was safety. They were collected by archives to understand trade, and are now helping climate modelling because of their detailed records of such things as temperature and wind. Soon power will cost more than hardware. 1pb consumes 2.2mwh in 5 years. He talked of ‘data huggers’ who avoid competition, don&#8217;t describe, don’t expose, don’t share their data. There are also good data sharers, but at the moment incentives and disincentives aren’t well balanced. We need research on how to enable better research and data must be fundamental to this.</p>
<p>IDCC has been over-subscribed for 3 of its years and always offers lots of inspiration and food for thought. The linkup with CODATA, whose events cover a similar subject area, can only be a good thing. I&#8217;m already looking forward to IDCC5, expected to be in London in 2009.</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2008/12/22/4th-international-digital-curation-conference-part-3-idcc4-rides-again/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>4th International Digital Curation Conference part 2 (return of idcc4)</title>
		<link>http://dablog.ulcc.ac.uk/2008/12/07/4th-international-digital-curation-conference-part-2-return-of-idcc4/</link>
		<comments>http://dablog.ulcc.ac.uk/2008/12/07/4th-international-digital-curation-conference-part-2-return-of-idcc4/#comments</comments>
		<pubDate>Sun, 07 Dec 2008 19:54:45 +0000</pubDate>
		<dc:creator>Kevin Ashley</dc:creator>
				<category><![CDATA[DA Blog]]></category>
		<category><![CDATA[Digital Archives]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[data curation]]></category>
		<category><![CDATA[DCC]]></category>
		<category><![CDATA[digital curation]]></category>
		<category><![CDATA[edinburgh]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[idcc4]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/2008/12/07/4th-international-digital-curation-conference-part-2-return-of-idcc4/</guid>
		<description><![CDATA[My last post from IDCC4 ended with me being unable to report on John Wilbank&#8217;s closing keynote on day 1, so I&#8217;ll rectify that now with the benefit of handwritten notes and a little more time for reflection. John is working on the Science Commons initiative which, through projects and advocacy, is taking the Creative [...]]]></description>
			<content:encoded><![CDATA[<p>My <a href="http://dablog.ulcc.ac.uk/2008/12/03/the-4th-international-digital-curation-conference-idcc4/">last post</a> from IDCC4 ended with me being unable to report on John Wilbank&#8217;s closing keynote on day 1, so I&#8217;ll rectify that now with the benefit of handwritten notes and a little more time for reflection.</p>
<p>John is working on the <a href="http://sciencecommons.org/">Science Commons</a> initiative which, through projects and advocacy, is taking the Creative Commons concepts and applying them to the doing of science, as well as the publishing of its outputs. (The DCC were <a href="http://forum.dcc.ac.uk/viewtopic.php?t=106">drawing our attention</a> to the initiative over 3 years ago.) He began with a view[1] that science is not unlike wikipedia: they are about publishing, in the sense of disclosure, advances are made by individual action and proceed by small, discrete steps, and trust ratings accrue from peer review. He also commented on the &#8220;tyranny of the crowd&#8221; effect that general search tools like Google suffer from: someone searching for information about spears (their manufacture, use, or &#8220;spears, the carrying and chucking of&#8221;) will be somewhat overwhelmed by the number of results relating to Britney. And from this he moved to a view that science, to advance further, requires a disruptive change to its practices that it is inherently resistant to. One thing that needs to change is the notion that science is communicated through periodic papers (itself an outdated metaphor), &#8220;units of knowledge which are adverts for years of work.&#8221; He observed that, even if we still want papers, we really want them with embedded (ideally semantic) linking and tagging. Yet, although we have the technology to do this in a semi-automated way, the licenses that apply to many e-journals explicitly prevent us from doing so.</p>
<p>He then moved to considering ways to improve openness of journals and their content: by giving incentives such as better statistics to those who publish in open journals, and through simple but effective tools such as the <a href="http://scholars.sciencecommons.org/">scholar&#8217;s copyright and addendum tool</a>, <span id="more-245"></span> a really simple idea which impressed me: something which automatically adds text to a publisher&#8217;s licence from a closed journal which gives the author the necessary rights to self-publish (and hence link and annotate) their work. He demonstrated the effect of linking and the power of the semantic web through a google search on particular receptors in brain chemistry and the genes which relate to them. A google search, despite highly specific language, returned 88,400 results, most of which were papers about the receptors. But what the researcher probably wants is a list of genes and evidence for the nature of their relationship (as encoders, regulators, etc.). Their semantic web tools give exactly this, and allow the resultant RDF query to be turned into a simple (if unwieldy) hyperlink. What&#8217;s more, they were able to use google maps for brain data in a way that allows it to be annotated; one of those clever, simple ideas that makes you wonder why noone else did it before. As John made clear, one of the reasons is that the data isn&#8217;t sufficiently open.</p>
<p>He argued strongly that, for open science data to realise its potential, we must abandon the notion of requiring attribution and/or citation because it places too great a burden on those combining data from multiple sources. He&#8217;s doing pilot work on open data in science commons with charities funding work on <a href="http://neurocommons.org/">rare brain diseases</a>: the sort of thing in which the ability to link scattered data, often from other areas of research, hugely amplifies the value of the original research funding. Working with google, they&#8217;ve come to the realisation that typical data mashups or search results might involve 40,000 invididual citations if all the data sources ar taken into account. For Google that&#8217;s a burden they&#8217;re not willing to deal with. So John is arguing strongly that we should abandon the desire to have databases, or cells in databases, attributable. It was a powerful argument, although I&#8217;m concerned that it shouldn&#8217;t be forgotten that we often still need the ability to determine data provenance, sometimes at the level of an individual value in a database cell, to ensure that we know we&#8217;re comparing like with like, or applying an appropriate statistical technique. And the provenance information is often tied up with the citation information. Still, John&#8217;s argument, and that of the science commons, seems very persuasive: huge benefits can be gained from making data completely open (as in public domain) and we will not realise those benefits if we cling to attribution or citation. He also made that point that although data growth is exponential, our brain capacity remains constant, and that the only human factor increasing along with data is population. People-driven annotation and sharing therefore helps us process increasing volumes of information (I think.)</p>
<p>Cameron Neylon has also <a href="http://blog.openwetware.org/scienceintheopen/2008/12/02/quick-update-from-international-digital-curation-conference/"> written about John&#8217;s talk and that crucial distinction between attribution and citation.</a> (His blog post also contains a link to <a href="http://www.slideshare.net/CameronNeylon/radical-sharing-transforming-science-presentation">his presentation from Tuesday morning</a>, which makes a great case for the need to curate networks of science, not digital objects per se. I&#8217;m sorry I missed the talk.) John points out that many people confuse the need to attribute with the need to cite; attribution is a legal requirement, bound up with copyright and licences (even in the open world of creative commons) and failure to attribute material puts you legally in the wrong. Citation is merely a social convention or an ethical obligation, something we <em>ought</em> to do; failure to cite leads to the disapproval of your peers and is called plagiarism, at least when you attempt to pass off the ideas of others as your own. And that leads to the conclusion that, for an academic, the consequences of not meeting that social obligation are potentially much worse than the consequences of falling foul of copyright law. The latter may, at worst, lead to a financial penalty which is unpleasant but survivable, whereas plagiarism can lead to the loss of reputation, job and career &#8211; despite no laws having been broken.</p>
<p><a href="http://www.flickr.com/photos/asifch/356920779/sizes/l/"><img src="http://dablog.ulcc.ac.uk/wp-content/uploads/2008/12/356920779_c0f4d50b91_m.jpg" alt="Edinburgh Castle in the mist: asifch@flickr.com, CC-BY-ND-NC licence" title="Edinburgh Castle in the mist: asifch@flickr.com, CC-BY-ND-NC licence" class="float-right" style="border: 1px solid #dddddd; padding: 4px" /></a>But that was a minor cororollary of the Science Commons thesis, which appears to be that we have to be willing to really let go of data, in a way that we haven&#8217;t done before, to allow science to proceed in a way that permits disruptive, rather than incremental, change.</p>
<p>Tuesday morning saw Martin Lewis, university librarian at Sheffield, look at the role libraries need to play in supporting research. His talk had two threads: one a reflection on how libraries are still agile, capable of change and sensitive to the needs of scolars and learners, and one a summary of the findings and consequences of the <a href="http://www.ukrds.ac.uk/">UKRDS</a> report, which at present is still in draft. He used new library buildings at Sheffield and Glasgow Calendonian to illustrate that libraries now create very different sorts of spaces for learners and as a result still attract students to congregate there. Capping of collection size is now commonplace &#8211; they recognise that most collections cannot continue to consume more space indefinitely. He reflected that UK university libraries had been devoting too much attention to learning and teaching and had, as a result, neglected the needs of researchers. And thus he moved to his assertion that university library services were best placed to meet the needs that the UKRDS feasibility study would identify. Martin says that they can raise awareness of data issues, lead policy on data management, provide advice to researchers early in the data lifecycle, and work with IT services to develop appropriate local capacity.</p>
<p>Martin&#8217;s talk was entertaining, informative and well-argued in equal measure, but I am not entirely persuaded by it. University libraries certainly have a place to play here, and in some institutions they may well need to adopt a central role. But just because they can do all of the things that Martin describes does not mean that they are the only people who can, nor that they are the best-placed to do so in every case. His argument put up a number of straw men, one of which seemed to see future RDS provision as a choice between a library-led service and an computing services-led one. It was a cheap dig (&#8220;How many university computing services have a &#8216;friends&#8217; organisation?&#8221;) and sets up a false dichotomy. This won&#8217;t be a simple either/or choice and there are many other routes to the intended end of a federated shared service of some sort. He emphasised that the UKRDS study was primarily a political act to keep the issue of research data management high on the agenda and that it should be seen in that light. I&#8217;m in complete agreement with that, and that&#8217;s the very reason I don&#8217;t see that it&#8217;s useful to engage in a turf war about which part of the university takes this forward.</p>
<p>That takes us to just over halfway at iDCC4, and it&#8217;s enough for one blog post. The rest will follow in a day or two, and there&#8217;s lots more of interest to write about.</p>
<p>&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8212;&#8211;</p>
<p>[1] &#8211; which he credited to someone else that sounded like Jean-Pierre Galdon, but wasn&#8217;t</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2008/12/07/4th-international-digital-curation-conference-part-2-return-of-idcc4/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>The 4th International Digital Curation Conference: idcc4</title>
		<link>http://dablog.ulcc.ac.uk/2008/12/03/the-4th-international-digital-curation-conference-idcc4/</link>
		<comments>http://dablog.ulcc.ac.uk/2008/12/03/the-4th-international-digital-curation-conference-idcc4/#comments</comments>
		<pubDate>Wed, 03 Dec 2008 00:54:14 +0000</pubDate>
		<dc:creator>Kevin Ashley</dc:creator>
				<category><![CDATA[DA Blog]]></category>
		<category><![CDATA[Digital Archives]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[data curation]]></category>
		<category><![CDATA[DCC]]></category>
		<category><![CDATA[digital curation]]></category>
		<category><![CDATA[edinburgh]]></category>
		<category><![CDATA[Events]]></category>
		<category><![CDATA[idcc4]]></category>

		<guid isPermaLink="false">http://dablog.ulcc.ac.uk/2008/12/03/the-4th-international-digital-curation-conference-idcc4/</guid>
		<description><![CDATA[This is a brief and undigested report from day 1 of the DCC&#8217;s international digital curation conference taking place in Edinburgh. After a welcome from Chris Rusbridge (DCC director) and Professor Peter Clarke (NeSC director) we had a keynote from Professor David Porteous, a professor of human molecular genetics and medicine and a key player [...]]]></description>
			<content:encoded><![CDATA[<p>This is a brief and undigested report from day 1 of the DCC&#8217;s international digital curation conference taking place in Edinburgh. After a welcome from Chris Rusbridge (<a href="http://www.dcc.ac.uk/">DCC</a> director) and Professor Peter Clarke (<a href="http://www.nesc.ac.uk/">NeSC</a> director) we had a keynote from Professor David Porteous, a professor of human molecular genetics and medicine and a key player in <a href="http://generationscotland.org/">Generation Scotland</a>.</p>
<p>He began by illustrating the changes in health, disease and knowledge of causes that lie behind some of his work. Changes in scottish demography illustrate this: in 1911 everyone is young, numbers decline with age in a smooth curve. After WWII, in 1951, there is a flat bulge from ages 10 to 50 with a decline thereafter, whereas 2001 and 2031 sees a bulge in pensioners. There is a consequent rise in chronic disease: disease that treatments of today are not very effective for, unlike the killers of the past, where effective treatments contributed to the changes in age profile in the population that we are now seeing. He illustrated this with reference to the <a href="http://www.sasi.group.shef.ac.uk/publications/reaper.html">grim reapers road map</a>: an atlas of mortality in the UK (and, as he said, a fine book title.) It showed cancer, heart and lung disease unequally distributed over the UK. Glasgow is particularly bad. Why? Nature and nurture both play a part, but other than smoking, we have very little evidence for the real effects underlying nurture causes such as diet variation. So his research concentrates on the nature aspect: what difference our genetic inheritance plays.</p>
<p>He then looked at changes in sequencing costs. From 1990-2003 the human genome project spent $3bn to do the first with machines spread over aircraft hangers; now one machine can do a genome for 500k, in the next year 5k. <a href="http://www.completehenomics.com">completegenomics</a> plans 20,000 genomes at $5k each in 2010 using 60,000 processors and 30Pb of storage. <span id="more-244"></span>One goal of all this is personalised medicine &#8211; drugs that work for your genetic makeup. In the USA, adverse drug reaction is 4th leading cause of hospital mortality (but I&#8217;m thinking that only some of this is genome-related; some of it must surely be because some drugs are just downright toxic, with prescription involving balanced risk that sometime doesn&#8217;t go the way we want, and because prescribing errors are still all too frequent.) Bringing together mass genomics and automated drug screening is key, involves two big sets of data. Generation Scotland plans to do this: a competitive advantage is an unhealthy, stable population. Large scale family-based studies possible, supportive attitude in Scotland to doctors and medical schools. Expertise in health informatics and ethical, legal, social science essential. It&#8217;s all volunteer-based (striking contrast to Iceland and UK genome bank); grandmothers are key influencers. Recruits have blood, serum and uurine samples and tests of lung, bone, etc. and mental health status, so it&#8217;s more than just aggregating health records. Mental health is one where drugs generally don&#8217;t work and interaction between nature, nurture, aetiology and drugs needs to be explored in much more depth. System is linked to medical records; subjects have right to withdraw.</p>
<p>10 years of consultation before it started. but then there&#8217;s google health and google health trends &#8211; both ways of using large amounts of data to gain knowledge which work in different ways.</p>
<p>I missed the next morning sessions because I had to attend another meeting, and rejoined to see the minute-madness presentations for the posters, about which I hope to write later this week.</p>
<p>After lunch we had Dr Bryan Lawrence of <a href="http://www.scitech.ac.uk/">STFC</a> talk about big science data curation at the STFC environmental data archive. There were lots of numbers in this presentation and I only capture some of them: petabytes of data overall, 50TB from met office, 4000 years work to &#8216;look&#8217; at it all. That&#8217;s why you need metadata, because you can&#8217;t examine it all at ingest. 2 minutes to find and do something simple: 60,000 images/year/person. So need to automate metadata creation and extraction. Google needs this metadata to help; it can&#8217;t deal with non-cited data directly. Most data mining processes text, not image data. We find data with discovery and ontology metadata; then look at context, character and discipline stuff; then also archival metadata. He mentions ISO 19115, should be derived from browse metadata. (There was a much more formal classification of metadata in his presentation which I haven&#8217;t captured in these notes.)</p>
<p>Data scientists can&#8217;t do their job unless the scientist has done theirs. They can choose <strong>not</strong> to take stuff, though, because the scientist hasn&#8217;t done their job. But even not taking something consumes resources to make the assessment and decision. Makes point that you can automate streams, but you can&#8217;t automate jobs away (10 things still need to be done, even if they are automated, so there&#8217;s still a linear resource relationship to the number of objects.)</p>
<p>A charge for 3 years storage up-front at ingest time which if volumes continue to grow, historical data storage lives on the margins from current business. Core budget pays for management and access systems , data management, network access, etc. then per data stream costs charged to projects. Core covers some projects already, with 25 FTE can supoport 10 new types of data per year, 100 things of a few hours work, 1000 things of a few minutes work, beyond that it must be automated. Next IGPCC requirements changes scale and thus the cost models. Interesting problems are in browse ontology and extra metadata space. Preserving metadata now presents its own challenges; real data publication is the way forward. In questions we determine that the deluge is necessary for the cost model, as is storage cost reduction. And at the moment it isn&#8217;t worth bothering about things to throw away, but that could change if those cost assumptions aren&#8217;t true (such as if storage costs plateau at some point.)</p>
<p>Neal Beagrie then speaks about research data costs. 4 case studies, 12 interviews, literature review, detailed look at 2 cost models against OAIS and UKHE TRAC led to <a href="http://www.jisc.ac.uk/publications/publications/keepingresearchdatasafe.aspx">the report</a>. Produced a 3-part activity model which supports a cost framework (pre-archive, archive, support services.) There are key cost variables, a resource template. separate economic adjustments from service adjustments. He contrasts repository costs for publications with costs for data repositories, and data from elsewhere about much bigger cost of repairing metadata later on as opposed to doing it right at creation. It looks at efficiency curve effects, economies of scale and problems with first mover costs. He mentions the ADS 20-year rule (that all-time costs of preservation are essentially accounted for in the first 20 years), but points out there are a number of assumptions behind that. Points out usefulness of NSB/NSF distinction between research, community and reference collections. The study is new in using FEC (full economic costs, a UK HE funding model which comes from TRAC &#8211; transparent approach to costing (not trustworthy archive certification!)</p>
<p>The study is not just about DIY, can account for partial or full outsourcing. OSI study shows that 1.4% or 1.5% of research funding goes on data preservation and access.</p>
<p>Brian Lavoie talks about economic sustainability, from the <a href="http://brtf.sdsc.edu/">blue ribbon task force</a>. Resources aren&#8217;t just &#8216;available&#8217;: meaningful engagement is necessary. They need to be comprehensive (or at least a critical mass), actionable and sustainable (hence persistent.) Sustainability is economic, technical and social. Task force supported by NSF, Mellon, LoC, JISC, CLIR, NARA. Mission to frame digital preservation as a sustainable economic activity. Need to articulate benefits and incentives for decision makers (parallels with PoWR and digital preservation policy study.) First gives a willingness to pay, second a willingness to provide.</p>
<p>Need selection and efficiency, and reliable predictability. Then need to choose organisational form and governance. Org may be no interest( 3rd party provider), private interest (university library/archive), statutory/mandate interest (national library/archive). Issues: separating costs of access now vs access in the future. Monetizing public good. He mentions &#8220;spend now for future value&#8221; which appears to resonate with the DELOS/NSF <a href="http://eprints.erpanet.org/48/">&#8220;Invest to Save&#8221;</a> message (in which I must declare an interest.) First report due this month.</p>
<p>At this point my laptop battery gave up the ghost. The final presentation from John Willbanks was a real highlight in many ways, but it will take until tomorrow for me to transcribe my handwritten notes.</p>
<p>Day 1 ended with a conference dinner in the splendid setting of Edinburgh Castle. The harpist was a particularly fine touch.</p>
]]></content:encoded>
			<wfw:commentRss>http://dablog.ulcc.ac.uk/2008/12/03/the-4th-international-digital-curation-conference-idcc4/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
