In September this year Dave Thompson of the Wellcome Library asked a question by Twitter, one which is highly relevant to digital preservation practice and learning skills. Addressing digital archivists and librarians, he asked: “Do we need to be able to do all ourselves, or know how to ask for what is required?”

My answer is “we need to do both”…and I would add a third thing to Dave’s list. We also need to understand enough of what is happening when we get what we ask for, whether it’s a system, tool, application, storage interface, or whatever.

Personally, I’ve got several interests here. I’m a traditional archivist (got my diploma in 1992 or thereabouts) with a strong interest in digital preservation, since about 2004. I’m also a tutor on the Digital Preservation Training Programme.

As an archivist wedded to paper and analogue methods, for some years I was fiercely proud of my lack of IT knowledge. Whenever forced to use IT, I found I was always happier when I could open an application, see it working on the screen, and experiment with it until it does what I want it to do. On this basis, for example, I loved playing around with the File Information Tool Set (FITS).

When I first managed to get some output from FITS, it was like I was seeing the inside of a file format for the first time. I could see tags and values of a TIFF file, some of which I was able to recognise as those elusive “significant properties” you hear so much about. So this is what they look like! From my limited understanding of XML – which is what FITS outputs into – I knew that XML was structured and could be stored in a database. That meant I’d be able to store those significant properties as fields in a database, and interrogate them. This would give me the intellectual control that I used to relish with my old card catalogues in the late 1980s. I could see from this how it would be possible to have “domain” over a digital object.

There’s a huge gap, I know, between me messing around on my desktop and the full functionality of a preservation system like Preservica. But with exercises like the above, I feel closer to the goal of being able to “ask for what is required”, and more to the point, I could interpret the outputs of this functionality to some degree. I certainly couldn’t do everything myself, but I want to feel that I know enough about what’s happening in those multiple “black boxes” to give me the confidence I need as an archivist that my resources are being preserved correctly.

With my DPTP tutor hat on, I would like to think it’s possible to equip archivists, librarians and data managers with the same degree of confidence; teaching them “just enough” of what is happening in these complex processes, at the same time translating machine code into concrete metaphors that an information professional can grasp and understand. In short, I believe these things are knowable, and archivists should know them. Of course it’s important that the next step is to open a meaningful discussion with the developer, data centre manager, or database engineer (i.e. “ask for what is required”), but it’s also important to keep that dialogue open, to go on asking, to continue understanding what these tools and systems are doing. There is a school of thought that progress in digital preservation can only be made when information professionals and IT experts collaborate more closely, and I would align myself with that.


It’s Open Access Week ( hashtag – #OAWeek2014 ) and around the world everyone is talking about the importance of sharing, of re-use and of people having free access to content. Although it started as a movement focused on scholarly publications, Open Access as a concept has made big waves. The move from paper to online has made the possibility of much greater openness attainable. Since the first Open Access Week took place in 2009, the movement has developed to promote the benefits of sharing in academia far beyond scholarly publications, to include research data and teaching and learning resources.

So what role, in all this excitement of sharing and re-use and collaboration, does digital preservation play? A very central one, we would say. Peter Subar’s definition is a good place to start -

“Open-access (OA) literature is digital, online, free of charge, and free of most copyright and licensing restrictions

OA removes price barriers (subscriptions, licensing fees, pay-per-view fees) and permission barriers (most copyright and licensing restrictions)”

- but to keep something digital and online, that something needs to be part of a well-managed digital preservation programme. Putting it out there is only half of the job. Deciding what content stays available, and for how long, and how digital content will continue to be accessible over time is fundamental to the ongoing success of the OA movement.Without digital preservation taking place, content can become inaccessible over time as file formats change, as hardware needed to view the content becomes obsolete – for any number of reasons that can damage content or make it inaccessible over time. So, digital preservation has a role in keeping OA content in an open and accessible state after the initial publication.

Digital preservation also has an important role to play before content is published in an OA way. Content is created, and that content needs to be preserved so that it can become open and accessible. If a researcher, for example, has created research data as part of a research project, then written a research paper based upon that data, intending to share their entire research output under Open Access, there is usually a period of time before both are ‘live’ and publicly published. Making sure that all research outputs are managed well from a digital preservation perspective is crucial. Without digital preservation taking place, digital objects can and do become inaccessible. To be able to open up content as Open Access, that content needs, by definition,  to be accessible. A desire to share will not overcome such issues as bit rot,  file corruption, content that can only now be viewed on unavailable software or any of the other many ways that digital objects can become inaccessible and/or degenerate over time.

The theme of the OA Week for 2014 is Generation Open. So this seems like the perfect year to raise a awareness of  digital preservation and how it supports and underpins the aspirations of the Open Access movement. If you’d like to know more about digital preservation, there are some useful resources out online. We’ve compiled a short list if some key resources, below, which you might find useful.

This blog is a good place to start, and we also run training courses in digital preservation, catering for the beginner with our ‘Introduction to Digital Preservation’ course and to the more experienced practitioner with our ’Practice of Digital Preservation’ course, running in November and December 2014 respectively.

The Digital Preservation Coalition (DPC), is a membership organisation that supports digital preservation. Their site is a wealth of information on all things digital preservation, including Tech Watch Reports, news, training and even jobs (if you get carried away!),  this is a great starting point. UK-based, they have members from all over the world.

The Open Preservation Foundation (OPF), is another international organisation. They support and open community around digital preservation and have useful information on tools, training and software and community events. Most useful when you have some basic knowledge of the subject.

The SPRUCE Project was a collaboration between the University of Leeds, the British Library, the Digital Preservation Coalition, the London School of Economics, and the Open Preservation Foundation, co-funded by Jisc. The aim was to bring together a community to support digital preservation in the UK. Although the project ended in November 2013, a live wiki brings together the top project outputs (all open, of course), including a Digital Preservation Business Case Toolkit and a community-owned Digital Preservation Tool Registry.

The Digital Curation Centre (DCC) is a centre of expertise in the curation of digital information. This is the go-to place for all your research data preservation needs, with useful case studies, how-to guides and training courses in this area.

For some tips and information on how the ‘big guys’ manage digital preservation, check out the British Library’s digital preservation strategy, which includes some useful links as well as the strategy itself, and ‘Preserving Digital Collections’ from The National Archives has lots of good information on digital preservation, including FAQs.

Enjoy Open Access Week 2014, and remember that sharing starts and ends with good digital preservation!

Today saw the inaugural meeting at ULCC of what we hope will become an ongoing series, intended to complement the successful EPrints UK User Group meeting.

The pow-wow will bring together developers from universities around the UK to learn about the next generation of features and functionality offered by the EPrints repository platform.

The event gives developers a chance to look “under the hood” of EPrints and better understand how to effectively implement and deploy new features at their own institutions. Developers discussed how they can actively contribute to the platform by feeding back changes and enhancements to the EPrints github repository.



Team Linter

Team Linter receiving their prizes. Photo to RepoFringe used under CC

Rory McNicholl, Tim Miles-Board and  Steph Taylor attended the Repository Fringe conference in Edinburgh, 30-31 July. Steph gave a presentation on how to turn a repository into a digital archive. The talk used the ART team’s knowledge of both repositories and digital presentation to give UK repository manages some useful guidelines and tips on how they could start to engage with digital preservation. Tim, already an established member of the UK repository community, made the most of the excellent networking opportunities to bring back many interesting leads and contacts for the team.

Rory, meanwhile, was busy not only networking but also creating the winning entry  in the Developer Challenge that ran for the two days of the conference.  With a brief ‘to do something cool with repositories within the Open Access realm or even better, something aiding Open Access compliance’, the race was on among participating teams.

Rory worked with  two other delegates, Paul Mucur of Altmetric & Richard Wincewicz of EDINA,  as part of ‘Team Linter’ to build a tool to not only check the completeness of repository records but to then fill in those gaps. The tool first identified any missing metadata within a record and then used  existing services such as SHERPA, and CROSS-REF to suggest information to fill those gaps. It was a great time-saving tool, and allowed for a knowing human eye to check the suggestions as they were made. The information was added in a cascade style, with each new piece of information being then used to search for more in-depth information. The record became more and more detailed and accurate as it progressed through the search, gathering information in. The demo showed how a very detailed record could be created from a journal article title alone. The code can be found on GitHub.

The team had espite tough competition from team ‘Are We There Yettt?’ and their  tool to alert repository managers when a paper that is deposited before publication is actually published. But the repository managers loved Team Linter just a little bit more, and showed their appreciation by giving them the loudest applause and cheers.  It was great to see one of our developers get such public recognition for their skills and knowledge, and a great way to end a very productive conference.

Moomins Blogpost

Photo by Todd.vision used under CC

In June, Richard Davis and Steph Taylor attended the Open Repositories conference in Helsinki. Between them, they gave three presentations on various aspects of the work of the ART team. Richard and Steph wrote a joint presentation on the evolution of the repository landscape, using the many bespoke developments carried out by the ART team’s EPrints developers. Repositories showcased in the presentation included the Linnean Society Herbarium, the Atlantic Archive repository and the SAS Open Journals.

Richard also spoke about the ULCC partnership with Arkivum, and how the ART team developers are linking up Arkivum and EPrints to create a repository that is also a digital archive.

And finally, carrying on in the digital preservation theme, Steph gave a short ‘repository rant’ presentation in which she was able to point out (in a rather firm way!),  why a repository is not a digital archive. The conference provided a great opportunity to network with repository people from around the world, to learn about their work and to share what we are doing at ULCC.


On 21-22 July, we launched our new course, ‘An Introduction to Digital Preservation’. This course is part of our ongoing development of the DPTP (Digital Preservation Training Programme). Aimed at the complete beginner to digital preservation, the course focusses on the basics,  setting  firm foundations for participants to begin to develop digital preservation skills and strategies that suit their own organisations.

The launch was very successful, with participants coming from HEIs, archives, financial and commercial companies, national memory institutions and funding bodies. It’s always nerve-racking to do run a new course for the first time. Months of hard work lie behind each course, as we research, write, edit and endlessly test the content to make sure it’s doing the job we need it to do, to help people learn. So it was wonderful to have such a great group of people to teach on our first ‘Intro’ course. Everyone was engaged, enthusiastic, interested and interesting, and most importantly, keen to take part in whatever we threw at them! We couldn’t have wished for a better experience, and the two days of the course flew by. Our own experience was, from the fedback forms, clearly shared by the participants.

As you might expect, we are very keen to teach this course again! We have already had many people wanting to come on the next ‘Intro’ course, and we plan to run this again in September and November. We hope to be confirming dates in the next few weeks. If you’re interested in the dates for the next courses, details of how to email in to receive updates and advanced notice are available on our home page.