24Days_sledge

Last week we launched our digital preservation advent tweets, a series of 24 tweets throughout the run up to Christmas. If you missed any, you can catch up via our Storify of the first 7 days.  We hope you’re enjoying this small celebration of all things digital preservation, and that you will share the links on Twitter if you find something useful. Thanks to all the people and organisations who have inspired our tweets so far, and here’s to the next 17 days!

Image from the British Library on Flickr

Photo by James Jordan https://www.flickr.com/photos/jamesjordan/

Photo by James Jordan https://www.flickr.com/photos/jamesjordan/

This one-day event on 31 October 2014 was organised by the DPC. The day concluded with a roundtable discussion, featuring a panel of the speakers and taking questions from the floor. The level of engagement from delegates throughout the event was clearly shown in the interesting questions posed to the panel, the thoughtful responses and the buzz of general discussion in this session. Among many interesting topics covered, three stand out as typical of the breadth of knowledge and interest shown at the event.

First, a fundamental question about the explosion of digital content and how it will impact on our work. How can we keep all of this stuff, where will we put it, and how much will it really cost? Sarah Middleton urged us to attend the upcoming 4C Conference in London to hear discussion of cutting-edge ideas about large-scale storage approaches. Catherine Hardman reminded us of one of the most obvious archival skills, which we sometimes tend to forget: selection. We do not have to keep “everything”, and a well-formulated selection policy continues to be an effective way to target the preservation of the most meaningful digital resources.

Next, a question on copyright and IPR as it applies to archives/archivists and hence digital preservation quickly span into the audience and back to different panel members in a lively discussion. The general inability of the current legislation, formed in a world of print, to deal with the digital reality of today was quickly identified as an obstacle to both those engaged in digital preservation and to users seeking access to digital resources.

The Hargreaves report was mentioned (by Ed Pinsent of ULCC) and given an approving nod for the sensible approach it took to bringing legislation into the 21st century. However, the speed with which any change has actually been implemented was of concern for all, and was felt to be damaging to the need to preserve material. The issues around copyright and IPR were knowledgeable discussed from a wide variety of perspectives, including the cultural heritage sector, specialist collections, archaeological data and resources and, equally important among delegates, the inability to fully open up collections to users in order to comply with the law as it stands.

Some hope was found, though, in the recent (and ongoing) Free Our History campaign. Using the national and international awareness of various exhibitions, broadcasts and events to mark the anniversary of the First World War, the campaign has focussed on the WW1 content that museums, libraries and archives are unable to display because of current copyright law. Led by the National Library of Scotland, other memory institutions and many cultural heritage institutions have joined in the CILIP campaign to prominently exhibit a blank piece of paper. The blank page represents the many items which cannot be publicly displayed. The visual impact of such displays has caught attention, and the accompanying petition is currently being addressed by the UK government.

The third issue raised during this session was the suggestion for more community activity, for example more networking and exchange of experience opportunities. Given the high rate of networking during lunchtime and breaks, not to mention the lively discussions and questions, this was greeted with enthusiasm. Kurt Helfrich from RIBA explained his idea for an informal group to organise site visits and exchange of experience sessions among themselves, perhaps based in London to start off with. Judging by the level of interest among delegates to share their own work and learn from others during this day, this would be really useful to many. Leaving the event with positive plans for practical action felt a very fitting way to end an event around making progress in digital preservation.

The above authored mostly by Steph Taylor, ULCC

Download the slides from this event

Photo by Ard Hesselink https://www.flickr.com/photos/docman/

Photo by Ard Hesselink
https://www.flickr.com/photos/docman/

This one-day event on 31 October 2014 was organised by the DPC and hosted at the futuristic, spacious offices of HSBC, where the presentation facilities and the catering were excellent. All those attending were given plenty of mental exercises by William Kilbride. He said he wanted to build on his “Getting Started in Digital Preservation” events and help everyone move further along the path towards a steady state, where digital preservation starts to become “business as usual”. The very first exercise he proposed was a brief sharing-discussion exercise where people shared things they have tried, and what worked and didn’t work.

Kurt Helfrich from The RIBA Library said his organisation had a large amount of staff administering a historic archive; various databases, created at different time for different needs, would be better if connected. He was keen to collaborate with other RIBA teams and link “silos” in his agency.

Lindsay Ould from Kings College London said “starting small worked for us”. They’ve built a standalone virtual machine, using locally-owned kit, and are using it for “manual” preservation; when they’ve got the process right, they could automate it and bring in network help from IT.

When asked about “barriers to success”, over a dozen hands in the room went up. Common themes: getting the momentum to get preservation going in the first place; extracting a long-term commitment from Executives who lose interest when they see it’s not going to be finished in 12 months. There’s a need to do advocacy regularly, not just once; and a need to convince depositors to co-operate. IT departments, especially in the commercial sector, are slow to see the point of digital preservation if its “business purpose” – a euphemism for “income stream”, I would say – is not immediately apparent. Steph Taylor of ULCC pointed out how many case studies in tools in our profession are mostly geared to the needs of large memory institutions, not the dozens of county archives and small organisations who were in the room.

Ed Pinsent (i.e. me) delivered a talk on conducting a preservation assessment survey, paying particular attention to the Digital Preservation Capability Maturity Model and other tools and standards. If done properly, this could tell you useful things about your capability to support digital preservation; you could even use the evidence from the survey to build a business case for investment or funding. The tricky thing is choosing the model that’s right for you; there are about a dozen available, with varying degrees of credibility as to their fundamental basis.

Catherine Hardman from the Archaeological Data Service (ADS) is one who is very much aware of “income streams”, since the profession of archaeology has become commercialised and somewhat profit-driven. She now has to engage with many depositors as paying customers. To that end, she’s devised a superb interface called ADS Easy that allows them to upload their own deposits, and add suitable metadata through a series of web forms. This process also incorporates a costing calculator, so that the real costs of archiving (based on file size) can be estimated; it even acts as a billing system, creating and sending out invoices. Putting this much onus on depositors is, in fact, a proven effective way of engaging with your users. In the same vein, ADS have published good practice guidance on things to consider when using CAD files, and advice on metadata to add to a Submission Package. Does she ever receive non-preferred formats in a transfer? Yes, and their response is to send them back – the ADS has had interesting experiences with “experimental” archaeologists in the field. Kurt Helfrich opened up the discussion here, speaking of the lengthy process before deposit that is sometimes needed; he memorably described it as a “pre-custodial intervention”. Later in the day, William Kilbride picked up this theme: maybe “starting early”, while good practice, is not ambitious enough. Maybe we have to begin our curation activities before the digital object is even created!

Catherine also perceived an interesting shift in user expectations; they want more from digital content, and leaps in technology make them impatient for speedy delivery. As part of meeting this need, ADS have embraced OAI-PMH protocols, which enables them to reuse their collections metadata and enhance their services to multiple external shareholders.

There is no doubt that having a proper preservation policy in place would go some way to helping address issues like this. When Kirsty Lee from the University of Edinburgh asked how many of us already had a signed-off policy document, the response level was not high. She then shared with us the methodology that she’s using to build a policy at Edinburgh, and it’s a thought-through meticulous process indeed. Her flowcharts show her constructing a complex “matrix” of separate policy elements, all drawn from a number of reports and sources, which tend to say similar things but in different ways; her triumph has been to distil this array of information and, equally importantly, arrange the elements in a meaningful order.

Kirsty is upbeat and optimistic about the value of a preservation policy. It can be a statement of intent; a mandate for the archive to support digital records and archives. It provides authority and can be leverage for a business case; it helps get senior management buy-in. To help us understand, she gave us an excellent handout which listed some two dozen elements; the exercise was to pick only the ones that suit our organisation, and to put them in order of priority. The tough part was coming up with a “single sentence that defines the purpose of your policy” – I think we all got stumped by this!

Download the slides from this event

Report continues in part two

DThompsonTweet2

In September this year Dave Thompson of the Wellcome Library asked a question by Twitter, one which is highly relevant to digital preservation practice and learning skills. Addressing digital archivists and librarians, he asked: “Do we need to be able to do all ourselves, or know how to ask for what is required?”

My answer is “we need to do both”…and I would add a third thing to Dave’s list. We also need to understand enough of what is happening when we get what we ask for, whether it’s a system, tool, application, storage interface, or whatever.

Personally, I’ve got several interests here. I’m a traditional archivist (got my diploma in 1992 or thereabouts) with a strong interest in digital preservation, since about 2004. I’m also a tutor on the Digital Preservation Training Programme.

As an archivist wedded to paper and analogue methods, for some years I was fiercely proud of my lack of IT knowledge. Whenever forced to use IT, I found I was always happier when I could open an application, see it working on the screen, and experiment with it until it does what I want it to do. On this basis, for example, I loved playing around with the File Information Tool Set (FITS).

When I first managed to get some output from FITS, it was like I was seeing the inside of a file format for the first time. I could see tags and values of a TIFF file, some of which I was able to recognise as those elusive “significant properties” you hear so much about. So this is what they look like! From my limited understanding of XML – which is what FITS outputs into – I knew that XML was structured and could be stored in a database. That meant I’d be able to store those significant properties as fields in a database, and interrogate them. This would give me the intellectual control that I used to relish with my old card catalogues in the late 1980s. I could see from this how it would be possible to have “domain” over a digital object.

There’s a huge gap, I know, between me messing around on my desktop and the full functionality of a preservation system like Preservica. But with exercises like the above, I feel closer to the goal of being able to “ask for what is required”, and more to the point, I could interpret the outputs of this functionality to some degree. I certainly couldn’t do everything myself, but I want to feel that I know enough about what’s happening in those multiple “black boxes” to give me the confidence I need as an archivist that my resources are being preserved correctly.

With my DPTP tutor hat on, I would like to think it’s possible to equip archivists, librarians and data managers with the same degree of confidence; teaching them “just enough” of what is happening in these complex processes, at the same time translating machine code into concrete metaphors that an information professional can grasp and understand. In short, I believe these things are knowable, and archivists should know them. Of course it’s important that the next step is to open a meaningful discussion with the developer, data centre manager, or database engineer (i.e. “ask for what is required”), but it’s also important to keep that dialogue open, to go on asking, to continue understanding what these tools and systems are doing. There is a school of thought that progress in digital preservation can only be made when information professionals and IT experts collaborate more closely, and I would align myself with that.

OpenAccess2014

It’s Open Access Week ( hashtag – #OAWeek2014 ) and around the world everyone is talking about the importance of sharing, of re-use and of people having free access to content. Although it started as a movement focused on scholarly publications, Open Access as a concept has made big waves. The move from paper to online has made the possibility of much greater openness attainable. Since the first Open Access Week took place in 2009, the movement has developed to promote the benefits of sharing in academia far beyond scholarly publications, to include research data and teaching and learning resources.

So what role, in all this excitement of sharing and re-use and collaboration, does digital preservation play? A very central one, we would say. Peter Subar’s definition is a good place to start -

“Open-access (OA) literature is digital, online, free of charge, and free of most copyright and licensing restrictions

OA removes price barriers (subscriptions, licensing fees, pay-per-view fees) and permission barriers (most copyright and licensing restrictions)”

- but to keep something digital and online, that something needs to be part of a well-managed digital preservation programme. Putting it out there is only half of the job. Deciding what content stays available, and for how long, and how digital content will continue to be accessible over time is fundamental to the ongoing success of the OA movement.Without digital preservation taking place, content can become inaccessible over time as file formats change, as hardware needed to view the content becomes obsolete – for any number of reasons that can damage content or make it inaccessible over time. So, digital preservation has a role in keeping OA content in an open and accessible state after the initial publication.

Digital preservation also has an important role to play before content is published in an OA way. Content is created, and that content needs to be preserved so that it can become open and accessible. If a researcher, for example, has created research data as part of a research project, then written a research paper based upon that data, intending to share their entire research output under Open Access, there is usually a period of time before both are ‘live’ and publicly published. Making sure that all research outputs are managed well from a digital preservation perspective is crucial. Without digital preservation taking place, digital objects can and do become inaccessible. To be able to open up content as Open Access, that content needs, by definition,  to be accessible. A desire to share will not overcome such issues as bit rot,  file corruption, content that can only now be viewed on unavailable software or any of the other many ways that digital objects can become inaccessible and/or degenerate over time.

The theme of the OA Week for 2014 is Generation Open. So this seems like the perfect year to raise a awareness of  digital preservation and how it supports and underpins the aspirations of the Open Access movement. If you’d like to know more about digital preservation, there are some useful resources out online. We’ve compiled a short list if some key resources, below, which you might find useful.

This blog is a good place to start, and we also run training courses in digital preservation, catering for the beginner with our ‘Introduction to Digital Preservation’ course and to the more experienced practitioner with our ’Practice of Digital Preservation’ course, running in November and December 2014 respectively.

The Digital Preservation Coalition (DPC), is a membership organisation that supports digital preservation. Their site is a wealth of information on all things digital preservation, including Tech Watch Reports, news, training and even jobs (if you get carried away!),  this is a great starting point. UK-based, they have members from all over the world.

The Open Preservation Foundation (OPF), is another international organisation. They support and open community around digital preservation and have useful information on tools, training and software and community events. Most useful when you have some basic knowledge of the subject.

The SPRUCE Project was a collaboration between the University of Leeds, the British Library, the Digital Preservation Coalition, the London School of Economics, and the Open Preservation Foundation, co-funded by Jisc. The aim was to bring together a community to support digital preservation in the UK. Although the project ended in November 2013, a live wiki brings together the top project outputs (all open, of course), including a Digital Preservation Business Case Toolkit and a community-owned Digital Preservation Tool Registry.

The Digital Curation Centre (DCC) is a centre of expertise in the curation of digital information. This is the go-to place for all your research data preservation needs, with useful case studies, how-to guides and training courses in this area.

For some tips and information on how the ‘big guys’ manage digital preservation, check out the British Library’s digital preservation strategy, which includes some useful links as well as the strategy itself, and ‘Preserving Digital Collections’ from The National Archives has lots of good information on digital preservation, including FAQs.

Enjoy Open Access Week 2014, and remember that sharing starts and ends with good digital preservation!

Watching Melissa Terras’s inaugural lecture on a Decade of Digital Humanities at UCL this evening made me think about the first time I engaged academic endeavours in the field. Courtesy of the Web Archive, here is a report I wrote after attending my first tip academic (or para-academic) conference, Digital Resources in the Humanities, at Glasgow University, September 1998. It was originally published online in the National Digital Archive of Datasets (NDAD) Newsletter No. 4, November 1998.

On September 9th I travelled with Ruth Vyse, the University Archivist, and John Ralph, ULCC’s Computing Services manager, to Glasgow to attend DRH98, the third annual conference on Digital Resources in the Humanities. The conference was hosted by the Humanities Advanced Technology and Information Institute at Glasgow University, and ran from Thursday 10th to Saturday 12th September. The conference focused on the use of digital technology to preserve our cultural heritage, and as such featured a wide variety of presentations about work going on in, and on behalf of, schools and colleges, museums and libraries, publishers and research organisations, mainly in the fields of the Arts and Social Sciences.

We were particularly interested to learn about developments in cataloguing data collections and providing access to computerized catalogues, and to hear what approaches and standards were being used in other large data storage systems.

A number of presentations were given by the Arts and Humanities Data Service (AHDS), including a reception to launch their new web site on the Thursday evening. Of particular interest were the presentation of the recent AHDS report, Creating and Preserving Digital Collections, and presentations on the work of the History Data Service and the UK Data Archive at Essex University.

Also of interest was a TV film, Into the Future: On the Presentation of Knowledge in the Electronic Age, made for the US Public Broadcasting Service by Terry Saunders. It succinctly presented many important issues surrounding the preservation of digital data (but, perhaps invevitably, it was less forthcoming with answers to the problems). In one example, the film showed how the condition of magnetic tapes containing data from NASA’s Viking Mars Lander missions of the 70s and 80s had deteriorated to the point where many were unreadable. In the following discussion, Neal Beagrie from AHDS emphasised that the fragility of computer media, and the speed of technological change made early intervention essential for the preservation of digital records. Our work with the PRO and government departments has made the NDAD project team all too well aware of this issue.

It was encouraging to note that a number of well-supported standards and effective techniques are emerging for digital archives: in some cases this means that multiple catalogues on diverse systems can be searched with a single query. Most presentations concerned systems that were accessible, completely or in part, via the World Wide Web, indicating that the Web has quickly become a preferred medium of access to such resources. An ever growing array of digital resources, including databases, text, images, audio and video, is readily accessible by users at every level, from school-children to statisticians: the challenge for designers of such systems is to provide access tools and methods appropriate to their target audience.

Although NDAD did not make a presentation at DRH98, reference was made to other work that NDAD staff have been directly involved in, including Project Earl (networking UK public libraries) and the British Library’s Electronic Beowulf, which Charles Henry of Rice University spoke warmly of in his capstone lecture The Fire In Grendel’s Eye. We hope to make a presentation on aspects of the NDAD system at next year’s conference, DRH99, which will be hosted by King’s College London.

The conference organisation was superb, and delegates were impressed with the facilities of the Gilmorehill Centre and the ancient University of Glasgow. The Welcome Reception took place in the University’s Hunterian Museum, amongst impressive relics of Scotland’s past, including Roman milestones and the death mask of Bonnie Prince Charlie. On the final night, a civic reception by the Lord Provost of Glasgow was followed by a meal of traditional Scottish fare (Scotch broth, haggis and salmon) and ceilidh, all in the magnificent surroundings of the city’s Kelvingrove Museum . In our few spare moments we also took the opportunity to visit the Hunterian Art Gallery, with its reconstruction of Charles Rennie Mackintosh’s house and large collection of Whistler paintings, and enjoyed the chance to travel on the “clockwork orange”, Glasgow’s underground railway. In all aspects of the Conference, the Glasgow organising committee set a very high standard: King’s College unbdoubtedly has a hard act to follow.