The second day of the Digital Preservation Training Programme (DPTP) was homework day. Delegates had prepared short presentations about a wide variety of digital preservation (DP) tools. With delegates being given a tool each on day one, we were able to cover 23 tools. Day two also saw us continuing to explore the OAIS model and also to look at rights management – the legal part of DP which often requires management of all kinds of rights in connection to the rights of both producers of content.
Fortunately, no dogs ate any homework, and we had a lively session learning about DP tools. Tools are not just technical systems or applications. At their most basic level, they are anything that help with the task of digital preservation, such as models like OAIS as well as more obvious applications that help with check sums and other DP tasks. There is even a board game, to help ease the pain of meticulous DP planning! The session started with ‘pin the tool on the OAIS model’, where delegates used post-it notes to show where exactly the tools they had researched would best fit in. It was great to hear the feedback from individual delegates who had assessed their tools in great detail. Some useful guidelines grew out of the session which could be applied to any DP tool you might consider using.
Our collective top five guiding questions were -
- How much does the tool cost?
- If there is a cost, is this a one-off payment or does it involve regular renewal payments?
- Is the tool active and in use, with an active user community around it? (particularly important for free tools)
- Would the tool save time and/or money?
- Would this tool be compatible with existing systems being used?
Although I’ve been preparing for my first Digital Preservation Training Programme (DPTP) for quite a while, it’s only after taking part in the first full day that the depth and breadth of the material covered really hit me. As a new trainer on the course, I am teaching only two of the many modules. This time around, most of the work it being done by my colleagues Ed Pinsent of Academic & Research Technologies, ULCC, with help from Sharon McMeekin of the Digital Preservation Coalition on Day Three. The course covers pretty much everything that you need to know to start or develop Digital Preservation (DP), from tools to models by way of standards, file formats, metadata and much more. Check the course outline for details of individual modules we teach and get some idea of what is covered. The course is very rooted in the practical, with lots of hands-on exercises and, despite a packed schedule, lots of time factored in for questions and discussions.
The mix of people on the course is one of it’s great strengths, with delegates being encouraged to share their own experiences and current projects with each other. For this course, we have a really interesting group made up of archivists from a traditional background who are moving into the digital area and people with a library and/or digital library background who are moving into digital preservation as part of their current role. The theme of expanding roles continued among the delegates, with this course having a number of records managers and also people working in research data management who are both now finding a need to include digital preservation for both digitised and born-digital content in their day-to-day work.
This year I collaborated with Chris Fryer of Northumberland Estates on a project under the auspices of the Jisc’s SPRUCE funding. It’s ended up as a case study, and it’s an assessment of available digital preservation solutions. The main aim was to build outputs that would have value to smaller organisations, who intend to implement digital preservation on a limited budget; Chris in particular wanted something aligned very closely to his own business case, and local practices.
We believe that the methodology we used on this project, if not the actual deliverables, will have some reuse value for other small organisations. There are four useful outputs in our toolkit:
- A requirements shopping list – a specification of what the chosen system would have to do
- An assessment form – the same shopping list, expressed as a scored checklist to assess a system
- Example(s) of assessments of real-world solutions
- A very simple self-assessment form for scoring organisational preparedness for digital preservation, based on ISO 16363.
The Requirements Deliverable is essentially a “shopping list” of what the chosen system has to do to perform digital preservation. It was built from a combination of:
1. The OAIS standard (somewhat selectively)
2. US National Library of Medicine 2007 specification
3. Suggestions sent by Jen Mitcham (Digital Archivist at the University of York), QA supplier to the project
We wanted to keep the specification concise, manageable and realistic so that it would meet the immediate business needs of Northumberland Estates, while also adhering to best practice. The project team agreed that it was not necessary to adhere to every last detail of OAIS compliance. This approach might horrify purists, but it worked in this context.
The Assessment Form deliverable is a recasting of the requirements document into a form that could be used for assessing a preservation solution. We added a simple scoring range, and a weighted score methodology to add weight to the “essential” requirements.
From the KCL/AHRC Language of Access project blog
Participants in the LOA project might be interested to listen back to today’s instalment of The Life Scientific on BBC Radio 4. It featured Professor Dame Wendy Hall, one of the pioneers of the World Wide Web and hypermedia, talking about a career spent at the forefront of web and digital media developments.
In particular Dame Wendy explains the significance of the growing “Web of Data” which increasingly exists in parallel with the more familiar “Web of Documents”. By using techniques such as linked open data, we can enable web searches and applications that make intelligent and useful links between sets of data. Mobile phone applications, for example, use the web to relate a person’s exact location (from GPS) to any number of datasets, including street maps, public transport timetables and historical events.
No matter how small or specialised a dataset may be – even a ‘simple’ bibliography – taking care to represent it as data using increasingly accessible tools and techniques – ensures that it can contribute to the “Web of Data” and participate in who-knows-what future connections and queries. (It’s also worth mentioning that the data generated from social networking activities is already a rich part of the Web of Data.)
Recently, I delivered a one-day training course on digitisation to Digital Humanities post graduates in Salford. Elinor Taylor of Salford University won an AHRC grant for a Research Skills Enrichment project, called Issues in the Digital Humanities: A Key Skills Package for Postgraduate Researchers, and one of the strands was about improving digitisation skills; more specifically how best to manage a digitisation project.
Elinor was unable to find anyone who could deliver the course they wanted, and commissioned ULCC to create a bespoke course. They approached us through our Digital Preservation Training Programme, which recently won an award for training and communication. Elinor at first thought a workshop / hands-on event might be best, where a digitisation workflow could be aligned with a real-world case processing papers from the Working Class Movement Library which they were scanning. In the end we agreed that an overview of management principles would be better. I was asked not to dwell on scanners and cameras, since the audience for the course would mostly be outsourcing their origination work to commercial providers. Audio-visual conversion was also out of scope.
A conference is a conference, but one can’t help but be inspired to think a little bigger when a guest of the organisation that has engineered a 27km vacuum, have consistently made mind-boggling breakthroughs in particle physics over the past 30 years… oh and invented the world wide web as a side project.
Tubes and stuff…
I’m by no means clever enough to understand most of what goes on at CERN – despite a very personable physicist doing his best to explain what matter is during a short bus journey – but the CERN workshop on innovations in scholarly communication (aka OAI8) covered familiar territory.
The conference, that ran 19th-21st June, was held at and in partnership with the University of Geneva. The plenaries covered “technical” issues, metrics, data and document semantics and research data – plus sessions on arts, humanities and social sciences and Gold OA infrastructure. In a nutshell? Keeping track of our stuff, this side of the Gutenberg Parenthesis.