We delivered our first DPTP workshop in London on 28 June 2010, on the subject of archiving websites. I delivered most of the training myself, working from my experience with archiving JISC project websites, writing the PoWR Handbook, and my sense for how the work should fit into a traditional archiving continuum. Accordingly I tried to structure the day to reflect a start-to-finish approach for the job, thusly:
- Consider your organisational requirements, drivers and level at which you want to do it, and create a selection policy that matches this. Consider legal framework.
- Understand the technology that copies websites (harvesters) and how websites themselves behave. I talked about aspects of the dynamic web that sometimes trip up a harvester – CMS, wiki, databases.
- Consider how (and indeed whether) you want to offer access to the collection, and whether metadata is needed.
- Build a programme for web archiving, adapting existing methodologies as needed – e.g. Institutional vs Individual. What other services exist, and can they do it for you?
In the middle, we had an excellent case study from Dave Thompson at the Wellcome Trust, and his experiences strongly reflected many of the themes of the course. Like many organisations, they don’t have one single reason for collecting web archives, and the future value of these collections is something we can’t yet see (due to its closeness with the live web).
We were all impressed by the people attending the course, all from a variety of backgrounds and projects, coming with widely different expectations of how they would be managing their web content. National libraries and business archives were represented, but also the arts; the Tate Gallery are doing interesting work in time-based media and specialist works of art that manifest themselves over the web. How to capture that content, and make it perform in the future?
The DPTP recognises the value of participation and sharing experiences, which we can all learn from. When I was holding forth on the concept of three possible points of capture for web content, I was very pleased to hear a proposal for a fourth possible method from our Swiss delegate Daniel Spichty. There were also numerous questions about exactly what it is that Content Management Systems do, which suggested to me I need to learn more about the inner workings and preservation implications of such systems.
I’m also pleased we were able to offer a printed copy of the PoWR Handbook to all those attending – in advance of the official launch of the book, which will take place at IWMW 2010 on 12 July.

