Writing Under Victoria:
The Whistler and Darwin Correspondence Projects

Content written on 1st March 2001 by Alastair Dunning.
The role of the Oxford Text Archive has since been taken up by AHDS Literature, Languages and Linguistics.

Introduction

Besides their chosen professions of naturalist and painter, two great figures of the Victorian era, Charles Darwin and James McNeill Whistler, were both prolific letter writers. Around 14,000 letters, either written or received by the naturalist, are known to be in existence, and the number is at least 9,000 for Whistler. Identifying, cataloguing and publishing such a mountain of literary output would be a colossal task for any team of academics, librarians and archivists, but this is the goal that two separate projects, the Darwin Correspondence Project and the Whistler Correspondence Project, are pursuing.

The Darwin Correspondence Project, based at Cambridge University, with partnerships at North American institutions, began in the 1970s with a single objective - to create printed volumes of all known Darwin letters. Within a few years of commencing, this project incorporated a secondary aim, to develop an electronic archive of the same letters. The members of the Darwin Project quickly surmised that having a digital version of the letters would enhance the principal hard-copy project on two fronts. Firstly, it would facilitate the process of editing the letters, and secondly, it would further persuade funding bodies of the wide-ranging ambitions of the project. As a long-term aim, the Darwin Project believes that the digital archive also has the potential to be developed into a resource that could be widely disseminated as a powerful research tool.

The Whistler Correspondence Project had a similar inception. The plan to catalogue Whistler's communication began over thirty years ago, but rapid advances in technology allowed the project to envision a digital archive of the painter's letters, which would also include some 3,000 additional letters related to Whistler. Like the Darwin Project, the Whistler Project foresaw similar advantages in digital creation - facilitating administration and extending access. This project was initiated by the Centre for Whistler Studies at the University of Glasgow, again with partner institutions in the USA.

Both projects are still underway, and the Whistler Project intends to conclude in 2003, the Darwin Project ten years later. In the context of these two projects, this case study aims to illustrate the benefits that can be derived from developing a database in an electronic format.

Benefits for the Custodian

Although the Darwin Project began as a project to produce hard copy, the rapid accumulation of records, notes, entry slips, and various other addenda, meant that having a version of the data in electronic form became a necessity simply for administrative reasons. From this point, transforming this digitised data into a presentable database is not an overly difficult task, and the Darwin Project duly made the decision to investigate how to do this in the 1990s. The Whistler Project also found that creating a digital archive using the available technology afforded them many advantages. Sophisticated text-editing software facilitated the handling of the large numbers of letters involved. Some images of Whistler's paintings, which, it is hoped, will also be incorporated into the completed resource, were also more easily handled once they were installed in electronic format. Additionally, computerisation allowed editing staff on both projects to access and edit the data simultaneously from a single server. Having the complete files on a single drive also helped maintain security, and simplified the creation of back-up copies.

When publishers were approached, both projects found little resistance to the idea of electronic editions of the correspondence. This reinforced the notion that printed and electronic versions would not compete against one another, but would be fulfilling complementary functions. Publication of the Darwin Correspondence might in fact never have taken place without the ability to produce camera-ready copy from the electronic version of the letters. This reduced the costs to the eventual publisher, and the amount of work involved. Crucially, the ability to produce camera ready copy also allowed the Darwin Project to retain control over complex material all the way through the proofing and typesetting stages.

Benefits for the User

Digital resources greatly improve the opportunities for access. For example, students working in the same location can consult electronic archives simultaneously, thus enhancing the possibilities of the archive as a teaching tool. Likewise, the publication of the letters in digital form increases the number of researchers who can investigate the correspondence. Responding to this opportunity was considered an important factor by both projects, because the multi-disciplinary nature of the letters meant they were likely to have a wide constituency of use. Amongst others, academics, scientists, historians and philosophers would have an interest in the Darwin collection, while the Whistler team noted that beyond art historians, historians and literary critics would have an interest in the resource. With access either on the Internet or via CD-ROMs, however, potential use spreads beyond higher education. The Darwin Project observed that other interested parties could include amateur historians and secondary schools, or simply the curious reader.

The Flexibility of the Digital Archive

Digital archives aid the process of searching for and finding specific material. Both the Darwin and the Whistler teams realised that electronic publication would increase the potential for searching the whole or particular parts of the database. This not only would be useful for the end-user, but has been and continues to be a most useful tool in the creation of the editorial material in the archive.

The Darwin Project has installed an 'Online Calendar' on their website. The Calendar catalogues each Darwin letter, with a brief summary of its contents. This ranges from the particular - in one letter of 1866 Darwin discusses with his son "what the chances are against squinting and non-squinting children coming alternately in a family of ten" - to the more philosophical, as in 1870 when a letter from a professor at the University of Jena considers the "applicability of evolutionary theory to the question of human origins". The Project is still developing its own search tools for the Calendar, but in the meantime users can use their own browser's search capabilities to discover which letters concern topics appropriate to their research theme. The Calendar includes all Darwin letters known of to date. There is an accompanying list of correspondents, arranged in alphabetical order. Among those appearing in this list are the astronomer William Herschel, the Parliamentary reformer and one-time Prime Minister, Lord John Russell, and the more mundane Bromley Savings Bank. Each correspondent is given a brief description, and a record of the letters they either sent to or received from Darwin is given.

In order to create its electronic collection, a segment of which will go online this year, the Whistler Project adopted a relational database structure, with inter-linked databases for the relevant letters, personal names, works of art and exhibitions. With this construction, determining connections between related events, people and items becomes a much simpler task for those using the resource. Because of its obvious connection to the arts, the Whistler Project is particularly keen to extend the project in this fashion by linking some images by Whistler to their citation in the correspondence. Printing these with the letters in a printed format would be expensive, for both producer and buyer, but fewer such problems arise when they are placed beside one another in electronic format. The database can also create hyperlinks from citations of specific events to general chronological tables and from mention of particular personalities to longer bibliographical descriptions. Such detailed connections allow the user to have a greater understanding of the context in which the letters were produced. Like the Darwin project, a catalogue of the letters contained in the database has now been established online and users can freely access this mechanism.

Digital copies of the letters give the opportunity for a more integrated scholarly apparatus. The Whistler Project is creating mutual links between specific footnotes and their reference points in the text itself. Both projects are also keen that the electronic text should be accompanied by the full editorial apparatus supplied in the printed volumes. They believe it is important for digital resources to have a scholarly reputation equal to that of hard copy publications.

Digital archiving also has the advantage of creating a flexible product - the archives can be updated continuously. Because of the nature of both projects this is an important feature. New letters by or to both Whistler and Darwin keep on appearing (around 60 new Darwin letters are unearthed each year), and need to be added if the projects wish to be comprehensive.

Solving Problems

The digitisation of texts is not without its difficulties, especially in its technical aspects. This was a worry for both projects, as they lacked the in-house experience necessary to make decisions regarding the choice of hardware and software, and on issues such as copyright or the preservation of data. Both admitted that finding consistent support from Information Technology support staff was a tricky task, although the Darwin Project acknowledged help over many years from Cambridge University's Literary and Linguistic Computing Centre, and from the University Library's Automation Department. Difficulties were also encountered when appealing for support from their libraries and home universities. Gaining technological advice was rather a case of identifying particular experts, not necessarily connected to their own institution. The Whistler Project, for instance, employed an external computer adviser to deal with text-encoding problems and refining the database. Others consulted were the University of Sheffield Humanities Research Institution and the Oxford Text Archive, providing the Whistler Project with help on suitable choices of software, and on methods to create databases that link to large pieces of text, such as the letters. Both projects commented that useful information was gathered by consulting with their partners from the USA, and also with those working on similar digitisation projects.

Preservation

For more complex questions, especially those relating to the preservation of digitised letters in the long term, help was sought from various organisations, including the aforementioned Oxford Text Archive. Preparing for the long-term preservation of digital archives is vital, as the difficulties experienced by the Darwin Project demonstrates. Its longevity has meant that those working on his letters have suffered many upheavals during changes of technology - moving from mainframes to personal computers potentially endangers access to data. Getting good advice on preserving data is essential in order to minimise these difficulties. The Whistler Project, which made the switch to digitisation at a later date, has made use of the widely-accepted Standard Generalised Markup Language (SGML) to ensure the safety of data in the long term, and also to facilitate the process of creating a versatile database. While the Darwin Project has so far stored its data using traditional ASCII code, it is considering converting to SGML so to have the advantage of a format that offers both flexibility and long-term security. The question of the physical preservation of the data must also be considered. Although the Darwin Project's electronic files are currently stored on the Cambridge University Library mainframe, there is no guarantee that the Library will be able to preserve them perpetually in an accessible form. It is likely that, for long-term preservation, a copy of the files will be deposited with the Oxford Text Archive.

Copyright

Copyright is another tricky area. Even the best prepared of digitisation programmes can come unstuck when faced with a welter of copyright issues. The projects were fortunate in that the copyright for the letters written by Darwin and Whistler was held by those who were already involved - a descendant of Darwin in the first case, and the University of Glasgow in the second. Organising copyright clearance for the letters written to either Whistler or Darwin meant consulting a huge number of their descendants, a time-consuming piece of work. For both projects, many of the copyright holders could not be contacted. Thankfully, however, current copyright law exempts one from securing copyright if reasonable attempts have been made to contact the rights holder.

Funding

As with most large-scale academic projects, it is important to cast a wide net in order to appeal to as many sources of financial support as possible. While these projects have both had a principal source of income, their longevity means that they have had to search around for other means of financial assistance. A recurring British Academy grant has been the mainstay of the Whistler Project, but additional funding from the John Sloan Memorial Trust and the Getty Grant Program are helping to complete the work. The University of Glasgow has also been very helpful in paying some team-members' salaries and providing small research grants. The Darwin team has had funding from sources in the United States since its inception. In Britain, grants have been received from The British Academy, The Royal Society, The Pilgrim Trust, the Isaac Newton Trust, and the Natural Environment Research Council, and a major grant from the Wellcome Trust has allowed the editorial team at Cambridge to be considerably expanded. Both projects are aware, however, of the need to continue looking for funding in order to see their tasks through to completion.

Neither the Whistler nor the Darwin Correspondence Project are complete at this point in time, although the latter has produced several hard-copy volumes of Darwin's letters. This, however, is far from being unusual. While creating an electronic archive does hold plenty of advantages - facilitating administrative matters and widening access for example - it also demands attention to additional matters. Creating an electronic archive is not something that can be done in an instant. Project managers need to be aware of issues such as preservation, additional copyright restrictions, and various technical issues, while still maintaining the high scholarly standards that would be assumed to accompany any hard-copy edition. As the Whistler and Darwin Correspondence Projects have recognised from the very outset, paying attention to these extra issues is vital if one hopes to exploit fully the possibilities that arrive with the development of an electronic archive.

Many thanks to Nigel Thorp of the Whistler Correspondence Project
and Alison Pearn of the Darwin Correspondence Project
for their help in the preparation of this case study.