-- Spreading the Word: ---- The Oxford Theology Faculty digital library project --
----------------------------------------------------------------------------
Content written on by
----------------------------------------------------------------------------
Introduction A common problem for university librarians can be the sudden demand for a number of texts when those texts are filed are under 'essential reading' on a first-year reading list. Mrs. Susan Lake, of the Theology Faculty Library at the University of Oxford, faces such a problem on a regular basis. For example, books from the first year course examining The Gospel according to Mark are in great demand during the first term. The library does not have the resources to purchase a suitable number of books (nor does it have sufficient space to house them) and many students are faced with an uncomfortable wait until they can get hold of the texts they need to read. Digitisation offers a solution to this dilemma. Electronic copies of the texts, posted on the Internet, can be printed out by the students at leisure. A copy on the Internet is infinite in its number; therefore student demand is instantly catered for. For those studying theology, digitisation is especially useful. Because a significant proportion of its texts are difficult to access physically (because they are out-of-print or housed in libraries outside the academic mainstream) having them available in an electronic format offers significant advantages. Mrs. Lake, aided by the University of Oxford Humanities Computing Unit, therefore initiated a pilot programme of digitisation (called The Theology Faculty digital library project) to ease the strain within her own library, and also to act as a model for others preparing electronic materials to be added to a student reading list. The Texts With one or two exceptions (where entire books were being digitised) each of the texts to be digitised consisted of approximately 25 excerpts taken from monographs, or articles taken from journals or anthologies. All related to the study of the Gospel of Mark, and totaled contained 625 pages. Each of the texts was of recent production, i.e. written within the past 30 years. Once the material had been digitised, the plan was to place the files within a 'virtual faculty' shell on the university website. The theology digital library would be the first part of a larger set of resources which students studying theology could access via the departmental homepage. Other resources available within the faculty shell included a seminar entitled The Bible: Its use and influence, and also a Greek language course. Copyright Digitising recently-produced articles immediately raises the problem of copyright. Despite the strong desire within Higher Education to digitise primary and secondary material - Susan Lake discovered that most of the lecturers within the theology department had positive responses to her project - the lack of swift mechanisms to organise copyright permission means that digitisation projects are more complex and time-consuming than they initially appear. For each text that the theology library wanted to create as part of their digital library, it was necessary to obtain the right to do so. The circumstances behind each text were very different. Where the copyright holder (sometimes the author, but more frequently the publisher) was related to the University of Oxford, it tended to be fairly painless process. Some other publishers, for example Christian Aid, (who published two of the books that excerpts had been taken from) were happy to give permission, and asked only for acknowledgement of the author, the publisher and the details of the copyright. More often, publishers of journals, such as Interpretation, would ask for a small fee ($10 an article), small enough to be covered by the budget of the digital library project. Many publishers were still suspicious of the Internet, however, and did not have a coherent policy for allowing digital publication. In the majority of these cases, Mrs. Lake found it best if she contacted publishers with a set of rights agreements already in mind, and most publishers were happy to accept the reasonable contract laid out by the theology team. The greatest organisational difficulties were found when trying to organise permission for articles appearing in anthologies. Articles within anthologies had often been published beforehand, meaning the project had to locate and contact the original publisher for permission. If the article had been translated, or the editor of the anthology had added his own footnotes or commentary to the original article, then copyright permission had to be sought from them as well. Thus for some articles, permission had to be sought from three different sources. As is often the case when approaching copyright, especially for a project with limited time and funding, pragmatic choices had to be made. The team asked for rights for a limited time only. While this improved the possibility of getting permission, and also lowered costs, it means that Susan Lake will have to re-apply for rights when the initial agreement expires after two years. To further smoothen the procedure of gaining permission, The Theology Faculty digital library project stipulated that the texts would only be available from computers with Oxford University IP addresses. Obviously, this was not the ideal solution. Not only could non-Oxford students not access the digitised texts, but neither could Oxford students working from their home PCs. However, it was necessary to offer publishers (many of whom were and still are apprehensive that the Internet is stealing their share of the market) such reassurances for the pilot project to get off the ground. Digitisation Many project managers starting out on a course of digitisation are under the impression that it is the technical aspect of the programme, that is the conversion of the primary material into electronic form, that will compose the trickiest part of the entire project. However, it is the organisational matters such as copyright that really cause problems. While digitisation itself is a repetitive task, the technology and the know-how exist to execute the procedure. Needless to say, this procedure can be carried out in many different ways; those running the digitisation project need to think clearly about the best ways of achieving this. Advice will be required from technical staff and organisations, such as the AHDS, with experience of digitisation. Will it be more efficient to type in the texts manually, or use a scanner and OCR (Option Character Recognition) to identify the characters of the text? Will indeed it produce a better result if the final electronic resource is visual or textual? Each digitisation project comes up with its own answers to these questions, even though the factors in answering these questions (the need to prepare the electronic documents reasonably quickly and cheaply while at the same time producing a high-quality resource for the students) are recurrent. The theology team at Oxford settled on a mixed programme of digitisation, using both text and image. Ideally, they would have created fully-edited text versions of each article, but this would have entailed scanning or keying in all the texts, and then going through the laborious task of proof-reading. The printed texts were instead first scanned as images, and so when a student now requests a finished text, she is presented with an image of that text. (For example, Figure 1) This image file could either be viewed as an online .gif file, or could be downloaded in .pdf format via the Adobe Acrobat Document Reader. Figure 1 - A sample image from the Theology Faculty digital library The theology team then created textual versions of the texts to lie behind the images, so to help the student with searching for particular words in the text. As stated, creating edited texts was too laborious a procedure for the theology project. Consequently, the project's text versions were not proof-read for errors or lapses in the scanning, but were simply placed on the web in unedited form. When the student performs a search, the computer uses this version to find the relevant word and then displays the relevant excerpt of the image file. Obviously, an uncorrected text file has its drawbacks. No student can be sure of finding all the particular usages of a word or concept in an article, because a low percentage of these words will have been misread by the scanner. Equally, the fragments of foreign languages using non-Western typefaces have not been edited, so no complex linguistic searching is possible. Nevertheless, the search mechanisms are still much better than had the texts remained in printed form. The actual digitisation was done in house. Frequently, project managers can contact the Higher Education Digitisation Service (HEDS) to provide extensive assistance. However, in this case the texts could not leave the Oxford library to be outsourced for digitisation, as HEDS require. Susan Lake, aided by technical help from the Faculty IT officer, initiated a plan for digitising the texts themselves. Specific instructions for digitising were developed and graduate students were employed to carry the job out, using a scanner purchased by the project team. A database was also created to store all the relevant bibliographic information, or metadata as it is called in the context of electronic data, on the texts, as well as some additional administrative information that is reserved for those involved in the design of the project. Database details, such as authors, publishers, translators etc. are presented to the users when they select a text to download. Developing the Project The Theology Faculty digital library was in place for the start of the 1999-2000 term. By the first month in 2000, it was receiving a healthy 3500 hits a month, taking much of the strain off the more traditionally-organised short loan collection within the theology library. Seeing the enthusiasm amongst lecturers, librarians and students, Mrs. Lake is planning a further set of digitisation to add to the 'virtual faculty'. Again, the available funding will not be a huge amount, and limits will again have to be put on the boundaries of the project. There has been talk of marking up the texts with XML (the eXtensible Markup Language), which provides digitisers with a more coherent way of presenting their texts on screen, and can offer users much greater sophistication in their searching. However, editing and marking up texts is a time-consuming business and the theology team has to place such plans on the shelf at the moment. Likewise, the texts will remain available only to those working from University of Oxford computers. Nevertheless the primary aim of the project has been achieved, providing theology students with the texts they need and thereby upholding the claim that the Internet has begun the process of permitting much easier access to much needed information. Further details on the project, including various appendices on funding, copyright and the instructions for digitisation, are available from [http://www.oucs.ox.ac.uk/hcdt/theology/]
Many thanks to Susan Lake for her help in composing this case study.
----------------------------------------------------------------------------
Page last modified: by [Email the AHDS]
| [Site Index]
| [Other Relevant Services]
| [Latest Collections]
|