Curating a Digital-First Collection: Prof. Kerby Miller's Collection of Irish Emigrant Letters
In 2023, the University of Galway Library digitised Professor Kerby Miller’s (Professor Emeritus of History, University of Missouri) donated research collection. His collection offers a deep reading into the lives of Irish emigrants to North America and the development of Irish diaspora identities across more than 250 years. Phase 1 of making this important collection available to the public online is underway, with the aim to publish all collected Irish emigrant letters to a dedicated online portal in early 2024. The digital curation workflow involves processing, selecting and describing the letters as unique items. The curation methodology is built on an interdisciplinary approach, and this article summarises some of the strategies and advancements made to date.
Process
The collection digitisation concluded in March 2023, with 150,000+ pages scanned to hi-resolution TIF format, according to international digital preservation standards. In anticipation of the workflow's item selection and description stages, the entire TIF collection was converted to JPEG 2000/ JPF version (for publication online) and JPEG version (for day-to-day access and reference). This conversion process took 3-4 weeks using Adobe Photoshop to run conversion actions in batches. Digital file and folder names were also revised to align with the Library’s naming conventions.
A simultaneous review of the collection’s box list determined what areas of the collection are IN and OUT of scope for selecting emigrant letters. The collection is arranged across 125 boxes, 1,491 subseries and 1,729 digital folders. The subseries are arranged around an individual, a family or other discernible grouping relating to a common association, such as location or research topic. Each folder holds a mix of material types, ranging from emigrant correspondence, memoirs, poems and newspaper clippings to research notes, researcher/ donor correspondence, genealogical records and more.
Through the box list review, 817 of 1,491 subseries were determined to be IN-scope, which supported planning and goal setting for the letter item selection and description stages. The digital extent of the collection IN-scope totals 976 folders and 77,087 pages.
Select
The letters that Miller collected stem from library and archives holdings and private collections. While a precise number is currently unknown, the selection stage will quantify the number of unique letter items. It is estimated that Miller collected 8,000-10,000 letters over five decades.
a) Naming
The workflow's selection stage centres on identifying unique letter items. The letters are named using a file naming convention that supports bulk ingestion to the Library’s Digital Asset Management (DAM) system. Any file that is not a letter page is also removed from the working folder of digital assets destined for publication.
In the example provided below, the folder on the left shows letter items named by inserting an identifier (_d00x) into the file name. The folder on the right shows the complete folder as it was digitised, holding a mix of material types. The last digit in the file name represents the page number in sequence – this unit does not change when identifying letter items. This ensures that the intellectual link between the arrangement of the physical collection and the curated digital collection is maintained.
b) Letter Type
To guide the letter selection stage, two primary letter types have been determined: typed transcript and reproduction. These terms are used according to the definitions provided by the Dictionary of Archives Terminology:- Transcript: A handwritten or typed copy of a document.
- Reproduction: A duplicate made from an original; a copy.
Letter items curated for online publication include a typed transcript version of a letter, a reproduction (photocopy) of an original letter, or a combination of both types, where available. One of the primary curation challenges is the existence of duplicate letters across the collection. Duplicated transcripts show revisions or corrections by the Millers (Kerby and Patricia) and research assistants. There are also duplicate reproductions that may have been collected from different sources or reproduced again to improve legibility.
Combining a typed transcript and a reproduction version of a letter into a single-letter item is the preferred standard for publication. Typed transcript versions are essential for Optical Character Recognition (OCR) scanning to extract keywords for search and filtering. This will allow users to access and navigate the letters without restriction, supporting diverse areas of research interest. By contrast, the reproduction versions connect users to the records' materiality. Encountering the handwritten letters, as penned by the authors, gives readers an emotive insight into the past.
The letters will be published according to the typical standards of the University of Galway Library digital collections. Each letter item will be displayed as digital images in the IIIF document viewer, arranged according to individual author or family group. Below is an example of a digitised letter in both reproduction (left) and typed transcript (right) type for demonstration:
c) Letter Quality
- Excerpt: transcripts that include only part of the letter’s contents.
- Run on: transcripts that share start and/ or end pages with other transcripts.
- Annotations: transcripts that include handwritten explanatory notes, commentary or corrections to the text.
- Handwritten: transcripts that are not typed; written by hand.
- Publication: transcripts that are photocopied from a publication, such as a book, journal or newspaper.
Describe
- Sender (author) first name
- Sender (author) last name
- Sender (author) gender
- Recipient first name
- Recipient last name
- Recipient gender
- Sender (author) location
- Recipient location
Geographic Metadata
- Fort Warren, Boston, Massachusetts, United States
- Fallagh (townland), Kilmacthomas, Waterford (county), Ireland
Comments