2020 Goals: Genealogical Record Sets

Some staffing changes at my local library mean that the amount of time I have available off the reference desk has shrunk pretty considerably, a change that has made me rethink how exactly I’m able to continue digitizing resources when I am not able sustainably work with the bulk of our digitization equipment. To address this change in a fashion that allows me to continue to my work, I’m choosing to focus this year on genealogical records which is turning out to be a successful plan for a variety of reasons.

Record Sets

After reviewing our manuscript collections and card indices three primary categories of materials made themselves apparent as good targets for digitization: organizational records, cemetery records, and veterans records.

For the organization records, identifying suitable record sets was luckily a straightforward process. I began by looking through our finding aids for collections devoted to a single organization (eg. International Union of Operating Engineers, Local 313 (Local 18) Collection, Athena Art Society Collection), category of organizations (Hospitals and Medical Organizations of Greater Toledo Collection) and failing that, our catch-all Clubs and Organizations of Greater Toledo Collection. While perusing these I noted any mention of a membership list, directory or yearbook.

We already have some corporate records from the Lloyd Brothers Walker Monument Company available online as a cemetery records set, but it has always bugged me that this set hasn’t had another cemetery record set to join it. After some combing through our catalog and finding aids, I eventually turned up the Northwestern Ohio Genealogical Society (NWOGS) Cemetery Project Collection. In the 1970s this genealogical society sallied forth into Northwest Ohio, parts of Michigan, and even one cemetery in Canada to individually transcribe every headstone in every cemetery they could find. We have the original manuscript cards they filled out by hand, as well as (and crucially) the typescripts they generated afterwards. A perfect candidate for digitization while scoping the records to only those cemeteries in Lucas County.

Finally for veterans records we have extensive card file indices of clippings from area newspapers documenting soldiers in World War II and the Korean War, as well as a recently rediscovered Women of World War II index. These three card files will form the starting point for our veterans record sets. These also are by far the most extensive collections with each drawer containing well over 500 cards and there being several dozen drawers. Scanning of the card files will in all probability extend well into 2021.

Equipment and specifications

These records are all typed text, generally of good quality and clarity, with minimal images for consideration. My goal in digitizing these records is to scan them at sufficient quality for a successful OCR of the records and thus allowing them to be searched across Ohio Memory, DPLA, and our own Federated Search Tool. As such, scanning these at a resolution of 400 PPI and in 48 bit color would enable successful OCR without making the process so time-consuming as to be (functionally) impossible. I’m primarily working on this project, with an intern from a local university also assisting with segments of the project as well.

Our reference desk couldn’t accommodate our Epson 11000XL large format flatbed scanner, much less our Kirtas book scanner or I2S planetary scanner, but it is spacious enough to fit one of our smaller flatbeds: the Epson V600 photo scanner. This flatbed produces remarkably good images considering it’s $200 price point and we happened to have a couple that had sort of made their way around our department for various projects. Having one moved out to our reference desk was a straightforward way to open up scanning capacity at the reference desk with no need for expenditure of funds.

Copyright considerations

Contemplating the copyright status of these works was an interesting avenue for consideration because this turned out to be a situation I don’t often find myself in. Frequently as a digitization librarian my projects center around materials in the public domain, materials for which we’ve been given permission by the copyright holder to share, or projects where we feel the digital distribution constitutes a fair use of the work. This project however falls into another category, that is, materials that are factual in nature and thus not eligible for copyright protection in the first place.

The organizational yearbooks/directories that I started this project with list committee makeup, membership rosters, and schedules of events, all information that is simply documenting and reflecting what occurred throughout the year. None of the information is particularly creative in nature. The same has been true of the cemetery transcriptions; they list who transcribed the tombstones, who generated the typescript, where the cemetery is located, when all this occurred, and then the tombstone transcriptions themselves. Again, simple facts, no creative endeavor engaged in.

In those cases where sections move into more substantially creative territory, I’ve omitted them from public access to ensure that intellectual property is being respected. For instance the Athena Art Society yearbooks were frequently handmade chapbooks during the run that has been digitized, with individually created artistic covers. All of these were excluded from upload. Similarly the Women’s Auxiliary to the Academy of Medicine of Toledo and Lucas County also included their constitution and governance information, as this would also certainly warrant copyright protection it was also excluded.

The big exception here are the card files of clippings providing information on these involved with World War II and the Korean War. The newspaper articles are all protected by copyright, and none of them have entered the public domain to date. For this a fair-use argument was needed, particularly highlighting the fact that the amount of shared was a small fraction of the original work, that there is no available means for creating a licensed version of this resource, the fact that creation of the index is a highly transformative use of the copyrighted material, and that the resource is meant for research purposes. The Columbia Fair-Use Checklist (created by Kenneth D. Crews and Dwayne K. Buttler) was used to document the assessment. Ultimately I ended up with 14 factors in favor of fair-use and one against, which certainly doesn’t guarantee a fair-use ruling but left me confident enough to proceed.


An important factor for consideration here is obviously privacy; the entire point of this enterprise is to make it easier to find information about people. A firm cut-off point of 1970 is being used, for a period of time amounting to 50 years at the time of writing. This provides a robust buffer of time and should ensure that most names and addresses will no longer be current.