Written in 2006, Cohen and Rosenzweig’s guide to digitization, “Becoming Digital,” certainly captures both the excitement and
concurrent hesitations over digitizing the past that have come to characterize the burgeoning field of digital humanities at the turn of the 21st century. The authors begin their introduction by mentioning that the explosion of interest in the internet and the movement of information and archives online, “has created an implicit bias that favors digitization over a more conservative maintenance of analog historical artifacts.” What I think this article does well is that while the benefits of digitization and subsequent access are discussed, the authors also ask readers to critically consider why they would want or need to pursue digitization projects and whether such projects
are necessary to the form and content of their research. I think that because it is relatively easy to get caught up in the excitement and ‘newness’ of the digital age, one might find themselves wanting to digitize materials simply for the sake of digitization itself and moving content online. However, when considering a project that potentially involves digitization, as the authors stress, it is important to weigh the various costs of such projects as it pertains to monetary expenses, but also the inevitable loss of information as original, physical artifacts are transfigured to accommodate a digital form.
A compelling point the article brings up is that “even in the best of circumstances, the move from analog to digital generally entails a loss of information, although the significance of this loss is subject to continuing debate.” One example of this kind of loss can be conceived of in terms of the quality and inherent value of vinyl records as opposed to CDs, or more recently, the entirely digital nature of music streaming online. What is lost when tangible artifacts are presented on flat surfaces as digital images? Or, what kind of value is lost when physical written sources are digitally formatted into visual representations of the text? I wonder whether such implicit qualitative
value is even able to be accurately quantified.
As a historian of medieval Europe, a bulk of the primary sources I will be dealing with in graduate school will inevitably be centuries-old manuscripts. The thought of only interfacing with digitized typefaces of manuscript texts is honestly a bit scandalizing. Besides the
actual written content of medieval manuscripts, much of the value of these sources are things that would be difficult or impossible to represent with digitized text, such as marginalia, annotations, or visible scrapings to the vellum, not least of all the intricately detailed calligraphy and artwork present in many documents of this type. Furthermore, faithful digitization of text is extremely difficult with manuscripts to begin with, and Cohen and Rosenzweig address this issue saying, “medieval manuscripts present much thornier difficulties, including different forms of the same letters and a plethora of superscripting, subscripting, and other hard-to-reproduce written formats.”
The incompatibility of text digitization software, such as OCR, with handwritten documents was made clear to me during Thursday’s lab. In class, I chose to scan a handwritten letter but when I tried to lift the text using OCR, the result was a bunch of scrambled nonsense ̶ numbers and characters that appeared nowhere in the letter and not one coherent word was recognized. Now, this was not even the scrawled or flourishing medieval font style that is often illegible to
everyone but philologists. It was relatively neat cursive that was absolutely discombobulated when run through digitization software. This experience in the lab helped me call into question whether text digitization is a useful endeavor when working with large quantities of medieval manuscripts ̶ the answer is probably not. Yet, it also
caused me to consider the benefits of digitized text that would be lost if I choose to scan and upload full page images of manuscripts online, namely, the ability to search and manipulate texts and the ability to readily copy and paste select passages.
I had much more success in scanning a modern painting of a medieval abbey than I believe I ever will in digitizing medieval text. The image I chose to scan was a oil painting of Bath Abbey in Bath, England by the artist Peter Brown. Having lived in Bath and having been a tower tour guide of the abbey, I have very fond associations with the courtyard space depicted in the painting, and especially so of the grey, shimmery wetness that characterizes the scene. Considering the ease and effectiveness achieved by scanning this image, I believe that scanning page images of medieval sources might be more geared toward the type of digitization relevant to my field. In addition, after briefly working with Tropy, I realize that text searching and the tagging of page images can be feasible if I ever find myself needing to search key attributes within a large corpus of digitized manuscript images.
Daniel J. Cohen and Roy Rosenzweig. “Becoming Digital.” Digital History: A Guide to Gathering, Preserving, and Presenting the Past on the Web, 2006. http://chnm.gmu.edu/digitalhistory/digitizing
Peter Brown. Pigeons in the Rain, Abbey Courtyard. 2016. Oil on Canvas.