Of course we have technology in our galleries and classrooms and information on the Web; of course we are exploiting social media to reach and grow our audiences. . . . But we aren’t conducting art historical research differently. We aren’t working collaboratively and experimentally.
James Cuno, Daily Dot Article
I start with this quote from Modernizing Art History because of its relevance to clean data. Many of the things Mr. Cuno mentions here are technologies that don’t necessarily need massive amounts of man power in cleaning data. Putting technology in galleries and classrooms is often in the form of ipads, touch screens, or laptops- technology that often can come out of the box with little effort. Putting information on the web, while it can take time to gather research and what not, ultimately the process of putting the information up is not far off writing an article or other piece of writing. Social media is much the same, while effort generally goes into putting together a content calendar, the tools themselves are relatively straightforward. And then there’s cleaning data for research which is a different bear entirely.
While visualizing data can yield fascinating results, the work that it takes to get to the point of actually analyzing the data can be incredibly time consuming. Cleaning data may seem straightforward, but often people will start cleaning data only to realize they want to do something another way. For example, maybe at first while cleaning data you decide to insert the date one way, but then midway through you change your mind. You then have to go back and make sure all of the data follows the same convention. This issue with data cleaning also points to why someone may shy away from collaborative work. Often, people have different ideas of how to best work at cleaning data, and unless the conventions are set up from the very beginning, you risk having to do work two and three times to make sure everything is consistent.
I ran into this while working on the Dictionary of Art Historians. Having started as a card catalogue in the 80s, the information contained in the dictionary has gone through many iterations, and has had several collaborators. This has resulted in several different forms of the data inside the dictionary. For example as we saw in class, the birthplace of someone may be inputted as just a country, as a city and country, as a city and country in the native tongue, there can be many different options for the form the data can take. At one point, I went through the data and cleaned up the field for home country (so one could search through the results using a faceted search that filtered only the Italian art historians or only the Germans) The problem with this field was the fact that everyone had inputted the country differently Italy vs. Italia vs. Kingdom of Napoli. As people type something in it may come out as frace instead of france. There can be many different problems with the data. This is one reason to advocate for controlled vocabularies that can be integrated into a database such that someone inputting data can pick from a drop down list rather than putting in the data themselves.
One kind of data I’m thinking about working with is comparing the versions of texts of the original Diana and Actaeon myth using a tool like Voyant or venturing into Python NLTK. In order to do this it would be important to put the texts into plain text files. Then it would be important to make sure the text that is in the file, is just the story I want, as many of the versions may have other stories before or after it. Whily Voyant will take stopwords out, if using a tool like NLTK the text has to go through several iterations before you can get reliable data. This involves taking out stopwords, truncating, and tokenizing, many of which are techniques that are used in search retrieval engines like google. As we learned in class, stopwords are words that occur frequently but have little meaning. These lists can be very short or very long depending on the text, and of course it’s important to remember that each language has its own stop words list. Truncating refers to combining words that have the same prefix. You may notice this in your google search if you search something like “computer science” but you get results involving the word computing. The final technique is tokenizing, which involves parsing a sentence down into words that the computer is then able to read.
Obviously in the Prown article we read art historians are interested in the types of research questions can be generated by “big data”, but there is still resistance even today to using data in a formalized manner. Maybe it’s a silly thought, but I do wonder if part of the reason art historians maybe reticent to use data is because of the prep involved. Many of us work in a manner where we look at the object first and then do research later. You don’t have to do a lot of work to prep your eyes to look at the object, in a sense visual analysis gives a sort of instant gratification. While art historians are clearly detail oriented, data input and cleaning takes a sort of monotonous eye to detail that we’re not used to. This all being said, I think one can get better at data cleaning with practice and can become comfortable with inputting data in such a manner.
Maybe one day our computers will become advanced enough to read messy data, but until then we have to deal with keeping our data tidy.
When I was at the Duke Wired! Lab this summer working on the Dictionary of Art Historians, I was able to learn a lot about the Visualizing Venice project that launched several years ago and has had multiple iterations of what they have looked at in terms of spatial Art History. This has included Venetian Ghettos, the Accademia and several years ago the exhibit a Portrait of Venice based on the 1500 map by Jacopo de Barbari. While I don’t have as much familiarity with the first two iterations of the project, I was able to see the Portrait of Venice at the Nasher in 2017 and was blown away by what the Wired Lab was able to do. This project went beyond what was described in our readings this week on spatial history and turned that spatial history into a truly digital and interactive project. To view a map on the computer and to be able to manipulate the data is incredibly helpful for thinking about how important space is in art history whether that be the space the artist is depicting of the space the artist is living in. To be able to play with a map large scale and to be able to (almost) experience it, is something completely different. This project allowed the viewer to use a touch screen to decide where to go in the map, it included sounds of what it may have sounded like, additional images, the only thing missing was smell (although considering the way that Venice smells now I don’t know that I would want to smell). It created a completely different learning context for the map. While before seeing it on the wall would be incredible (the map itself is 5 feet by 10 feet) you wouldn’t be able to see detail unless the museum itself decided to include detail shots or if you got really close up, and we all know how museums about people pressing their noses up against the glass. For example, this view would probably be impossible without the help of digital technology:
The image of the gondola with 4 little stick figures and the larger ship next to it is the small red box on the larger image above. In the exhibition one was able to zoom into the gigapixel image and explore the map with the annotations made by the team, very similar on a much larger scale to the story map JS I put together of the same map.
Additionally, the map itself is from the Minneapolis Institute of Art who on their website has linked to the gigapixel image and has asked patrons to search the image, find one of the 103 bell towers used to create the image, and then link to information on the bell tower. This of course brings up the idea of crowdsourcing in digital projects and the fascinating power that could be unlocked with that by using it properly (such as websites like Zooniverse!) This project as well as many like it brings up Johanna Drucker’s question of digitized art history vs. digital art history and recalls what Pamela Fletcher and Anne Helmreich, with David Israel and Seth Erickson, address in “Local/Global: Mapping Nineteenth-Century London’s Art Market,”
In other words, our field has established models for online access and distribution, but lacks robust examples of scholarly interpretation predicated on new modes of analysis made possible by the innovations of the digital age.
Clearly, tools like Story Map JS and other digital tools we’ve looked at aid in the organization and learning of specific information. I think just like annotation, this ability to kinesthetically engage with an object, or to think about object within the context of a specific space and then actually see that space activates a completely different part of the brain. Whether that qualifies as a new methodology? I don’t I know, but I do think it allows people to see what may have been available for years in a completely different light.
Then on the other hand you have scholars doing incredible projects with AR/VR and of digitally rebuilding lost buildings or trying to see the change in buildings based on plans they may have. The video below explores the Accademia of Venice using digital tools.
And this video is an interview with a PhD student at UNC who is working in religious studies and has built VR models of synagogues in order to explore how the architecture of the buildings were built in relation to the celestial skies and thus played a very important role in the Jewish liturgical activities:
This realm of mapping and of spatial history is something that truly could change the way we think about art history. In my own research, I could see where being able to construct the original context of a painting, or perhaps think about the artist’s studio space could elucidate new information that wouldn’t have been available before. While some types of projects are still only accessible by scholars with advanced technical skills or the resources to obtain such skills or to work with someone who has them, the innovation of tools like StoryMapJS point scholars in the right direction (speaking of which if anyone has interest in the rest of the Knight Lab tools, my colleague at the DIL wrote excellent tutorials for all of the tools)
In my mind, interactivity and annotations are at the core of using digital humanities to change the way one does scholarship. As someone who suffers from ADHD, interactivity, annotations, and frankly media outside just purely print is a Godsend. While I enjoy reading, and I can get very into some topics, a lot of print media is not quite engaging enough for my brain. As a child my favorite part of reading was when we were supposed to write on the text, when we were doing “active reading.” Still today as a Master’s student, after almost 7 years of higher education, I have finally determined the best way for me to ingest my readings- on an iPad. For a while I tried to print all of my readings out so I could do the same “active reading” I did as a sixth grader, but eventually you run out of printing money, you lose the pages, the pages get wet, you feel terrible about cutting down trees, you simply forget to print the reading out beforehand. So inevitably I would have to read on my laptop, while you can add text annotations and highlights in PDF viewers, it’s not engaging your brain in the same way. I’ve finally (again after almost 7 years of higher education) determined that the best way for me to read readings for class and retain a lot of what I read is for me to be able to draw on the page, to add emojis of my reactions, to circle and highlight, and ask questions in the margins. In experimenting with annotating images and videos this week, I’ve determined this “active reading” of an image or video is vital for my understanding.
I think we’ve all been on the couch at one point watching a movie or a TV show and been distracted by the steady stream of information that comes from our phone. As we’ve annotated videos this week, I’ve noticed that I’m much more engaged with a video if I know I’m waiting for that annotation. At the end of the movie Music and Lyrics with Drew Berrymore and Hugh Grant, they play an “old” music video that has a pop up annotations every so often with updates or behind the scenes information on the song, taken as a riff off of VH1 “Pop-Up Videos” such as this “My Heart Will Go On” from Titanic
While the pop up sound can get mildly annoying, ultimately your brain is being stimulated with this extra information that is coming up, and it often helps you to understand what your seeing and retain it further. While the video I made below was silly, it was interesting to see how captions can be used to enhance a video (particularly when it involves animals
Finally, in terms of annotating images, I believe thinglink works in a similar way to Tropy or my particular favorite NVivo. While it doesn’t have the same capabilities in terms or comparison of metadata or sections of images, it does allow the user to mark areas of an image that are important such that information is not lost. In the same way that old photographs that are annotated on the back with things like “Lola Lee and Richard Wedding, 10/8/1960 Colesville, MD”
These annotations preserve the memory of the physical photo, just as metadata can do the same for a physical image. For images that are not necessarily family photos but works of art as we’ve looked at, annotations allow an interesting way to preserve your research. Here, I’ve linked the ThingLink I made with 5 images of art pieces that I’m looking at for my Master’s thesis. The annotations allow me to ask questions of the image that I can think about and continue to ask myself as I go through the annotations, it allows you to link out to other research or media. Essentially it allows you to preserve the memory of your research thoughts. Rather than viewing an image, thinking that something is interesting and then writing it down but not remembering what it means the next time you look at it, or worse, not writing it down at all and forgetting everything you had thought. Furthermore, it allows for students to actively engage with an image. In a previous post I mentioned a digital project of Bosch’s Garden of Earthly Delights that essentially is a high res image that has been annotated. When I brought this project into my classroom this Spring, it made my students actively engage with the image more so than me lecturing about it. At the end of the class, several mentioned that was the piece they remembered very well, because of that active engagement. Annotations and interactivity allow for stimulation in ways we haven’t been able to do before and allows the viewer to more fully absorb the media they’re consuming.
In standard computer programming, iteration is one of the first things you learn and is essential for building a successful program. While Digital Humanities projects (including Digital Collections) may not be your standard program, it is still incredibly important to implement the iterative process. This is clear from Paige Morgan’s advice on how to get a Digital Humanities project off the ground, as much of the advice she gives involves repeating your project multiple times. She writes, “Talk to people about your project. Write about it. Apply to give lightning talks and/or conference papers about what you want to do.” This technology free advice represents iterating through your idea multiple times in different formats, getting closer and closer to the end product you’re looking for (the same is true in an iterative loop in programming, your computer will execute a line of code as many times as it needs to until it gets to the answer you specify) When she writes “When you’re building these small prototype versions, be easy on yourself.” and “Know that the platform or tool which which you build your project may change. Don’t commit to one right away. Experiment.” both of these pieces of advice involve building your project multiple times, experimenting with the the platform and the feel of it, maybe the methodology.
The advice that Morgan gives in her blog is important to remember when building smaller projects as well and using out of the box platforms or software. It certainly is true for Omeka and Scalar. From my point of view, Omeka seems more straightforward in terms of learning the tool, but may require more iterations through the actual data or items to determine what it is that you want to portray with your Omeka site. What in your data will be an item? What will be a collection and what will be an exhibit? This summer I worked on The Humanities Moments Project at the National Humanities Center an exciting project that is run on Omeka. The center the describes the project and mission as the following:
By illustrating the importance of the humanities for people from all walks of life, the project seeks to reimagine the way we think and talk about the humanities.
By highlighting their transformative power, the Humanities Moments project illuminates how our encounters with the humanities fuel the process of discovery, encourage us to think and feel more deeply, and provide the means to solve problems as individuals and as a society.
In the case of this project, the individual moment is categorized as an item. Some of the items are than put into collections that indicate where they came from – say for example NHC Staff, that is readily visible from the backend but less so from the front. If one clicks on a moment that is in a collection it is visible at the bottom near the tags, and can be clicked on to see others in the collection. The exhibits are made up of items and represent a theme such as Teachers, but do not necessarily come in from the same place (and thus are not a collection). This could have been done a completely different way, and it’s entirely possible near the beginning of the project it was. It was only through iterating through possibilities and keeping the mission that I mentioned above in mind that they came to this schema.
The same iterative process applies to Scalar, although in my mind it applies more to learning the platform then it does to the project. Scalar’s interface is slightly more difficult to figure out and thus, unless you want to read all of the documentation, it is easier to experiment through the platform. To try and implement part of a collection into the site, but then continue to work through all of the feature that are available to transform that collection.
I don’t feel that I can make a judgement on which of the two platforms we’ve looked at, Omeka and Scalar, is better for a digital collection- I believe it’s truly vital to a digital humanities project, as Paige Morgan says, to iterate through multiple platforms and to be always trying something new to get your project as close as possible to the end goal. Maybe it’ll get to the perfect project, maybe it won’t, ultimately though one can only find out by iterating through continually.