Digital History

The definition of digital history is fuzzy. The debate is more on the vision of the future of this specialization than on its meaning. One side of the dispute considers digital history as something that will become a universal presence for the historians with the growing importance of computers in their daily modern life as scholars. The natural following of this prediction is that every historian will be a digital one and therefore the definition of a special category of scholars will be meaningless. On the opposite side there are the ones that, much more critically, see digital historians as the current iteration of cliometrics, an old attempt to insert complex statistical analysis in the interpretation of the past. Many historians saluted the demise of this failed attempt to make historical analysis more scientific. They often still warily look at digital historians, perceiving them as the resurrection of an old enemy.

While I toast to the failure of past cultists of cliometrics, at the same time I appreciate when historians take the effort of showing tables with data, maps, and infographics. These tools never disappeared — thank goodness — and often illustrate the strive for detail and clarity of the good scholar. Indeed, every decade or so a new generation of historian concocts new methods of interpretation or a new focus. The trend then follows the natural cycle: it appears brilliant and innovative, then mainstream, and eventually ineffective or even detrimental to the progress of the discipline.

As I see it, digital history is neither the basic education for every future historian nor the last ephemeral trend. I cannot imagine that many future historians will be proficient programmers or will use complex statistical analyses. Too many students escape to the humanities because of their complete repulsion for numbers. At the same time, as few cling stubbornly to the sole pen and paper as word processors are the main tools for scholars, the increasing importance of digital data will influence more and more the historian’s craft. Online archives and database, together with the new powerful tool of artificial intelligence, will force at least a basic education on how to deal critically with the new sources, a refinement of the hermeneutics on digital data.

During my research for my PhD I had the perfect practical example of the urgency of a new understanding of digital archives. The website of the Imperial War Museum (IWM), the main source of my research that was not an online database, had serious issues. I was looking for the oral interviews of First World War’s veterans and the detailed descriptions of the content of the reels was the sole salvation from the impossible task of listening to all the recordings. Some of these were many hours long and the interviews numbered around seven hundred. This was the natural answer to my original question on why I could not find research that used these oral sources extensively. The problem of the IWM’s website was that if I searched for a word such as “knife,” the only results were the ones with the word in  some elements of the items such as the title or the type, but not from the description of the reel as it appeared on the page of the item. This trampled my research and required an efficient solution.

Python programming was the answer. I made a search of all the items with the type “oral interview” that were relevant to the Great War, saved the few pages of results and extracted with a script all the seven hundred links to the actual items. With the list of all the relevant items I could then program another script that every few seconds downloaded the html page of the item and saved them in one of my local folders. This was to prevent an abuse of the resources of the server because I then could work directly on my data. The third and final step was to write another script that analyzed all the pages, extracted all the data that I wanted, and organized it in a nicely formatted txt file. Once I had the file I was ready to roll: I could search into the txt file directly, find the specific reel of an interview that discussed what I was looking for, open directly the page of the item on the website of the IWM, and eyeball inside the audio where my subject was discussed. This new tool allowed me to use a source that was previously impervious, and to address the content of the database in a completely new way that integrated well with my writing process. If I were writing and had the need to find a source, I could take a ten-minute pause, do my search, transcribe the conversation, and jump back into writing.

Looking at the incredible amount of sources online today and imagining an even greater amount in the future, it is easy to understand that historians will need some basic grasp of how to use them efficiently and critically. However, this does not mean that every historian will be a programmer. We can already see how programs such as Tropy and Zotero are starting to fill the needs of the researchers. They are tools that empower the scholar with a better organization for exploring both physical and online archives. However, some historians will be programmers or at least will approach their research in a more technical manner. Some will use their skills to create maps and infographics, others will use databases and numerical analysis. These specialists will probably remain a subset scrutinized with suspicious eyes by the somewhat luddite common historian, but I doubt that they will disappear as a died-out trend. Therefore in the debate on the future of digital history I think that both analyses are somewhat correct.