In my new job as Digital Scholarship Fellow in Quantitative Text Analysis, I’m starting to work with students and faculty in the Liberal Arts on the hows and ways of counting words in lots of texts. This is a lot of fun, especially because I get to hear a lot about what excites different scholars from across different disciplines and what they think is fascinating about the data they work with (or want to work with).
One thing that is kind of strange about my job – and there are several aspects of my job that have required some adjustment – is that my background is broadly in corpus linguistics and English literature, so I don’t always think of the work I do as being explicitly “DH”. These distinctions are quite frankly boring unless you are knee-deep in the digital humanities, and even then I am not convinced it is an interesting discussion to have. Ultimately, people have lots of preconceived notions about what DH is and why it matters. I suspect that different disciplines within the Humanities writ large have different ideas about this too – certainly the major disciplines I cut across (English, linguistics, history, computer science, sociology) have very different perspectives on the value and experience of digital scholarship. And of course, doing “digital” work in the humanities is kind of redundant anyway: we don’t talk about “computer physics” or “test tube chemistry”, as Mike Witmore and others have pointed out.
Being mindful of this, I have acquired a few rules for doing digital scholarship over the years, and I find myself saying them a lot these days. They are as follows:
1. “Can you do X” is not a research question.
The answer to “can you do X” is almost always “yes”. Should you do X? That’s another story. Can you observe every time Dickens uses the word ‘poor’? Of course you can. But what does it tell you about poverty in Dickens’ novels? Without more detail, this just tells you that Dickens uses the word ‘poor’ in his books about the working class in 19th century Britain — and you almost certainly didn’t need a computer to tell you that. But should you observe every time Dickens uses the word ‘poor’? Maybe, if it means he uses this over other synonyms for the same concept, or if it tells us something about how characters construct themselves as working-class, or if it tells us how higher status characters understand lower-status individuals, or whatever else. These are all research questions which require further investigation, and tell us something new about Dickens’ writing.
2. Programming and other computational approaches are not findings.
So you have learned to execute a bunch of scripts (or write your own scripts) to identify something about your object of study! That’s great. Especially if you are in the humanities, this requires a certain kind of mind-bending that requires you to think about logic structures, understand how computers process information we provide, and in some cases overcome the deeply irregular rules which make your computer language of choice work. This is hard to do! You deserve a lot of commendation for having figured out how to do all of this, especially if your background is not in STEM. But – and this is hard to hear – having done this is not specifically a scholarly endeavour. This is a methodological approach. It is a means to an end, not a fact about the object(s) under investigation, and most importantly, it is not a finding. This is intrinsically tied to point #1: Should you use this package or program or script to do something? Maybe, but then you have to be ready to explain why this matters to someone who does not care about programming or computers, but cares very deeply about whatever the object of investigation is.
3. Get your digital work in front of people who want to use its findings.
Digitally inflected people already know you can use computers to do whatever it is you’re doing. It may be very exciting to them to learn that this particular software package or quantitative metric exists and should be used for this exact task, but unless they also care about your specific research question, there is a limited benefit for them beyond “oh, we should use that too”. However, if you tell a bunch of people in your specific field something very new that they couldn’t have seen without your work, that is very exciting! And that encourages new scholarship, exploring these new issues to those people your findings matter most to. You can tell all the other digital people about the work you’ve done as much as you want, but if your disciplinary base isn’t aware of it, they can’t cite it, they can’t expand on your research, and the discipline as a whole can’t move forward with this fact. Why wouldn’t you want that?