I was recently at a conference where someone said to me (perhaps facetiously- I’m still not sure) “Heather doesn’t accept ignorance of the digital humanities”, to which I was a little bit outraged: we had just spent two days with digital and non-digital scholars to discuss ways forward with the EEBO project, half of which was spent discussing ways that we could continue to integrate digital tools in teaching and use of EEBO. Yes, I had a handle on what some of these tools are – but that certainly doesn’t make me an expert. There are a lot of things I don’t know about.
This was apparently unconvincing. Upon further reflection, I guess I shouldn’t have been so indignant, because I’ve been using digital tools for long enough to be over the initial wow-factor of them. I come to the digital humanities from a background in corpus linguistics, which has been conducting large-scale projects on text analysis (or text mining, or whatever you want to call it) for quite a long time. So, no, I’m not really floored by the sorts of things you can do with a collection of textual samples and a computer. There’s a lot more you can do with text and computers, but for now I’m going to stick to text mining, as it’s what I know best.
Corpus linguistics comes with its own guidelines for good practice and its own debate of theory vs methodology, which might sound familiar to anyone following the digital humanities debates at the moment. But that brings me away from my original point: I’m no expert in digital methods. I’m just informed because of my background in corpus studies. A part of becoming informed is from following the “right” people on twitter, attending the “right” conferences, I suppose. I can produce an exhaustive list of Interesting People on Twitter Talking About Digital Approaches To Things. But like any other theory or methodology (depending on where you’re approaching digital humanities and/or corpus linguistics from), you need to be well-versed in how the approach works. Let me put it this way: I wouldn’t go to a neuropsychology conference and talk about brain scans simply because I’ve had an MRI done once. I understand what my MRI did (took pictures of my brain) but I have no idea how to interpret these results (This is my brain! Cool!)
A lot of people tell me, when we start talking about digital tools or digital humanities or textual analysis, that they’re “way over their heads here”. Most recently @ProfessMoravec and I have been having some really productive conversations about getting started in the digital humanities as an absolute beginner. She currently maintains an excellent blog which is chronicling her explorations in topic modeling and text analysis, and I’m reminded of what my friend Kat has to say about interdisciplinary scholarship – namely that you need to be aware of your field(s) and allow it to shape your research in ways you hadn’t expected.
As I see it, there are two ways to do this, and both are equally as important.
As scholars we are trained to research and inform ourselves. When I was getting started in corpus linguistics, I spent a lot of time reading about corpus studies. Digital humanities isn’t always corpus linguistics and vice versa, but I strongly suggest that anyone interested in text mining reads Corpus, concordance, collocation and Trust The Text: Language, Corpus and Discourse, both by John McHardy Sinclair, as well as A Companion to Digital Literary Studies (Siemens and Schriebman 2008) and A Companion to Digital Humanities (Schriebman, Siemens and Unsworth 2004). You will feel out of your element and over your head- but that’s probably how you felt when you first started reading Butler or Derrida, and they’re second nature to you now, right?
2. Try some tools and work out whether or not they’re useful for what you want to do.
As Ben Schmidt has pointed out, when you have a MALLET, everything looks like a nail. He’s right, of course: you can get a digital tool to do almost anything. The point here is that there’s loads of tools out there – even for something as simple as concordancing I can think of about 6 different tools off the top of my head, and not all of them are going to be suitable to your purposes.
A digital tool is just that: it’s a specific interpretation of a set of data. Expect it to not do what you think it’s going to do, expect it to be “wrong” and try to figure out why it’s doing what it is. Most importantly, expect using it to be much more difficult than you had originally anticipated! None of us woke up one morning knowing innately how to use these tools: it took trial and error and lots of frustration.
Should you use digital tools for something you can easily do in more “traditional” literary scholarship? Maybe. Distance or scalable reading (as the digital humanities people call what is essentially a form of corpus studies) is great when you have something too big to see up close at the level of linear reading. One of the most difficult things about digital studies of anything is accepting limitations: the sky’s the limit, right? It’s tempting to just get everything and hope for the best. I’ve found it’s helpful to start with something you do know quite well rather than blindly dumping all of Early Modern London plays into a computer and hoping to understand the results, which is why I’ve been starting at Shakespeare and branching out to all of Early Modern drama. Can I read and analyze 36 plays? Sure. Can I read 400 plays and compare them to each other? Realistically, no.
If the process of using digital tools will help you uncover something you wouldn’t have noticed in the first place, then, yes, it is worthwhile. In linear reading, it’s hard to find things that aren’t there, and digital tools can certainly help us uncover them. They’re not here to make your life easier, but here to guide us to more nuanced questions that we might not have been able to answer before.