Author: heatherfroehlich

@heatherfro // https://hfroehli.ch/

Counting things in Early Modern Plays So You Don’t Have To: Type/Token Ratios

If you’re just joining me, I’ve been working on word frequencies of six highly-prototypical lexical items in a corpus of slightly less than 400 Early Modern London plays. I recommend starting with my research notes and then looking at some quick & dirty results.

As I noted in my quick & dirty results, these numbers hadn’t been normalized in any way: it was all raw data. In an effort to move beyond just raw data, I compiled the total number of words in each play in the corpus. I initially was interested in how play length might be a variable over time my corpus, so I graphed that. The bulk of my plays are from the early 1600s, as you can see:

play length

Overall, plays do seem to get longer until about 1600, at which point they start to get shorter again. 1662 looks to be an outlier here, as the plays in a straight line on the far right-hand side are mostly by Margaret Cavendish. (I am currently trying to figure out how to color my graphs by author, so if you have advice on that, please let me know: I’m rather haphazardly teaching myself to graph in R as I go.)

OK, so I have the total number of tokens in each text. What if treated every instance of my prototypical lexical items as a specific type, and plotted them as type/token ratios? Type/token ratios have a bit messy history in corpus linguistics, as they’re mostly used to calculate vocabulary denseness (Type/Token Ratios: what do they really tell us?, Richards 1987 [pdf]), but this would show a ratio of the raw frequency of each lexical item of interest in each play compared to the length of each play, which would normalize my data a bit.

Click to zoom:

type/token ratios

First of all, it’s notable that the lexical-frequency-to-play-length ratio make some pretty clear bell-curve shapes; I haven’t tried to calculate standard deviations of play-length. (I suppose I could do that next.) The average length of an Early-Modern London play in my corpus was 22086.5 words.

It seems that as plays get longer, they’re more likely to use man (and, to some extent, wom*n) in ways that are not true for lord/lady and knave/wench. It’s also worth looking at scales here: there are nearly double the number of lords than ladys, although man/woman and knave/wench are more comparable. Also,  there are way fewer instances of knave and wench in my corpus overall, which suggests that maybe these words are not nearly as popular as we might like to think.

Counting things in Early Modern Plays So You Don’t Have To: Some Quick & Dirty Results

I was given a corpus of 400 plays for my PhD on gender in Early Modern London plays. Up to this point I had previously been focusing largely on Shakespeare, but have recently been moving into the larger corpus. So what does one do with 400 plays? My solution was “get to know them a little bit.”  I was counting the raw frequencies for lord/lady, man/wom*n, and knave/wench in the entire corpus using AntConc, manually recording it, and then transcribing this data into a spreadsheet. I had selected these terms on the basis that I had recently spent a lot of time looking at likely collocates for these terms, as these binaries represent a high-, neutral-, and low-  formality distinction.

Several of my twitter followers asked why I was just looking at wom*n and not also m*n, and the answer is that without a regular expression I was going to get a fair quantity of noise from m*n (including but certainly not limited to man, men, mean, moon, maiden, maintain, morn, mutton…). Wom*n, I had found, was a highly successful use of a wildcard, only picking up woman and women in the corpus. While this category remains somewhat imbalanced, it presents a pretty clear scope of the quantities for more neutral forms. Now that I have a better sense of what my corpus is like beyond “those files in that folder on my computer”, I can always go back and get other information pretty easily.

What can we learn from a corpus of 400 plays?
For starters, there’s not actually 400 plays in the 400-play-corpus, but 325 plays. I knew when I started this project that this corpus was less than 400, and that it did not cover everything. It is a representative corpus, but I was a bit surprised at how much less than 400 plays I actually had. These 325 plays cover 53 individual authors from the years 1514-1662,* which looks like this:

dates

Each dot represents a year of publication. You will note that some authors are more represented than others (Shirley, for example, has 33 plays in the corpus, spanning a number of years, whereas someone like Beza has only one play in the corpus.) The average year for a play to be published was in 1613, and an overwhelming majority of these plays have been published in the late 1500s into the first half of the 1600s.

Once I had the raw frequencies for everything, I was curious to see how these terms performed diachronically. For ease I’m going to keep calling it the 400-play corpus, and as you’re reading, remember that this is very quick & dirty. There’s a lot more to say & do with this data, but I think talking about raw data is a useful endeavor in that speaks volumes about the sample itself.

lady lord (diachronic)-1

man woman diachronic-1

These graphs suggest that the use of lady and wom*n look more frequent in the corpus from the late 1500s onwards (they’re both almost in a parabola shape) whereas the use of lord and man begins to decline around 1600, creating more of a bell curve effect.

And what about knave and wench? We see there’s a distinct decrease in usage for both just after the early 1600s, though knave was more frequent earlier in the corpus:knave wench diachronically-1

Two of these three sets of binaries show very similar graphs, but that’s because this is raw data: there’s simply more instances of plays occurring around the late 1530s onwards.

This was my first time using R for any graphing ever, so I’m going to dive back in and see what I can do with a more normalized corpus next.


Additionally, I owe a great debt to the following people, who were very selfless and helpful:
Sarah Werner, Julia Flanders, Shawn Moore, Douglas ClarkSimon Davies. Thank you.

Counting gender-specific nouns in 400 plays so you don’t have to: research notes

Those of you following me on Twitter will have noticed I’ve been tweeting bite-sized facts about gender in Early Modern London plays. Here are some research notes on what I’ve been doing.

The Corpus.
I have a corpus of ~400 Early Modern London plays, culled from EEBO by someone who is not me, spanning from 1514 to 1662. This almost certainly does not cover every play written in that time, nor does it cover variant editions of these plays. This is meant to be a largely representative corpus: I have all major playwrights, a number of minor ones, and most (but importantly not all) plays written by them; one edition per play. These files have been labeled by canonical generic description (eg comedy, tragedy, history, tragicomedy), year of publication, abbreviated author surname, and a truncated version of the title. All of this metadata has been collected from EEBO, again by that same someone who is not me.

The files themselves have had everything but the words said by characters stripped out. There are no headers (no scene/act denotations) and no character markers. Each word is on its own line, and all spelling has been modernized. Here is a sample, from Kyd’s The Spanish Tragedy:
The Spanish Tragedy
This, you will note, is not ideal for reading by human eyes. But computers can do some wonderful things with this format.

I’ve been sorting these files into separate folders by author, to get a sense of how many and which plays by which authors I have in my corpus. This is, quite simply, a little more manageable than a running list of plays sorted by genre & date. It also gives me a larger sense of when these authors are working, what generic kinds of plays I have for them, and allows me to have the flexibility to group them in a variety of ways (playhouses associated with specific playwrights, authors who are contemporaries, etc) later on.

My present goal is to get counts of how many times the words lord/lady, man/wom*n, and knave/wench appear in each play in my corpus. Part of the reason I’ve chosen these terms is that they represent a shift from high – neutral – low formality while retaining gender-specific contexts. I could have chosen other ones: I’ve been looking at collocational patterns in Shakespeare using these terms (here are the relevant slides, .pdf) and wanted to get a sense of how these terms are represented in the larger corpus before I do anything else.

I consider this “getting to know my plays” because I’ve been reading as many of these plays as I possibly can, but I have several disadvantages here:
1. I can remember what many of these plays are about, but not the fine level of detail the computer can pull out for me.

2. Some of these plays are very hard to find in print (and, as I’ve shown, they are not in an ideal format for reading). My university no longer subscribes to EEBO, so I don’t actually have access to the original full-text files.

Getting Data on 400 Plays and What To Do With It.
I’ve been running the plays through a concordance program called AntConc to get a visualization of where and how many of these terms appear in each subcorpus of author. Here’s what Dekker looks like in Antconc’s Concordance Plot viewer:
Screen shot 2013-04-28 at 2.22.44
Each black line represents one instance of the search term, and is visualized in a linear way (so, from the beginning to end of each play). This is useful in that the software  will give me a number of hits in each play AND shows me where these words appear in the play-texts. For The Honest Whore, Part 1, there’s a few instances of “lady” all at once, at the beginning of the play, a few scattered in the middle, another small clump (probably representing a conversation) in the middle, and a few sparse other instances toward the end of the play.  I’m doing this mostly to get a sense of where these highly salient words appear and don’t appear in ways that are very hard to keep track of when you’re reading 400 plays in a traditional, linear fashion. These are words you’d (presumably) expect to find in Early Modern plays, so you’re not really paying much attention to them as a reader.

I record this data by hand in a notebook by author and then manually copy the information into a csv file. While it would be great to essentially have a spreadsheet of all of this information automatically produced, spreadsheets are also not particularly well-designed for human eyes to read. Eventually this will turn into a very nice graph, I’m sure, but in this format, it’s hard to make much sense of it all:
Screen shot 2013-04-28 at 2.38.55

This is admittedly a little easier:
Scan 4

There is an easier way to do this for every my entire corpus at once in R and – presumably – Python, but quite frankly that would become information overload very quickly. So while some of you more computational people may be wondering why I’m moving at such a seemingly glacial pace, the answer is “because I want to be comfortable with the data and familiar in a way that allows me to think and reflect on it as it comes”, rather than having it all at once. I want to get to know my corpus a little bit more first. Eventually, I’ll be moving into R with this data – but not yet.

When I’m done I will be making the csv file available, and will hopefully be posting a write-up here. Thanks for your patience. In the meantime, here’s the csv file for all of Shakespeare (from the Globe Shakespeare, 1841) organized by genre (comedy, history, late plays, tragedies).

Does Shakespeare pass the Bechdel Test?

The Bechdel Test is a measure of how male and female characters are portrayed in cinema and other media. A piece passes the Bechdel test if it:

a) has at least two women in it
b) who talk to each other about something besides a man.

That’s it. Pretty simple, right? Not a lot of contemporary media passes the Bechdel test, rather alarmingly. While I was working out proportions of male and female characters in Shakespeare, I got a number of questions about whether or not Shakespeare will pass. I went looking to see if anyone else had approached this question before. Someone has, but at the time of writing this, their website is down for maintenance.

I have already shown that all of Shakespeare’s plays have 2 or more female characters. But what about “talking to each other about something other than a man”?

I began by searching in WordHoard for all examples of characters with the gender of female who use the lemma form she. In essence I am doing this analysis backwards: I’m asking if there are female characters who talk about something other than a man, then seeing if plays which pass this aspect of the test also feature a female character talking to another female character. If a male character was referred to in some way in the window of +7 words left or right in a way indisputably linking the discussion about the female character to the male character, the play has failed this part of the test.

WordHoard highlights the place in the play where each instance of the lemma she appears; these examples can be cross-referenced by clicking each individual example to call them up in the context of the play by act and scene.

King Lear, for example, fails, with “Why should she write to Edmund?” (IV.v.19)
Screen shot 2013-04-03 at 8.45.57

Titus Andronicus might pass the first part of the test, though:Screen shot 2013-04-03 at 9.14.22
These examples do not show any female character talking about another female character in explicit reference to a man. Male characters (lords) are alluded to, but I read them to not be directly implicated to the newborn baby the Nurse speaks to Aaron about – though you may disagree.

The first cull – do female characters in Shakespeare talk about something other than a man? – left me with the following plays:
Winter’s Tale, Pericles, Macbeth, 2 Henry 6, King John, 2 Henry 4, 1 Henry 6, Tempest, Henry 5, and I’m going to include Titus Andronicus.

1 Henry 4, Richard 2 and Julius Caesar had no examples of the lemma form she, so I will address them here as well.

The next question is “do female characters talk to other female characters in the play?”
Open Source Shakespeare allows you to isolate character’s speeches by name – and gives you the option to show cue speeches and the ability to see these speeches in the context of the play. They have been linked where appropriate.

The Winter’s Tale does not pass the test. Although Emilia and Paulina are talking to each other, they are talking about the king in Act 2 Scene 2.

Pericles does not pass the test: Leonine and Marina are talking to each other, but about Marina’s father (scroll up just slightly from where this link will take you) in Act 4 Scene 1.

Macbeth does not pass the test either, as The Gentlewoman talks about Lady Macbeth, but to the Doctor, who is presumably male, in Act 5 Scene I.

2 Henry 6 does not pass the test, as the female characters do not talk to each other.

King John does not pass, because of an interchange between Constance and Queen Elinor in Act 2 Scene I, in which they discuss John, Elinor’s son.

2 Henry 4 also does not pass, for two reasons: one, this interchange between Lady Northumberland and Lady Percy has them talking about the King in Act 2 Scene 3,  and two, because of this interchange between Doll Tearsheet and Hostess Quickly from Act 2 Scene 4, in reference to Pistol.

1 Henry 6 does not pass the test because the female characters do not talk to each other.

The Tempest also does not pass the test because the female characters do not talk to each other. (I am considering Ariel a female character here; this is still very much up for debate, and this may automatically disqualify The Tempest overall.)  Miranda and Ariel are not in conversation.

Henry V does pass the Bechdel Test, due to this discussion (in French) between Katherine and Alice from Act 3 Scene 4.

Titus Andronicus ultimately does not pass the test due to this conversation between Tamora, Lavinia and Bassanius in Act 2 Scene 3.

1 Henry 4 does not pass because the female characters do not talk to each other.

Richard 2 passes because the Queen and her ladies “are carefully not talking about Richard” as @angevin2 kindly points out; they are instead talking about garden sports in Act 3 Scene 4.

Julius Caesar does not pass because the female characters do not talk to each other.

By and large, Shakespeare does not pass the Bechdel test: but two plays do – and it’s not the plays I ever would have expected. However, I should point out I might be wrong here: like I said above, I did this backwards, by finding plays that had female characters talking without mentioning male characters, then checking to see if these plays did show two female characters in conversation. If you have a better solution for finding out if Shakespeare passes the Bechdel test, I am all ears!

EDIT (18 June 2015)

Some recommended further reading:
Selisker, Scott. (2014) “Literary Data and the Bechdel Test“, from the What Is Data in Literary Studies? colloquy, Modern Language Association annual meeting, Chicago, IL.

Mariani, Daniel. (2013) “Visualizing The Bechdel Test“. Ten Chocolate Sundaes blog post, 24 June 2013.

Agarwal et al (2015) “Key Female Characters in Film Have More to Talk About Besides Men: Automating the Bechdel Test“. Human Language Technologies: The 2015 Annual Conference of the North American Chapter of the ACL, pp. 830–840, Denver, Colorado, May 31 – June 5, 2015.

How much do female characters in Shakespeare actually say?

Recently I suggested there might be 147 female characters in Shakespeare. If we are to trust that, how do they break down by play? I used the Open Source Shakespeare genre distinctions to categorize each play and the female-character categorizations from WordHoard to produce the following:Screen shot 2013-02-17 at 9.16.43 In this graph, green represents comedy, black represents history, and red represents tragedy. As you will recall from my previous post, The Winter’s Tale has the most female characters, and 1H4, Julius Caesar, and Tempest have the least amount of female characters.

17 out of 37 plays have four female characters. This makes sense, as the Early Modern theatre could hire two boys to cover all female roles, although this would obviously limit the characters who could then speak to each other. More female characters required either more boys, or for each boy-actor to take on more parts (which would again limit the amount these characters could speak to each other).

But how much do these characters talk? Or, in other words, how much of each play is made up of words said by female characters? To do that, I’d first have to find how many words were in each play, and how much of those words were said by female characters. I already had made note of how many words were said by female characters in each play from my previous post, but I didn’t have the total number of words in each play.

I returned to WordHoard’s find words function to get a word-count according to the software’s own encoded edition of each play:Screen shot 2013-02-19 at 2.12.20 With this information, I was now able to produce the following graph. Again, green represents comedy, black represents history, and red represents tragedy; the shapes of each mark on the graph represents how many female characters are in each play:

Screen shot 2013-02-19 at 3.34.54

Female characters in As You Like It say the most out of all the female characters in Shakespeare (but that number includes Rosalind/Ganymede) with 8,643 words spoken out of 21,298 total words in the play. Female characters in Timon of Athens say the least, with 61 words out of 17,744 total words in the play. On the whole, while there may be slightly more female characters in comedies, the amount of words they actually speak is highly variable, whereas the histories seem to show the least amount of variation. I had also taken the average of all female characters in each genre and found that comedies had an average of 4.07 female characters; histories, an average of 4.083 female characters; and tragedies had an average of 3.72 female characters – suggesting that the history plays may be the most stable out of the three categories for female characters, which is interesting. If you are interested in which female characters say the most words, please click here for the relevant image.

A number of people have asked me if Shakespeare passes the Bechdel test: I’m working on it! Stay tuned…

How many female characters are there in Shakespeare?

This was a fairly straightforward question I found myself asking recently for a footnote.  Easy, I thought. I’ll go find a list of characters, count up the female ones, subtract them from the total number of characters, and I’ll have my answer. Though I could have picked up my Complete Works of Shakespeare and started counting from the dramatis personae for each play, I didn’t – because I knew that this information had been encoded before. Gender of characters is something that is often encoded in metadata (there’s a TEI category for gender), and character lists are easy to obtain.

I started with Open Source Shakespeare’s list of characters, which lists 1222 total characters in 37 plays. Characters included in this list included variations of “all”, from many plays:

Screen shot 2013-02-08 at 5.33.00

So, these instances of “all” aren’t really individual characters. However, the rest of this list contained every single character in all the plays, and that was something I could work with. If there are 1222 total “characters”, minus 31 instances of “alls”, there are 1191 individual characters. From there I could either put each of the 1191 individual characters in a box labeled “male”, “female” or “unknown, ambiguous or mixed”, or I could ask another program to do it for me.

I opened WordHoard and asked it to Find Words by Speaker Gender, which would account for those three categories. WordHoard covers all of the same plays as Open Source Shakespeare.
Screen shot 2013-02-08 at 5.25.05
Intuition tells me that it will be an easier task for a computer to isolate female characters than it will be to isolate male characters, so I select “female”, and click “find”. A few minutes later, WordHoard produces the total words spoken by all female characters in each play – and I add the criteria to show “words by speaker name”. My screen looks like this (click to make bigger):
Screen shot 2013-02-08 at 5.48.45Counting each character I reach a total of 147 female characters in all of Shakespeare, which of our 1191 characters amounts to about 12% of all the characters in Shakespeare. Winter’s Tale has the most female characters (8); Tempest and Henry IV part 1 have the least (2). But that depends on whether or not Ferdinand counts as a female character, in which case Tempest only has one female character. The Young Son in Richard III is deemed female. Macbeth has 7 female characters, but that includes the Witches:Screen shot 2013-02-08 at 5.55.58

I don’t particularly think that the Witches count as female- I would have been happier to see them as “unknown, mixed, or ambiguous”. How do we know if a character is really female? I could give the Open Source Shakespeare list to any Shakespeare scholar and they could come up with a different count by gender. According to WordHoard, though, Rosalind, Viola, Ferdinand, and the Witches are female characters and treats them universally throughout its system as being female. The benefit of this is that they cannot ever suddenly change categories within the structure of the program, though you may not necessarily agree with the way it has categorized them.

According to my numbers, I had 1044 characters left, covering “male” and “ambiguous”. I was curious as to what counts as “unknown, mixed, or ambiguous” according to WordHoard. (again, click to make bigger):

Screen shot 2013-02-08 at 6.07.34

Interestingly, characters who count as “gender-ambiguous”, according to WordHoard, include the actors Mustardseed, Peaseblossom, Cobweb and Moth from A Midsummer Night’s Dream. I disagreed with this distinction; as if they are ambiguous, surely the Witches should be as well? A number of examples here include the aforementioned “alls” and a number of ghosts or apparitions (“Ghosts of Others Murdered By Richard III” was my personal favorite). This raises more questions: Should apparitions and spirits get their own gender category? Are they gendered? What counts as “gendered”?

Ultimately I counted and removed all the “all”s – which here totals 17, and is in disagreement with the Open Source Shakespeare count. Had I been doing this by hand, I might have counted instances of two or more characters speaking together as “alls”, but WordHoard isn’t counting this information – WordHoard is merely counting the total number of words for each character, here marked as “all”, whereas if two characters say something at the same time they may not be marked as “all”.

This left 46 total ambiguous characters, covering characters such as servants, attendants, various apparitions, and the actors from A Midsummer Night’s Dream, and accounts for about 4% of the characters in the Shakespeare corpus. The 17 Alls accounted for about 1% of the corpus, leaving 998 male characters or about 83% of the corpus.

So, in review: how many female characters are there in Shakespeare? It’s hard to say, but one answer is 147.

Choosing tools, or why your computer is an abacus

If you wanted to put your small basil plant in your garden, would you find a backhoe to do it? Probably not, as you could dig a reasonable sized hole pretty quickly with a small shovel. Just like you don’t really need to bring out heavy machinery to do a simple gardening task, you probably don’t need complex tools to do small bits of text analysis. Is it impressive? Sure. Is it really necessary? Um, no, probably not.

When it comes to choosing and using digital tools for text analysis it’s a bit like gardening: you want to choose the right tool for the task. Some projects, like planting our basil, don’t really require anything complex, and you might be better off doing something the “old-fashioned” way of reading than overcomplicating with things that aren’t actually adding anything to your analysis.

Computers are very good at counting things. There are a nearly-endless number of tools which will help you count a variety of things in texts; that link probably doesn’t cover all of them. How do you know if you’re using the right one? Digital projects can be great, and digital analysis can be really useful, but if you can see it with your own eyes you probably don’t need a computer to tell it to you. Thematic elements often come out as being specific when comparing texts against one another. In The Tempest, words like ‘drown’ ‘island’ ‘isle’ ‘fish’ and ‘sea’ are more likely to appear – but you really don’t need a computer or complex statistics to tell you that, as Jonathan Hope points out. Digital tools that count things are much better suited to projects which are larger and when you’re looking for something much less thematic and much more specific.

So how do you know if you’re using the right tool for the task at hand? Well, you don’t always. Currently I have at least six tools for straight-up text analysis installed on my computer, and I can access more than a few others from my web browser. I’m compiling one myself. Do I really need all of these? In a word: yes. One is not better than the others. One might be more robustly informative than the others, depending on what I’m looking for.

In my recent research on the Shakespeare corpus I’ve found myself cross-slicing between a concordance program (AntConc), a statistical analysis tool (WordHoard), and the texts themselves (Open Source Shakespeare), and I will pull in others as they’re useful. It’s not that these tools individually aren’t doing enough, it’s that between the three of them, I can get a much more clear picture of what’s actually happening in my texts. Professor Alan Bryman has an excellent paper on triangulation from 2004 (pdf), where he argues for a three-check system “to enhance credibility and persuasiveness of a research account” (2004: 4). In other words: can you find it once, that’s exciting; if you can find it twice, even better, but if you can find it three times it’s a truth. Justifiably, it’s even more exciting when someone using entirely different tools and asking an entirely different question can arrive at the same conclusion that you did, albeit on a much larger scale.  Of course, I have the unspoken benefit of working on Shakespeare, who is widely digitized: but I’d return to the texts regardless of who I’m working on – I just might have to change my approach a bit.

When it comes to choosing tools for text analysis, “it was there so I used it” is not an acceptable answer. You should know what your tool can and cannot do; its benefits and its limitations, and you should be able to account for them. A tool is just an interpretation of data, as I said previously, and what you can see in one tool might not be enough to justify your claim. Trying a variety of approaches might show you something that you missed the first, second, third time around: a small detail can lead to much bigger and better questions than simply accepting the first thing you try. A KWIC concordance might not be showing you enough of your data; a log-likelihood analysis might be telling you too much, and your wordcloud might not be showing you anything useful at all. Like anything else, I have my favorite tools and I’m likely to turn to them first and recommend them above other text analysis tools. Are they right for your project? In all honesty: I don’t know.

But all of this shouldn’t stop you from using digital tools, though. I occasionally use KWIC tools as a search engine for a specific corpus, and I will introduce friends and colleagues to them for that purpose, which is probably poor scholarship. But much more interesting things can happen when you break the rules of what the tool should do, which is another blogpost in and of itself.

How to Choose a Postgraduate Degree Abroad

In 2010 I graduated from the University of New Hampshire with degrees in English and Linguistics, and moved to Glasgow to undertake a Masters of Research at Strathclyde for 2010-2011. I’m still here working on my PhD, and every few months, I get a Facebook message from an acquaintance saying something along the lines of “I’m thinking about going to graduate school abroad! You did it, right? Can you tell me about it?” And every time I think, here we go. Time to dig out my response from the last time I answered this question! So, for posterity, here’s my answer. Keep in mind that details on applying for postgraduate education are going to vary from country to country. This took me a little bit by surprise at first, because I had expected all education systems to be like the US’s. They are not. It is easy to forget this! As a result, keep one eye on all deadlines, because they might come sneaking up on you much sooner than you think (or take much longer than you would have anticipated).

Admittedly, my experience has been limited to a humanities track, so there might be some variants between humanities and science or engineering, for instance. This is meant to be resource for others considering graduate school abroad from the US, but I think it generally works for applying to graduate school anywhere.

First things first.
Great, you’re thinking about doing a graduate degree! What, specifically, do you want to study in your field? What are you interested in? I came to Scotland specifically to work with my supervisors because I was interested in the intersection of literature and linguistics. In undergrad, I would meet with my teachers, explaining that I wanted to write my essays about linguistics in a literature class and vice versa. I was tired of having to explain and sell my ideas every time. I decided that I wanted to do a graduate degree, but only if I could do something on the intersection of literature and linguistics. I didn’t want to keep having to explain it, I wanted to just be able to do it. (Obviously, I still have to explain it, but for different reasons now.)

My best advice about all of this is to figure out what you’re most interested in. A postgraduate degree is a little more specialized than an undergrad degree. You will want to be in a department with people who have research which is similar to the sorts of things you would want to do. Google any possible combination of what you want to be studying and ask mentors or advisers if they have any ideas. I wanted to study literary linguistics, and it turns out that Strathclyde has been heavily involved in literary linguistics and stylistics. These two fields never really caught on in the US, but they’re doing OK here in Europe, which works in my favor.

I hope this is not discouraging, but if you happen to be particularly interested in African American slave narratives along the Mississippi River – the people you will probably want to work with will probably not be in the UK. I’m not saying that you can’t or shouldn’t – but really think about what you want to do. Before you even start looking at schools, look for people you might want to work with. In the end, you’re not coming here for the school name but the expertise that school can offer you over all other institutions.

Once you have made a list of a few people you’d be interested in working with, contact them! Tell them that you’re interested in their program, in what they’re doing, and ask questions. The first ones which spring to mind for me are What will the course be like? Will my work abroad would be transferable back to my home country? Find out if these people are even considering taking on more graduate students – it would be a bit like structuring your entire undergrad career around taking one class you’ve waited to take in your last semester of your last year only to find out that the lecturer is on research leave. It’ll be heartbreaking to hear now, but way less so than finding out after you move specifically to work with someone. You surely will have better, more relevant questions than I will about your field – ask them. Find out if the place you want to go is a good fit for you.

Though I applied to three universities in the UK, and contacted one person at each institution, I actually only contacted one of my now-supervisors initially and had no idea that my now-primary supervisor (or his project) even existed. This has turned out to be quite the happy accident for me; I certainly can’t guarantee this success rate for everyone else. But I wouldn’t have known at all if I hadn’t contacted Nigel and asked questions in the first place.

Applications.
So! You have a department (or two or three) you want to apply to. Now what? Ask more questions. I had to decode the British university application without much help (“Qualifications? What do you mean by that?”). What degree are you applying for? I did a Masters of Research, which is different than an MPhil and an MA. Again, if you’re not sure: email someone and ask. I don’t think I can stress this enough. I distinctly remember asking a postgraduate coordinator about GREs and getting an email response back of “I don’t know what those are, so I would advise you not to worry about them.” Not Having To Take the GREs was definitely a plus in favor of graduate school abroad, I won’t lie, but trying to figure out how my GPA fit into the UK degree ranking system was a nightmare. In the end, it’s the responsibility of the universities to figure out all these conversions – most transcripts should have an explanation of the grading system.

I still feel guilty about the GRE thing when I talk to my friends who are in graduate school in the US, for what it’s worth.

Funding.
This is the big one! You are trying to figure out how, exactly, you are going to pay for this. Currently I think postgraduate student tuition in the UK is around £9000. This does not sound like a lot, but when you do the conversions (depending on day, position of the sun, stock markets, whether or not the moon is rising in Aries, etc) it ends up being about $18k. This is the good news, because that is a LOT less expensive than any US program.*

* When you factor in the cost of moving abroad, this number will go up quite a bit. (Cost of living, flights to/from the US and getting settled is an entirely different story.)

Ask someone who you’d want to work with what funding is like in your field and how it might apply to you, especially as a foreign student. The bad news is that as a non-UK/EU citizen you will not be eligible for much funding here (sorry- this is truly the bad news). We have a number of research councils here (the one that would cover me is the AHRC. Google ‘UK Research Councils’ for this information). They often offer a variety full studentships (scholarships) at different universities. They also fund big projects, so if someone is looking for a Masters student to join them as part of an ESRC-funded project on eye tracking in reading, for instance – again, making these up off the top of my head – they might mention that, and you’ll have to ask how that will work.

The US often sponsors students going abroad (see Rhodes & Fulbright scholarships, among others); there will probably be specific ones for your countries. I think the Erasmus Mundi program also sponsors scholars going abroad, but I’m not sure of the details of it. Apply for these before you leave the US, as some of them are not applicable to people who have lived outside the US in their place of research. This turned out to be a problem, as I missed the deadline for Fulbright scholarships the day I applied to Strathclyde. Because I’ve lived in the UK for over a year consecutively now, I can’t apply to the Fulbright grant system. I’m still kicking myself over this, for the record. That does not mean there aren’t other grants for you. My officemate funded 3 years of his PhD through a series of small, private grants from various organizations.

Good luck! If your university has any sort of international or study abroad office, get in touch with them, too. Take all the help you can get, seriously – you’ll be glad you did.