This was a fairly straightforward question I found myself asking recently for a footnote. Easy, I thought. I’ll go find a list of characters, count up the female ones, subtract them from the total number of characters, and I’ll have my answer. Though I could have picked up my Complete Works of Shakespeare and started counting from the dramatis personae for each play, I didn’t – because I knew that this information had been encoded before. Gender of characters is something that is often encoded in metadata (there’s a TEI category for gender), and character lists are easy to obtain.
I started with Open Source Shakespeare’s list of characters, which lists 1222 total characters in 37 plays. Characters included in this list included variations of “all”, from many plays:
So, these instances of “all” aren’t really individual characters. However, the rest of this list contained every single character in all the plays, and that was something I could work with. If there are 1222 total “characters”, minus 31 instances of “alls”, there are 1191 individual characters. From there I could either put each of the 1191 individual characters in a box labeled “male”, “female” or “unknown, ambiguous or mixed”, or I could ask another program to do it for me.
I opened WordHoard and asked it to Find Words by Speaker Gender, which would account for those three categories. WordHoard covers all of the same plays as Open Source Shakespeare.
Intuition tells me that it will be an easier task for a computer to isolate female characters than it will be to isolate male characters, so I select “female”, and click “find”. A few minutes later, WordHoard produces the total words spoken by all female characters in each play – and I add the criteria to show “words by speaker name”. My screen looks like this (click to make bigger):
Counting each character I reach a total of 147 female characters in all of Shakespeare, which of our 1191 characters amounts to about 12% of all the characters in Shakespeare. Winter’s Tale has the most female characters (8); Tempest and Henry IV part 1 have the least (2). But that depends on whether or not Ferdinand counts as a female character, in which case Tempest only has one female character. The Young Son in Richard III is deemed female. Macbeth has 7 female characters, but that includes the Witches:
I don’t particularly think that the Witches count as female- I would have been happier to see them as “unknown, mixed, or ambiguous”. How do we know if a character is really female? I could give the Open Source Shakespeare list to any Shakespeare scholar and they could come up with a different count by gender. According to WordHoard, though, Rosalind, Viola, Ferdinand, and the Witches are female characters and treats them universally throughout its system as being female. The benefit of this is that they cannot ever suddenly change categories within the structure of the program, though you may not necessarily agree with the way it has categorized them.
According to my numbers, I had 1044 characters left, covering “male” and “ambiguous”. I was curious as to what counts as “unknown, mixed, or ambiguous” according to WordHoard. (again, click to make bigger):
Interestingly, characters who count as “gender-ambiguous”, according to WordHoard, include the actors Mustardseed, Peaseblossom, Cobweb and Moth from A Midsummer Night’s Dream. I disagreed with this distinction; as if they are ambiguous, surely the Witches should be as well? A number of examples here include the aforementioned “alls” and a number of ghosts or apparitions (“Ghosts of Others Murdered By Richard III” was my personal favorite). This raises more questions: Should apparitions and spirits get their own gender category? Are they gendered? What counts as “gendered”?
Ultimately I counted and removed all the “all”s – which here totals 17, and is in disagreement with the Open Source Shakespeare count. Had I been doing this by hand, I might have counted instances of two or more characters speaking together as “alls”, but WordHoard isn’t counting this information – WordHoard is merely counting the total number of words for each character, here marked as “all”, whereas if two characters say something at the same time they may not be marked as “all”.
This left 46 total ambiguous characters, covering characters such as servants, attendants, various apparitions, and the actors from A Midsummer Night’s Dream, and accounts for about 4% of the characters in the Shakespeare corpus. The 17 Alls accounted for about 1% of the corpus, leaving 998 male characters or about 83% of the corpus.
So, in review: how many female characters are there in Shakespeare? It’s hard to say, but one answer is 147.