10 Things You Can Do with EEBO-TCP Phase I

The following are a list of resources I presented at Yale University (New Haven, CT, USA) on 4 May 2016 as part of my visit to the Yale Digital Humanities Lab. Thank you again for having me! This resource list includes work by colleagues of mine from the Visualising English Print project at the University of Strathclyde. You can read more about their work on our website, or read my summary blog post at this link.

You can download the corresponding slides at this link and the corresponding worksheet at this link.

EEBO and the TCP initiative
Official page http://www.textcreationpartnership.org/tcp-eebo/
The History of Early English Books Online http://folgerpedia.folger.edu/History_of_Early_English_Books_Online
Transcription guidelines & other documentation http://www.textcreationpartnership.org/docs/
EEBO-TCP Tagging Cheatsheet: Alphabetical list of tags with brief descriptions http://www.textcreationpartnership.org/docs/dox/cheat.html
Text Creation Partnership Character Entity List http://www.textcreationpartnership.org/docs/code/charmap.htm

Access the EEBOTCP-1 corpus
#1 – Download the XML files https://umich.app.box.com/s/nfdp6hz228qtbl2hwhhb/
#2 – Search the full text transcription repository online
http://quod.lib.umich.edu/e/eebogroup/ *
or http://eebo.odl.ox.ac.uk/e/eebo/ *
or http://ota.ox.ac.uk/tcp/
* these are mirrors of each other

#3 – Find a specific transcription
STC number vs ESTC number vs TCPID number
STC = specific book a transcription is from
ESTC number = “English Short Title Catalogue” (see http://estc.bl.uk/F/?func=file&file_name=login-bl-estc)
TCPID = specific transcription (A00091)

#4 – Search the full text corpus of EEBOTCP*
(*Can include EEBO-TCP phase I, Phase I and Phase II; read documentation carefully)

EEBO-TCP Ngram reader, concordancer & text counts http://earlyprint.wustl.edu/ (big picture)
CQPWeb EEBO-TCP, phase I https://cqpweb.lancs.ac.uk/eebov3
#5 – identify variant spellings BYU Corpora front end to EEBO-TCP (*not completely full text but will be soon*) http://corpus.byu.edu/eebo (potential variant spellings; see also EEBO NGram Viewer, above)

Find specific information in the TCP texts…
#6 – Find a specific language
e.g. Welsh: *ddg*
#7 a specific term or concept using the Historical Thesaurus of the OED, trace with resources listed above (http://historicalthesaurus.arts.gla.ac.uk/)

#8 Curate corpora
Alan Hogarth’s Super Science corpus uses EEBO-TCP provided metadata + disciplinary knowledge to curate texts about scientific writing

#9 Clean up transcriptions – Shota Kikuchi’s work
Using the VARD spelling moderniser + PoS tagging (Stanford PoS tagger and TregEx → improvements for tagging accuracy, syntactic parsing)

#10 Teach with it!
Language of Shakespeare’s plays web resource lead by Rebecca Russell, undergraduate student (University of Strathclyde Vertically Integrated Project, English Studies / Computer Science)

Advertisements