PRIMA FACIE
  • Prima Facie
  • Evidence
  • Shakespeare
  • Oxford
  • False Grails
  • Other Candidates
  • Resources
  1. Online Early Modern Texts
  • Prima Facie
    • An End to Doubt
    • Prima Facie Evidence
    • A Prima Facie Case for Shakespeare
    • AI response to PFC
    • The Role of Evidence
    • What is a Prima Facie Case?
    • Brief History
  • Evidence
    • The Use and Abuse of Evidence
    • Spurious Correlation
    • Droueshout
    • Beyond Belief
    • Why is there a ‘debate’?
    • The Unorthodox Logic of Diana Price
    • William Basse
    • Leonard Digges
    • Lifetime references
  • Shakespeare
    • Shakespeare of Stratford
    • Shakespeare’s Education
    • Shakespeare’s Handwriting
    • Shakespeare’s Monument
    • Shakespeare’s Biography
    • Shakespeare in Italy
    • Shakespeare Side by Side
  • Oxford
    • The 17th Earl of Oxford
    • J. Thomas Looney
    • Brave New Avon
    • Oxford’s Higher Education
    • Oxford’s Hand
    • Oxford’s Geneva Bible
    • The Swan of Hedingham
    • Six Sonnets
    • Frequency Analysis
    • Oxford’s poetry
    • Oxford’s Correspondence
      • The De Vere Letters
      • Armada Letters
      • Italy Correspondence
      • Personal Letters
      • Tin Trade
      • Tin Memoranda
      • Wardship Papers
      • Oxford’s Vocabulary
  • False Grails
    • False Grails
    • Wracke and Redemption
    • A Squadron of Tempests
    • The Mysterious Number 17
    • Off-wavelength Frequency
    • Computer Assisted Attribution
    • Plane truth
  • Other Candidates
    • Articles in this Section
    • Christopher Marlowe
    • Cervantes
    • Emilia Bassano Lanier
    • John Florio
    • The Top 50
    • The Full List of Shame
  • Resources
    • Resources
    • Site plan
    • Downloads

Online Early Modern Texts

Putting Early Modern Literature Online

An early programmable microcomputer

The development of online resources and what is available for research into the Early Modern Period today.

Resource Dates What it contains Text type Organisation & access Key limitations
LION (Literature Online) Launched 1997 (Chadwyck-Healey); migrated to ProQuest 2019 350,000+ works of poetry, drama & prose in English, 8th century to present. Canonical and semi-canonical authors. Shakespeare, Spenser, Jonson etc. Scholarly journals and ABELL index also included. Re-keyed full text. Transcribed from first editions or scholarly editions; 99.995%+ accuracy claimed. Subscription (institutional). Searchable via ProQuest platform. Boolean & proximity search. Browsable by author, genre, and period. Selective canonical coverage only. No POS tags or linguistic metadata. Commercial subscription required. Not bulk-downloadable for text mining.
EEBO (Early English Books Online) Microfilming from 1938; online 1998 (UMI/Chadwyck-Healey); ProQuest from c.2003 146,000+ titles, 1473–1700. Covers STC I (Pollard & Redgrave), STC II (Wing), Thomason Tracts, Tract Supplement. 17 million+ pages. Virtually all surviving print in English to 1700. Page images. Bitonal scans from microfilm (greyscale from 2012). PDF & TIFF. Images only — no searchable text unless a TCP transcription exists for that title. Subscription (ProQuest). Browse and search by ESTC metadata (author, title, date, STC number). Full-text search only available for TCP-transcribed subset. Images not freely reusable. Images are not text. Microfilm artefacts (bleed-through, damage). Black-and-white scanning distorts typeface detail. Coverage approximately 92% complete. Cannot be computationally processed without the TCP layer.
TCP Phase 1 (EEBO-TCP Phase I) Transcription 2000–2009; public release 1 January 2015 25,368 texts selected from EEBO. Selection biased towards New Cambridge Bibliography authors, then thematic and format batches. Coverage c.1475–1700. Hand-keyed XML. TEI P5 XML with structural markup (headings, verse, notes, figures). No POS tags or lemmatisation. Spelling is original and unregularised. Freely downloadable from TCP GitHub, Michigan, and Oxford Text Archive. Bulk XML. Searchable via ProQuest EEBO or Michigan interface. No POS query syntax. Canonical selection bias. Original spelling makes linguistic search inconsistent across the corpus. No POS or lemma data. Some transcription errors. Fixed snapshot — not updated.
TCP Phase 2 (EEBO-TCP Phase II) Transcription 2009 onwards; public release January–August 2020 ~35,000 additional texts from EEBO (combined total with Phase 1: ~60,000 texts). Broader coverage with more emphasis on English-language text. Completes the TCP transcription project. Hand-keyed XML. Same TEI P5 XML format as Phase 1. No POS tags, no lemmatisation, original spelling throughout. Now fully public. Bulk download from TCP GitHub. Integrated into ProQuest EEBO interface alongside Phase 1. Also accessible via EarlyPrint and CQPweb corpora. Same limitations as Phase 1. Approximately 85,000 EEBO titles remain without any transcription. Ongoing corrections are community-driven.
EEBO V3 / CQPweb (Lancaster University / UCREL annotated corpus) Built on TCP Phases 1 & 2; annotated version mounted on CQPweb; current version c.2015–ongoing 44,422 texts; 1.2 billion running tokens. Both TCP phases processed through Lancaster’s UCREL annotation pipeline. Spelling regularisation applied first, then POS tagging and lemmatisation. Available on the same CQPweb server as Lancaster’s hand-keyed and linguistically detailed Shakespeare corpus, which can be filtered by play, character, scene, and experimental filters including the social class of the speaking character. POS-tagged corpus with lemmatisation. Eight annotation fields per token: original form, regularised spelling, lemma, POS tag, and further linguistic metadata. CQP (Corpus Query Processor) syntax enables grammatical pattern search across the full corpus. Accessed via Lancaster CQPweb (cqpweb.lancs.ac.uk/eebov3). Free but requires account registration. CQP query language. KWIC concordance, frequency breakdown by date, and collocation tools. Fixed corpus — not updated in real time. POS tagger trained on modern English; accuracy is reduced for early modern syntax and morphology. No metadata filtering by author or title within the CQP interface. Fixed snapshot. Spelling regularisation introduces editorial decisions.