Computer Assisted Attribution
Oxfordian adventures with advanced stylometry
The early days of computer stylometry were hopeful ones for Oxfordians, keen to show that technology could validate the the presence of the Earl’s hand in Shakespeare’s work. The search for literary connective tissue linking Oxford to Shakespeare stylistically was the lodestar of Looney’s research—showing that Oxford’s verse “contained the germ of Shakespeare’s”. He doesn’t seem to have gone much further than Palgrave’s Golden Treasury, an anthology most English schoolchildren encountered by the age of 11 but his confidence that something probative would turn up was as unshakeable as Mr Micawber’s. Needless to say, nothing has.1
Metrical tests to determine developing styles which can chronicle the order of an artist’s work existed long before computers. The principles were well established and the dates that Edmund Malone proposed for Shakespeare’s work, within small date-ranges have changed surprisingly little in the 250 years of of scholarship that followed his pioneering work. Modern chronologies are mostly based on the work of E K Chambers in his seminal work William Shakespeare: A Study of Facts and Problems.2
Early computerised tests at The Shakespeare Clinic3 run by Ward Elliott and Robert Valenza were initially equivocal and didn’t really help anyone. Computerised stylometry was new and everyone was learning. All sides were sceptical about its usefulness and many academics still are. There was a scale problem in these early days. Hard disks on personal computers in the 80s were barely big enough to hold a single modern digital photograph (much less replace the eyebrows in it).
Dealing with large data on small machines meant computerised stylometry began by shrinking datasets and creating test banks, carrying out tests in partial corpora, looking for features that could be categorised, standardised and counted. As computers became cheaper and faster, Eliott and Valenza’s tests became more sophisticated and more accurate. However, what was taken for granted by most from the outset then turned into disaster for the Oxfordian argument. As computers got bigger, tests got better but the chances of an alternative author, specifically Oxford’s chances, virtually disappeared. Along with all the other runners and riders they looked at, Elliot and Valenza finally eliminated the Earl from The Alternative Shakespeare Stakes.4
“Judging from their surviving writing, Shakespeare was not just 100 times better than Oxford, he was also 80 times more productive. Shakespeare wrote about 3,500 lines of verse a year for twenty years, most of them immortal; Oxford, in the Shahan-Whalen scenario, wrote about 40 lines of woebegone juvenilia a year for ten years, then, for fifteen years, wrote nothing at all that he or anyone else could be bothered to save––but then, at forty-three, supposedly burst from his cocoon to become a literary supernova overnight.” Eliott & Valenza
Oxfordians who didn’t like E&V’s elimination of Oxford from the authorship stakes nevertheless enjoyed airily criticising the data, the method, attempting to redo the maths and dismiss almost everything they ever did. All their work, in fact, apart from one single instance in which everything unaccountably went in their favour. E&V did not agree that Hand D was obviously Shakespeare’s. Their analysis says not, at least not in the 1590s from when the earliest parts of the manuscript can be dated—but analysis has moved on. And Shakespeare’s contribution wasn’t written in the 1590s. The plot has thickened nicely (or diluted to the point of non-existence if your theory depends on it being written by someone else).
Data Vandalism

Figure 1 tracks feminine endings following the chronology that history and scholarship demands. Using the same data but following the alternative Oxfordian chronology devised by Hess5, or Clark6or Gilvary7 redates eleven plays back into the lifetime of De Vere. These authors may disagree about individual plays but claim that their researches independently arrived at the convenient conclusion that no plays were written after the death of the Earl of Oxford in 1604.
Looking at E&V’s carefully calculated metrics, the absolute stylometric chaos this causes should be plain to every student.8
.jpg)
This is data vandalism, not data analysis. The fact that all four metrics show the same pattern of chaos after the Oxfordian chronology shuffle confirms that the rearrangement is purposeful, non random and designed to misdirect.
Oxford died before a third of the work was written and this unwarranted revision of the record is completely worthless. Clark, almost turns the accepted chronology upside down. Imagine doing that with Beethoven or the Beatles. But what else can they do but rewrite history?
| Play Title ↕ | Riverside (Standard) ↕ | New Oxford (MLE) ↕ | Hess (Oxfordian) ↕ | Clark (Oxfordian) ↕ | Feminine Endings ↕ | Weak Endings ↕ |
|---|---|---|---|---|---|---|
| H6 1 | 1 | 1 | 1587 | 3 | ||
| H6 III | 2 | 2 | 2 | 1580 | 12 | 3 |
| H6 II | 3 | 3 | 1 | 1579 | 15 | 2 |
| Richard III | 4 | 6 | 3 | 1581 | 16 | 4 |
| Titus | 5 | 5 | 8 | 1577 | 10 | 5 |
| Comedy ot Errors | 6 | 4 | 14 | 1577 | 7 | 0 |
| Two Gent Verona | 7 | 8 | 4 | 1579 | 16 | 0 |
| Taming Shrew | 8 | 7 | 5 | 1579 | 25 | 1 |
| Love’sLLcst | 9 | 9 | 6 | 1582 | 14 | 4 |
| Richard Ili | 10 | 10 | 7 | 1579 | 15 | 3 |
| King John | 11 | 11 | 17 | 1581 | 8 | 7 |
| Romeo Juliet | 12 | 13 | 19 | 1582 | 6 | 6 |
| MSNDream | 13 | 14 | 9 | 1581 | 14 | 0 |
| Henry IV i | 14 | 12 | 21 | 1584 | 6 | 5 |
| Merry Wives | 15 | 18 | 10 | 1585 | 17 | 1 |
| Merchant Venice | 16 | 20 | 11 | 1579 | 13 | 6 |
| Henry IV ii | 17 | 15 | 12 | 1585 | 15 | 1 |
| Henry V | 18 | 16 | 22 | 1586 | 8 | 2 |
| As you lıke it | 19 | 21 | 25 | 1582 | 7 | 2 |
| Julius Caesar | 20 | 17 | 15 | 1583 | 12 | 10 |
| Much Ado | 21 | 19 | 16 | 1583 | 13 | |
| Hamlet | 22 | 25 | 27 | 1585 | 8 | 8 |
| Twelfth Night | 23 | 23 | 32 | 1580 | 2 | 3 |
| Troilus Cressida | 24 | 22 | 18 | 1583 | 13 | 6 |
| All’s Well | 25 | 31 | 20 | 1579 | 32 | 11 |
| Measure t M | 26 | 24 | 23 | 1581 | 12 | 7 |
| Othello | 27 | 26 | 24 | 1583 | 12 | 2 |
| Lear | 28 | 27 | 26 | 1589 | 13 | 5 |
| Macbeth | 29 | 32 | 33 | 1589 | 11 | 21 |
| Antony Cleo | 30 | 33 | 28 | 1579 | 13 | 71 |
| Timon | 31 | 28 | 1576 | 22 | 5 | |
| Coriolanus | 32 | 34 | 29 | 1580 | 27 | 60 |
| Pericles | 33 | 29 | 35 | 1577 | 5 | 15 |
| Cymbeline | 34 | 35 | 30 | 1578 | 13 | 78 |
| Winter’s Tale | 35 | 37 | 31 | 1586 | 12 | 57 |
| Tempest | 36 | 36 | 34 | 1583 | 11 | 42 |
| Henry VIII | 37 | 30 | 1601 | 0 | 45 | |
| Two Noble Kins | 38 | 38 | 21 | 50 |
A sortable table demonstrates the utter nonsense in Oxfordian chronology. It is based on E&V’s data but allows you, by scrolling around and sorting the columns, to see what a preposterous pig’s breakfast they have made of the entire Bankside zeitgeist. Hess’, the most popular, has simply shoved the entire chronology back a decade in time. Clark’s chronologies offer no sensible reasons for shoving plays back even further than Hess, other than smoother alignment to the Earl’s biography. Coriolanus turns up in 1580, almost 30 years before Jonson, Armin and Fletcher all mention it. The Tempest turns up 28 years before its first performance, The Merry Wives of Windsor, in 1579, gives us Falstaff returning by public demand 17 years before he first appeared.
Stylometry grows up
The charts above were the result of months of manual data entry analysis by teams working working with books and basic digital texts. Early digital texts cannot tell a feminine ending from a masculine one. In the intervening years, computing has moved from its Stone Age to what was thought science fiction in the early 90s. Addressing very large data arrays is now routine. The application of Principle Component Analysis has long been able to predict what an Amazon user might like to buy next (and offer them something they bought the day before). Today, more storage and processing power than existed on the whole planet in 1980, is fitted into a single car which can drive its owner around unassisted while they read the newspaper. Storage and processing power are now practically limitless.
This technology can be deployed to analyse whole corpora, digesting a writer’s style algorithmically and comparing it to everything in print. Everything. In 2009 Hugh Craig and Arthur Kinney published a guide to algorithmic analysis of Shakespeare’s work, which included a rather frightening chapter on the use of Principal Component Analysis (PCA) to identify stylistic features in the plays.9
|
The amount of data addressed by Craig & Kinney would have been unmanageably large 10 years ago.
|
|
|---|---|
|
Early Modern English Plays
|
165
|
|
Words of dialogue
|
3,250,000
|
|
Plays from 1590-1619
|
138
|
|
Plays of undisputed single authorship
|
112
|
|
Complete works
|
WS, TM, CM, JW, BJ
|
|
Four or more complete plays
|
JL, RG, GP, TD, TH, JF, JF
|
|
Three plays
|
RW, GC, JS
|
|
Usage dictionary
|
Online OED
|
Craig & Kinney’s data sets. Once the work of months, plays can be grouped and built into sets in minutes now.
The removal of capacity restraints has opened the door to algorithmic techniques capable of identifying complex patterns in data reducing the dimensionality of large datasets to the point where even the most complicated queries can be answered in minutes, seconds even. This enables researchers to refine their answers by asking hundreds of questions demanding hundreds of answers. Sorting the answers, rather than waiting for them, becomes the larger problem. By applying PCA to the text of Shakespeare’s plays, you can remove a random 1200 word section from a play, treat it as anonymous, then use an algorithmic process to test a search’s validity by its ability to return to its correct place, attributed to the correct playwright and in the correct approximate point in Bankside chronology with above 98% accuracy. Investigating collaborative work has now become a top academic priority. The First Folio now even a has a companion edition of collaborative work.10

The next chart was created, in response to a claim that the word “equivocation” was in was in everyday use throughout Oxford’s lifetime making the six occurrences in Macbeth unremarkable. It demonstrates that usage hit its zenith around the time of the first performance of Macbeth, 11 supporting the idea that play was written when “equivocation” was a buzzword. Father Henry Garnet, the Jesuit Superior in England and Gunpowder Plot conspirator was hanged, drawn, and quartered with his fellows for saying one thing with another “secret meaning reserved in … mind”. Synchronising the appearance the word in popular parlance with its use in the Scottish Play, showing that both occurred after the death of the Earl, creates a huge problem for Oxfordians. But it also catches one identical usage in Hamlet performed three years before the Earl died, ironically leaving Oxfordians a straw to clutch at. It’s a new science. The digital panorama of Bankside theatre is still very much a work in progress.
The point here, however, is that the second chart took only five minutes from start to finish. Extracting every instance of the word in every printed English book from EEBOV3 needed just a single query in simple English. The return provided the data for the chart as a text file with every instance, its author and date triggering a Python script which turned the output into a spreadsheet with the chart format ready to use. Questions have almost instant answers today, so you can ask hundreds of them, homing in on your targets iteratively.
Chronology crystallises
Stylometrically, it’s all been downhill for Doubters since Elliott & Valenza. Nothing has gone their way so they tend not to accept that any form of stylometry is any use at all. Unable to shake the consensus on chronology now it is supported by math, they synthesis their creative narratives, arguing that dating the plays is impossible since all of Bankside theatre was produced in a writer’s room led by their choice of alternative candidate, constantly revised by the addition of topical references. They have no evidence to suggest any of this actually occurred but because it quite definitely didn’t. Narrative synthesis and unfalsifiable contentions have become their entire world.
Doubters with dating problems will cheerfully accept the impossible idea that Coriolanus. was written in 1580 and that plays written after the Earl’s death were actually kept in a drawer and spoon-fed to the KM at a rate of two a year until there were no more left. Academic Consensus places Coriolanus around 1607-1608, where it fits everything we know about Bankside theatre development, Shakespeare’s developing style, audience taste, and trends in verse, language and stagecraft. It has major topical relevance to food riots which took place in the Midlands in 1607 and affected the work of many of Shakespeare’s contemporaries12. It did nothing for the Earl of Oxford, dead for three years. It did no more to enhance the case for any aristocratic aspirant. William Hazlitt, a brilliant 19c observer and critic, anticipates the interest of socialists like Marx, Engels and Brecht:
“Any one who studies it may save himself the trouble of reading Burke’s Reflections, or Paine’s Rights of Man, or the Debates in both Houses of Parliament since the French Revolution or our own. The arguments for and against aristocracy or democracy, on the privileges of the few and the claims of the many, on liberty and slavery, power and the abuse of it, peace and war, are here very ably handled with the spirit of a poet and the acuteness of a philosopher.”13
It’s hard to imagine the Earl of Oxford being discussed during the drafting of the Communist Manifesto.
Another embarrassingly firm chronology date, this time on *The Tempest, raged for decades and even produced books. J. Thomas Looney, the founder of the Oxfordian movement, chose the throw it out of the canon rather than try to account for it. Throwing out data that disagrees with one’s premise isn’t at all unusual in Doubter statistics.
Dating work using analysis of style, however, is instinctive. In the 16 years between Steven Spielberg’s Close Encounters of the Third Kind (1977), with its off screen aliens and Jurassic Park (1993) with its very much on screen dinosaurs, computers became big enough and powerful enough to be used for origination. Now AI can turn out a 90 minute thriller from a few sentences of instruction that might keep a four-year old engaged for five minutes. Better is coming but Shakespeare looks safe for the moment.
The professional theatre, in the 25 years Shakespeare was writing for it, changed at an almost equally rapid pace as music, TV and film has between the 1980’s and today. And for the same reasons. Rapid innovation and improvement. On Bankside, over the course of Shakespeare’s career, music changed, instruments changed, theatres changed, distribution changed and earnings changed dramatically. You can make a science of dating any kind of art but when rapid development is happening around it, you don’t always need much science to get things in order. Even at the most elementary level, dating doesn’t have to be guesswork. You can’t push Strawberry Fields by The Beatles back in time any further than the invention of the Mellotron which features in its introduction. Shakespeare can’t collaborate with Fletcher until he shows up on Bankside.
Early English Books Online
The appearance of a database with every English book printed from 1400-1700 with open internet access caught the interest of Doubters disappointed by test banks. “What if we could catch Oxford using a unique expression that only Shakespeare uses?” The answers to questions like this are readily available when you can produce the correct query and are honest with your data. Compared to algorithmically examining tens of thousands of printed books queries based on whole oeuvres, looking for single expressions is bound to be hit and miss. If you don’t know what you’re doing then you can miss targets and leap to false conclusions.
The data source. EEBO initially was split into two phases bringing every book printed in English until 1700 online, scanned, OCR’d and accessible to internet-based digital query.
| Resource | Dates | What it contains | Text type | Organisation & access | Key limitations |
|---|---|---|---|---|---|
| LION (Literature Online) | Launched 1997 (Chadwyck-Healey); migrated to ProQuest 2019 | 350,000+ works of poetry, drama & prose in English, 8th century to present. Canonical and semi-canonical authors. Shakespeare, Spenser, Jonson etc. Scholarly journals and ABELL index also included. | Re-keyed full text. Transcribed from first editions or scholarly editions; 99.995%+ accuracy claimed. | Subscription (institutional). Searchable via ProQuest platform. Boolean & proximity search. Browsable by author, genre, and period. | Selective canonical coverage only. No POS tags or linguistic metadata. Commercial subscription required. Not bulk-downloadable for text mining. |
| EEBO (Early English Books Online) | Microfilming from 1938; online 1998 (UMI/Chadwyck-Healey); ProQuest from c.2003 | 146,000+ titles, 1473–1700. Covers STC I (Pollard & Redgrave), STC II (Wing), Thomason Tracts, Tract Supplement. 17 million+ pages. Virtually all surviving print in English to 1700. | Page images. Bitonal scans from microfilm (greyscale from 2012). PDF & TIFF. Images only — no searchable text unless a TCP transcription exists for that title. | Subscription (ProQuest). Browse and search by ESTC metadata (author, title, date, STC number). Full-text search only available for TCP-transcribed subset. Images not freely reusable. | Images are not text. Microfilm artefacts (bleed-through, damage). Black-and-white scanning distorts typeface detail. Coverage approximately 92% complete. Cannot be computationally processed without the TCP layer. |
| TCP Phase 1 (EEBO-TCP Phase I) | Transcription 2000–2009; public release 1 January 2015 | 25,368 texts selected from EEBO. Selection biased towards New Cambridge Bibliography authors, then thematic and format batches. Coverage c.1475–1700. | Hand-keyed XML. TEI P5 XML with structural markup (headings, verse, notes, figures). No POS tags or lemmatisation. Spelling is original and unregularised. | Freely downloadable from TCP GitHub, Michigan, and Oxford Text Archive. Bulk XML. Searchable via ProQuest EEBO or Michigan interface. No POS query syntax. | Canonical selection bias. Original spelling makes linguistic search inconsistent across the corpus. No POS or lemma data. Some transcription errors. Fixed snapshot — not updated. |
| TCP Phase 2 (EEBO-TCP Phase II) | Transcription 2009 onwards; public release January–August 2020 | ~35,000 additional texts from EEBO (combined total with Phase 1: ~60,000 texts). Broader coverage with more emphasis on English-language text. Completes the TCP transcription project. | Hand-keyed XML. Same TEI P5 XML format as Phase 1. No POS tags, no lemmatisation, original spelling throughout. | Now fully public. Bulk download from TCP GitHub. Integrated into ProQuest EEBO interface alongside Phase 1. Also accessible via EarlyPrint and CQPweb corpora. | Same limitations as Phase 1. Approximately 85,000 EEBO titles remain without any transcription. Ongoing corrections are community-driven. |
| EEBO V3 / CQPweb (Lancaster University / UCREL annotated corpus) | Built on TCP Phases 1 & 2; annotated version mounted on CQPweb; current version c.2015–ongoing | 44,422 texts; 1.2 billion running tokens. Both TCP phases processed through Lancaster’s UCREL annotation pipeline. Spelling regularisation applied first, then POS tagging and lemmatisation. Available on the same CQPweb server as Lancaster’s hand-keyed and linguistically detailed Shakespeare corpus, which can be filtered by play, character, scene, and experimental filters including the social class of the speaking character. | POS-tagged corpus with lemmatisation. Eight annotation fields per token: original form, regularised spelling, lemma, POS tag, and further linguistic metadata. CQP (Corpus Query Processor) syntax enables grammatical pattern search across the full corpus. | Accessed via Lancaster CQPweb (cqpweb.lancs.ac.uk/eebov3). Free but requires account registration. CQP query language. KWIC concordance, frequency breakdown by date, and collocation tools. Fixed corpus — not updated in real time. | POS tagger trained on modern English; accuracy is reduced for early modern syntax and morphology. No metadata filtering by author or title within the CQP interface. Fixed snapshot. Spelling regularisation introduces editorial decisions. |
The resources availble for research and analysis of the transiitons in Early Modern English are enormous and straightfowardly accessible even if some can expensive
EEBO has gone though four complicated phases of development. In his book quest to find rare collocations that tie De Vere to Shakespeare14, Roger Stritmatter used only TCP-1 as his source data, ignoring TCP-2 for reasons understood neither by us, neither we suspect, by the Professor (contemporary review of V1 here) He identified an image using the phrase “haggard hawks” and thought it to be a unique connection. Searching the full database now shows how commonplace haggard hawks were. EEBO V3 returns 48 matches in 33 different texts
The University of Lancaster has gone further and produced a smooth and clean Shakespeare canon festooned with filters and metadata for scholars. A furthre comparative corpus of other Bankside plays delivers the same level of detail for comparative research.
Here are the hawks in the comparative plays database. Kyd cites haggard hawks as examples of achieving things by steady progress. “In time small wedges cleave the hardest”. That’s also true when arguing with conspiracy theorists. This small detail we are using here demonstrates the difference between academic research and dilettante wish fulfilment.
| # | Playwright | Text ID | Concordance line |
|---|---|---|---|
| 1 | Jonson | Bartholemew Fair | often . And i’your singing , you must use your hawks eye nimbly , and fly the purse to a mark |
| 2 | Green | Friat Bacon and Friar Bungay | my reparrell . Then here ’s good game for the hawk , for here ’s the master fool , and a |
| 3 | Beaumont | Philaster | may eat , use exercise , And keep a sparrow hawk , you can shoot in a Tiller , But of |
| 4 | Middleton | The Roaring Girl | do listen ) a cut purse thrusts and leers With hawk eyes for his prey : I need not show him |
| 5 | Middleton | The Roaring Girl | boy , there boy , what dost thou go a hawking after me with a red clout on thy finger . |
| 6 | Middleton | The Roaring Girl | is the generation of a feriant , how his eye hawks for venery . Come are you ready sir . Ready |
| 7 | Porter | Two Angry Woman of Abingdon | it must needs fall , And like a well lur’de hawk , she knows her call . Whist brother whist , |
| 8 | Drue | The Duchess of Suffolk | heard of Wolves , A harmless Dove amongst a thousand Hawk , If she returned , what providence can save , |
| 9 | Heywood | A Woman killed with kindness | meet me tomorrow At Cheuy-chase , I ’ll fly my Hawk with yours . For what ? for what ? Why |
| 10 | Heywood | A Woman killed with kindness | ’ll make them good a hundred pound tomorrow Upon my Hawks wing . T is a match , t is done |
| 11 | Heywood | A Woman killed with kindness | . A match . Ten Angels on sir Francis Actons Hawk : As much upon his Dogs . I am for |
| 12 | Heywood | A Woman killed with kindness | am for Sir Charles Mountford , I have seen His hawk and Dog both tried ? What clap you hands ? |
| 13 | Heywood | A Woman killed with kindness | her Jesses , and her bells . Away ? My Hawk killed to . I , but t was at the |
| 14 | Heywood | A Woman killed with kindness | sound too full , And spoil the mounting of your Hawk . T is lost . I grant it not : |
| 15 | Heywood | A Woman killed with kindness | but she brake away , Come , come , your Hawk is but a rifler . How ? I , and |
| 16 | Heywood | A Woman killed with kindness | good hound in all your kennel , Nor one good Hawk upon your Perch . How Knight ? So Knight ? |
| 17 | Heywood | A Woman killed with kindness | Sir Charles Mountford . True : with their Hounds and Hawks ? The matches were both plaid . Ha : and |
| 18 | Webster | The Duchess of Malfi | that hoped For a pleadon : There are rewards for hawks , and dogs , and When they have done us |
| 19 | Marston | The Malcontent | with the cough of the lungs still ? doos he hawk a nights still , he will not bite . No |
| 20 | Kyd | The Spanish Tragedy | savage Bull sustaines the yoke , In time all haggard Hawks will stoop to lure , In time small wedges cleave |
| 21 | Webster | The White Devil | second , what say you ? Do not like young hawks fetch a course about Your game flies fair and for |
| 22 | Webster | The White Devil | fear it : I ’ll answer you in your own hawking phrase , Some Eagles that should gaze upon the Sun |
| 23 | Webster | The White Devil | vndegestable words , Come up like stones we use give Hawk for physic . Why this is welsh to Latin . |
| 24 | Webster | The White Devil | bells And let you fly to the devil . Ware hawk , my Lord . Florence ! This is some treacherous |
| 25 | Webster | The White Devil | run of the lees for it . Your dog or hawk should be rewarded better Then I have been . I |
Here is another EEBO result, a more complex comparison of the idea of young hawks shedding their bonds and flying free, with new unfettered horizons. Churchyard ironically, was a loyal servant of Oxford’s. Near the end of his life, he experienced the opposite, fettered at the hands of De Vere who cynically allowed him to be imprisoned for underwriting a rental payment of £25, a debt on which his Lordship had welched.15 Unlike the Earl of Oxford’s verse, the poet here is trying to trigger the imagination of the reader, not make them feel sorry for the writer. It contains imagination. Without EEBO V3, I would never have come across it.
When hard in arm’s, new comers are embraste,
Farewell old friend, go play you where you will:
The hawk hath prayed, the Haggards gorgs is full.
Love stays not long, it is but one years bird,
A foolish fit, that mak ’s wild wits go mad:
A gallant Colt, that ronneth for a gird,
A lime rode fine, to catch a lusty lad.
A youthful prank, that mak ’s age look full sad,
A merry mate, so long as money lasts:
Good for a flight, then of her bells she casts.
Thomas Churchyard
Thanks to the herculean efforts, hand keying parts of speech on every word in the canon and Bankside theatre, algorithmic analysis is no longer bound to comparisons of lexical and function words. Shakespeare’s text in the digital canon of University of Lancaster now looks like this.
and_CC his_APPGE means_NN :_YCOL Note_VV0 if_CSW your_APPGE Lady_NN1 strain_VV0 his_APPGE Entertainment_NN1 With_IW any_DD strong_JJ ,_YCOM or_CC vehement_JJ importunity_NN1 ,_YCOM Much_DA1 will_VM be_VBI seen_VVN in_II that_DD1 :_YCOL In_II the_AT meantime_NNT1 ,_YCOM Let_VV0 me_PPIO1 be_VBI thought_VVN too_RG busy_JJ in_II my_APPGE fears_NN2 ,_YCOM
It’s actually worse because alongside part of speech metadata, the words can be extracted with their position in the canon, the play, the act, the line number, the character speaking, the genre, the day of the first performance, date, year and so on. 46 comparative plays have been given the same careful treatment.
Algorithmic analysis can now address a writer’s penchant for adverbs following transitive verbs or a fondness for multiple adjectives with proper nouns, using a whole new range of metrics to categorise style. Always been a losing battle for Deniers, stylometry as a weapon is now lost altogether. The last attempt to attempt by Professor Stritmatter to bring the early work of Shakespeare closer to Oxford using stylometric frequency analysis could hardly have gone worse. Reviews varied from incredulous to complete demolition.
Algorithms take over
The newest algorithmic methods do not require test banks. They do not hunt down individual word pairs and form conclusions on small sets of results. Because storage is cheap and limitless, and processing power can be too, the size of datasets and the size of query tasks is no longer a concern. Numbers that measure capacity today are meaninglessly huge. One whole oeuvre, however large, can easily be compared to vast online databases in hundreds of languages. All of Shakespeare can be compared to all of Marlowe, to all of Bankside theatre or to the whole of Elizabethan or Jacobean literature. Thanks to lemmatisation, spelling variants and particl This enables algorithms to identify where it belongs and to whose orthography it most closely corresponds. Principle Component Analysis based algorithmic query is what is used to yell you what a billion users are going to buy next on Amazon. CS-based The same methods can be used to remove Coriolanus and Hengist out of their positions and then answer questions about authorship, dating and genre by treating them as anonymous texts and asking where they would fit when they replace them.
This is the type of analysis that is extracting collaborative contributions by Marlowe to Shakespeare’s Henriad and looking for Shakespeare’s collsborations in the revision to Kyd’sSpanish Tragedy. The books on the subject are deep dives into statistical mathematics and not available at airport bookstores. One of the easiest to read is the revised edition of Hugh Craig and Arthur Kinney’s work on its application to Shakespeare which explains the procedure carefully enough to understand the process without the crushingly complicated detail required to repeat their experiments..16
Today if you ask people who know how to analyse a quadrillion lines of browsing history on voter trends in Western Europe, “will AI ever be able to write something that we cannot tell from Shakepseare?”, they will unhesitatingly, confidently reply “Yes”. And they may be right however much we wish they weren’t. It will rely on algorithms that don’t currently exist, on shortcuts on which no one has yet started work, but what they can do now, in the Stone Age of Artificial Intelligence, is already beyond comprehension.
A CQP/EEBO.V3 data query and its instant result, here demonstrating the random 16th and 17th century spelling varieties of “Shakespeare”, in 330 different printed books—dispensing with the idea that a hyphenated name indicates a pseudonym. A twofer, in fact as the idea that different spellings are significant is defenestrated theough teh same sahsh window. It may occasionally suggest a character trait in a play or novel (Doll Tear-sheet) but in 1,165 instances here, there are none are being used to clue readers into the existence of a hidden author. In many books (and documents not in the EEBO database because they are unprinted) more than one variant appears. The Parnassus Plays, written by the Jacobean version of Cambridge Footlights, spells his name three different ways. Spelling was non standard, often phonetic, sometimes idiosyncratic, throughout the period. But never needlessly cryptic.
| Num | Search result | Occurrences | Percent |
|---|---|---|---|
| 1 | Shakespeare | 850 | 72.96% |
| 2 | Shakespear | 133 | 11.42% |
| 3 | Shakespears | 37 | 3.18% |
| 4 | Shakspear | 29 | 2.49% |
| 5 | Shake-speare | 22 | 1.89% |
| 6 | Shakspeare | 20 | 1.72% |
| 7 | Shakespeares | 19 | 1.63% |
| 8 | Shake-spear | 6 | 0.52% |
| 9 | Shake-speares | 6 | 0.52% |
| 10 | Shakespere | 6 | 0.52% |
| 11 | Shackspear | 5 | 0.43% |
| 12 | Shakespeer | 4 | 0.34% |
| 13 | shakspears | 4 | 0.34% |
| 14 | Shackspeers | 3 | 0.26% |
| 15 | Shak-speare | 3 | 0.26% |
| 16 | Shakespeere | 3 | 0.26% |
| 17 | Shake-spears | 2 | 0.17% |
| 18 | Shakspeares | 2 | 0.17% |
| 19 | Shack-spear | 1 | 0.09% |
| 20 | Shackespeers | 1 | 0.09% |
| 21 | Shackspeare | 1 | 0.09% |
| 22 | Shackspears | 1 | 0.09% |
| 23 | Shackspeer | 1 | 0.09% |
| 24 | Shak-spear | 1 | 0.09% |
| 25 | Shakespea | 1 | 0.09% |
| 26 | Shakespeaks | 1 | 0.09% |
| 27 | Shakesperum | 1 | 0.09% |
| 28 | Shaksper | 1 | 0.09% |
| 29 | Shaksperus | 1 | 0.09% |
Footnotes
J. Thomas Looney, "Shakespeare" identified in Edward De Vere, the seventeenth earl of Oxford, (Cecil Palmer: London, March 1920).↩︎
E. K. Chambers, William Shakespeare, vol. I: A study of facts and problems, (Clarendon Press, 1930).↩︎
Donald W. Foster, “The Claremont Shakespeare Authorship Clinic: How Severe Are the Problems?” Springer, Computers and the Humanities, Vol. 32, no. 6, 1998, pp. 491–510.↩︎
Ward E. Y. Elliott and Robert J. Valenza, “And then there were none: Winnowing the Shakespeare claimants,” Springer Science and Business Media LLC, Computers and the Humanities, Vol. 30, no. 3, 1996, pp. 191–245.↩︎
W Hess Ron, “Shakespeare’s Dates: Their Effects on Statistical Analysis,” The Oxfordian. The Annual journal of the Shakespeare Oxford Society, Vol. II, 1999.↩︎
Eva Turner Clark, Hidden allusions to Shakespeare’s Plays, (1931).↩︎
Kevin Gilvary, “Towards an Oxfordian Dating of Shakespeare’s plays,” De Vere Society, de Vere Society Newsletter, 2019.↩︎
Professor Dean Simonton came to the same conclusion as E&V with an analytical approach of stylistic changes over time based on psychological principles. “I devised an objective and quantitative method to test alternative chronologies by assessing which of the rival datings established the strongest correspondence between conspicuous political events and thematic content dealing with the same or similar political events (Simonton, 2004p). Besides testing two alternative Oxfordian chronologies, I evaluated the Stratfordian chronology with different temporal shifts. To my surprise, the traditional chronology shifted just two years earlier provided the best fit, suggesting that it took an average of two years for the initial event-inspired idea to result in a finished play (subsequent revisions presumably making minimal impressions on the linkages). Neither Oxfordian chronology provided any correspondences even when shifted forward and backward. The Oxfordian chronologies also did much worse than the Stratfordian in accounting for stylistic changes in the plays. (Simonton, 2004a).”↩︎
Hugh Craig and Arthur F. Kinney (eds.), Shakespeare, Computers, and the Mystery of Authorship, (Cambridge University Press: Cambridge (GB), 2009).↩︎
William Shakespeare, William Shakespeare and others: Collaborative plays, (Palgrave Macmillan: Houndmills, Basingstoke, Hampshire, 2013).↩︎
Macbeth is closely connected to King James, and scholars are in general agreement that it is unlikely to have been written prior to his accession in 1603. He considered Banquo his direct ancestor, and eight Stuart kings preceded James, just as Banquo is depicted at the end of “a show of eight kings” (4.1.126.1–2).[320] A medal struck in 1605 commemorates the king’s escape from The Gunpowder Plot. It shows a serpent lurking among lilies and roses, with the legend, Detectus qui latuit (that which was hidden is disclosed). As Lady Macbeth says to her husband, “Look like the innocent flower, but be the serpent under’t.”↩︎
Jonathan Goldberg, James I and the politics of literature: Jonson, Shakespeare, Donne, and their contemporaries, (Johns Hopkins Univ. Press: Baltimore, Md. [u.a.], 1983).↩︎
“William Hazlitt - Characters of Shakespear’s Plays (1817).djvu/99 - Wikisource, the free online library,” https://en.wikisource.org/wiki/Page:William_Hazlitt_-_Characters_of_Shakespear%27s_Plays_(1817).djvu/99 p, 69.↩︎
Roger A. Stritmatter PhD and Bryan H. Wildenthal J.D, Poems of Edward de Vere, 17th Earl of Oxford, Volume I: He that Takes the Pain to Pen the Book, (Independently published).↩︎
Alan H. Nelson, Monstrous adversary: The life of Edward de Vere, 17th Earl of Oxford, (Liverpool University Press: Liverpool, 2003). p. 328.↩︎
Craig and Kinney (eds.), Shakespeare, Computers, and the Mystery of Authorship.↩︎