Frequency Analysis
New ways to deconstruct De Vere digitally depriving Doubters of dispositive data
In the same way that they welcomed digital stylometry, Doubters were keenly interested in the arrival of large, organised corpora of digital texts. New research tools and new ways to count things and calculate probabilities. Counting function words — the words writers use unconsciously — could be used as a recipe for analysing style, but once again large data turned out not to be helpful in promoting one writer’s claim to the corpus of another.
Counting the usage of individual words and searching for tell-tale expressions, however, became much easier once a single search could extract results across an entire corpus. This also went wrong when Roger Stritmatter wrote a whole book attempting to connect De Vere and Shakespeare by counting unusual or rare expressions — miscounting, in fact.
EEBO V3 is the latest collection of Early English Books Online. V3 differs from previous versions in that you can count parts of speech as well as identify function words. You can judge the usefulness of this for yourself on the site maintained by Lancaster University and see just how quickly you can separate one poet from another.
If you want to work with Oxford, a resources section contains a De Vere corpus we created which can be uploaded for comparison with their Bankside corpora, all of which have been hand-keyed and inspected by the team at Lancaster University. Our De Vere corpus, while not exactly hand-keyed, contains the 20 poems claimed as canonical by the De Vere Society and his 45,000-word letter repository which you can browse on this site. It too has every word tagged with metadata.
The nature of stylistic enquiry has completely changed. We used EEBO V3 to extract haggard hawks but the new version lets you search for haggard as both noun and adjective and pair it with other birds or other nouns, categorising your search — in the case of Shakespeare — to plays, genres, even individual characters or characters grouped by their age or social status. You can now analyse the difference between aristocrats, the middle class and the groundlings to see the variations Shakespeare used in vocabularies and different parts of speech.
This is a new field of study and so unlikely to assist Doubters with anything other than final enlightenment.
Alliteration
If you’ve read Oxford’s poetry you will have been struck by his use of clangorous alliterative lists of nouns and adjectives.
My life, through ling’ring long, is lodg’d in lair of loathsome ways;
Now this really doesn’t sound like Shakespeare at all, except when he is mocking over-reaching artistic pretensions. There are more instances, however, of Shakespeare alliterating multiple words than you might at first think. The three tables below cover his long poems and sonnets, the First Folio plays, and a selection of contemporary dramatists for comparison. Oxford comes last — his entire surviving poetic output, machine-tagged from a single corpus. Spend two minutes with the Shakespeare tables first.
The tables can be sorted and filtered — compare the alliteration in Macbeth to that in Venus and Adonis, or see how Marlowe and Jonson handle the same device.
Shakespeare’s poems and sonnets
Shakespeare’s plays (First Folio)
Contemporary dramatists
Edward de Vere, 17th Earl of Oxford
The table below covers de Vere’s entire surviving poetic output — the 20 poems accepted as canonical by the De Vere Society. Unlike the Shakespeare tables above, which are drawn from hand-keyed, editorially verified corpora, the Oxford corpus was processed by treetagging software applied to the poems as a single undifferentiated text, so individual poem boundaries are not identified. The difference in volume is not an artefact of the method.
79 instances in the poems and sonnets | 901 in the First Folio | 1037 across 21 contemporary dramatists | 30 in the complete De Vere poetic canon