Poe's Prodigious Prose: A Study in Type-Token Ratio
When analyzing The Fall of the House of Usher through Prose Parser, one metric immediately jumps off the page: Edgar Allan Poe's Type-Token Ratio (TTR) of 0.30 is nearly three times higher than any other work in our Classics Library.
But what does this actually mean? And more importantly, what doesn't it mean?
What is Type-Token Ratio?
Type-Token Ratio is one of the oldest and most intuitive measures of vocabulary richness. The formula is simple:
TTR = Unique Words (Types) / Total Words (Tokens)
A TTR of 1.0 would mean every word in the text is unique—no repetition whatsoever. A TTR approaching 0 would indicate extreme repetition of a small vocabulary set.
In Poe's The Fall of the House of Usher, he uses 2,134 unique words across 7,101 total words, yielding a TTR of 0.30. For every ten words Poe writes, three of them are words he hasn't used before in the story.
Poe vs. the Classics
Here's how Poe stacks up against other works in the library:
| Work | Author | TTR | Word Count |
|---|---|---|---|
| The Fall of the House of Usher | Edgar Allan Poe | 0.300 | 7,101 |
| Alice's Adventures in Wonderland | Lewis Carroll | 0.105 | 26,463 |
| A Scanner Darkly | Philip K. Dick | 0.103 | 85,180 |
| Frankenstein | Mary Shelley | 0.095 | 75,166 |
| The Picture of Dorian Gray | Oscar Wilde | 0.090 | 79,199 |
| Moby-Dick | Herman Melville | 0.088 | 214,532 |
| The Sun Also Rises | Ernest Hemingway | 0.075 | 67,897 |
| Dune | Frank Herbert | 0.068 | 200,461 |
| Dracula | Bram Stoker | 0.065 | 160,973 |
| Great Expectations | Charles Dickens | 0.062 | 185,545 |
| Sense and Sensibility | Jane Austen | 0.055 | 119,728 |
Poe's number looks extraordinary—but notice that column on the right? That's where things get complicated.
The Elephant in the Room: Text Length
Here's the inconvenient truth about TTR: it's heavily dependent on text length.
As a text grows longer, words inevitably repeat. There are only so many ways to say "the," "said," or "was." The longer you write, the more your TTR will decline, regardless of your actual vocabulary.
Poe's story is 7,101 words. Melville's Moby-Dick is over 214,000. If we extract a 7,000-word sample from Moby-Dick, its TTR would likely rise dramatically.
This is a fundamental limitation of TTR, and it's why linguists have developed alternative metrics.
Beyond TTR: The Hapax Ratio
One metric that provides additional insight is the hapax ratio—the percentage of unique words that appear only once in the text (called hapax legomena, Greek for "said only once").
| Work | Hapax Ratio |
|---|---|
| The Fall of the House of Usher | 70.6% |
| A Scanner Darkly | 57.1% |
| Dracula | 55.7% |
| The Picture of Dorian Gray | 55.6% |
| Moby-Dick | 54.2% |
| Alice's Adventures in Wonderland | 54.0% |
| The Sun Also Rises | 52.9% |
| Dune | 52.4% |
| Great Expectations | 52.4% |
| Frankenstein | 51.2% |
| The Wizard of Oz | 47.6% |
| Sense and Sensibility | 45.8% |
Poe leads again, and this metric is less susceptible to text length effects. Over 70% of Poe's unique words appear exactly once. He uses a word, then moves on to another. This supports the idea that Poe deliberately avoided repetition in his vocabulary choices.
What This Tells Us About Poe
Poe was famously meticulous about word choice. In his essay "The Philosophy of Composition," he described constructing "The Raven" with mathematical precision, selecting each word for maximum effect.
The data suggests this wasn't just talk. Poe's writing exhibits:
- Extreme vocabulary diversity: A TTR of 0.30 means roughly 30% of his words are unique
- Minimal word recycling: Over 70% of his vocabulary appears only once
- Dense, demanding prose: His Flesch readability score of 48.2 places him at "Difficult (College)" level
This aligns with Poe's gothic aesthetic. His ornate, deliberately unusual word choices create the atmosphere of creeping dread that defines his work. Words like "insufferable," "pestilent," "arabesque," and "phantasmagoric" appear once, do their work, and vanish—leaving an impression without becoming repetitive.
The Honest Conclusion
Is Poe's vocabulary genuinely richer than Melville's or Dickens's? The data can't definitively answer that question. TTR's length dependency means we're not comparing apples to apples.
What we can say is that within the confines of a 7,000-word short story, Poe demonstrates remarkable vocabulary diversity. His hapax ratio suggests deliberate avoidance of repetition. And his difficult readability score confirms that his word choices lean toward the unusual and elevated.
For writers, the lesson isn't to chase a high TTR. The lesson is to recognize that word choice creates atmosphere. Poe's vocabulary isn't just extensive; it's purposeful. Every unusual word serves the mood of decay, dread, and psychological dissolution.
Try It Yourself
Want to see how your vocabulary measures up? Run your own writing through Prose Parser to discover your Type-Token Ratio, hapax legomena, and other vocabulary metrics.
Explore the full Usher analysis →
Or analyze your own writing to discover your verbal fingerprints.
Explore the full analysis:
View The Fall Of The House Of Usher AnalysisWant to analyze your own writing?
Discover your verbal fingerprints with Prose Parser.
Analyze Your Text