Analysis
Sense and Sensibility
by Jane Austen
Unique Words
—
Type-Token Ratio
—
Hapax Legomena
—
Zipf Exponent
—
Lexical Diversity
Type-Token Ratio (TTR)
—
Ratio of unique words to total words. Higher values indicate greater vocabulary diversity.
Hapax Ratio
—
Percentage of words that appear only once. Higher values suggest more varied vocabulary.
—
Yule's K
Vocabulary concentration
—
Simpson's D
Word repetition probability
—
Dis Legomena
Words appearing twice
—
Top 10 Coverage
% of text from top 10 words
Word Frequency Distribution
Log-log plot of the top 100 words showing how frequency follows Zipf's Law (α ≈ 1.0 for natural language)
Zipf Exponent (α): —
Fit (R²): —
Top 20 Words by Rank
Most frequent words and their usage count
Word Frequency Categories
—
Hapax (1x)
Words appearing once
—
Rare (2-5x)
Infrequent words
—
Common (6-20x)
Regular words
—
Frequent (20+)
High-use words
Rare & Unusual Words
Content words (5+ letters) that appear infrequently - may be specialized or uncommon vocabulary.
Loading...
Longest Words
All Words
Loading...
Without Hyphens
Loading...
Hapax Legomena
Words that appear exactly once in the text
Loading...
Showing 1-50 of 0 hapax words
Page 1
Compare to...
Loading library...