There are plenty of Word Frequency lists but plurals, adjectives, adverbs of the same word end up in different positions in these lists.
I’m looking for a dataset or a way to create a dataset that has all forms or one word clumped together so it’s less about frequency and more about how familiar the word (and its different forms) is if that makes sense.
For instance, i have a list whete the word “have” is at 25th place, “has” at 39 and “had” at 105. Clearly, anyone who knows one of these words would know the other two as well.
Apologies if I did not get my point across clearly. Any help is appreciated. Thanks!
submitted by /u/haskpro1995
[link] [comments]