When you hear somebody asking about the most commonly used words (be those nouns or articles) you tend to firstly think it over but then it appears quite a vague issue. How can we know that? Is there a person who analyzes all the literary heritage and our everyday conversations and then counts all English words? Otherwise, it is impossible to tell for sure how often we use this or that word, isn’t it? Actually, it is possible and we shall discuss here what the language corpus is and how interesting it may be.
Let us start with a piece of lexicological statistics. If you haven’t known, according to Oxford English Dictionary (2nd ed.), the English language possesses:
- 171,476 words in current use
- 47,156 obsolete words
- 9,500 derivatives
Over a half of them are nouns, about a quarter go to adjectives and verbs take about a seventh. The rest is made up of interjections, prepositions, conjunctions, etc. But when we count all compounds (age + less = ageless), inflected forms (like running and runs), blendings (gigantic + enormous = ginormous), clipped words (gymnasium – gym), and slang words, we get 1,025,109.8 words as Global Language Monitor reports. Besides, Shakespeare himself invented over 1,700 words, so you can also try. As you see, you still have a lot to learn. But don’t worry, to speak English fluently and understand others, you need about only 2,000 of them.
All those words are found in the collection of “world texts” accessed via the internet containing the authentic spoken and written language that altogether make a corpus. The latter, therefore, is a systematic and computerized collection of the naturally occurred language samples used for linguistic analysis.
The analysis itself deals with the frequency of the phenomenon under investigation and, of course, is performed with special computer software. But okay, how exactly does it work?
Global Language Monitor, which we’ve previously mentioned, is a company that analyzes and tracks language usage trends worldwide. GLM main technology is called Narrative Tracker that is based on global discourse on the Internet, print and electronic global media, blogosphere, and social media sources. It provides real-time picture of the current language situation at any point in time. Although the most popular word appears to be an article “the” and there’s nothing strange about it, we offer you to take a look at the following set of words (from other parts of speech) that are the most frequent at the moment:
All these words are listed in the General Service List and its updated version (New General Service List) that gathers approximately 2,000 of the most “popular” words. Top 5 of them are easy to memorize: the, be, of, a, to. The complete list you may found here.