Once learners have gained control of the first 2,000 or so most common words in English – without which efficient communication and independent use of the language is very difficult – the vocabulary learning task becomes more challenging. The first 2,000 words are ubiquitous and do most of the heavy lifting; they support all the rest of the vocabulary and form the core of the language. Understandably, most language courses and suites of vocabulary learning materials focus on getting learners across the 2,000-word threshold as quickly and as painlessly as possible.
But what happens after that? One thing is for sure: you will never be able to teach your students enough vocabulary, never be able to say, “That’s it, job done!” It’s not like grammar teaching, where you can push learners to a level where they have encountered and practised all but the most obscure structures and patterns. In the case of vocabulary, after the first 2,000, many new words will be encountered just once, which is hardly enough to learn all you need to know about them. So, normally, we consider the next step in vocabulary learning to be the shift from ‘breadth’ (how many words you know) to developing ‘depth’ (what you know about those words and how to use them). Students will still learn new words, but they’ll also start to focus more on areas such as collocation (how words combine with each other), register, connotations, style and so on.
Learners typically develop a knowledge of collocation through extensive reading and listening, and steady practice of speaking and writing with good feedback. There are also materials specifically for the teaching of collocations. However, learner corpora show us that stubborn problems in the use of collocations can often remain, even when students reach the C-levels of the CEFR.
There will always be problems with remembering individual collocations, but one thing that a corpus gives us is the ability to see patterns of use and patterns of error, features that occur regularly in the performances of hundreds or even thousands of learners across the world. Let’s look at three such patterns of difficulty that occur across the CEFR levels, as observed in the multi-million-word Cambridge Learner Corpus.
The first pattern is collocation errors with delexical verbs. Delexical verbs are common, everyday verbs such as take, get, have, make, do, which combine with nouns to describe activities (e.g.make dinner, get a ticket, take a photo). The verbs are called delexical because we establish their meaning from the words they combine with: compare the different types of ‘having’ in have dinner, have a baby, have a car.
These collocations are often unpredictable (e.g. we make an effort but wedo our duty and we take a photo) and hard to recall when needed. So even at higher CEFR levels we find students saying She got [had] a baby last year and He made [did] his best. It’s not always easy to see a pattern that could be useful in teaching, but one pattern that emerges is students using make instead of do for physical activities (e.g. with words such as exercise, shopping, housework). This is an area we can focus on in teaching and build awareness.
A second pattern that persists across the levels is getting the wrong word order with what are called binomial expressions. These are typically two items connected by and (e.g. fish and chips, safe and sound, back and forth). The problem here is remembering which comes first. In some pairs of words, the order is flexible (it doesn’t really matter if we say girls and boys or boys and girls), but others are fixed (we don’t say sound and safe or forth and back). Even at C2 level we find students writing about the cons and pros of something and about white and black photographs of their grandparents. This would seem to be a problem of memory, or awareness, or both. First you need to be aware of which expressions are indeed fixed, and then you need to remember the order. Always recording binomial expressions in a vocabulary notebook with a warning note if the order is completely fixed can help.
A final pattern that occurs is a rather more unusual one and one that is difficult to explain. Students often collocate two words which are (near-)synonyms. We find learners at all CEFR levels writing about urban cities or a stench smell or a quiet silent place – doubling up the meaning! Why does this happen? It may be because students think you can intensify a word by using a similar word, it may be because they want to display their vocabulary knowledge to the full, or it may be that when teachers give two possible answers to an exercise (“Here you could say stench or smell”) the student hears “stench smell” and thinks it’s a collocation. Or there may be other reasons. There is no obvious remedy for this; teachers just have to keep an eye out for it.
One thing that is true is that collocation remains a challenge in learning any language and it is probably one of the last things our learners perfect. Using a corpus to tell us how learners use collocations and what problems they typically have is a good first step to helping them achieve collocational awareness and accuracy as quickly and efficiently as possible.