Email Calvin || Email Bickerton || Glossary || Book's Table of Contents || Calvin Home Page  


William H. Calvin and Derek Bickerton, Lingua ex Machina: Reconciling Darwin and Chomsky with the Human Brain (MIT Press, 2000), chapter 2.  See also

copyright ©2000 by William H. Calvin and Derek Bickerton

Yes, reading it on the web is still a drag.  The nonvirtual 300 pp. book is available from or direct from MIT Press.

Webbed Reprint Collection
This 'tree' is really a pyramidal neuron of cerebral cortex.  The axon exiting at bottom goes long distances, eventually splitting up into 10,000 small branchlets to make synapses with other brain cells.
William H. Calvin

University of Washington
Seattle WA 98195-1800 USA



What Are Words?

      Okay, I'll start off with what's a word and with why primate utterances aren't really words, because they can't be combined with others for a new meaning.





Were someone asked what a sentence was, the reply would almost certainly include something about words B that a sentence consisted of words strung together, or something of that sort. But when you think of words, what are they exactly? The word Aword@ seems to have some sort of intermediate existence between, on the one hand, very concrete terms like Atable@ and Achair@ and, on the other, very abstract ones like Achore@ and Anothing.@

On the one hand, Aword@ differs from Atable@ and Achair@ in that we can say, AThis chair is made of wood, that one of metal, that table of plastic,@ and so on. We can=t say the same about any of the various referents to which Aword@ may be applied. Any word may appear in a variety of guises: as soundwaves between a mouth and an ear; as marks on a page; and B in some sense yet to be defined B as things we have in our brains. We can remember words, forget them, confuse them with one another B in short, perform on them any of the operations we can perform on any of the things in our memory store.

On the other hand, we are able to say that words consist of something. They=re not like chores; anything can be a chore, depending on how you look at it. They=re not like nothing, which can=t be like anything. And there=s a still more dramatic difference between >word= and any other word. The word Achair@ is not a chair, and the word Achore@ is not a chore, but the word Aword@ is a word. So what on earth can we mean when we talk about words? Of course, everybody knows what a word is, but again, it=s like knowing what a sentence is. We know one when we see one, but when it comes to saying what it actually is, the problems begin.

For, of course, the whole evolutionary purpose of gathering, storing and filing such impressions is to be able to identify things. If we identify an orange as an orange, we then know that we can eat it without harm. If we identified an orange as a deadly nightshade berry, we wouldn=t eat it, and would be deprived of its nutritive value. If we identified a deadly nightshade berry as an orange, we might eat it, and might die as a consequence. So it=s very clear that the correct identification of things in the world B correct in terms of the consequences we predict from them, rather than in any sense of absolute truth B is adaptive, in the evolutionary sense of the term.

That is, if you identify things correctly, you survive and (hopefully) reproduce, breeding descendants able to identify at least as well. If you don=t identify correctly, you are slightly more likely to die before you reach reproductive age, and your misidentifying genes will not make it as far into the future. However slight an edge you get from correct identification, it=s all you need to ensure that in a few hundreds or thousands of generations, most members of your species will identify things at least as well as you did, while most who were worse at it will be long gone.

Of course, the example I gave is absurdly simple; most creatures can distinguish between things that resemble one another far more than do oranges and deadly nightshade berries. And because of their evolutionary value, these processes of identification, these fine discriminations in terms of stored sensory impressions, began very early in evolution, long before mammals walked the earth, or dinosaurs, indeed long before the first sea-creature, balancing perilously on its fins, dared to trespass on a still barren and desolate dry land. In us, those processes may seem to have reached a higher pitch of refinement (though organisms as diverse as bats, pit-vipers, and electric eels have highly developed senses that we do not possess even in rudimentary form). But they are in no way different in type from the processes that operate in other species, including species we might fancifully suppose to be considerably Alower@ than we are.

  So let=s try a slightly different approach. Let me state the minimal conditions that a neurological model of how words are represented has to meet in order to be plausible, in light of what we currently know about language. Those conditions will be, so far as I can make them, neutral with respect to whatever theory of language one may hold B Chomskian, functionalist, whatever. (There are a few things that linguists can agree on, though you might not think so if you heard them arguing.)

A word is, as one might expect after all this, something multifaceted. In order for a word to function, it has to trigger off a concept in the hearer=s mind. If the speaker says Aorange,@ this has to activate some kind of concept of orange in the hearer=s mind. Otherwise it would be as if I said Anaranja@ and you didn=t know Spanish.

Two problems here. The first is general, but I want to sidestep it, for the moment, at least: that is, what a word actually represents. A nave view is that it represents an object B Aorange@ represents an orange, or oranges. But then what do words like Aabsence@ or Anothing@ represent? Ferdinand de Saussure said, no, words represent concepts. But since we still can=t be sure what a concept is, that doesn=t help too much. For the moment, let=s just say that words represent something, somehow. They serve to focus your mind on some aspect of reality B or rather, I should say, of the picture of reality you carry about with you in your brain.

The second problem is more specific to the example given, though it affects a surprising number of words in any language. Take the two sentences, AShe ate an orange@ and AShe wore an orange sweatshirt.@ It should be clear now why Aorange@ can=t just mean an orange B it can mean a color. (Yet an orange doesn=t have to be orange to be an orange B unripe oranges are green.) In other words, when you hear the string of sounds that make up the word Aorange,@ you can=t tell simply from these if it=s the fruit or the color that the speaker intended to evoke. You have to determine what role the word plays in the sentence, whether it stands alone as a noun or modifies some other noun, like Asweatshirt.@

Another way of putting this would be to say that words have properties. Properties are things like, what word-class does a given word belong to (adjective, noun, verb, and so forth), whether it has to have a complement and if so what kind (for instance, prepositions must have a noun-phrase as complement), whether it has any agreement features (Ashe@ for instance is singular, third-person, and feminine, as well as nominative case) that have to match up with others in the utterance, and so on. The brain=s representation of a word has to include all of these things somehow, as well as more obvious things like meaning. To date I don=t think we have much idea about precisely how this is done. It looks as if where a word is stored in the brain might be determined by word properties, but for most features we don=t even have that much of a clue. However, it=s clear the brain must represent words somehow, or we couldn=t talk. It=s reasonable to suppose that both Aoranges@ have separate representations even if the sound-pattern that they share is the same.

So let=s consider for a moment just Aorange@ as a noun. When you hear the word Aorange,@ this may suggest to you just some vague picture (Akind of fruit@) or it may evoke the taste of an orange, or its color (ripe or unripe), or its smell, or the texture of its skin, or B if you happen to be a fruit-grower here in Italy B probably also the soft thud that an overripe orange makes when it falls and hits the ground, as well as probably lots of other things that might seem obvious to Italian fruit-growers but lie wholly outside the knowledge of you and me.

Now let=s get rid of a pseudo-problem that worries lots of people. How can words work if they evoke different things in different people? How can people ever understand one another? Well, in most cases at least, however little a word evokes in my mind, that little will be a subset, however weird or limited, of the set of things that the same word evokes in the minds of those who are expert in the relevant field. If it isn=t B if the word Aorange@ evoked in me some of the properties of bananas B we=re in real trouble. But this seldom happens, and if it does, we conclude there=s something wrong in the brain of the person concerned.

Back then to Aorange@ and the things it may evoke. What these things are are basically sensory impressions B but not of one particular object on one particular occasion. Rather they are generalized impressions derived from various occasions of seeing oranges, tasting them, and so on. If such impressions were not accurately filed in the brain somehow, we might see an orange on one occasion, note and observe it, see another, note and observe that, and fail to realize that they belonged to the same category.

Each of these sensory impressions will be linked to a particular sense. However, if one receives a combination of sense impressions with any degree of frequency, or with less frequency but in a life-threatening manner (the sound of a charging lion=s roar, for example, coupled with the sight of the animal getting rapidly larger), it=s hard to see how any member of the combination could subsequently occur without potentially triggering the others. In other words, in addition to representations in terms of a single sense, you get representations in terms of several sensory modalities B cross-modal representations.

The importance of this, for our purposes, is that not so long ago some believed that the reason animals didn=t have language was that they didn=t have cross-modal representations. Obviously, if you have words you have to have cross-modal representations. The word Alion@ wouldn=t be of much use to you if it evoked only the smell of a lion and not its appearance, or only its appearance and not the sound it made. Not that it has to evoke all of these at once B simply that it has to be able to, if needed, if you are going to get all the mileage out of words that you would like to get.

WHC: No problem with multimodality representations, Derek. Many of the neurons of association cortex, and some of the ones in the primary sensory cortices, respond to several major modalities of sensory input. For example, neurons in somatosensory cortex may also respond to light. But there=s some point in emphasizing the difficulty of multimodality linkups, because of the problem of doing them on-the-fly, when dealing with some combination that you=ve never encountered before (and so couldn=t have established any specialist connections for the combination). Language tasks are full of novel combinations.

While there aren=t objects in the brain, like those in the compartments of the left-luggage office, there are ensembles of neurons that effectively represent objects, analogies, and the other bricolage of our mental life. Yes, a person is only a collection of molecules, but their pattern of organization is everything; it=s the well-functioning organization which is the difference between a living person and a cadaver. My mental representation of Aapple@ is only a collection of neurons, all of which are also used for other purposes on occasion. Still, they form an organization that functions pretty well for recognizing apples, eating apples, pronouncing Aapple,@ and so forth.

It=s hard to talk about representations in the brain, how we memorize something and later make use of it, because of the lack of ready analogies in the technological world. Our memory isn=t much like that of a computer memory (though it does have some functional equivalents of keyboard buffers, short-term RAM, and long-term hard drives). Ours doesn=t have any empty slots because it=s a distributed, overlapped type of storage where the new stuff has to fit in, amidst the redundant resonances for a lot of old stuff. To appreciate how memory works requires talking about nearly a dozen different levels of organization (most areas of science only have to deal with several). All those levels B molecules and their receptors and channels, membranes, synapses, neurons, minicolumns, macrocolumns, areas, and larger brain regions B exhibit self-organization and emergent properties; they are all involved in any explanation.

For now, suffice it to say that concepts with fuzzy edges are what you=d expect from the sensory neurophysiology. This will not make lawyers happy, nor others that like to dissect issues into smaller and smaller well-delineated fragments, but nature seems to like fuzzy edges, at least at the cellular level of organization. Precision is accomplished with large committees redundantly trying to do the same task; precision is often an emergent property of enough imprecise neurons. I suspect that there=s a strong link between the neural process that makes syntax possible and that which makes our speculative, beyond-the-animals consciousness possible B namely, that they are both founded on the Darwinian cloning competitions of cerebral cortex. More later.

Some people use Athought@ to mean Amental image,@ but most mental images are pretty abstract, what makes cartoon sketches so successful. I=d use Athought@ in a broader sense, allowing relationships such as analogies. Relationships are far more abstract than objects themselves, and there are often layer upon layer of abstractions in our metaphors, undoubtedly aided by syntax=s structuring. Thoughts also follow themes, such as searching for cause and effect: I often approach various problems with what, overall, might be considered a Darwinian template, looking for signs of a spread of variants, some of which survive and reproduce better than others.

Note that to represent a word, a cross-modal representation must also have at least two other characteristics. It must not be an association that is automatically triggered whenever what it refers to appears B or rather, if is triggered, it must be possible to inhibit it from triggering its spoken representation, otherwise every time we saw a dog we would be obliged to say Adog.@ And the association must not trigger an automatic response, or be limited to a single kind of response. Whenever someone says, APass the salt,@ we don=t want to have to choose merely between passing the salt and not doing anything. We might want to throw the salt at them, if it was the last in a long chain of similar requests, or we might want to say, AGet it yourself,@ or make any other of a potentially infinite number of responses. In other words, coming or going, words have to be decoupled from the world of action in ways that animal calls are not. For instance, vervet monkeys can, on sighting a martial eagle, either give the martial eagle warning or keep quiet, but if the warning is given, the vervets seem to have no other choice than to do nothing or to run up a tree. Maybe they could do other things, but the evidence makes it look as if the action of running up a tree is preferentially linked to that kind of warning. Words cannot not have this property if they are to work as words should. True, a word in a particular context might be so linked B if someone shouts AFire!@ in a crowded theater, we are more likely than anything to head for the door B but if we ran out of the room every time the word Afire@ occurred in casual conversation, we would be regarded as weird indeed.

The representation of a word has to be hooked up with things other than preferential responses. It has to link with all the different sensory representations of whatever it refers to. It has to link with memory in such a way that any relevant remembered item can trigger it. It has to be linked, potentially, with representations of other words, so that longer utterances can be formed. It has to be linked preferentially with whatever sounds give it a phonetic realization. But it mustn=t be linked with particular responses, or indeed any responses. Words may seem on occasion to precipitate action, but in fact are only part of the evidence on which choices of action are made. If someone tells us to go, we may or may not go; if we do, a whole set of other considerations will have conspired to drive us. That=s certainly one of the most crucial differences between words and animal calls.

WHC: Most of the animal calls are analogous to our exclamations; they=re usually emotion-laden utterances. Chimps in the wild have about three dozen characteristic vocalizations, all in this category; some easily translate into AWhoopee!@ or AWeird!@ or AGet away.@ They have some signals, such as maintaining eye contact (between gorillas, this is a threat; between bonobos, a sexual invitation). Carrying a stick or waving leaves may be used to invite playful romps. There are many expressive body postures and movements, some of which carry directional information, as when a chimpanzee drags a branch down a path that he wants others to follow (ALet=s go this way!@) or swings it behind the stragglers to herd them.

Some vocalizations may be repeated to intensify the meaning, but otherwise combinations of calls and cries have no additional meaning, in the manner of combinations of our elementary vocalizations, the phonemes. Indeed, one of the evolutionary puzzles is how our ancestors made the transition from a few dozen vocalizations, each with an assigned meaning, to our present system of meaningless phonemes (about forty in English), that have meaning only in combination with each other. Even novel combinations (never seen before words like Abumbleberrism@) can be easily handled on the first pass.

One-word (or one-stock-phrase) utterances are often the only things that aphasics can speak after their left lateral language areas have been damaged in a stroke. Total mutism usually requires damage to the supplemental motor area, just above the corpus callosum in the brain=s midline, an area implicated in monkey vocalizations. So we may want to think of standard exclamations B and most primate calls B as involving an older, more primitive system, located far away from those left lateral brain areas that seem to be important in our kind of syntactic language.

The primitive exclamation speech area may not even be the cortical system where the first words (meaningful units that are recombineable for additional meaning) were invented; cortical areas near the Sylvian fissure seem more likely to have housed the first words. It makes you think in terms of a second language system, operating in parallel with an older one, and not necessarily an intensification of the first system. The second system could have its origins in something like face recognition and social relationships, rather than producing vocalizations.

Right. The amazing thing is that some people still believe that language must have developed out of some kind of hominid call system. In that case it would be strange indeed that the hominid call system B screams, crying, laughter, finger-pointing, fist-shaking and the like B still continues to exist alongside language.

Moreover B since this section concentrates on the unit word to the exclusion of all larger units B we=re not saying much here about a still more sharply distinguishing feature of words. What would be the use of a language that was limited to single-word utterances? Words must be potentially able to combine with one another, at least in the minimal subject-predicate mode: you use the first word to focus the hearer=s attention on a class or class-member and the second to make some kind of comment on that class or class member (dogs bark, John left . . .). You can=t do this with calls, because each simply triggers readiness for a certain behavior, and each requires a different behavior. There=s no way two calls can ever be linked with one another in the way that words can, so that the second call would say something about the first.

But a question we maybe should ask at this stage is whether the representations of words are simply cross-modal sites, places where the different sensory impressions can come together B something like what I think Damasio means by Aconvergence zones@ B or whether they require more abstract representations as well. The more abstract representation would serve as a further buffer between sensory input and motor output. I think in order to answer this we would probably need to know more about how both human and other primate brains work. Other primates can have cross-modal associations, but an association isn=t a representation, per se. Maybe you can=t get from cross-modal association to cross-modal representation without having a word or sign, some kind of representation of a symbolic object, to focus and fix cross-modal representations. If that=s so, then you don=t need a more abstract representation B the cross-modal representation is abstract enough.

But these are questions that lie squarely within your territory, Bill, and I=d rather hear what you have to say about them.

WHC: The visual attributes of an apple are likely to reside near visual cortex, its auditory template is likely to be near auditory cortex, and the vocalization motor program needed to pronounce Aapple@ is likely to be in the rear of the frontal lobe. (That=s the tentative conclusion from studying strokes, as when the color of an apple can be lost without the patient losing its characteristic shape or taste). So the full-fledged concept of an apple is not stored in some particular location; it=s more like a distributed data base where a multifaceted report can be pulled together when needed.

There are some major improvements that the human brain may have made, concerning how fast and flexibly the multimodality linkups can be made. Let me save this issue until I=ve explained something about cortical circuitry.

Okay, we=ll get back to it. But before we leave words and get on to sentences I=d like to comment on a recent suggestion that the rubicon between our species and others falls at the symbolic rather than the syntactic level. In other words, it=s words, not sentences, that dramatically distinguish our species from others. Anyone who makes this kind of claim has to explain how it is that Sherman, Kanzi, and other trained apes have acquired symbolic representation to the extent B quite considerable B that they have. True, this level was attained only under human instruction, but because there are so many things we absolutely can=t teach apes to do at all, we may reasonably conclude that no animal can learn things that fall outside its biological capacity B even if most animals can learn some things that their species doesn=t usually do. So there remains the possibility that evolution will enlarge the behavioral envelopes of other species, and that any of a number of advanced animals might, millions of years hence, spontaneously acquire symbolic representations, just as human ancestors once did. The fact of our current uniqueness by no means entails that we shall always be unique.

In fact, as was apparent nearly two decades ago, the real rubicon, unpalatable though this may be to the philosophically minded, is syntax, not symbols.

So what is a word, finally? A word is the combination of a mental representation of something, which may or may not exist in the real world, with a mental representation of a set of symbols (phonetic, orthographic, manual). What you utter are not words, but only the phonological representations of words. What you write are not words, only the orthographic representations of words. What you sign, if you know one of the sign languages of the deaf, are not words but only signed representations of words. It=s a convenient shorthand to speak of Athe words I spoke,@ or Athe words you wrote,@ one that in practice we would find it impossible to do without. But, in fact, words are much more abstract than that.

If all you did was to link these representations, all you would have would be a language of isolated words: Bread. Life. Oak tree. Silence. There would be meaning, but not a lot of it. To get anywhere serious, words have to be put together


WHC: The other problem that symbols have faced is that the categories to which they refer are rather fuzzy. Endless tales from animal behavior illustrate that categories may not be any more precise than they need to be (indeed, they=re sometimes so crude that major mistakes can be made, as when some birds attack their own offspring that have strayed beyond the guano ring and attempt to return to their nest). Categories can be pretty ad hoc, often formed around a prototype of the class (the robin is a prototype bird; the penguin is an outlier, something that you can argue about). 

Categories of one, such as proper names, are easy for us, but that=s because our brains have some specializations for them in the front end of the temporal lobes, just in front of where the specializations for facial recognition are located. While social species need to remember individuals for dominance and reciprocal altruism reasons, human group size is much larger than in the other great ape species.

Which reminds me, Derek, even in a protolanguage sentence, the words come with some intrinsic information about possible roles. That's because of where nouns and verbs tend to be located in the brain. The temporal lobe is quite specialized for concepts (more later) used as nouns and adjectives, while the frontal lobe is probably the natural home for verbs and the relative orientation words such as Aleft,@ Abefore,@ Aabove,@ and so forth. And that=s probably true for our pre-protolanguage ancestors as well: in all mammals, the frontal lobe is used to move and prepare for movements, so it isn=t surprising to find verbs there, at least verbs for when you=re the actor. But were you to stick your head into a brain scanner and try to find verbs that would go with a noun spoken to you (I say Abike,@ you reply, ARide?@), much of the area above your left temple would probably light up (meaning that the inferior frontal lobe was requesting more blood flow because it was working harder).

Try to put together the simplest noun and verb for the first time, and you=re probably invoking a long-distance circuit in the brain, a linkup between temporal and frontal lobes. Though, while looking down at the exposed brain surface during neurosurgery, they might seem within a few centimeters of one another, the route between them is actually more like the quickest land route between Spain and Morocco (via Israel!). Frontal and temporal lobes are connected by a very long loop through a white matter bundle called the arcuate fasciculus that detours around the huge infolding known as the insula. Just think of the temporal lobe as North Africa.

But because of this primitive, pre-protolanguage tagging of nouns and verbs by lobe of origin, you're not likely to mistake AWilliam@ for a verb, much as I've always aspired to write a memoir entitled My Life As an Active Verb. It does show you an aspect of language that segregates, though it's nothing at all like the performance-competence segregation that some expected from brain mapping.


Notes and References for this chapter

Copyright 2000 by
William H. Calvin and Derek Bickerton

The nonvirtual book is
available from
or direct from
 MIT Press.

 Email Calvin  
 Email Bickerton  

  Book's Table of Contents  

  Calvin Home Page