Language Origins Society talk by William H. Calvin (July 1996)

Home Page || How Brains Think || The Cerebral Code || The Calvin Bookshelf

William H. Calvin
UNIVERSITY OF WASHINGTON
SEATTLE, WASHINGTON 98195-1800 USA
http://weber.u.washington.edu/~wcalvin /LOS96.html

Corticocortical Coherence and Universal Grammar

Whenever the literary German dives into a sentence, that is the last you are going to see of him till he emerges on the other side of the Atlantic with his verb in his mouth.

Mark Twain

Outline of talk

Big step up from protolanguage to UG
Corticocortical connections, gross and microscopic
What's a coherent pathway? Example from Picasso
Corticocortical circuitry for coherent choruses
A backpath for resolving ambiguity (syntax's audit trail)
The resulting physiological view of deep, surface structures
Evolving UG and intelligence since Homo erectus days

Protolanguage
     I am going to suggest a neurophysiological candidate for the considerable step up from the simplest forms of language, known as protolanguage, to our full-fledged syntactic language with its nested embedding (I think I saw him leave to go home).
     Protolanguage is what is produced by children under two, some of the mentally retarded and agrammatic aphasics, the most accomplished apes, speakers of pidgins — and by American professors trying to communicate with Hungarian shopkeepers. They can, of course, often do better at comprehension than production.
     Linguistics researchers such as Derek Bickerton have come to doubt that intermediate forms exist, at least on the production side. Protolanguage has little structure, relying mostly on simple contextual associations between a few words to convey the message. And associations have their limitations.

Associations and their limitations

Associations for the tall blond man with one black shoe permuting into a blond black man with one tall shoe.

Adjacency is associations in sequence. Prepositions always refer to the first noun that follows. But with verbs, sometimes functionally related words become widely separated, as in Mark Twain’s famous complaint about long-range dependencies in German word order:

Whenever the literary German dives into a sentence, that is the last you are going to see of him till he emerges on the other side of the Atlantic with his verb in his mouth. [CT Yankee]

But binding of pronouns to their referents is an even longer-range linkage — back into preceding sentences — and it’s no strain at all.

Such binding requires longer-than-local links; recursive embedding, moreover, requires structuring a hierarchy of them: I think I saw him leave to go home. Such four-or-more-sentences-in-one nesting is considered essential for Universal Grammar. And, if Bickerton is right, there aren’t intermediate forms such as nesting one- or two-deep. There’s a big jump in abilities to explain mechanistically.

When I think of diagraming a sentence with my neurophysiologist’s imagination, I think of those lines as actually fiber-optic bundles looping from one cortical area to another, just like the Lincoln and Holland tunnels collect things in Manhattan and they resurface over in Jersey. So let’s take a look at the interconnection conduits of cerebral cortex.

Structures for nesting

You’ve all seen pictures of the corpus callosum, the best-known corticocortical pathway. Here are some of the pathways within the left hemisphere, which are far more important for language. Arcuate fasciculus. U-fibers and longer.

Nonadjacent areas of cerebral cortex are likely to be involved in many attempted associations, given what we know about comb’s visual connotations being stored near visual cortex, its auditory aspects near auditory areas, and so forth.

UG requires a lot of linking and particularly treelike structures, leading one to imagine cortical processing involving a number of different areas, talking with one another via those U-shaped corticocortical pathways.

Fiber optic bundles come in two kinds, coherent and incoherent. Corticocorticals are surely incoherent, from everything we know about neuroanatomy. Call this the Picasso Problem, so let me use portrait of Pablo himself to illustrate; just remember that by the time it gets through six synapses to cortex, it isn’t going to look like what was imaged on the retina. Probably as arbitrary as a bar code on a product package, though I’ll use familiar images as stand-ins for what’s probably a characteristic firing pattern involving a few thousand neurons.

In coherent fiber optic bundle like those in the endoscope used to inspect your ulcers, there is point-to-point mapping. Pixel here illuminates corresponding pixel there. But in CCs, each axon fans out over a mm or so at the destination, equivalent to optical blur (blurred portrait).

The longer fiber optic bundles, such as the ethernet ones, become jumbled — like crossing your fingers rather than keeping them all parallel. For imaging uses, incoherents are discarded but long-haul uses can tolerate incoherents. Unfortunately, from all we know about CCs, they’re likely to be jumbled too.

[unveil Picasso eye in forehead, keep bottom covered]

Now you see why I picked a Picasso: I thought jumbling his 1907 self portrait was somehow fitting. Can recognize Pablo from blurred version, with enough practice. CCs surely do this all the time; they can surely tune up to jumbled as well. They can (neural network theory tells us) tune up to any arbitrary pattern, giving enough experience.

[unveil last row, double jumble]

Distorted (blur and jumble again) when sent from B echo’ed back to A, but patch A can also presumably learn arbitrary patterns and so eventually come to recognize the doubly blurred, doubly jumbled pattern of Pablo. Then it has to create an equivalence to the original, but it can presumably do that too, with a little more practice.

But what if area B passes things along to Area C, and eventually back to Area A? Probably a lot like that parlor game that demonstrates the spread of a rumor, the successive distortions by the time it’s been whispered from one person to another around a circle. Coherent systems, like the email packets we all use, don’t have that problem: if you send an email to me and I send it back with comments, your text doesn’t come back jumbled and blured.

6/7 pairs of cortical areas have reverse projections in monkey. Degenerate code, where lots of patterns mean the same thing, slows down processing and requires a lot of learning; it’d be much simpler if CC transmission were coherent, with no blur or jumble.

Still, degenerate codes and enough time for learning ought to allow the conveyance of well-practiced special cases, analogous to the mariners’ signal flags - though perhaps only a few at a time, thereby limiting the possible novel associations that could be conveyed between cortical areas. Embedding would be restricted to stock phrases. The limitations of incoherent corticocorticals sound a lot like the limitations of protolanguage.

Why Coherent Corticocorticals are Handy

The problem isn’t tuning up to an increasingly arbitrary pattern — it’s dealing with novel stuff, things you’ve never communicated before. Where learning won’t suffice, in the short run. And particularly in the case of fancy language — when combinations are novel, like that one black shoe. You need to deal with novel combinations immediately if you are to use language to describe nonroutine situations.

My proposed solution is temporarily converting incoherent anatomy into coherent physiology. My explanation here requires a certain amount of hand waving in the interests of time. Fortunately the full version is available in my book The Cerebral Code.

hand clones compete

One important bit of the conceptual framework from the book: there are various theoretical reasons to suppose that association cortex clones certain firing patterns, e.g., that the movement command spatiotemporal pattern for pointing a finger creates a lot of copies of itself before being sent downstream to the muscles. And that alternative courses of action are doing the same thing, that how to decide which movement to make is perhaps a matter of the competition for territory. Not just movements but perception and memory recalls, all (in my theory) are utilizing cloning of firing patterns.

This copying makes cortex able to run a darwinian process, just as does the immune response and species evolution. Cloning patterns — with the occasional mistake or superposition producing a variant, which itself clones — have big implications for how you disambiguate a sentence you’ve heard or read. Whole book by itself (THE CEREBRAL CODE); it’s how I stumbled into coherent corticocorticals and UG.

Corticocortical circuitry for coherent transmission

Not point-to-area mapping like blur but point to adjacents and to ring, like flashlight beam. Actually more like bulls-eye since integer-multiple repeats. Starting to see astronomers using such image processing techniques for imaging planets around stars.

Error-correction code via this nice crystallization tendency. What’s copied was the clue to DNA triplets being genetic code; memes are what’s copied at more cultural level. Here, a 0.5 mm hexagon of several hundred minicolumn units is having its activity cloned.

I think that this spatiotemporal pattern is The Cerebral Code for a word of our vocabulary, a face we recall, even an idea such as a sentence we might later speak aloud. I think bird is a tune, robin a different tune, and particular parrots named Polly also have their characteristic melody. Unfortunately, unlike the genetic code, I doubt this one is universal — your code for parrot is probably different from mine, just accidents of upbringing.

Now let’s move this pattern over a CC bundle and reconstitute it in Area B.

local neighbor superposition and error-correct

Start with a triangular array of synced points. Fanout on end of long cc axon too and, if the same way as locally, we ought to see triangular arrays building up in Area B. But the arriving spatiotemporal pattern — blurred and jumbled, of course — isn’t being copied. Rather the cells in Area B are detecting the occasional coincidences and ignoring the uncorrelated. Might only take 3 out of 7 inputs to work right.

Even if jumble, the sync requirement means multiple jumbles must conspire to get 3 to overlap, else will simply be ignored. So can reconstitute the triangular array pattern, and for each active array in the original hexagons. We have temporarily converted an incoherent path into a coherent one by using triangular-array redundancy. In The Cerebral Code, I call it a faux fax.

The error-correction mechanism offers the possibility of sending arbitrary spatiotemporal patterns down the corticocortical bundle - and succeeding on the first try, so that one is no longer limited to the spatially and temporally distorted patterns that have been recognized by the target cortex as meaningful special cases. Such corticocortical coherence would mean that novel associations like one black shoe could be easily conveyed.

With coherent corticocortical tunnels connecting areas, you can maintain a chorus above a critical size (they are, presumably, always adapting and thereby falling silent). Back projections using the same code mean that you can have a distributed choir, distant chorus members contributing to keeping its membership above a critical size. It would be like missing choir practice but participating via a conference phone call.

Second, you can superimpose many patterns, not just the two you get at boundaries between competing codes. If S2 is he left and is embedded into S1 (I saw S2), and S1 is similarly embedded into S0 (I think S1), you get an S0 with all the six words of I think I saw him leave.

This faux fax would seem, at first glimpse, to produce an even more ambiguous superposition, of the kind that yielded the blond black man with one tall shoe. But bi-directional corticocortical links allow you to have your cake and eat it too, to disambiguate the morass by interrogating the subsidiary choruses during surface structure readout.

The backpath and resolving ambiguity via an audit trail

Since the same spatiotemporal firing pattern would now be shared by both the source and the target area; the target cortex could send it back with similar error correction and have it automatically recognized in the source cortex, with no need to tune up to a doubly-distorted version and then construct an equivalence to the original spatiotemporal firing pattern.

A backprojected spatiotemporal pattern might not need to be fully featured, nor fully synchronized, to help out with the peripheral site’s chorus. It could be more like that sing-along technique called "lining out" where a single voice prompts the next line in a monotone and the chorus repeats it with melodic elaboration; some singing at a fifth or an octave above the others, some with a delay, and so forth. The backpath could include more code than the subchorus sings, just as choirmasters and folk singers manage to include exhortations as they sing along.

The Heart of the Matter

While choir-practice-via-conference-call might have been how distributed singing got started, it may have a far more important function: structuring complex sentences. Back projections can provide an audit trail that can resolves the ambiguity of excessive superposition. ("Who said X? Sing it again, the whole thing!") That’s the way you get get the referent of a pronoun, for example.

WYSIWYG
But, most importantly, with links that can maintain sentence structure, embedding becomes possible: no longer is there a danger that the mental model of the amalgamation WYSIWYG will be scrambled, like that blond black man with one tall shoe. When converting deep structured amalgamation into surface structure, you need to read out each of the voices separately in an appropriate order.

My guess is that you have to do this when associating more than two items. No-mans-land superpositions are easy, triples require staging somehow (such as by CC overlays). The sentence is many-voiced like a symphony.

orchestral voices as NP-PP, symphony-as-sentence

CCs and UG

The linguistic categories could map to the physiology, in this model, something like this:

The "meaning of the sentence" is an abstract cerebral code (those extensive symphonic superpositions) whose S0 hexagons compete for territory with those that suggest alternative interpretations (a darwinian copying competition: see The Cerebral Code).
Phrase structure is presumably a matter of the coherent corticocortical links to contributing territories, having their own competitions and tendencies to die out if not reinforced by backprojecting codes. Binary tree structure would reflect the active corticocortical links on which back tracking can occur.
Argument structure (as when give requires three nouns that can play the roles of actor, recipient, and object given) could arise at the level of both the subchoruses and the top-level one; multilobed attractors, idiosyncratic to the verb or preposition, might implement it. This "what goes together" is the equivalent of harmony. Instead of just a major scale (7 notes out of 12 that "go together" nicely) and a minor scale (a different 7-note subset), we have a harmonious set of roles for each verb — a set of roles like Agent, Patient or Theme, Goal, Source, Instrument, Beneficiary, Time and Place — that must, may, or can’t go with a particular verb.
Surface structure, needed to actually speak one word after another, is a often a matter of unpacking the contributors into a conventional ordering, such as subject-verb-object, conventions about prepositional structure, and so forth. If a NP sent up to S0 has a verb or preposition in it, that’s a sign that interrogation of the S1 subchorus is needed during surface structuring readout. And if S1 has PPs or NPs with embedded verbs or prepositions, you’ll have to prompt a sub-subchorus to sing as well.

Rather than a staging buffer, I imagine a selective attention spotlight superimposed by another coherent corticocortical connection on the winning hexagonal mosaic. It would, like the conductor of Benjamin Britten’s Young People’s Guide to the Orchestra, enhance the contributing subchoruses one by one and thereby read out the sentence in a form which will aid the recipient, who must guess the underlying binary tree and use it to reconstruct who did what to whom.

Language Origins: Evolving UG and creativity

What might degrade such a nice flexibly-structured system into a protolanguage? Incoherence will do. Were the corticocortical’s error correction not well tuned, as in infants, linkages would be restricted to well-practiced special cases, perhaps only the spatiotemporal firing patterns for a limited number of vocabulary items, perhaps a few schemas and scripts. Were connections fogged by seizures, or if there were an overlay of background noise from some pathology, the triangular array scheme for constructing coherence would fail and the operation would revert back to incoherent CC capabilities, where a cerebral code for an item was no longer the same, here and there.

The back path would also be slow and chancy, and it’s what allows a subchorus to be maintained and permits an audit trail to resolve ambiguities. Structure wouldn’t work any more; embedding would probably be restricted to stock phrases. Relating who did what to whom would take a long time — just as it does in "protolanguage."

Corticocortical coherence is thus one candidate for what converted protolanguage into Language Itself; while I have used my own triangular array example for how to achieve that coherence, other yet-to-be-discovered coherence-enhancing schemes should similarly improve novel associations and nesting. Indeed, the transition from special case to arbitrary code conveyance could have implemented several major innovations of Universal Grammar - recursive embedding and long-range links - in one step.

Given that improved corticocortical coherence is likely to occur in the context of a darwinian process at each end that can bootstrap quality, a substantial improvement in mental abilities for dealing with novelty might be expected to accompany the transition from protolanguage to Universal Grammar.

So we’ve got several candidates for what might have stimulated the infrequently-innovating Homo erectus cultures to evolve into the constantly-changing cultures of Homo sapiens about a quarter-million years ago. Both candidates mechanisms make use of the triangular arrays among the superficial pyramidal neurons of neocortex:

Darwinian copying competitions might have come into frequent use in cortical areas that previously were over-committed to specialization, allowing higher-quality guessing.
Corticocortical coherence might have improved to the point that a common cerebral code for each word or movement became possible, one able to be passed through the embedding hierarchy in a way that associations along the way remained properly linked.

Either would have substantial benefits for language, intelligence, and our plan-ahead abilities that require evolving quality schemes before acting. Either improvement might have freed our ancestors from the rut in which Homo erectus was seemingly stuck.

Suggested reading: The corticocortical coherence chapter of THE CEREBRAL CODE and the syntax chapter of HOW BRAINS THINK (BasicBooks 1996) may be of interest. Language cortex physiology is addressed in various chapters of CONVERSATIONS WITH NEIL'S BRAIN (Addison-Wesley 1994). Universal grammar is nicely addressed in Ray Jackendoff's book PATTERNS IN THE MIND (BasicBooks 1993), and protolanguage is similarly addressed in Derek Bickerton's LANGUAGE AND SPECIES (University of Chicago Press 1990). For more related reading, see the Calvin Bookshelf.