THE CEREBRAL CODE by William H. Calvin (Chapter 1)

Home Page || amazon.com softcover link || The Calvin Bookshelf || Table of Contents

A book by
William H. Calvin
UNIVERSITY OF WASHINGTON
SEATTLE, WASHINGTON 98195-1800 USA

THE CEREBRAL CODE
Thinking a Thought in the Mosaics of the Mind
Available from MIT Press
copyright 1996 by William H. Calvin

1
The Representation Problem
and the Copying Solution

Even in the small world of brain science [in the 1860s], two camps were beginning to form. One held that psychological functions such as language or memory could never be traced to a particular region of the brain. If one had to accept, reluctantly, that the brain did produce the mind, it did so as a whole and not as a collection of parts with special functions. The other camp held that, on the contrary, the brain did have specialized parts and those parts generated separate mind functions. The rift between the two camps was not merely indicative of the infancy of brain research; the argument endured for another century and, to a certain extent, is still with us today.

Antonio R. Damasio, 1995

One cell, one memory may not be the way things work, but it seems to be the first way that people think about the problem of locating memories in cells. Even if you aren't familiar with how computers store data, the take-home message of most introductions to the brain is that there are pigeonhole memories -- highly specialized interneurons, the firing of which might constitute an item's memory evocation. On the perceptual side of neurophysiology, we call it the grandmother's face cell (a neuron that may fire only once a year, at Christmas dinner). On the movement side, if a single interneuron (that's an "insider neuron," neither sensory neuron nor motor neuron) effectively triggers a particular response, it gets called a command neuron. In the simplest of arrangements, both would be the same neuron.

Indeed, the Mauthner cells that trigger the escape reflex of the fish are exactly such neurons. If the fish is attacked from one side, the appropriate Mauthner cell fires and a massive tail flip results, carrying the fish away from the nibbles of its predator. Fortunately these cells already had a proper name, so we were spared the nibble-detector tail-flip cell.

But we know better than to generalize these special cases to the whole brain -- it can't be one cell, one concept. Yet the reasoning that follows isn't as easily recalled as those pigeonhole memory examples that inadvertently become the take-home message from most introductions to the subject. A singular neuron for each concept is rendered implausible in most vertebrates by the neurophysiological evidence that has accumulated since 1928, when the first recordings from sensory nerves revealed a broad range of sensitivity. There were multiple types, with the sensitivity range of one type overlapping that of other types. This overlap, without pure specialties, had been suspected for a long time, at least by the physiologically inclined. Thomas Young formulated his trichromatic theory of colors in 1801; after Hermann von Helmholtz extended the theory in 1865, it was pretty obvious that each special color must be a particular pattern of response that was achievable in various ways, not a singular entity. More recently, taste has turned out the same way: bitter is just a pattern of strong and weak responses in four types of taste buds, not the action of a particular type.

This isn't to say that a particular interneuron might not come to specialize in some unique combination -- but it's so hard to find narrow specialists, insensitive to all else, that we talk of the expectation of finding one as the "Grandmother's face cell fallacy." The "command neuron" usually comes with scare quotes, too, as the Mauthner cell arrangement isn't a common one. While we seek out the specialized neurons in the hopes of finding a tractable experimental model, we usually recognize that committees are likely the irreducible basis of representations -- certainly the more abstract ones we call schemas.

Because the unit of memory is likely to be closely related to sensory and motor schemas, pigeonhole schemes such as one-cell-one-memory had to be questioned. After Karl Lashley got through with his rat cortical lesions and found no crucial neocortical sites for maze memory traces, we had to suspect that a particular "memory trace" was a widespread patterning of some sort, one with considerable redundancy. You're left trying to imagine how a unit of memory could be spatially distributed in a redundant manner, overlapping with other memories.

One technological analogy is the hologram, but the brain seems unlikely to utilize phase information in the same way. A simpler and more familiar example of an ensemble representation is the pattern of lights on a message board. Individually, each light signifies nothing. Only in combination with other lights is there a meaning. More modern examples are the pixels of a computer screen or dot-matrix printer. Back in the 1940s, the physiological psychologist Donald Hebb postulated such an ensemble (which he called a cell-assembly) as the unit of perception -- and therefore memory. I'll discuss the interesting history of the cell-assembly in the Intermission Notes but for now, just think of one committee-one concept, and that any one cell can serve on multiple committees.

Note that it is not merely the lights which are lit that contain the concept's characteristic pattern: it is just as important that other lights are off, those that might "fog" the desired pattern were they turned on. Fortunately, most neurons of association cortex fire so infrequently that we often take the shortcut of talking only about "activating cells"; in other parts of the nervous system (especially the retina), there are background levels of activity that can be decreased as well as increased (just as gray backgrounds allow the textbook illustrator to use both black and white type) in an analog manner. But, as we shall see, neocortex also has some "digital" aspects.

A minor generalization to Hebb's cell-assembly would be moveable patterns, as when a message board scrolls: the pattern's the thing, irrespective of which cells are used to implement it. I cannot think of any cerebral examples equivalent to the moving patterns of Conway's Game of Life, such as flashers and gliders, but it is well to keep the free-floating patterns of automata in mind.

The important augmentation of the message board analogy is a pattern of twinkling lights: the possibility that the relevant memory pattern is a spatiotemporal one, not merely a spatial one. In looking for spatiotemporal patterns and trying to discern components, we are going to have the same problems as the child looking at a large Christmas tree, trying to see the independently flashing strings of lights that have been interwoven.

In the long run, however, a memory pattern cannot be a spatiotemporal one: long-term memories survive all sorts of temporary shutdowns in the brain's electricity, such as coma; they persist despite all sorts of fogging, such as those occurring with concussions and seizures. Hebb's dual trace memory said that there had to be major representational differences between long-term memory and the more current "working memories," which can be a distinctive pattern of neuron firings. As Hebb put it:

If some way can be found of supposing that a reverberatory [memory] trace might cooperate with the structural change, and carry the memory until the growth change is made, we should be able to recognize the theoretical value of the trace which is an activity only, without having to ascribe all memory to it.

    We are familiar with this archival-versus-current, passive-versus-active distinction from phonograph records, where a spatial-only pattern holds the information in cold storage and a spatiotemporal pattern is recreated on occasion, a pattern almost identical to that which originally produced the spatial pattern. A sheet of music or a roll for a player piano also allows a spatial-only pattern to be converted to a spatiotemporal one. I will typically use musical performance as my spatiotemporal pattern analogy and sheet music as my analogy to a spatial-only underpinning.
    At first glimpse, there appear to be some spatial-only sensations, say, those produced by my wristwatch on my skin (it's not really static because I have the usual physiological tremor, and a radial pulse, to interact with its weight). But most of our sensations are more obviously spatiotemporal, as when we finger the corner of the page in preparation for turning it. Even if the input appears static, as when we stare at the center of a checkerboard, some jitter is often introduced, as by the small micronystagmus of the eyeball (as I discuss further in the middle of my Intermission Notes, the nervous system gets a spatiotemporal pattern from the photoreceptors sweeping back and forth under the image). Whether timeless like a drawing of a comb or changing with time as when feeling a comb running through your hair, the active "working" representation is likely to be spatiotemporal, something like the light sequence in a pinball machine or those winking light strings on the Christmas tree.
    Certainly, all of our movements involve spatiotemporal patterns of muscle activation. Even in a static-seeming posture, physiological tremor moves things. In general, the implementation is a spatiotemporal pattern involving many motor neuron pools. Sometimes, as in the case of the fish's tail flip, the command for this starts at one point in time and space, but usually even the initiation of the movement schema is spatiotemporal, smeared out in both time and space.
    The sensation need not funnel down to a point and then expand outwards to recruit the appropriate response sequence; rather, the spatiotemporal pattern of the sensation could create the appropriate spatiotemporal pattern for the response without ever having a locus. Spread out in both time and space, such ephemeral (and perhaps relocatable) ensembles are difficult to summarize in flow charts or metaphors. Think, perhaps, of two voices, one of which (the sensory code) starts the song, is answered by the other voice (movement code); the voices are then intertwined for awhile (and the movement eventually gets underway), and then the second voice finishes the song.
    To my mind, the representation problem is which spatiotemporal pattern represents a mental object: surely recalling a memory is not a matter of recreating the firing patterns of every cell in the brain, so that they all mimic the activity at the time of input. Some subset must suffice. How big is it? Is it a synchronized ensemble like a chord, as some cortical theories would have it? Or is it more like a single note melody? Or with some chords mixed in? Does it repeat over and over, or does one repetition suffice for a while?

Those questions were in the air, for the most part, even back in my undergraduate days of the late 1950s, when I first met Hebb after reading his then-decade-old book, The Organization of Behavior. Hebb, amazingly, guessed a solution in 1945, even before the first single neuron recordings from mammalian cerebral cortex (glass microelectrodes weren't invented until 1950). Although our data have grown magnificently in recent decades, we haven't improved much on Hebb's statement of the problem, or on his educated guess about where the solution is likely to be found.
    Multiple microelectrode techniques now allow the sampling of several dozen neurons in a neighborhood spanning a few square millimeters. In motor cortex, even a randomly sampled ensemble can predict which movement, from a standard repertoire, that a trained monkey is about to make. For monkeys forced to wait before acting on a behavioral choice, sustained cell firing during the long hold is mostly up in premotor and prefrontal areas. In premotor and prefrontal cortex, some of the spatiotemporal patterns sampled by multiple microelectrodes are surprisingly precise and task-specific. With the fuzzier imaging techniques, we have recently seen some examples of where working memory patterns might be located: for humans trying to remember telephone numbers long enough to dial them, it's the classical Broca and Wernicke language areas that light up in imaging techniques.
    Because recall is so much more difficult than mere recognition (you can recognize an old phone number, even when you can't voluntarily recall it), we may need to distinguish between different representations for the same thing. The cryptographers make a similar distinction between a document and a hashed summary of that document (something like a checksum but capable of detecting even transposed letters). Such a 100-byte "message digest" is capable of recognizing a unique, multipage document ("I've seen that one before") but doesn't contain enough information to actually reconstruct it. So, too, we may have to distinguish between simple Hebbian cell-assemblies -- ones that suffice for recognition -- and the more detailed ones needed for abstracts and for complete recall.
    Hebb's formulation imposes an important constraint on any possible explanation for the cerebral representation: it's got to explain both spatial-only and spatiotemporal patterns, their interconversions, their redundancy and spatial extent, their imperfect nature (and characteristic errors therefrom), and the links of associative memory (including how distortions of old memories are caused by new links). No present technology provides an analogy to help us think about the problem.

The role of similar constraints on theorizing can be seen in how Kepler's three "laws" about planetary orbits posed the gravity problem that Newton went on to solve. Only a half century ago, molecular genetics had a similar all-important constraint that set the stage for a solution. Biologists knew that, whatever the genetic material was, it had to fit inside the cell, be chemically stable -- and, most significantly, it had to be capable of making very good copies of itself during cell "division." That posed the problem in a solvable way, as it turned out.
    Most people thought that the gene would turn out to be a protein, its three-dimensional nooks and crannies serving as a template for another such giant molecule. The reason Crick and Watson's DNA helical-zipper model caused such excitement in 1953 was because it fit with the copying constraint. It wasn't until a few years later that it became obvious how a triplet of a 4-letter DNA code was translated into strings from the 20-letter amino acid alphabet, and so created enzymes and other proteins.
    Looking for molecular copying ability led to the solution of the puzzle of how genes were decoded. Might looking for a neural copying mechanism provide an analogous way of approaching the cerebral code puzzle?

...memes are not strung out along linear chromosomes, and it is not clear that they occupy and compete for discrete 'loci', or that they have identifiable 'alleles' .... The copying process is probably much less precise than in the case of genes.... Memes may partially blend with each other in a way that genes do not.
     Richard Dawkins, 1982
Memes are those things that are copied from mind to mind. Richard Dawkins formulated this concept in 1976 in his book, The Selfish Gene. Cell division may copy genes, but minds mimic everything from words to dances. The cultural analog to the gene is the meme (as in mime or mimic); it's the unit of copying. An advertising jingle is a meme. The spread of a rumor is cloning a pattern from one mind to another, the metastasis of a representation.
    Might, however, such cloning be seen inside one brain and not just between brains? Might seeing what was cloned lead us to the representation, the cerebral code? Copying of an ensemble pattern hasn't been observed yet, but there are reasons to expect it in any brain -- at least, in any brain large enough to have a long-distance communications problem.
    If the pattern's the thing, how is it transmitted from the left side of the brain to the right side? Or from front to back? We can't send it like a mail parcel, so consider the problems of telecopying, of making a distant copy of a local pattern. Is there a NeuroFax Principle at work?
    When tracing techniques were crude, at a millimeter level of resolution, it seemed as if there were point-to-point mappings, an orderly topography for the major sensory pathways such that neighbors remained next to one another. One could imagine that those long corticocortical axon bundles were like fiber optic bundles that convey an image by thousands of little light pipes. But with finer resolution, topographic mappings turn out to be only approximately point-to-point; instead, an axon breaks up into clumps of endings. For the corticocortical axon terminations of the "interoffice mail," this fanout spans macrocolumnar dimensions and sometimes many millimeters. Exact point-to-point mapping doesn't occur.
    So, at first glimpse, it appears that corticocortical bundles are considerably worse than those incoherent fiber optic bundles that are factory rejects -- unless, of course, something else is going on. Perhaps it doesn't matter that the local spatiotemporal pattern is severely distorted at the far end; if codes are arbitrary, why should it matter that there are different codes for Apple in different parts of the brain? Just as there are two equally valid roots to a quadratic equation, just as isotopes have identical chemical properties despite different weights, so degenerate codes are quite common. For example, there are six different DNA triplets that all result in leucine being tacked on to a growing peptide.
    The main drawback to a degenerate cortical code is that most corticocortical projections are reciprocal: six out of seven interareal pathways have a matching back projection. It might undo the distortion of the forward projection, in the manner of inverse transforms, but that's demanding a lot of careful tuning and regular recalibration. And it isn't simply a matter of each local region having two local codes for Apple, one for sending, the other for receiving. Each region has multiple projection targets and thus many possible feedback codes that mean Apple.
    There might, of course, be some sort of error-correction code that allows a single characteristic spatiotemporal pattern for Apple. It would have to remove any distortions caused by the spatial wanderings, plus those associated with temporal dispersions of corticocortical transmission. It would need, furthermore, to operate in both the forward and return paths. I originally dismissed this possibility, assuming that an error-correcting mechanism was too fancy for cerebral circuitry. But, as will become apparent by the end of the following chapter, such error correction is easier than it sounds, thanks to that fanout of the corticocortical axon's terminals contributing to standardization of a spatiotemporal pattern.

Copying for a faux fax is going to be needed for cerebral cortex, even if simpler nervous systems, without a long-distance problem, can operate without copying. Copying might also be handy for promoting redundancy. But there is a third reason why copying might have proved useful in a fancy brain: darwinism.
    Perhaps it is only a matter of our impoverished knowledge of complex systems, but creativity seems to be a shaping-up process. During the evolution of new species and during the immune response's production of better and better antibodies, successive generations are shaped up, not especially the individual. Yes, the individual is plastic and it learns, but this modification during life is not typically incorporated into the genes that are passed on (learning and experience only change the chances of passing on the genes with which one was born -- the propensity for learning such things, rather than the things themselves). Yes, culture itself passes along imitations, but memes are easily distorted and easily lost, compared to genuine genes.
    Reproduction involves the copying of patterns, sometimes with small chance variations. Creativity may not always be a matter of copying errors and recombination, but it is reasonable to expect that the brain is going to make some use of this elementary darwinian mechanism for editing out the nonsense and emphasizing variations on the better-fitting ones in a next generation.

Natural selection alone isn't sufficient for evolution, and neither is copying alone -- not even copying with selection will suffice. I can identify six essential aspects of the creative darwinian process that bootstraps quality.

    1. There must be a reasonably complex pattern involved.
    2. The pattern must be copied somehow (indeed, that which is copied may serve to define the pattern).
    3. Variant patterns must sometimes be produced by chance.
    4. The pattern and its variant must compete with one another for occupation of a limited work space. For example, bluegrass and crab grass compete for back yards.
    5. The competition is biased by a multifaceted environment, for example, how often the grass is watered, cut, fertilized, and frozen, giving one pattern more of the lawn than another. That's natural selection.
    6. There is a skewed survival to reproductive maturity (environmental selection is mostly juvenile mortality) or a skewed distribution of those adults who successfully mate (sexual selection), so new variants always preferentially occur around the more successful of the current patterns.

With only a few of the six essentials, one gets the more widespread "selective survival" process (which popular usage tends to call darwinian). You may get some changes (evolution, but only in the weakest sense of the word) but things soon settle, running out of steam without the full process to turn the darwinian ratchet.
Indeed, many things called darwinian turn out to have no copying process at all, such as the selective survival of some synaptic connections in the brain during pre- and postnatal development of a single individual. Selective survival, moreover, doesn't even require biology. For example, a shingle beach is one where the waves have carried away the smaller rocks and sand, much as a carving reflects the selective removal of some material to create a pattern. The copying-mutation-selection loop utilized by the small-molecule chemists as they try to demonstrate the power of RNA-based evolution captures most of darwinism, as do "genetic" algorithms of computer science.
Not all of the essentials have to be at the same level of organization. Pattern, copying, and variation involve the genes, but selection is based on the bodies (the phenotypes that carry the genes) and their environment; inheritance, however, is back at the genotype level. In RNA-based evolution, the two levels are combined into one (the RNA serves as a catalyst in a way that affects its survival -- but it is also what is copied).

Because neural versions of the six essentials are going to play such a large role in the rest of this book, let me comment on the better-known versions for a moment.
    The gene is a string of DNA base-pairs that, in turn, instructs the rest of the cell about how to make a protein, perhaps an enzyme that regulates the rate of tissue growth. We'll be looking back from neural implementations, such as movement commands, and trying to see what patterns could have served as the cerebral code to get them going. Larger genetic patterns, such as whole chromosomes, are seldom copied exactly. So, too, we will have to delve below the larger movements to see what the smaller units might be.
    While the biological variations seem random, unguided variation isn't really required for a darwinian process to operate. We tend to emphasize randomness for several reasons. First, randomness is the default assumption against which we test claims of guidance. And second, the process will work fine without guidance, without any foreknowledge of a desired result. That said, it might work faster, and in some restricted sense better, with some hints that bias the general direction of the variants; this need not involve anything as fancy as artificial selection. We will see neural versions of random copying errors and recombination, including (in the last chapter) some discussion about how a slow darwinian process might guide a faster one by biasing the general direction in which its variations are done.
    Competition between variants depends on some limitation in resources (space in association cortex, in my upcoming examples) or carrying capacity. During a wide-open population explosion, competition is minor because the space hasn't filled up yet.
    For competition to be interesting, it must be based on a complex, multifaceted environment. Rather than the environment of grass, we'll be dealing with biases from sensation, feedback from our own movements, and even our moods. Most interestingly, there are both current versions of these environmental factors and memories of past ones.
    Many of the offspring have variations that are "worse" than the successful parent pattern but a minority may possess a variant that is an even better fit to the particular multifaceted environment. This tendency to base most new variations on the more successful of the old ones is what Darwin called the principle of inheritance, his great insight and the founding principle of what became population biology.
    It means that the darwinian process, as a whole loop, isn't truly random. Rather, it involves repeated exploratory steps where small chance variations are done on well-tested-by-the-environment versions. It's an enormously conservative process, because variations propagate from the base of the most successful adults -- not the base of the population as born. Without this proviso, the process doesn't accumulate wisdom about what worked in the past. The neural version also needs exactly the same characteristic, where slight variations are done from an advanced position, not from the original center of the population.

At least five other factors are known to be important to the evolution of species. The creative darwinian process will run without them, but they affect the stability of its outcome, or the rate of evolution, and will be important for my model of cognitive functions. Just like the catalysts and enzymes that speed chemical reactions without being consumed, they may make improbable outcomes into commonplace ones.

    7. Stability may occur, as in getting stuck in a rut (a local peak or basin in the adaptational landscape). Variants occur but they backslide easily. Only particularly large variations can ever escape from a rut, but they are few, and even more likely to produce nonsense (phenotypes that fail to develop properly, and so die young).
    8. Systematic recombination generates many more variants than do copying errors and the far-rarer cosmic-ray mutations. Recombination usually occurs once during meiosis (the grandparent chromosomes are shuffled as haploid sperm and ova are made) and again at fertilization (as the haploid parent genomes are combined into diploid once again, at fertilization). Sex, in the sense of gamete dimorphism (going to the extremes of expensive ova and cheap sperm), was invented several billion years ago and greatly accelerated species evolution over the rate promoted by errors, bacterial conjugation, and retroviruses.
    9. Fluctuating environments (seasons, climate changes, diseases) change the name of the game, shaping up more complex patterns capable of doing well in several environments. For such jack-of-all-trades selection to occur, the environment must change much faster than efficiency adaptations can track it, or "lean mean machine" specialists will dominate the expensive generalists.
    10. Parcellation, as when rising sea level converts the hilltops of one large island into an archipelago of small islands, typically speeds evolution. This is, in part, because more individuals then live on the margins of the habitat where selection pressure is greater. Also, there is no large central population to buffer change. When rising sea level converted part of the coastline of France into the island of Jersey, the red deer trapped there in the last interglaciation underwent a considerable dwarfing within only a few thousand years.
    11. Local extinctions, as when an island population becomes too small to sustain itself, speed evolution because they create empty niches. When subsequent pioneers rediscover the unused resources, their descendants go through a series of generations where there is enough food -- even for the more extreme variations that arise, the ones that would ordinarily lose out in the competition with the more optimally endowed, such as the survivors of a resident population. When the environment again changes, some of those more extreme variants may be able to cope better with the third environment than the narrower range of variants that would reach reproductive age under the regime of a long-occupied niche.
Sexual selection also has the reputation of speeding evolution, and there are "catalysts" acting at several removes, as in Darwin's example of what introducing cats to an English village would do to enhance the bee-dependent flowers, via reducing the rodent populations that disrupt bee hives.
    An example of how these catalysts work together is island biogeography, as in the differentiation of Darwin's finches unbuffered by large continental gene pools. Archipelagos allow for many parallel evolutionary experiments. Episodes that recombine the islands (as when sea level falls during an ice age) create winner-take-most tournaments. Most evolutionary change may occur in such isolation, in remote valleys or offshore islands, with major continental populations serving as slowly changing reservoirs that provide pioneers to the chancy periphery.

Although the creative darwinian process will run without these catalysts, using darwinian creativity in a behavioral setting requires some optimization for speed, so that quality is achieved within the time span of thought and action. Accelerating factors are the problem in what the French call avoir l'esprit de l'escalier -- finally thinking of a witty reply, but only after leaving the party. I will not be surprised if some accelerating factors are almost essential in mental darwinism, simply because of the time windows created by fleeting opportunities.