There are several reasons, however, why a skill or an area of knowledge that has been relearned using a new area of the neocortex to replace one that has been damaged will not necessarily be as good as the original. First, because it took an entire lifetime to learn and perfect a given skill, relearning it in another area of the neocortex will not immediately generate the same results. More important, that new area of the neocortex has not just been sitting around waiting as a standby for an injured region. It too has been carrying out vital functions, and will therefore be hesitant to give up its neocortical patterns to compensate for the damaged region. It can start by releasing some of the redundant copies of its patterns, but doing so will subtly degrade its existing skills and does not free up as much cortical space as the skills being relearned had used originally.
There is a third reason why plasticity has its limits. Since in most people particular types of patterns will flow through specific regions (such as faces being processed by the fusiform gyrus), these regions have become optimized (by biological evolution) for those types of patterns. As I report in chapter 7, we found the same result in our digital neocortical developments. We could recognize speech with our character recognition systems and vice versa, but the speech systems were optimized for speech and similarly the character recognition systems were optimized for printed characters, so there would be some reduction in performance if we substituted one for the other. We actually used evolutionary (genetic) algorithms to accomplish this optimization, a simulation of what biology does naturally. Given that faces have been flowing through the fusiform gyrus for most people for hundreds of thousands of years (or more), biological evolution has had time to evolve a favorable ability to process such patterns in that region. It uses the same basic algorithm, but it is oriented toward faces. As Dutch neuroscientist Randal Koene wrote, “The [neo]cortex is very uniform, each column or minicolumn can in principle do what each other one can do.”13
Substantial recent research supports the observation that the pattern recognition modules wire themselves based on the patterns to which they are exposed. For example, neuroscientist Yi Zuo and her colleagues watched as new “dendritic spines” formed connections between nerve cells as mice learned a new skill (reaching through a slot to grab a seed).14 Researchers at the Salk Institute have discovered that this critical self-wiring of the neocortex modules is apparently controlled by only a handful of genes. These genes and this method of self-wiring are also uniform across the neocortex.15
Many other studies document these attributes of the neocortex, but let’s summarize what we can observe from the neuroscience literature and from our own thought experiments. The basic unit of the neocortex is a module of neurons, which I estimate at around a hundred. These are woven together into each neocortical column so that each module is not visibly distinct. The pattern of connections and synaptic strengths within each module is relatively stable. It is the connections and synaptic strengths between modules that represent learning.
There are on the order of a quadrillion (1015) connections in the neocortex, yet only about 25 million bytes of design information in the genome (after lossless compression),16 so the connections themselves cannot possibly be predetermined genetically. It is possible that some of this learning is the product of the neocortex’s interrogating the old brain, but that still would necessarily represent only a relatively small amount of information. The connections between modules are created on the whole from experience (nurture rather than nature).
The brain does not have sufficient flexibility so that each neocortical pattern recognition module can simply link to any other module (as we can easily program in our computers or on the Web)—an actual physical connection must be made, composed of an axon connecting to a dendrite. We each start out with a vast stockpile of possible neural connections. As the Wedeen study shows, these connections are organized in a very repetitive and orderly manner. Terminal connection to these axons-in-waiting takes place based on the patterns that each neocortical pattern recognizer has recognized. Unused connections are ultimately pruned away. These connections are built hierarchically, reflecting the natural hierarchical order of reality. That is the key strength of the neocortex.
The basic algorithm of the neocortical pattern recognition modules is equivalent across the neocortex from “low-level” modules, which deal with the most basic sensory patterns, to “high-level” modules, which recognize the most abstract concepts. The vast evidence of plasticity and the interchangeability of neocortical regions is testament to this important observation. There is some optimization of regions that deal with particular types of patterns, but this is a second-order effect—the fundamental algorithm is universal.
Signals go up and down the conceptual hierarchy. A signal going up means, “I’ve detected a pattern.” A signal going down means, “I’m expecting your pattern to occur,” and is essentially a prediction. Both upward and downward signals can be either excitatory or inhibitory.
Each pattern is itself in a particular order and is not readily reversed. Even if a pattern appears to have multidimensional aspects, it is represented by a one-dimensional sequence of lower-level patterns. A pattern is an ordered sequence of other patterns, so each recognizer is inherently recursive. There can be many levels of hierarchy.
There is a great deal of redundancy in the patterns we learn, especially the important ones. The recognition of patterns (such as common objects and faces) uses the same mechanism as our memories, which are just patterns we have learned. They are also stored as sequences of patterns—they are basically stories. That mechanism is also used for learning and carrying out physical movement in the world. The redundancy of patterns is what enables us to recognize objects, people, and ideas even when they have variations and occur in different contexts. The size and size variability parameters also allow the neocortex to encode variation in magnitude against different dimensions (duration in the case of sound). One way that these magnitude parameters could be encoded is simply through multiple patterns with different numbers of repeated inputs. So, for example, there could be patterns for the spoken word “steep” with different numbers of the long vowel [E] repeated, each with the importance parameter set to a moderate level indicating that the repetition of [E] is variable. This approach is not mathematically equivalent to having the explicit size parameters and does not work nearly as well in practice, but is one approach to encoding magnitude. The strongest evidence we have for these parameters is that they are needed in our AI systems to get accuracy levels that are near human levels.
The summary above constitutes the conclusions we can draw from the sampling of research results I have shared above as well as the sampling of thought experiments I discussed earlier. I maintain that the model I have presented is the only possible model that satisfies all of the constraints that the research and our thought experiments have established.
Finally, there is one more piece of corroborating evidence. The techniques that we have evolved over the past several decades in the field of artificial intelligence to recognize and intelligently process real-world phenomena (such as human speech and written language) and to understand natural-language documents turn out to be mathematically similar to the model I have presented above. They are also examples of the PRTM. The AI field was not explicitly trying to copy the brain, but it nonetheless arrived at essentially equivalent techniques.