The question then becomes: How weak can the assumptions be and still allow all relevant knowledge to be derived from finite data? Notice the word relevant: we’re only interested in knowledge about our world, not about worlds that don’t exist. So inventing a universal learner boils down to discovering the deepest regularities in our universe, those that all phenomena share, and then figuring out a computationally efficient way to combine them with data. This requirement of computational efficiency precludes just using the laws of physics as the regularities, as we’ll see. It does not, however, imply that the universal learner has to be as efficient as more specialized ones. As so often happens in computer science, we’re willing to sacrifice efficiency for generality. This also applies to the amount of data required to learn a given target knowledge: a universal learner will generally need more data than a specialized one, but that’s OK provided we have the necessary amount-and the bigger data gets, the more likely this will be the case.
Here, then, is the central hypothesis of this book:
All knowledge-past, present, and future-can be derived from data by a single, universal learning algorithm.
I call this learner the Master Algorithm. If such an algorithm is possible, inventing it would be one of the greatest scientific achievements of all time. In fact, the Master Algorithm is the last thing we’ll ever have to invent because, once we let it loose, it will go on to invent everything else that can be invented. All we need to do is provide it with enough of the right kind of data, and it will discover the corresponding knowledge. Give it a video stream, and it learns to see. Give it a library, and it learns to read. Give it the results of physics experiments, and it discovers the laws of physics. Give it DNA crystallography data, and it discovers the structure of DNA.
This may sound far-fetched: How could one algorithm possibly learn so many different things and such difficult ones? But in fact many lines of evidence point to the existence of a Master Algorithm. Let’s see what they are.
The argument from neuroscience
In April 2000, a team of neuroscientists from MIT reported in Nature the results of an extraordinary experiment. They rewired the brain of a ferret, rerouting the connections from the eyes to the auditory cortex (the part of the brain responsible for processing sounds) and rerouting the connections from the ears to the visual cortex. You’d think the result would be a severely disabled ferret, but no: the auditory cortex learned to see, the visual cortex learned to hear, and the ferret was fine. In normal mammals, the visual cortex contains a map of the retina: neurons connected to nearby regions of the retina are close to each other in the cortex. Instead, the rewired ferrets developed a map of the retina in the auditory cortex. If the visual input is redirected instead to the somatosensory cortex, responsible for touch perception, it too learns to see. Other mammals also have this ability.
In congenitally blind people, the visual cortex can take over other brain functions. In deaf ones, the auditory cortex does the same. Blind people can learn to “see” with their tongues by sending video images from a head-mounted camera to an array of electrodes placed on the tongue, with high voltages corresponding to bright pixels and low voltages to dark ones. Ben Underwood was a blind kid who taught himself to use echolocation to navigate, like bats do. By clicking his tongue and listening to the echoes, he could walk around without bumping into obstacles, ride a skateboard, and even play basketball. All of this is evidence that the brain uses the same learning algorithm throughout, with the areas dedicated to the different senses distinguished only by the different inputs they are connected to (e.g., eyes, ears, nose). In turn, the associative areas acquire their function by being connected to multiple sensory regions, and the “executive” areas acquire theirs by connecting the associative areas and motor output.
Examining the cortex under a microscope leads to the same conclusion. The same wiring pattern is repeated everywhere. The cortex is organized into columns with six distinct layers, feedback loops running to another brain structure called the thalamus, and a recurring pattern of short-range inhibitory connections and longer-range excitatory ones. A certain amount of variation is present, but it looks more like different parameters or settings of the same algorithm than different algorithms. Low-level sensory areas have more noticeable differences, but as the rewiring experiments show, these are not crucial. The cerebellum, the evolutionarily older part of the brain responsible for low-level motor control, has a clearly different and very regular architecture, built out of much smaller neurons, so it would seem that at least motor learning uses a different algorithm. If someone’s cerebellum is injured, however, the cortex takes over its function. Thus it seems that evolution kept the cerebellum around not because it does something the cortex can’t, but just because it’s more efficient.
The computations taking place within the brain’s architecture are also similar throughout. All information in the brain is represented in the same way, via the electrical firing patterns of neurons. The learning mechanism is also the same: memories are formed by strengthening the connections between neurons that fire together, using a biochemical process known as long-term potentiation. All this is not just true of humans: different animals have similar brains. Ours is unusually large, but seems to be built along the same principles as other animals’.
Another line of argument for the unity of the cortex comes from what might be called the poverty of the genome. The number of connections in your brain is over a million times the number of letters in your genome, so it’s not physically possible for the genome to specify in detail how the brain is wired.
The most important argument for the brain being the Master Algorithm, however, is that it’s responsible for everything we can perceive and imagine. If something exists but the brain can’t learn it, we don’t know it exists. We may just not see it or think it’s random. Either way, if we implement the brain in a computer, that algorithm can learn everything we can. Thus one route-arguably the most popular one-to inventing the Master Algorithm is to reverse engineer the brain. Jeff Hawkins took a stab at this in his book On Intelligence. Ray Kurzweil pins his hopes for the Singularity-the rise of artificial intelligence that greatly exceeds the human variety-on doing just that and takes a stab at it himself in his book How to Create a Mind. Nevertheless, this is only one of several possible approaches, as we’ll see. It’s not even necessarily the most promising one, because the brain is phenomenally complex, and we’re still in the very early stages of deciphering it. On the other hand, if we can’t figure out the Master Algorithm, the Singularity won’t happen any time soon.
Not all neuroscientists believe in the unity of the cortex; we need to learn more before we can be sure. The question of just what the brain can and can’t learn is also hotly debated. But if there’s something we know but the brain can’t learn, it must have been learned by evolution.
The argument from evolution
Life’s infinite variety is the result of a single mechanism: natural selection. Even more remarkable, this mechanism is of a type very familiar to computer scientists: iterative search, where we solve a problem by trying many candidate solutions, selecting and modifying the best ones, and repeating these steps as many times as necessary. Evolution is an algorithm. Paraphrasing Charles Babbage, the Victorian-era computer pioneer, God created not species but the algorithm for creating species. The “endless forms most beautiful” Darwin spoke of in the conclusion of The Origin of Species belie a most beautiful unity: all of those forms are encoded in strings of DNA, and all of them come about by modifying and combining those strings. Who would have guessed, given only a description of this algorithm, that it could produce you and me? If evolution can learn us, it can conceivably also learn everything that can be learned, provided we implement it on a powerful enough computer. Indeed, evolving programs by simulating natural selection is a popular endeavor in machine learning. Evolution, then, is another promising path to the Master Algorithm.