Читать онлайн "The Master Algorithm: How the Quest for the Ultimate Learning Machine Will Remake Our World" - Domingos Pedro - RuLit

The most important curve in the world

As far as its neighbors are concerned, a neuron can only be in one of two states: firing or not firing. This misses an important subtlety, however. Action potentials are short lived; the voltage spikes for a small fraction of a second and immediately goes back to its resting state. And a single spike barely registers in the receiving neuron; it takes a train of spikes closely on each other’s heels to wake it up. A typical neuron spikes occasionally in the absence of stimulation, spikes more and more frequently as stimulation builds up, and saturates at the fastest spiking rate it can muster, beyond which increased stimulation has no effect. Rather than a logic gate, a neuron is more like a voltage-to-frequency converter. The curve of frequency as a function of voltage looks like this:

This curve, which looks like an elongated S, is variously known as the logistic, sigmoid, or S curve. Peruse it closely, because it’s the most important curve in the world. At first the output increases slowly with the input, so slowly it seems constant. Then it starts to change faster, then very fast, then slower and slower until it becomes almost constant again. The transfer curve of a transistor, which relates its input and output voltages, is also an S curve. So both computers and the brain are filled with S curves. But it doesn’t end there. The S curve is the shape of phase transitions of all kinds: the probability of an electron flipping its spin as a function of the applied field, the magnetization of iron, the writing of a bit of memory to a hard disk, an ion channel opening in a cell, ice melting, water evaporating, the inflationary expansion of the early universe, punctuated equilibria in evolution, paradigm shifts in science, the spread of new technologies, white flight from multiethnic neighborhoods, rumors, epidemics, revolutions, the fall of empires, and much more. The Tipping Point could equally well (if less appealingly) be entitled The S Curve. An earthquake is a phase transition in the relative position of two adjacent tectonic plates. A bump in the night is just the sound of the microscopic tectonic plates in your house’s walls shifting, so don’t be scared. Joseph Schumpeter said that the economy evolves by cracks and leaps: S curves are the shape of creative destruction. The effect of financial gains and losses on your happiness follows an S curve, so don’t sweat the big stuff. The probability that a random logical formula is satisfiable-the quintessential NP-complete problem-undergoes a phase transition from almost 1 to almost 0 as the formula’s length increases. Statistical physicists spend their lives studying phase transitions.

In Hemingway’s The Sun Also Rises, when Mike Campbell is asked how he went bankrupt, he replies, “Two ways. Gradually and then suddenly.” The same could be said of Lehman Brothers. That’s the essence of an S curve. One of the futurist Paul Saffo’s rules of forecasting is: look for the S curves. When you can’t get the temperature in the shower just right-first it’s too cold, and then it quickly shifts to too hot-blame the S curve. When you make popcorn, watch the S curve’s progress: at first nothing happens, then a few kernels pop, then a bunch more, then the bulk of them in a sudden burst of fireworks, then a few more, and then it’s ready to eat. Every motion of your muscles follows an S curve: slow, then fast, then slow again. Cartoons gained a new naturalness when the animators at Disney figured this out and started copying it. Your eyes move in S curves, fixating on one thing and then another, along with your consciousness. Mood swings are phase transitions. So are birth, adolescence, falling in love, getting married, getting pregnant, getting a job, losing it, moving to a new town, getting promoted, retiring, and dying. The universe is a vast symphony of phase transitions, from the cosmic to the microscopic, from the mundane to the life changing.

The S curve is not just important as a model in its own right; it’s also the jack-of-all-trades of mathematics. If you zoom in on its midsection, it approximates a straight line. Many phenomena we think of as linear are in fact S curves, because nothing can grow without limit. Because of relativity, and contra Newton, acceleration does not increase linearly with force, but follows an S curve centered at zero. So does electric current as a function of voltage in the resistors found in electronic circuits, or in a light bulb (until the filament melts, which is itself another phase transition). If you zoom out from an S curve, it approximates a step function, with the output suddenly changing from zero to one at the threshold. So depending on the input voltages, the same curve represents the workings of a transistor in both digital computers and analog devices like amplifiers and radio tuners. The early part of an S curve is effectively an exponential, and near the saturation point it approximates exponential decay. When someone talks about exponential growth, ask yourself: How soon will it turn into an S curve? When will the population bomb peter out, Moore’s law lose steam, or the singularity fail to happen? Differentiate an S curve and you get a bell curve: slow, fast, slow becomes low, high, low. Add a succession of staggered upward and downward S curves, and you get something close to a sine wave. In fact, every function can be closely approximated by a sum of S curves: when the function goes up, you add an S curve; when it goes down, you subtract one. Children’s learning is not a steady improvement but an accumulation of S curves. So is technological change. Squint at the New York City skyline and you can see a sum of S curves unfolding across the horizon, each as sharp as a skyscraper’s corner.

Most importantly for us, S curves lead to a new solution to the credit-assignment problem. If the universe is a symphony of phase transitions, let’s model it with one. That’s what the brain does: it tunes the system of phase transitions inside to the one outside. So let’s replace the perceptron’s step function with an S curve and see what happens.

Climbing mountains in hyperspace

In the perceptron algorithm, the error signal is all or none: you got it either right or wrong. That’s not much to go on, particularly if you have a network of many neurons. You may know that the output neuron is wrong (oops, that wasn’t your grandmother), but what about some neuron deep inside the brain? What does it even mean for such a neuron to be right or wrong? If the neurons’ output is continuous instead of binary, the picture changes. For starters, we now know how much the output neuron is wrong by: the difference between it and the desired output. If the neuron should be firing away (“Oh hi, Grandma!”) and is firing a little, that’s better than if it’s not firing at all. More importantly, we can now propagate that error to the hidden neurons: if the output neuron should fire more and neuron A connects to it, then the more A is firing, the more we should strengthen their connection; but if A is inhibited by another neuron B, then B should fire less, and so on. Based on the feedback from all the neurons it’s connected to, each neuron decides how much more or less to fire. Based on that and the activity of its input neurons, it strengthens or weakens its connections to them. I need to fire more, and neuron B is inhibiting me? Lower its weight. And neuron C is firing away, but its connection to me is weak? Strengthen it. My “customer” neurons, downstream in the network, will tell me how well I’m doing in the next round.