Выбрать главу
∞∞∞∞∞∞∞∞∞∞∞∞∞∞∞∞∞∞∞

How much does a typical person know?

Everyone knows a good deal about many objects, topics, words, and ideas—and one might suppose that a typical person knows an enormous amount. However, the following argument seems to suggest that the total extent of a person’s commonsense knowledge might not be so vast. Of course, it is hard to measure this, but we can start by observing that every person knows thousands of words, and that each of those must be linked in our minds to as many as a thousand other such items. Also a typical person knows hundreds of uses and properties of thousands of different common objects. Similarly, in the social realm, one may know thousands of things about tens of people, hundreds of things about hundreds of people, and tens of useful items about as many as a thousand people.

This suggests that in each important realm, one might know perhaps a million things. But while it is easy to think of a dozen such realms, it is hard to think of a hundred of them. This suggests that a machine that does humanlike reasoning might only need a few dozen millions of items of knowledge.[98]

Citizen: Perhaps so, but I have heard of phenomenal feats of memory. What about persons with photographic memories, who can recollect all the words of a book after only a single reading of it? Could it be that we all remember, to some extent, everything that happens to us?

We all have heard such anecdotes, but whenever we try to investigate one, we usually fail to uncover the source, or find that someone was fooled by a magic show trick. Many a person has memorized an entire book of substantial size (which most usually is a religious tract)—but no one has ever been shown to have memorized a hundred such books. Here is what one psychologist said about a person who appeared to him to possess a prodigious memory:

Alexander R.Luria: “For almost thirty years the author had an opportunity systematically to observe a man whose remarkable memory... which for all practical purposes was inexhaustible” (p3) … It was of no consequence to him whether the series I gave him contained meaningful words or nonsense syllables, numbers or sounds; whether they were presented orally or in writing. All that he required was that there be a three-to-four-second pause between each element in the series. . … And he could manage, also, to repeat the performance fifteen years later, from memory.”[99]

This may seem remarkable, but it might not be truly exceptional, because, in 1986, Thomas Landauer concluded that, during any extended interval, none of his subjects could learn at a rate of more than about 2 bits per second, whether the realm be visual, verbal, musical, or whatever. So, if Luria’s subject required four seconds per word, he was well within Landauer’s estimate.[100] And even if that individual were to continue this over the course of a typical lifetime, this rate of memorization would produce no more than 4000 million bits—a database that would easily fit on the surface of a Compact Disk.

Student: I’m uncomfortable with this argument. I agree that it might apply to our higher-level kinds of knowledge. But our sensory and motor skills might be based on much larger amounts of information.

We don’t have a good way to measure such things, and making such estimates raises hard questions about how those fragments of knowledge are stored and connected. Still, we have no solid evidence that any person has ever surpassed the limits that Landauer’s research suggests.[101]

Chapter §7 will speculate about how we organize knowledge so that, whenever one of our processes fails, we can usually find an alternative. But here we’ll change the subject to ask how we could endow a machine with the kinds of knowledge that people have.

∞∞∞∞∞∞∞∞∞∞∞∞∞∞∞∞∞∞∞

Could we build a Baby-Machine?

Alan Turing: “We cannot expect to find a good child machine at the first attempt. One must experiment with teaching one such machine and see how well it learns. One can then try another and see if it is better or worse [but] survival of the fittest is a slow method for measuring advantages. The experimenter, by the exercise of intelligence, should be able to speed it up [because] if he can trace a cause for some weakness he can probably think of the kind of mutation which will improve it.”[102]

To equip a machine with something like the knowledge we find in a typical person, we would want it to know about books and strings; about floors, ceilings, windows, and walls; about eating, sleeping, and going to work. And it wouldn’t be very useful to us unless it knew about typical human ideals and goals.

Programmer: Then, why not build a ‘baby-machine’ that learns what it needs from experience? Equip a robot with sensors and motors, and program it so that it can learn by interacting with the real world—the way that a human infant does. It could start with simple If-Then schemes, and then later invent more elaborate ones.

This is an old and popular dream: to build a machine that starts by learning in simple ways and then later develops more powerful methods—until it becomes intelligent. In fact several actual projects have had this goal, and each such system made progress at first but eventually stopped extending itself.[103] I suspect that this usually happened because those programs failed to develop good new ways to represent knowledge.

Inventing good new ways to represent knowledge is a major goal in Computer science. However, even when these are discovered, they rarely are quickly and widely adopted—because one must also develop good skills to work with them efficiently. And since such skills take time to grow, you will have to make yourself tolerate periods in which your performance becomes not better, but worse.[104]

The Investment Principle: It is hard to appreciate the virtues of a new technique because, until you become proficient with it, it will not produce results as good as you’ll get from the methods that you are familiar with.

No one has yet made a baby-machine that that developed effective new kinds of representations. Chapter §10 will argue that human brains are born equipped with machinery that eventually provides them with several different ways to represent various types of knowledge.

Here is another problem with “baby-machines.” It is easy to program computers to learn fairly simple new If Then rules; however, if a system does this too recklessly, it is likely to deteriorate from accumulating too much irrelevant information. Chapter §8 will argue that unless learning is done selectively—by making appropriate “Credit Assignments,” a machine will fail to learn the right things from most of its experiences.

Entrepreneur: Instead of trying to build a system that learns by itself, why not make one that searches the Web to extract knowledge from those millions of pages of content-rich text.

That certainly is a tempting idea, for the World Wide Web must contain more knowledge than any one person could comprehend. However, it does not explicitly include the knowledge that one would have to use to understand what all those texts mean. Consider the kind of story we find in a typical young child’s reading book.

вернуться

98

This discussion is adapted from my introduction to Semantic Information Processing, MIT Press, 1969.

вернуться

99

From: Alexander R.Luria, The Mind of a Mnemonist: Cambridge: Harvard University Press, 1968.

вернуться

100

Landauer, Thomas K. (1986). “How much do people remember? Some estimates of the quantity of learned information in long-term memory.” Cognitive Science, 10, 477-493. See also Ralph Merkle’s description of this in http://www.merkle.com/humanMemory.html. Furthermore, according to Ronald Rosenfeld, the information in typical text is close to about 6 bits per word. See Rosenfeld, Ronald, “A maximum entropy approach to adaptive statistical language modeling,” Computer, Speech and Language, 10, 1996, also at http://www.cs.cmu.edu/afs/cs/user/roni/WWW/me-csl-revised.ps. In these studies, the term ‘bit’ of information is meant in the technical sense of C.E. Shannon in http://cm.bell-labs.com/cm/ms/what/shannonday/paper.html.

вернуться

101

My impression that this also applies to the results reported by R.N. Haber in Behavioral and Brain Sciences, 2, 583-629,1979.

вернуться

102

A. M. Turing, Computing Machinery and Intelligence, at www.cs.swarthmore.edu/~dylan/Turing.html

вернуться

103

See several essays about self-organizing learning systems at: Gary Drescher, Made-Up Minds, MIT Press 1991, ISBN: 0262041200; Lenat’s 1983 “AM” system at http://web.media.mit.edu/~haase/thesis/node52.html; Kenneth Haase’s thesis at http://web.media.mit.edu/~haase/thesis/; Pivar, M. and Finkelstein, M. (1964) in The Programming Language LISP, MIT Press 1966; Solomonoff, R. J. “A formal theory of inductive inference,” Information and Control, 7 (1964), pp.1-22; Solomonoff, R. J. “An Inductive Inference Machine,” IRE Convention Record, Section on Information Theory, Part 2, pp. 56-62, 1957. Also, see his essay at http://world.std.com/~rjs/barc97.html. In recent years this has led to a field of research with the name of ‘Genetic Programming.’

вернуться

104

Technically, if a system has already been optimized, then any change is likely to make it worse until one find a higher peak, some distance away in the “fitness space.”