A number of scholars have been highly critical of that radical idea. Steven Pinker and the linguist Ray Jackendoff have argued that recursion might actually be found in other aspects of the mind (such as the process by which we recognize complex objects as being composed of recognizable subparts). The primatologist David Premack, meanwhile, has suggested that although recursion is a hallmark of human language, it is scarcely the only thing separating human language from other forms of communication. As Premack has noted, it’s not as if chimpanzees can speak an otherwise humanlike language that lacks recursion (which might consist of language minus complexities such as embedded clauses).[34] I’d like to go even further, though, and take what we’ve learned about the nature of evolution and humans to turn the whole argument on its head.
The sticking point is what linguists call syntactic trees, diagrams like this:
<…>
Small elements can be combined to form larger elements, which in turn can be combined into still larger elements. There’s no problem in principle with building such things — computers use trees, for example, in representing the directory, or “folder” structures, on a hard drive.
But, as we have seen time and again, what is natural for computers isn’t always natural for the human brain: building a tree would require a precision in memory that humans just don’t appear to have. Building a tree structure with postal-code memory is trivial, something that the world’s computer programmers do many times a day. But building a tree structure out of contextual memory is a totally different story, a kluge that kind of works and kind of doesn’t.
Working with simple sentences, we’re usually fine, but our capacity to understand sentences can easily be compromised. Take, for example, this short sentence I mentioned in the opening chapter:
People people left left.
Here’s a slightly easier variant:
Farmers monkeys fear slept.
Four words each, but enough to boggle most people’s mind. Yet both sentences are perfectly grammatical. The first means that some set of people who were abandoned by a second group of people themselves departed; the second one means, roughly, “There is a set of farmers that the monkeys fear, and that set of farmers slept; the farmers that the monkeys were afraid of slept.” These kinds of sentences — known in the trade as “center embeddings” (because they bury one clause directly in the middle of another) — are difficult, I submit, precisely because evolution never stumbled on proper tree structure.[35]
Here’s the thing: in order to interpret sentences like these and fully represent recursion (another classic is The rat the cat the mouse chased bit died), we would need to keep track of each noun and each verb, and at the same time hold in mind the connections between them and the clauses they form. Which is just what grammatical trees are supposed to do.
The trouble is, to do that would require an exact memory for the structures and words that have just been said (or read). And that’s something our postal-code-free memories just aren’t set up to do. If I were to read this book aloud and suddenly, without notice, stop and ask you to repeat the last sentence you heard — you probably couldn’t. You’d likely remember the gist of what I had said, but the exact wording would almost surely elude you.[36]
As a result, efforts to keep track of the structure of sentences becomes a bit like efforts to reconstruct the chronology of a long-ago sequence of events: clumsy, unreliable, but better than nothing. Consider, for example, a sentence like It was the banker that praised the barber that alienated his wife that climbed the mountain. Now, quick: was the mountain climbed by the banker, the barber, or his wife? A computer-based parser would have no trouble answering this question; each noun and each verb would be slotted into its proper place in a tree. But many human listeners end up confused. Lacking any hint of memory organized by location, the best we can do is approximate trees, clumsily kluging them together out of contextual memory. If we receive enough distinctive clues, it’s not a problem, but when the individual components of sentences are similar enough to confuse, the whole edifice comes tumbling down.
Perhaps the biggest problem with grammar is not the trouble we have in constructing trees, but the trouble we have in producing sentences that are certain to be parsed as we intend them to be. Since our sentences are clear to us, we assume they are clear to our listeners. But often they’re not; as engineers discovered when they started trying to build machines to understand language, a significant fraction of what we say is quietly ambiguous.[37]
Take, for example, this seemingly benign sentence: Put the block in the box on the table. An ordinary sentence, but it can actually mean two things: a request to put a particular block that happens to be in a box onto the table, or a request to take some block and put it into a particular box that happens to be on the table. Add another clause, and we wind up with four possibilities:
Put the block [(in the box on the table) in the kitchen].
Put the block [in the box (on the table in the kitchen)].
Put [the block (in the box) on the table] in the kitchen.
Put (the block in the box) (on the table in the kitchen).
Most of the time, our brain shields us from the complexity, automatically doing its best to reason its way through the possibilities. If we hear Put the block in the box on the table, and there’s just one block, we don’t even notice the fact that the sentence could have meant something else. Language alone doesn’t tell us that, but we are clever enough to connect what we hear with what it might mean. (Speakers also use a range of “paralinguistic” techniques, like pointing and gesturing, to supplement language; they can also look to their listeners to see if they appear to understand.)
But such tricks can take us only so far. When we are stuck with inadequate clues, communication becomes harder, one reason that emails and phone calls are more prone to misunderstandings than face-to-face communication is. And even when we speak directly to an audience, if we use ambiguous sentences, people may just not notice; they may think they’ve understood even when they haven’t really. One eye-opening study recently asked college students to read aloud a series of grammatically ambiguous[38] sentences like Angela shot the man with the gun (in which the gun might have been either Angela’s murder weapon or a firearm the victim happened to be carrying). They were warned in advance that the sentences were ambiguous and permitted to use as much stress (emphasis) on individual words as they liked; the question was whether they could tell when they successfully put their meaning across. It turns out that most speakers were lousy at the task and had no clue about how bad they were. In almost half the cases in which subjects thought that they had successfully conveyed a given sentence’s meaning, they were actually misunderstood by their listeners! (The listeners weren’t much better, frequently assuming they’d understood when they hadn’t.)
34
In a hypothetical recursion-free language, you might, for example, be able to say “Give me the fruit” and “The fruit is on the tree,” but not the more complex expression “Give me the fruit that is hanging on the tree that is missing a branch.” The words “that is hanging on the tree that is missing a branch” represent an embedded clause itself containing an embedded clause.
35
Recursion can actually be divided into two forms, one that requires a stack and one that doesn’t. The one that doesn’t is easy. For example, we have no trouble with sentences like
36
Perhaps the most extreme version of remembering only the gist was Woody Allen’s five-word summary of
The problem with trees is much the same as the problem with keeping tracking of our goals. You may recall, from the chapter on memory, the example of what sometimes happens when we plan to stop at the grocery store after work (and instead “autopilot” our way home, sans groceries). In a computer, both types of problems — tracking goals and tracking trees — are typically solved by using a “stack,” in which recent elements temporarily take priority over stored ones; but when it comes to humans, our lack of postal-code memory leads to problems in both cases.
As it happens, there are actually two separate types of recursion, one that requires stacks and one that doesn’t. It is precisely the ones that do require stacks that tie us in knots.
37
According to legend, the first machine translation program was given the sentence “The flesh is weak, but the spirit is willing.” The translation (into Russian) was then translated back into English, yielding, “The meat is spoiled, but the vodka is good.”
38
Ambiguity comes in two forms, lexical and syntactic. Lexical ambiguity is about the meanings of individual words; I tell you to go have a ball, and you don’t know whether I mean a good time, an elaborate party, or an object for playing tennis. Syntactic (or grammatical) ambiguity, in contrast, is about sentences like
In a perfect language, in an organism with properly implemented trees, this sort of inadvertent ambiguity wouldn’t be a problem; instead we’d have the option of using what mathematicians use: parentheses, which are basically symbols that tell us how to group things. (2 × 3) + 2 = 8, while 2 × (3 + 2) = 10. We’d be able to easily articulate the difference between