Выбрать главу

Tongue-twisters emerge as a consequence of the complicated dance that the articulators perform. It’s not enough to close our mouth or move our tongue in a basic set of movements; we have to coordinate each one in precisely timed ways. Two words can be made up of exactly the same physical motions performed in a slightly different sequence. Mad and ban, for example, each require the same four crucial movements — the velum (soft palate) widens, the tongue tip moves toward alveolar closure, the tongue body widens in the pharynx, and the lips close — but one of those gestures is produced early in one word (mad) and late in another (ban). Problems occur as speech speeds up — it gets harder and harder to get the timing right. Instead of building a separate timer (a clock) for each gesture, nature forces one timer into double (or triple, or quadruple) duty.

And that timer, which evolved long before language, is really good at only very simple rhythms: keeping things either exactly in phase (clapping) or exactly out of phase (alternating steps in walking, alternating strokes in swimming, and so forth). All that is fine for walking or running, but not if you need to perform an action with a more complex rhythm. Try, for example, to tap your right hand at twice the rate of your left. If you start out slow, this should be easy. But now gradually increase the tempo. Sooner or later you will find that the rhythm of your tapping will break down (the technical term is devolve) from a ratio of 2:1 to a ratio of 1:1.

Which returns us to tongue-twisters. Saying the words she sells properly involves a challenging coordination of movements, very much akin to tapping at the 2:1 ratio. If you first say the words she and sells aloud, slowly and separately, you’ll realize that the /s/ and /sh/ sounds have something in common — a tongue-tip movement — but only /sh/ also includes a tongue-body gesture. Saying she sells properly thus requires coordinating two tongue-tip gestures with one tongue-body gesture. When you say the words slowly, everything is okay, but say them fast, and you’ll stress the internal clock. The ratio eventually devolves to 1:1, and you wind up sticking in a tongue-body gesture for every tongue-tip gesture, rather than every other one. Voilà, she sells has become she shells. What “twists” your tongue, in short, is not a muscle but a limitation in an ancestral timing mechanism.

The peculiar nature of our articulatory system and how it evolved, leads to one more consequence: the relation between sound waves and phonemes (the smallest distinct speech sounds, such as /s/ and /à/) is far more complicated than it needs to be. Just as our pronunciation of a given sequence of letters depends on its linguistic context (think of how you say ough when reading the title of Dr. Seuss’s book The Tough Coughs As He Ploughs the Dough), the way in which we produce a particular linguistic element depends on the sounds that come before it and after it. For example, the sound Isl is pronounced in one way in the word see (with spread lips) but in another in the word sue (with rounded lips). This makes learning to talk a lot more work than it might otherwise be. (It’s also part of what makes computerized voice-recognition a difficult problem.)

Why such a complex system? Here again, evolution is to blame; once it locked us into producing sounds by articulatory choreography, the only way to keep up the speed of communication was to cut corners. Rather than produce every phoneme as a separate, distinct element (as a simple computer modem would), our speech system starts preparing sound number two while it’s still working on sound number one. Thus, before I start uttering the h in happy, my tongue is already scrambling into position in anticipation of the a. When I’m working on a, my lips are already getting ready for the pp, and when I’m on pp, I’m moving my tongue in preparation for the y

This dance keeps the speed up, but it requires a lot of practice and can complicate the interpretation of the message.[31] What’s good for muscle control isn’t necessarily good for a listener. If you should mishear John Fogerty’s “There’s a bad moon on the rise” as “There’s a bathroom on the right,” so be it. From the perspective of evolution, the speech system, which works most of the time, is good enough, and that’s all that matters.

Curmudgeons of every generation think that their children and grandchildren don’t speak properly. Ogden Nash put it this way in 1962, in “Laments for a Dying Language”:

Coin brassy words at will, debase the coinage; We’re in an if-you-cannot-lick-them-join age, A slovenliness provides its own excuse age, Where usage overnight condones misusage. Farewell, farewell to my beloved language, Once English, now a vile orangutanguage.

Words in computer languages are fixed in meaning, but words in human languages change constantly; one generation’s bad means “bad,” and the next generation’s bad means “good.” Why is it that languages can change so quickly over time?

Part of the answer stems from how our prelinguistic ancestors evolved to think about the world: not as philosophers or mathematicians, brimming with precision, but as animals perpetually in a hurry, frequently settling for solutions that are “good enough” rather than definitive.

Take, for example, what might happen if you were walking through the Redwood Forest and saw a tree trunk; odds are, you would conclude that you were looking at a tree, even if that trunk happened to be so tall that you couldn’t make out any leaves above. This habit of making snap judgments based on incomplete evidence (no leaves, no roots, just a trunk, and still we conclude we’ve seen a tree) is something we might call a logic of “partial matching.”

The logical antithesis, of course, would be to wait until we’d seen the whole thing; call that a logic of “full matching.” As you can imagine, he who waits until he’s seen the whole tree would never be wrong, but also risks missing a lot of bona fide foliage. Evolution rewarded those who were swift to decide, not those who were too persnickety to act.

For better or worse, language inherited this system wholesale. You might think of a chair, for instance, as something with four legs, a back, and a horizontal surface for sitting. But as the philosopher Ludwig Wittgenstein (1889-1951) realized, few concepts are really defined with such precision. Beanbag chairs, for example, are still considered chairs, even though they have neither an articulated back nor any sort of legs.

I call my cup of water a glass even though it’s made of plastic; I call my boss the chair of my department even though so far as I can tell she merely sits in one. A linguist or phylogenist uses the word tree to refer to a diagram on a page simply because it has branching structures, not because it grows, reproduces, or photosynthesizes. A head is the topside of a penny, the tail the bottom, even though the top has no more than a picture of a head, the bottom not a fiber of a wagging tail. Even the slightest fiber of connection suffices, precisely because words are governed by an inherited, ancestral logic of partial matches.[32]

вернуться

31

Co-articulation did not evolve exclusively for use in speech; we see the same principle at work in skilled pianists (who prepare for thumb-played notes about two notes before they play them), skilled typists, and major league baseball pitchers (who prepare the release of the ball well before it occurs).

Or Jimi Hendrix’s “Excuse me while I kiss the sky” for “Excuse me while I kiss this guy.” If you, like me, get a kick out of these examples, Google for the term Mondegreen and find oodles more.

вернуться

32

Is that good or bad? That depends on your point of view. The logic of partial matches is what makes languages sloppy, and, for better or worse, keeps poets, stand-up comedians, and linguistic curmudgeons gainfully employed. (“Didja ever notice that a near-miss isn’t a miss at all?”)