Выбрать главу

Rosa froze the image. “There is no known human language detectable in this signal. And yet we can detect structure…”

She told us, somewhat to my surprise, that there was a flourishing discipline in the study of nonhuman languages.

It had originated in questions about animal communication. The songs of whales and whistles of dolphins were obvious case studies, but so were the hoots and screeches of chimps and monkeys, the stamping of elephants — even the dull chemical signaling of one plant to another. But how much information was contained in these messages? Even if you couldn’t translate the language, even if you didn’t know what the whales sang about, were there ways of determining if there was any information in there at all — and if so, how much, how dense? This was a discipline that in latter years had been useful in helping us figure out the sometimes cryptic utterances of our more enigmatic artificial intelligences — and, I thought, it might be useful someday if we ever encountered extraterrestrial intelligences.

Rosa waved a hand, and the air filled with graphs. It was all to do with information theory, she said, the mathematics of sequences of symbols — binary digits, DNA bases, letters, phonemes. “The first thing is to see if there is any information in your signal. And to do that you construct a Zipf graph…” This was named after a Harvard linguist of the 1940s. You broke up your signal into its elements — bases, letters, words — and then made a bar graph of their frequency of use. She showed us examples based on the English alphabet, presenting us with a kind of staircase, with the usage of the most commonly used letters — e, t, s — to the left, and lesser usages represented by more bars descending to the right. “That downward slope is a giveaway that information-rich structure is present. Think about it. If you have meaningless noise, a random sequence of letters, each one is liable to come up as often as any other.”

“So the graph would be flat,” Sonia said.

“Yes. On the other hand if you had a signal with structure but no information content — say just a long sequence of e, e, e, like a pure tone — you’d have a vertical line. Signals containing meaningful information come somewhere between those two extremes. And you can tell something about the degree of information contained by the slope of the graph.”

Sonia asked, “What about the dolphins?” She glanced apologetically at Tom. “I know it’s nothing to do with your mother. I’d just like to know.”

Rosa smiled. “Actually the analysis is a little trickier in that case. With human languages, it’s easy to see the breakdown into natural units, letters, words, sentences: you can see what you must count. With nonhuman languages, like dolphin whistles, it’s harder to see the breaks between linguistic units. But you can use trial and error. Even dolphin whistles have gaps, so that’s a place to start, and then you can expand the way you decompose your signal, looking for other trial break markers, until you find the breakdown that gives you the strongest Zipf result.”

Sonia asked, “And the answer?”

Rosa waved a hand, like a magician. A new line on the graph appeared, below the first and parallel to it. “Dolphin whistles, and whale songs and a number of other animal signals, contain information — in fact they all show signs of optimal coding. Of course knowing there is information in there isn’t the same as having a translation. We know the dolphins are talking, but we still don’t know what they are talking about.”

“We may never know,” said Sonia, her voice tight. “Now that the oceans are empty.”

Gea rolled back and forth, friction sparks flying. You wouldn’t think a tin robot could look so judgmental.

Rosa said brightly, “As far as Morag is concerned we aren’t done yet. There is a second stage of analysis which allows us to squeeze even more data out of these signals.”

As I’d half-expected, she began to talk about entropy. The Zipf analysis showed us whether a signal contained information at all, Rosa said. The entropy analysis she presented now was going to show us how complex that information was. It makes sense that information theoreticians talk about entropy, if you think about it. Entropy comes from thermodynamics, the science of molecular motion, and is a measure of disorder — precise, quantified. So it is a kind of inverse measure of information.

Rosa showed us a new series of graphs, which plotted “Shannon entropy value” against “entropy order.” It took me a while to figure this out. The zero-order-entropy number was easiest to understand; that was just a count of the number of elements in your system, the diversity of your repertoire — in written English, that could be the twenty-six letters of the alphabet plus a few punctuation marks. First-order entropy measured how often each element came up in the language — how many times you used e versus t or s. Second-order and higher entropies were trickier. They were to do with correlations between the elements of your signal.

Rosa said, “If I give you a letter, what’s your chance of predicting the next in the signal? Q is usually followed by u, for instance. That’s second-order entropy. Third-order means, if I give you two letters, what are your chances of predicting the third? And so on. The longer the chain of entropy values, the more structure there is in your signal.”

The most primitive communications we knew of were chemical signaling between plants. Here you couldn’t go beyond first-order Shannon entropy: given a signal, you couldn’t guess what the next would be. Human languages showed eighth — or ninth-order entropy.

We talked around the meaning of this. The Shannon entropy order has something to do with the complexity of the language. There is a limit to how far you can spin out a paragraph, or even an individual sentence, if you want to keep it comprehensible — though a more advanced mind could presumably unravel a lot more complexity.

Sonia asked, “And the dolphins?”

Sadly, the dolphins’ whistles showed no more than third or fourth-order Shannon entropy. They beat out most primates, but not by much.

“I guess they were too busy having fun after all,” Sonia said wistfully.

Tom had glowered all the way through this. Now he asked, “And the signal from the mother-thing? What does your analysis tell us about that?”

“It passes the Zipf test,” Rosa said. “And as for entropy—”

She laid a new line on her graphical display of plant, chimp, dolphin, human languages. Sloping shallowly, it tailed away into the distance of the graph’s right-hand side, far beyond the human.

“The analysis is uncertain,” Rosa said. “As you can imagine we’ve never actually encountered a signal like this before. Human languages, remember, reach Shannon order eight or nine. This signal, Morag’s speech, appears to be at least order thirty. We have to accept, I think, that Morag’s speech does contain information, of a sort. But it is couched in a fantastically abstruse form. As if it contains layers of nested clauses, overlapping tense changes, double, triple, quadruple negatives, all crammed into each sentence—”

“Jeez,” Shelley said. “No wonder we can’t figure it out.” She sounded daunted, even humbled.

It wasn’t a comfortable thought for me either. The bright new artificial minds, such as Gea, would surely have scored more highly than us on a scale like this — but at least we made them. This was different; this was outside humanity’s scope altogether. Suddenly we were going to have to get used to sharing the universe with a different order of intelligence than us.