Stuck, Brandenburg isolated samples of “lonely” voices. The first was a recording of a difficult German dialect that had plagued audio engineers for years. The second was a snippet of Suzanne Vega singing the opening bars of “Tom’s Diner,” her 1987 radio hit. Perhaps you remember the a cappella intro to “Tom’s Diner.” It goes like this:
Dut dut duh dut
Dut dut duh dut
Dut dut duh dut
Dut dut duh dut
Vega had a beautiful voice, but on the early stereo encodings it sounded as if there were rats scratching at the tape.
In 1989, Brandenburg defended his thesis and was awarded his PhD. He then took the voice samples with him on a fellowship to AT&T’s Bell Labs in Murray Hill, New Jersey. There, he worked with James Johnston, a specialist in voice encoding. Johnston was the Newton to Brandenburg’s Leibniz—independently, he had hit upon an identical mathematical approach to psychoacoustic modeling, at almost exactly the same time. After an initial period spent marking territory, the two decided to cooperate. Throughout 1989, listening tests continued in parallel in Erlangen and Murray Hill, but the American test subjects proved less patient than the Germans. After listening to the same rat-eaten, four-second sample of “Tom’s Diner” several hundred times, the volunteers at Bell Labs revolted, and Brandenburg was forced to finish the experiment on his own. He was there in New Jersey, listening to Suzanne Vega, when the Berlin Wall came down.
Johnston was impressed by Brandenburg. He’d spent his life around academic researchers and was accustomed to brilliance, but he’d never seen anybody work so hard. Their collaboration spurred several breakthroughs, and soon the scratching rats were banished. In early 1990, Brandenburg returned to Germany with a nearly finished product in hand. Many compressed samples now revealed a state of perfect “transparency”: even to a discriminating listener like Grill, using the best equipment, they were indistinguishable from the original compact discs.
Impressed, AT&T officially graced the technology with its imprimatur and a modicum of corporate funding. Thomson, a French consumer electronics concern, also began to provide money and technical support. Both firms were seeking an edge in psychoacoustics, as this long-ignored academic discipline was suddenly white hot. Research teams from Europe, Japan, and the United States had been working on the same problem, and other large corporations were jockeying for position. Many had thrown their weight behind Fraunhofer’s better-established competitors. Seeking to mediate, the Moving Picture Experts Group (MPEG)—the standards committee that even today decides which technology makes it to the consumer marketplace—convened a contest in Stockholm in June 1990 to conduct formalized listening tests for the competing methods.
As the ’90s opened, MPEG was preparing for a decade of disruption, shaping technological standards for near-future technologies like high-definition television and the digital video disc. Being moving picture experts, the committee had first focused exclusively on video quality. Audio encoding problems were an afterthought, one they’d tackled only after Brandenburg pointed out that there was no longer much of a market for silent movies. (This was the sort of joke that Brandenburg liked to make.)
An MPEG endorsement might mean a fortune in licensing fees, but Brandenburg knew it would be tough to get. The Stockholm contest was to be graded against ten audio benchmarks: an Ornette Coleman solo, the Tracy Chapman song “Fast Car,” a trumpet solo, a glockenspiel, a recording of fireworks, two separate bass solos, a ten-second castanet sample, a snippet of a newscast, and a recording of Suzanne Vega performing “Tom’s Diner.” (The last was suggested by Fraunhofer.) The judges were neutral participants, selected from a group of Swedish graduate students. And, as MPEG needed undamaged ears that could still hear high-pitched frequencies, the evaluators skewed young.
Fourteen different groups submitted entries to the MPEG trials—the high-stakes version of a middle school science fair. On the eve of the contest, the competing groups conducted informal demonstrations. Brandenburg was confident his group would win. He felt that access to Zwicker’s seminal research, still untranslated from German, gave him an insurmountable edge.
The next day a room full of fair-haired, clear-eared Scandinavian virgins spent the morning listening to “Fast Car” ripped 14 different ways. The listeners scored the results for sound quality on a five-point scale. After tabulating the answers, MPEG announced the results—it was a tie! At the top was Fraunhofer, locked in a statistical dead heat with a rival group called MUSICAM. No one else was close.
Fraunhofer’s strong showing in the contest was unexpected. They were a dark horse candidate from a research institution, a bunch of graduate students competing against established corporate players. MUSICAM was more representative of the typical MPEG contest winner—a well-funded consortium of inventors from four different European universities, with deep ties to the Dutch corporation Philips, which held the patents on the compact disc. MUSICAM also had several German researchers on staff, and Brandenburg suspected this was not a coincidence. They’d had access to Zwicker’s untranslated research, too.
MPEG had not anticipated a tie, and had not made provisions to break one. Fraunhofer’s approach provided better audio quality with less data, but MUSICAM’s required less processing power. Brandenburg felt this disparity worked in his favor, as computer processing speed improved with each new chip cycle, and doubled every 24 months or so. Improving bandwidth was more difficult, as it required digging up city streets and replacing thousands of miles of cable. Thus, Brandenburg felt, MPEG should look to conserve bandwidth rather than processing cycles, and he repeatedly made this argument to the audio committee. But he felt he was being ignored.
After Stockholm the team waited for months for a ruling from MPEG. In October 1990, Germany was reunified, and Grill kept himself busy by applying Brandenburg’s algorithm to his new favorite song: the Scorpions’ “Wind of Change.” In November, Eberhard Zwicker, hearing researcher and table tennis enthusiast, passed away at the age of 66. In January 1991, the Fraunhofer team rolled out its first commercial product, a 25-pound hardware rack for broadcast transmission. It made an early sale to the bus shelters of a reunified Berlin.
Finally, MPEG approached Fraunhofer with a compromise. The committee would make multiple endorsements. Fraunhofer would be included, but only if they agreed to play by certain rules, dictated by MUSICAM. In particular, they would have to adopt a gangrenous piece of proprietary technology called a “polyphase quadrature filter bank.” Four uglier words did not exist. Some kind of filter bank was necessary—this was the technology that split sound into component frequencies, the same way a prism did to light. But the Fraunhofer team already had its own filter bank, which worked fine. Adding another would double the complexity of the algorithm, with no increase in sound quality. Worse, Philips had a patent on the code, which meant giving an economic stake in Fraunhofer’s project to its primary competitor. After a long and heated internal debate, Brandenburg finally agreed to this compromise, as he didn’t see a way forward without MPEG’s endorsement. But to others on the project, it looked like Fraunhofer had been fleeced.
In April 1991, MPEG made its endorsements public. Of the 14 original contenders, three methods would survive. The first was termed Moving Picture Experts Group, Audio Layer I, a compression method optimized for digital cassette tape that was obsolete practically the moment the press release was distributed. Then, with a naming scheme that could only have come from a committee of engineers, MPEG announced the other two methods: MUSICAM’s method, which would henceforth be known as the Moving Picture Experts Group, Audio Layer II—better known today as the mp2—and Brandenburg’s method, which would henceforth be known as the Moving Picture Experts Group, Audio Layer III—better known today as the mp3.