In its natural state, Cas9 is rather disinterested in DNA, essentially colliding randomly and bouncing off. But once Cas9, which has a hand-shaped structure, clasps a guide RNA, a subtle reconfiguration of the protein’s structure primes it to react with DNA as it goes in search of its matching target. According to Blake Wiedenheft, a professor at Montana State University, the Cas protein complexes “patrol the entire intracellular environment, find and bind this foreign [viral] DNA, and mark that foreign DNA for destruction in a matter of minutes… that’s a pretty remarkable task.”17
The task of finding and binding the target sequence is a two-step process. First, Cas9 seeks out and interacts with a short motif in the DNA called the PAMIII sequence—a beacon that provides the enzyme with a cue to briefly caress the DNA. “That ephemeral interaction results in a distortion of the DNA,” explains Wiedenheft. By bending the DNA, Cas9 unzips the double helix to allow the guide RNA to slip into the resulting crevice (forming a so-called R-loop).18 The guide conducts a quick sequence check against the target DNA. If a perfect match is found along all twenty or so bases, this marks the DNA sequence for destruction. Cas9 seversIV both strands of the DNA as cleanly as a kitchen knife, creating a double-strand break (DSB) just a few bases away from the PAM sequence.19
This remarkable process was captured in a stunning video shot by University of Tokyo researchers Hiroshi Nishimasu and Osamu Nureki in 2017. Using a technique called high-speed atomic force microscopy, they were able to zoom in at the precise moment that Cas9 grasps the DNA. In the film, Cas9 looks like a gold-colored rock as it pauses over a strand of DNA for several seconds before guillotining the DNA in half.20 The clip went viral after Nishimasu posted it on his Twitter account and it was shown on Japanese television.
But repurposing Cas9 to seek out a specific unique sequence in the human genome is literally a million times more complicated than cutting viral DNA. As the Cas9 complex enters the alien surroundings of a cell nucleus, it is confronted by a maze of DNA—twenty-three pairs of chromosomes, six billion letters of DNA—compared to a typical phage genome of just a few thousand bases. Once in the nucleus, each Cas9 molecule scours the densely packed coils of DNA to identify PAM sites, which occur on average once every full 360° rotation of the double helix. In principle, the enzyme has to interrogate 300–400 million bases to identify its precise target.
Johan Elf, a biophysicist at Uppsala University in Sweden, calculates that Cas9 normally takes about six hours to search through every PAM sequence in the bacterial genome, pausing at each prospective site for a mere twenty milliseconds to peer into the double helix to see if it has found the correct target.21 But the packaging of DNA in a eukaryotic cell nucleus is far more complex than bacteria. During lectures to his students at the University of Edinburgh, Andrew Wood shows a diagram of a bacterial cell alongside a winding, looping mammalian DNA fiber. “Cas9 didn’t evolve to work in the environment in which we now put it,” he says. “It’s mind-boggling that it is possible to interrogate hundreds of millions of nucleotides in a matter of hours.”22
Once Cas9 has cut the DNA, the cell’s DNA repair enzymes reseal the break. Experts marvel that it works as well as it does.23 Cas9 even surpasses the previously developed ZFN and TALENV gene-editing platforms. “They both evolved to regulate eukaryotic DNA and yet Cas9 seems to outperform them,” Wood says.
Let’s pause to note that the PAM sequence has a critical role: by searching for a short PAM sequence rather than having to unzip and check essentially the entire genome, the task of Cas9 to latch onto its target sequence is greatly simplified. The PAM also answers the riddle of how Cas9 doesn’t accidentally carve up the repeats in the CRISPR array. That’s because when they are initially added to the bacterial CRISPR array, the PAM sequence is clipped off. Genome engineers refuse to be limited by the natural list of PAM sequences, so they are modifying the original Cas9 and Cas enzymes from other species to expand their PAM preferences.
With such an effective security system, one might reasonably ask: why aren’t all viruses extinct? Viruses have sneakily evolved a multitude of escape mechanisms—a group of proteins that are able to disable the Cas nucleases, known as anti-CRISPR proteins. Bacteria and their viruses are like prey and predators locked in a perpetual battle that rages on after hundreds of millions of years.24 CRISPR is found in 40 percent of bacterial genomes, and almost all archaeal genomes, but surprisingly not at all in the genomes of higher organisms. Although Cas9 is by far the most popular enzyme used in CRISPR applications—and subject to a bitter patent dispute I’ll discuss later—this enzyme represents a blip in the diverse CRISPR systems seen in nature. A huge effort is underway to mine the biological diversity on earth to uncover new Cas family proteins with novel functions to expand the CRISPR toolbox.25
Once a researcher has identified the gene sequence they wish to target, they can go to any number of websites, key in the desired matching sequence, and order that custom short guide RNA sequence. If CRISPR is a molecular word processor, then the RNA acts as the “CTRL-F” function, targeting the gene sequence of interest. Cas9 acts as the “CTRL-X” keystroke. But genome editing isn’t just about pointing the cursor to highlight and remove a typo. It’s about deciding and managing what happens next—how to correct the typo.
CRISPR Cutting of DNA. 1. Scanning: The Cas9 nuclease is bound to a guide RNA in a ribonucleoprotein complex. The guide consists of CRISPR RNA (crRNA) and the tracrRNA. The Cas9 complex scans the DNA in search of a PAM sequence, which is the cue to check for a sequence match. 2. Locking In: Cas9 binds to the DNA and unzips the double helix, allowing the crRNA to align to the single-stranded DNA. 3. Cutting: If there is a perfect DNA:RNA match, Cas9 undergoes a conformational change resulting in both DNA strands being cut in the same position. (Adapted from ref. 23.)
Cells possess multiple molecular pathways to repair breaks and other mutations in DNA; if they didn’t, we wouldn’t be alive. The two most common repair pathways are called non-homologous end joining (NHEJ) and homology-directed repair (HDR). NHEJ sloppily stitches the broken ends of DNA back together, but frequently results in small insertions or deletions at the repair site. This is ideal for investigators using CRISPR to deliberately disrupt the function of a gene by breaking it and introducing various random insertions and deletions. The other pathway, HDR, makes a faithful repair if a suitable template is available. In normal circumstances, the template is the corresponding gene on the sister chromosome. The beauty of the CRISPR genome editing is that the investigator can supply a suitable template containing the desired sequence to patch into the Cas9-induced break, thereby resulting in the desired edit.VI 26
In January 2020, about five hundred scientists flocked to Banff, a ski resort in the Canadian Rockies, for the first big CRISPR conference of the year. (It also turned out to be the last, as the COVID-19 pandemic shut down all conference travel.) The organizers invited Doudna to deliver the opening keynote address on a Sunday morning, a role to which she has grown accustomed. She began with a heartfelt apology for not being able to stay and mingle for the next several days, but she had to get back to Berkeley to give a Monday morning lecture to six hundred undergraduates.
Doudna’s lecture, delivered with the same humility and wonder at the march of science that she had at the start of her career more than three decades ago, was the biotech equivalent of a State of the Union address. Her opening summation—a tribute not only to her own work but also that of legions of researchers over the previous quarter century—was simple: