Выбрать главу

It is becoming clearer that this transcription from genes that don’t code for protein is actually critically important for cellular function. Oddly, however, we remain caught in a linguistic trap of our own making. The RNA that is produced from these regions, the RNA that was previously under our radar, is still called non-coding RNA (ncRNA). It’s a sloppy shorthand, because what we really mean is non-protein-coding RNA. The ncRNA does, in fact, code for something – it codes for itself, a functional RNA molecule. Unlike mature mRNA, which is an RNA means to a protein end, ncRNAs are themselves the end-points.

Re-defining rubbish

This is the paradigm shift. For at least 40 years molecular biologists and geneticists have focused almost exclusively on the genes that code for proteins, and the proteins themselves. There have been exceptions, but we’ve just treated these as the odd bits of rubble on the top of the shed. But non-coding RNAs are finally starting to stand firmly alongside proteins as fully functional molecules. Different but equal.

These ncRNAs are found all over the genome. Some come from introns. Originally it was assumed that the spliced-out bits of mRNA from the introns get degraded by cells. It now seems much more likely that at least some (if not all or most) are actually processed to act as functional ncRNAs in their own right. Others overlap genes, frequently transcribed from the opposite strand to the protein-coding mRNA. Yet others are found in regions where there are no protein-coding genes at all.

We met two ncRNAs in the last chapter. These were Xist and Tsix, the ncRNAs that are required for X inactivation. These are both very long ncRNAs, of several thousand kilobases in length. When Xist was first identified, it was only the second known ncRNA. Current estimates suggest there are thousands of such molecules in the cells of higher mammals, with over 30,000 ‘long’ ncRNAs (defined as having a length greater than 200 bases) reported in mice[132]. Long ncRNAs may actually out-number protein-coding mRNAs.

In addition to X inactivation, long ncRNAs also appear to play a critical role in imprinting. Many imprinted regions contain a section that encodes a long ncRNA, which silences the expression of surrounding genes. This is similar to the effect of Xist. The protein-coding mRNAs are silenced on the copy of the chromosome which expresses the long ncRNA. For example, there is an ncRNA called Air, expressed in the placenta, exclusively from the paternally inherited mouse chromosome 11. Expression of Air ncRNA represses the nearby Igf2r gene, but only on the same chromosome[133]. This mechanism ensures that Igf2r is only expressed from the maternally inherited chromosome.

The Air ncRNA gave scientists important insights into how these long ncRNAs repress gene expression. The ncRNA remained localised to a specific region in the cluster of imprinted genes, and acted as a magnet for an epigenetic enzyme called G9a. G9a puts a repressive mark on the histone H3 proteins in the nucleosomes deposited on this region of DNA. This histone modification creates a repressive chromatin environment, which switches off the genes.

This finding was particularly important as it provided some of the first insights into a question that had been puzzling epigeneticists. How do histone modifying enzymes, which put on or remove epigenetic marks, get localised to specific regions of the genome? Histone modifying enzymes can’t recognise specific DNA sequences directly, so how do they end up in the right part of the genome?

The patterns of histone modifications are localised to different genes in different cell types, leading to exquisitely well-regulated gene expression. For example, the enzyme known as EZH2 methylates the amino acid called lysine at position 27 on histone H3, but it targets different histone H3 molecules in different cell types. To put it simply, it may methylate histone H3 proteins positioned on gene A in white blood cells but not in neurons. Alternatively, it may methylate histone H3 proteins positioned on gene B in neurons, but not in white blood cells. It’s the same enzyme in both cells, but it’s being targeted differently.

There is increasing evidence that at least some of the targeting of epigenetic modifications can be explained by interactions with long ncRNAs. Jeannie Lee and her colleagues have recently investigated long ncRNAs that bind to a complex of proteins. The complex is called PRC2 and it generates repressive modifications on histones. PRC2 contains a number of proteins, and the one that interacts with the long ncRNAs is probably EZH2. The researchers found that the PRC2 complex bound to literally thousands of different long ncRNA molecules in embryonic stem cells from mice[134]. These long ncRNAs may act as bait. They can stay tethered to the specific region of the genome where they are produced, and then attract repressive enzymes to shut off gene expression. This happens because the repressive enzyme complexes contain proteins like EZH2 that are capable of binding to RNA.

Scientists love to build theories, and in some ways a nice one was shaping up around long ncRNAs. It seemed that they bind to the region from which they are transcribed, and repress gene expression on that same chromosome. But if we go back to our analogy from the start of this chapter, we’d have to say that it’s now becoming clear we have built a pretty small shed and already cemented quite a bit of rubble to the roof.

There’s an amazing family of genes, called HOX genes. When they’re mutated in fruit flies (Drosophila melanogaster) the results are incredible phenotypes, such as legs growing out of the head[135]. There’s a long ncRNA known as HOTAIR, which regulates a region of genes called the HOX-D cluster. Just like the long ncRNAs investigated by Jeannie Lee, HOTAIR binds the PRC2 complex and creates a chromatin region which is marked with repressive histone modifications. But HOTAIR is not transcribed from the HOX-D position on chromosome 12. Instead it is encoded at a different cluster of genes called HOX-C on chromosome 2[136]. No-one knows how or why HOTAIR binds at the HOX-D position.

There’s a related mystery around the best studied of all long ncRNAs, Xist. Xist ncRNA spreads out along almost the entire inactive X chromosome but we really don’t know how. Chromosomes don’t normally become smothered with RNA molecules. There’s no obvious reason why Xist RNA should be able to bind like this, but we know it’s nothing to do with the sequence of the chromosome. The experiments described in the last chapter, where Xist could inactivate an entire autosome as long as it contained an X inactivation centre, showed that Xist just keeps on travelling once it’s on a chromosome. Scientists are basically still completely baffled about these fundamental characteristics of this best-studied of all ncRNAs.

Here’s another surprising thing. Until very recently, all long ncRNAs were thought to repress gene expression. In 2010, Professor Ramin Shiekhattar at the Wistar Institute in Philadelphia identified over 3,000 long ncRNAs in a number of human cell types. These long ncRNAs showed different expression patterns in different human cell types, suggesting they had specific roles. Professor Shiekhattar and his colleagues tested a small number of the long ncRNAs to try to determine their functions. They used well-established experimental methods to knock down expression of their test ncRNAs and then analysed expression of their neighbouring genes. The predicted outcome, and the actual results, are shown in Figure 10.2.

вернуться

132

Carninci et al. (2005), Science 309: 1559–1563.

вернуться

133

Nagano et al. (2008), Science 322: 1717–1720.

вернуться

134

Zhao et al. (2010), Molecular Cell 40: 939–953.

вернуться

135

Garber et al. (1983), EMBO J. 2: 2027–36.

вернуться

136

Rinn et al. (2007), Cell 129: 1311–1323.