It’s early days but current data suggest that the extremists on both sides should probably relax a little because the reality is likely to lie somewhere between their two positions. The only way we can really test the hypothesis that long non-coding RNAs have functions in the cell is to test each one, in the correct cell type. Although perfectly sensible as an approach, this isn’t as straightforward as it sounds. Partly this is down to sheer numbers. If we detect hundreds or thousands of different long non-coding RNAs in a cell or tissue, we have to make a decision about which one we want to test. But to do that, we already need to have developed a hypothesis about what that specific long non-coding RNA might do in the cell. Without that hypothesis, we won’t know what effects we should be looking for if we interfere with the expression or function of that molecule.
Another complication is that many of the long non-coding RNAs are found in the same region as classical protein-coding genes. Sometimes they may be in exactly the same position, but encoded on the opposite strand, just as we saw for Xist and Tsix in Chapter 7. Others may be found within the stretches of junk that lie between two amino acid-coding regions in a single gene, which we first encountered in Friedreich’s ataxia in Chapter 2 (see page 18). There are lots of ways in which the long non-coding RNAs may be co-located in the same region as protein-coding genes and this creates substantial experimental difficulties if trying to investigate function.
Usually the functions of genes are tested by mutating them. There are all sorts of mutations that can be introduced but the most commonly used will either switch the gene off or will lead to it being expressed at a higher level than normal. But because so many of the long non-coding RNAs overlap with protein-coding genes, it’s hard to mutate one without mutating the other at the same time. We then face the problem of knowing whether the effects we see are due to the change in the long non-coding RNA or in the protein-coding gene.
A frivolous analogous example may help to visualise this problem. A PhD student was investigating how frogs hear. He had developed an experimental system where he surgically removed certain parts of a frog and then monitored if it could hear a loud noise, in this case a gunshot. One day he rushed in to his supervisor’s office, yelling that he had worked out how frogs hear. ‘They hear with their legs!’ he told his bemused supervisor. When she asked how he could be so sure he said, ‘It’s simple. Normally if I fire the gun, the frog hears it and jumps in fright. But when I remove the frog’s legs it doesn’t jump anymore when I fire the gun, so it must hear through its legs.[17]
Theoretically, of course, it’s also possible that some of the unexpected effects sometimes encountered when we mutate protein-coding genes have been due to unrecognised changes in co-located long non-coding RNAs which we hadn’t even realised were present at the time the experiment was carried out.
Because of this potential collateral damage to protein-coding genes, many researchers are focusing their efforts on a subset of long non-coding RNAs which don’t overlap these regions. There’s plenty of choice, as there are at least 3,500 long non-coding RNAs in this category. There is a tendency in the literature to refer to these more distant long non-coding RNAs as a special class, and they have been given a separate name.[18],{144} But it’s worth remembering that if we do this, we are classifying these molecules by what they are not, i.e. they aren’t co-located with protein-coding genes. This could mean that we lump together large numbers of long non-coding RNAs in one class when really they may turn out to be functionally quite distinct from each other.
The rush to create categories and nomenclature has been, and continues to be, a real problem in the whole field of genome analysis because it tends to lock us in to definitions before we really have enough biological understanding to create relevant categories. Imagine if you had never seen a movie, and then you were treated to a week of films. Let’s imagine you see Top Hat; Singin’ in the Rain; The Good, the Bad and the Ugly; High Noon; The Sound of Music; The Magnificent Seven; Cabaret; True Grit; Unforgiven and West Side Story. If asked to categorise movies, you would say they come in two flavours: musicals and westerns. That’s fine, but what happens in the following week if you are shown Bridget Jones’s Diary and Gravity? Or Paint Your Wagon, Seven Brides for Seven Brothers and Calamity Jane, all of which are song-and-dance films involving cowboys? You’ll be stuck trying to shoehorn movies into genre definitions you developed before you understood the cinematic landscape. For a similar reason, we’ll try to avoid too many definitions of individual classes of long non-coding RNAs and just focus on what we really know experimentally.
Appropriate control of gene expression is required throughout life, but it’s critically important in very early development, because even the slightest shift in events during the first few cell divisions can have dramatic effects. This is particularly true in the zygote, the single cell formed from the fusion of an egg and a sperm. The zygote, and the first few cells generated by division from this progenitor, are known as totipotent. They are able to create all the cells of the embryo and placenta. Researchers would love to work with these cells, but they are tiny in number. Instead, most research is carried out in embryonic stem cells, also known as ES cells. These were originally derived from embryos, many years ago, but we don’t need to access embryos any more to get them, as they can be grown in cell culture. ES cells are from a slightly later stage in development and aren’t quite as unconstrained as the zygote. They are known as pluripotent, as they have the potential to form any cell type in the body, but not placental cells.
In the correct, carefully controlled culture conditions, ES cells divide to generate yet more pluripotent stem cells. But relatively minor changes to the culture conditions lead to a loss of pluripotency. The ES cells begin to differentiate into more specialised cell types. One of the most dramatic changes is when ES cells differentiate into heart cells, which beat spontaneously and in synchrony in a Petri dish. But essentially the ES cells can move down many different development routes, depending on the ways that they are treated.
Researchers manipulated ES cells in culture by knocking down the expression of nearly 150 of the long non-coding RNAs that are located far from any known protein-coding genes. They knocked down the expression of just one long non-coding RNA in each experiment. They found that in dozens of cases, knockdown of just one long non-coding RNA was enough to change the ES cells from being pluripotent to starting to differentiate into other cells. The authors also analysed which genes were expressed before and after they knocked down the long non-coding RNAs. They found that over 90 per cent of the long non-coding RNAs controlled expression of protein-coding genes either directly or indirectly. In many cases, the expression of hundreds of protein-coding genes was affected. These were nearly always genes that were far away on the genome, not the ones that were closest to the long non-coding RNAs that they had knocked down.
The scientists also performed the reciprocal experiment. They treated ES cells with a chemical that is known to cause them to differentiate and then analysed the expression of the specific long non-coding RNA class in which they were interested. They found that expression of about 75 per cent of the long non-coding RNAs dropped as the cells moved from being pluripotent to being committed to a development pathway. The two sets of data are consistent with the idea that the levels of expression of certain long non-coding RNAs act as gatekeepers to maintain ES cells in a pluripotent state.{145} This created confidence that these non-protein-coding RNAs do have a function in the cell, at least during early development.
17
This is a famous thought experiment. No actual frogs were harmed in the creation of this anecdote.