Human disorders don’t just occur because of defects in the splicing machinery. They can also arise because protein-coding genes themselves have mutations in sites that are important for the control of splicing of the RNA from that single gene. Some authors have claimed that up to 10 per cent of human inherited disorders may be caused by mutations at the splice sites, those two-base sequences shown in Figure 17.5.{348}
One example of this mechanism was a family in which two young siblings developed intractable diarrhoea within a few days of birth. Medical staff managed to stabilise the children, but the diarrhoea persisted for many months and one of the two affected children died at seventeen months of age. When the genomes of the children were sequenced, the researchers found a mutation in a splice site in a gene, changing one of the GU sequences shown in Figure 17.5. This resulted in the splicing machinery skipping over an amino acid-coding region inappropriately. Essentially, an amino acid-coding region was left out of the protein, and as a consequence the protein could no longer do its job.{349}
Kaposi’s sarcoma is a cancer that first came to public attention when it was found at high levels in people with AIDS. AIDS is caused by the human immunodeficiency virus (HIV) and the effect of the HIV infection is to suppress the immune system. Kaposi’s sarcoma is caused by a different virus called HHV-8. Normally our immune systems control this virus but if the immune system is seriously below par, HHV-8 can become established and trigger Kaposi’s sarcoma.
HHV-8 is present in a high percentage of people in the Mediterranean basin, but Kaposi’s sarcoma is rare in this population, and almost never found in small children. So medics were very surprised when a Turkish family brought in their two-year-old daughter who had a classic lesion characteristic of this cancer on her lip. The cancer spread rapidly and aggressively and the little girl died just four months after she was first diagnosed.
The child was negative in all tests for HIV. Her parents were related to each other, a first-cousin marriage. Researchers looked for genetic reasons why the daughter might have an impaired immune response to HHV-8.
By sequencing DNA obtained from samples that had been taken from the deceased girl, scientists identified a mutation in a splice site of a specific gene. The mutation changed an AG to an AA, which meant the spliceosome could no longer recognise where it was meant to cut the RNA molecule. The result was that a junk region that should have been removed was retained in the messenger RNA molecule. This messed up the sequence, creating a stop signal much too early in the messenger RNA. This prevented the ribosome from making the full-length protein. Because the protein is one that is required for mounting a good immune response to viruses such as HHV-8, the child with the mutation was very susceptible to Kaposi’s sarcoma.{350}
Although splice site mutations are relatively common, genetic diseases are more often caused by mutations in the amino acid-coding regions of genes. Some of these cause problems because they introduce stop signals that prevent the ribosomes from making full-length proteins from messenger RNA templates. Other mutations may change the code from one amino acid to another. For example, CAC codes for the amino acid histidine whereas CAG codes for glutamine, a different amino acid. But researchers have speculated that up to 25 per cent of the mutations that change the amino acid in this way also influence the splicing of nearby regions in the messenger RNA. In some cases the disease may be due not to the single amino acid alteration per se, but to the variation that the nucleotide change creates in the way a messenger RNA is spliced.
The problem is that it is very difficult to demonstrate that this is the case in most situations. Even if we can show that the change in the RNA leads to both an altered splicing pattern and an amino acid change, how can we tell which effect causes the disease symptoms? Are these due to protein with one altered amino acid, or because the protein has also been spliced in an unusual pattern?
Nature has actually provided us with proof that sometimes a mutation in a coding region can cause a disease by influencing splicing, rather than by changing an amino acid. There is an extraordinary disorder called Hutchinson-Gilford Progeria, named after the two scientists who first identified it. Progeria means early ageing and this particular form is incredibly dramatic. It is also extremely rare, affecting about one in 4 million children.{351}
Affected babies seem perfectly healthy at first but within a year their growth rate slows dramatically, and they remain underweight and short for the rest of their lives. The children begin developing many symptoms of old age, including thinning hair, stiffness and baldness. Although there are some ageing conditions that they don’t develop, such as Alzheimer’s disease (and the children also don’t have learning disabilities), the affected individuals do develop severe cardiovascular disease. This is usually what causes death by the early teens, as a consequence of heart attacks or major strokes.
In 2003 researchers identified the gene mutation that causes Hutchinson-Gilford Progeria. Every patient they tested had a de novo mutation, meaning one that developed spontaneously in the parents’ egg or sperm. Incredibly, in eighteen unrelated patients (out of twenty who were assessed) the mutation was exactly the same.{352}
A sequence that should read GGC in a particular gene had mutated and now read as GGT. This mutation was in the amino acid-coding part of the gene. This might seem like a straightforward case of a mutation changing an amino acid in a protein, so of course the first thing to do is to look at the genetic code and see what these two sequences code for. GGC, the normal sequence, codes for a simple amino acid called glycine. But the mutated sequence, GGT, codes for — wait for it — glycine. Yep, same amino acid.
This is because our genetic code has a level of redundancy. Our genome is composed of four letters — A, C, G and T (or U in RNA). Blocks of three letters are used to code for an amino acid. There are 64 possible combinations of three from four letters. Three of these combinations are stop signals, telling the ribosomes not to add any more amino acids to a protein chain. This leaves 61 combinations to code for amino acids. But our proteins only contain 20 different amino acids. So some amino acids can be coded for by different three-letter combinations. At one extreme, glycine is coded for by GGA, GGC, GGG and GGT(U). At the other, the amino acid methionine is only coded for by the combination AT(U)G.
But if the amino acid sequence encoded by the mutated gene doesn’t change in Hutchinson-Gilford Progeria, what causes the dramatic phenotype in this condition? Look again at Figure 17.5. The two-base sequence at the beginning of each intervening junk region within a gene is GT. In the patients where the normal GGC changes to GGT, the amino acid region gains an inappropriate extra splice signal. In the context of all the other splicing signals in that genomic region, this inappropriately positioned GT acts strongly. The spliceosome cuts the messenger RNA in the amino acid-coding region rather than in the junk region. The amino acid-coding regions join up badly and the end result is a loss of about 50 amino acids from the end of the protein. This in turn means that the protein itself isn’t processed properly, and it begins to wreak havoc in the cells. We still don’t know exactly how this leads to the extraordinary ageing we see in these children, but our best guess at the moment is that the cell nucleus isn’t maintained properly. This may lead to changes in gene expression and nuclear breakdown. Some genes and some cell types may be more sensitive to this than others.