With the recent explosion of genetic data banks, there is a great deal of interest in compressing genetic data. Recent work on applying standard data compression algorithms to genetic data indicates that reducing the data by 90 percent (for bit perfect compression) is feasible: Hisahiko Sato et al., “DNA Data Compression in the Post Genome Era,” Genome Informatics 12 (2001): 512–14, http://www.jsbi.org/journal/GIW01/GIW01P130.pdf.
Thus we can compress the genome to about 80 million bytes without loss of information (meaning we can perfectly reconstruct the full 800-million-byte uncompressed genome).
Now consider that more than 98 percent of the genome does not code for proteins. Even after standard data compression (which eliminates redundancies and uses a dictionary lookup for common sequences), the algorithmic content of the noncoding regions appears to be rather low, meaning that it is likely that we could code an algorithm that would perform the same function with fewer bits. However, since we are still early in the process of reverse-engineering the genome, we cannot make a reliable estimate of this further decrease based on a functionally equivalent algorithm. I am using, therefore, a range of 30 to 100 million bytes of compressed information in the genome. The top part of this range assumes only data compression and no algorithmic simplification.
Only a portion (although the majority) of this information characterizes the design of the brain.
2. Another line of reasoning is as follows. Though the human genome contains around 3 billion bases, only a small percentage, as mentioned above, codes for proteins. By current estimates, there are 26,000 genes that code for proteins. If we assume those genes average 3,000 bases of useful data, those equal only approximately 78 million bases. A base of DNA requires only 2 bits, which translate to about 20 million bytes (78 million bases divided by four). In the protein-coding sequence of a gene, each “word” (codon) of three DNA bases translates into one amino acid. There are, therefore, 43 (64) possible codon codes, each consisting of three DNA bases. There are, however, only 20 amino acids used plus a stop codon (null amino acid) out of the 64. The rest of the 43 codes are used as synonyms of the 21 useful ones. Whereas 6 bits are required to code for 64 possible combinations, only about 4.4 (log2 21) bits are required to code for 21 possibilities, a savings of 1.6 out of 6 bits (about 27 percent), bringing us down to about 15 million bytes. In addition, some standard compression based on repeating sequences is feasible here, although much less compression is possible on this protein-coding portion of the DNA than in the so-called junk DNA, which has massive redundancies. So this will bring the figure probably below 12 million bytes. However, now we have to add information for the noncoding portion of the DNA that controls gene expression. Although this portion of the DNA constitutes the bulk of the genome, it appears to have a low level of information content and is replete with massive redundancies. Estimating that it matches the approximately 12 million bytes of protein-coding DNA, we again come to approximately 24 million bytes. From this perspective, an estimate of 30 to 100 million bytes is conservatively high.
8. Dharmendra S. Modha et al., “Cognitive Computing,” Communications of the ACM 54, no. 8 (2011): 62–71, http://cacm.acm.org/magazines/2011/8/114944-cognitive-computing/fulltext.
9. Kurzweil, The Singularity Is Near, chapter 9, section titled “The Criticism from Ontology: Can a Computer Be Conscious?” (pp. 458–69).
10. Michael Denton, “Organism and Machine: The Flawed Analogy,” in Are We Spiritual Machines? Ray Kurzweil vs. the Critics of Strong AI (Seattle: Discovery Institute, 2002).
11. Hans Moravec, Mind Children (Cambridge, MA: Harvard University Press, 1988).
Epilogue
1. “In U.S., Optimism about Future for Youth Reaches All-Time Low,” Gallup Politics, May 2, 2011, http://www.gallup.com/poll/147350/optimism-future-youth-reaches-time-low.aspx.
2. James C. Riley, Rising Life Expectancy: A Global History (Cambridge: Cambridge University Press, 2001).
3. J. Bradford DeLong, “Estimating World GDP, One Million B.C.—Present,” May 24, 1998, http://econ161.berkeley.edu/TCEH/1998_Draft/World_GDP/Estimating_World_GDP.xhtml, and http://futurist.typepad.com/my_weblog/2007/07/economic-growth.xhtml. See also Peter H. Diamandis and Steven Kotler, Abundance: The Future Is Better Than You Think (New York: Free Press, 2012).
4. Martine Rothblatt, Transgender to Transhuman (privately printed, 2011). She explains how a similarly rapid trajectory of acceptance is most likely to occur for “transhumans,” for example, nonbiological but convincingly conscious minds as discussed in chapter 9.
5. The following excerpt from The Singularity Is Near, chapter 3 (pp. 133–35), by Ray Kurzweil (New York: Viking, 2005), discusses the limits of computation based on the laws of physics:
The ultimate limits of computers are profoundly high. Building on work by University of California at Berkeley Professor Hans Bremermann and nanotechnology theorist Robert Freitas, MIT Professor Seth Lloyd has estimated the maximum computational capacity, according to the known laws of physics, of a computer weighing one kilogram and occupying one liter of volume—about the size and weight of a small laptop computer—what he calls the “ultimate laptop.”
[Note: Seth Lloyd, “Ultimate Physical Limits to Computation,” Nature 406 (2000): 1047–54.
[Early work on the limits of computation were done by Hans J. Bremermann in 1962: Hans J. Bremermann, “Optimization Through Evolution and Recombination,” in M. C. Yovits, C. T. Jacobi, C. D. Goldstein, eds., Self-Organizing Systems (Washington, D.C.: Spartan Books, 1962), pp. 93–106.
[In 1984 Robert A. Freitas Jr. built on Bremermann’s work in Robert A. Freitas Jr., “Xenopsychology,” Analog 104 (April 1984): 41–53, http://www.rfreitas.com/Astro/Xenopsychology.htm#SentienceQuotient.]
The potential amount of computation rises with the available energy. We can understand the link between energy and computational capacity as follows. The energy in a quantity of matter is the energy associated with each atom (and subatomic particle). So the more atoms, the more energy. As discussed above, each atom can potentially be used for computation. So the more atoms, the more computation. The energy of each atom or particle grows with the frequency of its movement: the more movement, the more energy. The same relationship exists for potential computation: the higher the frequency of movement, the more computation each component (which can be an atom) can perform. (We see this in contemporary chips: the higher the frequency of the chip, the greater its computational speed.)
So there is a direct proportional relationship between the energy of an object and its potential to perform computation. The potential energy in a kilogram of matter is very large, as we know from Einstein’s equation E = mc2. The speed of light squared is a very large number: approximately 1017 meter2/second2. The potential of matter to compute is also governed by a very small number, Planck’s constant: 6.6 × 10−34 joule-seconds (a joule is a measure of energy). This is the smallest scale at which we can apply energy for computation. We obtain the theoretical limit of an object to perform computation by dividing the total energy (the average energy of each atom or particle times the number of such particles) by Planck’s constant.