Open main menu
Genome size ranges (in base pairs) of various life forms

Genome size is the total amount of DNA contained within one copy of a single complete genome. It is typically measured in terms of mass in picograms (trillionths (10−12) of a gram, abbreviated pg) or less frequently in daltons, or as the total number of nucleotide base pairs, usually in megabases (millions of base pairs, abbreviated Mb or Mbp). One picogram is equal to 978 megabases.[1] In diploid organisms, genome size is often used interchangeably with the term C-value.

An organism's complexity is not directly proportional to its genome size; total DNA content is widely variable between biological taxa. Some single-celled organisms have much more DNA than humans, for reasons that remain unclear (see non-coding DNA and C-value enigma).

Origin of the termEdit

Tree of life with genome sizes as outer bars

The term "genome size" is often erroneously attributed to a 1976 paper by Ralph Hinegardner,[2] even in discussions dealing specifically with terminology in this area of research (e.g., Greilhuber 2005[3]). Notably, Hinegardner[2] used the term only once: in the title. The term actually seems to have first appeared in 1968, when Hinegardner wondered, in the last paragraph of another article, whether "cellular DNA content does, in fact, reflect genome size".[4] In this context, "genome size" was being used in the sense of genotype to mean the number of genes.

In a paper submitted only two months later, Wolf et al. (1969)[5] used the term "genome size" throughout and in its present usage; therefore these authors should probably be credited with originating the term in its modern sense. By the early 1970s, "genome size" was in common usage with its present definition, probably as a result of its inclusion in Susumu Ohno's influential book Evolution by Gene Duplication, published in 1970.[6]

Variation in genome size and gene contentEdit

With the emergence of various molecular techniques in the past 50 years, the genome sizes of thousands of eukaryotes have been analyzed, and these data are available in online databases for animals, plants, and fungi (see external links). Nuclear genome size is typically measured in eukaryotes using either densitometric measurements of Feulgen-stained nuclei (previously using specialized densitometers, now more commonly using computerized image analysis[7]) or flow cytometry. In prokaryotes, pulsed field gel electrophoresis and complete genome sequencing are the predominant methods of genome size determination.

Nuclear genome sizes are well known to vary enormously among eukaryotic species. In animals they range more than 3,300-fold, and in land plants they differ by a factor of about 1,000.[8][9] Protist genomes have been reported to vary more than 300,000-fold in size, but the high end of this range (Amoeba) has been called into question.[by whom?] In eukaryotes (but not prokaryotes), genome size is not proportional to the number of genes present in the genome, an observation that was deemed wholly counter-intuitive before the discovery of non-coding DNA and which became known as the "C-value paradox" as a result. However, although there is no longer any paradoxical aspect to the discrepancy between genome size and gene number, the term remains in common usage. For reasons of conceptual clarification, the various puzzles that remain with regard to genome size variation instead have been suggested by one author to more accurately comprise a puzzle or an enigma (the so-called "C-value enigma").

Genome size correlates with a range of measurable characteristics at the cell and organism levels, including cell size, cell division rate, and, depending on the taxon, body size, metabolic rate, developmental rate, organ complexity, geographical distribution, or extinction risk.[8][9] Based on currently available completely sequenced genome data (as of April 2009), log-transformed gene number forms a linear correlation with log-transformed genome size in bacteria, archaea, viruses, and organelles combined, whereas a nonlinear (semi-natural logarithm) correlation is seen for eukaryotes.[10] Although the latter contrasts with the previous view that no correlation exists for the eukaryotes, the observed nonlinear correlation for eukaryotes may reflect disproportionately fast-increasing non-coding DNA in increasingly large eukaryotic genomes. Although sequenced genome data are practically biased toward small genomes, which may compromise the accuracy of the empirically derived correlation, and ultimate proof of the correlation remains to be obtained by sequencing some of the largest eukaryotic genomes, current data do not seem to rule out a possible correlation.

Genome reductionEdit

Genome size compared to number of genes. Log-log plot of the total number of annotated proteins in genomes submitted to GenBank as a function of genome size. Based on data from NCBI genome reports.

Genome reduction, also known as genome degradation, is the process by which an organism's genome shrinks relative to that of its ancestors. Genomes fluctuate in size regularly, and genome size reduction is most significant in bacteria.

The most evolutionarily significant cases of genome reduction may be observed in the eukaryotic organelles known to be derived from bacteria: mitochondria and plastids. These organelles are descended from primordial endosymbionts, which were capable of surviving within the host cell and which the host cell likewise needed for survival. Many present-day mitochondria have less than 20 genes in their entire genome, whereas a modern free-living bacterium generally has at least 1,000 genes. Many genes have apparently been transferred to the host nucleus, while others have simply been lost and their function replaced by host processes.

Other bacteria have become endosymbionts or obligate intracellular pathogens and experienced extensive genome reduction as a result. This process seems to be dominated by genetic drift resulting from small population size, low recombination rates, and high mutation rates, as opposed to selection for smaller genomes.[citation needed] Some free-living marine bacterioplanktons also shows signs of genome reduction, which are hypothesized to be driven by natural selection.[11][12][13]

In obligate endosymbiotic speciesEdit

Obligate endosymbiotic species are characterized by a complete inability to survive external to their host environment. These species have become a considerable threat to human health, as they are often capable of evading human immune systems and manipulating the host environment to acquire nutrients. A common explanation for these manipulative abilities is their consistently compact and efficient genomic structure. These small genomes are the result of massive losses of extraneous DNA, an occurrence that is exclusively associated with the loss of a free-living stage. As much as 90% of the genetic material can be lost when a species makes the evolutionary transition from a free-living to an obligate intracellular lifestyle. Common examples of species with reduced genomes include Buchnera aphidicola, Rickettsia prowazekii, and Mycobacterium leprae. One obligate endosymbiont of leafhoppers, Nasuia deltocephalinicola, has the smallest genome currently known among cellular organisms at 112 kb.[14] Despite the pathogenicity of most endosymbionts, some obligate intracellular species have positive fitness effects on their hosts.

The reductive evolution model has been proposed as an effort to define the genomic commonalities seen in all obligate endosymbionts.[15] This model illustrates four general features of reduced genomes and obligate intracellular species:

  1. "genome streamlining" resulting from relaxed selection on genes that are superfluous in the intracellular environment;
  2. a bias towards deletions (rather than insertions), which heavily affects genes that have been disrupted by accumulation of mutations (pseudogenes);[16]
  3. very little or no capability for acquiring new DNA; and
  4. considerable reduction of effective population size in endosymbiotic populations, particularly in species that rely on vertical transmission of genetic material.

Based on this model, it is clear that endosymbionts face different adaptive challenges than free-living species.

Conversion from picograms (pg) to base pairs (bp)Edit


or simply:


Drake's ruleEdit

In 1991, John W. Drake proposed a general rule: that the mutation rate within a genome and its size are inversely correlated.[17] This rule has been found to be approximately correct for simple genomes such as those in DNA viruses and unicellular organisms. Its basis is unknown.

It has been proposed that the small size of RNA viruses is locked into a three-part relation between replication fidelity, genome size, and genetic complexity. The majority of RNA viruses lack an RNA proofreading facility, which limits their replication fidelity and hence their genome size. This has also been described as the "Eigen paradox".[18] An exception to the rule of small genome sizes in RNA viruses is found in the Nidoviruses. These viruses appear to have acquired a 3′-to-5′ exoribonuclease (ExoN) which has allowed for an increase in genome size.[19]

See alsoEdit


  1. ^ a b Dolezel J, Bartoš J, Voglmayr H, Greilhuber J (2003). "Nuclear DNA content and genome size of trout and human". Cytometry Part A. 51 (2): 127–128. doi:10.1002/cyto.a.10013. PMID 12541287.
  2. ^ a b Hinegardner R (1976). "Evolution of genome size". In F.J. Ayala (ed.). Molecular Evolution. Sinauer Associates, Inc., Sunderland. pp. 179–199.
  3. ^ Greilhuber J, Doležel J, Lysák M, Bennett MD (2005). "The origin, evolution and proposed stabilization of the terms 'genome size' and 'C-value' to describe nuclear DNA contents". Annals of Botany. 95 (1): 255–260. doi:10.1093/aob/mci019. PMC 4246724. PMID 15596473.
  4. ^ Hinegardner R (1968). "Evolution of cellular DNA content in teleost fishes". American Naturalist. 102 (928): 517–523. doi:10.1086/282564.
  5. ^ Wolf U, Ritter H, Atkin NB, Ohno S (1969). "Polyploidization in the fish family Cyprinidae, Order Cypriniformes. I. DNA-content and chromosome sets in various species of Cyprinidae". Humangenetik. 7 (3): 240–244. doi:10.1007/BF00273173. PMID 5800705.
  6. ^ Ohno S (1970). Evolution by Gene Duplication. New York: Springer-Verlag. ISBN 0-04-575015-7.
  7. ^ Hardie DC, Gregory TR, Hebert PD (2002). "From pixels to picograms: a beginners' guide to genome quantification by Feulgen image analysis densitometry". Journal of Histochemistry and Cytochemistry. 50 (6): 735–749. doi:10.1177/002215540205000601. PMID 12019291.
  8. ^ a b Bennett MD, Leitch IJ (2005). "Genome size evolution in plants". In T.R. Gregory (ed.). The Evolution of the Genome. San Diego: Elsevier. pp. 89–162.
  9. ^ a b Gregory TR (2005). "Genome size evolution in animals". In T.R. Gregory (ed.). The Evolution of the Genome. San Diego: Elsevier. pp. 3–87.
  10. ^ Hou Y, Lin S (2009). Redfield RJ (ed.). "Distinct Gene Number- Genome Size Relationships for Eukaryotes and Non-Eukaryotes: Gene Content Estimation for Dinoflagellate Genomes". PLoS ONE. 4 (9): e6978. doi:10.1371/journal.pone.0006978. PMC 2737104. PMID 19750009.
  11. ^ Dufresne A, Garczarek L, Partensky F (2005). "Accelerated evolution associated with genome reduction in a free-living prokaryote". Genome Biol. 6 (2): R14. doi:10.1186/gb-2005-6-2-r14. PMC 551534. PMID 15693943.
  12. ^ Giovannoni SJ; et al. (2005). "Genome streamlining in a cosmopolitan oceanic bacterium". Science. 309 (5738): 1242–1245. doi:10.1126/science.1114057. PMID 16109880.
  13. ^ Giovannoni SJ; et al. (2008). "The small genome of an abundant coastal ocean methylotroph". Environmental Microbiology. 10 (7): 1771–1782. doi:10.1111/j.1462-2920.2008.01598.x. PMID 18393994.
  14. ^ And the Genomes Keep Shrinking…
  15. ^ Wernegreen J (2005). "For better or worse: Genomic consequences of genomic mutualism and parasitism" (PDF). Current Opinion in Genetics & Development. 15 (6): 1–12. doi:10.1016/j.gde.2005.09.013. PMID 16230003. Archived from the original (PDF) on 2011-07-22. Cite uses deprecated parameter |deadurl= (help)
  16. ^ Moran NA, Plague GR (2004). "Genomic changes following host restriction in bacteria". Current Opinion in Genetics & Development. 14 (6): 627–633. doi:10.1016/j.gde.2004.09.003.
  17. ^ Drake, J W (1991). "A constant rate of spontaneous mutation in DNA-based microbes". Proc Natl Acad Sci USA. 88: 7160–7164. doi:10.1073/pnas.88.16.7160. PMC 52253. PMID 1831267.
  18. ^ Kun, A; Santos, M; Szathmary, E (2005). "Real ribozymes suggest a relaxed error threshold". Nat Genet. 37: 1008–1011. doi:10.1038/ng1621. PMID 16127452.
  19. ^ Lauber, C; Goeman, JJ; Parquet Mdel, C; Thi Nga, P; Snijder, EJ; Morita, K; Gorbalenya, AE (Jul 2013). "The footprint of genome architecture in the largest genome expansion in RNA viruses". PLoS Pathog. 9 (7): e1003500. doi:10.1371/journal.ppat.1003500.

Further readingEdit

External linksEdit