genomes

illustartion of DNA molecule

Illustration 7074323 / Dna © Paul Fleet | Dreamstime.com

The size of a species's genome is described by the number of base pairs in its DNA. (A base-pair is a pair of amino acids). Because the range of sizes is huge, the SI prefixes kilo-, mega-, and giga-, together with their symbols, have been adopted. So a kilobase (kb) is 1000 nucleotide pairs, in double-stranded DNA, or 1000 nucleotide in single-stranded DNA. Similarly megabase (Mb) represents a million nucleotides or nucleotide pairs, and gigabase (Gb) 109. There is no need for terabase, because no genome that big has been discovered.

can be converted to picograms by using the factor 1 picogram = 978 Mb.3

3. J. Doležel, J. Bartoš, H. Voglmayr and J. Greilhuber.
Nuclear DNA content and genome size of trout and human.
Cytometry Part A 51A:127–128 (2003)
doi: 10.1002/cyto.a.10013

As the authors note, the same factor was published by T. Cavalier-Smith in
The Evolution of Genome Size.
New York: John Wiley and Sons, 1985.

Smallest genome

"Candidatus Tremblaya princeps"

In 2002, the organism with the smallest known genome (491 kilo-basepairs) was Nanoarchaeum equitans,¹ a microbe discovered living on the surfaces of another microbe, Igniococcus. Both are members of the Archea domain. They were found in gravel taken from the ocean floor north of Iceland, in an area heated by volcanic activity. N. equitans is extremely small, roughly a sphere 400 nanometers in diameter. As of this writing, the microbe's genome had not been mapped completely, but was estimated at a few hundred genes, well below the size of the previous record holder.

The previous record holder (from 1995) was the bacterium Mycoplasma genitalium, with about 580 kb.² From this and other studies two researchers³ estimated the smallest possible genome at about 256 genes.

1. Harald Huber, Michael J. Hohn, Reinhard Rachel, Tanja Fuchs, Verna C. Wimmer and Karl O. Stetter.
A new phylum of Archea represented by a nanosized hyperthermophilic symbiont.
Nature, volume 417, pages 27-28, 2 May 2002.
A letter to the editor.

2. Claire M. Fraser, Jeannine D. Gocayne, Owen White, Mark D. Adams, Rebecca A. Clayton, Robert D. Fleischmann, Carol J. Bult, Anthony R. Kerlavage, Granger Sutton, Jenny M. Kelley, Janice L. Fritchman, Janice F. Weidman, Keith V. Small, Mina Sandusky, Joyce Fuhrmann, David Nguyen, Teresa R. Utterback, Deborah M. Saudek, Cheryl A. Phillips, Joseph M. Merrick, Jean-Francois Tomb, Brian A. Dougherty, Kenneth F. Bott, Ping-Chuan Hu, and Thomas S. Lucier.
The minimal gene complement of Mycoplasma genitalium.
Science, volume 270, pages 397-404 (20 October 1995).

3. Arcady R. Mushegian and Eugene V. Koonin.
A minimal gene set for cellular life derived by comparison of bacterial genomes.
Proceedings of the National Academy of Sciences, volume 93, no. 19, pages 10268–10273 (September 17 1996).
For the genomes, see www.ncbi.nlm.nih.gov/Complete_Genomes/.

Jack Maniloff.
The minimal cell genome: 'On being the right size.'
Proceedings of the National Academy of Sciences, volume 93, no. 19, pages 10004–10006 (September 17 1996).

Smallest vertebrate genome

Fugu rubripes, a Japanese puffer fish. Although its genome has about 30,000 genes, it is small because it includes very little "junk". See:

www.lbl.gov/Science-Articles/Archive/fugu-decoded.html this link goes to another website

www.lbl.gov/Science-Articles/Archive/fugu-facts.html this link goes to another website

Largest vertebrate genome

Ampiuma means, the Australian lungfish (Neoceratodus forsteri) 43 billion basepairs

Previous record holder the mexican axolotl

Axel Meyer, Siegfried Schloissnig, Paolo Franchini, et al.
Giant lungfish genome elucidates the conquest of land by vertebrates.
Nature, 590, pages 284-289 (18 January 2021).
doi.org/10.1038/s41586-021-03198-8

How many genes do organisms have?

Organism Number of
base pairs
in millions
Number
of genes
When
first
sequenced
By whom, notes
Haemophilus influenzae
(bacteria)
1.8 1,740 1995

Fleischmann R. D., Adams M. D., White O., Clayton R. A., Kirkness E. F., Kerlavage A. R., Bult C. J., Tomb J. F., Dougherty B. A., Merrick J. M., et al.
Whole-genome random sequencing and assembly of Haemophilus influenzae Rd.
Science, vol. 269, pages 496-512.

Saccharomyces cerevisa
(yeast)
12.1 6,034 1996

www.yeastgenome.org/

Caenorhabditis elegans
(roundworm)
97 19,099 1998

First animal genome to be sequenced.
Genome sequence of the nematode C. elegans: A platform for investigating biology.
Science, vol. 282, page 2012. (11 December 1998)

Arabidopsis thaliana
(thale cress)
125 25,500 2000 First plant genome to be sequenced.

The Arabidopsis Genome Initiative.
Analysis of the genome sequence of the flowering plant Arabidopsis thaliana.
Nature, vol. 408, page 796. (14 December 2000)

Other articles in the same issue treat the sequences of chromosomes 1, 3 and 5.

Drosophila melanogaster
(fruit fly)
185 13,061 2000 Adams, M. D. et al.
The genome sequence of Drosophila melanogaster.
Science, vol. 287, page 2185. (24 March 2000)
Mus musculus
(laboratory mouse)
3000 50,000    
Homo sapiens
(human)
3120 30,000 2001 Human Genome Project and Celera

Venter, J. C. et al.
The sequence of the human genome.
Science, vol. 294, page 1304. (16 February 2001)

International Human Genome Sequencing Consortium.
Initial sequencing and analysis of the human genome.
Nature, vol. 409, page 860. (15 February 2001)


https://genome.ucsc.edu
https://www.celera.com

    2013 Genome Reference Consortium
3.055 billion base pair (bp)   2021

Nurk, S.,  et al, (the Telomere-to-Telomere Consortium)
The complete sequence of a human genome.
bioRxiv preprint server, 27 May 2021.
doi.org/10.1101/2021.05.26.445798

Agrobacterium tumefaciens
(crown gall bacterium)
5.67 ? 2001 www.agrobacterium.org
Oryza sativa
(rice)
389 37,544 2005

Second plant genome to be sequenced.
International Rice Genome Sequencing Project.
The map-based sequence of the rice genome.
Nature, vol. 436, page 793 (11 August 2005).

for further reading

The National Center for Biotechnology Information maintains a fascinating, though advanced, repository of information at www.ncbi.nlm.nih.gov/Complete_Genomes/ this link goes to another website.

www.genomesize.com this link goes to another website. An Animal Genome Size Database, maintained by T. Ryan Gregory at the University of Guelph. It reports genome size in picograms.

https://www.alliancegenome.org/