A Suggested Genome for “Mitochondrial Eve”
Abstract
The “Out of
Address for
correspondence: ilbg18230@btinternet.com
Received:
Introduction
It is now 20 years since Rebecca Cann, Mark Stoneking and Allan
Wilson (Cann, 1987) presented their famous article
“Mitochondrial
The original work was based on the partial
sequencing of the mitochondrial
As part of the “Out of Africa” theory it is
necessary to consider that there was a single matrilineal ancestor for the
whole of mankind who lived in
Mitochondrial
Mitochondria
are found in all the nucleated cells of the body and are concerned with the
production and transfer of energy within cells and the production of RNA that
is involved in the process of making proteins.
Inside every mitochondrion there are circular rings of deoxyribonucleic
acid (the mtDNA) and each ring is made up of about 16,569 nucleotide bases. These bases are four in number: Adenine,
Cytosine, Guanine and Thymine, and for simplicity they are normally represented
by their initial letters—A, C, G, T. The actual sequence of mtDNA in the human was first determined in
The human mitochondrial genome coding region
contains genes for 13 enzymes, two RNA ribosome components (rRNAs)
and 22 different transfer RNAs (tRNAs),
and some small regions concerned with the replication of the mitochondria. There is also a large non-coding region known
as the “Control Region” or the “Hypervariable Regions.” For this discussion the control region will
simply be termed “HVR1,” for the part at locations 16024-16569, and “HVR2,” for
the part at 1-576.
Mutations
A nucleotide base in the mtDNA may on a rare
occasion undergo a mutation; that is, a nucleotide base at a particular place
can change. For example: a location that
is initially occupied with an Adenine may be filled by a Guanine - such a
change at location 263 will here be represented by “A263G.” It is also possible for there to be insertion
of extra bases or deletion of bases—these types of mutation are much less
common, but arguably more interesting.
Although the non-coding
regions make up less than a tenth of the mtDNA, over the past 200,000 years these regions have shown a disproportionate
number of mutations. This is thought to
have occurred because there is no selective pressure on mutations in the
non-coding regions, by which is meant that mutations in the non-coding regions
are taken to be functionless and harmless to the cell and the person whose
mitochondria show such a mutation.
However, in the coding regions, mutations will be selected against and
will not persist if their effects are harmful to the cell and the person.
Mutations that are found in the coding regions and have persisted, usually do
not affect the actual amino-acid sequence of the gene product, and in the tRNAs, appear not to compromise
the incorporation of the corresponding amino acid during protein
synthesis. The effect of a mutation in
the coding regions for the ribosomal proteins is still largely unknown.
The Phylogenetic Tree
When the mutations found in human mtDNA
genomes are studied it is possible to draw a tree based on the common
occurrence of the mutations. This tree is known as a
phylogenetic tree. The first
phylogenetic tree was presented in “Mitochondrial
Since then the different branches of the
phylogenetic tree have been considered as denoting “haplogroups” and given
labels, from A-Z. For example, the
Over the past 20 years, the phylogenetic tree has
been greatly expanded and is now very complicated. The gradual change in its development can be
followed by looking through the published papers from Maca-Meyer
(2001), Herrnstadt (2002), Mishmar
(2003), Kivisild (2006), Torroni
(2006), Ruiz-Pesini (2007).
However, despite trees of ever increasing size being produced, there does not
appear to any single tree that includes all the mutations that have occurred
since Mitochondrial Eve along the line leading to the
For the
purposes of this paper a simplified phylogenetic tree is shown in Figure 1.

Figure 1. A Simplified Phylogenetic Tree. The largest haplogroups
are L1, L2, J, T, U, K, H, D, G, and C.
All mtDNA genomes now show an average of about 50
mutations which have occurred in the 200,000 years since Mitochondrial
Eve. This paper discusses the mutations
that have occurred in the
Methods
The main source for mtDNA genomes is “The National
Centre of Biotechnology Information” (NCBI) where over 3,700 complete human
mitochondrial genomes are now available for study at the “Entrez
Nucleotide” website: http://www.ncbi.nih.gov/entrez/query.fcgi
Each mtDNA genome can be viewed by entering the
appropriate accession number; for example “EU157923” gives the latest genome to
be made available (as of September 2007).
Information on the structure of parts of the
phylogenetic tree has been taken from various papers, in particular Herrnstadt (2002), Mishmar
(2003), Kivisild (2006), Torroni
(2006), Ruiz-Pesini (2007).
However in most of the papers the example trees have been drawn using only the
mutations found in the coding region.
For the purpose of this paper the author has used his own computer
programs to determine all the mutations present in the 3,700 genomes that are
presently available and has thereby been able to build a phylogenetic tree
based on mutations from both the coding and non-coding regions of the
mitochondrial
Results and Discussion
A suggested mutation list back to Mitochondrial Eve
The Cambridge Reference Sequence (
For convenience, the mutations are considered here
in two parts: firstly the mutations that have occurred in approximately the
last 60,000 years, and secondly, the mutations which
occurred in the previous 140,000 years.
Mutations occurring in the last 60,000 years
20 mutations appear to have occurred in
approximately the last 60,000 years. These mutations are now well accepted and
are detailed on many of the published phylogenetic trees. These mutations can be considered as being
those mutations that have occurred since Homo Sapiens
first left
In the following discussion, we will start at
The most recent eight mutations on the line leading
to
A750G 315.1C A4769G A1438G
A15326G A8860G 309.1C A263G
These mutations all occur within Haplogroup H. The
two insertions 309.1C and 315.1C mean that the
The area of the genome from 303-315 is largely made
up of C’s, and is termed a Poly-C area. This area is very variable and can even
be different between relatives.
There is a further variable area, at 514-523, which
in the
Separating Haplogroups H and V from the “R” node
are 5 mutations (see Figure 1):
C7028T A2706G C14766T
G11719A A73G
The two mutations:
C16223T C12705T
come between the “N” node and the “R” node (see Figure
1), effectively separating the major European haplogroups from the
remainder of the tree.
It is interesting to note that despite being in the
HVR1 area, the mutation at 16223 appears very stable and is therefore extremely
useful in determining if a genome belongs above the “N” node, or elsewhere in
the tree.
The last five mutations that are encountered on
this “walk” back to 60,000 years before the present are:
G15301A T10873C A10398G
T9540C A8701G
and they come between the major forking to the Asian
haplogroups and the “N” node.
The presence of so many mutations in just this
small area of the phylogenetic tree indicates a significant bottleneck
in the spread of mankind and implies that the population at this period of time
outside
The 20 mutations that have occurred in the line
leading to
Table 1.
The Mutation List Covering the Last 60,000 Years
Mutation Function Position
in the “Phylogenetic Tree”
A750G 12S-rRNA* mutation used to define
Haplogroup H2a
315.1C HVR2 mutation within
Haplogroup H2
A4769G ND2
(Met > Met) mutation within
Haplogroup H2
A1438G 12S-rRNA* mutation defines
Haplogroup H2
A15326G CytB (Thr >
A8860G ATP6 (Thr >
309.1C HVR2 mutation within
Haplogroup H
A263G HVR2 mutation
within Haplogroup H
C7028T
A2706G 16S-rRNA mutation used to define
Haplogroup H
C14766T CytB (Ile > Thr) * mutation
used to define Haplogroups H and V
G11719A ND4
(Gly > Gly) mutation used to define Haplogroup
pre-HV
A73G HVR2 mutation used to
define Haplogroup pre-HV
C16223T HVR1 between the N and
R nodes
C12705T ND5
(Ile > Ile) between the N and R nodes
G15301A CytB (Leu > Leu) between
“L” haplogroups and N node
T10873C ND4
(Pro > Pro) between “L”
haplogroups and N node
A10398G ND4
(Thr >
T9540C
A8701G ATP6
(Thr >
Note:
All mutations in the HVR1 and HVR2 areas are considered functionless,
but the effects of the mutations marked *
are unknown.
Mutations occurring in the previous 140,000 years
It is fairly easy to give a firm list of mutations
that have occurred over the last 60,000 years because there are many genomes
available for the European and Asian haplogroups. However, for the period of 140,000 years
closer to Mitochondrial Eve, it is not possible to be so
confident as there are far fewer published genomes. Indeed the list suggested here may need to be
revised as new genomes are published.
32 mutations appear to have occurred in the
approximate period of 140,000 years back closer to Mitochondrial Eve.
The most recent mutations we encounter are:
G1018A G769A
between the branches leading to Haplogroups L4 and L7 and below the series of L3 haplogroups.
The next five mutations come between the branch
leading to Haplogroup L6 and the branches leading to Haplogroups L4 and L7:
C16278T C13650T C7256T C3594T T152C
Between the branches leading to Haplogroups L2 and L6 are two
mutations:
G7521A A4104G
Between the branches to Haplogroup L5 and
Haplogroup L2 there are 12 mutations:
T16519C T16311C T16189C C16187T A15301G C13506T
A13105G T10810C G10688A C8655T T825A G247A
Once again, the high number of mutations suggests
there was a significant bottleneck in human evolution at the time,
perhaps around 120,000 years ago, which might have lasted for many thousands of
years.
Note that this is the second time a mutation at
location 15301 has occurred, which means that genomes beyond this point have
the
Between the branches leading to Haplogroups L1 and
L5 there are a further four mutations:
C8468T A7146G T2885C G2758A
The last seven mutations come between Mitochondrial
Eve and the branch to Haplogroup L0:
A16230G G12007A G11914A
G9755A T6185C C4312T
C1048T
These differences from the
The 32 mutations from the period 200,000-60,000
years before present are reviewed in Table 2, together with comment
about their location and function.
Table 2.
The Mutation List from 60,000- 200,000 Years Ago
Mutation Function Position
in the Phylogenetic Tree
G1018A 12S-rRNA between Haplogroup L4
and the L3 series
G769A 12S-rRNA between Haplogroup L4
and the L3 series
C16278T HVR1 between
Haplogroup L6 and Haplogroups L4 and L7
C13650T ND5
(Pro > Pro) between
Haplogroup L6 and Haplogroups L4 and L7
C7256T
C3594T ND1 (Val > Val) between Haplogroup L6 and
Haplogroups L4 and L7
T152C HVR2 between
Haplogroup L6 and Haplogroups L4 and L7
G7521A tRNA Asp between
Haplogroup L2 and Haplogroup L6
A4104G ND1
(Leu > Leu) between Haplogroup L2 and
Haplogroup L6
T16519C HVR1 between
Haplogroup L5 and Haplogroup L2
T16311C HVR1 between
Haplogroup L5 and Haplogroup L2
T16189C HVR1 between
Haplogroup L5 and Haplogroup L2
C16187T HVR1 between
Haplogroup L5 and Haplogroup L2
A15301G CytB (Leu > Leu) (
C13506T ND5
(Tyr > Tyr) between Haplogroup L5 and
Haplogroup L2
A13105G ND5
(Ile > Val) between
Haplogroup L5 and Haplogroup L2
T10810C ND4
(Leu > Leu) between Haplogroup L5 and
Haplogroup L2
G10688A ND4
(Val > Val) between
Haplogroup L5 and Haplogroup L2
C8655T ATP6
(Ile > Ile) between Haplogroup L5 and
Haplogroup L2
T825A 12S-rRNA between Haplogroup L5
and Haplogroup L2
G247A HVR2 between
Haplogroup L5 and Haplogroup L2
C8468T ATP8
(Leu > Leu) between Haplogroup L1 and
Haplogroup L5
A7146G
T2885C 16S-rRNA between Haplogroup L1
and Haplogroup L5
G2758A 16S-rRNA between Haplogroup L1
and Haplogroup L5
A16230G HVR1 present in
Haplogroup L0 and the chimpanzee
G12007A ND4
(Trp > Trp) present in Haplogroup L0 and
the chimpanzee
G11914A ND4
(Thr > Thr) present in Haplogroup L0 and
the chimpanzee
G9755A
T6185C
C4312T tRNA Ile present in Haplogroup L0 and the
chimpanzee
C1048T 12S-rRNA present in Haplogroup
L0 and the chimpanzee
Note: The last 7 mutations are all found in
Haplogroup L0. But, as they also occur
in the chimpanzee (Pan troglodytes) mtDNA, this
suggests that they are mutations on the main ancestral line. There are other mutations found in Haplogroup
L0 which are not to be found in chimpanzee mtDNA.
Conclusion
A
suggested mitochondrial genome for Mitochondrial Eve is therefore the Cambridge
Reference Sequence (
A73G T152C G247A A263G 309.1C 315.1C A750G G769A
T825A G1018A C1048T A1438G A2706G G2758A T2885C C3594T
A4104G C4312T A4769G T6185C C7028T A7146G C7256T G7521A
C8468T C8655T A8701G A8860G T9540C G9755A A10398G G10688A
T10810C T10873C G11719A G11914A G12007A C12705T A13105G C13506T C13650T C14766T A15326G C16187T T16189C C16223T A16230G C16278T
T16311C T16519C
The actual sequence is available in a supplementary
text data file which accompanies this paper (Editor’s Note:
The referenced supplementary text file contains one long string of bases
without any internal reference points.
For a tabular version comparing
References
Cann RL, Stoneking M, Wilson AC (1987) Mitochondrial DNA
and human evolution. Nature, 325:31-36.