Mitochondrial
William R. Hurst
Abstract
Long neglected by scientists and
mostly excluded from their phylogenetic trees, the variants at positions 00514-00524
in mitochondrial
Address for correspondence: wrhurst_17@msn.com. W. R. Hurst is the Administrator of the
Haplogroup K Project.
Received:
Introduction
This
study began as an investigation of the variants at mitochondrial
The goal
in the present study is to rectify the past neglect of these interesting mutations
by (1) studying the added resolution that they bring to one mtDNA
haplogroup–Haplogroup K, (2) looking at the few scientific papers that focused
on them, (3) looking at their role in the mtDNA tree in general, and (4)
summarizing what has been learned.
Suggestions for future research will follow.
Nomenclature
The first
large hurdle that must be dealt with is nomenclature. The
Family
Tree
Another
nomenclature factor is that the sequence 514-524 is part of the original HVR3 (aka HVS-
References
will be made below to sequences in FTDNA’s MitoSearch
database, the mtDNA Haplogroup K Project (which included 321 high-resolution
HVR1+HVR2 sequences as of July 23, 2007), and the federal GenBank
database. In MitoSearch, sequences are
always labeled as just K, while in the K Project about 10% of the total
sequences or 14% of the high-resolution sequences have confirmed subclade (also
called subhaplogroup) designations based on
full-sequence tests. GenBank sequences
vary in how they are labeled, based on their origin. Most subclade designations are those from
Behar et al. (2006, Fig. 1), referred to below as the “Behar K tree.” Subclade
designations of sequences not confirmed by full-sequence tests are as predicted
by the author. Additional provisional
subclade designations used in this article are those of the author and may
change when a new authoritative K tree is published.
Definitions
Mutations
Mitochondrial
Heteroplasmy
The
phenomenon of different mtDNA variants being found in different mitochondria or
in different cells in the same person is known as heteroplasmy. Point heteroplasmy (or structural heteroplasmy) is the term used when different SNP
variants are found in a cell. Length heteroplasmy is the occurrence of
any mixture of a
Strictly
speaking, when the term heteroplasmic
mutation is used, or when heteroplasmy is used as a noun, what is usually
meant is a situation where two or more variants for the same position are
detected by an mtDNA test. Where the
heteroplasmy is due to SNP variants, there is a set of IUPAC (International
Union of Pure and Applied Chemistry) codes; 16093Y, for example, would mean
that both the mutated version 16093C and the
Heteroplasmy
and heteroplasmic mutation are often used more loosely to explain why certain
mutations, SNPs or indels, occur by the inheritance of different variants
between generations. Apparently, even if
the mutated variant is not detected in the mother, by the normal ra
Perhaps
there was an intermediate step where, using the strict definition of a
heteroplasmic mutation, both variants were detectable. The key word here is “detectable,” since
those heteroplasmies that are not detectable by the direct sequencing method
commonly used by testing companies – which would require perhaps 20% for the
minority variant to be observed – may be
detectable at 5% by other methods (Tully et al. 2000). In fact, detection of heteroplasmies as low
as 1-2% has a special name: microheteroplasmy (Smigrodzki and Khan, 2005).
The Behar
K tree demonstrates the problem which the effects of undetectable heteroplasmy
cause with trees created with software such as Fluxus-Engineering’s
Network program. To prevent
reticulations caused by heteroplasmic and other recurrent mutations, Behar
excluded our 524 insertions as well as the positions 309 and 315 insertions and
certain other HVR and coding-region mutations.
And yet, there are patterns involving position 524 in the K
subclades. The 524 insertions are found
in certain subclades, but not in others; and likewise the deletions. These patterns will be discussed in detail
for each subclade below. Even adding
them back to the data used for the Fluxus diagram
does not always explain the appearances of the 524 indels. Turner (2006) expressed the situation well in
the title of an article in this Journal: “Now You See It, Now You Don’t:
Heteroplasmy in Mitochondrial
We see
that a mutation reported for a person may have occurred in two general ways;
(1) by a de novo mutation similar to
a nuclear
In the
context of heteroplasmy, the term “fixed” means that only one heteroplasmic
variant is inherited by the founder of a subclade. If a different variant appears later in that
subclade or a lower subclade, it may be assumed that there has been a de novo mutation. “Fixed out” means that a particular variant
is missing from the group of inherited variants. If that variant later appears in that
subclade or one of its descendant subclades, it again may be assumed that there
has been a de novo mutation. Tully et al. (2000) has some discussion of
the term “fixed.” A related term is “resolved.”
If a woman with a strict heteroplasmy (two or more variants detectable)
has a descendant with only one variant detectable, the position is said to be
resolved at that variant. A progression
over many generations might be (1) a woman with only the T or
Haplogroup
Notation
For any
mtDNA haplogroup, there are often several levels of subclades or
subhaplogroups. For this article, the
major or high-level K subclades are K1, K1a, K1b, K1c and
Points of Conundrum
For this
article the term points of conundrum will
be used for certain branching points on the K phylogenetic tree which are
clearly defined by coding-region or HVR mutations, but which may appear to originate or pass on the
length heteroplasmic variants at position 524 between generations and nodes on
the tree by the only occasionally visible heteroplasmic system. The reason for using the new term is not that
a new method of heredity has been discovered, just that the effects of
undetected heteroplasmic mutations has not been widely discussed. Typically, a subclade which has haplotypes
with more than one variant, divides into two or more lower subclades with
different combinations of the variants. Table
1 shows the percentages of each type of variant (deletions,
In Table 1, the percentages of the
position 524 variants for the members of the mtDNA Haplogroup K Project are
those of the Family Tree
Table 1. 524 Variants in Haplogroup K
|
|
Deletions % |
|
Insertions % |
|
K Project |
2.2 |
68.4 |
29.4 |
|
SMGF |
16.8 |
76.8 |
6.4 |
Table 2 illustrates the percentages of
each variant in most K subclades. The
subclades listed include those from the Behar K tree which have examples in the
K Project confirmed by full-sequence tests or known examples in GenBank, plus
provisional subclades used by the author: K1a10, K1a11, Pre-K1a9 and
Pre-K1a10. Those with plus signs, K1a+,
K1b+, K1c+ and K2+, include not only samples which have been assigned
high-level subclade designations after full-sequence tests; but also samples
from the K Project that have not been tested adequately to determine their
possible membership in a lower subclades.
These may eventually move into one of the more specific lower subclades
listed. The Counts column lists the
number of examples of each subclade from the K Project and GenBank. The GenBank examples include the 121
full-sequence used in the Behar K tree except for those marked “H” (for Herrnstadt) which, until recently, were not in
GenBank. Even now the published Herrnstadt sequences do not include HVR mutations. Added are several other K examples listed on
Table 2. Percentages of Position 524 Heteroplasmic
Variants in Haplogroup K Subclades
|
Subclade |
Counts |
522-,523- % |
|
524.1,524.2 % |
524.3,524.4 % |
524.5,524.6 % |
524.7,524.8 % |
524 Total Inserts % |
|
Repeats |
|
4 |
5 |
6 |
7 |
8 |
9 |
|
|
|
11-KP |
|
100 |
|
|
|
|
0 |
|
K2a |
34-KP,19-GB |
6 |
94 |
|
|
|
|
0 |
|
K2a1a |
1-GB |
|
100 |
|
|
|
|
0 |
|
K2a2 |
1-GB |
|
100 |
|
|
|
|
0 |
|
K2a2a |
8-KP,2-GB |
|
100 |
|
|
|
|
0 |
|
K2a3 |
2-GB |
|
100 |
|
|
|
|
0 |
|
K2a4 |
1-GB |
|
100 |
|
|
|
|
0 |
|
K2c |
1-GB |
|
100 |
|
|
|
|
0 |
|
K1 |
1-KP |
|
100 |
|
|
|
|
0 |
|
K1c+ |
14-KP |
21 |
79 |
|
|
|
|
0 |
|
K1c1 |
1-KP,8-GB |
|
100 |
|
|
|
|
0 |
|
K1c1a |
1-GB |
100 |
|
|
|
|
|
0 |
|
K1c1b |
4-GB |
|
100 |
|
|
|
|
0 |
|
K1c2 |
26-KP,1-GB |
|
96 |
4 |
|
|
|
4 |
|
K1a+ |
67-KP |
1 |
70 |
24 |
4 |
1 |
|
29 |
|
K1a1 |
1-KP,1-GB |
|
100 |
|
|
|
|
0 |
|
K1a1a |
1-KP |
|
100 |
|
|
|
|
0 |
|
K1a1b |
1-KP,1-GB |
|
100 |
|
|
|
|
0 |
|
K1a1b1 |
2-KP,1-GB |
33 |
67 |
|
|
|
|
0 |
|
K1a1b1a |
30-KP,7-GB |
|
97 |
3 |
|
|
|
3 |
|
K1a6 |
2-GB |
|
100 |
|
|
|
|
0 |
|
K1a7 |
1-GB |
|
100 |
|
|
|
|
0 |
|
K1a8 |
3-GB |
|
100 |
|
|
|
|
0 |
|
K1a11 |
8-KP |
|
100 |
|
|
|
|
0 |
|
K1a3 |
1-GB |
|
100 |
|
|
|
|
0 |
|
|