Phylogenetic Networks for the Human mtDNA Haplogroup T
David A. Pike
Department of Mathematics and Statistics,
Memorial University of Newfoundland,
St. John's, Newfoundland,
Department of Mathematics and Statistics,
Memorial University of Newfoundland,
St. John's, Newfoundland,
Abstract: We develop phylogenetic networks for mtDNA haplogroup T, based on information stored in the MitoSearch database at www.mitosearch.org. Analysing the structure of the resulting networks, we note that nucleotide 16296 appears to be unstable throughout the haplogroup. We also observe a cluster that does not fall within one of the established subgroups of haplogroup T and so we propose some revision to the haplogroup hierarchy in order to encompass this cluster.
E-mail Address: email@example.com
The human mitochondrial DNA molecule was first fully sequenced in 1981 by Anderson et al.;
this sequence of 16,569
nucleotide base pairs has since become known as the Cambridge Reference Sequence and is often referred to by the
acronym CRS. Numerous subsequent studies have revealed that mutations within the mtDNA genome are an effective tool
with which to delve into aspects of population genetics and human migrations. In this regard, a phylogenetic tree of
major mtDNA haplogroups has been developed, whereby each haplogroup is characterised by a particular set of mutational
differences as compared to the CRS.
A version of this tree that relies on coding region mutations appears in a manuscript by
Herrnstadt et al. (2002).
Here we focus our attention on haplogroup T, which was first described as "group 2B" by
Richards et al. (1996),
who observed that the haplogroup was characterised by a pair
of mutations (at nucleotide positions 16126 and 16294)
within the first hypervariable region (HVR1) of the noncoding control region of the mtDNA genome.
Torroni et al. (1996)
associated the haplogroup with
polymorphic restriction sites within the coding region,
but also observed a correlation with HVR1 positions 16294 and 16296
(mutations at 16126 were observed in several samples, but were also lacking in a few others).
Haplogroup T is now generally associated
with a number of polymorphisms, at nucleotide positions 16126 and 16294
within the noncoding region
of the mtDNA genome and the following positions within the coding
709, 1888, 4216, 4917, 8697, 10463, 11251, 13368, 14905, 15452, 15607,
(Torroni et al. 1996; Macauley et al. 1999; Finnilä and Majamaa 2001).
Additional mutations, such as those at positions 73 and 16519 are
common within haplogroup T,
but are also found in several other haplogroups
(Wilkinson-Herbots et al. 1996; Helgason et al. 2000).
Our inquiry into the structure of the phylogenetic network for haplogroup T stems from the growing number
of individuals who are participating in
genetic genealogy studies and find themselves to be members
of the haplogroup. Searches for information about the haplogroup will, with some effort,
reveal that it originated in the Near East approximately 46,500 years ago but is now most
Europe, where it is found to occur in up to 10% of some subpopulations
(Richards et al. 1998; Helgason et al. 2001).
In the overall phylogenetic tree, haplogroup T is closest to haplogroup J, which is characterised by the
HVR1 motif 16069-16126
(Torroni et al. 1994; Richards et al. 1996)
as well as coding region mutations at 4216, 10398, 11251, 12612, 13708, and 15452
(Torroni et al. 1994; Macaulay et al. 1999; Finnilä and Majamaa 2001).
Hence the parent haplogroup JT has the motif 4216-11251-15452-16126, with 16126 being the defining HVR1 mutation.
When considering HVR1 mutations, it is therefore the additional mutation at 16294 that defines haplogroup T,
whereas haplogroup J is distinguished by the mutation at 16069.
The scientific literature also contains a
number of papers from the medical research community, in which
attempts to correlate pathological conditions with haplogroup
membership are made.
For instance, a study conducted in Spain observed a greater rate of
occurrence of reduced sperm motility among men in haplogroup T
than was found with men in other haplogroups
(Ruiz-Pesini et al. 2000). However, a more recent study conducted in
the haplogroup association not to be sound, and noted that care must be
taken when attempting to draw conclusions about haplogroups
when considering only a regional sampling of data
(Pereira et al. 2005).
Elsewhere it has been reported that membership in haplogroup T may
offer some protection against Alzheimer Disease
(Chagnon et al. 1999; Herrnstadt et al. 2002)
and also Parkinson's Disease (Pyle et al. 2005),
but the cautionary words of Pereira et al. suggest that further
studies may be necessary before reaching firm conclusions.
Searches for information about the
haplogroup will also reveal that
Russian Tsar Nicholas II was a member of haplogroup T, and that he
and his brother, the Grand Duke George Alexandrovitch Romanov,
both exhibited heteroplasmy at nucleotide position 16169
(Ivanov et al. 1996).
This information, while interesting, may not
satisfy those whose primary interest is genetic
genealogy and who are seeking some sense of place within the haplogroup. In this paper we construct a phylogenetic network
based on information stored in the
MitoSearch database at
a public database designed to assist
in the pursuit of genetic genealogy; individuals can enter their own mtDNA haplotype into the database
in the hope of making contact with others who share their genetic
signature (i.e., with potential relatives who share
In particular, we extract the data pertaining to haplogroup T and then construct
a map based on this data set, so that individuals may determine their place within the haplogroup T family.
Subsequent to building phylogenetic networks, we conduct some analysis of the haplogroup and its subgroups.
In so doing, we propose a revision to the haplogroup T subgroup hierarchy.
Methodology Analysis and Discussion Acknowledgements Source Data
Subgroup Associated HVR1 Mutations T1 16163-16186-16189 T2 16304 T3 16292 T4 16324 T5 16153 Subgroup T T* T1 T2 T3 T4 T5 Samples 47 61 76 144 17 7 15 Subgroup Associated HVR1 Mutations T1 16189 T1a 16163-16186-16189 T1b 16163-16189-16243 T1c 16182-16183-16189-16298
File translated from TEX by TTH, version 3.72.
On 23 Mar 2006, 19:07.
MethodologyThe source for the data we use is, as mentioned above, the MitoSearch database found at www.mitosearch.org. For each sample, it contains the results of genetic analysis of the nucleotides in the interval 16001 to 16569, which encompasses the first hypervariable region (HVR1). Several of the database entries also report the results of genetic analysis for nucleotide positions 1 to 574 (this interval includes HVR2). As of November 15, 2005, the MitoSearch database contained a total of 367 samples that had been classified as belonging to haplogroup T or one of its subgroups. As the majority of these samples had only been tested for mutations within the interval 16001 to 16569, we chose to limit our consideration to this interval alone. Alternatively, we could have opted to work with the minority of samples that had been fully tested for both HVR1 and HVR2, but such a choice would have been contrary to our motivational goal of presenting a map that could be consulted by genetic genealogists, many of whom only have information for the HVR1 portion of their mtDNA genome. The MitoSearch database does not store information pertaining to the coding region of the mtDNA genome. Not having coding region data and not utilising HVR2 data may partially inhibit our ability to construct phylogenetic networks in the sense that it is possible that some genetic branching may not be observed if its only evidence is located among mutations in these regions. Fortunately the part of the human mtDNA molecule that we are using (the HVR1 portion) is known to have the highest rate of variation of any part of the mtDNA genome (Greenberg et al. 1983; Kocher and Wilson 1991). To date, five major subgroups of haplogroup T have been identified, and each is associated with a particular set of HVR1 mutations (Richards et al. 1998; Richards et al. 2000). These motifs, which are in addition to the HVR1 motif 16126-16294 that defines haplogroup T, are listed in Table 1.
Analysis and DiscussionThe overall phylogenetic network, based on the 361 samples and the corresponding 121 haplotypes is shown in Figure 1. Each node and solid edge has been labelled, but even with a small font the labels may detract from the overall presentation. Hence in Figure 2 we present the corresponding unlabelled network diagram. The node representing the haplotype 16126-16294-16519 has been drawn with a double circle to emphasise that it is the point at which the network connects to the greater human mtDNA phylogenetic network.
AcknowledgementsCredit goes to Family Tree DNA for creating and managing the MitoSearch database. The diagrams in this document were drawn with the assistance of Pajek (Batagelj and Mrvar), a software program for large network analysis. The three anonymous reviewers who refereed this paper are also thanked for several helpful comments. Research support from NSERC is also acknowledged.
Source DataThe raw data extracted from the MitoSearch database and used in the construction of the networks presented in this paper are available online at www.jogg.info/21/T-data.txt. Anderson S., Bankier AT, Barrell BG, de Bruijn MHL, Coulson AR, Drouin J, Eperon IC, Nierlich DP, Roe BA, Sanger F, Schreier PH, Smith AJH, Staden R, Young IG (1981) Sequence and organization of the human mitochondrial genome. Nature 290:457-465. Batagelj V, Mrvar A. Pajek - Program for Large Network Analysis. Home page: http://vlado.fmf.uni-lj.si/pub/networks/pajek. Chagnon P, Gee M, Filion M, Robitaille Y, Belouchi M, Gauvreau D (1999) Phylogenetic analysis of the mitochondrial genome indicates significant differences between patients with Alzheimer disease and controls in a French-Canadian founder population. Am. J. Med. Genet. 85:20-30. Finnilä S, Majamaa K (2001) Phylogenetic analysis of mtDNA haplogroup TJ in a Finnish population. J. Hum. Genet. 46:64-69. Greenberg BD, Newbold JE, Sugino A (1983) Intraspecific nucleotide sequence variability surrounding the origin of replication in human mitochondrial DNA. Gene 21:33-49. Helgason A, Hickey E, Goodacre S, Bosnes V, Stefánsson K, Ward R, Sykes B (2001) mtDNA and the Islands of the North Atlantic: Estimating the Proportions of Norse and Gaelic Ancestry. Am. J. Hum. Genet. 68:723-737. Helgason A, Sigurðardóttir S, Gulcher JR, Ward R, Stefánsson K (2000) mtDNA and the Origin of the Icelanders: Deciphering Signals of Recent Population History. Am. J. Hum. Genet. 66:999-1016. Herrnstadt C, Elson JL, Fahy E, Preston G, Turnbull DM, Anderson C, Ghosh SS, Olefsky JM, Beal MF, Davis RE, Howell N (2002) Reduced-Median-Network Analysis of Complete Mitochondrial DNA Coding-Region Sequences for the Major African, Asian, and European Haplogroups. Am. J. Hum. Genet. 70:1152-1171. Howell N, Smejkal CB (2000) Persistent Heteroplasmy of a Mutation in the Human mtDNA Control Region: Hypermutation as an Apparent Consequence of Simple-Repeat Expansion/Contraction. Am. J. Hum. Genet. 66:1589-1598. Ivanov PL, Wadhams MJ, Roby RK, Holland MM, Weedn VW, Parsons TJ (1996) Mitochondrial DNA sequence heteroplasmy in the Grand Duke of Russia Georgij Romanov establishes the authenticity of the remains of Tsar Nicholas II. Nat. Genet. 12:417-420. Kivisild T, Reidla M, Metspalu E, Rosa A, Brehm A, Pennarun E, Parik J, Geberhiwot T, Usanga E, Villems R (2004) Ethiopian Mitochondrial DNA Heritage: Tracking Gene Flow Across and Around the Gate of Tears. Am. J. Hum. Genet. 75:752-770. Kocher TD, Wilson AC (1991) Sequence Evolution of Mitochondrial DNA in Humans and Chimpanzees: Control Region and a Protein-Coding Region. In S. Osawa and T. Honjo (Eds.), Evolution of Life: Fossils, Molecules, and Culture, pp. 391-413. Springer-Verlag, Tokyo. Macaulay V, Richards M, Hickey E, Vega E, Cruciani F, Guida V, Scozzari R, Bonné-Tamir B, Sykes B, Torroni A (1999) The Emerging Tree of West Eurasian mtDNAs: A Synthesis of Control-Region Sequences and RFLPs. Am. J. Hum. Genet. 64:232-249. Malyarchuk BA, Derenko MV (1999) Molecular instability of the mitochondrial haplogroup T sequences at nucleotide positions 16292 and 16296. Ann. Hum. Genet. 63:489-497. Palanichamy M, Sun C, Agrawal S, Bandelt H-J, Kong Q-P, Khan F, Wang C-Y, Chaudhuri TK, Palla V, Zhang Y-P (2004) Phylogeny of Mitochondrial DNA Macrohaplogroup N in India, Based on Complete Sequencing: Implications for the Peopling of South Asia. Am. J. Hum. Genet. 75:966-978. Pereira L, Gonçalves J, Goios A, Rocha T, Amorim A (2005) Human mtDNA haplogroups and reduced male fertility: real association or hidden population substructuring. Int. J. Androl. 28:241-247. Pyle A, Foltynie T, Tiangyou W, Lambert C, Keers SM, Allcock LM, Davison J, Lewis SJ, Perry RH, Barker R, Burn DJ, Chinnery PF (2005) Mitochondrial DNA haplogroup cluster UKJT reduces the risk of PD. Ann. Neurol. 57:564-567. Richards M, Côrte-Real M, Forster P, Macauley V, Wilkinson-Herbots H, Demaine A, Papiha S, Hedges R, Bandelt H-J, Sykes B (1996) Paleolithic and Neolithic Lineages in the European Mitochondrial Gene Pool. Am. J. Hum. Genet. 59:185-203. Richards M, Macaulay V, Hickey E, Vega E, Sykes B, Guida V, Rengo C, Sellitto D, Cruciani F, Kivisild T, Villems R, Thomas M, Rychkov S, Rychkov O, Rychkov Y, Gölge M, Dimitrov D, Hill E, Bradley D, Romano V, Calì F, Vona G, Demaine A, Papiha S, Triantaphyllidis C, Stefanescu G, Hatina J, Belledi M, Rienzo AD, Oppenheim A, Nørby S, Al-Zaheri N, Santachiara-Benerecetti S, Scozzari R, Torroni A, Bandelt H-J (2000) Tracing European Founder Lineages in the Near Eastern mtDNA Pool. Am. J. Hum. Genet. 67:1251-1276. Richards MB, Macaulay VA, Bandelt H-J, Sykes BC (1998) Phylogeography of mitochondrial DNA in western Europe. Ann. Hum. Genet. 62:241-260. Ruiz-Pesini E, Lapeña A-C, Díez-Sánchez C, Pérez-Martos A, Montoya J, Alvarez E, Díaz M, Urriés A, Montoro L, López-Pérez MJ, Enríquez JA (2000) Human mtDNA Haplogroups Associated with High or Reduced Spermatozoa Motility. Am. J. Hum. Genet. 67:682-696. Torroni A, Huoponen K, Francalacci P, Petrozzi M, Morelli L, Scozzari R, Obinu D, Savontaus ML, Wallace DC (1996) Classification of European mtDNAs From an Analysis of Three European Populations. Genetics 144:1835-1850. Torroni A, Lott MT, Cabell MF, Chen Y-S, Lavergne L, Wallace DC (1994) mtDNA and the Origin of Caucasians: Identification of Ancient Caucasian-specific Haplogroups, One of Which is Prone to a Recurrent Somatic Duplication in the D-Loop Region. Am. J. Hum. Genet. 55:760-776. Wilkinson-Herbots HM, Richards MB, Forster P, Sykes BC (1996) Site 73 in hypervariable region II of the human mitochondrial genome and the origin of European populations. Ann. Hum. Genet. 60:499-508.