SNPs on Chips:
A New Source of Data for Y-Chromosome Studies
‘Satiable Curiosity is a column dedicated to the proposition that genetic genealogists are an untapped resource for resolving questions about
Polymorphisms (SNPs, pronounced snips) are mutations that substitute one base
(A, C, G or T) for another, thus creating two possible versions of a
recently, tests for Y SNPs used a targeted approach, selecting a limited number
of the best candidates for analysis. For instance, Family Tree
Now several companies are offering scans for hundreds of thousands of SNPs, scattered across the entire genome. Tests for these SNPs are assembled on a platform commonly called a chip, because the original manufacturing process was similar to the one used to design computer chips. Although the emphasis is on the 22 pairs of autosomal (non-sex) chromosomes, the mass-produced genome chips also contain hundreds of Y SNPs, more than ever tested before in a simultaneous fashion.
At the time of this writing, a smattering of individuals have received results from two of the genome companies, 23andMe and deCODEme. The two companies have taken a different approach to SNP selection.
has added a number of custom SNPs to the off-the-shelf chip from Illumina. These were specifically chosen to cover much of
the phylogenetic tree, in collaboration with
In contrast, deCODEme does not assign such derived haplogroups, yet it includes more Y SNPs in the raw data, a total of 858. These SNPs are all in dbSNP and come from a large variety of sources. They have simply been observed to occur in some setting or another, and they are not necessarily vetted for their placement in a tree structure.
That immediately makes the curious genetic genealogist ask the question “CAN these SNPs be placed on the phylogenetic tree? In fact, would the SNPs revise the tree, uniting some branches or adding some twigs at the tips?” The genome tests are expensive (in the $1000 range), but if a few pioneering individuals share their raw data, all may benefit from the insights gained by comparing even a few people.
this cooperative endeavor is taking place informally right now. Since R1b1c is
very common, the first interesting discovery involved a SNP in that haplogroup,
rs34276300, which was found to be ancestral in one branch of R1b1c and derived
in several other branches. Two companies, EthnoAncestry
and Family Tree
systematic effort to collect genotype data, from a broader variety of
haplogroups, would u
to this collaborative effort may write to me for further instructions on
extracting their genotypes from the complete genome scan. No medical
implications of the Y-SNP portion are known at the present time, although
fertility problems might be evidenced by several consecutive no-calls in the
region of genes responsible for sperm production. Contributors may remain anonymous, but if
they are willing to be contacted for further information (such as
Although 23andMe customers may feel they already have all the answers, there are other points of interest, and input is solicited from them as well. For instance, there is the question of the actual uniqueness in UEP– how often do parallel and reverse mutations occur? The targeted SNP tests skip over large numbers of SNPs, and perhaps one will show up when more “irrelevant” markers are included. Preliminary results have also revealed a curious phenomenon: one marker has been heterozygous (exhibiting two alleles) in several R1b1c individuals, but homozygous (one allele) in haplogroup I. The reason for this is not known.
The sheer quantity of raw data is unprecedented, and genetic genealogists are in a position to help interpret it.
Y-SNPs on Chips Database
Karafet TM, Mendez FL, Meilerman MB, Underhill PA, Zegura SL, Hammer MF (2008) New binary polymorphisms reshape and increase the resolution of the human Y chromosomal haplogroup tree. Genome Res, 18:830-838.
 The current version of the ISOGG Y Phylogenetic Tree can be found at http://www.isogg.org/tree.
 Ordering information is
available only on the personal results page of people who have obtained
 Occasionally, some SNPs cannot be “called,” or assigned a value, and the assignment may not be quite as deep, e.g., R1b1c*.
 Not all markers in this tree have rs numbers.
 A large-scale deletion, approximately 4 million bases roughly between
positions 23,000,000 and 27,000,000, was described by King et al. (2005). This region contains a commonly tested Y-