‘Satiable
Curiosity
SNPs on Chips:
A New Source of Data for Y-Chromosome Studies
‘Satiable Curiosity is a column dedicated to the proposition that genetic genealogists are an untapped resource for resolving questions about
Single Nucleotide
Polymorphisms (SNPs, pronounced snips) are mutations that substitute one base
(A, C, G or T) for another, thus creating two possible versions of a
Until
recently, tests for Y SNPs used a targeted approach, selecting a limited number
of the best candidates for analysis. For instance, Family Tree
Now
several companies[6] are
offering scans for hundreds of thousands of SNPs, scattered across the entire
genome. Tests for these SNPs are assembled on a platform commonly called a
chip, because the original manufacturing process was similar to the one used to
design computer chips. Although the emphasis is on the 22 pairs of autosomal (non-sex) chromosomes, the mass-produced genome
chips also contain hundreds of Y SNPs, more than ever tested before in a
simultaneous fashion.
At the
time of this writing, a smattering of individuals have received results from
two of the genome companies, 23andMe and deCODEme.
The two companies have taken a different approach to SNP selection.
23andMe
has added a number of custom SNPs to the off-the-shelf chip from Illumina. These were specifically chosen to cover much of
the phylogenetic tree, in collaboration with
In
contrast, deCODEme does not assign such derived
haplogroups, yet it includes more Y SNPs in the raw data, a total of 858. These
SNPs are all in dbSNP and come from a large variety
of sources. They have simply been observed to occur in some setting or another,
and they are not necessarily vetted for their placement in a tree structure.
That
immediately makes the curious genetic genealogist ask the question “CAN these
SNPs be placed on the phylogenetic tree? In fact, would the SNPs revise the
tree, uniting some branches or adding some twigs at the tips?” The genome tests
are expensive (in the $1000 range), but if a few pioneering individuals share
their raw data, all may benefit from the insights gained by comparing even a
few people.
In fact,
this cooperative endeavor is taking place informally right now. Since R1b1c is
very common, the first interesting discovery involved a SNP in that haplogroup,
rs34276300, which was found to be ancestral in one branch of R1b1c and derived
in several other branches. Two companies, EthnoAncestry
and Family Tree
A more
systematic effort to collect genotype data, from a broader variety of
haplogroups, would u
Contributors
to this collaborative effort may write to me for further instructions on
extracting their genotypes from the complete genome scan. No medical
implications of the Y-SNP portion are known at the present time, although
fertility problems might be evidenced by several consecutive no-calls in the
region of genes responsible for sperm production.[12] Contributors may remain anonymous, but if
they are willing to be contacted for further information (such as
Although
23andMe customers may feel they already have all the answers, there are other
points of interest, and input is solicited from them as well. For instance,
there is the question of the actual uniqueness in UEP– how often do parallel
and reverse mutations occur? The targeted SNP tests skip over large numbers of
SNPs, and perhaps one will show up when more “irrelevant” markers are included.
Preliminary results have also revealed a curious phenomenon: one marker has
been heterozygous (exhibiting two alleles) in several R1b1c individuals, but
homozygous (one allele) in haplogroup I. The reason for this is not known.
The sheer
quantity of raw data is unprecedented, and genetic genealogists are in a
position to help interpret it.
Ann
Turner
Web
Resources
http://dnacousins.com/SNPs_on_Chips.xls
http://dnacousins.com/SNPs_on_Chips.zip
Y-SNPs on Chips Database
References
Update (
[1] The current version of the ISOGG Y Phylogenetic Tree can be found at http://www.isogg.org/tree.
[2] Ordering information is
available only on the personal results page of people who have obtained
[4]
http://www.dnaheritage.com/ysnp.asp
[6] http://www.23andme.com, http://www.decodeme.com, http://www.seqwright.com, http://www.geneessence.com
[8]
[9] Occasionally, some SNPs cannot be “called,” or assigned a value, and the assignment may not be quite as deep, e.g., R1b1c*.
[11] Not all markers in this tree have rs numbers.
[12] A large-scale deletion, approximately 4 million bases roughly between
positions 23,000,000 and 27,000,000, was described by King et al. (2005). This region contains a commonly tested Y-