Journal of Genetic Genealogy

Enriching our world
through citizen science.

1 Using a Y-DNA Surname Project to Dig Deeper Into Your Genealogy: A Case Study
2 Conflicts of Interest The author, Allan H. Westreich, Ph.D., declares no conflicts of interest.
3
viewPDFgreen100

Using a Y-DNA Surname Project to Dig Deeper Into Your Genealogy: A Case Study

Allan H. Westreich, Ph.D.

Address for correspondence:
Allan H. Westreich, Ph.D., 250 Route 28, Suite 206, Bridgewater, NJ 08807

The primary goal of this study was to test the value of genetic genealogy to help break through brick walls encountered with traditional genealogical methods. More specifically, a Y-DNA surname project was used to better understand the possible connections between separate family tree branches associated with the surname of Westreich. Each project member tested his Y-DNA with the 37-marker short tandem repeat (STR) test. By combining the DNA results with genealogical knowledge obtained through traditional methods, separate Westreich family tree branches were connected into a single merged tree, thus widening each individual tree with new-found cousins. Also, a deeper rabbinical branch enabled the other branches to extend several generations further back in time. Even a non-Westreich branch was connected to the tree. These conclusions were reached through an integrative genealogical approach which combined both genetic and traditional (non-genetic) genealogical methods, and not by either one alone. However, there are definite limitations to these conclusions, the primary one being they are not definitive but based on probabilities. For example, the primary conclusion that all of the project members belong to the same recent paternal lineage is “very likely.” The secondary goal of this study was to clearly document the use of a Y-DNA surname group from start-to-finish to assist others in applying this relatively new technology to their family tree(s) of interest.

Traditional paper-based genealogical research inevitably hits a brick wall, either temporary or permanent. In recent years, another tool has become available to help break through these roadblocks: genetic genealogy. Since each of us shares some DNA with our ancestors, and therefore with our siblings and cousins, the hunt for elusive distant (or sometimes close) relatives can be supplemented with looking for people with whom we share significant amounts of DNA.

 

The primary goal of this study was to test the value of genetic genealogy in better understanding the possible connections between separate family tree branches associated with the surname of Westreich, thereby demonstrating a generalizable framework useful for studying other surnames. Are the separate branches actually connected to each other? If so, members of one branch can incorporate information from the other branches to expand their genealogical knowledge both breadthwise (new cousins in recent generations) and depthwise (new ancestors in earlier generations).

 

The best type of DNA to use for a surname study is Y-DNA (Estes, 2016) since both surname and Y-DNA are transmitted relatively unchanged down the male line, father-to-son-to-son-to-son. Only males have Y-DNA. Therefore, candidates for DNA testing for this study are males with the surname of Westreich. The overall strategy is to compare the DNA of the tested individuals to determine the likelihood of blood relationships.

 

All of the currently known Westreich branches descend from Jewish ancestors from current-day southeastern Poland. This area was formerly known as Galicia, part of the Austro-Hungarian Empire from the late 1700s through the early 1900s. The paper trail stops for most of these family histories in the early 1800s.

 

The earliest known paper-based pedigree belongs to a branch of Galician Westreich rabbis dating back to the early 1700s (Wunder, 1981). It is not unusual for rabbis to have earlier-known family histories since they were considered the royalty of their time and their genealogies were often well-documented (Paull & Briskman, 2015). If descendants from the other Westreich branches are able to match their Y-DNA with descendants from the rabbinical line, then they will be able to extend their family branch back by another three or so generations. Similarly, several Jewish genealogical studies (Paull & Briskman, 2015; Paull, Rosenstein & Briskman, 2016; Paull, Briskman & Twersky, 2016; Akaha & Unkefer, 2015) have tried to identify the Y-DNA “signatures” of renowned rabbinical lines as references to which others can attempt to match and therefore extend back their family trees. Perhaps this strategy is most deftly called “is a rabbi hiding in your family tree?” (Akaha & Unkefer, 2015).

 

This study uses an integrative genealogical approach, which combines both genetic and traditional (non-genetic) genealogical data and methods. When weighing genealogical hypotheses, it is best to consider all available evidence, both genetic and non-genetic (Bettinger, 2016a; Bettinger, 2016b), whether supportive or dismissive. Even so, the solutions to complex problems are often probabilistic, not definitive.

The secondary goal of this study was to clearly document the use of a Y-DNA surname group from start-to-finish to assist others in applying this relatively new technology to their family tree(s) of interest.

Methods

A Y-DNA Surname Project was established with Family Tree DNA (FTDNA; Houston, Texas, USA). This provided the centralized “location” for DNA testing, DNA comparisons, and communication among the group members.

 

Candidates were identified for Y-DNA testing. The basic requirements were male gender with the surname of Westreich or one of its variants (e.g., Westrich, Vestraich, etc.). Ideal candidates would have a long-documented family tree, which would allow others in the group of testers to significantly extend their family trees back in time if their DNA matched. Already-known close relatives (e.g., brothers, first cousins) of another tester were not necessary as they would provide little, if any, new information. If female Westreich descendants were identified, they were a potential resource for finding male Westreich’s in their local family branch, e.g., brother or paternal uncle.

 

Internet use was fundamental to finding and contacting these testing candidates. Useful sites included general search engines, genealogical records and sharing sites, online directories, and social networking sites. Sometimes the search process included “reverse genealogy” (Taylor, 2009), where the starting point was a known male Westreich from the past—such as Rabbi Israel Hill Westreich born circa 1720 (Wunder,1981)—and the search looked forward in time for his living male Westreich descendents. This is the reverse of a typical genealogy search where the starting point is a living person and the goal is to find their earlier ancestors.

 

Male Westreich descendants from six separate Westreich family trees were identified and contacted. Four were successfully recruited to join the FTDNA Surname Project and test their Y-DNA. It took approximately 1 year to identify, recruit, and obtain the DNA results of these four participants.

 

Table 1 presents the basic genealogical information known about each group member’s most distant known ancestor (MDKA), i.e., the highest link in each of the four separate family trees. The first names of the group members have been omitted for privacy reasons. The geographical location(s) associated with the MDKA is where the ancestor may have been born, lived, or died. Also, while these locations are all in modern-day Poland, they may have been part of Galicia, Austria during the lifetime of the MDKA.

 

The DNA test used was the Y-DNA 37-marker short tandem repeat (STR) test. This is the standard for initial testing of members in a DNA surname group (Gleeson, 2016a). As stated above, Y-DNA is used because both surname and Y-DNA are transmitted relatively unchanged down the male line, father-to-son-to-son-to-son. STR markers are locations on the Y-DNA that contain a variable number of repeated patterns of genetic information. Each marker tested yields a value, called an allele, which is the number of repeated patterns at that location. For example, marker DYS393 (DNA Y-Chromosome Segment 393) may have the value of 12, meaning that the genetic pattern of nucleotide bases AGAT (adenine-guanine-adenine-thymine; Wikipedia, 2016) was repeated 12 times. (Each marker has a different genetic pattern of nucleotide bases associated with it.)

 

STR values change (mutate) slowly over generations, so they are useful for testing for relatedness within a recent “genealogical timeframe” of existing surnames, paper records, etc. (roughly the previous several hundred years; Gleeson, 2015a). Descendants of a recent common ancestor should have the same or similar STR values for each of the markers tested.

 

Thirty-seven markers are generally considered to be optimal for initial testing (Gleeson, 2016a). More markers (e.g., 67, 111) can yield a slightly higher resolution at a greater expense, while fewer (e.g., 12, 25) may not adequately distinguish family lines from one another.

 

The primary results of each group member’s Y-DNA test were their paternal haplotype and haplogroup. A haplotype is a list of the allele values for each of the markers tested. An example haplotype is shown in Table 2. In this example, for a tester named “Male1”, marker DYS392 has its genetic pattern (TAT, or thymine-adenine-thymine; Wikipedia, 2016) repeated 11 times. The values in this example are fictitious for the sake of confidentiality. Only 12 markers are shown for simplicity’s sake; each member in this study tested at least 37 markers.

 

A Y-DNA haplogroup, similar to a Y-DNA haplotype, represents a group of men who share the same paternal ancestry. A new haplogroup is defined by a mutation of a single nucleotide polymorphism (SNP). SNP mutations occur on a random basis at a rate much, much slower than STR mutations. If SNP mutations did not occur, all men would belong to the same haplogroup. Compared to a haplotype, a haplogroup: (1) is much, much larger, consisting of groups and sub-groups (called sub-clades) of many more individuals; and (2) originates from much, much more ancient ancestry (dating back to tens of thousands of years ago), thus is not considered to be within a genealogical timeframe.

 

FTDNA reports a “predicted” haplogroup as part of the results from STR testing. This prediction is based on the STR haplotype. To confirm the haplogroup, SNP testing must be performed. Haplogroup names begin with letters, followed by numbers and letters to specify sub-groups. Some examples are B, J2a, and R1b1a. Since these names can get quite long, a shorthand notation has been developed, e.g., J2a1b is also called J-M67, where J is the topmost haplogroup and M67 is the bottommost SNP that defines the sub-branch.

 

At the heart of this study is using DNA to help determine whether two or more people descend from a recent common paternal ancestor and thus belong to the same recent paternal lineage group. Truth be told, DNA can never provide 100% proof of this. However, current DNA technology can help determine whether there is a high probability (or not) of this being true.

 

A high probability of recent relatedness is based on multiple potential sources of evidence, both genetic and non-genetic (Gleeson, 2015a). It is important to weigh the totality of the genetic and non-genetic evidence, both supportive and dismissive, when considering a genealogical hypothesis. The criteria to be considered for grouping members into the same recent paternal lineage are:

 

 

Now the question arises, what genetic distance is considered small enough to conclude that two people are related within a genealogical timeframe (roughly, the past several hundred years)? For 37-marker tests with people of the same surname, FTDNA reports possible “matches” if the genetic distance is less than or equal to 4 (FTDNA Learning Center, 2016c). More specifically, FTDNA uses the guidelines in Table 5 (FTDNA Learning Center, 2016a) for assessing the degree of relatedness based on genetic distance. Note that these are guidelines and are not absolute.

 

The simplest and most reasonable strategy for grouping members of a surname study into paternal lineages is to initially group them based on the same or similar haplotype, i.e., low genetic distance, and then use the additional genetic and non-genetic factors listed above for corroboration, particularly for borderline cases (Gleeson, 2015a).

 

After the Y-DNA results of the four recruited members were used to group them into recent paternal lineages, additional candidate group members were identified from their Y-DNA matches, who were already in the FTDNA database. These additional candidate group members will be discussed in the Results section below.

 

Once the paternal lineage groups were established, each of which consists of one or more members who have a high probability of sharing a recent common paternal ancestor, a more in-depth look at the closeness of the genetic connection between the members was undertaken. This was done by estimating the time to the most recent common ancestor (TMRCA) and then using this information along with relevant non-genetic evidence to merge the previously separate family trees. This process is detailed in the Results section below.

Results

The results of the Y-DNA STR 37-marker tests for the four recruited members of the Westreich surname project are presented in Table 6. An asterisk (*) denotes that the value is the same for all four testers for a given marker; the actual value is not specified for privacy reasons. Only the shaded values differ from the most frequent values of the other testers. At first glance, note that most of the STR values (144 of 148) are identical across all four members and that all four are considered a “match” to each other based on FTDNA guidelines (genetic distance is less than or equal to 4). We seem to be barking up the right (family) tree!

 

As stated above in the Methods section, the next step is to group the four testers into recent paternal lineage groups based on the following genetic and non-genetic criteria:

 

  • • Low genetic distance. The first criteria for grouping into the same paternal lineage is low genetic distance from the modal haplotype of that group, followed by the additional corroborating factors below. In this case, the modal haplotype is simply the haplotype of Male2 and Male4 Westreich. Male1 and Male3 Westreich each have a low genetic distance of 2 from the modal haplotype, with the three markers (DYS439, DYS389ii, and DYS456) that differ from the mode known to mutate at moderate rates (neither particularly fast nor slow) (Wikipedia, 2016). This suggests they are all recently related by FTDNA genetic distance guidelines (see Table 5). This is summarized in Table 7. All four testers meet the genetic distance criteria for belonging to the same recent paternal lineage group.
  • • Same haplogroup. As seen in Table 6, all four testers have the same predicted haplogroup of J-M172.
    However, since the haplogroups were predicted from the haplotypes, there is no new information here. Independently-tested SNP values would be necessary to determine the true haplogroups to provide another meaningful piece of evidence.
  • • Same MDKA. None of the testers have the same MDKA.
  • • Same surname. All of the testers have the surname Westreich. While this evidence suggests that they share a recent paternal ancestor, it alone is not conclusive. Since the surname of Westreich was used in multiple districts across 19th-century western Galicia (Beider, 2004), the competing hypothesis that unrelated individuals adopted the same surname of Westreich is also a possibility.
  • • Similar geography. All of the testers descend from 19th-century ancestors from current-day southeastern Poland, formerly part of western Galicia, part of the Austro-Hungarian Empire from the late 1700s through the early 1900s. The earliest known ancestral towns of each of the branches lie within 90 miles of each other. And all of these ancestral towns are within 60 miles of Sedziszow Malopolski, the earliest ancestral town of the rabbinical branch and therefore possibly the source of all of the Westreich branches in this study. In addition, based on information obtained from traditional genealogical sources, both Male1 and Male4 have ancestors that lived in Brzesko (also known as Brigel in Yiddish), and Male3 and Male4 both have ancestors who lived in Grybow.
  • • Shared rare marker values. If testers share marker values that are uncommon in members of their larger haplogroup, this is evidence supportive of a relationship. Estes (2013) considers the frequency cutoff for “very rare” markers as less than or equal to 6% within the haplogroup. For the marker YCAIIa, all four testers have the value of 23 which occurs with a 1% frequency within the larger haplogroup of J2 (of which J-M172 is a sub-group) (Rootsweb, 2016).
  • • Same ethnicity and/or ethnic-based traditions. All of the testers’ ancestors (as best as can be determined) share the same religion, Ashkenazi Judaism. It is a longstanding Ashkenazi Jewish tradition to name children after a deceased ancestor. If two Jewish family trees share given names, this is suggestive of common ancestors. The earliest known ancestor of Male4 is Rabbi Israel Hillel Westreich, whose grandson with the same not-so-common given name undoubtedly was named after him. Male1 also has an Israel Hillel Westreich in his family tree. Furthermore, the grandson Israel Hillel in Male4’s tree lived in Brzesko and died in 1846 (Wunder, 1981). The Israel Hillel in Male1’s tree also lived in Brzesko and was born in 1849, suggesting that he may have been named after his recently-deceased ancestor from Male4’s tree.

 

After reviewing all of the above evidence in its totality—low genetic distances and significant corroborating evidence, both genetic and non-genetic—it is very likely that all four Westreich testers belong to the same recent paternal lineage, i.e., share a recent common paternal ancestor. It seems quite unlikely, given all of the above evidence, for the competing hypothesis of these Westreich’s not being related, to be true.

 

Once the four recruited group members were assigned to a single paternal lineage group, additional candidate members were identified from their Y-DNA matches, comprising people who had previously tested with FTDNA. Particularly for those candidates who do not share the Westreich surname, they must have a very low genetic distance to the group modal haplotype as well as additional corroborating evidence in order to be considered part of the same recent paternal lineage group.

 

One such candidate surfaced. Male1 Taffel (given name again omitted for privacy) is a perfect 37 of 37 match with the modal haplotype, i.e., he has a genetic distance of 0 from the modal haplotype. In addition, there are several additional corroborating factors. His projected haplogroup of J-M172 is the same. His paternal ancestors are also from Galicia. More specifically, they lived in Sedziszow Malopolski (aka Shendishov or Shendishov Malopolski in Yiddish) which, very interestingly, is the same as the location of the earliest known Westreich rabbi ancestor of Male4 (Wunder, 2016). Male1 Taffel’s haplotype shares the same rare marker value of 23 at YCAIIa. And his ancestors also share the same Ashkenazi Jewish religious background.

 

Based on the above evidence, Male1 Taffel is very likely to belong to the same paternal lineage group as the four Westreich members. The complete list of group members appears in Table 8.

 

The final challenge of this study was to combine the knowledge gained from the DNA testing with data obtained from traditional genealogical methods to produce a single, merged family tree of the group members. Before DNA testing, all of the family trees of the group members were separate from each other, as shown in Figure 1.

 

As a result of DNA analysis, the members have been grouped into a single recent paternal lineage, i.e., they all share a common paternal ancestor within a genealogical timeframe. Therefore, for each pair of separate family trees, the MRCA is the connecting point. If the number of generations from the testers to the MRCA can be determined for each pair of family trees, then the two trees can be merged. That number of generations is defined as the time to MRCA (TMRCA). (To approximate the number of years from the testers until the MRCA, simply multiply the number of generations by 30; Gleeson, 2016b.)

 

Merging two family trees based on TMRCA is illustrated in Figure 2 with an overly simplified example in which two men know their respective fathers and grandfathers and the TMRCA is five generations back from both tested men.

 

Keep in mind that the names of the MRCA and of all the generations from the MDKA up to and including the MRCA are not revealed by DNA and therefore still cannot be included in the tree.

 

To connect the separate trees of the four Westreich testers, the TMRCA needs to be identified between each pair of trees. Although the above simplified example uses a TMRCA of five generations, in practice the exact number is rarely known. The best we can do is to approximate a probable range of number of generations until the MRCA. Both genetic and non-genetic evidence can be very helpful in refining the endpoints of these ranges, as illustrated below.

 

Fortunately, in this example where all the individuals in the family trees have the surname Westreich and therefore their MRCA would very likely also have the Westreich surname, the early end of the TMRCA range is constrained by the time of adoption of Jewish surnames. Jewish surnames were mandated by the Austrian government in 1787, and specifically in western Galicia in 1805 (Paull & Briskman, 2014). Before this time period, Ashkenazi Jews typically did not have surnames with the possible exception of rabbinical lines.

 

Therefore, the first man in the Westreich rabbinical line to adopt the surname Westreich was highly likely to be either Israel Hillel (born circa 1720) or Yosef Yoska (born circa 1750) (Wunder, 2016). And therefore, the earliest possible MRCA between the rabbinical line and each of the other Westreich trees is highly likely to be either Israel Hillel (born circa 1720), Yosef Yoska (born circa 1750), or an unknown brother of Yosef Yoska.

 

The recent end of the range of possible MRCA’s is also fairly tightly constrained in this example. For Male1 and Male2, based on birth year of their MDKA, the latest possible MRCA with the rabbinical line is Israel Hillel Westreich (born circa 1780). For Male3, based on the birth year of his MDKA, the latest possible MRCA with the rabbinical line is Yosef Yoska (born 1810). In fact, this is quite possible since Male3’s MDKA Abraham Westreich (born 1845) as well as Yosef Yoska Westreich (born 1810) were born in Grybow, Poland.

 

In summary, the MRCA for the rabbinical tree with each of the other Westreich trees lies in the range of Israel Hillel Westreich (born circa 1720) and Israel Hillel Westreich (born circa 1780), with the possible exception for Male3 extending down to Yosef Yoska (born 1810). The resulting merged Westreich family tree is illustrated in Figure 3.

 

The last step is to merge the Westreich tree with the Taffel tree, thus creating a single merged tree of all the testers in the single paternal lineage. The most likely reason that these branches have individuals with different surnames is that the MRCA lived before the adoption of Jewish surnames (although this could also be explained by a “non-paternity event“; Estes, 2016). Using the same reasoning as above, the first Westreich ancestor without the Westreich surname is likely Israel Hillel (born circa 1720) or his father. So that is the recent end of the TMRCA range.

 

Since we have no paper-based information to determine the early end of the TMRCA range, genetic tools are used. FTDNA estimates the TMRCA given the genetic distance between two haplotypes (see Table 9; FTDNA Learning Center, 2016b). TMRCA is calculated as a set of probabilities that the MRCA lived no longer than “x” number of generations ago. In our example, we want to calculate the TMRCA between the merged Westreich tree and the separate Taffel tree. Therefore, we compare the modal haplotype of the Westreich testers with the haplotype of the Taffel tester. In this case, they are the same, i.e., the genetic distance is 0. Therefore, there is a 50% chance that the MRCA lived no longer than two generations before the testers, a 90% chance within five generations, and a 95% chance within seven generations.

 

FTDNA also has a TiP (time predictor) tool (International Society of Genetic Genealogy Wiki, 2016) that further refines the TMRCA estimate. In addition to genetic distance, it also takes into account the average mutation rates for each marker to give a more precise estimate. Continuing with our example, the TiP estimates are 59% chance that the MRCA lived no longer than two generations before the testers, 93% chance within six generations, and 97% chance within eight generations.

 

Given that these estimates are far from exact (Estes, 2012) and often overestimate the TMRCA (Akaha & Unkefer, 2015; Paull, Briskman & Twersky, 2016), a conservative estimate (Unkefer, 2014) for the early end of the TMRCA is eight generations. This corresponds with Israel Hillel’s father (born circa 1690), who is also in the recent end of the TMRCA range. The resulting merged group family tree is illustrated in Figure 4 and is consistent with the MDKA’s of the Westreich rabbinical tree and the Taffel tree both having lived in the same location, Sedziszow Malopolski.

 

Conclusions

The goals of this study were to:

 

 

This Y-DNA project contributed significantly to breaking through some of the brick walls that had been reached in studying the Westreich family. By combining the genetic results with information obtained from traditional genealogical methods, separate Westreich family tree branches were connected into a single merged tree, thus widening each individual tree with new-found cousins. The deeper rabbinical branch enabled the other branches to extend several generations further back in time. And even a non-Westreich branch was connected to the group.

 

However, there are definite limitations to this study, as with genealogical research in general. First and foremost, conclusions drawn from genetic genealogy alone are not definitive, particularly for positive results. Even when combined with known evidence from traditional genealogical methods, all of the conclusions in the above paragraph are “very likely” and not absolute, including the fundamental one that all of the group participants belong to the same recent paternal lineage. The number of generations in DNA-connected branches is approximate at best, and the names of the missing generations will never be supplied by DNA alone.

 

The net result is that the addition of a Y-DNA surname project to information already gathered by traditional genealogical methods has generated some very interesting and significant hypotheses regarding the Westreich family genealogy that are very likely to be true. This, in turn, points to future work to further examine and test these new hypotheses:

 

 

Hopefully this article has also met its secondary goal of clearly explaining the methods of a Y-DNA surname project from start-to-finish to assist others in applying this technology to their family tree(s) of interest. One does not have to be a DNA expert, professional genealogist, or rabbinical scholar to conduct this type of research. The primary requirements are logic and persistence. Or as Thomas A. Edison (1901) said, “1 percent inspiration and 99 percent perspiration.”

 

Acknowledgments

The author would like to thank all of the Y-DNA Surname Project participants, without whom this research would not have been possible.

 

Conflicts of Interest

The author declares no conflicts of interest.

References