Genetic Structure of an Isolated Sub-Tribe of the Adi People of Arunachal Pradesh State in Northeast India: Isonymy Analysis and Selective Neutrality of Surname Distribution in Adi Panggi

 

 

Suvendu Maji  and T. S. Vasulu

 

 

Abstract

 

The distribution of surnames was studied to infer the population structure aspects of migration and genetic drift and the expectation of surname neutrality of progeny size distributing among the Panggi (Pangi), a small isolated Tibeto-Burman sub-tribe of the Adi tribe, subsisting on hunting-and-gathering in Upper Siang district of Arunachal Pradesh, India.  Random isonymy (I), Karlin-McGregor’s ν and Fisher’s α were estimated.  Log-log and Pareto distributions were also studied for the occurrence of surname distribution among husbands, wives for testing of surname neutrality.  The estimates of homonymy vary between villages and show greater variation among husbands than wives.  The log-log distributions of surname occurrence in the patrilocal tribe show a nonlinear trend and the pattern differs in husband and wife samples.  The results also show a non-linear trend of surname distribution with respect to progeny size among the post- reproductive and reproductive wives and their husbands.  The surname analysis among the Adi Panggi provide some insights about the genetic structure of the isolated tribe which will help the molecular genetic studies to unravel the differential paternal and maternal past genetic history of the tribe.

 

 

 

Address for correspondence:  T. S. Vasulu, vasulu@gmail.com

 

Received:  26 Oct 2007; accepted:  05 Feb 2008.

 

 

 

Introduction

 

Surnames are a unique bio-cultural trait, which provide a convenient means of investigating microevolution in human populations.  The patrilineal mode of inheritance of surnames mimics highly polymorphic genes on the Y-chromosome; the non-biological nature of its dispersal is expected to be independent of fertility and mortality differentials and therefore satisfies the expectations of the neutral theory of evolution (Kimura, 1980), which has been described by Karlin and MacGregor (1967) as the theoretical distribution of different mutant forms that are maintained in a population under the equilibrium between random genetic drift, mutation, and migration.  The occurrence of different surnames in human populations also conforms to Fisher’s logarithmic (Chakraborty et al, 1981) and Pareto’s discrete (Fox 1983) distributions.

 

The earliest application of theoretical models to the surname distribution has been considered among the parishes in the Parma valley (Yasuda et al, 1974) and in island populations of Sardinia (Zei et al, 1983a, b).  Since then, there has been progress in the use of isonymy studies to investigate the genetic structure among a wide variety of populations, particularly in Europe and Latin American countries (Barrai et al, 1987; Barrai et al, 2003; Colantonio et al, 2003).  These studies have shown that the logarithmic distribution of surname frequency follows a linear trend (Barrai et al, 1987) in conformity with the neutral allele model (Kimura, 1983); however, a few studies have shown deviation from the expected linear trend (Barrai et al, 1987; Barrai et al, 2002; Barrai et al, 2003).  These isonymy studies have revealed (a) geographical patterns in surname distributions as a result of differential migration as demonstrated in Ferrara, an immigrant community, Italy (Barrai et al, 1987; Barrai et al, 1989; Barrai et al, 1990), Austria (Barrai et al, 2000), Perugia (Rodriguez et al, 1993), Spain (Rodriguez et al, 2003); (b) the reflection of social and natural selection in Denmark (Boldsen 1992); (c) deviation from linearity as a result of excesses of surname repetition especially, in Netherlands (Barrai et al, 2002), Sicily (Scapoli et al, 1997) and Belgium (Barrai et al, 2003).  The majority of the cited studies are based on large sample sizes consisting of an entire nation or region; however, such studies among small isolated populations have rarely been conducted.  In this regard, it will be of interest to examine the expectation of neutrality of surname distribution in small populations, especially among isolated tribes, since the demographic events and marriage practices bring rapid changes along the kinship lines influencing the surname structure and in such situation the neutrality of surname distribution may not be expected.

 

This study describes the surname distribution and examines the neutral allele model in an isolated small tribe viz., Adi Panggi--one of the several sub-tribes of the Adi tribal cluster--inhabiting the Upper Siang River Valley in central Arunachal Pradesh State, India.

 

Materials and Methods

 

Adi Tribe

 

The Adi Tribe consists of several sub-tribes inhabiting different altitudes of the southeastern part of Himalayan mountain terrain along the Siang river valley in the central region of the State of Arunachal Pradesh (Roy 1960; Singh 1994; Blackburn 2004; Lego 2005).  In general, an Adi tribal village consists of a group of families belonging to a few specific clans living together at different locations of the valley.  Adi Panggi is one of the smallest isolated sub-tribes.  They reside in 7 villages situated in different valleys or hill slopes over an area of about 50 square km in Geku Circle, Upper Siang district, and number about four thousand individuals (Koley 2005).  The sub-tribe speaks the Adi language, a member of the Tibeto-Burman linguistic family.  The northern part of the region shares a border with China.  Although the Adi sub-tribes share a common historical migration, possible common origin, and linguistic and cultural affiliations, each sub-tribe forms a separate group that maintains its identity.  Each sub-tribe is geographically  isolated, practices high endogamy, and has specific clans and surname structure different from other sub-tribes of the Adi (Roy 1960; Lego 2005; Koley 2005).  Like other tribes in India, they are patrilocal, patrilineal, and patriarchal: sons tend to stay in the village or nearby villages, while daughters migrate to their husband’s village after their marriage and adopt their husband’s surname (Das 1953).

 

The clan and surname is indicative of their putative origin from their possible common ancestral stock and indicates genetic kinship.  The surname structure plays an important role in their marriage, warfare and in hunting and cultural activities and are stable over generations.  These factors make surname analysis useful for investigating the genetic structure of the population.  There are only a few studies on the Adi Panggi, e.g., ABO polymorphism (Bhattacharjee 1954; Krithika et al, 2006), cultural aspects (Sharma 1960), anthropometric variation (Roy 1966) and a recent ethnographic study (Koley 2005) have been reported.

 

Sample

 

Demographic data and blood samples of Adi Panggi (Pangi) tribal population were collected from six villages in Geku circle (Figure 1) for a molecular population genetics study among the Adi tribe of Arunachal Pradesh, India.  The study was approved by the ethical committee of Indian Statistical Institute, Kolkata.  The genetic analysis will be the subject of an upcoming article.

 

 

Figure 1.  Map showing the location of different Panggi villages ( ) in the Upper Siang District along the Siang River Valley in Arunachal Pradesh, India.

 

 

 

For the surname analysis, the surname of the husband and the maiden surname of the wife were collected through pedigree data from a field survey in 2006 (Maji et al. 2007).  The surname data were collected from 154 husbands and 130 wives.  Of the seven villages, six villages were studied for surname distribution: Sumsing (SS) and Sibum (SB) are remotely located around 15-30 km away from Geku Town (GT) whereas the remaining three villages, Ramku (RK), Kumku (KK) and Peram (PR), are located close to GT.

 

Isonymy Analysis

 

Random isonymy (I) was calculated from the frequency distribution of the abundance of each surname separately in males (pi) and females (qi) and for both in each  (ith) village.  If pi and qi are the frequencies of a particular surname in males (husbands) and females (wives) respectively, then the random isonymy estimate among males, females and both males and females is:

 

I = pi2  or  I = qi2  and  I = pi qi

 

Since the random isonymy is biased for the samples’ size variations, an unbiased isonymy and its variance was estimated (Dyke et al, 1983; Relethford 1988; Barrai et al, 1989; Barrai et al, 1991).  If there are Ni males and Nj females in the samples the unbiased random isonomy Iij is defined as:

 

Iij =  (Σnis njs) / (Ni • Nj)

 

where, nis  and njs are the numbers of individuals with surname s in populations i and j, respectively, and Ni and Nj are the total number of surnames in populations i and j, and summation is over all s surnames.

 

Surname Distribution and Neutral Allele Model

 

Since the surnames mimic a genetic trait on the Y-chromosome, and the surname distribution in a large population also correlates with genetic diversity (Barrai et al, 1996), it is expected to conform, under certain assumptions, to the selection neutrality of an infinite allele model described by Karlin and MacGregor (1967), which can be accounted for by the logarithmic distribution first given by Fisher.  That is, if we let S represent the number of times that a surname occurs in a population, and let K represent the frequency of S (i.e., K is the number of surnames occurring this same number of S times), then we may characterize the distribution by graphing log K versus log S.  Karlin and MacGregor’s ν and Fisher’s α were estimated for the Adi Panggi after the formula proposed by Zei et al. (1983a, 1983b) and Barrai et al. (1992) respectively, where

 

α = 1/Iij  and  ν =α/(Ni +α)

 

and I is random isonymy, Ni is the total number of individuals and α is a measure of the surname diversity and ν is a measure of the migration into the population.

 

Further, the log2S-log2K distribution of surname occurrence (S) and its frequency K for husband, wife, and for the total sample, was considered separately in the present study.  This distribution is expected to show a linear trend under the assumptions of the neutral allele model (Kimura, 1983).  We also have considered the Pareto distribution (Fox 1983), where the relationship between the logarithms of n and k is expected to be linear, where n is the number of individuals and k is the number of surnames, but the results are presented for the log-log distribution only.

 

Surname distribution and progeny size

         

Occurrence of surname with respect to variation in progeny size distribution (separately for male, female, and total children) among the post-reproductive women and their husbands were analyzed to investigate the pattern of surname distribution in a system of patrilocal marriage, where males (kin group, especially brothers) reside in the same village and females move out of the village after their marriage.  Similar analysis was carried out for the case of reproductive wives and their husbands’ sample. 

 

Results

 

The occurrence of different surnames among husbands and wives of the Adi Panggi tribe are shown in Table 1.  There are 18 different surnames among 154 husbands and 22 surnames among 130 wives among the Panggi tribe distributed over six villages.  There are 15 surnames of non-Panggi origin that have filtered into the population through marriages with non-Panggi wives.  Husbands represent a smaller number of surnames whereas wives have more diversity of surnames.  On average, a single surname is shared by about 8.5% of husbands, while in the case of wives it is about 6%.  The three most common surnames among husbands include Paron, Panyang, and Mongku, which represent about 47% of the husbands.  In the case of wives, the surnames Panyang, Paron, and Taku are the three most common and occur among 57% of Panggi wives.  These seven surnames (which have a frequency above 5%) represent 78.5% of the husbands and 61.5% of their wives.  The most frequent surnames occur among both husbands and wives with similar percentages, and the seven most frequent (above 5% occurrence) constitute 70.7% of the sample.  The sample contains four singly occurring Panggi surnames, representing 1.3% percent of the total individuals.  Considering both Panggi and non-Panggi surnames, there are 18 singletons that constitute 6% of the sample.  The unbiased random isonymy Iij for the Adi Panggi samples is 0.1025.

 

 

 

Table 1

Frequencies of different surnames among husband and wives (Panggi and Non-Panggi) in Adi Panggi tribe of Arunachal Pradesh

Sur-

name ID

Surname

Hus-

bands

Wives                 (Panggi

Only)

Wives

(Panggi and

Non-Panggi)

 

 

 

N=154

N=130

N=147

1

Paron

15.58

16.92

14.97

2

Panyang

22.08

29.23

25.85

3

Taku

5.84

10.00

8.84

4

Panggeng

8.44

3.85

3.40

5

Tagi

7.79

3.85

3.40

6

Mongku

10.39

3.08

2.72

7

Tosang

2.60

2.31

2.04

8

Jopir

0.65

0.00

0.00

9

Taying

1.95

2.31

2.04

10

Tatung

1.30

1.54

1.36

11

Ejing

8.44

3.85

3.40

12

Paloh

3.90

3.08

2.72

13

Padun

1.30

5.38

4.76

14

Tateh

3.90

3.08

2.72

15

Tayom

1.30

1.54

1.36

16

Aje

1.95

0.00

0.00

17

Tarang

1.95

2.31

2.04

18

Taruk

0.65

3.85

3.40

19

Teksin

-

0.77

0.68

20

Kirom

-

0.77

0.68

21

Gete

-

1.54

1.36

22

Tangu

-

0.77

0.68

23

Non-Panggi  (15)

-

-

11.56

 

 

 

 

Table 2 shows the estimates ν and α, the two parameters of Karlin-MacGregor and Fishers’ distributions that describe the differential migration rates and surname diversity, among six villages of Adi Panggi.  Both the husbands and wives show wide variation in the rate of migration and surname diversity between the villages.  The two villages KM and RK show the least migration rates and surname diversity, whereas SB shows the highest values of ν and α in case of the husband samples.

 

 

 

Table 2

Indirect Estimation of Migration of Surnames (n) and Surname Diversity (a) in Different Villages of the Adi Panggi Subtribe.

 

 

 

GT

PR

RK

KM

SS

SB

Husband

n

0.1144

0.1606

0.0966

0.0560

0.1083

0.2750

a

4.2612

4.4005

1.2832

1.0085

4.9796

10.6207

Wife

n

0.0967

0.3444

0.6965

0.1467

0.1334

0.3030

a

2.5678

9.4574

25.2379

2.5790

5.3877

11.7380

Wife (P & NP)

n

0.1679

0.4706

0.7450

0.1827

0.1607

0.3444

a

5.6482

21.3309

40.8934

3.5767

7.0828

14.7090

 

 

Panggi wives show higher surname diversity and migration rates than husbands, except in case of GT.  Panggi and non-Panggi wives show higher values than husbands (Table 3).  The estimates of νF and αF,based on random isonymy, show lesser values than from the Karlin-MacGregor and Fishers’ distributions.  The results show a similar pattern among husbands (nF = 0.0513 and aF = 8.3278) and Panggi plus non-Panggi wives (nF = 0.0559 and aF = 8.7113).

      

 

Table 3

Estimates of migration and surname diversity in Adi Panggi

 

Adi Panggi

Sample Size

n

a

*nF

*aF

Husband

154

0.0344

5.4830

0.0513

8.3278

Wife

130

0.0555

7.6390

0.0483

6.6020

Wife

(Panggi & Non Panggi)

147

0.9010

14.7215

0.0559

8.7113

 

            *nF  & *aF  = Distributions based on Random Isonymy (F)

 

 

 

 

The logarithmic distribution of surname occurrence among husbands and wives and in the total sample for the Panggi is shown in Figures 2 and 3.

 

 

 

 

Figure 2.  Log-Log Distribution of surnames in Adi Panggi Husbands and wives.

 

Figure 3.  Log-Log distribution of surnames in Adi Panggi tribe.

 

 

 

Both the figures show a non-linear trend suggesting deviation from the expected neutral allele model.  The shape of the distribution is different in husbands and wives, it is an inverted parabolic curve in males and a slightly concave curve in females.  The best non-linear fit of the curve with polynomial degree four show a variance of about 60% (R˛ = 0.59) in the case of the husbands sample while for wives a polynomial degree three shows a good fit to the distribution (R˛ = 0.80), which is better than the linear fit (R˛ = 0.74).  In the case of all Adi Panggi (both husbands and wives) the distributions show a nonlinear trend which is different from the trend observed in the case of husband and wife samples (Figure 3).  The best fit to the curve is a polynomial of degree four, which accounts for about 96% of the variance.  The non-linear distribution is due to the higher frequency of occurrence of rare surnames (especially those that occur only once and are contributed by wives).  However the truncated log-log distribution (with the rare surnames excluded) shows a linear trend.

 

The surname distribution observed among husbands and wives has also been considered with respect to their progeny size to investigate variation of fertility differentials with respect to surnames among the post-reproductive wives and their husbands, separately for male, female, and total progeny size.  The results of the distribution show a non-linear trend suggesting a differential reproductive fitness for different surnames among husbands and wives (Figure 4).  In the case of husbands, the rare surnames (occurring once or twice) and common surnames (occurring more than 6-9 times) show a trend toward higher progeny size (males, females and total children) than those surnames that occur less frequently (between 3-5 times).  The best fit curve in case of total children (polynomial degree three) accounts for about 94.5 percent of the variance (R˛ = 0.945).  In case of females, the trend shows a negative association between occurrence of surnames and the progeny size and the best fit curve for the total children accounts (polynomial degree three) about 99 percent of the variance (R˛ = 0.989).  Progeny size shows a decreasing trend with the rare surnames that occur once or twice and a trend of lesser progeny in case of more common surnames (except in case the most common surnames that occur ten times among wives).  These results are a reflection of the population structure of the tribe, especially the patrilocal system of marriage, which tends to increase the prevalence of some specific surnames as a result of pre­fer­ential marriages among a few specific clans or surnames.

 

 

 

Figure 4.  Occurrence of surname distribution (s); total number of male (), female (∆), and total children (×) among Adi Panggi husbands and wives (above 45 Years) and progeny size distribution (--).

 

 

 

A similar distribution of progeny size with respect to surname distribution was also considered in case of reproductive wives and their husbands’ sample (Figure 5).  The pattern of distributions is quite varied compared to the case of post-reproductive samples.  In the case of the wives, the progeny size shows an increasing trend from the rare surnames to the most frequent surnames, though there is some perturbance at the initial case of rarer ones. 

 

Figure 5.  Occurrence of surname distribution(s): total number of male (), female (), and total children (×) among Adi Panggi husbands and wives (below 45 years) and progeny size distribution (--).

 

 

Among the husband samples, a similar initial perturbance of progeny size is noticed with the rarer surnames, but the progeny size shows stability with the more commonly occurring surnames of the tribe.

 

Discussion

 

Though selective neutrality of surname distribution has been validated in a number of populations in Europe and Latin American countries, a few studies have shown deviation from neutrality, implicating the role of migration and operation of selection with respect to certain surnames.  The results of the present study indicate that such trends are also observed in small isolated populations.  Apart from the expected influences of migration and drift, there could be other related factors of population structure that could cause the deviation from neutrality.  In small tribal populations the structure of the surname distribution might vary widely depending on the history of settlement, prevailing marriage patterns and other socio-cultural and environmental reasons.  Further, the demographic parameters can vary drastically owing to their subsistence pattern, epidemiological factors, and social organization such as warfare history.  Also, internal tribal