Open Access

Demographic estimates from Y chromosome microsatellite polymorphisms: Analysis of a worldwide sample

  • J Michael Macpherson1Email author,
  • Sohini Ramachandran1,
  • Lisa Diamond1, 2 and
  • Marcus W Feldman1
Human Genomics20041:345

DOI: 10.1186/1479-7364-1-5-345

Received: 2 July 2004

Accepted: 2 July 2004

Published: 1 August 2004


Polymorphisms in microsatellites on the human Y chromosome have been used to estimate important demographic parameters of human history. We compare two coalescent-based statistical methods that give estimates for a number of demographic parameters using the seven Y chromosome polymorphisms in the HGDP-CEPH Cell Line Panel, a collection of samples from 52 worldwide populations. The estimates for the time to the most recent common ancestor vary according to the method used and the assumptions about the prior distributions of model parameters, but are generally consistent with other global Y chromosome studies. We explore the sensitivity of these results to assumptions about the prior distributions and the evolutionary models themselves.


Bayesian inference demographic parameters human history statistical genetics time to the most recent common ancestor (TMRCA) Y chromosome


It is possible to estimate evolutionary and demographic parameters from observed genetic variation in contemporary human populations. Of these quantities, the mean time to the most recent common ancestor (TMRCA) of the sample is of particular interest in population genetic studies because the age of the common ancestor indicates the relatedness of the individuals sampled.

Studies of coalescence in human DNA sequences usually focus on the uniparentally inherited Y chromosome and mitochondrial genome, where the absence of recombination limits the complexity of genealogical analysis.

The human Y chromosome is non-recombining over most of its length and is thus a highly informative haplotypic system that permits the tracing of paternal lineages and complements the maternal history of a population, as observed from mitochondrial DNA. Earlier studies have observed a high degree of geographic population structure on the Y chromosome, attributed to mating practices and the small effective population size of the Y chromosome.[14] Analyses of Y chromosomal haplotypes have been used to investigate the origins of specific regional populations. For example, studies have considered Austronesian-speaking populations,[5] histories of males in Israeli and Palestinian Arab populations,[6] and the history of Khoisan languages characterised by click consonants.[7] Fewer studies have looked at Y chromosome markers in globally distributed populations to calculate a TMRCA.[812] In this study, we investigate the global TMRCA, ancestral population size, growth rate and mutation rate from Y chromosome microsatellite polymorphisms in the HGDP-CEPH Human Genome Diversity Cell Line Panel.[13]

We contrast two coalescent-based methods of inference: (1) a modified version [9] of a rejection algorithm (RA) [14] and (2) the Markov Chain Monte Carlo (MCMC) program BATWING [12] Both methods aim to produce posterior distributions for each of the above parameters, given a particular set of prior distributions on their values. These posterior distributions can differ between the two methods for a given set of priors and also are sensitive to the particular prior set chosen. W e investigate both this sensitivity to choice of priors and the robustness of the inferences to changes in the underlying models.



The 677 males in the sample come from 52 populations in seven geographical regions (Africa, America, Central/South Asia, East Asia, Europe, the Middle East and Oceania). The individuals were typed at seven polymorphic microsatellite loci on the non-recombining portion of the Y chromosome. These loci include two trinucleotide repeats (DYS388 and DYS392) and five tetranucleotide repeats (DYS389a, DYS389b, DYS390, DYS391 and DYS395). Across all loci, the sample contains 50 alleles, six of which are found only in a single population.

Computational method

Our goal was to infer the joint distribution of several demographic and genetic parameters, given the polymorphism data. To do this, we used two methods, RA [9, 14] and BATWING, a MCMC implementation.[12, 15] Both methods assume the same growth model, in which the population has a constant effective number of Y chromosomes, N A , until a time t0 before the present. After this time, the population grows exponentially at a rate r0 per generation. Each method uses these parameters together with the coalescent process [16, 17] to generate genealogical trees with appropriately scaled branch lengths. Both also assume that mutations occur independently at each locus as a Poisson process with rate μ, which has as units mutations per locus per generation. Both methods require that prior distributions be specified for each of the previously mentioned parameters. As employed, neither method takes into account the possible effects of recombination, selection or population structure.

The key difference between the methods is that the RA uses summary statistics, while BATWING uses the full data. For this reason, the RA runs much faster than BATWING. In the RA, a genealogy is simulated under a parameter set sampled independently from the priors. If the standardised differences between each of three summary statistics computed for the simulated data and the observed data are all smaller than a threshold δ, the parameter set is accepted into the posterior distribution; if not, the parameter set is rejected. After many repetitions of this process, the collection of accepted sets of parameters forms the joint posterior distribution. The three summary statistics -- number of haplotypes, mean variance in repeat number and mean heterozygosity -- were chosen for their sensitivity to changes in population size.[9] Beaumont et al.[18] investigated the effects of using more summary statistics and a more sophisticated criterion for acceptance-rejection and found that results differed little from the approach of Pritchard et al., [9] provided that the acceptance threshold was set low enough. We used a threshold of δ = 0.1 in runs of at least 1 million trials, which normally resulted in acceptance rates of around 10-3, well within the range recommended in Beaumont et al.[18]

While the RA starts afresh with each iteration, BATWING maintains a tree at all times and progresses by proposing new trees slightly different from the current tree. A new tree replaces the current tree probabilistically. By construction, BATWING's probabilities of transition between trees specify an irreducible Markov chain which is guaranteed to converge upon the joint distribution of interest, although there is no bound on the length of time convergence may require.[15] The computational expense in generating this potentially enormous number of iterations is BATWING's primary limitation (see 'Discussion on the paper' section of Wilson et al.[12]).

When BATWING simulates a mutation event, it is assumed that the number of microsatellite repeats changes by exactly one, with equal probability of increasing or decreasing. This, the stepwise mutation model (SMM),[19] is also the default model used by the RA. We experimented with two other mutation models using the RA, namely the symmetric geometric model (SGM),[9] in which the number of repeats changes by a value chosen from a symmetric geometric distribution having variance σ2, and the range-constraint model (RCM) [20] in which the repeat number has stepwise changes but with hard reflecting boundaries located at a fixed number of repeats on either side of the original value. We set this fixed number to three, leading to a range of six, because the mean observed range of the number of repeats was 5.9.

We used four different sets of priors labelled P, K, W and Z, each consisting of a density function for each of the four parameters under both BATWING and the RA (Table 1). P and W derive from two previous global TMRCA studies of Y chromosome microsatellites, those of Pritchard et al. [9] and Wilson et al.[12]K and Z contrast higher and lower mutation rate means, as reported in the recent literature in Kayser et al.[21] and Zhivotovsky et al.[22]P and W use very diffuse priors for N A , while K and Z use priors with mean N A equal to 1,000, the value strongly suggested by Pritchard et al. [9] The respective priors for r0 and t0 are diffuse and identical across the four prior sets.
Table 1

Prior distributions used to analyse microsatellite polymorphisms on the Y chromosome. The rejection algorithm [9] and BATWING [12] were run on each set of priors. Distributions were chosen based on past studies; the means for each distribution are given in brackets. μ is the mutation rate per locus per generation, N A is the ancestral population size, t0 is the time of start of exponential population growth in generations before present, r is rate of exponential population growth per generation

Prior set

Derived from

μ prior [mean]

N A prior [mean]

t0prior [mean]

rprior [mean]


Pritchard et al. (1999)

gamma (10, 12,500) [0.0008]

Log normal (8.5, 2) [36,000]

exp (0.001) [1000]

gamma (1, 200) [0.005]


Kayser et al. (2000)

gamma (1, 416) [0.0024]

gamma (3, 0.003) [1000]

exp (0.001) [1000]

gamma (2, 400) [0.005]


Wilson, Weale and Balding (2003)

gamma (18, 8,170) [0.0022]

Gamma (3, 0.001) [3000]

exp (0.001) [1000]

gamma (2, 400) [0.005]


Zhivotovsky et al. (2004)

gamma (1.5, 2,175) [0.0069]

gamma (3, 0.003) [1000]

exp (0.001) [1000]

gamma (2, 400) [0.005]

About 6 per cent of the repeat scores (290 of 4,739) are missing from the dataset. Because the number of haplotypes is not defined when data are missing, these missing data must be removed or replaced to allow the RA to proceed. Since most haplotypes which had missing data only lacked a single repeat score, we replaced each missing repeat score at a given locus by a value chosen from a multinomial distribution created from the frequencies of repeat scores observed for the respective population sample at that locus. Although BATWING can handle missing data by treating missing leaves as nuisance parameters, for consistency one such substituted dataset served for all the results reported here. We found that BATWING runs on the unsubstituted data resulted in posteriors very similar to those with the substituted data (results not shown). The dataset used in this analysis can be found at

Since our focus in this study was on the properties of the posterior distribution for the TMRCA and the demographic parameters, rather than the branching pattern of the genealogy, we modified the BATWING source code to reduce computation time. Reasoning that a maximum parsimony tree would have an approximately correct topology, we started BATWING with such a tree and disabled branch swapping after the first 100,000 iterations, which reduced the time of each iteration thereafter by about 30 per cent.


Heterozygosity was computed for each of the seven major geographical regions using an unbiased estimator.[23] These ranged as follows: 0.45 (America), 0.53 (Africa), 0.55 (Middle East), 0.59 (Europe), 0.60 (Oceania), 0.62 (Central/South Asia) and 0.66 (East Asia). We performed an analysis of molecular variance using Genetic Data Analysis [24] and observed that 73-89 per cent of genetic variation occurs between individuals in the same population (Table 2). The American and Middle Eastern populations have especially low within-population variance. Of the six alleles found only in a single population, four appeared exactly once in the dataset, while the other two appeared twice. There were 46 alleles appearing more than once in the sample, of which three were exclusive to one of the seven major geographical regions listed in Table 2.
Table 2

Analysis of molecular variance for the Y chromosome. Ninety-five per cent confidence intervals (CIs; in parentheses) were calculated using 1,000 bootstraps across loci. The World-B97 sampl [28] consists of 14 populations that were chosen in order to approximate the sample of Barbujan et a l.[26] The fifth column corrects for the smaller Y chromosome population size, as in Pérez-Lezaun et a l.[30] The estimate and CI for the among-region variance component for Eurasia are set to zero because Weir's unbiased estimator [23] yields slightly negative value


Number of regions

Number of populations

Variance components and 95% confidence intervals (%)


Within populations

Within populations (corrected)

Among populations within regions

Among regions




80.4 (74.7-84.5)

94.3 (92.2-95.6)

19.6 (15.5-25.2)





80.2 (73.2-85.6)

94.2 (91.6-96.0)

15.0 (13.2-17.1)

4.80 (0.00-10.6)




83.9 (77.5-88.8)

95.4 (93.2-96.9)

15.5 (11.2-17.2)

0.56 (0.00-6.16)




84.7 (71.5-97.1)

95.7 (90.9-99.3)

8.43 (2.90-11.1)

6.85 (0.00-22.1)




91.2 (87.6-94.5)

97.6 (96.6-98.6)

8.78 (5.49-12.4)





86.4 (83.4-89.1)

96.2 (95.3-97.0)

13.6 (10.9-16.6)





88.9 (85.4-91.7)

97.0 (95.9-97.8)

11.1 (8.25-14.4)

0.00 (0.00-0.00)




86.8 (81.3-92.8)

96.3 (94.6-98.1)

13.2 (7.17-18.7)


   Middle East



66.2 (58.7-74.6)

88.7 (85.0-92.2)

33.8 (25.4-41.2)


   Central/South Asia



94.7 (92.0-97.2)

98.6 (97.9-99.3)

5.30 (2.77-8.04)


East Asia



81.0 (71.8-87.7)

94.5 (91.1-96.6)

19.0 (12.3-28.2)





80.6 (70.2-92.3)

94.3 (90.4-98.0)

19.4 (7.65-29.8)





58.0 (48.2-70.0)

84.7 (78.8-90.3)

42.0 (29.9-51.7)


An important feature of human population genetic structure is the fraction of the total genetic variation that lies within populations relative to that among populations.[2528] Due to the lower effective population sizes of the X and Y chromosomes and the mitochondrial genome compared with the autosomes, genetic drift is stronger in these systems, which can explain the smaller within-group variance that has been reported in such nonautosomal regions.[27, 29] A correction suggested by Pérez-Lezaun et al.[30] can be applied. In the fifth column of Table 2, we see that the within-population components, after correction for population size, are generally similar to those reported by Rosenber et al.[28] and Ramachandran et al.[29] When this analysis is repeated using only the tetranucleotide repeats, the results are not affected.[31, 32]

We have summarised the posterior distributions from the RA and BATWING for each of the four sets of prior distributions in Table 3. The three sets of priors P, K and Z tend to produce similar posteriors under the RA and BATWING, including TMRCA point estimates of 60,000-90,000 years before the present (ybp), assuming a constant generation length of 25 years. Note that BATWING may only be compared directly to the RA using the SMM. In many cases, as would be expected from its use of the full data, BATWING produced narrower credible intervals than did the RA, but this reduction in the variance was not universal. It can be seen in Figure 1 that for the P, K and Z priors, the TMRCA traces which form the BATWING posteriors have most support in the region from 60,000-90,000 ybp, and differ in the extent to which the priors permit exploration of parameter space.
Table 3

Demographic parameters estimated from seven Y chromosome microsatellite loci in 677 individuals drawn from 52 global populations. Columns: priors (see Table 1 for details of each set of prior distributions); method (RA = rejection algorithm using stepwise mutation model [SMM], symmetrical geometric model [SGM], or range constraint model [RCM], BW = BATWING followed by number of Markov Chain Monte Carlo updates); remaining columns are posterior distributions, with mean and 95 per cent credible interval, TMRCA is the mean time to the most recent common ancestor in years before present (generation time is assumed to be 25 years), μ is mutation rate per locus per generation, N A is ancestral population size, t0 is time of start of exponential population growth in generation before present is rate of exponential population growth per generation

Prior set


TMRCA (× 10-3) mean [95%]

μ(× 10-3) mean [95%]

N A mean [95%]

t0(× 103) mean [95%]

r(× 10-2) mean [95%]



86 [30-221]

0.71 [0.34-1.19]

1630 [140-4520]

22 [8-51]

0.77 [0.23-2.13]



58 [20-147]

0.71 [0.36-1.18]

990 [70-3110]

23 [8-52]

0.75 [0.25-2.08]



121 [33-441]

0.73 [0.35-1.22]

2070 [240-6510]

21 [8-50]

0.79 [0.22-2.16]


BW, 4 × 106

82 [34-195]

0.80 [0.40-1.34]

2510 [400-11630]

36 [12-73]

0.32 [0.15-0.58]


BW, 200 × 106

64 [31-131]

0.79 [0.39-1.32]

710 [250-1740]

48 [24-92]

0.30 [0.14-0.52]



68 [21-178]

0.89 [0.27-2.15]

1020 [220-2410]

24 [7-64]

0.67 [0.20-1.55]



64 [21-155]

0.66 [0.20-1.59]

960 [220-2270]

27 [9-61]

0.67 [0.21-1.60]



78 [22-199]

0.98 [0.29-2.51]

1090 [290-2390]

22 [7-55]

0.69 [0.23-1.55]


BW,4 × 106

55 [16-143]

1.27 [0.31-4.02]

760 [190-2350]

42 [4-118]

0.43 [0.11-1.33]


BW, 200 × 106

63 [17-178]

1.10 [0.31-2.71]

660 [190-1610]

52 [13-159]

0.37 [0.11-0.92]


BW, 800 × 106

61 [18-164]

0.90 [0.27-2.12]

760 [230-1760]

42 [11-124]

0.47 [0.14-1.10]



39 [19-81]

1.68 [1.01-2.72]

940 [250-1940]

11 [5-18]

1.01 [0.46-1.82]



29 [14-57]

1.49 [0.91-2.18]

600 [230-1300]

13 [7-21]

0.85 [0.49-1.29]



83 [20-266]

1.76 [1.08-2.71]

1470 [250-3730]

11 [5-24]

0.87 [0.36-1.95]


BW,4 × 106

32 [16-66]

1.85 [1.10-2.78]

1100 [260-4160]

14 [5-26]

0.72 [0.41-1.17]


BW, 200 × 106

29 [15-57]

1.81 [1.07-2.74]

590 [240-1210]

16 [9-27]

0.70 [0.40-1.09]



79 [27-200]

0.71 [0.24-1.54]

1120 [290-2550]

26 [9-63]

0.62 [0.20-1.48]



71 [24-170]

0.55 [0.18-1.24]

1010 [250-2310]

30 [10-74]

0.62 [0.20-1.42]



88 [28-215]

0.74 [0.25-1.61]

1210 [330-2660]

26 [9-65]

0.64 [0.20-1.49]


BW,4 × 106

41 [16-94]

1.40 [0.52-2.94]

1490 [540-3420]

23 [5-60]

0.48 [0.18-1.04]


BW, 200 × 106

84 [26-228]

0.72 [0.22-1.64]

690 [210-1630]

80 [22-232]

0.25 [0.08-0.57]
Figure 1

The most recent common ancestor (TMRCA) estimated from BATWING runs over the four sets of priors. Each plot point is the mean of 4 million Markov Chain Monte Carlo (MCMC) update

BATWING and the RA gave different point estimates for t0, r, and N A . The RA estimated a growth period beginning 20,000-25,000 ybp, growing at a rate of 6-8 × 10-3 per generation, whereas BATWING placed greatest support on a longer, slower growth period of 30,000-50,000 years at rate of 3-5 × 10-3 per generation. BATWING also tended to give smaller estimates of ancestral population size than the RA (mean 700-1,000, compared with 1,000-1,500). The mutation rate estimates were similar with both approaches, at 7-9 × 10-4 per locus per generation. Despite the differences in the modes of the distributions, these posterior distributions overlap considerably for each demographic parameter.

The use of W priors resulted in posteriors very different from those of the P, K and Z priors, namely a much younger TMRCA point estimate of 30,000 ybp, and a much greater mutation rate and growth rate (Table 3). We address the discrepancy between the results of the W priors and the other three prior sets in the Discussion.

Effect of mutation and growth models

Because we were interested in the effects of model assumptions on the posterior distributions, we tested three models of mutation and growth using the RA. Following Pritchard et al., [9] we used the relative acceptance rate ratios to determine which model, if any, was more consistent with the data. This amounts to placing an evenly weighted prior on each model and using the RA to assess posterior support for the models.

We observed that the TMRCA posteriors and, to a lesser extent, those of the ancestral population size and the mutation rate, were affected by the choice of mutation model (Table 3). The RCM produced considerably longer TMRCAs (80,000-120,000 ybp) than the SMM, while the SGM produced shorter TMRCAs (30,000-70,000 ybp). This ordering matches that of Pritchard et al., [9] who used the RA with a smaller dataset of similar geographical diversity. The acceptance rates for all three models were similar (Table 4).
Table 4

Ratio of acceptances per million trials at threshold δ = 0.1 and average acceptance rates using the rejection algorithm [9] and different mutation models for all four sets of priors (Table 1)

Prior set

Mutation model





Mean rate (× 10-3)





















Since some studies have documented asymmetry in the microsatellite mutation process (eg Calabrese and Durrett [33]), we compared the RA to a version adapted to permit mutational asymmetry. Using the SGM and the K priors, identical and independent priors were placed on the respective rates of repeat increase and decrease. The mean acceptance rates were nearly identical for the symmetric and asymmetric mutation models (symmetric/asymmetric ratio 52/48; mean acceptance rate 1.63 × 10-3). The posteriors for the increase and decrease rates were very similar and also very similar to the μ posterior for the mutation model. Thus, these data are not better explained by the asymmetric model than the symmetric mutation model.

Several studies [21, 34] have noted that microsatellite loci on the same chromosome can mutate at different rates. We addressed the assumption that each locus mutates at the same rate by comparing RA acceptance rates between runs assuming that all loci mutate at the same rate and runs assuming that each locus mutates at a different rate. We used the Z priors in both cases; in the latter case, one mutation rate was drawn from the Z μ prior for each locus. The acceptance rate was similar to that obtained when a single mutation rate governed all the loci (single rate/multiple rates ratio 49/51; mean acceptance rate 1.42 × 10-3), as were the estimated TMRCAs and parameter posteriors. Both this result and the mutation rate symmetry result held for all sets of priors.

We compared the growth model described above with a model having two exponential growth phases using a modified version of the RA. This second model has additional parameters t1, the time before the present at which the population begins its second growth phase, and r1, the rate at which the population grows thereafter. Over the worldwide data, acceptance rates were similar between the two models (single growth-phase/dual growth-phase ratio 52/48; mean acceptance rate 1.88 × 10-3). Under the dual growth phase model using the P priors, growth was somewhat slower and lasted longer than in the single growth phase model (Table 5). This result was also robust to the choice of prior set.
Table 5

Results from rejection algorithm with one and two exponential growth phases usin P priors


(Mean, 95% range)






(20,000, 14,7000)


(18,000, 130,000)



(44, 3,100)


(55, 3,400)

r 0


(0.0025, 0.028)


(0.00018, 0.015)

r 1




(0.00028, 0.018)

t 0


(8000, 52,000)


(620, 32,000)

t 1




(430, 41,000)

We also compared the single growth model against a model with constant population size. In agreement with Pritchard et al., [9] the growth model had a much higher acceptance rate (constant/growth ratio 0/100; mean acceptance rate 0.68 × 10-3) than did the model of constant population size, with the exception of two populations, namely the Oceanic and American populations, each of which had slightly higher acceptance rates for the constant model than for the expansion model (see also Zhivotovsky et al.[35]).

Using regional subsets of the data, both methods produced different posteriors from those using the entire sample. Table 6 shows results obtained with the RA using the P priors with the SGM from typical runs of both the RA and BATWING. There is great overlap between these posteriors, and in the cases of the TMRCA and the time of expansion, this overlap makes it difficult to discern a clear pattern in the timing of splitting of the population or expansion of the respective subpopulations.
Table 6

Mean parameter estimates and 95 per cent credible intervals for individual populations obtained with the rejection algorithm using the priors and the P symmetrical geometric mode


TMRCA (× 103) mean [95%]

μ (× 10-3) mean [95%]

N A mean [95%]

t0(× 103) mean [95%]

R(×10-2) mean [95%]


58 [20-150]

0.71 [0.34-1.19]

990 [70-3100]

22 [8.5-50]

0.83 [0.26-2.0]


53 [15-160]

0.70 [0.35-1.25]

1,000 [62-3500]

16 [3.8-43]

0.64 [0.12-1.8]


54 [20-140]

0.70 [0.34-1.17]

970 [64-3500]

23 [8.4-51]

0.81 [0.26-2.1]


30 [11-83]

0.74 [0.35-1.25]

730 [57-2100]

14 [1.0-48]

0.43 [0.18-1.7]

Central/South Asia

51 [18-140]

0.73 [0.34-1.27]

970 [61-3300]

21 [6.8-48]

0.65 [0.19-1.8]

East Asia

55 [20-150]

0.72 [0.36-1.26]

1,000 [59-3500]

23 [7.8-53]

0.70 [0.21-1.9]


44 [16-110]

0.69 [0.34-1.19]

800 [52-2700]

20 [7.0-48]

0.70 [0.21-1.9]


51 [19-140]

0.70 [0.35-1.21]

940 [92-3100]

20 [7.0-48]

0.81 [0.25-2.1]

Middle East

52 [15-150]

0.76 [0.37-1.33]

1,100 [96-3300]

15 [1.7-48]

0.43 [0.31-1.3]


59 [18-160]

0.76 [0.36-1.32]

1,400 [120-4100]

17 [0.8-55]

0.37 [0.15-1.4]

With such a large dataset, BATWING required a long time to converge. We observed that after long runs of 200 million iterations, the posteriors that resulted differed from those produced after the first 4 million iterations (Table 3). We monitored progress towards convergence by computing the autocorrelation function (ACF), the correlation of a chain with itself when its indices are offset by some integer lag, and observing the chains and the overall likelihood. In BATWING runs using the entire dataset, we found that the ACF decayed to zero monotonically but slowly as the lag increased. We extended several BATWING runs to hundreds of millions of iterations and watched for signs of nonconvergence, but found that the parameter plots and likelihood plots remained steady (Figure 1). The ACF continued to decline monotonically and more rapidly than before the runs were extended, but it still did not reach zero for lags of less than tens of thousands for any of the demographic parameter chains. This differs from the experience of Wilson et al. [12] who reported ACF declining to zero by lag 30. It is likely that this difference is caused by slow movement of the chain between distant regions of parameter space, which might be expected, since the number of nuisance parameters -- for example the internal node haplotypes -- and therefore the size of the parameter space, is much larger in this study than in Wilson et al.[12]


These two methods of inference support a recent human Y chromosome TMRCA. Three of the sets of priors we examined resulted in a mean TMRCA of 60,000-90,000 ybp. The estimates exceeded 100,000 ybp when we limited the range of mutation under the RA. These values are consistent with other global TMRCA estimates from Y chromosome microsatellite data, including those of Pritchard et al. [9] (46,000-91,000 ybp), and also concur with several single nucleotide polymorphism studies of the Y chromosome, including Thomson et al.[10] (48,000-59,000 ybp) and Tang et al.[36] (91,000 ybp). Interpreting these estimates requires care, because they are sensitive to both the priors and models assumed, and rely on a simple model of evolution.

Sensitivity to priors

We examined the dependence of the results from the two methods on the different sets of priors. It is clear from Table 3, particularly in the case of the W priors, that the choice of priors affects the posteriors. There are two main reasons for the posteriors not to be identical for different sets of priors. First, the data may not be informative. Uninformative data would imply a flat likelihood surface. We would expect to see posteriors resembling the priors under both methods. The microsatellite data used here appear to be informative, because the posteriors differ from their priors (Figure 2), and because the RA and BAT WING tended to infer similar posterior departures from a given set of priors (Table 3). The second reason may be that the priors do not allow the exploration of those regions of the parameter space which would otherwise be included in the posteriors; this appears to explain the divergent results obtained with the W priors.
Figure 2

Comparison of mutation rate prior distributions and posterior distributions estimated by BATWING between four sets of priors. Priors are represented by curves, posteriors by histograms

Using the W priors from Wilson et al.[12] with the HGDP-CEPH data, both inference methods produced a mean TMRCA estimate of around 30,000 years, which is consistent with the findings of that paper. The W W μ prior has much smaller variance than the K μ prior, effectively precluding values of μ smaller than 0.001 (Figure 2). The K and Z μ priors include these smaller values as well as the higher values implied under the W priors. With both low and high mutation rates available, the posteriors from the K and Z priors placed most of their support on mutation rates lower than 0.001 (Figure 2), leading to an older TMRCA. Because of this result, and since a TMRCA of 30,000 years seems improbable in light of archaeological evidence that anatomically modern humans existed outside Africa at least 35,000-45,000 ybp [3740] we suggest that lower mutation rates leading to an earlier TMRCA are more plausible than the higher rates of the W priors.

The inadvertent restriction of parameter space might be mitigated by choosing uniform priors spanning the conceivable range of the parameters. For large datasets, this is not a feasible approach for the RA, much less for BATWING, with available computers. A practicable approach might be to use the RA to compute the standardised differences between the summary statistics and the data as usual, and to simply record these differences rather than accepting or rejecting them, drawing the parameters from broad uniform priors. Some regions of parameter space will likely generate very different summary statistics from the data, and these may be ignored. But those parameter space regions which produce statistics near to those of the data may serve as a basis for establishing sensible priors.

Y chromosome mutation rate

A phylogenetic study by Zhivotovsk et al.[23] estimated the mean effective Y chromosome mutation rate to be 0.69 × 10-3 ± 0.57 × 10-3. By contrast, estimates from pedigree studies suggest higher mutation rates (2.8 × 10-3 and 2.1 × 10-3)[21, 41]. Two of our sets of prior distributions, K and Z, differed only in the priors for μ, with a higher mean of 2.4 × 10-3 and larger variance in the former, and a lower mean of 0.69 × 10-3, with smaller variance in the latter. The resulting posteriors, with respective means of 0.92 × 10-3 and 0.73 × 10-3, tend to support an effective mutation rate more like the lower, phylogeny-derived value.

It appears that the data are consistent with all three mutation models. The RCM limits the repeat length explicitly by prohibiting repeat lengths too distant from the initial value. Neither the SMM nor the SGM place restrictions on the length, but the SMM approaches large repeat lengths more slowly than the SGM because its step size is limited to one. The more that repeat lengths are limited, the longer it takes to achieve some level of diversity in the population, which accounts for the observation that the oldest TMRCA occurs with the RCM while the youngest occurs with the SGM model. We also found that the results were not sensitive to the assumption of a single mutation rate across all loci or the assumption of symmetry in repeat length change.

The estimated TMRCA is critically dependent on prior assumptions about the rate of mutation. Assuming constant population size and a given sample size, the coalescent branch length expectations are directly proportional to the population size; doubling the population size doubles the expected length of each branch in the genealogy. The coalescent in a growing population may also be rescaled, such that all branch lengths change in the same proportion, although this requires more than a simple proportional change in the population size.[42] For example, by appropriately scaling the demographic priors and mutation rate, we obtained arbitrarily large or small TMRCAs using the HGDP-CEPH data under both RA and BATWING, with acceptance rates indistinguishable from those reported in Table 4. The differences in acceptance rates between the four prior sets reflect not the likelihood of the TMRCA given the data, but the likelihood of the combination of tree geometry and mutation rate. Changing the population history changes the relative lengths of the tree branches; some tree geometries are more consistent with the dataset's level of polymorphism than others. The TMRCAs we report are those most consistent with the range of mutation rates reported in the literature.

Were there distinct epochs of population growth?

The estimates in this study of 20,000-50,000 ybp for the time of population expansion long precede 10,000 ybp, the time around which agriculture is widely believed to have developed,[37, 43, 44] and at which the population would naturally be expected to increase. Several other studies also give estimates of expansion time much earlier than 10,000 ybp, including microsatellite studies of the Y chromosome (20,000 ybp, from Pritchard et al. [9]) and autosomes (35,000 ybp, from Zhivotovsky et al.[22]). Estimates of expansion time from extensive studies of nuclear autosomal sequences, such as 0-100,000 yb [45] and 36,000-97,000 ybp [46] also suggest an early start to population expansion. If population increase began long before the development of agriculture, something else, perhaps another behavioural change, may have precipitated this earlier expansion.

Reasoning that the emergence of agriculture might have drastically increased the rate of population growth, we compared the original RA to a version which explicitly allowed for two distinct growth phases. We did not observe much difference between the results for the two growth phases (Table 5), nor between those for the two growth phases combined and the original, single phase of growth. Furthermore, the acceptance rates were quite similar between the two growth models. We conclude that, although we observe a strong signal of growth by comparison to the constant population size model, no sharp increase in the rate of growth after its onset is evident from the data.

Both methods explored here make a number of simplifying assumptions. While recombination can probably be safely disregarded for these Y chromosome markers, the same cannot be said for the possible effects of selection, population structure and sampling error. Selection is known to mimic population growth,[45] compressing towards the present the portion of the genealogy in which it acts. Population structure may also strongly affect genealogies [47] as can the pooling of samples from different populations [48, 49] and uncorrected ascertainment bias.[50] Inferences drawn from other genomic regions and from more specific models will be useful in more accurately understanding human demographic history.

Our estimates for Y chromosome TMRCAs are again shorter than those obtained for the mitochondrion,[36, 51] reinforcing interest in understanding the differences between male and female demography for early modern humans. Further work might include implementing models of range constraints with soft boundaries for microsatellites.[52, 53] The RA might also be modified to include population subdivision, with different expansion times for different populations.



We thank Thomas Lebarbanchon for contributions to the rejection algorithm implementation. Richard Klein, Noah Rosenberg, Hua Tang, Lev Zhivotovsky and our reviewers provided insightful comments on the manuscript. This work was supported by NIH grant GM28428 to MWF. JMM is a Howard Hughes Medical Institute Predoctoral Fellow. SR is supported by an NDSEG fellowship.

Authors’ Affiliations

Department of Biological Sciences, Stanford University
Stanford Genome Technology Center


  1. Seielstad MT, Minch E, Cavalli-Sforza LL: 'Genetic evidence for a higher female migration rate in humans'. Nat Genet. 1998, 20: 278-280. 10.1038/3088.View ArticlePubMedGoogle Scholar
  2. Bertranpetit J: 'Genome, diversity, and origins: The Y chromosome as a storyteller'. Proc Natl Acad Sci USA. 2000, 97: 6927-6929. 10.1073/pnas.97.13.6927.PubMed CentralView ArticlePubMedGoogle Scholar
  3. Underhill PA, Shen PD, Lin AA, et al: 'Y chromosome sequence variation and the history of human populations'. Nat Genet. 2000, 26: 358-361. 10.1038/81685.View ArticlePubMedGoogle Scholar
  4. Hammer MF, Karafet TM, Redd AJ, et al: 'Hierarchical patterns of global human Y chromosome diversity'. Mol Biol Evol. 2001, 18: 1189-1203. 10.1093/oxfordjournals.molbev.a003906.View ArticlePubMedGoogle Scholar
  5. Hurles ME, Nicholson J, Bosch E, et al: 'Y chromosomal evidence for the origins of Oceanic-speaking peoples'. Genetics. 2002, 160: 289-303.PubMed CentralPubMedGoogle Scholar
  6. Nebel A, Filon D, Weiss DA, et al: 'High-resolution V chromosome haplotypes of Israeli and Palestinian Arabs reveal geographic substructure and substantial overlap with haplotypes of Jews'. Hum Genet. 2000, 107: 630-641. 10.1007/s004390000426.View ArticlePubMedGoogle Scholar
  7. Knight A, Underhill PA, Mortensen HM, et al: 'African Y chromosome and mtDNA divergence provides insight into the history of click languages'. Curr Biol. 2003, 13: 464-473. 10.1016/S0960-9822(03)00130-1.View ArticlePubMedGoogle Scholar
  8. Hammer MF, Karafet T, Rasanayagam A, et al: 'Out of Africa and back again: Nested cladistic analysis of human Y chromosome variation'. Mol Biol Evol. 1998, 15: 427-441. 10.1093/oxfordjournals.molbev.a025939.View ArticlePubMedGoogle Scholar
  9. Pritchard JK, Seielstad MT, Pérez-Lezaun A, Feldman MW: 'Population growth of human Y chromosomes: A study of V chromosome microsatellites'. Mol Biol Evol. 1999, 16: 1791-1798. 10.1093/oxfordjournals.molbev.a026091.View ArticlePubMedGoogle Scholar
  10. Thomson R, Pritchard JK, Shen P, Oefner PJ, Feldman MW: 'Recent common ancestry of human Y chromosomes: Evidence from DNA sequence data'. Proc Natl Acad Sci USA. 2000, 97: 7360-7365. 10.1073/pnas.97.13.7360.PubMed CentralView ArticlePubMedGoogle Scholar
  11. Tang H, Siegmund DO, Shen P, et al: 'Estimation of the time to the most recent common ancestor by tree-partition'. Am J Hum Genet. 2001, 69 (Supplement): 394-Google Scholar
  12. Wilson IJ, Weale ME, Balding DJ: 'Inferences from DNA data: Population histories, evolutionary processes and forensic match probabilities'. J R Stat Soc [Ser A]. 2003, 166: 155-201. 10.1111/1467-985X.00264.View ArticleGoogle Scholar
  13. Cann HM, de Toma C, Cazes L, et al: 'A human genome diversity cell line panel'. Science. 2002, 296: 261-262.View ArticlePubMedGoogle Scholar
  14. Tavaré S, Balding DJ, Griffiths RC, Donnelly P: 'Inferring coalescence times from DNA sequence data'. Genetics. 1997, 145: 505-518.PubMed CentralPubMedGoogle Scholar
  15. Wilson IJ, Balding DJ: 'Genealogical inference from microsatellite data'. Genetics. 1998, 150: 499-510.PubMed CentralPubMedGoogle Scholar
  16. Kingman JFC: 'The coalescent'. Stochastic Process Appl. 1982, 13: 235-248. 10.1016/0304-4149(82)90011-4.View ArticleGoogle Scholar
  17. Nordborg M: 'Coalescent theory'. Edited by: Balding, D.J., Bishop, M., Cannings, C. 2003, Handbook of Statistical Genetics, Wiley, Chichester, W. Sussex, UK, 602-635. 2Google Scholar
  18. Beaumont MA, Zhang W, Balding DJ: 'Approximate Bayesin computation in population genetics'. Genetics. 2002, 162: 2025-2035.PubMed CentralPubMedGoogle Scholar
  19. Ohta T, Kimura M: 'A model of mutation appropriate to estimate the number of electrophoretically detectable alleles in a finite population'. Genet Res. 1973, 22: 201-204. 10.1017/S0016672300012994.View ArticlePubMedGoogle Scholar
  20. Feldman MW, Bergman A, Pollock DD, Goldstein DB: 'Microsatellite genetic distances with range constraints: Analytic description and problems of estimation'. Genetics. 1997, 145: 207-216.PubMed CentralPubMedGoogle Scholar
  21. Kayser M, Roewer L, Hedman M, et al: 'Characteristics and frequency of germline mutations at microsatellite loci from the human Y chromosome, as revealed by direct observation in father/son pairs'. Am J Hum Genet. 2000, 66: 1580-1588. 10.1086/302905.PubMed CentralView ArticlePubMedGoogle Scholar
  22. Zhivotovsky LA, Underhill PA, Ginnioglu C, et al: 'The effective mutation rate at Y chromosome short tandem repeats, with application to human population-divergence time'. Am J Hum Genet. 2004, 74: 50-61. 10.1086/380911.PubMed CentralView ArticlePubMedGoogle Scholar
  23. Weir B: Genetic Data Analysis II. 1996, Sinauer Press, Sunderland, MAGoogle Scholar
  24. Lewis PO, Zaykin DV: 'Genetic data analysis: Computer program for the analysis of allelic data. Version 1.0 (d16c)'. 2001, []Google Scholar
  25. Lewontin RC: 'The apportionment of human diversity'. Evol Biol. 1972, 6: 381-398.View ArticleGoogle Scholar
  26. Barbujani G, Magagni A, Minch E, Cavalli-Sforza LL: 'An apportionment of human DNA diversity'. Proc Natl Acad Sci USA. 1997, 94: 4516-4519. 10.1073/pnas.94.9.4516.PubMed CentralView ArticlePubMedGoogle Scholar
  27. Jorde LB, Watkins WS, Bamshad MJ, et al: 'The distribution of human genetic diversity: A comparison of mitochondrial, autosomal, and Y chromosome data'. Am J Hum Genet. 2000, 66: 979-988. 10.1086/302825.PubMed CentralView ArticlePubMedGoogle Scholar
  28. Rosenberg NA, Pritchard JK, Weber JL, et al: 'Genetic structure of human populations'. Science. 2002, 298: 2381-2385. 10.1126/science.1078311.View ArticlePubMedGoogle Scholar
  29. Ramachandran S, Rosenberg NA, Zhivotovsky LA, Feldman MW: 'Robustness of the inference of human population structure: A comparison of X-chromosomal and autosomal microsatellites'. Hum Genom. 2004, 1: 87-97.Google Scholar
  30. Pérez-Lezaun A, Calafell F, Seielstad M, et al: 'Population genetics of Y-chromosome short tandem repeats in humans'. J Mol Evol. 1997, 45: 265-270. 10.1007/PL00006229.View ArticlePubMedGoogle Scholar
  31. Excoffier L, Hamilton G: 'Comment on 'Genetic structure of human populations''. Science. 2003, 300: 1877-View ArticlePubMedGoogle Scholar
  32. Rosenberg NA, Pritchard JK, Weber JL, et al: 'Response to Comment on 'Genetic structure of human populations''. Science. 2003, 300: 1877-View ArticleGoogle Scholar
  33. Calabrese P, Durrett R: 'Dinucleotide repeats in the Drosophila and human genomes have complex, length-dependent mutation processes'. Mol Biol Evol. 2003, 20: 715-725. 10.1093/molbev/msg084.View ArticlePubMedGoogle Scholar
  34. Bowcock AM, Ruiz-Linares A, Tomforhde J, et al: 'High-resolution of human evolutionary trees with polymorphic microsatellites'. Nature. 1994, 368: 455-457. 10.1038/368455a0.View ArticlePubMedGoogle Scholar
  35. Zhivotovsky LA, Rosenberg NA, Feldman MW: 'Features of evolution and expansion of modern humans, inferred from genome-wide microsatellite markers'. Am J Hum Genet. 2003, 72: 1171-1186. 10.1086/375120.PubMed CentralView ArticlePubMedGoogle Scholar
  36. Tang H, Siegmund DO, Shen PD, et al: 'Frequentist estimation of coalescence times from nucleotide sequence data using a tree-based partition'. Genetics. 2002, 161: 447-459.PubMed CentralPubMedGoogle Scholar
  37. Klein RG: The Human Caree. 1999, University of Chicago Press, Chicago, ILGoogle Scholar
  38. Mellars P: 'The fate of the Neanderthals'. Nature. 1998, 395: 539-540. 10.1038/26842.View ArticlePubMedGoogle Scholar
  39. O'Connell JF, Allen J: 'When did humans first arrive in greater Australia and why is it important to know?'. Evol Anthropol. 1998, 6: 132-146. 10.1002/(SICI)1520-6505(1998)6:4<132::AID-EVAN3>3.0.CO;2-F.View ArticleGoogle Scholar
  40. Bowler JM, Johnston H, Olley JM, et al: 'New ages for human occupation and climatic change at Lake Mungo, Australia'. Nature. 2003, 421: 837-840. 10.1038/nature01383.View ArticlePubMedGoogle Scholar
  41. Heyer E, Puymirat J, Dieltjes P, et al: 'Estimating Y chromosome-specific microsatellite mutation frequencies using deep rooting pedigrees'. Hum Mol Genet. 1997, 6: 799-803. 10.1093/hmg/6.5.799.View ArticlePubMedGoogle Scholar
  42. Griffiths RC, Tavaré S: 'Sampling theory for neutral alleles in a varying environment'. Phil Trans R Soc Lond B. 1994, 344: 403-410. 10.1098/rstb.1994.0079.View ArticleGoogle Scholar
  43. Ammerman AJ, Cavalli-Sforza LL: The Neolithic Transition and the Genetics of Populations in Europ. 1984, Princeton University Press, Princeton, NJGoogle Scholar
  44. Cavalli-Sforza LL, Feldman MW: 'The application of molecular genetic approaches to the study of human evolution'. Nat Genet. 2003, 33: 266-275. 10.1038/ng1113.View ArticlePubMedGoogle Scholar
  45. Wall JD, Przeworski M: 'When did the human population size start increasing?'. Genetics. 2001, 155: 1865-1874.Google Scholar
  46. Pluzhnikov A, Di Rienzo A, Hudson RR: 'Inferences about human demography based on multilocus analyses of noncoding sequences'. Genetics. 1999, 161: 1209-1218.Google Scholar
  47. Beaumont MA: 'Recent developments in genetic data analysis: What can they tell us about human demographic history?'. Heredity. 2004, 922: 365-379.View ArticleGoogle Scholar
  48. Ptak S, Przeworski M: 'Evidence for population growth in humans is confounded by fine-scale population structure'. Trends Genet. 2002, 18: 559-593. 10.1016/S0168-9525(02)02781-6.View ArticlePubMedGoogle Scholar
  49. Hammer MF, Blackmer F, Garrigan D, et al: 'Human population structure and its effects on sampling Y chromosome sequence variation'. Genetics. 2003, 164: 1495-1509.PubMed CentralPubMedGoogle Scholar
  50. Wakeley J, Nielsen R, Liu-Cordero SN, Ardlie K: 'The discovery of single-nucleotide polymorphisms -- and inferences about human demographic history'. Am J Hum Genet. 2001, 69: 1332-1347. 10.1086/324521.PubMed CentralView ArticlePubMedGoogle Scholar
  51. Vigilant L, Stoneking M, Harpending H, et al: 'African populations and the evolution of human mitochondrial DNA'. Science. 1991, 253: 1503-1507. 10.1126/science.1840702.View ArticlePubMedGoogle Scholar
  52. Garza JC, Slatkin M, Freimer NB: 'Microsatellite allele frequencies in humans and chimpanzees, with implications for constraints on allele size'. Mol Biol Evol. 1995, 12: 594-603.PubMedGoogle Scholar
  53. Zhivotovsky LA, Feldman MW, Grishechkin SA: 'Biased mutations and microsatellite variation'. Mol Biol Evol. 1997, 14: 926-933. 10.1093/oxfordjournals.molbev.a025835.View ArticlePubMedGoogle Scholar


© Henry Stewart Publications 2004