Skip to main content

On the stability of the Bayenv method in assessing human SNP-environment associations



Phenotypic variation along environmental gradients has been documented among and within many species, and in some cases, genetic variation has been shown to be associated with these gradients. Bayenv is a relatively new method developed to detect patterns of polymorphisms associated with environmental gradients. Using a Bayesian Markov Chain Monte Carlo (MCMC) approach, Bayenv evaluates whether a linear model relating population allele frequencies to environmental variables is more probable than a null model based on observed frequencies of neutral markers. Although this method has been used to detect environmental adaptation in a number of species, including humans, plants, fish, and mosquitoes, stability between independent runs of this MCMC algorithm has not been characterized. In this paper, we explore the variability of results between runs and the factors contributing to it.


Independent runs of the Bayenv program were carried out using genome-wide single-nucleotide polymorphism (SNP) data from samples from 60 worldwide human populations following previous applications of the Bayenv method. To assess factors contributing to the method's stability, we used varying numbers of MCMC iterations and also analyzed a second modified data set that excluded two Siberian populations with extreme climate variables. Between any two runs, correlations between Bayes factors and the overlap of SNPs in the empirical p value tails were surprisingly low. Enrichments of genic versus non-genic SNPs in the empirical tails were more robust than the empirical p values; however, the significance of the enrichments for some environmental variables still varied among runs, contradicting previously published conclusions. Runs with a greater number of MCMC iterations slightly reduced run-to-run variability, and excluding the Siberian populations did not have a large effect on the stability of the runs.


Because of high run-to-run variability, we advise against making conclusions about genome-wide patterns of adaptation based on only one run of the Bayenv algorithm and recommend caution in interpreting previous studies that have used only one run. Moving forward, we suggest carrying out multiple independent runs of Bayenv and averaging Bayes factors between runs to produce more stable and reliable results. With these modifications, future discoveries of environmental adaptation within species using the Bayenv method will be more accurate, interpretable, and easily compared between studies.


Phenotypic variation along environmental gradients, such as limb length in animals increasing with colder climates [1], animal body size increasing in colder environments [2], and bird coloration becoming darker in more humid climates [3], was first documented in the 1800s. Since then, many associations between phenotypic variation and environmental variation, both among and between species, have been noted [420]. To define these patterns, Huxley [21] introduced the term ‘cline’, which refers to a continuous phenotypic or genetic gradient that is associated with a pattern of geographic and/or environmental heterogeneity.

Various methods have been developed to determine the contribution of environmental variation to allelic or genotypic frequency patterns. Early theory in population genetics was used to derive statistics that allowed variation in allele frequencies to distinguish selection from drift [2224]. Additionally, it has been shown that correlations between features of the environment and allele frequencies can be detected [15, 2527]. Recently, the availability of genome-wide data on DNA polymorphisms has spurred interest in using haplotype frequencies or allele frequency spectra to infer the existence or extent of selection across genomes [2837].

Coop et al. [38] developed a statistical procedure named Bayenv to detect genomic evidence of adaptation to specific environmental variation, and it is this method we address here. The method carries out a locus-by-locus Bayesian analysis to determine whether a pattern in which population allele frequencies vary linearly with environmental variables is more probable than a pattern in which allele frequencies are assumed to be neutral and unrelated to environmental variables.

Given a set of populations with associated environmental variables and genome-wide single-nucleotide polymorphism (SNP) allele frequencies, a null model is first estimated to describe how allele frequencies covary across populations. Sample allele frequencies are drawn from a set of underlying population frequencies, which are assumed to be distributed according to a multivariate normal distribution around a transformed global allele frequency with a variance-covariance matrix representing the genetic structure of the populations. These transformed frequencies are not constrained between 0 and 1. The prior on the variance-covariance matrix is an inverse Wishart distribution, and Markov chain Monte Carlo (MCMC) computation is used to explore the posterior distribution of the covariance matrix, given the ancestral allele frequency and population allele frequencies.

After the covariance of allele frequencies between populations is estimated, the posterior probability of an alternative model for an individual SNP is determined. The alternative model predicts that the transformed population allele frequencies at a SNP are normally distributed around the ancestral allele frequency according to a linear model:

P θ | Ω , α , β ~ N α + β Y , α 1 - α Ω ,

where θ is the transformed allele frequency, α is the ancestral allele frequency, Y is the environmental variable, and Ω is the variance-covariance matrix that was estimated in the first step of this procedure.

For each SNP, the Bayes factor (BF) is the posterior probability of the data at that SNP under the alternative model, integrated over all parameters, then divided by the posterior probability of the data under the null model, integrated over all parameters of the null model. Using a single run of the MCMC under the null distribution and importance sampling, the Bayes factors for multiple environmental variables are estimated quickly for each of the SNPs assayed (see additional details in Coop et al. [38]). This method has been shown to have good power compared to a range of other approaches [39].

The Bayenv method has been applied to many species including humans [4044], plants [4549], fish [50, 51], and mosquitoes [52]. In all cases, the objective was to determine the extent to which a species' pattern of genomic variation can be inferred to be a response to environmental variation - for example climate, geography, photoperiod, or soil type - and hence could be regarded as a signal of adaptation.

Hancock et al. [40] applied the Bayenv method to genome-wide data from 61 worldwide populations for evidence of adaptation to 11 local climatic variables. Analyzing the pattern of variation of genome-wide SNP allele frequencies across the populations, they found evidence for adaptation to several variables, the most significant being latitude, summer relative humidity, summer solar radiation, and winter relative humidity. They reported signs of adaptation at SNPs previously identified by genome-wide association studies (GWAS) to be significantly related to phenotypes associated with environmental gradients, as well as an enrichment of genic versus non-genic SNPs among the SNPs most strongly correlated with the climate variables.

Although the Bayenv method has been applied to different species and environmental variables and has produced some results agreeing with other analytical methods, the stability of the results across multiple MCMC runs is not known. It is also unknown how the results may be affected by the inclusion of specific populations from extreme environments, which could potentially introduce bias [53]. In this study, we carried out many independent runs of the Bayenv program using data from Hancock et al. [40] to determine the degree of variability between runs and the factors that may contribute to any variability.

During the writing of this manuscript, an updated version of this program, Bayenv 2.0 [53], was released. While the Bayes factor estimation procedure is the same, it adds features that allow for the input of pooled population sequencing data and the computation of non-parametric correlations of the allele frequencies with the environmental variables. Although using this new version is advisable, the analyses of the original Bayenv program presented here remain relevant both to further applications of the method as well as to studies using results of previous Bayenv analyses (i.e., Fraser [54]).


All 60 populations; 100,000 MCMC iterations

First, we analyzed the full data set of 60 populations and assessed correlations with 11 climate variables using 100,000 MCMC iterations (as in Hancock et al. [40]). We compared results between pairs of five independent runs as well as between each of our runs and Hancock et al.'s [40] published results. We report comparisons of our runs with an updated set of results made available online by the authors (which we refer to as Hancock2) that used a different number of MCMC burn-ins from the original set of results made available (Hancock1) (see ‘Methods’ section for details). The 16 pairs analyzed here are shown by a's in Table 1.

Table 1 Pairs of Bayenv runs compared

We found that Bayenv produced slightly different results in independent runs of the program. For a given climate variable, pairwise correlations of the logarithms of Bayes factors (BF's) were computed for all SNPs from each run, and correlations were averaged across all pairs of runs mentioned above (shown in Table 1). The averaged correlation coefficients ranged from 0.66 to 0.89 (Table 2a), depending on the climate variable. Despite analyzing only 60 of the 61 populations analyzed in Hancock et al. [40], correlations among our runs were similar to correlations with results from Hancock et al. [40]. Correlations between the two versions of Hancock et al.'s [40] published results were also comparable.

Table 2 Average correlations of the log of the Bayes factors (BF) and average correlations of the empirical p values between runs for each climate variable

Correlations of empirical p values for all SNPs between the pairs of runs listed in Table 1 were also calculated; these correlation coefficients ranged from 0.63 to 0.85 (Table 2e), depending on the climate variable. Again, all pairs of runs compared displayed similar correlation values. For both the BF and p value correlations, the longitude variable had the lowest correlation among runs while winter radiation flux had the highest.

To examine the consistency of SNPs present in the empirical tails of each run, the overlap of SNPs in the tails with p value cutoffs of 0.05, 0.01, 0.005, and 0.001 was calculated. Overlap was defined as the proportion of overlapping SNPs relative to the total number of SNPs in the tail; this proportion was averaged over all climate variables. All pairs of runs showed comparable proportions of overlaps, both when compared among the Blair runs and between Blair runs and Hancock2. These overlap proportions ranged from 0.24 in the 0.001 tails of both runs to 0.57 in the 0.05 tails of both runs when Blair1 was compared to Hancock2 (Table 3a), with similar values for overlap between all Blair runs. In general, greater SNP overlap occurred when the SNPs in one run's tail were compared to the less extreme tail in another run. (In other words, more SNPs in the 0.001 tail of run A overlapped with the SNPs in the 0.005 tail of run B than the 0.001 tail of run B, and even more overlapped with the SNPs in the 0.01 tail of run B).

Table 3 Empirical p value tail comparisons between indicated runs

To assess whether SNPs with more extreme p values had greater overlap among runs, we ordered the empirical p values from the lowest to the highest for each climate variable and binned the SNPs in groups of 1,000. Each bin was compared to the corresponding ordered bin in each of the other Blair runs. Bins of lower p values had the highest overlap and overlap decreased as the empirical p values increased. The mean overlap for the top 1,000 SNPs was 0.44, and overlap was much lower for less significant SNPs (Table 4).

Table 4 Overlap of ranked SNPs (ranked by empirical p value from lowest to highest and binned in groups of 1,000) between runs

There was also variation among the results at individual SNPs. For example, SNPs in the CORIN gene region (rs4558846, rs6447571, rs17601068) gave strong signals (log10BF = 21.9, 28.7, and 20.8 and empirical p values = 2.0 × 10-5, 3.1 × 10-5, and 2.1 × 10-5) of association with minimum winter temperature as reported in Hancock et al. [40] (which we refer to as Hancock1), but show low signals in other runs: Blair1 (log10BF = -0.32, -0.50, and -0.48 and empirical p values = 0.40, 0.57, and 0.44) and Hancock2 (log10BF = -0.33, -0.07, and -0.16 and empirical p values = 0.36, 0.22, and 0.20). Some of this variability could be due to the limited burn-in in Hancock1 (see ‘Methods’ section). Other SNPs, such as rs2075756 in TRIP6, show more consistency. Blair1 obtained a log10BF of 4.0 and empirical p value of 7.9 × 10-4, while Hancock2 reported a log10BF of 1.4 and empirical p value of 3.2 × 10-3 for this SNP. This SNP shows a strong signal with absolute latitude, the variable for which we saw consistently strong signals of genic/non-genic enrichment among SNPs with low empirical p values (see below).

We then assessed whether genic SNPs were enriched among SNPs with the strongest signals by comparing the fraction of genic SNPs to non-genic SNPs in the empirical tails. This enrichment was compared across all five of our runs as well as with Hancock1 and Hancock2. The absolute enrichment values (calculated as described in the ‘Methods’ section) varied across runs and differed from Hancock's [40] published results (results not shown). The significance values of the genic/non-genic enrichments, which represent the proportion of bootstrapped samples that are also enriched, also varied across runs (i.e., for summer solar radiation) (Table 5).

Table 5 Genic/non-genic enrichments for each climate variable in the indicated empirical tails

The absolute latitude and winter relative humidity variables were almost always significant, with the exception of the 0.005 tail in Blair5; the summer precipitation variable was never significant. All three of these findings agreed with Hancock et al.'s published results [40]. However, summer solar radiation and humidity were not consistently significant, though reported to be significant in Hancock et al. [40]. Among the remaining variables, results varied among runs, and our runs showed less significance on average than Hancock et al.'s published results [40].

All 60 populations; 500,000 MCMC iterations

In order to determine whether the observed variability between runs was due to the MCMC not converging, the number of MCMC iterations was increased fivefold to 500,000. Five independent runs were performed, each with different seeds, but with the same full data set of 60 populations. All pairs of the five runs with 500,000 iterations were compared (shown as b's in Table 1).

Between independent runs, both the Bayes factors and p values showed higher correlation coefficients compared to those between independent runs with 100,000 iterations. When averaged across all pairs of runs, the Bayes factor correlation coefficients ranged from 0.82 to 0.95 (Table 2b), and the correlation coefficients of the p values ranged from 0.81 to 0.92 (Table 2f). These correlations are significantly higher than among runs of 100,000 iterations for all environmental variables: paired Wilcoxon tests of these correlations, which for each climate variable are averaged across pairs of runs, give a p value of 0.0038 for the Bayes factors and a p value of 0.0038 for the empirical p values. The SNPs in the empirical p value tails were compared as before, and although the overlap was greater between long runs than between runs with 100,000 iterations, the proportion of overlapping SNPs in the tails was still not high, ranging from 0.64 in the 0.05 tails of two runs to 0.36 in the 0.001 tails of two runs (Table 3b).

Genic/non-genic enrichment significance values also stabilized somewhat in the runs with 500,000 iterations but still did not show the same significance values from run to run (Table 5). Genic/non-genic enrichments were highly significant for the absolute latitude and relative humidity winter variables and again were not significant for the summer precipitation variable across all five runs. Interestingly, none of the five runs showed significant genic/non-genic enrichment for winter precipitation rate, except in the 0.01 tail in Blair5. As before, varying levels of significance were seen for summer solar radiation and summer relative humidity. This differs from Hancock et al. [40], who reported high significance across all tails for these environmental variables.

In searching for the source of the differences described above, biases from allele frequency, SNP ascertainment panel, and method of calculating empirical p values were investigated, but none of these potential biases was shown to explain the variability among runs. First, the data were re-analyzed using empirical p values that were not binned by allele frequency. P value overlap did not increase when binned vs. un-binned p values were compared. Second, the proportions of SNPs from each ascertainment panel in each tail were calculated, and these proportions were found to be similar to the overall proportions of SNPs in each ascertainment panel. Additionally, when each ascertainment panel was considered separately, the overlap in the tails of p values was similar to the overlap in the tails when all three ascertainment panels were considered together.

Without Siberian populations; 150,000 iterations

Because the Siberian populations, the Naukan Yup’ik and Maritime Chukchee, live in some of the most extreme climatic regions among the 60 sampled populations, it may be possible that removing these populations from the data set could affect the variability of Bayes factors, empirical p values, and genic/non-genic enrichments among independent runs. All pairs of runs using this modified data set were compared (shown as c's in Table 1). Pairwise correlations of the log of the Bayes factors (Table 2c) and p values (Table 2g) as well as fractions of overlap of SNPs in the 0.05, 0.01, 0.005, and 0.001 tails of the empirical p values (Table 3c) were comparable to the correlations between the runs of the full data set (60 populations with 100,000 iterations (Table 2a)). This suggests that the inclusion of populations with extreme environmental variables may not affect run-to-run variability.

To determine whether excluding the Siberian populations affected the results of the Bayenv method, each of the runs without the Siberian populations was compared to the updated Hancock results published on the dbCline website (Hancock2) (shown by d's in Table 1). Pairwise correlations between the log of the Bayes factors of these pairs (Table 2d), averaged across all pairs, were similar to correlations between pairs of non-Siberian runs (Table 2c). However, correlations of empirical p values between Hancock2 and non-Siberian runs (Table 2h) were lower than correlations between pairs of non-Siberian runs (Table 2g) and also lower than the correlations between pairs of runs using the full data set (Table 2e). Similarly, overlap in the empirical tails is lower when non-Siberian runs are compared to the results from Hancock2 (Table 3d), demonstrating a slight difference in results when these populations were excluded.

Although genic/non-genic enrichment significance values varied across runs of the non-Siberian data set (Table 5), some patterns differed from the results that included the full data set of 60 populations. The genic/non-genic enrichment for the absolute latitude variable was less significant than in runs analyzing all populations (Table 5), showing a small effect of exclusion of the Siberian populations. In addition, winter precipitation rate was sometimes significant in the non-Siberian runs; this variable was significant in Hancock et al.'s [40] published data but not in our runs with the full population set using 500,000 MCMC iterations.


We have explored the computational replicability of the Bayenv program [38] in determining evidence for positive selection in response to climatic variation. We used genome-wide SNP data from 60 worldwide human populations and 11 climate variables as in Hancock et al. [40] and carried out independent Bayenv runs with 100,000 and 500,000 MCMC iterations, as well as with 150,000 iterations while omitting Siberian populations with extreme climatic variables. In comparing pairs of our runs as well as each of our runs with the results previously published in Hancock et al. [40], results from the Bayenv method appear to be variable across runs in their Bayes factors, empirical p values, and in the values and significances of enrichments of genic SNPs in the empirical tails. Although some run-to-run variability is expected in any MCMC algorithm, we show here that the differences between runs are sufficient to lead to varying biological conclusions regarding human adaptation.

In runs with 100,000 MCMC iterations, correlations of Bayes factors between replicate runs were low (0.66–0.88) between pairs of our runs, between our runs and the results of Hancock et al. [40], and between the different versions of results posted by Hancock et al. (Table 2a). To explore the possibility that the MCMC had failed to converge, we increased the number of MCMC iterations to 500,000; although these correlations increased as a result (0.82–0.93) (Table 2b), there was still substantial run-to-run variability in further analyses.

Because Bayes factors are dependent on the accuracy of the null model and, therefore, may be substantially inflated due to imperfections of this model [40], we focused further analyses on the empirical p values. Hancock et al. [40] write, ‘Since we cannot expect the null model to account fully for the effects of population structure, we emphasize that we cannot take the BFs themselves at face value, nor can they be directly compared across climate variables. Therefore, we took a conservative approach and conducted subsequent analyses by comparing each SNP to the empirical distribution.’ In our analysis, empirical p values were seen to be no more stable than the Bayes factors themselves, as correlation coefficients of the empirical p values were in the same range as the correlation coefficients of the Bayes factors (Table 2e). As before, the correlation of p values between runs improved when the MCMC iterations were increased to 500,000 (Table 2f). Although we recommend using a large number of MCMC iterations, this does not completely stabilize Bayenv results from run to run.

Amounts of SNP overlap in the tails of the empirical p value distributions showed that the relative ranking of the SNPs associated with environmental variables varied across runs, even with a larger number of MCMC iterations. While the overlap of SNPs between the extreme tails (i.e., top 0.01%) of two runs was found to be low, we recognize that this slightly overestimates run-to-run variability. For example, 24% of the 0.001 tail of Hancock2 overlapped with the 0.001 tail of Blair1; that proportion increased to 83% with the 0.005 tail of Blair1 (Table 3a). This suggests that the SNPs with the lowest empirical p values have generally low p values in all runs. Analysis of the overlap of SNPs with low empirical p values versus the overlap of SNPs with high or midrange empirical p values shows that strong signals are also more reproducible from run-to-run than weaker signals.

Despite this trend, a SNP's significance and relative ranking nearly always vary from one run of Bayenv to another. SNPs inferred to be significant using some pre-defined p-value threshold in one run may not be found to be significant in another run. Thus, the low fractions of empirical p-value tail overlap between runs suggest that caution is advisable in drawing inferences based on only one run, especially when making conclusions regarding individual SNPs or genes (see ‘Results’ section).

Finally, we assessed the stability of inferences regarding the enrichment of genic SNPs among SNPs strongly associated with a given climate variable. Evidence of such an enrichment has been argued as evidence of strong selection on genic regions in response to an associated environmental pressure [40, 45]. Because the stability of these significances depends only on the ratio of genic to non-genic SNPs in the empirical tails, as opposed to the rank of each individual SNP, this measure could be consistent despite variability of Bayes factors and empirical p values. However, the significances of climate variables including summer maximum temperature, summer relative humidity, summer solar radiation, winter minimum temperature, winter precipitation rate, and winter solar radiation varied widely from run to run (Table 5).

Of particular interest are summer solar radiation and summer relative humidity, both claimed to be significant by Hancock et al. [40] but for which significance did not hold in our analyses. For these variables, as well as others, neither the enrichment values nor the significance of the enrichment values increased or decreased systematically when the tail cut-off was more extreme, as suggested in Hancock et al. [40]. The lack of significance for these variables in the Hancock2 analysis, which includes all 61 populations, shows that the exclusion of the data from Australian Aborigines from our analysis is not the cause for this difference in significance. Thus, replicate runs demonstrate that these two variables may have been less crucial in human adaptation than suggested previously [40].

Despite this variability between runs, certain climate variables show some consistency in genic/non-genic enrichments and agree with conclusions of Hancock et al. [40] (Table 5). Winter relative humidity is significant across all runs, including those in which the Siberian populations are excluded, suggesting that this may be a robust signal of adaptation in response to this climatic variable. Absolute latitude also shows a strong signal and is significant across all runs that include all populations. That we were able to replicate the significance of both of these variables, despite the fact that we did not include the Australian Aborigines (included in Hancock et al. [40]) underscores their potential importance in human adaptation.

We note that populations with extreme climate variables may indeed have an effect on some conclusions drawn from Bayenv analyses. For example, without the Siberian populations, an enrichment of genic SNPs was no longer significant for latitude (Table 5). Thus, while the inclusion of outlying populations may affect biological conclusions, it does not appear to affect run-to-run variability (see ‘Results’ section).

In summary, our analysis of the consistency of genic enrichments between runs demonstrates the importance of performing multiple runs of Bayenv. Two independent runs can potentially lead to very different biological conclusions. Significance of a particular environmental variable in multiple Bayenv runs constitutes much stronger evidence for its relevance and importance for human adaptation.

A new version of the Bayenv method, Bayenv 2.0 [53], has recently been released. Running the new version with 100,000 MCMC iterations produces slightly more stable results among runs than the old version; however, the variability is comparable to the runs of the old version with 500,000 MCMC iterations (results not shown). Gunther et al. note that their inclusion of non-parametric analyses in Bayenv 2.0 helps to account for potential false positives due to outlying populations [53]; however, our results suggest that results run on the previous version of the Bayenv program that do not include outlying populations may still lead to varying results across runs. Thus, while we suggest using the new version, the results described in this manuscript remain relevant to the application of Bayenv 2.0.

It can be difficult to disentangle stability and specificity, and we recognize that some of the run-to-run variability of this method may be due to less strong signals of environmental adaptation (i.e., winter minimum temperature as compared to latitude). However, as every run may be different, it is still advisable to carry out multiple runs when making biological conclusions, especially when the signal tested may not be strong to begin with. Averaging Bayes factors of two or more independent runs before calculating the empirical p values produces more stable empirical p values and genic/non-genic enrichments (results not shown). Additionally, individual SNPs that show large Bayes factors in at least one run could be rerun with more MCMC iterations to determine a Bayes factor value to which all iterations converge. Unfortunately, this process would not be as useful for a genome-wide analysis, as a large set of SNPs is necessary for the calculation of the empirical p values and the genic/non-genic enrichments.


We conclude that, in using the Bayenv method, relying on the relative rankings of SNPs calculated from only one run of the program may lead to biological conclusions that might not have been reached with another independent run of the program. Neither increasing the number of MCMC iterations nor removing outlying populations completely eliminated this variability. Thus, we suggest carrying out several independent runs of the Bayenv analysis to assess variability of the Bayes factors and empirical p values across runs. We also suggest averaging Bayes factors from independent runs to produce more stable results. With these modifications, future discoveries of environmental adaptation within species using the Bayenv method will be more accurate, interpretable, and easily compared between studies.



Allele frequency data on 623,318 SNPs from samples of 60 human populations from the freq_data file (61pops_climate_freqs.txt) on the dbCline website ( were analyzed. The data set included samples from 52 HGDP populations plus Vasekela !Kung, Amhara, Naukan Yup’ik, Maritime Chukchee, Luhya, Maasai, Tuscans, and Gujarati. These were the same populations used by Hancock et al. [40], with the exception of the Australian Aborigines, who were not included in the allele frequency file on dbCline. To translate allele frequencies into allele counts, the sample sizes from Li et al. [55] were used for the HGDP populations and from the supplementary information of Hancock et al. [40] for the additional eight populations.

Climate variables

The climate data consisted of the 11 variables in Hancock et al. [40] that are posted on the dbCline website: latitude, absolute latitude, longitude, winter minimum temperature, summer maximum temperature, summer precipitation rate, winter precipitation rate, summer radiation flux, winter radiation flux, summer relative humidity, and winter relative humidity. Climate data were obtained for each population directly from the dbCline website (61pops_climate_freqs.txt).

Covariance matrix estimation

For the set of 60 populations, three covariance matrices, each one representing a different ascertainment panel (Illumina 250 K, Illumina 300 K, or Illumina AFR), were generated using the Bayenv program (as in Hancock et al. [40]). The program was run for 100,000 MCMC iterations, and the entries of each matrix were averaged over the last three output matrices (output every 5,000 iterations). Visual comparison showed the matrices to be qualitatively similar to those in Coop et al. [38], and the matrices were stable from run to run. For our analysis, one covariance matrix for each ascertainment panel was estimated as described here and used for all runs.

Bayenv runs

Five independent runs of the Bayenv program were carried out using the full data set of 60 populations and 11 climate variables. As with the covariance matrices, SNPs were analyzed with their corresponding ascertainment panel. Each run used a different random seed but the same input data set and 100,000 MCMC iterations. Bayes factors produced from the three ascertainment panels were combined into one file for each run. These runs are referred to as Blair1 - Blair5 (see Table 1).

In a separate analysis, five independent runs of the Bayenv program were carried out as above, using the same full data set but increasing the MCMC iterations to 500,000 (referred to as LongBlair1 - LongBlair5) (see Table 1). For all runs, we use the default MCMC burn-in values.

Bayenv runs without Siberia

Five independent runs of the Bayenv program were carried out using a modified data set and 150,000 MCMC iterations. In this modified data set, all 11 climate variables were included, but the Siberian populations (Naukan Yup’ik and Maritime Chukchee) were excluded (referred to as W/O Siberia1 - W/O Siberia5; see Table 1). The aim here was to explore how the Siberian populations, who live in some of the most extreme climatic regions, might affect the stability and conclusions drawn from Bayes factors, empirical p values, and genic/non-genic enrichment values (see below).

Hancock runs

Bayes factors and empirical p values from Hancock et al. [40] were obtained from the dbCline website. Following the Hancock et al. [40] publication, the authors posted a set of results, which we refer to as Hancock1, which used fewer burn-ins for the MCMC procedure and thus failed to converge (Anna Di Rienzo, personal correspondence). After further correspondence with the authors, a second set of results that used a longer burn-in was posted in autumn 2013 (which we refer to as Hancock2). We examined BF correlations and empirical p value correlations and overlap between these two data sets (described below). For calculations of genic/non-genic enrichment (described below), we performed the analysis on both Hancock1 and Hancock2. In general, our analysis of the Hancock1 data set gave similar results to those published by Hancock et al. [40].

SNP pvalues

As in Hancock et al. [40], we calculated empirical p values by binning SNPs based on their global allele frequency (across 60 populations). For each SNP, a transformed rank statistic was computed by comparing the computed Bayes factor for the SNP with the Bayes factors from other SNPs in the same global allele frequency bin. For this process, all SNPs were analyzed together, and no distinction was made for SNPs belonging to different ascertainment panels.

Genic/non-genic enrichments

To calculate genic/non-genic enrichments, a SNP was defined as genic if it was within 10 kb of a gene and non-genic if it was greater than 50 kb from a gene (as in Hancock et al. [40], personal correspondence). Enrichment in an empirical tail was calculated with the following equation:

Enrichment = n g n ng N g N ng ,

where ng and nng are the number of genic and non-genic SNPs, respectively, in the empirical tail, and Ng and Nng are the number of genic and non-genic SNPs, respectively, among all tested SNPs. To determine the significance of the genic/non-genic enrichments, 500-kb regions from the genome were re-sampled with replacement (also as in Hancock et al. [40]). Then, given the p values of the SNPs in the re-sampled regions, we calculated enrichment values in each empirical tail. This bootstrap re-sampling procedure was repeated 10,000 times. Significance values were determined by counting the fraction of bootstrapped samples with enrichment values above 1. In Table 5, one star denotes that 95%–97.5% of bootstrapped samples were enriched, two stars denote 97.5%–99%, and three stars denote >99%; zero stars denotes that <95% of bootstrap samples were enriched, and thus that the genic enrichment is not significant.

Comparison of runs

Each run produces a Bayes factor for each climate variable for each SNP. From those, the empirical p values were calculated. For each climate variable, the Pearson correlations of the log of the Bayes factors and the Pearson correlations of the empirical p values were computed for each pair of runs compared (see Table 1). For each set of comparisons (i.e., Blair1-5, or W/O_Sib1-5 vs. Hancock2) the correlation coefficients were then averaged across all relevant pairs of runs (see letters of Table 1). For all pairs shown in Table 1, empirical p values were also compared by calculating the fraction of SNPs in the tail of one run that was present in the tail of the other run. In another analysis, to assess differences in run stability by empirical p value, SNPs were ordered by empirical p value from lowest to highest and binned into groups of 1,000. Overlaps between the corresponding bins of different runs (Blair1-5) were calculated. (This assessed what proportion of the 1,000 most-significant SNPs in Blair1 were present in the 1,000 most-significant SNPs in Blair2, what proportion of SNPs ranked 1,001–2,000 in Blair1 were present in SNPs ranked 1,001–2,000 of Blair2, and so forth.) Finally, significances of genic/non-genic enrichment values were calculated for each run, as described above, and compared.



Bayes factor


Markov Chain Monte Carlo


Single-nucleotide polymorphism.


  1. Allen JA: The influence of physical conditions in the genesis of species. Radical Rev. 1877, 1: 108-140.

    Google Scholar 

  2. Bergmann C: Über die Verhältnisse der wärmeökonomie der Thiere zu ihrer Grösse. Göttinger Studien. 1847, 3: 595-708.

    Google Scholar 

  3. Gloger CL: Das Abändern der Vögel durch Einfluss des Klimas. 1833, August Schulz: Breslau

    Book  Google Scholar 

  4. Ashton KG, Tracy MC, de Queiroz A: Is Bergmann's rule valid for mammals?. Am Nat. 2000, 156: 390-415. 10.1086/303400.

    Article  Google Scholar 

  5. Brown JH, Lee AK: Bergmann's rule and climatic adaptation in woodrats (Neotoma). Evolution. 1969, 23: 329-338. 10.2307/2406795.

    Article  Google Scholar 

  6. Coyne JA, Beecham E: Heritability of two morphological characters within and among natural populations of Drosophila melanogaster. Genetics. 1987, 117: 727-737.

    PubMed Central  CAS  PubMed  Google Scholar 

  7. Freckleton RP, Harvey PH, Pagel M: Bergmann's rule and body size in mammals. Am Nat. 2003, 161: 821-825. 10.1086/374346.

    Article  PubMed  Google Scholar 

  8. Harcourt AH, Schreier BM: Diversity, body mass, and latitudinal gradients in primates. Int J Primatol. 2009, 30: 283-300. 10.1007/s10764-009-9342-5.

    Article  Google Scholar 

  9. Huey RB, Gilchrist GW, Carlson ML, Berrigan D, Serra L: Rapid evolution of a geographic cline in size in an introduced fly. Science. 2000, 287: 308-309. 10.1126/science.287.5451.308.

    CAS  Article  PubMed  Google Scholar 

  10. Jablonski NG, Chaplin G: The evolution of human skin coloration. J Hum Evol. 2000, 39: 57-106. 10.1006/jhev.2000.0403.

    CAS  Article  PubMed  Google Scholar 

  11. Jain S, Bradshaw A: Evolutionary divergence among adjacent plant populations. I. The evidence and its theoretical analysis. Heredity. 1966, 21: 407-441. 10.1038/hdy.1966.42.

    Article  Google Scholar 

  12. Johnston RE, Selander RK: Evolution in the house sparrow II. Adaptive differentiation in North American populations. Evolution. 1971, 25: 1-28. 10.2307/2406496.

    Article  Google Scholar 

  13. Katzmarzyk PT, Leonard WR: Climatic influences on human body size and proportions: ecological adaptations and secular trends. Am J Phys Anthropol. 1998, 106: 483-503. 10.1002/(SICI)1096-8644(199808)106:4<483::AID-AJPA4>3.0.CO;2-K.

    CAS  Article  PubMed  Google Scholar 

  14. Leonard WR, Sorensen MV, Galloway VA, Spencer GJ, Mosher MJ, Osipova L, Spitsyn VA: Climatic influences on basal metabolic rates among circumpolar populations. Am J Hum Biol. 2002, 14: 609-620. 10.1002/ajhb.10072.

    Article  PubMed  Google Scholar 

  15. Mullen LM, Hoekstra HE: Natural selection along an environmental gradient: a classic cline in mouse pigmentation. Evolution. 2008, 62: 1555-1570. 10.1111/j.1558-5646.2008.00425.x.

    CAS  Article  PubMed  Google Scholar 

  16. Pool JE, Aquadro CF: The genetic basis of adaptive pigmentation variation in Drosophila melanogaster. Mol Ecol. 2007, 16: 2844-2851. 10.1111/j.1365-294X.2007.03324.x.

    PubMed Central  Article  PubMed  Google Scholar 

  17. Storz JF, Balasingh J, Bhat HR, Nathan PT, Doss DPS, Prakash AA, Kunz TH: Clinal variation in body size and sexual dimorphism in an Indian fruit bat, Cynopterus sphinx (Chiroptera: Pteropodidae). Biol J Linnean Soc. 2001, 72: 17-31. 10.1111/j.1095-8312.2001.tb01298.x.

    Article  Google Scholar 

  18. Relethford JH: Apportionment of global human genetic diversity based on craniometrics and skin color. Am J Phys Anthropol. 2002, 118: 393-398. 10.1002/ajpa.10079.

    Article  PubMed  Google Scholar 

  19. Sumner F: The analysis of a concrete case of intergradation between two subspecies. Proc Natl Acad Sci. 1929, 15: 110-120. 10.1073/pnas.15.2.110.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  20. Roberts DF: Body weight, race and climate. Am J Phys Anthropol. 1953, 11: 533-558. 10.1002/ajpa.1330110404.

    CAS  Article  PubMed  Google Scholar 

  21. Huxley J: Clines: an auxiliary taxonomic principle. Nature. 1839, 142: 219-220.

    Article  Google Scholar 

  22. Cavalli-Sforza L: Population structure and human evolution. Proc R Soc Lond B Biol Sci. 1966, 164: 362-379. 10.1098/rspb.1966.0038.

    CAS  Article  PubMed  Google Scholar 

  23. Endler JA: Natural Selection in the Wild. 1986, Princeton University Press: Princeton

    Google Scholar 

  24. Lewontin R, Krakauer J: Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics. 1973, 74: 175-195.

    PubMed Central  CAS  PubMed  Google Scholar 

  25. Haldane JB: The theory of a cline. J Genet. 1948, 48: 277-284. 10.1007/BF02986626.

    CAS  Article  PubMed  Google Scholar 

  26. Slatkin M: Gene flow and selection in a cline. Genetics. 1973, 75: 733-756.

    PubMed Central  CAS  PubMed  Google Scholar 

  27. Lenormand T: Gene flow and the limits to natural selection. Trends Ecol Evol. 2002, 17: 183-189. 10.1016/S0169-5347(02)02497-7.

    Article  Google Scholar 

  28. Akey JM, Zhang G, Zhang K, Jin L, Shriver MD: Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 2002, 12: 1805-1814. 10.1101/gr.631202.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  29. Barreiro L, Laval G, Quach H, Patin E, Quintana-Murci L: Natural selection has driven population differentiation in modern humans. Nat Genet. 2008, 40: 340-345. 10.1038/ng.78.

    CAS  Article  PubMed  Google Scholar 

  30. Bonin A: Explorative genome scan to detect candidate loci for adaptation along a gradient of altitude in the common frog (Rana temporaria). Mol Ecol. 2006, 23: 773-783.

    CAS  Google Scholar 

  31. Coop G, Pickrell JK, Novembre J, Kudaravalli S, Li J, Absher D, Myers R, Cavalli-Sforza LL, Feldman MW, Pritchard JK: The role of geography in human adaptation. PLoS Genet. 2009, 5: e1000500-10.1371/journal.pgen.1000500.

    PubMed Central  Article  PubMed  Google Scholar 

  32. Fournier-Level A, Korte A, Cooper MD, Nordborg M, Schmitt J, Wilczek AM: A map of local adaptation in Arabidopsis thaliana. Science. 2011, 334: 86-89. 10.1126/science.1209271.

    CAS  Article  PubMed  Google Scholar 

  33. Kohn MH, Pelz HJ, Wayne RK: Locus-specific genetic differentiation at Rw among warfarin resistant rat (Rattus norvegicus) populations. Genetics. 2003, 164: 1055-1070.

    PubMed Central  CAS  PubMed  Google Scholar 

  34. Pickrell JK, Coop G, Novembre J, Kudaravalli S, Li JZ, Absher D, Srinivasan BS, Barsh GS, Myers RM, Feldman MW, Pritchard JK: Signals of recent positive selection in a worldwide sample of human populations. Genome Res. 2009, 19: 826-837. 10.1101/gr.087577.108.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  35. Sabeti PC, Schaffner SF, Fry B, Lohmueller J, Varilly P, Shamovsky O, Palma A, Mikkelsen TS, Altshuler D, Lander ES: Positive natural selection in the human lineage. Science. 2006, 312: 1614-1620. 10.1126/science.1124309.

    CAS  Article  PubMed  Google Scholar 

  36. Tang K, Thornton KR, Stoneking M: A new approach for using genome scans to detect recent positive selection in the human genome. PLoS Biol. 2007, 5: e171-10.1371/journal.pbio.0050171.

    PubMed Central  Article  PubMed  Google Scholar 

  37. Voight BF, Kudaravalli S, Wen X, Pritchard JK: A map of recent positive selection in the human genome. PLoS Biol. 2006, 4 (3): e72-10.1371/journal.pbio.0040072.

    PubMed Central  Article  PubMed  Google Scholar 

  38. Coop G, Witonsky D, Di Rienzo A, Pritchard JK: Using environmental correlations to identify loci underlying local adaptation. Genetics. 2010, 185: 1411-1423. 10.1534/genetics.110.114819.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  39. De Mita S, Thuillet AC, Gay L, Ahmadi N, Manel S, Ronfort J, Vigouroux Y: Detecting selection along environmental gradients: analysis of eight methods and their effectiveness for outbreeding and selfing populations. Mol Ecol. 2013, 22: 1383-1399. 10.1111/mec.12182.

    Article  PubMed  Google Scholar 

  40. Hancock AM, Witonsky DB, Alkorta-Aranburu G, Beall CM, Gebremedhin A, Sukernik R, Utermann G, Pritchard JK, Coop G, Di Rienzo A: Adaptations to climate-mediated selective pressures in humans. PLoS Genet. 2011, 7 (4): e1001375-10.1371/journal.pgen.1001375. doi:10.1371/journal.pgen.1001375

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  41. Hancock AM, Witonsky DB, Ehler E, Alkorta-Aranburu G, Beall C, Gebremedhin A, Sukernik R, Utermann G, Pritchard J, Coop G, Di Reinzo A: Human adaptations to diet, subsistence, and ecoregion are due to subtle shifts in allele frequency. Proc Natl Acad Sci. 2010, 107 (Suppl 2): 8924-8930.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  42. Hancock AM, Clark VJ, Qian Y, Di Rienzo A: Population genetic analysis of the uncoupling proteins supports a role for UCP3 in human cold resistance. Mol Biol Evol. 2011, 28: 601-14. 10.1093/molbev/msq228.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  43. Hancock A, Witonsky D, Gordon AS, Eshel G, Pritchard JK, Coop G, Di Rienzo A: Adaptations to climate in candidate genes for common metabolic disorders. PLoS Genet. 2008, 4: e32-10.1371/journal.pgen.0040032.

    PubMed Central  Article  PubMed  Google Scholar 

  44. Fumagalli M, Sironi M, Pozzoli U, Ferrer-Admettla A, Pattini L, Nielsen R: Signatures of environmental genetic adaptation pinpoint pathogens as the main selective pressure through human evolution. PLoS Genet. 2011, 7 (11): e1002355-10.1371/journal.pgen.1002355.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  45. Hancock AM, Brachi B, Faure N, Horton MW, Jarymowycz LB, Sperone FG, Toomajian C, Roux F, Berelson J: Adaptation to climate across the Arabidopsis thaliana genome. Science. 2011, 334: 83-86. 10.1126/science.1209244.

    CAS  Article  PubMed  Google Scholar 

  46. Eckert AJ, Bower AD, Gonzalez-Martinez SC, Wegrzyn JL, Coop G, Neale DB: Back to nature: ecological genomics of loblolly pine (Pinus taeda, Pinaceae). Mol Ecol. 2010, 2010 (19): 3789-805.

    Article  Google Scholar 

  47. Fang Z, Pyhäjärvi T, Weber AL, Dawe RK, Glaubitz JC, Gonzalez JJ, Ross-Ibarra C, Doebley J, Morrell PL, Ross-Ibarra J: Megabase-scale inversion polymorphism in the wild ancestor of maize. Genetics. 2012, 191: 883-894. 10.1534/genetics.112.138578.

    PubMed Central  Article  PubMed  Google Scholar 

  48. Keller SR, Levsen N, Olson MS, Tiffin P: Local adaptation in the flowering- time gene network of balsam poplar, Populus balsamifera L. Mol Biol Evol. 2012, 29: 3143-52. 10.1093/molbev/mss121.

    CAS  Article  PubMed  Google Scholar 

  49. Pyhajarvi T, Hufford MB, Mezmouk S, Ross-Ibarra J: Complex patterns of local adaptation in teosinte. Genome Biol Evol. 2013, 5 (9): 1594-1609. 10.1093/gbe/evt109.

    PubMed Central  Article  PubMed  Google Scholar 

  50. Jones FC, Chan YF, Schmutz J, Grimwood J, Brady SD, Southwick AM, Absher DM, Myers RM, Riemchen TE, Deagle BE, Schluter D, Kingsley DM: A genome-wide SNP genotyping array reveals patterns of global and repeated species-pair divergence in sticklebacks. Curr Biol. 2011, 2011 (22): 83-90.

    Google Scholar 

  51. Limborg MT, Blankenship SM, Young SF, Utter FM, Seeb LW, Hansen MHH, Seeb JE: Signatures of natural selection among lineages and habitats in Oncorhynchus mykiss. Ecol Evol. 2012, 2: 1-18.

    PubMed Central  Article  PubMed  Google Scholar 

  52. Cheng C, White BJ, Kamdem C, Mockaitis K, Costantini C, Hahn MW, Besansky NJ: Ecological genomics of Anopheles gambiae along a latitudinal cline in cameroon: a population resequencing approach. Genetics. 2012, 190: 1417-1432. 10.1534/genetics.111.137794.

    PubMed Central  Article  PubMed  Google Scholar 

  53. Gunther T, Coop G: Robust identification of local adaptation from allele frequencies. Genetics. 2013, 195: 205-220. 10.1534/genetics.113.152462.

    PubMed Central  Article  PubMed  Google Scholar 

  54. Fraser HB: Gene expression drives local adaptation in humans. Genome Res. 2013, 23: 1089-1096. 10.1101/gr.152710.112.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  55. Li JZ, Absher DM, Tang H, Southwick AM, Casto AM, Ramachandran S, Cann HM, Barsh GS, Feldman M, Cavalli-Sforza LL, Myers RM: Worldwide human relationships inferred from genome-wide patterns of variation. Science. 2008, 319: 1100-1104. 10.1126/science.1153717.

    CAS  Article  PubMed  Google Scholar 

Download references


We would like to thank Jonathan Pritchard, Graham Coop, Brenna Henn, and Anna Di Rienzo for helpful discussion and comments, as well as two anonymous reviewers for their input. This work was supported in part by the Center for Computational, Evolutionary and Human Genomics at Stanford University.

Author information



Corresponding author

Correspondence to Lily M Blair.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors' contributions

LB participated in the design of the study and performed the statistical analysis. JG and MF participated in its design and coordination. All authors read and approved the final manuscript.

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Cite this article

Blair, L.M., Granka, J.M. & Feldman, M.W. On the stability of the Bayenv method in assessing human SNP-environment associations. Hum Genomics 8, 1 (2014).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI:


  • Environmental adaptation
  • Positive selection
  • Genome-wide scans
  • Human adaptation
  • Markov chain monte carlo
  • Natural selection