Open Access

Identifying positive selection candidate loci for high-altitude adaptation in Andean populations

  • Abigail W. Bigham1Email author,
  • Xianyun Mao2,
  • Rui Mei3,
  • Tom Brutsaert4,
  • Megan J. Wilson5,
  • Colleen Glyde Julian5,
  • Esteban J. Parra6,
  • Joshua M. Akey7,
  • Lorna G. Moore8 and
  • Mark D. Shriver2
Human Genomics20094:79

https://doi.org/10.1186/1479-7364-4-2-79

Received: 14 October 2009

Accepted: 14 October 2009

Published: 1 December 2009

Abstract

High-altitude environments (>2,500 m) provide scientists with a natural laboratory to study the physiological and genetic effects of low ambient oxygen tension on human populations. One approach to understanding how life at high altitude has affected human metabolism is to survey genome-wide datasets for signatures of natural selection. In this work, we report on a study to identify selection-nominated candidate genes involved in adaptation to hypoxia in one highland group, Andeans from the South American Altiplano. We analysed dense microarray genotype data using four test statistics that detect departures from neutrality. Using a candidate gene, single nucleotide polymorphism-based approach, we identified genes exhibiting preliminary evidence of recent genetic adaptation in this population. These included genes that are part of the hypoxia-inducible transcription factor (HIF) pathway, a biochemical pathway involved in oxygen homeostasis, as well as three other genomic regions previously not known to be associated with high-altitude phenotypes. In addition to identifying selection-nominated candidate genes, we also tested whether the HIF pathway shows evidence of natural selection. Our results indicate that the genes of this biochemical pathway as a group show no evidence of having evolved in response to hypoxia in Andeans. Results from particular HIF-targeted genes, however, suggest that genes in this pathway could play a role in Andean adaptation to high altitude, even if the pathway as a whole does not show higher relative rates of evolution. These data suggest a genetic role in high-altitude adaptation and provide a basis for genotype/phenotype association studies that are necessary to confirm the role of putative natural selection candidate genes and gene regions in adaptation to altitude.

Keywords

genome scan positive selection Native Americans altitude adaptation

Introduction

Identifying gene regions showing signatures of natural selection in the human genome offers a window into our recent evolutionary past, as well as a deeper understanding of how this evolutionary force has shaped extant patterns of variation. Several recent studies have analysed dense single nucleotide polymorphism (SNP) genotype data to detect signatures of selection in three major continental groups: West Africans, East Asians and Northern Europeans [16]. To date, only a few studies have focused on identifying candidate genes under selection with reference to a specific selective pressure [7, 8]. Here, we use high-density SNP data to search for candidate genes for altitude adaptation in Andean populations. By expanding the populations of study to the Americas and targeting a specific selective pressure, hypobaric hypoxia, we can produce a more detailed and nuanced understanding of this evolutionary process.

High-altitude environments provide scientists with a natural laboratory to study the genetic and physiological effects of hypobaric hypoxia, the decreased partial pressure of oxygen at high altitude resulting in lower circulating oxygen levels in the body, on endemic highland species [911]. Humans have inhabited three high-altitude (>2,500 m) zones of the world for multiple generations: the Tibetan Plateau, the Andean Altiplano and the Semien Plateau of Ethiopia (Figure 1). Each of these populations exhibits unique circulatory, respiratory and haematological adaptations to life at high altitude. For example, research has shown that Tibetan and Ethiopian populations have relatively low haemoglobin concentrations, in contrast to the 'classic' Andean physiological adaptation (also seen in high-altitude sojourners), where haemoglobin concentrations are elevated compared with low-altitude groups [1217]. Andeans also exhibit lower levels of resting ventilation, a more 'blunted' hypoxic ventilatory response, higher levels of pulmonary arterial pressure and an increased frequency of chronic mountain sickness compared with their Tibetan counterparts [18, 19]. Overall, this research has led to a substantial body of literature documenting the suite of human physiological responses to high-altitude habitation (for a review, see Hornbein and Schoene [20]).
Figure 1

The geography of human adaptation to high altitude. Geographic locations where humans have adapted to life at high altitude are indicated in grey and include the Andean Altiplano of South America, the Tibetan Plateau of Central Asia and the Semien Plateau of Ethiopia. The inset indicates the sampling locations of the four Native American population samples. The populations include Peruvian Quechua, Bolivian Aymara, Nahua, Mixtec and Tlapanec speakers from Guerrero, Mexico, and Maya from the Yucatan Peninsula, Mexico.

The physiological differences between low- and high-altitude populations have been well documented, but little work has focused on understanding the genetic bases or identifying the genetic variants underlying these adaptations [21, 22]. The few natural selection genetics studies conducted previously have focused on specific genes hypothesised to play a role in adaptation to altitude, but none of them have found conclusive evidence of this evolutionary force [2127]. One recent study scanned 998 genetic markers in seven Nepalese Sherpa porters and identified genomic regions that may have been involved in adaptation to altitude [28]. However, genome scans using much larger panels of genetic markers and larger sample sizes will be necessary to expand upon these very preliminary findings. Related research has explored the heritability of specific altitude phenotypes such as arterial oxygen saturation, haemoglobin concentration and thoracic skeletal dimensions [16, 29, 30]. One heritability study concluded that a major autosomal dominant locus exists for high oxygen saturation, where Tibetan women carrying this high oxygen saturation allele had a higher offspring survival rate than women possessing the low oxygen saturation allele [30]. Even though research of this nature documents the potential for natural selection to act on phenotypic traits, it does not identify the gene(s) controlling the phenotype.

As part of an ongoing project to understand the role of natural selection in shaping human genetic diversity in high-altitude populations, we genotyped 490,032 autosomal SNPs using the Affymetrix, Inc. (Santa Clara, CA) GeneChip® Mapping 500 K array in 195 persons of high-altitude or low-altitude descent. By comparing high-altitude populations with related populations living at low altitude, a list of selection-nominated candidate genes and gene regions was generated using four summary statistics: locus-specific branch length (LSBL), the natural log of the ratio of heterozygosities (lnRH), Tajima's D and the whole-genome long-range haplotype (WGLRH) test. We focused our attention on the hypoxia-inducible factor (HIF) pathway, which is a transcriptional regulator that controls cellular oxygen homeostasis and plays a key role in energy metabolism. It is upregulated in many cancers and may be involved in the accumulation of adipose tissue. This pathway, comprising at least 75 genes scattered throughout the genome, is thought to regulate many of the physiological responses to cellular hypoxia. Based on their functional roles, we have an a priori reason to expect that genes in this pathway might be involved in adaptation to high altitude [31]. Genomic searches for signatures of natural selection, however, are also a means of aiding the identification of gene function or to expand the current understanding of a gene's function. Several studies of natural selection have helped to identify functional roles for the loci under selection [3234]. Therefore, we also considered non-HIF genes in this analysis.

Materials and methods

Populations

SNP data were generated using 105 individuals of Native American descent (previously reported on by Mao et al.[35]). This sample could be further divided into two groups, a high-altitude group and a low-altitude group. The high-altitude group was composed of 50 individuals of Andean descent: 25 Quechua collected in Cerro de Pasco, Peru (4,300 m), and 25 individuals of largely Aymara ancestry collected in La Paz, Bolivia (3,600 m) [36, 37]. The low-altitude group consisted of Native American lowlanders from Mexico, including 11 Nahua, nine Mixtec and ten Tlapanec individuals collected in Guerrero (1,600 m) and 25 Maya individuals collected in the Yucatan Peninsula (10 m). Sampling locations for each of the Native American samples is shown in Figure 1. Native American population samples, highland and lowland alike, were selected based on the proportion of Native American to European individual genetic ancestry, with persons showing high levels of Native American and low levels of European ancestry chosen for this research. Genetic ancestry was estimated using a panel of ancestry informative markers (AIMs) that distinguish between West African, Northern European and Native American populations [38, 39]. Additionally, we included 90 East Asian lowlanders in this research: the 45 Haplotype Mapping Project (HapMap) Han Chinese from Beijing and the 45 HapMap Japanese from Tokyo. In this analysis, we split the high-altitude and low-altitude populations into three population groupings: 1) Andeans (Quechua and Aymara); 2) Mesoamericans (Maya, Nahua, Tlapanec and Mixtec); and 3) East Asians (Han Chinese and Japanese).

In addition to the Native American and East Asian populations, 120 West African and Northern European individuals from the HapMap project were also genotyped using the Affymetrix 500 K array set. These included 60 Yoruba from Ibadan, Nigeria and 60 individuals from the USA who were of northern and western European ancestry, collected by the Centre d'Etude du Polymorphisme Humain (CEPH). The availability of the HapMap data made it possible to compare the results of our analysis with those of previous studies of natural selection in the same samples. By doing so, we confirm that this genotyping platform is appropriate for the analysis of Andean signatures of natural selection.

Genome scan data

The Affymetrix, Inc. Gene Chip Human mapping 500 K array set was used to generate high-density multi-locus SNP genotype scores. This mapping array has an even distribution across the genome, with an average inter-SNP distance of 5.8 kilobases (kb). It is composed of two arrays named for the restriction enzymes used in the complexity reduction step of the reaction, the Nsp array and the Sty array. Each array assays approximately 250,000 SNPs. In total, this analysis was conducted using 490,023 autosomal SNPs.

Tests for positive selection

We used four statistics to identify candidate loci showing positive selection in the Andean population: LSBL, lnRH, Tajima's D and the WGLRH test [5, 4042]. LSBL, lnRH and the WGLRH test were implemented as previously described [5, 41, 43]. LSBL was calculated for each SNP in the dataset, whereas an overlapping sliding windows approach was taken to calculate lnRH and Tajima's D. We used a window size of 200,000 base pairs (bp), moving in 50,000 bp increments along each chromosome for lnRH and 100,000 bp with 10,000 bp increments for Tajima's D. Window size was determined by the genome coverage and the marker density of the Affymetrix 500 K array set. Statistical significance for each of LSBL, lnRH and Tajima's D was determined by using its respective genome-wide empirical distribution generated by these data. Those loci with pE values falling in the top (LSBL) or bottom (lnRH and Tajima's D) 5 per cent of the empirical distribution were considered statistically significant (α = 0.05). For the WGLRH test, significance was assessed by comparing the relative extended haplotype homozygosity (REHH) of a specific core haplotype with the gamma distribution and applying the false discovery rate (FDR) approach to correct for multiple tests [44].

For Tajima's D, we compared standardised Tajima's D across windows similar to the integrated haplotype score (iHS) statistic of Voight et al.[2] To do so, we used the following equation:
s t a n d a r d i s e d D = D i - μ ( D ) S D D
Where Di is the Tajima's D calculated for a sliding window in a given population panel (Andean, Mesoamerican or East Asian), μ is the mean Tajima's D for all windows and SD is the standard deviation of Tajima's D for all windows. Using standardised D, we identified regions of the genome that were significantly negative in the Andeans. Because we were interested in regions of the genome that have been subject to natural selection in Andeans but not in lowland Native American populations, however, we also wanted to compare the two New World populations--Andeans and Mesoamericans--to identify genomic regions that may have undergone selection in the high-altitude populations but not in the low-altitude populations. To do so, we developed a statistic to summarise the difference in Tajima's D between two populations using the following equation:
T a j i m a s D d i f f e r e n c e = D i A - D i B - μ D A - D B S D ( D A - D B )

Here, D iA is Tajima's D computed for a given sliding window in population A, D iB is Tajima's D computed for a given sliding window in population B, μ is the mean Tajima's D for all windows and SD is the standard deviation of Tajima's D for all windows. Again, Tajima's D was calculated for each population using an overlapping sliding window size of 100,000 bp with 10,000 bp increments. We did not include East Asians in this comparison, in order to eliminate overlooking genomic regions that may have undergone changes in East Asians after the Old World and New World populations split.

Haplotypic phase was determined for each chromosome prior to calculating two of the statistics, Tajima's D and the WGLRH test. The program FastPHASE resolved the haplotypic phase from the unphased genotype data for Tajima's D [45]. Northern Europeans, West Africans, Native Americans and East Asians were phased individually. The high-altitude and low-altitude Native Americans were phased together as one population. Missing genotypes were inferred for all populations. For the WGLRH test, haplotypic phase was computed using the expectation maximisation (EM) algorithm as implemented in the program Haploview [46]. FastPHASE was not used for this test because we strictly followed the WGLRH algorithm as it was designed, which used Haploview for phasing [41].

Results

Working with 490,032 SNPs from the Affymetrix, Inc. Gene Chip Mapping 500 K array in 195 individuals from high and low altitudes, we identified gene regions (α = 0.05 and α = 0.01) that differed significantly between Andeans and two low-altitude populations, Mesoamericans and East Asians, using four statistics that detect departures from neutrality. The statistics included LSBL, lnRH, Tajima's D and the WGLRH test. The significant SNP comparisons or SNP windows for each of the four test statistics applied to these data are shown in Table 1. The empirical distribution for LSBL is shown in Figure 2.
Table 1

Number of SNP or SNP window comparisons for each test statistic and their empirical p-values at two levels of α

Test

Number

of tests

pE 5 0.05

pE 5 0.01

LSBL

490,032

24,502

4,900

lnRH

53,251

2,663

533

Tajima's D

263,882

13,194

2,639

WGLRH

43,153

55

NA

LSBL = locus-specific branch length test; lnRH = natural logarithm of the ratio of heterozygosities; WGLRH = whole-genome long-range haplotype.

Figure 2

Empirical distribution of LSBL across three populations: Andeans, Mesoamericans and East Asians. The inset shows LSBLs above 0.3 for Andeans and Mesoamericans. These two populations have fewer LSBLs above 0.15 compared with the East Asians, which is expected, given their more recent common ancestry. LSBL = Locus-specific branch length.

We identified 14, seven and three HIF pathway genes that fell into the 5 per cent tail of the empirical distribution for LSBL, lnRH and Tajima's D difference, respectively. No HIF pathway candidate genes were located in a statistically significant extended haplotype region for the WGLRH test. The SNP genotyping platform used in this analysis, however, did not assay SNPs within the gene boundaries of 13 HIF pathway candidate genes. For this reason, it was important to look 50 kb upstream and downstream of the start and end coordinates of each gene for significant SNPs or SNP windows so as to not exclude a potential candidate gene from analysis. When doing so, 29, ten and eight HIF pathway genes show at least one significant SNP or window for LSBL, lnRH and Tajima's D difference, respectively. Table 2 enumerates the significant HIF pathway genes for each test statistic using the 50 kb upstream and downstream definition.
Table 2

Summary of the significant HIF pathway candidate genes for the four test statistics used to detect signatures of positive selection

 

Test for natural selection

 

Gene

name

LSBL

lnRH

Tajima's D

Difference

WGLRH

ADRA1B

X

   

ARNT2

X

   

ATP1A1

 

X

  

ATP1A2

X

   

CDH1

X

X

  

COPS5

X

   

CXCR4

X

   

EDN1

X

   

EDNRA

X

X

X

 

EGLN1

X

   

EGLN2

X

   

ELF2

X

X

  

FRAP1

 

X

  

IL1A/

 

X

  

IL1B

    

IL6

X

   

IGFBP1

  

X

 

IGFBP2

X

   

MDM2

X

   

MMP2

X

   

NOS1

X

   

NOS2A

X

X

X

 

NOTCH1

X

   

NRP1

  

X

 

NRP2

X

   

POLRA

 

X

  

PIK3CA

X

 

X

 

PIK3CG

  

X

 

PRKAA1

X

X

X

 

PRKAA2

X

   

SNAI3

X

   

SPRY2

X

   

TF

X

   

TGFA

X

   

TNC

X

X

  

TNF

X

   

VEGF

X

 

X

 

LSBL = locus-specific branch length; lnRH = natural log of the ratio of heterozygosities; WGLRH = whole-genome long-range haplotype.

The number and proportion of significant SNPs or sliding windows varied for each gene. For example, all nine lnRH windows and six of 28 LSBLs for tenascin C (TNC) were statistically significant. By contrast, the gene nitric oxide synthase 2A (NOS2A) displayed only one significant lnRH window out of seven, and the gene vascular endo-thelial growth factor (VEGF) contained only one significant LSBL among 15 assayed SNPs. Moreover, the only gene with all of the windows in the gene region significant for lnRH was TNC. None of the HIF pathway candidate genes were statistically significant for all of the test statistics. However, we did identify significant HIF pathway genes using two or three out of the four statistics. VEGF was significant for Tajima's D difference and LSBL. TNC and cadherin 1 (CDH1) were statistically significant for LSBL and lnRH. Lastly, three genes, those encoding endothelin receptor type A (ENDRA) and protein kinase, AMP-activated, alpha 1 catalytic subunit (PRKAA1) and NOS2A, were statistically significant for LSBL, lnRH and the Tajima's D difference.

To evaluate if HIF pathway genes are over-represented in the 5 per cent tail of the empirical distribution for Andeans, we used Fisher's exact test. We tested the hypothesis that the proportion of significant LSBL values (α = 0.05) is higher among HIF pathway candidate genes than among non-HIF genes using a 2 × 2 contingency table, where the four categories were: significant LSBLs for HIF genes, non-significant LSBLs for HIF genes, significant LSBLs for non-HIF genes and non-significant LSBLs for non-HIF genes. The results indicated that the HIF pathway candidate genes are not over-represented in the 5 per cent tail of the distribution (OR = 0.644 (95 per cent confidence interval 0.538-0.778); p < 0.001). A second method of testing if the HIF pathway candidate genes are over-represented in the 5 per cent tail is to compare the LSBL distribution of HIF pathway candidate genes to the LSBL distribution of all non-HIF genes using a one-sided Kolmogorov-Smirnov (K-S) test. Again, the results of this test suggested that the 5 per cent tail of the empirical LSBL distribution is not enriched with HIF genes Dn, m= 0.0205; p = 0.3162). It is important to note that these results do not preclude particular HIF genes from involvement in genetic adaptation to high altitude. Rather, they denote that the HIF pathway as a whole has not evolved in response to hypoxia among Andeans.

In addition to studying the HIF pathway candidate genes specifically, we also scanned across each chromosome to discern genomic regions showing evidence of reduced variation in Andeans, a hallmark of directional selection. Given the large number of significant tests for LSBL, lnRH and Tajima's D using a 5 per cent significance cut-off we restricted our attention to regions with clusters of significant values for one or more test statistics as selection-nominated candidate gene regions. To identify such regions, we calculated the significance of one megabase non-overlapping windows moving across each chromosome for LSBL, lnRH and Tajima's D using the hypergeometric distribution. The p-value for each window was corrected for multiple tests using the Bonferroni correction [44]. In total, p-values for 2,718 windows were calculated for each of the LSBL, lnRH and Tajima's D statistics. Significant p-values were defined such that one false-positive would be expected for all observed windows. Using this definition, windows for which p ≤ 0.004 were considered to be statistically significant. The results of this analysis revealed 54 regions displaying extended regions of continuously significant statistics for two or more of the three statistics. Three of these regions located on chromosomes 11, 12 and 15 were significant for all three statistics. Table 3 enumerates the chromosomal regions that were statistically significant for LSBL, lnRH and Tajima's D.
Table 3

One megabase windows displaying extended regions of statistical significance for LSBL, lnRH, and Tajima's D difference

Chromosome

Window

start

Window

end

LSBL

p-value*

lnRH

p-value*

Tajima's D

p-value*

Known

genes

11

82000000

83000000

0.000000

0.000001

0.000000

19

12

109000000

110000000

0.000000

0.000000

0.000002

41

15

41000000

42000000

0.000000

0.000534

0.000000

70

*p-values have been corrected for multiple tests using the Bonferroni correction.

The WGLRH test identified 43,153 extended haplotype/core regions throughout the genome in the Andean panel. Only 57 of these regions were statistically significant after identifying 'flipped' SNPs and applying an FDR correction for multiple testing. Two of these extended haplotypes were also identified as significant in the Mesoamericans. After removing these two regions from the Andean analysis, 55 significant extended haplotype regions remained. Those significant extended haplotypes containing known genes in their core regions are listed in Table 4. No common core haplotypes were shared between the East Asians and the Andeans. Of the 55 significant 500 kb extended haplotype regions, seven contained SNPs that were statistically significant for LSBL. None of the core regions identified using the WGLRH test overlapped with the statistically significant gene windows for lnRH or Tajima's D.
Table 4

Summary of the significant core regions containing known genes in the Andean population for the WGLRH test

Chromosome

SNPs in core region

Haplotype frequency

p-value adjusted

Genes in core region*

1

3

0.060

0.032

KCNK2

1

2

0.080

0.038

KCNK2

1

5

0.070

0.050

KIAA1026

1

7

0.070

0.032

PEX14

2

2

0.520

0.016

ERBB4

2

2

0.540

0.006

INPP5D

2

2

0.080

0.038

TMEM169

3

2

0.330

0.011

CPNE4

4

2

0.051

0.007

LDB2

4

3

0.090

0.000

RBM47

5

7

0.070

0.034

FCHO2

6

3

0.120

0.018

F13A1

6

4

0.080

0.001

F13A1

6

3

0.120

0.036

KCNK5

8

2

0.130

0.036

ADRA1A

9

2

0.176

0.047

ADAMTSL1

9

2

0.610

0.044

PTPRD

9

3

0.070

0.011

VAV2

10

2

0.170

0.015

AK056561

10

2

0.500

0.021

FAM107B

10

2

0.101

0.040

GFRA1

10

3

0.540

0.023

OLAH

11

2

0.090

0.047

LDLRAD3

11

3

0.060

0.002

PSMD13

12

2

0.330

0.032

CHST11

12

5

0.080

0.035

SLC4A8

14

7

0.080

0.002

KCNK10

14

3

0.440

0.043

PPP2R5C

17

2

0.390

0.036

C17orf54

17

6

0.080

0.003

MPRIP

18

2

0.450

0.038

PTPRM

19

2

0.295

0.002

GNA15

20

2

0.100

0.000

RP5-1022P6.2

21

2

0.490

0.001

ERG

Core regions that do not contain a known gene are not listed.

*Genes are listed for the core region only and not the 500 kb extended haplotype regions identified by each core.

To validate that this dataset was appropriate for identifying signatures of positive selection in Andean populations, we performed an identical analysis using all four statistical tests for positive selection on the HapMap project populations [47]. The samples included in this analysis have been used in previous genome scans conducted on larger datasets [2, 6]. The populations included 60 Yoruba from Ibadan, Nigeria; 60 individuals of northern and western European ancestry from the USA collected by the CEPH; and 90 East Asians from China and Japan. The East Asians used in this analysis corresponded to the East Asians used for the Andean analysis. This analysis identified significant gene regions consistent with previous studies. For example, SNPs found in the gene solute carrier family 24, member 5 (SLC24A5)--a gene associated with skin pigmentation and shown to be under positive selection in European populations but not in East Asian and West African populations--possessed statistically significant LSBL and lnRH values in the European population;[32, 48] however, this was not observed for Tajima's D difference or the WGLRH test. Another gene, ectodysplasin A receptor (EDAR), known to be involved in hair and tooth development, consistently shows evidence of positive directional selection among East Asian populations [49]. For the East Asians in this analysis, SNPs falling within EDAR showed statistically significant LSBL values and Tajima's D window, but non-significant lnRH windows. Additionally, it was not identified as a significant core haplotype for the WGLRH test. The absence of significant lnRH windows is not surprising in this population of Chinese and Japanese individuals, however, given that the haplotype under selection has not swept to fixation in the Japanese population. By extending our analysis to this group of well-studied old-world populations, we verified that the signatures of selection found in these three populations overlapped with those signatures identified using other SNP datasets, supporting our contention that the SNP dataset and analytical methods used here are appropriate for identifying signatures of positive selection in high-altitude Andeans.

Discussion

Using a dense genome-wide panel of SNPs (Affymetrix 500 K chip), we compared patterns of genetic diversity between high-altitude Andeans, low-altitude Mesoamericans and East Asians to identify selection-nominated candidate genes or gene regions in Andeans. Four tests based on different characteristics of the data were used in our analysis: LSBL, lnRH, Tajima's D and the WGLRH test. We selected these complementary methods because each statistic possesses a varying degree of efficacy for identifying signatures of natural selection depending on the allelic background of the populations used in the analysis, the strength of selection and the length of time elapsed since the start of the selective event. Given the aspects of genetic variation summarised by these statistics, it is not expected that the results of tests will overlap. Rather, these methods should be considered as complementary tests that can be useful for the identification of regions under positive selection.

Based on the results of this study, the HIF pathway genes exhibiting the most compelling evidence of positive directional selection across the test statistics are ENDRA, PRKAA1 and NOS2A. ENDRA, expressed in vascular smooth muscle, encodes a vasoconstrictor whose actions are mediated through endothelin-1 [50]. PRKAA1 encodes a heterotrimeric enzyme belonging to the ancient 5'-AMP-activated protein kinase gene family involved in regulation of cellular ATP (reviewed by Kemp et al.[51]). PRKAA1 functions as a cellular energy sensor under ATP-deprived conditions, such as those that are experienced in hypoxia. Thus, it provides metabolic adaptations to the oxygen-starved cellular environment. NOS2A, in combination with additional nitric oxide synthase enzymes, synthesises nitric oxide (NO) from arginine and oxygen. NO increases blood flow in the arteries and helps to regulate blood pressure. Erzurum et al.[52] have recently shown that NO production is increased in Tibetans resident at 4,200 m compared with sealevel controls. Recent work has also demonstrated higher uterine artery blood flows during pregnancy in Andean than European high-altitude residents, possibly due to greater uterine artery vasodilation [37]. These studies suggest that vascular factors, not just haematological or pulmonary systems, contribute to altitude adaptation in Tibetan and Andean populations. Here, we showed preliminary evidence of positive selection in NOS2A in Andean populations.

It would be worthwhile to extend the work conducted in this study to Tibetan populations who show physiological adaptations with respect to NO production, to determine if a similar genetic signal is present in this Himalayan population.

The three chromosomal regions showing extended regions of statistically significant test results are excellent candidates for further study. They include regions on chromosomes 11, 12 and 15. In addition to the two chromosomal regions, the 55 candidate regions identified by the WGLRH test are also strong candidates for further study; however, the WGLRH test only considers derived alleles whose frequencies have risen to >0.85 in the populations under consideration. One problem with only considering those haplotypes with high frequencies of the derived allele is that natural selection could also act to select the ancestral allele, and these signatures cannot be detected with the WGLRH test. Given the low altitudes inhabited by human ancestors, however, it is more likely that selection acted on a novel mutation in one or more of the genes involved in adaptation to altitude, as opposed to an ancestral variant already present in the population. This is especially relevant with regard to the HIF pathway, as this is an evolutionarily ancient system important in embryogenesis, development and homeostasis.

One potential problem with this and other genome scans is that it uses pre-ascertained SNPs to look at the underlying pattern of nucleotide diversity. For example, Tajima's D was first designed for sequence-based tests of selection wherein the nucleotide diversity is known for an entire gene or gene region. Using this statistic on genome scans for natural selection, one must be aware of the ascertainment bias inherent in the analysis. Given the selection criteria of the Affymetrix 500 K panel, uncommon, low-frequency alleles will be under-represented and common alleles will be over-represented. This bias is more likely to miss candidate natural selection genes rather than increase our false-positive rate. Moreover, we used Tajima's D in conjunction with other tests for positive selection so that genes that might have been overlooked using this statistic could have been identified with one or more of the other three statistics. To illustrate this point, consider the genes TNC and CDH1, which, in our analysis, showed signatures of selection by LSBL and lnRH, but not either of the Tajima's D statistics. This pattern is the same as that observed for the gene SLC24A5, which is known to be the target of natural or sexual selection in European populations [32, 48]. Therefore, it is possible that with unbiased complete sequence data of TNC or CDH1 in Andean and Mesoamerican populations, Tajima's D will reveal a pattern of nucleotide diversity consistent with positive directional selection.

In future work, it would be interesting to compare the overall genetic signals of natural selection found in Andean populations with those found in Tibetan populations, as these two populations are distinct in geographical locale as well as in duration of time living at altitude. Archaeological data indicate that Himalayan populations first inhabited the Tibetan Plateau as early as 25,000 years ago, whereas populations first moved onto the Andean Altiplano 11,000 years ago [53, 54]. By understanding how similar environmental pressures with varying evolutionary time frames can result in either the same or different genetic adaptations, we will be better situated to understand the molecular basis for convergent human adaptations. After identification, all putative natural selection regions identified in Andeans and Tibetans must be confirmed by further research, such as genotype/phenotype association studies and functional assays.

Declarations

Acknowledgements

We would like to thank the people of high altitude who participated in this research and two anonymous reviewers for helpful comments on the manuscript. A.W.B was supported by a National Science Foundation Graduate Research Fellowship. This work was supported by the National Science Foundation (grant number 0622337 to A.W.B); the Wenner-Gren Foundation (grant number 7538 to A.W.B); and the National Institutes of Health (grant numbers 079647, 001188 to L.G.M.). Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.

Authors’ Affiliations

(1)
Department of Pediatrics, The University of Washington
(2)
Department of Anthropology, Pennsylvania State University
(3)
Affymetrix, Inc.
(4)
Departments of Exercise Science and Anthropology, Syracuse University
(5)
Department of Anthropology and Altitude Research Center, University of Colorado
(6)
Department of Anthropology, University of Toronto
(7)
Department of Genome Sciences, University of Washington
(8)
Graduate School of Arts and Sciences, Wake Forest University

References

  1. Altshuler D, The International HapHap Consortium (TIHC): A haplotype map of the human genome. Nature. 2005, 437: 1299-1320. 10.1038/nature04226.View ArticleGoogle Scholar
  2. Voight BF, Kudaravalli S, Wen X, Pritchard JK: A map of recent positive selection in the human genome. PLoS Biol. 2006, 4: e72-10.1371/journal.pbio.0040072.PubMed CentralView ArticlePubMedGoogle Scholar
  3. Hinds DA, Stuve LL, Nilsen GB, Halperin E, et al: Whole-genome patterns of common DNA variation in three human populations. Science. 2005, 307: 1072-1079. 10.1126/science.1105436.View ArticlePubMedGoogle Scholar
  4. Akey JM, Zhang G, Zhang K, Jin L, et al: Interrogating a high-density SNP map for signatures of natural selection. Genome Res. 2002, 12: 1805-1814. 10.1101/gr.631202.PubMed CentralView ArticlePubMedGoogle Scholar
  5. Shriver MD, Kennedy GC, Parra EJ, Lawson HA, et al: The genomic distribution of population substructure in four populations using 8,525 autosomal SNPs. Hum Genomics. 2004, 1: 274-286.PubMed CentralView ArticlePubMedGoogle Scholar
  6. Sabeti PC, Varilly P, Fry B, Lohmueller J, et al: Genome-wide detection and characterization of positive selection in human populations. Nature. 2007, 449: 913-918. 10.1038/nature06250.PubMed CentralView ArticlePubMedGoogle Scholar
  7. Hancock AM, Witonsky DB, Gordon AS, Eshel G, et al: Adaptations to climate in candidate genes for common metabolic disorders. PLoS Genet. 2008, 4: e32-10.1371/journal.pgen.0040032.PubMed CentralView ArticlePubMedGoogle Scholar
  8. McEvoy B, Beleza S, Shriver MD: The genetic architecture of normal variation in human pigmentation: An evolutionary perspective and model. Hum Mol Genet. 2006, 15 (Spec No 2): R176-181.View ArticlePubMedGoogle Scholar
  9. Baker PL: Man in the Andes: A Multidisciplinary Study of High-Altitude Quechua. 1976, Dowden, Hutchinson, and Ross, Inc., Stroudsbourg, PA, USAGoogle Scholar
  10. Schull WJ, Rothhammer F: The Aymara: Strategies in Human Adaptation to a Rigorous Environment. 1990, Kluwer Academic Publishers, Boston, MA, USAView ArticleGoogle Scholar
  11. Moore LG, Shriver M, Bemis L, Hickler B, et al: Maternal adaptation to high-altitude pregnancy, an experiment of nature: A review. Placenta. 2004, 25 (Suppl A): S60-S71.View ArticlePubMedGoogle Scholar
  12. Beall C, Brittenham G, Macuaga F, Barragan M: Variation in hemoglobin concentration among samples of high altitude natives in the Andes and the Himalayas. Am J Hum Biol. 1990, 2: 639-651. 10.1002/ajhb.1310020607.View ArticleGoogle Scholar
  13. Beall CM, Decker MJ, Brittenham GM, Kushner I, et al: An Ethiopian pattern of human adaptation to high-altitude hypoxia. Proc Natl Acad Sci USA. 2002, 99: 17215-17218. 10.1073/pnas.252649199.PubMed CentralView ArticlePubMedGoogle Scholar
  14. Beall CM, Goldstein MC: Hemoglobin concentration of pastoral nomads permanently resident at 4,850-5,450 meters in Tibet. Am J Phys Anthropol. 1987, 73: 433-438. 10.1002/ajpa.1330730404.View ArticlePubMedGoogle Scholar
  15. Adams WH, Strang LJ: Hemoglobin levels in persons of Tibetan ancestry living at high altitude. Proc Soc Exp Biol Med. 1975, 149: 1036-1039.View ArticlePubMedGoogle Scholar
  16. Beall CM, Brittenham GM, Strohl KP, Blangero J, et al: Hemoglobin concentration of high-altitude Tibetans and Bolivian Aymara. Am J Phys Anthropol. 1998, 106: 385-400. 10.1002/(SICI)1096-8644(199807)106:3<385::AID-AJPA10>3.0.CO;2-X.View ArticlePubMedGoogle Scholar
  17. Beall CM, Reichsman AB: Hemoglobin levels in a Himalayan high altitude population. Am J Phys Anthropol. 1984, 63: 301-306. 10.1002/ajpa.1330630306.View ArticlePubMedGoogle Scholar
  18. Zhuang J, Droma T, Sun S, Janes C, et al: Hypoxic ventilatory responsiveness in Tibetan compared with Han residents of 3,658 m. J Appl Physiol. 1993, 74: 303-311.PubMedGoogle Scholar
  19. Groves BM, Droma T, Sutton JR, McCullough RG, et al: Minimal hypoxic pulmonary hypertension in normal Tibetans at 3,658 m. J Appl Physiol. 1993, 74: 312-318.PubMedGoogle Scholar
  20. High Altitude: An Exploration of Human Adaptation. Edited by: Hornbein TF, Schoene RB. 2001, Marcel Dekker, Inc., New York, NY, USAGoogle Scholar
  21. Rupert JL, Kidd KK, Norman LE, Monsalve MV, et al: Genetic polymorphisms in the renin-angiotensin system in high-altitude and low-altitude Native American populations. Ann Hum Genet. 2003, 67: 17-25. 10.1046/j.1469-1809.2003.00004.x.View ArticlePubMedGoogle Scholar
  22. Moore LG, Zamudio S, Zhuang J, Droma T, et al: Analysis of the myoglobin gene in Tibetans living at high altitude. High Alt Med Biol. 2002, 3: 39-47. 10.1089/152702902753639531.View ArticlePubMedGoogle Scholar
  23. Hochachka PW, Rupert JL: Fine tuning the HIF-1 "global" O2 sensor for hypobaric hypoxia in Andean high-altitude natives. Bioessays. 2003, 25: 515-519. 10.1002/bies.10261.View ArticlePubMedGoogle Scholar
  24. Suzuki K, Kizaki T, Hitomi Y, Nukita M, et al: Genetic variation in hypoxia-inducible factor 1alpha and its possible association with high altitude adaptation in Sherpas. Med Hypotheses. 2003, 61: 385-389. 10.1016/S0306-9877(03)00178-6.View ArticlePubMedGoogle Scholar
  25. Rupert JL, Monsalve MV, Devine DV, Hochachka PW: Beta2-adrenergic receptor allele frequencies in the Quechua, a high altitude native population. Ann Hum Genet. 2000, 64: 135-143. 10.1046/j.1469-1809.2000.6420135.x.View ArticlePubMedGoogle Scholar
  26. Rupert JL, Devine DV, Monsalve MV, Hochachka PW: Angiotensin-converting enzyme (ACE) alleles in the Quechua, a high altitude South American native population. Ann Hum Biol. 1999, 26: 375-380. 10.1080/030144699282688.View ArticlePubMedGoogle Scholar
  27. Bigham AW, Kiyamu M, Leon-Velarde F, Parra EJ, et al: Angiotensin-converting enzyme genotype and arterial oxygen saturation at high altitude in Peruvian Quechua. High Alt Med Biol. 2008, 9: 167-178. 10.1089/ham.2007.1066.PubMed CentralView ArticlePubMedGoogle Scholar
  28. Malacrida S, Katsuyama Y, Droma Y, Basnyat B, et al: Association between human polymorphic DNA markers and hypoxia adaptation in Sherpa detected by a preliminary genome scan. Ann Hum Genet. 2007, 71: 630-638. 10.1111/j.1469-1809.2007.00358.x.View ArticlePubMedGoogle Scholar
  29. Kramer AA: Heritability estimates of thoracic skeletal dimensions for high-altitude Peruvian populations. Populations Studies on Human Adaptation and Evolution in the Peruvian Andes. Edited by: Melton TW, Eckhardt RB. 1992, The Pennsylvania State University Press, University Park, PA, USA, 25-49.Google Scholar
  30. Beall CM, Blangero J, Williams-Blangero S, Goldstein MC: Major gene for percent of oxygen saturation of arterial hemoglobin in Tibetan highlanders. Am J Phys Anthropol. 1994, 95: 271-276. 10.1002/ajpa.1330950303.View ArticlePubMedGoogle Scholar
  31. Moore LG, Shriver M, Bemis L, Vargas E: An evolutionary model for identifying genetic adaptation to high altitude. Adv Exp Med Biol. 2006, 588: 101-118. 10.1007/978-0-387-34817-9_10.View ArticlePubMedGoogle Scholar
  32. Lamason RL, Mohideen MA, Mest JR, Wong AC, et al: SLC24A5, a putative cation exchanger, affects pigmentation in zebrafish and humans. Science. 2005, 310: 1782-1786. 10.1126/science.1116238.View ArticlePubMedGoogle Scholar
  33. Tishkoff SA, Reed FA, Ranciaro A, Voight BF, et al: Convergent adaptation of human lactase persistence in Africa and Europe. Nat Genet. 2007, 39: 31-40. 10.1038/ng1946.PubMed CentralView ArticlePubMedGoogle Scholar
  34. Bersaglieri T, Sabeti PC, Patterson N, Vanderploeg T, et al: Genetic signatures of strong recent positive selection at the lactase gene. Am J Hum Genet. 2004, 74: 1111-1120. 10.1086/421051.PubMed CentralView ArticlePubMedGoogle Scholar
  35. Mao X, Bigham AW, Mei R, Gutierrez G, et al: A geno-mewide admixture mapping panel for Hispanic/Latino populations. Am J Hum Genet. 2007, 80: 1171-1178. 10.1086/518564.PubMed CentralView ArticlePubMedGoogle Scholar
  36. Brutsaert TD, Parra EJ, Shriver MD, Gamboa A, et al: Spanish genetic admixture is associated with larger V(O2), max decrement from sea level to 4338 m in Peruvian Quechua. J Appl Physiol. 2003, 95: 519-528.View ArticlePubMedGoogle Scholar
  37. Wilson MJ, Lopez M, Vargas M, Julian C, et al: Greater uterine artery blood flow during pregnancy in multigenerational (Andean), than shorter-term (European), high-altitude residents. Am J Physiol Regul Integr Comp Physiol. 2007, 293: R1313-1324. 10.1152/ajpregu.00806.2006.View ArticlePubMedGoogle Scholar
  38. Bonilla C, Shriver MD, Parra EJ, Jones A, et al: Ancestral proportions and their association with skin pigmentation and bone mineral density in Puerto Rican women from New York city. Hum Genet. 2004, 115: 57-68. 10.1007/s00439-004-1125-7.View ArticlePubMedGoogle Scholar
  39. Shriver MD, Parra EJ, Dios S, Bonilla C, et al: Skin pigmentation, biogeographical ancestry and admixture mapping. Hum Genet. 2003, 112: 387-399.PubMedGoogle Scholar
  40. Storz JF, Payseur BA, Nachman MW: Genome scans of DNA variability in humans reveal evidence for selective sweeps outside of Africa. Mol Biol Evol. 2004, 21: 1800-1811. 10.1093/molbev/msh192.View ArticlePubMedGoogle Scholar
  41. Zhang C, Bailey DK, Awad T, Liu G, et al: A whole genome long-range haplotype (WGLRH) test for detecting imprints of positive selection in human populations. Bioinformatics. 2006, 22: 2122-2128. 10.1093/bioinformatics/btl365.View ArticlePubMedGoogle Scholar
  42. Tajima F: Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989, 123: 585-595.PubMed CentralPubMedGoogle Scholar
  43. Schlotterer C: A microsatellite-based multilocus screen for the identification of local selective sweeps. Genetics. 2002, 160: 753-763.PubMed CentralPubMedGoogle Scholar
  44. Benjamini Y, Hochberg Y: Controlling the false discovery rate: A practical and powerful approach to multiple testing. J R Stat Soc. 1995, 57: 289-300.Google Scholar
  45. Scheet P, Stephens M: A fast and flexible statistical model for large-scale population genotype data: Applications to inferring missing genotypes and haplotypic phase. Am J Hum Genet. 2006, 78: 629-644. 10.1086/502802.PubMed CentralView ArticlePubMedGoogle Scholar
  46. Qin ZS, Niu T, Liu JS: Partition-ligation-expectation-maximization algorithm for haplotype inference with single-nucleotide polymorphisms. Am J Hum Genet. 2002, 71: 1242-1247. 10.1086/344207.PubMed CentralView ArticlePubMedGoogle Scholar
  47. International HapMap Consortium: The International HapMap Project. Nature. 2003, 426: 789-796. 10.1038/nature02168.View ArticleGoogle Scholar
  48. Norton HL, Kittles RA, Parra E, McKeigue P, et al: Genetic evidence for the convergent evolution of light skin in Europeans and East Asians. Mol Biol Evol. 2007, 24: 710-722.View ArticlePubMedGoogle Scholar
  49. Carlson CS, Thomas DJ, Eberle MA, Swanson JE, et al: Genomic regions exhibiting positive selection identified from dense genotype data. Genome Res. 2005, 15: 1553-1565. 10.1101/gr.4326505.PubMed CentralView ArticlePubMedGoogle Scholar
  50. Arai H, Hori S, Aramori I, Ohkubo H, et al: Cloning and expression of a cDNA encoding an endothelin receptor. Nature. 1990, 348: 730-732. 10.1038/348730a0.View ArticlePubMedGoogle Scholar
  51. Kemp BE, Stapleton D, Campbell DJ, Chen ZP, et al: AMP-activated protein kinase, super metabolic regulator. Biochem Soc Trans. 2003, 31: 162-168.View ArticlePubMedGoogle Scholar
  52. Erzurum SC, Ghosh S, Janocha AJ, Xu W, et al: Higher blood flow and circulating NO products offset high-altitude hypoxia among Tibetans. Proc Natl Acad Sci USA. 2007, 104: 17593-17598. 10.1073/pnas.0707462104.PubMed CentralView ArticlePubMedGoogle Scholar
  53. Moseley M: The Incas and their Ancestors. 2001, Thames and Hudson, London, UKGoogle Scholar
  54. Aldenderfer M: Moving up in the world. Am Sci. 2003, 91: 542-549.View ArticleGoogle Scholar

Copyright

© Henry Stewart Publications 2009