Genome-wide scans for loci under selection in humans
© Henry Stewart Publications 2005
Received: 31 January 2005
Accepted: 31 January 2005
Published: 1 June 2005
Natural selection, which can be defined as the differential contribution of genetic variants to future generations, is the driving force of Darwinian evolution. Identifying regions of the human genome that have been targets of natural selection is an important step in clarifying human evolutionary history and understanding how genetic variation results in phenotypic diversity, it may also facilitate the search for complex disease genes. Technological advances in high-throughput DNA sequencing and single nucleotide polymorphism genotyping have enabled several genome-wide scans of natural selection to be undertaken. Here, some of the observations that are beginning to emerge from these studies will be reviewed, including evidence for geographically restricted selective pressures (ie local adaptation) and a relationship between genes subject to natural selection and human disease. In addition, the paper will highlight several important problems that need to be addressed in future genome-wide studies of natural selection.
Phenotypic diversity is a ubiquitous characteristic of natural populations. Individuals vary in almost every conceivable way, including physical appearance, behaviour, disease susceptibility, ability to detoxify drugs and perception of environmental stimuli . Although environmental forces undoubtedly contribute to phenotypic variation, so too does genetic variation. Therefore, explaining the evolutionary forces that create, maintain and shape patterns of human genetic variation is of fundamental importance in understanding phenotypic variation .
An important goal in studies of human genetic variation is to identify loci that have been targets of natural selection due to their variable effects on the fitness of individuals throughout a population's history. Signatures of natural selection delimit regions of the genome that are, or have been, functionally important. Therefore, identifying such regions will facilitate the identification of genetic variation that contributes to phenotypic variation and help to functionally annotate the genome. Unfortunately, inferring the action of natural selection remains a challenge. This is likely to change in the near future, as high-throughput methods for cataloguing genetic variation on a genome-wide scale and new statistical tools for detecting selection have been, and continue to be, developed.
Much important work has been done on genome scans for natural selection in model organisms such as Drosophila;[3–5] this review, however, will focus on studies performed in human populations. Firstly, there will be a summary of the effects of natural selection and population history on patterns of genetic variation and some of the common statistical methods used to test for deviations from neutrality will be presented. Next, a critical evaluation of several empirical genome-wide scans for selection will be presented. Finally, the paper will highlight several important problems, both practical and conceptual, that need to be addressed in future studies.
Human genetic variation: The neutral expectation
The evolutionary sojourn of a newly arisen mutation depends upon how it affects the fitness of the individual who possesses it. The neutral theory of molecular evolution posits that the vast majority of polymorphisms in a population are selectively neutral and have no appreciable effects on fitness [6, 7]. Under neutrality, changes in allele frequency are governed by the stochastic effects of genetic drift in populations of finite size. Thus, the effective population size, Ne, and neutral mutation rate, μo, determine levels of polymorphism within species and the rate of divergence between species . In addition, the effect of mutations with small fitness effects can be rendered 'nearly neutral' if the product of Ne and s (which measures the strength of selection) is < 1 [9, 10]. For human populations, Ne is approximately 10,000 and therefore |s| must be greater than 10-4 to overcome the stochastic effects of genetic drift. Because the neutral theory makes explicit and quantitative predictions about expected patterns of genetic variation within and between species, it is an indispensable tool in studies of natural selection. Specifically, the neutral theory provides an essential foundation for evaluating the evidence either for or against selection in empirical data, as it serves as the null hypothesis when exploring alternative evolutionary models [11, 12].
Evolutionary forces perturb patterns of genetic variation
Natural selection and population demographic history perturb patterns of genetic variation relative to what is expected under a standard neutral model (constant sized, randomly mating, panmictic population at mutation drift equilibrium). Below, the way in which selection and demographic history affect patterns of genetic variation will be considered, from a coalescent point of view.
Balancing selection occurs when polymorphisms are selectively maintained in a population. By contrast with positive selection, the genealogy of a locus subject to balancing selection is characterised by an increased time to the most recent common ancestor and long internal branches (Figure 2). The effect of balancing selection on gene genealogies can be understood by considering balanced alleles as distinct subpopulations, such that coalescence events can occur rapidly within a subpopulation but slowly between subpopulations . The signature of balancing selection includes elevated levels of polymorphism relative to neutral expectations and a skew of the allele frequency distribution towards an excess of intermediate frequency alleles [29, 30, 35].
In addition to natural selection, population demographic history can also have strong influences on patterns of genetic variation, which often mimic the effect of natural selection [36, 37]. In other words, inferences of natural selection are confounded by population demographic history. For example, both positive selection and increases in population size have similar effects on gene genealogies (Figure 2); both processes therefore lead to an excess of low-frequency alleles in a population. In fact, strong positive selection can be thought of as a rapid population expansion of an advantageous allele as it sweeps through a population. Similarly, population structure and balancing selection both result in subdivided genealogies and therefore both processes are expected to result in an excess of intermediate-frequency alleles in a population (Figure 2). Population bottlenecks can lead to an excess of either low- or intermediate-frequency alleles relative to neutral expectations, depending on the age and severity of the bottleneck. Figure 2 demonstrates the effect of a severe and recent bottleneck, which forces all lineages to coalesce at the time of the size reduction and results in a genealogy that is similar to positive selection. Human populations clearly do not meet all of the assumptions of the standard neutral model; hence, rejecting the standard neutral model for a particular locus cannot be interpreted as unambiguous evidence for selection.
Detecting the signature of natural selection
Before presenting the results from genome-wide scans for natural selection, there now follows a brief description of some commonly used statistical methods designed to detect departures from neutrality, highlighting some of their strengths and limitations. The following is not meant to be an exhaustive discussion of such tests, and descriptions of many interesting and useful methods will not be included here. For further study, the reader is encouraged to see an excellent review by Kreitman .
Statistical tests of neutrality
Most thoroughly studied
Powerful only if non-neutral evolution occurred within a critical time period
Coalescent likelihood methods
In principle, more powerful than summary statistic methods, possible to estimate multiple parameters simultaneously
Does not require sequence data, can be performed with marker genotypes only
Low power to detect balancing selection
Long-range haplotype test (LRH)
Does not require sequence data, can be performed with marker genotypes only
Sensitivity of LRH test to population demographics and haplotype construction not well studied; not applicable to detecting balancing selection
Within- and between-species tests
Hudson - Kreitman - Aguade (HKA)
May be useful for detecting balancing selection
Requires sequences from multiple individuals in two species; may be difficult to interpret significant HKA test
McDonald - Kreitman
More sensitive than raw measure of the ratio of one number of non-synonymous amino acid substitution in a gene to the number of synonymous substitutions (dn/ds); may be fairly robust to population demographics and recombination
Requires sequences from multiple individuals in two species; selection to change codons may adversely affect test; applicable to protein-coding regions only
Ratio of non-synonymous to synonymous substitutions
Requires only a single sequence from each species; raw dn/ds measure is unlikely to be confounded by demographics or recombination
Raw dn/ds measure is extremely stringent; applicable to protein-coding regions only
The site-frequency spectrum tests discussed above are confounded by demographic events such as population growth, bottlenecks and subdivision (Figure 2) and are rendered conservative by intra-locus recombination. The desire to estimate population demographic parameters, recombination rates and evolutionary parameters has prompted the development of maximum likelihood-based methods which use the complete data, rather than summary statistics [31, 43, 44]. These methods are computationally intensive and are not currently feasible for large datasets, but they potentially allow for substantial gains in statistical power relative to summary statistics methods and are likely to become increasingly important tools in the future (for a general discussion, see Felsenstein ).
Another within-species test that has been used to detect selection is to compare the variation in allele frequencies between populations, which can be quantified by the statistic FST. Under selective neutrality, FST is determined by genetic drift, whereas natural selection is a locus-specific force that can cause systematic deviations in FST values for a selected gene and nearby genetic markers. For example, geographically restricted directional selection may lead to an increase in FST of a selected locus, whereas balancing or species-wide directional selection may lead to a decrease in FST compared with neutrally evolving loci [46–50]. In a series of simulation experiments analysing two different FST test implementations, Beaumont and Balding found that this approach yielded sufficient power to detect positive selection provided that the selective coefficient was approximately five times larger than the migration rate, but that FST had little power to detect balancing selection .
Positive selection is also expected to increase levels of linkage disequilibrium (LD) relative to neutral expectations. Recently, a new statistical test was developed, the long-range haplotype (LRH) test,  which takes advantage of ancestral recombination events and the associated decay in LD to identify genes subject to positive selection. The rationale for this test is that a common allele with long-range LD potentially represents a site that has appeared recently and was driven to high frequency before recombination could erode LD. The LRH approach does not detect balancing selection, however, and the robustness of the test to non-neutral population demographics, the choice of haplotype defining markers and phase misspecification have not been well studied.
The second major class of neutrality tests compares levels of within-species polymorphism and between-species divergence and includes the Hudson - Kreitman - Aguade (HKA)  and McDonald - Kreitman (MK)  tests. The HKA method tests the goodness of fit of the observed levels of polymorphism within species and the observed divergence between species to those predicted under neutral theory. In order to determine polymorphism and divergence expectations under neutrality, data are required from at least two loci in each species, so that a simultaneous estimate can be made of a time-since-speciation parameter and a relative population size parameter. Under the HKA test, rejection of the null is formally interpreted as elevated polymorphism at one locus or reduced polymorphism at the other, or excess divergence at one locus or limited divergence at the other. Thus, it may not be obvious which locus or which process is responsible for producing a statistically significant test. McDonald  has described improvements to the HKA test which may ameliorate this problem.
In the MK test, a 2 × 2 contingency table is formed to compare the number of non-synonymous and synonymous sites that are polymorphic within a species (PN and PS) and fixed between species (DN and DS). Under neutrality, the ratio of non-synonymous to synonymous sites that are polymorphic equals the ratio of non-synonymous to synonymous sites that are fixed (ie PN/PS = DN/DS). Under positive selection, however, these two ratios are no longer equal and DN/DS > PN/PS . Among the strengths of the MK test are that it does not require assumptions about population demographic history (although under some circumstances the test can be adversely affected by increases in effective population size ) and is relatively insensitive to intra-locus recombination. Positive or purifying selection for codon usage may, however, bias the MK test .
The final class of neutrality tests uses between-species data to test for adaptive protein evolution. The classic test of positive selection compares the number of non-synonymous amino acid substitutions in a gene (dn) with the number of synonymous amino acid substitutions (ds). Under neutrality, the mutation rate at both categories of sites is the same, and dn/ds is expected to equal one; however, dn/ds < 1 for proteins subject to purifying selection and dn/ds > 1 for proteins under adaptive evolution. Although dn/ds > 1 provides strong evidence for adaptive protein evolution, it is a very conservative test, particularly if only a small number of codons have been selected for. The basic test has also been extended by Nielsen and Yang  and others to include models of codon and transition/transversion bias, to detect variation in dn/ds ratios among lineages and to identify specific codons under selection [56, 57].
Key advantages of genome-wide analyses
As alluded to above, distinguishing between the confounding effects of natural selection and population demographic history is difficult when studying a single locus. When many unlinked genes are considered, however, a clear strategy emerges. Population demographic history affects patterns of variation at all loci in a genome in a similar manner, whereas natural selection acts upon specific loci [12, 37, 46, 58]. Therefore, by sampling a large number of unlinked loci throughout the genome, empirical distributions of test statistics can be constructed and genes subject to locus-specific forces, such as natural selection, can be identified as outlier loci.
In addition to providing empirical distributions, genomewide scans for natural selection offer several additional advantages compared with single-locus studies. Genome-wide scans can suggest general principles about the types of variation that natural selection acts most forcefully upon. Datasets derived from an unbiased sampling of loci throughout the genome allow for the discovery of novel functional elements whose presence is revealed by evidence for selection. Whole-genome scans also have the potential to reveal networks of genes whose evolutionary histories are correlated due to their collaboration in executing cellular functions. Finally, it is important to stress that genome-wide analyses do not preclude single-locus analyses, and that achieving a detailed and thorough understanding of the selective and demographic forces acting upon a locus will necessitate focused single-locus analyses drawing from multiple scientific disciplines.
Genome scans for natural selection
Summary of genome-wide scans for selection
Number of loci
174 candidate selection genes were Identified whose levels of population structure were inconsistent with neutrality
43 sliding windows were identified that contained significant deficits in heterozygosity relative to neutral expectations
15 loci were identified with levels of population structure inconsistent with neutrality
13 loci were identified with levels of population structure inconsistent with neutrality
1,547 genes with dn/ds > 1 in the human lineage
One of the first genome-wide screens for selection to be performed analysed 26,530 single nucleotide polymorphisms (SNPs), which were genotyped in three human populations: African-Americans, East Asians and European-Americans . An empirical distribution of FST was constructed and outlier SNPs in gene regions were identified. As discussed above, geographically restricted selection (local adaptation) can accentuate levels of population structure by creating large differences in allele frequencies between populations. Conversely, balancing selection can lead to lower than expected levels of population structure. In total, 174 candidate selection genes were identified whose levels of population structure were significantly different compared with neutral expectations (156 genes had exceptionally high values of FST and 18 had exceptionally low values of FST). In addition, the average FST was significantly different between SNPs located in exons, introns and non-genic regions, which is consistent with the action of purifying selection. One limitation of this study was that it relied upon markers that were discovered in a small number of chromosomes, which can lead to significant ascertainment bias (ie in this case, an over-representation of intermediate-frequency alleles). Such ascertainment bias complicates inferences of natural selection, and, as the authors note, additional analyses are needed to confirm the signature of selection in these genes.
Three genome-wide scans for natural selection have also been performed with microsatellite markers, [66–68] the largest of which analysed 5,257 microsatellite markers in 28 individuals of European descent . A sliding window analysis across the genome revealed 43 bins that contained a significant reduction in heterozygosity relative to neutral expectations. Interestingly, the recombination rate in these 43 bins was significantly reduced compared with the genomewide average, which is consistent with theoretical predictions that positive selection will be easier to detect in regions of the genome with low recombination rates .
The other two microsatellite based genome-wide scans for selection included multiple populations and searched for evidence of local adaptation by identifying outlier loci that exhibited large levels of population structure relative to the empirical distribution of all loci. Specifically, Kayser et al.  studied 332 microsatellite markers in 47 Europeans and 47 Africans (23 Ethiopians and 24 South Africans). The test statistics RST, a multiallelic analogue of FST, and ln RV, which is the natural log of the variance in allele sizes between populations,  were calculated for all loci. Numerous outlier loci were detected and 11 were studied further by genotyping additional microsatellite markers in these regions. The additional microsatellite analyses confirmed the large differences in genetic differentiation, which strengthens the hypothesis that outlier loci have been targets of geographically restricted selective pressures. Similarly, Storz et al.  analysed a total of 624 microsatellite loci that were previously genotyped in multiple populations from Africa, Europe and Asia. Again, measures of population structure were calculated for all markers (FST and an analogue to ln RV) and outlier loci were identified. In total, 13 outlier loci were found and all but one had significant reductions in heterozygosity in non-African populations; this was interpreted as evidence that local adaptation was more common outside of Africa. An important limitation of the microsatellite analyses is that the high mutation rate of microsatellites may obscure signatures of selection, except in low-recombining regions of the genome [70, 71].
In one of the largest gene-based genome-wide screens performed to date, Clark et al.  analysed 7,645 orthologous genes from humans, chimpanzees and mice (see also Figure 3D). Maximum-likelihood models were fitted to proteincoding DNA sequences to estimate rates of synonymous (ds) and non-synonymous (dn) substitutions. In total, 1,547 genes had dn/ds ratios > 1 in humans, which is commonly interpreted as evidence for positive selection, but the neutral model could be formally rejected at p < 0.05 for only six of these genes. Using an alternative statistical method with greater sensitivity, branch site models were fitted to the data in order to detect accelerated rates of dn/ds in the human lineage for a subset of nucleotide sites (ie dn/ds does not have to be > 1 for the entire gene). A total of 667 genes were identified as significant at p < 0.05 in this analysis; subsequent bioinformatics analyses revealed two interesting observations. First, accelerated rates of evolution were found for several functional classes of genes, including olfactory, nuclear transport and sensory perception. Secondly, genes with evidence for positive selection were enriched for genes that are associated with human diseases, as defined by the Online Mendelian Inheritance of Man (OMIM) database. OMIM primarily contains monogenic disease genes with large phenotypic effects, and it will therefore be interesting to see if these results also extend to complex disease genes. Indeed, signatures of natural selection have been described for several genes associated with various complex diseases [34, 72–78]. If complex disease genes are enriched for signatures of natural selection, finding targets of adaptive evolution may be a useful strategy for prioritising candidate genes in diseasemapping studies.
It is important to note that a recent theoretical study has suggested that maximum-likelihood branch site models may have a high false-positive rate  and, therefore, the 667 significant (at p < 0.05) genes in the study by Clark et al.  may contain a higher than anticipated fraction of false positives. In addition, increased rates of dn/ds along a lineage do not always indicate the action of positive selection and can also occur due to relaxation of purifying selection [79, 80]. As the authors point out, obtaining polymorphism data from human populations would provide further insight into the evolutionary history of these genes and help to clarify some of the issues raised above.
Genes with evidence of local adaptation
Potential selective pressure
In addition, several studies have found that non-African populations possess more evidence for selection relative to African populations [60, 67, 68]. As most studies have considered only a single African population, however, it is difficult to determine whether the observed differences in the frequency of selective events between African and non-African populations is a general phenomenon or simply reflects the need to sample African populations more comprehensively. Furthermore, theoretical studies have demonstrated that the power to detect a recent selective sweep is greater compared with an older sweep [41, 42, 88, 89]. Therefore, the frequency of selective events may be similar in African and non-African populations, but may be easier to detect in non-African populations if they occurred more recently.
Looking ahead: The HapMap project
The HapMap project (http://www.hapmap.org/) is a large international collaboration to describe patterns of common haplotype variation throughout the human genome . The initial goal of the HapMap project is to genotype 600,000 SNPs in 270 individuals: 90 individuals of northern and western European ancestry (30 trios consisting of two parents and an adult child), 90 Yoruban individuals from Ibadan, Nigeria (30 trios), 45 unrelated Japanese individuals from Tokyo, Japan, and 45 unrelated Han Chinese individuals from Beijing, China. Although the HapMap project was initially developed to facilitate the search for complex disease genes, it will provide a powerful resource for population genetics and evolutionary studies. Specifically, it will provide a unifying publicly-available resource of genome-wide variation data to interrogate systematically for signatures of natural selection. As numerous evolutionary analyses will undoubtedly be conducted on the HapMap data, results can be verified across studies, which will allow prioritising candidate selection genes for subsequent studies.
It is important to temper our enthusiasm for genome-wide scans of natural selection because several analytical and conceptual challenges remain. For example, as indicated above, thousands of hypothesis tests will be performed in a typical study and it is necessary to correct for multiple tests to avoid an unacceptably high false-positive rate. One particularly appealing approach is to control the false discovery rate, [90, 91] which is more powerful than traditional methods such as Bonferroni corrections and has been used in a wide variety of genomics analyses. Furthermore, as numerous genome-wide scans for selection will be applied to common datasets, such as the HapMap, methods for combining results across studies would be invaluable.
A critical issue that has already arisen in current genomewide scans for selection is the need to verify the signature of selection through replication studies and by alternative experimental approaches. The importance of follow-up studies cannot be overstated because in their absence we will simply be left with a list of interesting 'candidate selection genes'. The problem of follow-up replication in genome-wide studies is a general one that has been considered in linkage analysis  and genetic association studies . Clearly, replication in independent samples from the same population is an important criterion that can be used to discard false positives that accumulate from the multiple testing inherent in genome scans. Genome-wide study designs are known to suffer from the 'winner's curse' phenomenon, however, whereby the effect sizes of statistically significant loci are systematically over-estimated [93, 94]. If such concerns are ignored, the statistical power of subsequent replication attempts is likely to be over-estimated, leading the community to place undue faith in the veracity of failed replication attempts. Even if signatures of selection are confirmed, it remains difficult to identify the specific variants that have been subject to selection. Ideally, suspected targets of selection will be functionally characterised, which will facilitate inferences on genotype - phenotype correlations and ultimately on how the putative selected alleles affect fitness. Finally, more powerful methods to estimate evolutionary parameters, such as the timing of selective events and the strength of selection, need to be developed.
In addition to the issues described above, it is important to note that all of the statistical methods and studies considered in this review are predicated upon simple theoretical models of natural selection. For example, tests such as Tajima's D search for signatures of selection that act on a single locus. Genes do not exist in isolation, however, and it is possible -- perhaps even likely -- that selection acts on combinations of alleles, a process that is referred to as epistatic selection . Recently, two studies in Drosophila melanogaster demonstrated strong empirical evidence for epistatic selection [96, 97]. It seems likely that that progress in reconstructing gene and protein networks will serve as a valuable guide in beginning to explore epistatic selection in humans.
The intersection of high-throughput methods to access human genetic variation on a genome-wide scale and statistical tools to identify signatures of natural selection will undoubtedly provide a deeper understanding of how adaptive processes helped to shape our genomes. Furthermore, the same resources used to scan the genome for signatures of selection will also provide a more comprehensive understanding of human demographic history, which will be necessary to understand how neutral and non-neutral evolutionary forces have interacted to shape extant patterns of human genetic and phenotypic diversity. Although many hurdles are likely to be encountered, the evolutionary insights obtained from genome-wide analyses will have implications for many contemporary issues, such as the functional annotation of the human genome and the discovery of complex disease genes.
We thank Jennifer Madeoy, Dayna Akey and an anonymous reviewer for critical reading of the manuscript and providing valuable comments. J.R. is supported by the University of Washington Medical Scientist Training Program. J.M.A. is supported by a Pilot and Feasibility Award from the Clinical Nutrition Research Unit at the University of Washington.
- Valle D: Genetics, individuality, and medicine in the 21st century. Am J Hum Genet. 2004, 74: 374-381. 10.1086/382790.PubMed CentralView ArticlePubMedGoogle Scholar
- Bamshad M, Wooding SP: Signatures of natural selection in the human genome. Nat Rev Genet. 2003, 4: 99-111. 10.1038/nrg999.View ArticlePubMedGoogle Scholar
- Harr B, Kauer M, Schlotterer C: Hitchhiking mapping: A population-based fine mapping strategy for adaptive mutations in Drosophila melanogaster. Proc Natl Acad Sci USA. 2002, 99: 12949-12954. 10.1073/pnas.202336899.PubMed CentralView ArticlePubMedGoogle Scholar
- Kauer MO, Dieringer D, Schlotterer C: A microsatellite variability screen for positive selection associated with the "Out of Africa" habitat expansion of Drosophila melanogaster. Genetics. 2003, 165: 1137-1148.PubMed CentralPubMedGoogle Scholar
- Schofl G, Schlotterer C: Patterns of microsatellite variability among X chromosomes and autosomes indicate a high frequency of beneficial mutations in non-African D. simulans. Mol Biol Evol. 2004, 21: 1384-1390. 10.1093/molbev/msh132.View ArticlePubMedGoogle Scholar
- Kimura M: Evolutionary rate at the molecular level. Nature. 1968, 217: 624-626. 10.1038/217624a0.View ArticlePubMedGoogle Scholar
- King JL, Jukes TH: Non-Darwinian evolution. Science. 1969, 164: 788-798. 10.1126/science.164.3881.788.View ArticlePubMedGoogle Scholar
- Kimura M: The Neutral Theory of Molecular Evolution. 1983, Cambridge University Press, Cambridge, UKView ArticleGoogle Scholar
- Ohta T: Slightly deleterious mutant substitutions in evolution. Nature. 1973, 246: 96-98. 10.1038/246096a0.View ArticlePubMedGoogle Scholar
- Ohta T, Gillespie JH: Development of neutral and nearly neutral theories. Theor Popul Biol. 1996, 49: 128-142. 10.1006/tpbi.1996.0007.View ArticlePubMedGoogle Scholar
- Otto SP: Detecting the form of selection from DNA sequence data. Trends Genet. 2000, 16: 526-529. 10.1016/S0168-9525(00)02141-7.View ArticlePubMedGoogle Scholar
- Nielsen R: Statistical tests of selective neutrality in the age of genomics. Heredity. 2001, 86: 641-647. 10.1046/j.1365-2540.2001.00895.x.View ArticlePubMedGoogle Scholar
- Kingman JFC: The coalescent. Stochastic Process Appl. 1982, 13: 235-248. 10.1016/0304-4149(82)90011-4.View ArticleGoogle Scholar
- Kingman JFC: On the genealogy of large populations. J Appl Prob. 1982, 19A: 27-43.View ArticleGoogle Scholar
- Hudson RR: Properties of a neutral allele model with intragenic recombination. Theor Popul Biol. 1983, 23: 183-201. 10.1016/0040-5809(83)90013-8.View ArticlePubMedGoogle Scholar
- Hudson RR: Testing the constant-rate neutral allele model with protein sequence data. Evolution. 1983, 37: 203-217. 10.2307/2408186.View ArticleGoogle Scholar
- Tajima F: Evolutionary relationship of DNA sequences in finite populations. Genetics. 1983, 105: 437-460.PubMed CentralPubMedGoogle Scholar
- Fu YX, Li WH: Coalescing into the 21st century: An overview and prospects of coalescent theory. Theor Popul Biol. 1999, 56: 1-10. 10.1006/tpbi.1999.1421.View ArticlePubMedGoogle Scholar
- Rosenberg NA, Nordborg M: Genealogical trees, coalescent theory and the analysis of genetic polymorphisms. Nat Rev Genet. 2002, 3: 380-390. 10.1038/nrg795.View ArticlePubMedGoogle Scholar
- Charlesworth B, Morgan MT, Charlesworth D: The effect of deleterious mutations on neutral molecular variation. Genetics. 1993, 134: 1289-1303.PubMed CentralPubMedGoogle Scholar
- Hudson RR, Kaplan NL: Deleterious background selection with recombination. Genetics. 1995, 141: 1605-1617.PubMed CentralPubMedGoogle Scholar
- Neuhauser C, Krone SK: The genealogy of samples in models with selection. Genetics. 1997, 145: 519-534.PubMed CentralPubMedGoogle Scholar
- Maynard Smith J, Haigh J: The hitch-hiking effect of a favorable gene. Genet Res. 1974, 231: 1114-1116.Google Scholar
- Thomson G: The effect of a selected locus on a linked neutral locus. Genetics. 1977, 85: 752-788.Google Scholar
- Kaplan N, Hudson RR, Langley CH: The "hitchhiking effect" revisited. Genetics. 1989, 123: 887-899.PubMed CentralPubMedGoogle Scholar
- Stephan W, Wiehe THE, Lenz MW: The effect of strongly selected substitutions on neutral polymorphism: Analytical results based on diffusion theory. Theor Popul Biol. 1992, 41: 237-254. 10.1016/0040-5809(92)90045-U.View ArticleGoogle Scholar
- Nordborg M: Structured coalescent processes on different time scales. Genetics. 1997, 146: 1501-1514.PubMed CentralPubMedGoogle Scholar
- Schierup MH, Vekemans X, Charlesworth D: The effect of subdivision on variation at multi-allelic loci under balancing selection. Genet Res. 2000, 76: 51-62. 10.1017/S0016672300004535.View ArticlePubMedGoogle Scholar
- Kelly JK, Wade MJ: Molecular evolution near a two-locus balanced polymorphism. J Theor Biol. 2000, 204: 83-101. 10.1006/jtbi.2000.2003.View ArticlePubMedGoogle Scholar
- Nordborg M, Innan H: The genealogy of sequences containing multiple sites subject to strong selection in a subdivided population. Genetics. 2003, 163: 1201-1213.PubMed CentralPubMedGoogle Scholar
- Neilsen R: Estimation of population parameters and recombination rates from single nucleotide polymorphisms. Genetics. 2000, 154: 931-942.Google Scholar
- Braverman JM, Hudson RR, Kaplan NL, et al: The hitchhiking effect on the site frequency spectrum of DNA polymorphism. Genetics. 1995, 140: 783-796.PubMed CentralPubMedGoogle Scholar
- Fay JC, Wu CI: Hitchhiking under positive Darwinian selection. Genetics. 2000, 155: 1405-1413.PubMed CentralPubMedGoogle Scholar
- Sabeti PC, Reich DE, Higgins JM, et al: Detecting recent positive selection in the human genome from haplotype structure. Nature. 2002, 419: 832-837. 10.1038/nature01140.View ArticlePubMedGoogle Scholar
- Takahata N, Nei M: Allelic genealogy under overdominant and frequency-dependent selection and polymorphism of major histocompatibility complex loci. Genetics. 1990, 124: 967-978.PubMed CentralPubMedGoogle Scholar
- Tajima F: The effect of change in population size on DNA polymorphism. Genetics. 1989, 123: 597-601.PubMed CentralPubMedGoogle Scholar
- Przeworski M, Hudson RR, Di Rienzo A: Adjusting the focus on human variation. Trends Genet. 2000, 16: 296-302. 10.1016/S0168-9525(00)02030-8.View ArticlePubMedGoogle Scholar
- Kreitman M: Methods to detect selection in populations with applications to the human. Annu Rev Genomics Hum Genet. 2000, 1: 539-559. 10.1146/annurev.genom.1.1.539.View ArticlePubMedGoogle Scholar
- Tajima F: Statistical method for testing the neutral mutation hypothesis by DNA polymorphism. Genetics. 1989, 123: 585-595.PubMed CentralPubMedGoogle Scholar
- Fu YX, Li WH: Statistical test of neutrality of mutations. Genetics. 1993, 133: 693-709.PubMed CentralPubMedGoogle Scholar
- Fu YX: Statistical tests of neutrality of mutations against population growth, hitchhiking and background selection. Genetics. 1997, 186: 1997-2004.Google Scholar
- Simonsen KL, Churchill GA, Aquadro CF: Properties of statistical tests of neutrality for DNA polymorphism data. Genetics. 1995, 141: 413-429.PubMed CentralPubMedGoogle Scholar
- Kuhner MK, Yamato J, Felsenstein J: Estimating effective population size and mutation rate from sequence data using Metropolis-Hastings sampling. Genetics. 1995, 140: 1421-1430.PubMed CentralPubMedGoogle Scholar
- Kuhner MK, Yamato J, Felsenstein J: Maximum likelihood estimation of population growth rates based on the coalescent. Genetics. 1998, 149: 429-434.PubMed CentralPubMedGoogle Scholar
- Felsenstein J: Likelihood calculations on coalescents. Inferring Phylogenies. Edited by: Felsenstein J. 2004, Sinauer Associates, Sunderland, MA, 470-487.Google Scholar
- Cavalli-Sforza LL: Population structure and human evolution. Proc R Soc Lond B Biol Sci. 1966, 164: 362-379. 10.1098/rspb.1966.0038.View ArticlePubMedGoogle Scholar
- Lewontin RC, Krakauer J: Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms. Genetics. 1973, 74: 175-195.PubMed CentralPubMedGoogle Scholar
- Weir BS, Cockerham CC: Estimating F-statistics for the analysis of population structure. Evolution. 1984, 38: 1358-1370. 10.2307/2408641.View ArticleGoogle Scholar
- Vitalis R, Dawson K, Boursot P: Interpretation of variation across marker loci as evidence of selection. Genetics. 2001, 158: 1811-1823.PubMed CentralPubMedGoogle Scholar
- Beaumont MA, Balding DJ: Identifying adaptive genetic divergence among populations from genome scans. Mol Ecol. 2004, 13: 969-980. 10.1111/j.1365-294X.2004.02125.x.View ArticlePubMedGoogle Scholar
- Hudson RR, Kreitman M, Aguade M: A test of neutral molecular evolution based on nucleotide data. Genetics. 1987, 116: 153-159.PubMed CentralPubMedGoogle Scholar
- McDonald JH, Kreitman M: Adaptive protein evolution at the Adh locus in Drosophila. Nature. 1991, 351: 652-654. 10.1038/351652a0.View ArticlePubMedGoogle Scholar
- McDonald JH: Improved tests for heterogeneity across a region of DNA sequence in the ratio of polymorphism to divergence. Mol Biol Evol. 1998, 15: 377-384. 10.1093/oxfordjournals.molbev.a025934.View ArticlePubMedGoogle Scholar
- Eyre-Walker A: Changing effective population size and the McDonald-Kreitman test. Genetics. 2002, 162: 2017-2024.PubMed CentralPubMedGoogle Scholar
- Nielsen R, Yang Z: Likelihood models for detecting positive selected amino acid sites and applications to the HIV-1 envelope gene. Genetics. 1998, 148: 929-936.PubMed CentralPubMedGoogle Scholar
- Nei M, Gojobori T: Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol. 1986, 3: 418-426.PubMedGoogle Scholar
- Suzuki Y, Gojobori T: A method for detecting positive selection at single amino acid sites. Mol Biol Evol. 1999, 16: 1315-1328. 10.1093/oxfordjournals.molbev.a026042.View ArticlePubMedGoogle Scholar
- Andolfatto P: Adaptive hitchhiking effects on genome variability. Curr Opin Genet Dev. 2001, 11: 635-641. 10.1016/S0959-437X(00)00246-X.View ArticlePubMedGoogle Scholar
- Hudson RR: Generating samples under a Wright-Fisher neutral model of genetic variation. Bioinformatics. 2002, 18: 337-338. 10.1093/bioinformatics/18.2.337.View ArticlePubMedGoogle Scholar
- Akey JM, Eberle MA, Rieder MJ, et al: Population history and natural selection shape patterns of genetic variation in 132 genes. PLoS Biol. 2004, 2: 1591-1599.View ArticleGoogle Scholar
- International HapMap Consortium: The international HapMap project. Nature. 2003, 426: 789-794. 10.1038/nature02168.View ArticleGoogle Scholar
- Clark AG, Glanowski S, Nielsen R, et al: Inferring nonneutral evolution from human-chimp-mouse orthologous gene trios. Science. 2003, 302: 1960-1963. 10.1126/science.1088821.View ArticlePubMedGoogle Scholar
- Yang Z, Nielsen R: Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol Biol Evol. 2000, 17: 32-43. 10.1093/oxfordjournals.molbev.a026236.View ArticlePubMedGoogle Scholar
- Yang Z: PAML: A program package for phylogenetic analysis by maximum likelihood. Comput Appl BioSci. 1997, 13: 555-556.PubMedGoogle Scholar
- Akey JM, Zhang G, Zhang K, et al: Interrogating a highdensity SNP map for signatures of natural selection. Genome Res. 2002, 12: 1805-1814. 10.1101/gr.631202.PubMed CentralView ArticlePubMedGoogle Scholar
- Payseur BA, Cutter AD, Nachman MW: Searching for evidence of positive selection in the human genome using patterns of microsatellite variability. Mol Biol Evol. 2002, 19: 1143-1153. 10.1093/oxfordjournals.molbev.a004172.View ArticlePubMedGoogle Scholar
- Kayser M, Brauer S, Stoneking M: A genome scan to detect candidate regions influenced by local natural selection in human populations. Mol Biol Evol. 2003, 20: 893-900. 10.1093/molbev/msg092.View ArticlePubMedGoogle Scholar
- Storz JF, Payseur BA, Nachman MW: Genome scans of DNA variability in humans reveal evidence for selective sweeps outside of Africa. Mol Biol Evol. 2004, 21: 1800-1811. 10.1093/molbev/msh192.View ArticlePubMedGoogle Scholar
- Schlötterer C: A microsatellite-based multilocus screen for the identification of local selective sweeps. Genetics. 2002, 160: 753-763.PubMed CentralPubMedGoogle Scholar
- Schlötterer C, Wiehe T: Microsatellites, a neutral marker to infer selective sweeps. Microsatellites -- Evolution and Applications. Edited by: Goldstein D, Schlötterer C. 1999, Oxford University Press, Oxford, UK, 238-248.Google Scholar
- Wiehe T: The effect of selective sweeps on the variance of the allele distribution of a linked multi-allele locus-hitchhiking of microsatellites. Theor Popul Biol. 1998, 53: 272-283. 10.1006/tpbi.1997.1346.View ArticlePubMedGoogle Scholar
- Hamblin MT, Di Rienzo A: Detection of the signature of natural selection in humans: Evidence from the Duffy blood group locus. Am J Hum Genet. 2000, 66: 1669-1679. 10.1086/302879.PubMed CentralView ArticlePubMedGoogle Scholar
- Tishkoff SA, Varkonyi R, Cahinhinan N, et al: Haplotype diversity and linkage disequilibrium at human G6PD: Recent origin of alleles that confer malarial resistance. Science. 2001, 293: 455-462. 10.1126/science.1061573.View ArticlePubMedGoogle Scholar
- Hamblin MT, Thompson EE, Di Rienzo A: Complex signatures of natural selection at the Duffy blood group locus. Am J Hum Genet. 2002, 70: 369-383. 10.1086/338628.PubMed CentralView ArticlePubMedGoogle Scholar
- Bamshad MJ, Mummidi S, Gonzalez E, et al: A strong signature of balancing selection in the 50 cis-regulatory region of CCR5. Proc Natl Acad Sci USA. 2002, 99: 10539-10544. 10.1073/pnas.162046399.PubMed CentralView ArticlePubMedGoogle Scholar
- Fullerton SM, Bartoszewicz A, Ybazeta G, et al: Geographic and haplotype structure of candidate type 2 diabetes susceptibility variants at the calpain-10 locus. Am J Hum Genet. 2002, 70: 1096-1106. 10.1086/339930.PubMed CentralView ArticlePubMedGoogle Scholar
- Rockman MV, Hahn MW, Soranzo N, et al: Positive selection on MMP3 regulation has shaped heart disease risk. Curr Biol. 2004, 14: 1531-1539. 10.1016/j.cub.2004.08.051.View ArticlePubMedGoogle Scholar
- Nakajima T, Wodding S, Sakagami T, et al: Natural selection and population history in the human angiotensinogen gene (AGT): 736 complete ATG sequences in chromosomes from around the world. Am J Hum Genet. 2004, 74: 898-916. 10.1086/420793.PubMed CentralView ArticlePubMedGoogle Scholar
- Zhang J: Frequent false detection of positive selection by the likelihood method with branch-site models. Mol Biol Evol. 2004, 21: 1332-1339. 10.1093/molbev/msh117.View ArticlePubMedGoogle Scholar
- Rooney AP, Zhang J: Rapid evolution of a primate sperm protein: Relaxation of functional constraint or positive Darwinian selection?. Mol Biol Evol. 1999, 16: 706-710. 10.1093/oxfordjournals.molbev.a026153.View ArticlePubMedGoogle Scholar
- Gilad Y, Rosenberg S, Przeworski M, et al: Evidence for positive selection and population structure at the human MAO-A gene. Proc Natl Acad Sci USA. 2002, 99: 862-867. 10.1073/pnas.022614799.PubMed CentralView ArticlePubMedGoogle Scholar
- Rana BK, Hewett-Emmett D, Jin L, et al: High polymorphism at the human melanocortin 1 receptor locus. Genetics. 1999, 151: 1547-1557.PubMed CentralPubMedGoogle Scholar
- Bersaglieri T, Sabeti PC, Patterson N, et al: Genetic signatures of strong recent positive selection at the lactase gene. Am J Hum Genet. 2004, 74: 1111-1120. 10.1086/421051.PubMed CentralView ArticlePubMedGoogle Scholar
- Stephens JC, Reich DE, Goldstein DB, et al: Dating the origin of the CCR5-Delta32 AIDS-resistance allele by the coalescence of haplotypes. Am J Hum Genet. 1998, 62: 1507-1515. 10.1086/301867.PubMed CentralView ArticlePubMedGoogle Scholar
- Rockman MV, Hahn MW, Soranzo N, et al: Positive selection on a human-specific transcription factor binding site regulating IL4 expression. Curr Biol. 2003, 13: 2118-2123. 10.1016/j.cub.2003.11.025.View ArticlePubMedGoogle Scholar
- Nijenhuis T, Hoenderop JGJ, Nilius B, Bindels RJM: (Patho)physiological implications of the novel epithelial Ca2þ channels TRPV5 and TRPV6. Pflugers Arch. 2003, 446: 401-409. 10.1007/s00424-003-1038-7.View ArticlePubMedGoogle Scholar
- van de Graaf SF, Hoenderop JG, Gkika D, et al: Functional expression of the epithelial Ca2+ channels (TRPV5 and TRPV6) requires association of the S100A10-annexin 2 complex. EMBO J. 2003, 22: 1478-1487. 10.1093/emboj/cdg162.PubMed CentralView ArticlePubMedGoogle Scholar
- Kim Y, Stephan W: Joint effects of genetic hitchhiking and background selection on neutral variation. Genetics. 2000, 155: 1415-1427.PubMed CentralPubMedGoogle Scholar
- Przeworski M: The signature of positive selection at randomly chosen loci. Genetics. 2002, 160: 1179-1189.PubMed CentralPubMedGoogle Scholar
- Benjamini Y, Hochberg Y: Controlling the false discovery rate: A practical and powerful approach to multiple testing. JR Stat Soc. 1995, 57: 289-300.Google Scholar
- Storey JD, Tibshirani R: Statistical significance for genome-wide experiments. Proc Nat Acad Sci USA. 2003, 100: 9440-9445. 10.1073/pnas.1530509100.PubMed CentralView ArticlePubMedGoogle Scholar
- Lander E, Kruglyak L: Genetic dissection of complex traits: Guidelines for interpreting and reporting linkage results. Nat Genet. 1995, 11: 241-247. 10.1038/ng1195-241.View ArticlePubMedGoogle Scholar
- Lohmueller KE, Pearce CL, Pike M, et al: Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat Genet. 2003, 33: 177-182. 10.1038/ng1071.View ArticlePubMedGoogle Scholar
- Goring HH, Terwilliger JD, Blangero J: Large upward bias in estimation of locus-specific effects from genomewide scans. Am J Hum Genet. 2001, 69: 1357-1369. 10.1086/324471.PubMed CentralView ArticlePubMedGoogle Scholar
- Lewontin RC, Kojima K: The evolutionary dynamics of complex polymorphisms. Evolution. 1960, 14: 458-472. 10.2307/2405995.View ArticleGoogle Scholar
- Takano-Shimizu T, Kawabe A, Inomata N, et al: Interlocus nonrandom association of polymorphisms in Drosophila chemoreceptor genes. Proc Natl Acad Sci USA. 2004, 101: 14156-14161. 10.1073/pnas.0401782101.PubMed CentralView ArticlePubMedGoogle Scholar
- Zapata C, Nunez C, Velasco T: Distribution of nonrandom associations between pairs of protein loci along the third chromosome of Drosophila melanogaster. Genetics. 2002, 161: 1539-1550.PubMed CentralPubMedGoogle Scholar