The genomic distribution of population substructure in four populations using 8,525 autosomal SNPs
© Henry Stewart Publications 2004
Received: 2 February 2004
Accepted: 2 February 2004
Published: 1 May 2004
Understanding the nature of evolutionary relationships among persons and populations is important for the efficient application of genome science to biomedical research. We have analysed 8,525 autosomal single nucleotide polymorphisms (SNPs) in 84 individuals from four populations: African-American, European-American, Chinese and Japanese. Individual relationships were reconstructed using the allele sharing distance and the neighbour-joining tree making method. Trees show clear clustering according to population, with the root branching from the African-American clade. The African-American cluster is much less star-like than European-American and East Asian clusters, primarily because of admixture. Furthermore, on the East Asian branch, all ten Chinese individuals cluster together and all ten Japanese individuals cluster together. Using positional information, we demonstrate strong correlations between inter-marker distance and both locus-specific FST (the proportion of total variation due to differentiation) levels and branch lengths. Chromosomal maps of the distribution of locus-specific branch lengths were constructed by combining these data with other published SNP markers (total of 33,704 SNPs). These maps clearly illustrate a non-uniform distribution of human genetic substructure, an instructional and useful paradigm for education and research.
Keywordspopulation genomics population genetics microarray genotyping evolution admixture
The completion of the primary human genome sequence was announced in 2003 and millions of single nucleotide polymorphisms (SNPs) are already available in public databases [eg The SNP Consortium (TSC), dbSNP, HGVbase]. Paralleling these advances in our knowledge of the human genome have been remarkable breakthroughs in genotyping technologies, providing >1,000-fold increases in genotyping capacity. Thus, we are on the brink of an unprecedented understanding of human variation and the evolution of our species. A detailed understanding of the extent, pattern and meaning of human variation is fundamental to the effective application of genomics to studies of human biology. For example, understanding the amount of genetic structure present in human populations is relevant to epidemiological studies as, if uncontrolled for, it can produce false-positive results in association studies  and lower statistical power in linkage analyses . Additionally, patterns of structure within and between human populations can be important in terms of epidemiological risks and the evaluation of drug response .
Previous studies have used relatively small numbers of genetic markers to explore the geographical patterns of human genetic variation, resulting in an incomplete picture of human diversity at the genomic level [4, 5]. Only recently has it become possible to carry out studies with thousands of markers on a genome-wide scale under the new paradigm of population genomics, which models genetic variation at both a genomic and a locus-specific level [6, 7]. We analysed the genetic variation of 8,525 autosomal markers in four population samples and two Centre d'Etude du Polymorphisme Humain (CEPH) family trios. The SNP multilocus genotype data were collected using a new method developed by Affymetrix, called 'whole genome amplification' (WGA) . A sample of 78 unrelated individuals--38 European-Americans, 20 African-Americans and 20 East Asians (ten Chinese and ten Japanese)--was selected from the TSC core panel of individuals for analysis on the WGA microarrays. The two CEPH family trios (mother, father and child) are of European-American ancestry. We combined these new data with a recently available dataset consisting of 26,530 SNPs compiled from a public database . Positional information was available for 33,704 of these 36,347 SNPs and was used to investigate human population substructure at a locus-specific level, whereby all genomic regions are not averaged together, but investigated as individual data elements.
The DNA samples we analysed were from two publicly available sample sets curated at the Corriell Institute (Camden, NJ): TSC and CEPH. Two family trios (mother, father and child) were selected from the CEPH family mapping panels and were European-American. The four population samples were subsets of a commonly used set of samples assembled by TSC for the purposes of SNP verification and allele frequency estimation. From these TSC panels, we included 38 European-Americans, 20 African-Americans, ten Chinese and ten Japanese. The Chinese and Japanese subjects were ascertained in the USA, but were of Chinese and Japanese ancestry.
WGA technology was used to genotype individuals in this study. Details of this method have been published elsewhere,, but, briefly, fractions of the genome are obtained by restriction enzyme digestion of genomic DNA, ligated with adaptors and subsequently amplified with a universal primer. The amplified target is fragmented, labelled with terminal transferase and biotin-ddATP (dideoxy Adenosine Triposphate) and hybridised overnight to synthetic microarrays . Genotypes are called by interpreting signals from allele-specific probes using a model-based algorithm. The accuracy of this method is >99.5 per cent. SNPs were chosen from the TSC database on the basis of their predicted location on 400-800 base pair fragments generated by in silico digestion of human genome sequences with various restriction enzymes.
Individual genetic distances were estimated using the allele sharing distance (ASD) . The tree of individuals, based on the ASD distance, was constructed using the neighbour-joining method  with the Molecular Evolutionary Genetics Analysis software package (MEGA version 2.1) . The tree branching pattern was evaluated by bootstrapping, and was based on 100 replicates. The principal coordinates analysis (PCA) was carried out with NTSYS software . The computer program STRUCTURE 2.0  was used to infer relative individual admixture levels in the sample. The analysis was carried out with an admixture model of K = 3 (three populations), the model previously determined to show the highest posterior probabilities for these data. A total of 25,000 simulation iterations were run for the burn-in period and 75,000 additional iterations were run to get parameter estimates. For estimations of individual admixture in the African-American sample, we included only the European-American and African-American subjects and set K = 2 with independent alphas. The average individual admixture in the African-American sample was 0.25.
Phylogenetic and clustering relationships were studied among the 84 persons using the ASD method  to estimate the average distance between all pairwise combinations of individuals. Matrices of ASD measures were used to reconstruct individual trees with the neighbour-joining method  and prepare a two-dimensional plot based on a PCA. Additionally, we used subsets of the total marker panels based on FST level (both high and low FST markers) to explore the effects of allele frequency difference on the results. Finally, we used pairwise FST measures to calculate LSBLs for each of the populations. Markers were grouped by the distance between them and tested for the level of correlation in branch length values.
Tree-based and PCA approaches quantify the average evolutionary relationships among individuals and populations, and FST calculated from many loci quantifies the average amount of genetic variation due to differences among groups or geographical regions. As illustrated in Figures 2, 3B and 3C, not all loci across the genome have experienced the same amount of evolutionary change. Rather, most loci have undergone only marginal changes in allele frequency, while a smaller number of loci have undergone very large changes in frequency. Although FST can be used to quantify the degree of evolution at a particular locus, and this approach has proven successful in several studies, it is not without some drawbacks.
One particular drawback is that FST is sensitive to changes in any of the populations included in the analysis. Any one (or more) of the populations could have different allele frequencies from the others, leading to a higher FST. With this in mind, we extended this approach to quantify the degree of evolution at a particular locus by calculating LSBLs (see Figure 1 and Methods section). This approach geometrically isolates allele frequency change, allowing specification of not only the amount of evolution that has occurred, but also the population(s) that underwent changes at particular loci.
Summary statistics for heterozygosity, branch lengths and FST.
Full F ST
1.6 × 10 -12
1.4 × 10 -15
LSBL raw dat 2
5.5 × 10-33
9.0 × 10-7
3.6 × 10-63
In addition to examining the branch lengths using measured allele frequencies, we adjusted for, and analysed, the effects of admixture. The European admixture rate in this sample was measured with STRUCTURE 2.0  to be 25 per cent -- a reasonable level, given what has been observed in other African-American populations [22, 23]. The effect of gene flow is to decrease genetic distance, in this case between the African-American and European-American samples. This results in shorter African-American and European-American branches, and relatively longer East Asian branches. Indeed, when controlling for gene flow, average branch lengths change substantially: African autosomal branch length increases to 0.114 from a raw level of 0.069, European-American branch length increases to 0.046 from 0.039 and East Asian branch length decreases to 0.055 from 0.066.
It is important to know if the admixture adjustment affected all markers similarly. Therefore, we calculated the correlation between branch lengths for the unadjusted and the admixture adjusted branch lengths and found a high correlation (R2 levels of 0.944, 0.977 and 0.963 for the African-American, European-American and East Asian branch lengths, respectively).
Regions where multiple SNPs showing high LSBL are in close genomic proximity indicate locations that have recently undergone dramatic changes in allele frequency because of either random genetic drift or natural selection. Therefore, it may be instructive to plot the LSBL estimates relative to their chromosomal positions. Genomic positions were obtained for a total of 33,704 SNPs from a combined set of markers (9,817 total markers from the WGA chip and 26,530 from using a recent version of the dbSN  (January, 2003). The results have been plotted for each of the 23 chromosomes and are presented in the online supplementary material (http://www.anthro.psu.edu/biolab). Figure 1 presents chromosome 1 as an example of these plots. Patterns of high and low FST levels are clarified and decomposed by branch length values. It is usually the case that high branch lengths for linked SNPs in particular populations result in clusters of high FST levels. This is reasonable because branch lengths are calculated from the three pairwise FST values. As such, the impression is that the full FST plot is noisier, having more spikes (single high values) than the branch length plots and a higher level of extreme values for FST compared with branch lengths.
The large number of markers used in these analyses provides an unprecedented level of resolution facilitating the study of human history at the genomic level. Our investigation of multilocus genotype data on 8,525 autosomal SNPs reinforces two observations consistently reported in the literature. First, the root of the tree branches from within the African-American clade, as has been previously observed [24, 25]. This is consistent with palaeontological evidence indicating an African origin of our species . Secondly, most human genetic variation is found within populations, while a minor proportion of the total variance is due to differences between continental population groups. The average FST in this study (13.2 per cent) is similar to values reported in numerous previous studies,[4, 5, 9, 18, 12, 22, 23] beginning with the classic paper by Richard Lewontin in 1972 . The average FST obtained with this study's markers is not significantly different from the average FST obtained in a recent independent study based on SNPs typed in these same three populations (13.2 per cent versus 12.3 per cent; t-test p > 0.05) . Additionally, the average FST level for X-chromosome SNPs is greater than that for autosomal SNPs (Table 1: 19.4 per cent vs 13.0 per cent; t-test p < 3.6 × 10-63 Likewise, the LSBL was observed to be significantly higher for X-chromosomal markers compared with autosomal markers in all three populations. A faster rate of evolution for X-chromosome markers has been noted previously [27, 28] and the potential causes discussed in detail . A higher average level of X-chromosomal differentiation is consistent with the action of higher selection pressure, especially in the African-American sample, where the heterozygosity is not decreased; however, additional population samples without admixture are needed, particularly West African samples, before conclusions to this effect are drawn.
Although genetic variation between major continental groups represents a minor fraction of the total variation observed in humans, it is misleading to describe variation between world populations as negligible. There is a wide dispersion of FST values at the genomic level. Most of the 8,525 markers analysed in this study show small allele frequency differences between populations; however, there is a subset of markers showing very high FST values and, as illustrated in Figures 3A, 3B and 4, these loci can have major effects on the observed clustering of populations and individuals according to geographical origin. At the continental level, bootstrap support is high for individuals belonging to major branches of this tree. Japanese and Chinese individuals cluster in two separate groups within the East Asian branch, but show a lower bootstrap level. Demographic factors, restricted gene flow and natural selection driving adaptation to different environments have resulted in genetic divergence between major human continental groups that can be captured at both the population and the individual level using a large number of markers. These results confirm and extend earlier work by Cavalli-Sforza and colleagues [4, 30]. As demonstrated by Mountain and Cavalli-Sforza, clustering relationships are expected to change as additional populations are analysed and the number of subjects is increased . To some unknown extent, the large degree of separation observed in the PCA plot and on the trees is a result of having data representing geographically extreme populations to the exclusion of intermediate groups.
It is also important to note that admixture can have a profound impact on the genetic clustering of individuals [31, 32]. By contrast with the patterns observed for European-Americans and East Asians on both the PCA plot and the trees, African-American individuals do not cluster tightly or in a similarly globular fashion. We estimated relative individual ancestry levels in the African-American sample using STRUCTURE 2.0  and compared these with the branching pattern of the tree of individuals. There is remarkable correspondence between the individual admixture estimates obtained with STRUCTURE and both tree branching order (ρ = 0.983, p < 0.0001) and the PCA results (ρ = 0.988, p < 0.0001). While the role of admixture in the origins of African-American populations is widely appreciated,[22, 23], the extent to which non-European ancestry is present in European-Americans has received much less attention . One individual (Eu5--Coriell# NA17205) among the 44 European-Americans stands out from the others on both the trees and the PCA graph, suggesting a significant proportion of non-European ancestry. Indeed, this person clusters with South Asians (from India) in separate analyses (data not shown). These results emphasise that in some contemporary populations, quantitative descriptions of the genetic clustering of individuals  may be more appropriate than dichotomous classifications [1, 11, 33, 34].
In addition to considering its influence on genetic clustering, it is important to recognise how admixture can affect the magnitude of LSBLs. As shown in Table 1, the effect of European ancestry in the African-American sample is both to shorten African-American and European-American branches and to lengthen East Asian branches. Since more individuals of European ancestry were used to identify and validate the TSC SNPs, ascertainment bias may be affecting the overall magnitude of branch length . Although admixture has a dramatic effect on average branch length, however, there is a high correlation between LSBLs calculated with raw allele frequencies and those calculated with ancestry-adjusted allele frequencies.
Although potentially useful and descriptive, qualitative and quantitative assessments of individual and population affiliations and phylogenies are heuristic rather than definitive statements regarding genetic variation. Average values may describe important historical and demographic aspects of both individuals and populations under consideration; however, the nature of human genomic variation is such that there is no one history . Independent assortment, recombination, natural selection and genetic drift have resulted in tens of thousands of genomic regions, each with a unique history. The identification and subsequent exclusion of loci responding to selective pressures allows for a more realistic assessment of population demographics. This approach, first recognised by Lewontin and Krakauer, has since been expanded upon in other analyses interrogating the genome for markers affected by natural selection [9, 37–39]. Much of this focus has been concentrated on FST, which summarises the proportion of total variation due to group differences. We have used pairwise FST measures to calculate the three LSBLs, thus effectively decomposing the full FST into component parts. In this way, we can isolate and evaluate the population-specific changes in allele frequency.
To test whether LSBL captures evolutionary history, we compared branch length levels for pairs of markers (see Figure 5). Levels of correlation, which are high for closely spaced SNPs, decrease as a function of inter-marker spacing. On a genomic scale, nearby regions share more in terms of common evolutionary histories than do more widely spaced regions. A relationship between the correlation of FST for marker pairs and inter-marker distance was first shown by Akey et al . using a subset of the data analysed here. These researchers found that the correlation observed between FST and inter-marker distance was stronger than a simulated coalescent distribution assuming selective neutrality. They interpreted this higher correlation as the footprint of adaptive hitchhiking. While adaptive selection has unquestionably occurred at particular genomic locations, these correlations represent summaries of data on markers from across the genome. Since demographic events will also affect the levels of linkage disequilibrium and haplotype block characteristics, they are expected to affect relationships between levels of evolution and inter-marker distance. Thus, more generally, it can be concluded that these correlations in branch length and FST levels are functions of the shared evolutionary histories of closely linked markers and reflect a non-uniform distribution of human genetic substructure across the genome.
The multiform distribution of genetic substructure has significant implications for research, not only in evolutionary, but also in biomedical contexts. Using LSBL to isolate allele frequency change allows for the identification and subsequent investigation of genomic regions that are candidates for having experienced recent directional selection by virtue of containing clusters of outlying SNPs. Additional work is required to develop statistics which combine positional and branch length information so that regions least likely to have been the result of genetic drift alone can be identified. The human genome has a multivariate history; consequently, efforts to control for population structure (eg genomic control, structured association  and combined methods ) can and should be improved through marker selection efforts. It will ultimately be possible to produce data on the scale that we demonstrate here on a routine basis in disease association studies. Until that time, however, smaller sets of informative markers can be selected from large surveys because they are informative for particular axes across which population substructure is found. Such selected sets of markers can be used to efficiently detect and adjust for even very high levels of population substructure [32, 42]. In sum, these analyses, which reveal a non-uniform distribution of human genetic substructure, suggest a paradigm relevant to the further explorations of genotype/phenotype relationships both within and among populations.
This work was supported in part by a grant: NIH/NHGRI (HG02154) to MDS. We would like to acknowledge helpful discussions with Rick Kittles, Nik Schork, Kateryna Makova and Bruce Lindsey.
- Risch N, Burchard E, Ziv E, et al: 'Categorization of humans in biomedical research: Genes, race and disease'. Genome Biol. 2002, 3: 1-12.View ArticleGoogle Scholar
- Schork N, Fallin D, Tiwari HK, et al: 'Pharmacogenetics'. Handbook of Statistical Genetics. Edited by: Balding, D., Bishop, M. and Cannings, C. 2001, John Wiley and Sons, Hoboken, NJ, 741-764.Google Scholar
- Burroughs VJ, Maxey RW, Levy RA: 'Racial and ethnic differences in response to medicines: Towards individualized pharmaceutical treatment'. J Natl Med Assoc. 2002, 94: 1-26.PubMed CentralPubMedGoogle Scholar
- Bowcock AM, Ruiz-Linares A, Tomfohrde J, et al: 'High resolution of human evolutionary trees with polymorphic microsatellites'. Natur. 1994, 368: 455-457. 10.1038/368455a0.View ArticleGoogle Scholar
- Rosenberg NA, Pritchard JK, Weber JL, et al: 'Genetic structure of human populations'. Science. 2002, 298: 2381-2385. 10.1126/science.1078311.View ArticlePubMedGoogle Scholar
- Cavali-Sforza LL: 'Population structure and human evolution'. Proc R Soc Lond B Biol Sci. 1966, 164: 362-379. 10.1098/rspb.1966.0038.View ArticleGoogle Scholar
- Black WC, Baer CF, Antolin MF, et al: 'Population genomics: Genome-wide sampling of insect populations'. Annu Rev Entomol. 2001, 46: 441-469. 10.1146/annurev.ento.46.1.441.View ArticlePubMedGoogle Scholar
- Kennedy GC, Matsuzaki H, Dong S, et al: 'Large-scale genotyping of complex DNA'. Nat Biotechnol. 2003, 21: 1233-1237. 10.1038/nbt869.View ArticlePubMedGoogle Scholar
- Akey JM, Zhang G, Zhang K, et al: 'Interrogating a high-density SNP map for signatures of natural selection'. Genome Res. 2002, 12: 1805-1814. 10.1101/gr.631202.PubMed CentralView ArticlePubMedGoogle Scholar
- Chee M, Yang R, Hubbell E, et al: 'Accessing genetic information with high-density DNA arrays'. Science. 1996, 274: 610-614. 10.1126/science.274.5287.610.View ArticlePubMedGoogle Scholar
- Chakraborty R, Jin L: A unified approach to study hypervariable polymorphisms: Statistical considerations of determining relatedness and population distances DNA Fingerprinting: Current State of the Science. Edited by: Pena, S.D.J., Jefferys, A.J., Epplen, J. and Chakraborty, R. 1993, Birkhauser, Basel, Switzerland, 67: 153-175. Vol. EXSGoogle Scholar
- Saitou N, Nei M: 'The neighbor-joining method: A new method for reconstructing phylogenetic trees'. Mol Biol Evol. 1987, 4: 406-425.PubMedGoogle Scholar
- Kumar S, Tamura K, Jakobesen IB, et al: 'MEGA2: Molecular evolutionary genetics analysis software'. Bioinformatics. 2001, 17: 1244-1245. 10.1093/bioinformatics/17.12.1244.View ArticlePubMedGoogle Scholar
- Rohlf FJ: 1992, NTSYS-pc version 1.70Google Scholar
- Pritchard JK, Stephens M, Donelly P: 'Inference of population structure from multilocus genotype data'. Genetics. 2000, 155: 945-959.PubMed CentralPubMedGoogle Scholar
- Hudson RR: 'Generating samples under a Wright-Fisher neutral model'. Bioinformatics. 2002, 18: 337-338. 10.1093/bioinformatics/18.2.337.View ArticlePubMedGoogle Scholar
- Weir BS, Cockerham CC: 'Estimating F-statistics for the analysis of population substructure'. Evolution. 1984, 38: 1358-1370. 10.2307/2408641.View ArticleGoogle Scholar
- Romualdi C, Balding D, Nasidze IS, et al: 'Patterns of human diversity, within and among continents, inferred from biallelic DNA polymorphisms'. Genome Res. 2002, 12: 602-612. 10.1101/gr.214902.PubMed CentralView ArticlePubMedGoogle Scholar
- Jorde LB, Watkins WS, Bamshad MJ, et al: 'The distribution of human genetic diversity: A comparison of mitochondrial, auto-somal, and Y-chromosomal data'. Am J Hum Genet. 2000, 66: 979-988. 10.1086/302825.PubMed CentralView ArticlePubMedGoogle Scholar
- Cavalli-Sforza LL, Menozzi P, Piazza A: The History and Geography of Human Genes. 1994, Princeton University Press, Princeton, NJGoogle Scholar
- Lewontin R: 'The apportionment of human diversity'. Evol Biol. 1972, 6: 381-398.View ArticleGoogle Scholar
- Pfaff CL, Parra EJ, Bonilla C, et al: 'Population structure in admixed populations: Effects of admixture dynamics on the pattern of linkage disequilibrium'. Am J Hum Genet. 2001, 68: 198-207. 10.1086/316935.PubMed CentralView ArticlePubMedGoogle Scholar
- Parra EJ, Marcini A, Akey J, et al: 'Estimating African American admixture proportions by use of population specific alleles'. Am J Hum Genet. 1998, 63: 1839-1851. 10.1086/302148.PubMed CentralView ArticlePubMedGoogle Scholar
- Watkins WS, Ricker CE, Bamshad MJ, et al: 'Patterns of ancestral human diversity: An analysis of Alu insertion and restriction site polymorphisms'. Am J Hum Genet. 2001, 68: 738-752. 10.1086/318793.PubMed CentralView ArticlePubMedGoogle Scholar
- Nei M, Takezaki N: 'The root of the phylogenetic tree of human populations'. Mol Biol Evol. 1996, 13: 170-177. 10.1093/oxfordjournals.molbev.a025553.View ArticlePubMedGoogle Scholar
- Stringer C: 'Modern human origins: Progress and prospects'. Phil Trans R Soc Lond. 2002, 357: 563-579. 10.1098/rstb.2001.1057.View ArticleGoogle Scholar
- Charlesworth B, Coyne JA, Barton NH: 'The relative rates of evolution of sex chromosomes and autosomes'. Am Nat. 1987, 130: 113-149. 10.1086/284701.View ArticleGoogle Scholar
- Payseur BA, Cutter AJ, Nachman MW: 'Searching for evidence of natural selection in the genome using microsatellite variability'. Mol Biol Evol. 2002, 19: 1143-1153. 10.1093/oxfordjournals.molbev.a004172.View ArticlePubMedGoogle Scholar
- Kayser M, Brauer S, Stoneking M: 'A genome scan to detect candidate regions influenced by Local Natural Selection in Human Populations'. Mol Biol Evol. 2003, 20: 893-900. 10.1093/molbev/msg092.View ArticlePubMedGoogle Scholar
- Mountain J, Cavalli-Sforza LL: 'Multilocus genotypes, a tree of individuals and human evolutionary history'. Am J Hum Genet. 1997, 61: 705-718. 10.1086/515510.PubMed CentralView ArticlePubMedGoogle Scholar
- McKeigue PM, Carpenter J, Parra EJ, et al: 'Estimation of admixture and detection of linkage in admixed populations by a Bayesian approach using Markov chain simulation: Application to African-American populations'. Ann Hum Genet. 2000, 64: 171-186. 10.1046/j.1469-1809.2000.6420171.x.View ArticlePubMedGoogle Scholar
- Shriver MD, Parra EJ, Dios S, et al: 'Skin pigmentation, biogeographical ancestry and admixture mapping'. Hum Genet. 2003, 112: 387-399.PubMedGoogle Scholar
- Wilson JF, Weale ME, Smith AC, et al: 'Population genetic structure of variable drug response'. Nat Genet. 2001, 29: 265-269. 10.1038/ng761.View ArticlePubMedGoogle Scholar
- Bamshad MJ, Wooding S, Watkins WS, et al: 'Human population genetic structure and inference of group membership'. Am J Hum Genet. 2003, 72: 578-589. 10.1086/368061.PubMed CentralView ArticlePubMedGoogle Scholar
- Mountain J, Cavalli-Sforza LL: 'Inference of human evolution through cladistic analysis of nuclear DNA restriction polymorphisms'. Proc Natl Acad Sci USA. 1994, 91: 6515-6519. 10.1073/pnas.91.14.6515.PubMed CentralView ArticlePubMedGoogle Scholar
- Lewontin RC, Krakauer J: 'Distribution of gene frequency as a test of the theory of the selective neutrality of polymorphisms'. Genetics. 1973, 74: 175-195.PubMed CentralPubMedGoogle Scholar
- Bowcock AM, Kidd JR, Mountain JL, et al: 'Drift, admixture, and selection in human evolution: A study with DNA polymorphisms'. Proc Natl Acad Sci USA. 1991, 88: 839-843. 10.1073/pnas.88.3.839.PubMed CentralView ArticlePubMedGoogle Scholar
- Beaumont MA, Nichols RA: 'Evaluating loci for use in the genetic analysis of population structure'. Proc Biol Soc. 1996, 263: 1619-1626. 10.1098/rspb.1996.0237.View ArticleGoogle Scholar
- Vitalis R, Dawson K, Boursot P: 'Interpretation of variation across marker loci as evidence of selection'. Genetics. 2001, 158: 1811-1823.PubMed CentralPubMedGoogle Scholar
- Devlin B, Roeder K: 'Genomic control for association studies'. Biometrics. 1999, 55: 997-1004. 10.1111/j.0006-341X.1999.00997.x.View ArticlePubMedGoogle Scholar
- Pritchard JK, Donnelly P: 'Case-control studies of association in structured or admixed populations'. Theor Popul Biol. 2001, 60: 227-237. 10.1006/tpbi.2001.1543.View ArticlePubMedGoogle Scholar
- Hoggart CJ, Parra EJ, Shriver MD, et al: 'Control of confounding of genetic associations in stratified populations'. Am J Hum Genet. 2003, 72: 1492-1504. 10.1086/375613.PubMed CentralView ArticlePubMedGoogle Scholar