In this study, we have used a panel of 168 AIMs to estimate the ancestry composition of a multi-ethnic US sample collected for studies of drug addictions in New York City and Las Vegas. We compared this information to self-identified ethnicity and family history data. This comparison revealed high concordance in the major ancestry between self-identified ethnicity and AIMs analysis in African Americans and European Americans in agreement with other studies [10, 29–31]. However, self-identified ethnicity and family history data could not predict the degree of admixture that may have an effect on allele frequencies.
This study reiterates the complexity of the ‘Hispanic/Latino’ term. Our results are compatible with studies indicating a relatively high European contribution (>50%) in subgroups of this population [5, 7–9, 12–14]. The study emphasizes the importance of AIMs data in genetic studies of HA since the self-identified ethnicity and family history may not reveal the complex ancestry contributions of this group. A special scrutiny has to be used in case–control association studies in this population, and AIMs data should be used to correct for potential population stratification.
The major Hispanic subgroup in this study was Puerto Rican, a population that currently represents approximately 1.5% of the US population. The pattern of ancestral proportions may have clinical significance for specific diseases when a specific ancestry may have a protective effect based on alleles with higher frequency in this population. For example, a recent study of end-stage kidney disease in Hispanics from New York City reported an approximately 30% African contribution and a very small Native American contribution  emphasizing the difference between ‘Mexican Hispanics’ and ‘Caribbean Hispanics.’ The sample in the current study was collected in Las Vegas and New York City and has a small, unrepresentative proportion of Hispanics of Mexican origin; conversely, studies with a mix HA from the East, the South West, and the West coasts of the USA are expected to have even larger level of admixture.
This study confirmed the finding of other studies showing a highly diverse proportion of European ancestry in self-identified AAs (7–21%) [9, 22, 23, 30, 32, 33]. This diversity can be explained in part by the historical ‘one-drop rule’ (which classified individuals with any level of African ancestry as ‘African Americans’). It is clear that for the AA population, self-identified ethnicity is not sufficient to estimate the admixture level and a random AA sample may differ in admixture level from another sample, to an extent that will affect allele frequencies. The average of 7% of European admixture in this sample is compatible with some studies [22–24, 33] but is lower than other studies [30, 32, 34, 35]. This difference may be explained in part by the various numbers of defined clusters used in the different studies. Our STRUCTURE analysis was based on seven clusters, and the European cluster obtained in studies based on small number of clusters is most probably represented in our study by two clusters: Europe and Middle East. These clusters were found to be relatively close (population differentiation index Fst = 0.005)  and the Middle East cluster was shown to form a gradient across Europe . Including the Middle East cluster in the total European contribution would result in a 13% contribution that is closer to the estimate by other studies. The difference between our estimates of ancestral proportions and other studies may also reflect recruitment from different US regions and the use of a different set of AIMs.
Our finding of a very low Native American contribution in this AA sample, based on AIMs analysis, is compatible with other studies [9, 22, 32] and may represent a conflict with some of the reported family history. It is also possible that this sample does not represent other AA groups in the USA. It most likely does not reflect a limitation in the AIMs set or the analysis since this cluster was clearly detected in HA.
In this study, we have shown the effect of admixture on allele frequencies of two SNPs in the mu opioid receptor gene, OPRM1 (118A > G and 17C > T). The significantly higher frequency of the 118 G allele in this small random HA sample compared with the EA sample probably reflects the contribution of Asian and Native American ancestries. Similarly, the significantly higher frequency of the 17 T allele in the HA sample compared with the EA sample most probably reflects the contribution of African ancestry. This study reiterates the importance of AIMs in defining ancestry, especially in admixed populations and emphasizes the concept that Hispanic Americans is not a valid category in genetic research.
This study provides support for the robustness of this set of AIMs, as our results corroborate the results of other studies using this set [18, 22–24]. The study demonstrates that computation of ethnic factor scores ‘anchored’ against worldwide genetic diversity (CEPH reference populations) yields a stable factor structure, allows comparisons between different datasets, and may permit combining data from different studies. This set of AIMs is especially useful in situations where large-scale genotyping is not available.
There are several limitations to this study: first, our sample does not represent the general US population, as it was derived from only two main locales (Las Vegas and New York City), with unrepresentative low proportion of Mexican Americans. Second, the specific AIMs used in this study were selected based on HapMap data (release #16c.1, 2005)  and as such are limited to allele frequency data from small samples of three main original HapMap populations (Northern and Western Europe (CEPH), Nigeria (Yoruba), and Han Chinese (Beijing)) and may not be suited for analysis of certain populations.
Albeit very promising, great care must be used in research of this kind to avoid misleading interpretations. Genetic ancestry estimates could help the dismissal of the concept of race, but may also support the notion of distinct human biological subgroups that may increase stigmatization and discrimination [38, 39]. There is growing evidence that major health differences between populations involve gene-environment interaction, and, as such, their understanding will need not only genetic tools but also social/cultural information .