Ancestry-informative markers on chromosomes 2, 8 and 15 are associated with insulin-related traits in a racially diverse sample of children

Type 2 diabetes represents an increasing health burden. Its prevalence is rising among younger age groups and differs among racial/ethnic groups. Little is known about its genetic basis, including whether there is a genetic basis for racial/ethnic disparities. We examined a multi-ethnic sample of 253 healthy children to evaluate associations between insulin-related phenotypes and 142 ancestry-informative markers (AIMs), while adjusting for sex, age, Tanner stage, genetic admixture, total body fat, height and socio-economic status. We also evaluated the effect of measurement errors in the estimation of the individual ancestry proportions on the regression results. We found that European genetic admixture is positively associated with insulin sensitivity (SI), and negatively associated with the acute insulin response to glucose, fasting insulin levels and the homeostasis model assessment of insulin resistance. Our analysis revealed associations between individual AIMs on chromosomes 2, 8 and 15 and these phenotypes. Most notably, marker rs3287 at chromosome 2p21 was found to be associated with SI (p = 5.8 × 10-5). This marker may be in admixture linkage disequilibrium with nearby loci (THADA and BCL11A) that previously have been reported to be associated with diabetes and diabetes-related phenotypes in several genome-wide association and linkage studies. Our results provide further evidence that variation in the 2p21 region containing THADA and BCL11A is associated with type 2 diabetes. Importantly, we have implicated this region in the early development of diabetes-related phenotypes, and in the genetic aetiology of population differences in these phenotypes.


Introduction
Type 2 diabetes prevalence in the paediatric population is increasing, while age at onset is decreasing. 1,2 Type 2 diabetes also disproportionately affects racial/ethnic minorities in the USA. 3 Twin and familial studies have shown a substantial genetic component to the disease, as well as its related phenotypes. 4 -10 Although much is known about environmental contributions to type 2 diabetes, only a very small proportion of the variation due to genetic factors is currently explainable by identified genetic polymorphisms. 11 -13 Similarly, little is known about the specific genetic factors that may contribute to population differences in diabetes prevalence. Because the origins of type 2 diabetes are likely to be rooted in childhood, a better understanding of genetic determinants among paediatric populations could lead to a better insight into the aetiology of type 2 diabetes and eventually improved prediction and prevention of the disease.
Endo-phenotypes can be useful in closely dissecting the genetic basis of eventual disease status. 14 For type 2 diabetes, several such measurable phenotypes exist, typically examining measures of glucose and insulin homeostasis. These measures serve as indicators of reduced insulin response and action that may presage type 2 diabetes. 15,16 Furthermore, previous studies have suggested a genetic basis for racial/ ethnic differences in insulin dynamics. 17,18 Examining the genetic basis for these detailed phenotypes therefore allows for a much better understanding of the link between the genetic and metabolic pathways that underlie the development of type 2 diabetes.
Since the loci recently identified by genome-wide association studies (GWAS) 19 -21 occurred predominantly among individuals of European descent, there is considerable uncertainty as to whether these associations translate to other populations. Hispanic Americans (HAs) and African Americans (AAs) suffer from higher rates of type 2 diabetes than European Americans (EAs), 22,23 and similar differences exist for endo-phenotypes among children, regardless of disease status. 24 -28 Admixed populations -such as HAs, AAs and EAs -can be examined to determine if there is a genetic basis for these population differences and to identify specific genetic regions associated with both ancestry and insulin-related outcomes. 29 Previous studies have shown that admixture is a strong predictor of diabetes and insulin-related traits but also that this relationship may be mediated through environmental factors such as income or educational level that are correlated with genetic admixture. 17, 30 -32 Genetic mapping methods such as admixture mapping capitalise on population differences in a trait and the extended blocks of linkage disequilibrium (LD) created after the admixture process. 33 They are, therefore, most applicable to recently admixed populations such as HAs and AAs. To our knowledge, only one previous study has used this type of method to localise genetic variants associated with type 2 diabetes. 34 Given the large differences in type 2 diabetes and the plausible genetic origins of these differences, there is a great need for additional studies that attempt to pinpoint population-specific genetic risk factors.
In this study, we also capitalise on population differences in a phenotype and the extended LD blocks resulting from admixture in an attempt to identify specific genetic regions that might be involved in the aetiology of population differences in insulin-related traits. Specifically, we examine the association between genetic admixture and four insulin-related outcomes: insulin sensitivity (S I ), acute insulin response to glucose (AIRg), fasting insulin (FI) levels and the homeostasis model assessment of insulin resistance (HOMA-IR). We then examine the association between 142 ancestryinformative markers (AIMs) and each of these traits to identify genetic regions that potentially account for ethnic/racial differences. We performed this study among a multi-ethnic sample of children, adjusting for known influential covariates, including genetic admixture as a control for genetic background, and socio-economic status. We place our findings in the context of other studies that have found associations in similar regions.

Study participants
A total of 253 children between the ages of seven and 12 (52 per cent male) were recruited as part of a cross-sectional cohort study examining population differences in metabolic phenotypes among children with no major illnesses or medical diagnoses. The children were classified by parental report as AA (n ¼ 87), EA (n ¼ 108), HAs (n ¼ 52) and bi-racial (n ¼ 6). All children were pubertal stage 3, as assessed by a paediatrician according to the criteria of Marshall and Tanner. 35 Written informed assent and consent were obtained from children and parents, respectively, as approved by the University of Alabama at Birmingham Institutional Review Board. All measurements were taken between 2004 and 2008 at the University of Alabama at Birmingham General Clinical Research Center (GCRC) and Department of Nutrition Sciences.

Anthropometric measurements
In the first of two sessions completed by participants, pubertal status, anthropometric measurements and body composition were assessed. Height was measured without shoes to the nearest centimetre using a stadiometre (Heightronic 235; Measurement Concepts, Snoqualmie, WA, USA). Body composition was assessed by dual-energy X-ray absorptiometry (DXA) using a GE Lunar Prodigy densitometer (GE LUNAR Radiation Corp., Madison, WI, USA), as previously described. 36 Participants were measured while lying flat on their backs with arms at their sides, wearing light clothing. Analysis of DXA scans was performed using paediatric software (Encore 2002, version 6.10.029).

Insulin-related measurements
At the second visit (which took place within 30 days of the first visit), participants were admitted to the GCRC for an overnight visit. Following an overnight fast, blood samples were obtained to establish the basal levels of glucose and insulin, and a frequently sampled intravenous glucose tolerance test (FSIGTT) was performed as described elsewhere. 37 -39 S I (the increase in fractional glucose disappearance per unit of insulin increase) and AIRg (the area above baseline insulin concentration during ten minutes following exposure to glucose) were estimated from the FSIGTT using minimal modelling. 40 HOMA-IR, a surrogate measure of insulin resistance, derived as: fasting glucose (mg/ dl) X fasting insulin (mU/ml)/405 41 was also calculated.
Socio-economic status (SES) SES was measured using the Hollingshead 4-factor index of social class, which combines information on the education and occupational prestige of parents. 42 Scores range from 8 to 66, with higher scores representing higher status.
Genetic analysis DNA from blood was obtained from all study participants and was typed at 142 AIMs by Prevention Genetics (Marshfield, WI, USA). Each marker (single nucleotide polymorphism [SNP]) was genotyped using a fluorescent allele-specific polymerase chain reaction (AS-PCR)-based assay. 43 Reaction components were assembled on an array tape platform (http://www.douglasscientific.com) using nanolitre volumes (500 -1000 nl). PCRs were carried out in a water bath thermocycler using a standard three-stage parameter (denaturation, primer annealing, primer extension). The specific parameters of each PCR vary depending on the nature of the primers and the SNP being genotyped. The array tape was scanned post-PCR and the ratio of fluorescent signals was used to determine the genotype (homozygous for one allele, or heterozygous). A subset of these AIMs is described elsewhere. 44 These markers were chosen because they exhibit large frequency differences between ancestral West African, Amerindian and West European populations. Individual West African, Amerindian and European genetic admixture estimates were obtained by maximum likelihood estimation, 45 using the genotypes at each AIM and an estimate of the allele frequencies of these AIMs in the three ancestral parental populations (see Table S1).

Statistical analyses
Differences in mean values for phenotypes between racial/ethnic groups were examined using analysis of variance. Multiple linear regression analyses were used to test the association between European admixture and total fat and the four insulin-related phenotypes, and to examine the association between each of 142 SNPs and four insulin-related phenotypes. For S I , FI and HOMA-IR, the model was defined by age, Tanner stage, sex, SES, European admixture, Amerindian admixture, total fat and height. By controlling for two of three admixture estimates, we prevented the introduction of co-linearity in the statistical models, since the three admixture estimates add up to 1. For AIRg, the model was additionally adjusted for S I . To conform to the assumptions of regression, all models were evaluated for residual normality; logarithmic transformation was performed when appropriate. Outliers were removed based on whether residuals were greater than three standard deviations away from the mean.
Genotyped SNPs were tested for association with the four insulin-related phenotypes using linear regression under additive, dominant, recessive and two-degrees-of-freedom genotypic models. Considering each phenotype and each genetic model separately, we applied a Bonferroni multiple correction to the marker association tests; a p-value cut-off of 3.6 Â 10 24 keeps the nominal type I error rate at 0.05. To determine the extent to which measurement error in admixture estimates could skew the results, we applied the method described by Divers et al. 46 Basically, we obtained an estimate of the measurement error covariance and applied the simulation extrapolation (SimEx) algorithm 47 to retest for association between each marker and phenotype, for each mode of inheritance model. Analyses were carried out with PLINK, 48 SAS 9.1 software (SAS Institute, Cary, NC, USA) and R. 49 Table 1 shows the descriptive characteristics of the sample. Differences in total fat, AIRg, FI and HOMA-IR were statistically significant between racial/ethnic groups (all at p , 0.01). HA had higher total fat, FI and HOMA-IR than both EA and AA. AA had higher AIRg, and lower S I than other groups. EA had the highest S I values ( p , 0.0001).

Association between genetic admixture and insulin-related phenotypes
The associations between European genetic admixture and total fat and all insulin-related phenotypes were statistically significant ( p , 0.05; Table 2). Individuals with higher European admixture had less total fat, higher S I and lower AIRg, FI, and HOMA-IR. In these models, there were significant associations of Tanner stage and total fat with S I , FI and HOMA-IR ( p , 0.01). Total fat was also associated with AIRg (p ¼ 0.04). Upon analysing these associations within racial/ethnic groups, we found significant associations of European genetic admixture only among HA for S I , FI and AIRg ( p , 0.05). These results suggest that specific genetic variants may exist, contributing to population differences for these phenotypes.

Association between single markers and insulin-related phenotypes
The results of single SNP analyses are presented in Table 3. Marker rs3287, located at 2p21, is significantly associated with S I under the recessive ( p ¼ 5.8 Â 10 25 ) and genotypic ( p ¼ 1.9 Â 10 24 ) models. Allele G at this SNP is associated with decreased S I , and is at a higher frequency in the West African parental population (0.75) compared with both the European (0.18) and Amerindian (0.18) parental populations. In our sample, the frequency of the G allele is 0.58 among AAs, 0.25 among EAs and 0.21 among HAs. We found no associations that withstand Bonferroni correction ( p , 3.6 Â 10 24 ) within each racial/ethnic group. Marker rs1373302, located at 8q13, is significantly associated with AIRg under the dominant model ( p ¼ 9.7 Â 10 25 ). Allele T at this marker is associated with increased AIRg and is at a higher frequency among the West African (0.65) and European (0.73) parental populations than among the Amerindian parental population (0.08). We found no significant associations within racial/ ethnic groups for this marker after adjusting for Bonferroni correction. Marker rs2671110, located at 7q32.3, is significantly associated with AIRg among AAs under the recessive model (p ¼ 1.4 Â 10 24 ). Allele A at this marker is associated with increased AIRg and is at a higher frequency among the West African parental population (0.94) than among the Amerindian (0.0) and European (0.13) parental populations.
Marker rs12439722 is significantly associated with FI and HOMA-IR under the recessive and genotypic models ( p ¼ 2.8 Â 10 24 and 1.4 Â 10 24 , respectively). Allele A at this marker is associated with increased FI, and is at a higher frequency in the West African (1.0) and European (0.97) parental populations than in the Amerindian (0.17) parental population.  The application of the measurement error correction methods did not yield results that were significantly different to those we observed with the naïve analyses. For example, we observed a p-value of 7.68 Â 10 25 for the association between rs3287 and S I under the recessive model after accounting for the measurement error versus 5.9 Â 10 25 without the adjustment. This result confirms that individual ancestry proportions are very well measured. Therefore, with 142 AIMs, we can be confident that our results are not driven by measurement errors in the estimation of individual ancestry.

Discussion
We sought to examine the potential genetic basis for population differences in insulin-related phenotypes in a racially/ethnically diverse sample of children. We found that European genetic admixture is associated with insulin-related phenotypes. Next, we determined whether any of the individual 142 AIMs scattered throughout the genome were associated with any of the insulin-related phenotypes. We found a strong association between S I and an AIM at chromosome 2p21 (rs3287), explaining 4.14 per cent of the variance of the trait. Although this effect size may appear large compared with other genetic association studies, our use of refined phenotypes, the inclusion of many covariates and the use of admixed individuals is likely to have increased our ability to detect an effect size of this magnitude. We also found weaker, but statistically significant associations between AIRg and an AIM at chromosome 8q13 (rs1373302) located in the transient receptor potential cation channel, subfamily A, member 1 (TRPA1) gene, and between FI and HOMA-IR and an AIM at chromosome 15q22 (rs12439722) in the gene for the hect (homologous to the E6-AP[UBE3A] carboxyl terminus) domain and RCC1 (CHC1)-like domain RLD) 1 (HERC1). It should be noted that, although we have used a multiple correction for the 142 markers tested, we have not corrected for each of the genetic models tested. If we were to use a Bonferroni correction for all markers and models tested, the p-value threshold would be 8.8 Â 10 25 . In this case, only the association between rs3287 and S I ( p ¼ 5.8 Â 10 25 ) would be considered statistically significant. The four genetic models are likely to be correlated, however, making such a correction overly conservative.
Our finding that insulin-related phenotypes are associated with European admixture is in agreement with previous findings. 17 European admixture is positively associated with favourable insulinrelated phenotypes (higher S I and lower AIRg, FI and HOMA-IR). When we examine the association of European admixture within racial/ethnic groups, they are only statistically significant among HAs. It is difficult to interpret these results, however, because of the different sample sizes and different admixture proportion distributions by racial/ethnic group. Among the other covariates examined, we find that total body fat and Tanner stage are the strongest risk factors associated with these insulin-related traits. These results are consistent with those of other studies that have shown that adiposity is a major risk factor for these traits. 50,51 Insulin-related traits have also previously been found to be associated with pubertal stage. 52,53 The 2p21 chromosomal region previously has been identified as being associated with type 2 diabetes and related traits via both linkage scans and GWAS. Marker rs3287 is located between two loci, thyroid adenoma associated (THADA) and B-cell chronic lymphocytic leukaemia/lymphoma 11A (BCL11A) at 2p21. These loci have been identified previously in two recent GWAS meta-analyses of type 2 diabetes. 21,54 This region was also identified in linkage scans for insulin-and diabetes-related traits. 55,56 It is plausible that, through their effects on cell apoptosis 57 and/or nutrient transport, 58 these loci may be associated with the progression of type 2 diabetes and/or that different pathways may be involved across populations. Given that the occurrence of the rs3287 risk allele is higher in the West African parental population than in the European and Amerindian parental populations, and that AAs tend to have lower S I , this or another nearby variant that is in admixture LD may explain part of the observed differences in type 2 diabetes susceptibility between AAs and EAs. The markers that we have found to be associated with FI and HOMA-IR are in a region on chromosome 15 that previously has been found to be associated with insulin-related traits in a linkage scan. 59 Unlike other association studies, we have identified these associations relatively early in the lifespan. It could be that the children with unfavourable insulin-related phenotypes are already on the path towards developing type 2 diabetes. In the long-term, these markers could inform prediction and treatment strategies for early-onset type 2 diabetes and explain population differences. The fact that none of the AIMs showed any significant association with any of the insulin-related phenotypes when performing analyses within racial/ ethnic groups may be related to reduced power due to a smaller sample size. Furthermore, our ability to find significant associations by race is strongly influenced by the frequency of the variants and the fact that AIM alleles tend to have a low frequency in one group and a high frequency in another group. By using a multi-ethnic approach, we have the advantage of having more intermediate allele frequencies represented, thus increasing the power to detect associations.
This study had several strengths. First, the use of several endo-phenotypes that are likely to be proximal to the development of type 2 diabetes may have pinpointed more effectively the genetic factors that eventually lead to disease phenotypes. Secondly, the inclusion of individuals from different racial/ethnic backgrounds, and the use of markers that differ in frequency between populations, can lead to a better understanding of the genetic basis for population differences in insulin-related phenotypes and the prevalence of diabetes. Thirdly, the inclusion of environmental and phenotypic measurements enhances the ability to pinpoint the genetic regions that directly influence the diseasecausing phenotype.
The study also had some limitations. A relatively small number of genetic markers were used, reducing our ability to provide a high level of resolution with regard to the precise location of potential risk variants. The main weakness of the study was in the small sample size, which raises concerns about the statistical power of the study. Of all the reported associations, the one between marker rs3287 and S I will require the greatest level of statistical power in order to reject the null hypothesis. A power calculation for this association model reveals that at a p-value of 6 Â 10 25 , our data provides 67 per cent power to estimate the R-squared effect of 0.45 that we obtained for the full model, with a semi-partial R-squared for the marker of 0.04. Although this level of power might not seem sufficient, the concerns of this association being a type 1 error are dissipated by the fact that it represents a form of replication of several previously reported findings at chromosome 2p21. Evidently, this level of detection with a small sample size was aided by the use of precise phenotyping, the consideration of physiological parameters and the inclusion of admixture estimates, as previously discussed.
In conclusion, we have shown that regions on chromosomes 2, 8 and 15 were associated with insulin-related traits in this sample. These results suggest that these regions may harbour causal variants that may also explain population differences in the insulin-related phenotypes and, ultimately, type 2 diabetes prevalence, since the markers tested exhibit large frequency differences between groups. Future studies must combine detailed phenotypic, environmental and genetic measures on similarly diverse but larger sample sizes. The inclusion of different populations is of paramount importance if we are to understand the genetic basis for population differences and fairly implement effective prevention, intervention and treatment strategies.