Analysis of pharmacogenetic traits in two distinct South African populations

Our knowledge of pharmacogenetic variability in diverse populations is scarce, especially in sub-Saharan Africa. To bridge this gap in knowledge, we characterised population frequencies of clinically relevant pharmacogenetic traits in two distinct South African population groups. We genotyped 211 tagging single nucleotide polymorphisms (tagSNPs) in 12 genes that influence antiretroviral drug disposition, in 176 South African individuals belonging to two distinct population groups residing in the Western Cape: the Xhosa (n = 109) and Cape Mixed Ancestry (CMA) (n = 67) groups. The minor allele frequencies (MAFs) of eight tagSNPs in six genes (those encoding the ATP binding cassette sub-family B, member 1 [ABCB1], four members of the cytochrome P450 family [CYP2A7P1, CYP2C18, CYP3A4, CYP3A5] and UDP-glucuronosyltransferase 1 [UGT1A1]) were significantly different between the Xhosa and CMA populations (Bonferroni p < 0.05). Twenty-seven haplotypes were inferred in four genes (CYP2C18, CYP3A4, the gene encoding solute carrier family 22 member 6 [SLC22A6] and UGT1A1) between the two South African populations. Characterising the Xhosa and CMA population frequencies of variant alleles important for drug transport and metabolism can help to establish the clinical relevance of pharmacogenetic testing in these populations.


Introduction
The field of pharmacogenomics aims to utilise the genetic composition of an individual to personalise therapeutic regimens and improve treatment outcomes. Most of the initial examples of the clinical utility of pharmacogenomics were elucidated for cancer treatments. Currently, however, there are more than 15 drugs used in the treatment of a variety of chronic diseases, such as cardiovascular disease, HIV/AIDS and seizures, for which the US Food and Drug Administration (FDA) recommends or requires pharmacogenomic testing to prevent drug-related toxicity or improve drug efficacy. 1 The increase in the number and breadth of drugs for which pharmacogenetic tests are recommended or required by the FDA is an indication of the important role that genetics plays in predicting treatment outcomes.
In order for pharmacogenetic testing to have the most impact in as many people possible, it is important to understand which genetic variants are predictive of treatment outcomes in diverse populations. Most pharmacogenetics studies to date have been conducted in a limited number of population groups, most frequently in Western European and North American Caucasians. As a result of these limitations, genotype-to-phenotype correlates of drug response or toxicity for a number of drugs are clinically applicable in relatively few treated individuals. Furthermore, pharmacogenetic profiles characterised in Caucasians are often extrapolated for use and interpretation in other populations, in spite of at least two major problems with this method. First, it is clear that the population frequency of variants can differ markedly between populations, such as Caucasians. The differences in population frequencies of variant alleles has an impact on the clinical utility of pharmacogenetic testing, being more utilised in populations with a higher frequency of the variant allele than in populations in which the variant allele is rare. Secondly, ethnically specific variants exist in non-Caucasian populations which are more predictive of treatment outcomes than those identified in Caucasians. For example, although the UGT1A1*28 polymorphism is predictive of toxicity to the anticancer drug, irinotecan, in Caucasians, the UGT1A1*6 polymorphism is more predictive of irinotecan toxicity in Asians. 2 Such ethnically specific variation is not currently taken into account in most commercially available pharmacogenetic tests or on FDA drug labels.
African populations are among the most genetically diverse in the world. 3 In spite of this diversity, very few pharmacogenetics studies have been conducted in African populations. In fact, it is documented that there is inter-ethnic variability in pharmacogenetic traits between African populations. 4 Although the International HapMap Project has included three African populations, the Yoruba of Nigeria and the Maasai and the Luhya of Kenya, the population genetics of these three groups cannot represent the remaining populations in West and East Africa, or other populations living in Southern Africa. In order to achieve the goal of personalised medicine and individualisation of therapy in Africa, it is important carefully and systematically to study pharmacogenetic traits in as many distinct African population groups as possible.
To bridge gaps in pharmacogenetic mapping in African populations, especially those residing in Southern Africa, this study prioritises the genotyping of 211 single nucleotide polymorphisms (SNPs) in 12 genes known to affect drug absorption, transport and metabolism in the Xhosa and Cape Mixed Ancestry (CMA) populations living in the Western Cape, South Africa. These genes are relevant for the pharmacokinetic disposition of a number of medications, including those used for the treatment of HIV infection, which is having a devastating impact in the region. The Xhosa population is indigenous to the Eastern Cape of South Africa, is the second largest ethnic group in South Africa and comprises approximately 17.6 per cent ( 8 million) of the South African population. 5 The CMA population is known to have the highest rate of admixture worldwide, including mixes of European, African, South Asian and Indonesian ancestry, and comprises 8.9 per cent ( 4 million) of the South African population. 5 In this study, we characterised and examined differences in the minor allele frequency (MAF) estimates of pharmacogenetic alleles between the Xhosa and CMA populations. Secondly, we characterised haplotypes of the pharmacogenetic genes in both the Xhosa and CMA groups, and, finally, the MAF estimates we obtained for the Xhosa and CMA were compared with the HapMap estimates for other African, US and Asian populations. Taken together, these data should help lay the foundation for future pharmacogenetics studies in other South African populations, as well as the eventual use of pharmacogenetic testing, where clinically relevant, for the South African population.

Study design
Written informed consent for the collection, storage and extraction of genomic DNA was obtained in English, Afrikaans or Xhosa from 176 unrelated HIV-positive South Africans. 6 The DNA belonged to 109 Xhosa and 67 CMA individuals. 7 Ethnicity was determined by self-report. This study was approved by the individual Committees on Human Research at Stellenbosch University, South Africa and the University of California, San Francisco, USA.

Measurements
Two hundred and eleven tagging SNPs (tagSNPs) in 12 genes (Supplementary Table S1) that are important for drug absorption, transport and metabolism of antiretrovirals and other medications were selected using Snagger software. 8 This software takes into account different population frequencies of SNPs, as reported in HapMap, to generate a representative list of SNPs. 8 Because little is known about South African population genetic substructure and polymorphisms, HapMap Phase I, build 36 was used to select tagSNPs informative across all four population samples (ie Caucasian, Yoruba, Japanese and Han Chinese). In this manner, the likelihood of selecting for markers that may be informative in the Xhosa and the CMA populations is increased. Other SNPs were force-included based on their clinical pharmacogenetic relevance, as reported in the literature. All SNPs were included on a custom SNP genotyping array and DNA samples genotyped using the Illumina GoldenGate Assay kit (Illumina, Inc., San Diego, CA, USA).

Statistical analyses
Call rates, MAF and Hardy-Weinberg disequilibrium test p values were calculated using the R package. Chi-squared tests were used to test for Hardy-Weinberg disequilibrium. When small observed numbers were present for one or more genotype groups, Fisher's exact test was applied. Association analyses were performed using the co-dominant genetic model to report on SNPs with significantly different frequencies between the Xhosa and CMA groups. The significance criterion was set at a Bonferroni-corrected p value 0.05. In order to improve the quality of the genotype data, the SNP call rate was required to meet or exceed 90 per cent. The MAF of SNPs retained for association tests was required to meet or exceed 5 per cent in the Xhosa.
The R package, haplo.stats, was used to infer haplotypes. A sliding window haplotype association test was performed for each SNP represented in a given gene. This tests for association between haplotypes and an outcome. Given an ordered (by chromosomal locations) set of markers (1, 2, 3,. . .,n), sliding windows of overlapping haplotypes are tested in sequence (ie for window size ¼ 3, markers 1-2-3 are treated as a single haplotype, then markers 2-3-4 are treated as a single haplotype, then markers 3-4-5, etc.). Haplotypes of varying sizes (2-, 3-, 4-SNP haplotypes) are assessed within each gene for this dataset. This haplotype test also assessed the association between identified haplotypes and outcome (in our case, ethnicity), as previously described by another group. 9

Study population
Our study population consisted of 176 HIV-positive, unrelated South African individuals, aged 21 and older. There were 109 Xhosa individuals, consisting of 79 females and 30 males, and 67 CMA individuals, consisting of 35 females and 32 males.
Comparison of SNPs between the Xhosa and CMA populations Six of the 12 genes studied (those encoding the ATP binding cassette (ABC) sub-family B, member , CYP2C18, CYP3A4 and CYP2A7P1), in descending order of significance, contained at least one tagSNP that differed statistically between the Xhosa and the CMA populations ( Table 1). The tagSNP results are presented in descending order of statistical significance (Table 1). Among the six genes, the MAF of eight of the 211 genotyped SNPs were statistically different between the Xhosa and the CMA, with the greatest difference found for the ABCB1 SNP (rs13233308) ( p ¼ 1.77E-05; Table 1) and least difference found for ABCB1 SNP (rs1202184) ( p ¼ 0.0459).
CYP3A5 SNP (rs4546450) occurred at a frequency of 0.03 in the Xhosa and 0.17 in the CMA ( p ¼ 0.00393). Two SNPs in UGT1A1 were found to be statistically significantly different between the Xhosa and the CMA. The first SNP in UGT1A1 (rs7572563) occurred at a frequency of 0.14 in the Xhosa and 0.32 in the CMA ( p ¼ 0.0108) and the second SNP in UGT1A1 (rs4148329) occurred at a frequency of 0.21 in the Xhosa and 0.41 in the CMA population ( p ¼ 0.0445). One SNP in CYP2C18 (rs2860840) is undetected in the Xhosa but occurred at a frequency of 0.09 in the CMA ( p ¼ 0.0148). A single SNP in CYP3A4 (rs2738258) occurred at a frequency of 0.35 in the Xhosa and 0.17 in the CMA (p ¼ 0.0263). A SNP in CYP2A7 (rs11666982) occurred at a frequency of 0.22 in the Xhosa and 0.43 in the CMA ( p ¼ 0.0279).

Haplotype analysis
Based on the genotyped tagSNPs, haplotypes were constructed for each of the 12 genes. There were no identifiable haplotypes in the Xhosa and the CMA in the following genes: ABCB1, ABCC2, CYP2A7P1, CYP2B6, CYP2C19, CYP2D6, CYP3A5 or CYP3A7; however, haplotypes were identified in the gene encoding solute carrier family 22 member 6 (SLC22A6), CYP2C18, CYP3A4 and UGT1A1 (Table 2).
A total of four haplotypes were identified in the SLC22A6 gene. The four-SNP TAGG haplotype of SCL22A6 was found to occur at a significantly different frequency in the Xhosa (0.19) and in the CMA (0.05) ( p ¼ 2.7E-04; Table 2). In CYP2C18, a total of three haplotypes were identified. The four-SNP AAGC haplotype of CYP2C18 occurred at a frequency of 0.40 in the Xhosa and 0.25 in the CMA ( p ¼ 3.0E-03). A total of seven haplotypes were identified in CYP3A4, which included the *1B SNP (rs2740574; Table 2). Six of the seven CYP3A4 haplotypes were significantly different in terms of the population frequency between the Xhosa and the CMA. The four-SNP haplotype of CYP3A4 which differed the most between the groups was the GCAG haplotype, which occurred at a frequency of 0.04 in the Xhosa, compared with 0.22 in the CMA population ( p ¼ 3.3E-05; Table 2).
A total of ten haplotypes in UGT1A1 were identified. Unlike the other genes, two different haplotype blocks were identified in UGT1A1. One of the UGT1A1 haplotype blocks consisted of four SNPs, composed of five haplotypes. Two of the four SNP haplotypes were significantly different in frequency between the Xhosa and CMA South African populations (Table 2).      Table 3. Continued Analysis of pharmacogenetic traits in two distinct South African populations PRIMARY RESEARCH The second UGT1A1 haplotype block consisted of three SNPs, composed of five haplotypes. Two of the three SNP haplotypes were found to be significantly different in frequency between the Xhosa and the CMA ( Table 2). The most significant haplotype difference was the GGA haplotype of UGT1A1, which occurred at a frequency of 0.12 in the Xhosa and 0.33 in the CMA ( p ¼ 5.2E-06).
Comparison of SNPs between South African, the HapMap African, US and Asian populations A comparison of the MAF of 35 pharmacogenetic SNPs with known functional or clinical associations in 10 genes (ABCB1, ABCC2, CYP2B6, CYP2C18, CYP2C19, CYP2D6, CYP3A4, CYP3A5, CYP3A7 and UGT1A1) in the Xhosa and CMA populations is presented in Table 3. The MAF of the SNPs do not differ statistically between the Xhosa and CMA populations, except for CYP3A5 rs4646450 ( p ¼ 0.00393) and CYP2C18 rs2860804 ( p ¼ 0.0148). Table 3 also shows a comparison between the allele frequencies obtained in the two distinct South African population groups in our study and available reports for other African populations, of which most data are known for the Yoruba from Nigeria and most recently the Maasai and the Luhya tribes of Kenya. In addition, the table displays a comparison of the allele frequencies in the African populations with other diverse populations in the USA and Asia.

Discussion
In this study, we analysed the allelic variation of 211 tag SNPs in 12 genes that are important in drug disposition and treatment outcome in two South African population groups: the Xhosa and the CMA. We identified both single SNPs and haplotypes which occurred at significantly different frequencies in the two populations.
In most sub-Saharan African countries, HIV/ AIDS comprises one of the top socioeconomic and health burdens. It is estimated that 25 per cent of the adult population living in Southern Africa is infected with HIV, with an incidence of approximately 18 per cent in South Africa alone. 10 Given the high incidence of HIV/AIDS in South Africa, the greatest impact of pharmacogenetics may initially be made by improving treatment outcomes on antiretroviral therapy (ART).
In terms of the pharmacogenetic relevance of the ABC family of transporter genes, the evidence of their role in predicting HIV treatment-related toxicity is inconclusive. The presence of the ABCB1 3435T allele is associated with a decreased risk of hepatotoxicity in HIV patients treated with either efavirenz or nevirapine. 11,12 There is no conclusive evidence of the clinical significance of the ABCB1 1236C . T allele, however, although it appears minimally to affect the kinetics of the immunosuppressant drug cyclosporine. 13 Both efavirenz and nevirapine, which are nonnucleoside reverse transcriptase inhibitors (NNRTIs), are currently used in first-line treatment regimens of HIV-infected individuals in South Africa. 14 Therefore, it would be important to assess the importance and contribution of the ABCB1 variant alleles to drug-related toxicity with these NNRTIs. Parathyras et al. studied the association between a number of variants of ABCB1 and immune recovery in South Africans treated with ART and found no association between the well-known 3435T allele and immune recovery; 7 however, an association was found between the ABCB1 G2677A SNP and immune recovery in this study. 7 Based on the results of the present study, it would be interesting to investigate whether there is an association between the two ABCB1 tagSNPs (rs13233308 and rs1202184) found to be significantly different between the Xhosa and CMA populations and immune recovery.
According to the South African Department of Health, the current second-line ART regimen should include the anchoring agent lopinavir and ritonavir. 14 Therefore, pharmacogenetic traits of CYP3A4 and CYP3A5 may have an impact on the treatment outcome of second-line therapy. As protease inhibitors are both substrates and inhibitors of CYP3A, however, the influence of CYP3A gene variation on ART treatment outcomes is difficult to discern 15 -although the CYP3A4*1B variant allele is associated with variability in the pharmacokinetics of the protease inhibitor indinavir. 16 In fact, homozygotes for the *1B variant have a lower bioavailability of indinavir than heterozygotes and homozygotes for the *1A common allele. 16 Similarly, the common allele of CYP3A5 A6986G is associated with increased clearance of indinavir. 17 Similar association studies should be carried out to assess the contribution of CYP3A variants to response or exposure to lopinavir in the South African population. It would be interesting to assess the influence of the CYP3A5*6 variant that results in a loss-of-function of the CYP3A5 enzyme on lopinavir exposure and treatment outcome in South Africans, as this allele is more common in people of African descent than in Caucasians and Asians. 18,19 The variant occurred at a frequency of 0.2 in the Xhosa and 0.17 in the CMA populations in the present study. In addition, the influence of the CYP3A4 SNP (rs2738258) and CYP3A5 SNP (rs4646450), both found to be statistically significantly different between the Xhosa and CMA in the present study, on lopinavir exposure and treatment outcome should be investigated.
The single SNPs and haplotype structures inferred for the Xhosa and the CMA populations in the UGT1A1 gene could be used more accurately to stratify the two populations in order to perform pharmacogenetic association studies. The South African HIV treatment guidelines changed in April 2010, and first-line ART. Now includes tenofovir in addition to either nevirapine or efavirenz and lamivudine. 14 Although both lamivudine and tenofovir are only nominally affected by CYP enzymes, they are glucuronidated in the liver and excreted unchanged through the kidneys. 15 Therefore, studies could be designed to assess the impact of UGT1A1 SNPs (rs7572563 and rs4148329) and haplotypes on treatment outcomes of antiretroviral drugs that may undergo glucuronidation prior to excretion, such as tenofovir. It would make sense initially to genotype SNPs of the UGT1A1 haplotype with known functional alleles such as the UGT1A1*93 or the rs887829 SNP, both of which have been associated with hyperbilirubinaemia, 20,21 and assess their impact on the response to tenofovir.
It is clear that there are differences in the MAF of key pharmacogenetic alleles in South African populations compared with other African populations (Table 3). Of particular interest, the loss of function CYP2B6*18 variant allele is thought to occur most frequently in West African populations, with a reported MAF of 0.04. 22 In the present study, however, we find that it occurs at a frequency of 0.17 in the Xhosa and 0.09 in the CMA populations, compared with a reported frequency of 0.07 in the Luhya and 0.02 in the Maasai populations ( Table 3). The CYP2B6 gene plays an important role in the metabolism of two of the first-line ART drugs used in South Africa: efavirenz and nevirapine. The CYP2B6*18 SNP is the only coding SNP in CYP2B6. 23 The variant is associated with elevated plasma concentrations of efavirenz and nevirapine and hepatotoxicity in HIV patients from Mozambique treated with either drug. 24 -26 To our knowledge, the present study is the first report on the MAF of the CYP2B6*18 variant in the Xhosa and the CMA populations. Given that efavirenz and nevirapine are both firstline treatment agents in this region, further investigation of the association between CYP2B6 null variant alleles and adverse reactions in South African populations is warranted. Such findings have important implications for the incidence of adverse reactions to efavirenz and nevirapine in different African populations.
The current study is the first of its kind systematically to characterise tagging and clinically relevant pharmacogenetic SNPs in the two South African populations; however, there are inherent limitations to our analyses. First, as this was a purely descriptive study, there are no associations made with any disease (eg HIV) or treatment outcomes; however, this work lays the foundation for the future study of such associations. Secondly, whereas the sample size of the Xhosa sample is adequate, the sample size of the CMA population is modest and it is possible that lower frequency alleles could not be detected in this group. Thirdly, the frequency of SNPs typed in this study were previously characterised in other populations and therefore we cannot rule out the presence of novel SNPs for which we did not test in the Xhosa and CMA populations, such as those recently reported in CYP2C19 and CYP2D6 in these populations. 5,27 Fourthly, the haplotypes inferred are limited by the sample size of our population and there may be others that remain to be identified. Lastly, 12 genes that are known to be associated with treatment outcomes in HIV infection were characterised but there are likely to be more genes that remain to be studied.

Conclusion
To our knowledge, this is the largest pharmacogenetics study of two distinct South African population groups. Our work shows that there are significant differences in the frequencies of variant alleles in several genes (ABCB1, CYP2A7P1, CYP2C18, CYP3A4, CYP3A5 and UGT1A1) associated with treatment outcome in the Xhosa and the CMA populations of South Africa. It also shows that for the majority of SNPs analysed, there is great similarity in allele frequency between the two groups. Such work is of great importance for laying the foundation for ethnicity-specific genotype-to-phenotype correlates of treatment outcome for these various enzyme polymorphisms and their drug substrates. Importantly, we also identified novel haplotype structures in four genes (CYP2C18, CYP3A4, SLC22A6 and UGT1A1) in the two distinct South African populations. The haplotypes could be used, in addition to single SNPs, to more accurately stratify patient groups according to ethnicity and to aid in identifying associations between causative variants and drug response.
It is clear from this work and that of others that not all African groups share the same allele frequencies of key pharmacogenetic genes. 4,5,7 Therefore, it is important that studies such as this is performed in as many populations as possible, to generate the most useful information on the clinical application of pharmacogenetics for these specific populations. Caution is advised in using a single African population in pharmacogenetics studies since it cannot be representative for all Africans.