Analysis of the genetic landscape in 208 human pharmacogenes
We analyzed the genetic variability in 208 genes with importance for drug ADME using exome sequencing data from 60,706 unrelated individuals. In total, we identified 69,923 variants distributed across transporter genes (33,792 variants in 73 genes), genes encoding phase 1 (21,161 variants in 71 genes) and phase 2 enzymes (10,411 variants in 46 genes), nuclear receptors (2338 variants in 9 genes), and other pharmacogenes with miscellaneous functions (2221 variants in 6 genes; Fig. 1a). Notably, 57,773 (83%) of these 69,923 variants we identified were novel as compared to dbSNP release 135 (Fig. 1b).
Evolutionary constraints in transporters as well as in phase 1 and phase 2 drug-metabolizing genes were low as judged by the large numbers of loss-of-function variants identified in these genes (LoF intolerance score = 0.08 ± 0.02 SEM; see the “Methods” section for details). In contrast, nuclear receptors and other selected genes with importance for drug response were highly LoF-intolerant (LoF intolerance score = 0.53 ± 0.11 SEM), comparable to values observed across haploinsufficient genes (LoF intolerance score > 0.5; Fig. 1c) [11]. Importantly, the vast majority of variants were rare (98.5%; MAF < 1%) or very rare (96.2%; MAF < 0.1%) and more than half (51.1%) of all variants were only detected in a single individual, highlighting the genetic diversity in human pharmacogenes (Fig. 1d).
Rare variants contribute substantially to functional variability
To evaluate the functional importance of rare pharmacogenetic variants, we computed functionality assessments of each SNV using a computational assessment model specifically optimized for the assessment of pharmacogenes (see the “Methods” section for details). We then aggregated frequencies of frameshift, splice, start-lost, stop-gain, and putatively deleterious missense variants and found that the pattern and distribution of genetic variability differed substantially across the 208 pharmacogenes analyzed. Genetic variability with functional impact was governed by few high-frequency variants for some genes, including ABCB5, SLC22A10, CYP1A2, CYP2C8, or GSTT2 (Fig. 2a). In contrast, the functionality of the majority of pharmacogenes, including ABCB1, SLC10A1, and CYP3A7, is dominated by rare genetic variants. The frequency of genetic variants predicted to affect the functionality of the gene product differed more than 1000-fold between genes. The most highly variable genes were SLC22A10 (aggregated functional variant frequency 1.08), ABCB5 (0.91), and FMO2 (0.86), whereas the lowest numbers of functional variants were observed for GSTT1 (0.0006), RXRA (0.0007), PPARD (0.0011), and CYP17A1 (0.0014; Fig. 2a).
Notably, rare pharmacogenetic variants were strongly enriched in mutations predicted to cause functional alterations (Fig. 2b), consistent with previous reports [12, 14, 15]. In the 208 pharmacogenes combined, each individual harbored on average a total of 40.6 putatively functional variants (Fig. 2c). Rare variants accounted for 4.4 (10.8%) of these functional variants, of which 1.8, 1.7, 0.7, and 0.2 were allotted to transporters, phase 1, phase 2, and other pharmacogenes, respectively (Fig. 2c).
Prediction of the importance of rare genetic variants for drug response
Given the significant contribution of rare genetic variants to the functional variability in pharmacogenes, we considered it of importance to include rare variants into predictions of drug response. Using the genetic information as template, we analyzed the contribution of rare genetic variants for drug pharmacokinetics and/or drug response, focusing on five drugs with well-characterized pharmacology and substantial unexplained inter-individual variability. Specifically, we evaluated the relevance of rare SNVs for the anticoagulant warfarin, the HMG-CoA reductase inhibitor simvastatin, the antifungal voriconazole, the antipsychotic olanzapine, and the antineoplastic agent irinotecan (Additional file 2: Table S2). We first estimated the relative importance of different genetic factors for drug metabolism phenotypes of the specific drugs based on extensive literature analysis. Subsequently in a second step, we integrated these evaluations with our genetic variability data to derive assessments of the impact of rare genetic variants on the pharmacokinetics of or response to the given drug.
Warfarin response
Warfarin is a racemic mixture of the R- and S-stereoisomers of which the S-form is at least five times more potent. Warfarin response is influenced by common genetic polymorphisms in CYP2C9, CYP4F2, and VKORC1, which jointly explain up to 45% of warfarin dose requirements [16]. Yet also other genes, such as CYP3A4, CYP1A2, EPHX1, and ABCB1, have been implicated in warfarin pharmacokinetics [17,18,19]. However, despite this extensive knowledge of warfarin transport and metabolism, around 40% of the variability in warfarin dose requirements remains unexplained by common genetic variants and other patient-specific factors [20].
Our analyses predict that rare genetic variants contribute only minorly to the metabolism of the pharmacologically less potent R-enantiomer of warfarin (Fig. 3a–c). Similarly, their contribution to warfarin pharmacodynamics by alterations in CYP4F2 (3.6%) and VKORC1 (2%) function is expected to be relatively low. Importantly however, rare SNVs have a major impact on hepatic S-warfarin metabolism. Overall, 2.1% of CYP2C9 alleles are predicted to harbor rare variants with deleterious effects, accounting for 18.4% of the genetically encoded functional differences in CYP2C9 activity (Fig. 3b). Moreover, our analyses predict rare variants with functional consequences in 1.3% of ABCB1 alleles, encoding the P-gp/MDR1 transporter that is implicated in warfarin clearance, whereas no common deleterious variants were identified (Fig. 3b). However, given the controversy regarding the functional impacts of common ABCB1 variants, such as rs1045642 and rs2032582, future research is necessary to delineate associations between ABCB1 genotypes and factors related to P-gp/MDR1 activity [21]. Combined, our analyses pinpoint CYP2C9 and ABCB1 as loci for which comprehensive NGS profiling can likely reveal substantial additional information regarding the unexplained variability in warfarin dose requirements.
Simvastatin myopathy
ADRs related to high-dose simvastatin therapy are strongly linked to the common (12.9% MAF) genetic variant rs4149056 in SLCO1B1 (encoding the transporter OATP1B1) with an odds ratio of 4.5 per copy of the risk allele [22]. Toxicity is caused by an impaired hepatic uptake of the drug that results in elevated plasma concentrations of simvastatin acid, which have been shown to cause myotoxicity in vitro [23] (Fig. 3d). In our analyses, we correctly predicted the functional impact of rs4149056 and did not find additional common variants with reduced functionality. However, we identified rare deleterious variants with an aggregated frequency of 1.2%, which are estimated to jointly explain 8.7% of the genetic basis of SLCO1B1 variability (Fig. 3e). Similarly, rare variants are expected to contribute substantially to the metabolism and transport of simvastatin with an aggregated rare functional variant frequency of 1.3, 1.1, and 0.8% for ABCB1, CYP3A4, and ABCG2, respectively (Fig. 3f).
Voriconazole efficacy and ADRs
Voriconazole is a triazole antifungal agent exhibiting large inter-individual variability in serum concentrations that is a common reason for therapeutic failure or the manifestation of ADRs. Voriconazole is extensively metabolized by various CYPs (CYP2C19, CYP2C9, and CYP3A4) and FMOs (FMO1, FMO3, and FMO5) accounting for 75 and 25% of its hepatic metabolism, respectively (Fig. 4a) [24, 25]. Genetic polymorphisms in CYP2C19 have been reproducibly linked to differences in voriconazole serum levels and jointly explain around 50% of the inter-individual variability in voriconazole metabolism ([26] and references therein). In addition to CYP2C19 alleles, clinical pharmacogenetic studies also implicated reduced functionality variants of CYP2C9 (CYP2C9*2) and CYP3A4 (rs4646437) in differences in voriconazole pharmacokinetics [27, 28]. For CYP2C19, our analyses identified rare deleterious variants with an aggregated frequency of 1.6%, whereas the common functional CYP2C19 alleles CYP2C19*2 and CYP2C19*17 showed frequencies of 18.5 and 15.3%, respectively (Fig. 4b). Consequently, rare variants are estimated to account for 4.4% of the overall genetic variability of CYP2C19 function. Furthermore, rare alleles contributed substantially to the variability in other genes implicated in voriconazole efficacy and ADRs, including FMO1 (100% contribution), FMO5 (100%), CYP3A4 (43.1%), and CYP2C9 (18.4%; Fig. 4b, c).
Serum olanzapine levels
The therapeutic benefits for schizophrenic or bipolar patients when treated with the antipsychotic olanzapine are limited by extensive inter-individual variability in olanzapine serum concentrations, which can result in exposure levels outside the therapeutic interval [29]. As olanzapine serum levels are directly linked to the likelihood of therapeutic success [30] and the risk of ADRs [31, 32], an individualization of dosing regimens promises to increase treatment success rates.
While the metabolism of olanzapine is well characterized, the influence of genetic factors is more controversial (Fig. 4d). CYP2D6 and UGT2B10 hydroxylate or glucuronidate olanzapine, respectively, but so far, no study demonstrated clinically relevant effects of haplotypes of these genes on olanzapine exposure. In contrast, multiple clinical association studies found that genetic variants in CYP1A2, FMO3, and UGT1A4 could explain up to 50% of differences in olanzapine serum levels, whereas other studies failed to replicate such associations ([33] and references therein). Rare variants substantially contribute to UGT1A4 and UGT2B10 variability and are predicted to account for 100 and 47% of the genetically encoded inter-individual variability in the functionality of these genes (Fig. 4e, f). On the contrary, we estimate that rare variants only explain 6.3, 5.5, and 1.6% of the variability in CYP2D6, FMO3, and CYP1A2, respectively.
Irinotecan toxicity
Irinotecan is a topoisomerase inhibitor prodrug that is used in combination therapy for advanced colorectal, lung, and other cancers. Irinotecan has a narrow therapeutic window and, as a consequence, up to 36% of patients suffer from dose-limiting toxicities [34]. Irinotecan is subjected to a complex interplay of competing metabolic activation and inactivation pathways (Fig. 5a). Around 97% of irinotecan is metabolized by CYP3A4 and CYP3A5 to the pharmacologically inactive metabolites APC and NPC, while only 3% become metabolically activated into SN-38 by the carboxylesterases CES1 and CES2. Subsequently, SN-38 is detoxified by glucuronidation mediated by UGT1A1 and, to a lesser extent, UGT1A9. Irinotecan and its metabolites are excreted into the bile and intestine via multiple transporters of the ABC family. Importantly, the β-glucuronidase enzymes of the intestinal microflora can re-activate glucuronidated SN-38 resulting in diarrhea and damage to all segments of the intestine [35].
Genetic variants in UGT1A1 have been reproducibly linked to neutropenia and diarrhea toxicity in various ethnicities and dosing regimens [36]. Furthermore, multiple polymorphisms in the transporter genes ABCB1, ABCC1, ABCC2, ABCG2, and SLCO1B1 have been implicated in irinotecan clearance and/or risk of toxicity [37,38,39,40]. While associations between CYP3A genotype and irinotecan pharmacokinetics are controversial, incorporation of CYP3A activity data into dosing calculations have resulted in reduced incidence of severe neutropenia [41].
Interestingly, our computational analyses of population-scale sequencing data indicate that rare genetic variants are important factors for irinotecan activation and transport (Fig. 5b). The aggregated frequency of rare deleterious alleles in CES1 and CES2 were 1.5 and 0.4%, respectively, accounting for 19.6 and 100% of the functional genetic variants in these genes. Similarly, for SLCO1B1 and ABCC1, 8.7 and 40.5% of all deleterious variants were assigned to rare variants, while for ABCB1, ABCC2, and ABCG2, no common variants with functional consequences were identified. Notably though, we do not consider here variants with functional impacts that do not result in changes of the gene product, such as the synonymous variants rs1045642 (ABCB1 I1145I) and rs1128503 (ABCB1 G412G) or the UTR variant rs717620 in ABCC2. In contrast, inter-individual variability in UGT1A1 is primarily due to common polymorphisms, such as UGT1A1*28, and only an estimated 4.5% are allotted to rare SNVs.
Combined, our evaluations indicate that rare genetic variants in pharmacogenes have the potential to explain a substantial part of the unexplained genetic variability in drug metabolism phenotypes. Examples were selected for which gene-drug interactions were well studied, and we speculate that the relative importance of rare variants is even higher for less extensively characterized drugs. Furthermore, we give indications about the extent of genetically encoded functional variability that would be missed when only considering common genetic variants, thereby providing guidance for the optimal drug-specific choice of genotyping strategy.