Trick or treat: The effect of placebo on the power of pharmacogenetic association studies

The genetic mapping of drug-response traits is often characterised by a poor signal-to-noise ratio that is placebo related and which distinguishes pharmacogenetic association studies from classical case-control studies for disease susceptibility. The goal of this study was to evaluate the statistical power of candidate gene association studies under different pharmacogenetic scenarios, with special emphasis on the placebo effect. Genotype/phenotype data were simulated, mimicking samples from clinical trials, and response to the drug was modelled as a binary trait. Association was evaluated by a logistic regression model. Statistical power was estimated as a function of the number of single nucleotide polymorphisms (SNPs) genotyped, the frequency of the placebo 'response', the genotype relative risk (GRR) of the response polymorphism, the strategy for selecting SNPs for genotyping, the number of individuals in the trial and the ratio of placebo-treated to drugtreated patients. We show that: (i) the placebo 'response' strongly affects the statistical power of association studies -- even a highly penetrant drug-response allele requires at least a 500-patient trial in order to reach 80 per cent power, several-fold more than the value estimated by standard tools that are not calibrated to pharmacogenetics; (ii) the power of a pharmacogenetic association study depends primarily on the penetrance of the response genotype and, when this penetrance is fixed, power decreases for larger placebo effects; (iii) power is dramatically increased when adding markers; (iv) an optimal study design includes a similar number of placebo- and drugtreated patients; and (v) in this setting, straightforward haplotype analysis does not seem to have an advantage over single marker analysis.


Introduction
Pharmacogenetics (PGx) -t he study of howg enetic differences influence the variability in patients' response to drugs 1investigates genes ideally covering all of the drug'si nteractions in the course of its passage through the body. 2 The objectiveof PGx research is to identifythe genetic profile contributing to an individual'sresponse patterntoaspecific drug. Little is known about the genetic basis of differential drug response.There are examples where as ingle gene mayexert ad ominante ffect on treatment efficacy,asi nt he case of cytochrome P4502D6 (CYP2D6), where deficientp atients need to be identified before treatment initiation by codeineand its derivatives due to efficacy loss. 3 More commonly,the phenotype of drug response is classified as multifactorial, as it generally results from the interaction of anumber of different genetic,asw ell as environmental, factors. An exampleoft his is the efficacy of clozapinet herapyi nt he treatment of schizophrenia. 4 Traditionally,g enetic mapping can be approached either by linkage (family-based) methods or by association study (population-based) designs. The latter are particularly likely to playaprominent role in pharmacogenetics, as it mayb ed ifficult to collect informativef amilies with multiple patients treated with thes ame drugs.T he simplest and most widely applied strategy of association studies is the case-control design; however, several keya spectsd istinguish PGx association studies from standard disease-oriented case-control studies. First, PGxa ssociation studies areu sually based on either prospectiveo ro ngoing clinicalt rials, where,c lassically, patients arer andomly assigned to one of twog roups: at reatment group,r eceiving the tested drug;a nd ac ontrol group, receiving placebo (randomised, controlled study). As ar esult, the number of responders( 'cases') can only be determined once the study has been completed and not ap riori ,c omplicating the recruitment of the required cohort. Secondly,P Gx association studies in general, and those of medications for PRIMARY RESEARCH q HENRYSTEWART PUBLICATIONS 1479-7364. HUMANGENOMICS .V OL 2. NO 1. 28-38 MARCH 2005 psychiatric and immunological diseases in particular,a re characterised by ap oors ignal-to-noise ratio: approximately one third of the patients enrolled in efficacy trials mayrespond to placebo treatment. The placebo 'response' in randomised clinicalt rials includes such statisticala rtefacts as regression to the mean, 5 drift in measurement of the response over time and bias of expectationsb yb oth patients and evaluators, as well as real effects such as spontaneous recovery,atendencyt os eek treatment outside the studya nd the response to additional attention and concerna rising from participation in clinical trials. 6 Although as ystematic review of placebo versus no treatment found little evidence for placebo effect, 7 one issue seemsu nquestionable: the placebo effect is present in clinical practice and in clinical trials -bywhichever name we choose to call it or the nature of the phenomenon -a nd its amplitude mayv aryw ith drug treatment. 6 Therefore, the impact of placebo effects on statisticalp ower in the context of PGx association studies needs to be evaluateda nd quantified.
Several factorsh aveb een shown to influence powere stimationsfor association studies, suchas: disease penetrance and prevalence; the net effect of the susceptibility locus; the frequencyoft he disease allele(s); the frequency of them arker allele(s); and the extent of linkage disequilibrium (LD) between disease alleles and marker alleles. 8,9 At present, there are no analyticald erivationso fp ower estimation that handle more realistic situations, such as complex dependencies between linked markersa nd the disease-causing allele frequency, recombination hot spots etc. Therefore, the strategyofchoice is simulations. Long and Langley 10 pursued this strategyt o quantify the powerofcomplex trait association studies across a wide range of settings usingal arge number of simulations. They simulated genotypes based on the coalescentm odel, 11 phenotypes were randomised, with phenotype probability being conditioned on the causative single nucleotide polymorphism (SNP) genotypes, and association wase valuatedu singa ppropriate statistical tools. The study concluded that greater power wasa chieved by increasing the sample size than by increasing the number of polymorphisms, and that marker-based tests were more powerful than simple haplotype-based analyses.
PGx studies differ considerably from standard case-control association studies, however, as illustrated above and confirmed by our results; hence,itisimportant to quantify the statistical powerofassociation studies in thecontext of PGx and to map the parameter space of such studies. Powerestimation for PGx studies has been previously studied by Cardon et al., 12 who used analytical formulae to study simplistic trial designs. They explored howdifferent propertiesofSNPs,for example the frequency of the disease-causing alleles, mightinfluence the required size and expected powerofthe clinicaltrial. Unfortunately,for PGx studies -asfor complex trait associations -the frequencies of these phenotype-causingvariants are unknown and their distribution is complex, motivating asimulation-based approach.
The goal of this study wast oe valuate thep ower of PGx association studies under differents cenarios, with special emphasis on the placebo effect. Thesetting wasadrugclinical trial consisting of adouble-blind, randomised controlled study, which included ap lacebo-treated control group and ad rugtreated group.S NPs for ac andidate gene region were then genotyped in these groups and tested for association with the response phenotype under the assumptiono fc omplete LD. Drug response wass implistically treated as ab inaryt rait, and marker allelef requencies were then compared between responders( cases) and non-responders( controls), similar to a case-control design nested within ac ohort. 13 Powerw as estimated by simulation, as in the study by Long and Langley, 10 and association wase valuated using al ogistic regression model. 14,15 Since ac onsiderable fraction of respondersw ere expected to respond, due to the placebo phenocopy (an indistinguishable phenotype unrelated to the tested causative allele), we focused on the interaction between genotype and drug/placebo labelling. Them odel we propose assumes that specific genotypes have differential effects in the drug-treated group but not in the placebo-treated group. 12 Thus, the logistic regression term, which is expected to indicate true association, is the interaction termf or genotype by drug. Va rious studies (egGauderman 16 )havecalculated the required sample size for studies of gene -environment interactions, but the methods suggested areu sually applicable to very specific designs and calculations are presented for specific sets of parametersa nd aret hereforen ot directly applicable to the PGx context and the particular designo fi nterest (randomised controlled study).
Powerw as estimated over aw ide range of experimental designp arameters: first and foremost, the number of individuals that participated in the clinical trial, the magnitude of the placebo effect and thep enetrance of the response locus. We further examined direct (typing thecausative allele itself) versus indirect (typing at ightly correlated SNP) tests and haplotype versus singlem arkerf requency analyses. We also changedt he ratio between the sizes of placebo-a nd drug-treated patient groups, the number of SNPs and the method for choosing those SNPs (either randomly or categorised in allele frequency bins). 9,17 Combined, our analyses provideacomprehensive examination of the parameter space for PGx study designs.

Materials and methods
For each setting of parameters, we evaluated powera st he fraction of simulations, out of R ¼ 100 or 1,000( see below) repetitions, in which true association wasd etected, with an expected type Ie rror of 5p er cent.E ach of the R simulations wasp erformed as outlined below: . Generate genotype data . Generate phenotype data . For indirect tests, select SNPsf or study . Assess association between marker alleles/haplotypes and phenotype.
The effect of placebo on the power of pharmacogenetic association studies Review PRIMARY RESEARCH

Parameterst ested
We evaluated statistical power, as afunction of the number ( N ) of individuals in the clinical trial ( N ¼ 100 to N ¼ 1,500), under ar ange of different parameter settings: . The frequency ( f 0 ¼ 15 per cent to f 0 ¼ 40 per cent) of the placebo-response phenocopy. Importantly,t his magnitude of the placebo effect is assumed to equal the penetrance (frequency of response) amongh omozygotes for the non-response allele. . The size ratio between drug-and placebo-treated patient groups( either by suggestingad ifferent study design -i e fixing the total number of patients -o rb ys uggesting drug-only follow-up studies, fixing the number of placebo-treated individuals). . The genotype relative risk (GRR)o ft he response polymorphism (2 to 4). GRRi sd efined as the ratio between the penetrance among homozygotes for the response allele ( f 2 )a nd homozygotes for the non-response allele (or placebo effect, f 0 ). 18 . The number of SNPs examined ( M ¼ 3o r M ¼ 5). . The strategy for SNPs election (randomly or by frequency categories).

Generationo fg enotype data
The coalescent approach 11 wasu sed to generate samples consisting of completely linked SNPs. As imple population genetic model involving only mutation and random genetic drift wasa ssumed, withoutr ecombination within the small region considered. We simulated afixednumber of sites, using the ms software (see Hudson 19 for further details on haplotype generation). As ingle realisation of the coalescentp rocess resulted in as et of haplotypes for 50 polymorphic sites. Sites were correlated, as expected by sitesi nc omplete LD.O ne of the sites wasr andomly chosen as the response site.T he only requirement wast hat the frequency of its minora llele was more than 5p er cent. To further simplify the model, the ancestral allelew as assigned as the aetiological allele.H aplotypes were then randomlyp aired to formg enotypes.

Generationo fp henotypic data
Patients were randomly assigned to the drug-o rp lacebotreated group with equal probability, or according to afi xed drug/placebogroup size ratio.P atients assigned to the placebo group were randomly defined as responderso rn on-responders, witht he probability of the former equal to the 'placebo effect'. Patients assigned to the drug group were randomly labelled responder/non-responder,w ith the probability of response determined by the penetrance of each genotype.F or the non-response homozygotes, this probability wase qual to the placebo effect. The penetrance of the heterozygote wasset to the mean of the twohomozygote penetrances, representing an additivem ode of inheritance.

Strategy for SNP selection
M ¼ 3o r M ¼ 5m arkerso ut of the5 0s imulated markersi n the candidate region were selected for genotyping. The number of SNPs per gene waslimited to adheretothe budget constraints of the experimental designa nd,m ore importantly, availability: SNPs must be known (as if mined from public databases), technically typeable and polymorphic in the study population(s).T he causativeS NP wasn ot explicitly excluded and could appeara so ne of the markers. Tw os trategies were tested for selecting theS NPs for genotyping: . Categorya pproach .I nt he presence of LD,a dequate matching of allele frequencies at marker and trait loci determines if am arker site will be useful for detecting an association with the trait variant. 9,17 Following this principle,S NPs were classified into three or fived istinct categories by their minor allele frequencies. One SNP from each categorywas then selected at random. If one category wase mpty of SNPs,w e' walked' alongt he chromosome until hitting aSNP with afrequency not already present in the selected set. The frequency categories were: 0.1 -0. Detecting association between markers and drug response Associationw as detected by al ogistic regression model commonly used to analyse categorical data. We used the commercially available SAS statisticals oftware. 20 In this analysis, the log odds of being ar esponder wasr egressed on the independentv ariables. The model contained twoi ndependent variables -a'drug' indicator variable D (drug or placebo) and the genotype variable G (having three possible values: 0, 1 or 2) -a nd the interaction between them ( D * G ), namely where b 0 is the intercept and b i ( i ¼ 1to3)isthe change in log odds as aresult of aunit increase in D , G ,or D * G ,respectively. Associationw as detected by as ignificant ( p , 0.05) drug by genotype interaction effect. Intuitively,t his is just am ore generalv ersion of implementinga na ssociation test of respondersv ersus non-respondersi nadrug-only experimental design, while accountingf or thel evel of the placebo phenocopy,k nown from as eparate,p lacebo-only design. Tw oa pproaches were considered: . A 'direct association' approach,i nw hich potentiald rugresponse variants were tested one at at ime.T he suspected causativeS NP wast hereforet he only genotypeconsidered in the logistic regression model. In this approach, R ¼ 1,000 iterations were performed. . An 'indirect association' approach, in which several markers (three or five) were typed, hopefully turningo ut to be significantly correlated with the response locus. Genotypes of all of the SNPs were thereforeconsidered in the logistic regression model, either marker by marker (testingeach of the three or fiveSNPs with separate regression modelsand recording the highest statistic,a se xplained below) or as haplotypes. The individual contribution of each SNP varied, as expected between different random runs of the simulation process,a nd we focused on the overall significance of association. Thes ignificance of single-marker association wasc omputed through aM onte Carlo permutation approach 21 and compared with haplotype analysis. For all indirect marker-based tests, which employeda Monte Carlo procedure 22 for powere stimation, R ¼ 100 wasu sed, due to the computationally intensiven ature of this analysis.
To assess the significance of single-marker association,w e applied logistic regression analysist oe ach genotyped marker and recorded the highest statistic (Wald x 2 )f or the drug by genotype interaction term. We randomly permuted the response labels and repeated the same analysis5 00 times to obtaint he distribution of the maximum x 2 score under the null hypothesis of no association. The p value for ag iven simulation wase stimated according to this distribution.
Haplotype analysis wasm ore straightforward, since it did not requirem aximisation over many singlem arker scores. In this case,the logistic regression model included haplotypes and drug by haplotype terms, instead of the respectiveg enotype terms. Ah aplotype variable assumes av alue in {0,1,2}, denoting its copyn umber in the genotype of an individual. Haplotypes area ssumed to be resolved by pedigrees or computation (eg Stephens et al. 23 ). Note that the combination of completeL Da nd the selection of non-redundant SNPs implied that therea re exactly M þ 1h aplotypes. R ¼ 1,000 simulations were run.

Ty pe Ie rror
Naturally, powers houldb ec ompared when the false-positive rates are fixed to be the same across different methods. The statisticaltests performed in these simulations were designed to hold the type Ie rror at ac onstantr ate of 5p er cent. To validate the rate of our type Ierror,simulations were runwith GRR equal to 1-ie f 2 wase qualt ot he placebo effect. The proportion of false associations wast hen recorded for the different tests: direct analysis on the causative effect, the singlemarker Monte Carlo permutation approach and the haplotype analysisf or N ¼ 500 and N ¼ 1,000. Thep robabilityo f detecting af alse association wase stimated when the placebo effect was2 6p er cent (as in GRR ¼ 3). The results of this validation benchmark are shown in Ta ble 1. Note that the variance in false-positiver ates for random SNPs seemed to be higher than that for haplotypes.

Comparison with predictions by existing tools
In order to comparethe numbersobtained in this study with a scenario in which there wasn op lacebo effect, powerw as calculated with the 'Genetic PowerC alculator' (GPC) program, 24 for a' classical' case-control study.T he parameters were set as follows:

Results
We first examined the poweru nder the optimistic assumption of detecting direct association (ie thet ested marker is the Ta ble 1. Estimated false-positive rates for the different statistical tests.

Number of persons
False-positiver ates  causativeSNP). In Figure 1a, poweri splottedasafunction of the total number of persons participatingi nt he clinicalt rial (half placebo-treated and half drug-treated) for different penetrance scenarios. Even for the bestp enetrance-scenario examined (GRR ¼ 3a nd placebo effect f 0 ¼ 26.6 per cent), more than 500 individuals arer equired to be includedi nt he clinicalt rial to reach thes tandard level of 80 per cent power. This is in sharpc ontrast to the predictions of the GPC, 24 which are an order of magnitude smaller than the worst penetrance scenario examined (in Figure 1a, compare GPC [plottedi nd ashedc urve]a nd GRRw ith Another observation is that the powerc urvesa re sorted according to the penetrance of the response genotype, f 2 .This mayb ee xpected, givent hat GRR ¼ f 2 = f 0 and that the prevalence of response is af unction of f 2 ,a nd f 0 .T ob etter evaluate the relativei mpact of the penetrances f 2 and f 0 on power, in Figure 1b we plotted the powerasafunction of these parametersfor afixednumber ( N ¼ 1,000) of persons per trial.
Fixing each of these penetrance parametersr eveals that the power, across its dynamic range,i sa lmost al inear function of the other penetrance.W ecan observe that for agiven value of f 2 ,p ower decreases approximately linearly as f 0 increases. Moreover, for ag iven value of f 0 ,p ower increases approximatelylinearly as function of f 2 at most of the powerranges. In addition, for ag iven GRR ratio,p ower is considerably affected by the value of f 2 .T hus, considering the parameter space defined in our simulations, powerf or f 2 ¼ 0.4, 0.6 or 0.8, and GRR ¼ 2, is 0.39, 0.644 and 0.881, respectively. Figure 2presents theeffect of different drug/placebo group size ratios on powerfor the best penetrance scenario in Figure  1a Figure 2a refers to the scenario where afi rstc linical trial including drug-a nd placebo-treated groups has been completed and, in order to enlarge the sample size,d rug-only follow-up studies are included in subsequent analyses. We thereforeu sed afi xed number of placebo-treatedindividuals and increased the size of  the cohorto fd rug-treated patients. Plots of powerv ersus study size for different ratios (1:1,2:1 or 4:1) between placeboand drug-treated group sizes ares hown. Wo rsta nd intermediate penetrance scenarios in Figure 1a were also analysed (data not shown). Increasing the number of drug-treated patients improved powero nly minimally (usually 10 -25p er cent for the first doubling and an additional , 15 per cent for the second). Improvement wasl argest for the less powered scenario (data not shown). To evaluate the best design for a PGx association study when the number of patients is limited, we calculated powerf or 1:1, 2:1 and 1:2 drug-/placebotreated group size ratios for the same GRR/f 2 / f 0 scenarios as in Figure 2a, but this time fixing the total number of patients participatingi nt he clinical trial. While the bestr atio seemst o be 1:1, and the worst2 :1, differences are small and often statistically insignificant (Figure2 b). We next evaluated powerf or the indirect approach -i e the tested marker is distinct from the response SNP ( Figures  3-5). The powerc urve for analysis, including the causative SNP,i sa lso presented for comparison. In Figure 3, we compared powerf or twod ifferent strategies for selecting the markerst ob eg enotyped, either randomly or by categories (see Methods section for details),e xamining three penetrance scenarios and twoo ptions for the number of markerst yped ( M ¼ 3o r M ¼ 5). Only for the most empowered setting (Figure 3f )d id the 'categories strategy' showaconsistent advantage over the 'random strategy'.
Comparing powero btained for the different number M of markerst yped on the sames imulated datasetsy ielded similar plots, withenhancedpower for M ¼ 5o ver M ¼ 3( Figure 4). This improvement is large for larger study sizes and it is significant (see grey-shaded patches in Figure 4), even for the modest number of performed simulations when the study size is increased.
We used the samed atasets (categories strategy) to compare the relativep ower of haplotype versus single marker analysis ( Figure 5). Perhaps surprisingly,s traightforwardh aplotype analysisdoes not seem to have an advantage over single marker analysis( which seemss uperior in the scenarios examined in Figures 5b and 5f). Furthermore,n either of the powerp lots for graphs 5a -f indicate statistically significant differences between these analytical approaches.

Discussion
We have shown that the attributes characteristic of aclinical trial, particularlyt he magnitudeo ft he placebo effect, have unexpected implications on the statistical powerofP Gx association studies. Our simulation results stand in sharpc ontrast to the over-optimistic predictions of tools designed primarily for case-control disease association studies 24 and highlightt he marked impact that asubstantial placebo effect can have on reducing study power. In the absence of analytical tools specifically tailored to calculate powerinthe PGx context, where gene -environmenti nteractions are integrated our results can only be compared with tools designed for classical disease association studies. Thes imulation study presented here shows that even under the mostfavourable scenarioinvolving highpenetrance conditions -reliable association (80 per cent power) between SNPs in ac andidate gene or region and the response to ad rugr equires the recruitment of an 'optimaln umber' -N < 500 patients -i naclinicaltrial, givent hatthe causative SNP is genotyped, and N < 800 patients when fiveperfectly linked markersa re genotyped (Figure 4). Despitethe fact that for some results regarding the indirect association approach the standard errors ares till large (due to limited number of simulations performed), ageneral trend is nevertheless visible.I tish encecrucial to takethe marked impact of the placebo effect on poweri nto consideration in PGxstudies. Our empirical approach allows exploration of acomplex arrayofp racticali ssues of study design, in contrast to previous, theoretical, simplistic studies. 12 Therefore, the results presented here are meant to guide the optimal integration of genotype data into ongoing clinical trials and to define the size of such at rial required for aPGx study.
In practice,once abeneficial effect of anew treatment is clearly demonstrated, patients on placebo treatment are shifted to real therapeutic regimens. Hence, the total size of agiven placebo-treated cohortwill often remain limited, while the number of drug-treated patients willpotentially significantly increase.W ereportinthis study that the optimal study design in the presence of aplacebo effect under the modelsexamined comprises an equalnumber of drug-and placebo-treated patients, as is usually the case in Phase III clinicaltrials. Adding more drug-treated patients, even four times as many,increases poweronly mildly.This is in sharpcontrast to the more classical case-control studies aimed at the elucidation of the aetiology of common diseases, where the number of affected cases is the limiting variable and where significant gains in powercould be obtained by increasing thesize of the control group. 9 We speculate that the rationale for this differential impact of relative cohortsizes is that in PGxitisessential to evaluate the penetrancefor the non-causativegenotype ( f 0 ), which is negligible in disease susceptibility,and thereforethe number of placebotreated individuals becomes atighter bottleneck.
Af urther potential improvement for the study designi sa n educated selection of markers. Ideally,m arkersn eed to be chosen in such am anner as to improve the chances of matching the causative allele frequency. 8,9 Ye t, the latter is unknown (ie whether common as proposedu nder the 'common-disease, common-variant hypothesis' 25 )o rl ess frequent, as also advocated. 26 Even though detailed haplotype maps 27 are well underway,w hich maye ventually allowS NP selectionb ased on phylogenetic analysis 28 or haplotype blocks, 29 until such data areu nderstood, one is still restricted to choosing markersf romamodest set of validated SNPs, often with allele frequencies being the only additional data available.I nt his study,w es pread marker frequencies over the The effect of placebo on the power of pharmacogenetic association studies Review PRIMARY RESEARCH possible range of informativea lleles ( . 5p er cent or . 10 per cent). We compared this strategy with that of choosing markersr andomly.S urprisingly,l ittle difference in poweri s reported, if at all. One possible explanation mightb et hat redundant markersa re not the major source of powerl oss when only as mall seto fm arkers is used, as these SNPs are likelyt ofall in different allele frequency categories by chance. Ye to ur results suggest that poweri sg reatly increased if five markers(M ¼ 5) are typed instead of three ( M ¼ 3) (Figure4), as with case-control association studies. This is likely to stem from the increased chances, as M gets larger,o fh itting a marker allele which is in phase witht he response allele.S ince the number of individuals participating in ac linicalt rial is limited, increasing the number of genotyped markersm ay be the strategy of choice, and the only feature controlled by study designers, for improving the powerofaPGx association study.
In this study,w ea lso considered the option of improving powerb yahigher-level analysis of the genotypic data. Our simulations extend earlier results in ac omplex-trait context 10,29 to the PGx framework,r egarding similarity of poweri na nalysis based on haplotypes versus single markers. More sophisticated analysiso fh aplotypes, exploiting their cladistic structures, may, however, be more advantageous in PGx than in other areas, 30,31 yett he impactso fadeparture from the infinites ite model (an assumption implicit in our coalescents imulation) and of homoplasy remain to be The effect of placebo on the power of pharmacogenetic association studies Review PRIMARY RESEARCH calibrated. These results place another pin on the map of the literature on haplotype versus singlem arker analyses, each method having its owna dvantages. 10,29,32,33 The frequency of the response allelei sa ni mportant determinant of the powerofassociation studies. 8,29 Since this aspect of association studies has been extensively analysed, however, we avoid handling this issue,relying instead on existing analysis.
Simulation assumptions in this study consider av eryb asic genetic model: an equilibrium population with only mutation and random genetic drift modifying an on-recombinant haplotype block containing the candidate gene under study. Real life is farm ore complex. Nonetheless, this model is already sufficient to indicate the general trends of the factors that mayconfound PGxstudies. While this simple model does not accurately reflect samples drawn from humanpopulations, we consider it preferable to more assumptive, but often still controversial, models. Incorporating other factors, such as recombination, genec onversion, recurrent mutations or demographic expansion, into the coalescentm odel is likely to deteriorate the powerestimated in the present study.Its hould be noted that we makei mplicit assumptions in the manner in which simulations arel aid out. First, the response allele is assumed to be thea ncestral, usuallym ore common,o ne.T his assumption is rationalised by our focus on drugs that, by default, do evokearesponse,b yc ontrast with long-shot treatments whose success is the exception and which require separate analysis. Furthermore,t he range of minor allele frequencies that areexamined in this work maybias our findings. The simulation parametersa nalysed implicitlyf ocust his work at more common SNPs, more akin to the common-disease, common-variant scenario.O ther excluded factorsr elevant specifically to aP Gx powers tudy -s ucha sm ultiple drug doses, quantitativeo rc ategorical outcomesi nstead of ab inary response,d ifferent modelsf or placebo effect, allelic heterogeneity, epistatic interactions and genotyping errors -a ll motivate further research. Lastly,s tudies of adverse drug effects, which aren ot examined in the current study,m ay requiref urther research involving this particular design.
The interest of large pharmaceutical companies in PGx studies, the strong possibility that new drugs will be required to be evaluatedfor PGxb ythe Food and Drugs Administration and the public demand for more personalised medicines is likely to increase the number of PGx studies in the nearfuture.T o increase the likelihood of obtaining significant results, studies need to be designed to takeinto consideration the parameters that affectpower estimation. Thepresent study implies that simpletranspositions of conventional case-control modelsand powerevaluationstoPGx are not straightforwardand require separate consideration.While statisticalpower in PGx is affected by some parameters, as with disease susceptibility studies, the particularities of astudy design that is based on aclinical trial change the setofcontrollable parametersand transformthe landscape of success probabilities. Thefollow-ups suggested above are expected to further refine the outline characteristics of statistical poweri nPGx studies of drug response.