Skip to main content

Genetic association studies in cancer: Good, bad or no longer ugly?


For some time, investigators have appreciated that genetic association studies in cancer are complex because of the multi-stage process of cancer and the daunting challenge of analysing genetic variants in population and family studies. Because of recent technological advances and annotation of common genetic variation in the human genome, it is now possible for investigators to study genetic variation and cancer risk in many different settings. While these studies hold great promise for unravelling multiple genetic risk factors that contribute to the set of complex diseases called cancer, it is also imperative that study design and methods of interpretation be carefully considered. Replication of results in sufficiently large, well-powered studies is critical if genetic variation is to realise the promise of personalised medicine -- namely, using genetic data to individualise medical decisions. In this regard, the plausibility of validated genetic variants can only be realised by the study of gene-gene and gene-environment interactions. The genetic association study in cancer has come a long way from the days of restriction fragment length polymorphisms, and now promises to scan an entire genome 'agnostically' in search of genetic markers for a disease or outcome. Moreover, the application and interpretation of these studies should be conducted cautiously.


The promise of analysing common germ-line genetic variation and cancer risk has been accelerated by knowledge gained from annotating the draft sequence of the human genome. Genetic variation in different populations can now be used to search for genetic markers that associate with cancer risk, therapeutic response and outcome. This new paradigm, the study of complex diseases, such as cancer, by the analysis of common genetic variation represents the first step in surveying the genome comprehensively. It is also known that the differences between individual human genomes additionally includes other types of variation, such as microsatellite markers, insertions and deletions (from a single base to large regions of thousands of bases) and copy number variation, but, nevertheless, the first large-scale maps have been generated for single nucleotide polymorphisms (SNPs) [1, 2]. Since most common SNPs (with a minor allele frequency greater than 5 per cent in a studied population) are silent and have no apparent function, currently, testing for SNPs is directed at identifying markers of disease risk or outcome [35]. There is a subset of SNPs, however, that have functional consequences which can result in a subtle change in gene function, such as alteration of a transcription factor binding site in the promoter of a gene or in the coding sequence of a gene product.

One of the first major steps towards identifying the common SNPs for study was the establishing of the International HapMap Project, which has developed a fine-scale haplotype map of the human genome [6]. This project has genotyped more than 2.6 million SNPs in three distinct continental populations. In parallel, other initiatives have begun to sequence genes of great biological interest in search of common and uncommon SNPs. Sequence verification, while slower and more costly, has provided important insights into the spectrum of common and uncommon single-base nucleotide substitutions in the genome. For example, the National Cancer Institute SNP500 Cancer project is validating SNPs in genes implicated in cancer biology, [7] and the National Heart, Lung and Blood Institute's Seattle SNPs project has focused on candidate genes and pathways that underlie the inflammatory response.

REGenotyping technology has advanced significantly and it is now possible to genotype hundreds of thousands of SNPs in accurate, high-throughput platforms at lower prices [8, 9]. In fact, there are commercial products available for interrogation of common genetic variation across the 'whole genome', utilisinga strategy of surrogacy testing. Based on the HapMap Phase 2 data [6], it is possible to take advantage of linkage disequilibrium across the genome by choosing a set of tagging SNPs as markers of genetic variation across the human genome. It is estimated that with this approach, at least 500,000 SNPs would be required to survey common genetic variation [10, 11]. This significant expansion of knowledge of normal human genetic variation, together with technical advances, has created an opportunity to interrogate the genetic basis of cancer risk, response to therapy and outcome. There are many issues in study design and analysis that must carefully be considered.

The complexities of genetic association studies in cancer

The study of genetic variation and its contribution to cancer risk is a daunting undertaking because of the need to combine large population-based studies with dense genetic analyses. Figure 1 shows many of the steps to consider in designing and interpreting a genetic association study in cancer.

Figure 1

The steps of a genetic association study in cancer. Issues related to each step are noted in the 'staircase'. If the end goal of an association study is personalised medicine, careful planning and analysis is crucial.

Although the complexity of cancer as a disease has been described by others, the interaction between genes and the environment has not yet been explored in detail [12, 13]. In any one type of cancer, there are often significant differences in age of onset, rapidity of tumour growth, presence of metastases, pathological appearance, gene expression patterns, somatic genetic changes, response to therapy and familial risk. Thus, the task of searching for common factors that associate with genetic markers has to carefully consider well-designed studies that address specific hypotheses.

Cancer genetics

Studies of familial cancer have provided great insights into cancer biology by mapping rare familial mutations that have been subsequently evaluated in the laboratory, thus adding plausibility to the observed disruption in function due to a mutation in one or more genes. These observations have also led to insights in sporadic cancers. In this regard, studies in rare paediatric cancers have yielded important insights. For example, the RB gene was the first tumour suppressor geneidentified through a genetic association study [14]. Knudson's original description of the inheritance of retinoblastoma became the foundation of an excellent understanding of the role that RB plays as a tumour suppressor and transcriptional regulator [15]. Another such example is the Li - Fraumeni syndrome, which is characterised by family pedigrees with high rates of sarcoma and breast cancer, as well as leukaemia, brain tumours and adrenocortical carcinomas [16, 17]. Subsequently, the identification of mutations in the TP53 gene in a majority of, but not all, patients with Li - Fraumeni syndrome led directly to an understanding of TP53 and its role as a critical transcription factor in normal cell growth, apoptosis and DNA repair [1820].

The identification of familial breast cancer pedigrees through careful epidemiological study identified the BRCA1 and BRCA2 genes;[21, 22] in turn, follow-up studies have generated important insights into the function of these genes in DNA repair. Mutations in these genes in family pedigrees are highly penetrant and are associated with a significant risk for breast and ovarian cancers. Common genetic variation in BRCA1 and BRCA2 also appears to contribute to the risk for sporadic breast cancer, albeit with a substantially smaller effect. For example, genetic variation in BRCA2 was shown to result in an increased risk for sporadic breast cancer in the Multiethnic Cohort (MEC) [23]. Specifically, a single SNP in intron 24 was associated with a two-fold increased risk for breast cancer. This suggests that, even in the absence of a mutation that could change protein function or regulation, more subtle variants can serve as markers for increased risk for cancer.

SNPs as disease markers

Although the early history of SNP analysis was predicated on choosing candidate SNPs with known functional consequences, currently no functional information is available for the vast majority of SNPs. In fact, it is unlikely that most SNPs have functional consequences [24]. SNPs in certain genomic regions, such as promoters or intron - exon splice sites, could result in significant functional alterations in gene regulation, but the effort to validate this in the laboratory is arduous. It has been suggested by others that, in choosing SNPs for a genetic association study, one should cull from high-priority lists of SNPs with functional implications [25]. This approach has the potential to find the more highly penetrant SNPs in an association study, but is limited because it underutilises SNPs as genetic markers and, in particular, other untested SNPs that could be in linkage disequilibrium with the positive marker SNP. Until recently, many studies focused on non-synonymous SNPs because of potential amino acid changes that could affect protein structure and function. Non-synonymous SNPs contribute to the genetic diversity seen in the immune system [26] and potentially change the structure or function of the protein of interest; however, a large number of non-synonymous SNPs may be conservative and have minimal or no effect on gene function [27, 28].

SNPs that change gene regulation have also been described. Examples include an SNP in the promoter of MDM2, a negative regulator of p53, which was shown to increase the affinity of the transcriptional activator Sp1, resulting in higher levels of MDM2 RNA and protein [29], and synonymous variants in the human dopamine receptor 2 (DRD2) gene, which affect mRNA stability and translation [30]. Other such functional variants have been recently described, especially in pharmacogenomics [31]. It is possible that SNPs that result in subtle changes in gene regulation are of minimal consequence in the short term, but, over the life span of an individual, accumulated changes could be significant. It is quite likely, however, that even when the nuances of gene regulation are fully understood, the majority of SNPs will still serve best as genetic markers of disease.

An understanding of population-specific genetic variation in healthy individuals is critical in choosing SNPs to investigate in a study of cancer risk. It has been well established that the distribution of the incidence of specific cancers can vary greatly across the global populations. While some of this has been ascribed to different environmental factors, it is also plausible that differences in the genetic variation of distinct populations could also contribute. In many ways, large association studies in cancer are designed to analyse genetic profiles of common variation that has been shaped by unrelated factors. In this regard, the molecular evolution of SNPs reflects the specific history of populations -- in particular the admixture of different populations over time. This latter issue has been exploited by some in the use of admixture markers to investigate cancers with a disparate incidence between populations [32, 33]. Throughout evolution, humans have been subjected to different selective pressures (ie endemic pathogens or dietary needs), resulting in genetic variants which have been 'fine-tuned' in their ability to fight infection, reproduce and respond to other challenges [27, 34, 35]. This results in genetic differences between different populations around the world. Differences in the origin of groups within a study can be significant enough to generate sufficient population stratification and thus add a potential confounding factor in the genetic epidemiology of complex disease [3638].

Multiple interactions

Other biomarkers and environmental influences which contribute to the multi-factorial nature of cancer, as well as other complex diseases, further complicate the study of genetic association and cancer risk. Gene - gene interactions are also crucial to cancer risk assessment. The recent report from the InterLymph Consortium showed the greatest risk for non-Hodgkin lymphoma to be in individuals homozygous for the TNF-308A allele and carrying at least one IL10-3585A allele (odds ratio [OR] 2.13) [39]. The importance of gene - gene interactions was also demonstrated in a study of gastric cancer and cytokine gene SNPs [40]. Individuals with multiple polymorphisms of interleukin-(IL-) 1 receptor antagonist, tumour necrosis factor A and IL-10 had the greatest risk for gastric cancer, with ORs of 2.8 for one, 5.4 for two and 27.3 for three or four high-risk genotypes.

Gene-environment interactions add complexity to the interpretation of genetic association studies. One example is the investigation which has focused on the contribution of the genetic variations in the N-acetyltransferase (NAT2) gene to the risk for specific cancers, especially bladder and lung cancer [41]. In particular, differences in the activity of NAT2 (ie rapid and slow acetylator genotypes) could explain the association between the NAT2 gene and tobacco smoke and subsequent risk for bladder cancer. The slow acetylator phenotype is associated with an increased risk for bladder cancer compared with individuals with the fast acetylator phenotype, especially when combined with tobacco use [42, 43]. Interestingly, the type of tobacco appears to be important; for example, so-called black tobacco is more strongly associated with the observed effect of NAT2 genotypes [42, 43].

Genetic association and other clinical studies often assess only two outcomes: affected or unaffected. This approach is useful in cancer studies because cancer is usually an all or none diagnosis at the time of the study. When intermediate precursors or quantitative traits of disease are added to the analysis, however, the complexity significantly increases. Mendelian randomisation is a concept that attempts to bring together independent inheritance of individual traits with modifiable environmentally modifiable exposures [44, 45]. By using independent inheritance of traits, it is possible to reduce the confounding in studying exposure - disease associations [46]. Examples include studies of serum cholesterol, cancer risk and the APOE gene;[47] folate, homocysteine, coronary heart disease and the MTHFR gene;[44] and the relationship between alcohol, variation in the ALDH2 gene and oesophageal cancer [48].

Study design

Subject selection and sample size

In designing a study of genetic variation and cancer risk in a population, there are a number of critical factors to consider, such as sample size, population stratification, allele frequencies of the SNPs of interest, environmental risk factors and phenotype definition. In particular, a careful definition of the cancer phenotype to be studied is crucial. Genetic factors that contribute to low-grade prostate cancer could be different to those that contribute to high-grade prostate cancer. If so, a study in which low- and high-grade diseases are grouped together could miss a potential genetic contribution for one form of the disease [49, 50]. While it may be difficult to ensure a study population that is as homogeneous as possible, it is crucial to limit confounding due to background genetic differences. Differences in genetic variation between ethnic groups have been well described and are due to a combination of evolutionary history, migration and admixture [36, 38, 51]. Efforts to avoid population stratification also need to be taken to provide cases and controls with genetic backgrounds as similar as possible.

To address some of these issues, large cohort studies, such as the MEC [52], are being established to create the large sample sizes needed. One strength of the MEC is that exposure and biomarker data on individuals from five different ethnic groups in Hawaii and California have been collected. This study is an immense resource for genetic epidemiology. Another such study is the National Cancer Institute (NCI)'s Breast and Prostate Cancer Cohort Consortium, consisting of over 5,000 breast cancer and 8,000 prostate cancer cases. The consortium's goal is to study genetic variation in genes in key pathways [53]. The Network of Investigator Networks [54], sponsored by the Human Genome Epidemiology Network, seeks to pool analysis from multiple investigations for critical analysis and to address reproducibility issues [5557].

SNP choice and interpreting the results

In order fully to understand the results of a genetic association study, all of the study endpoints described above must be considered to design a study with sufficient power to detect a measurable effect. So far, the majority of genetic association studies with common SNPs in cancer have reported modest associations, with ORs typically between 1 and 2. Examples of meta-analyses that found ORs in this range in lung cancer include XPD 751GG (OR 1.27)[58] and CYP1A1 exon 7 polymorphism (OR 1.15) [59], in breast cancer include XRCC3 T241M (OR 1.16) and BRCA2 N372H (OR 1.13)[60] and in gastric cancer include an approximately twofold increased risk for the IL8-251A allele [6163].

These studies illustrate the fact that the likelihood of findinga significant association (ie OR > 2) in a large study of a sporadic cancer is low, even for candidate genes with a strong prior. Since, by definition, SNPs are common genetic variants, individuals with a particular risk allele may never develop disease. Instead, it has become apparent that a large number of variants will each have a small contribution, perhaps evident in its population-attributable risk of 1-2 per cent per SNP. The consequence of searching for alleles with a moderate effect, namely an OR less than 1.8, is that studies have to be large and can, with rare exception, only address high frequency SNPs (ie SNPs greater than 5 per cent). Moreover, the opportunity to examine gene-environment interactions should be considered as an important reason for conducting a study.

Biological plausibility is a critical step in choosing genes for either a candidate gene or pathway approach. So far, less than 2 per cent of genes have been studied, but with the advent of new tools of whole-genome scans, there is now an opportunity to look across the genome. Still, for many studies, SNPs have to be selected based on knowledge of the pattern of linkage disequilibrium across the gene or chromosomal region. It is fortuitous that genetic association studies have increased rapidly in scope, moving away from a single SNP in a single gene to haplotype-tagging methods for SNP selection in pathways of genes or, in the near future, whole-genome scans of 500,000 or more SNPs per individual. In the end, whole-genome scans will identify markers that will need to be carefully mapped, similar to the approach for candidate gene studies.

One of the key issues in SNP association studies is replication of results. The literature is strewn with false-positive associations and reproducibility issues. One way to address the false-positive association problem is by using the probability of a false-positive report as a means to weight the likelihood that a SNP would be associated with disease based on knowledge of the gene and/or pathway [64]. The concept of false discovery rate (FDR) is an alternative, useful way of correcting for multiple testing comparisons without the stringent penalty stipulated by the Bonferroni correction [65]. The expected proportion of false rejections of the null hypothesis among the total number of rejections is used as a measure of global error. This method has been applied successfully to studies of qualitative [65] and quantitative [66] traits. Due to linkage disequilibrium between SNPs, however, the Bonferroni correction -- which tests each SNP as an individual entity -- may be too stringent, and an FDR approach may be more conducive to multiple testing concerns in genetic association studies.

The whole-genome association study is based on the extremely high-throughput methods of genotyping hundreds of thousands of SNPs in each individual in the study. An advantage of this method is that the extent of genetic variation across the entire human genome can be evaluated at one time, in an 'agnostic manner'; namely without prior knowledge of the putative functional importance of a region. The NCI Cancer Genetic Markers of Susceptibility Strategic Initiative, is a programme designed to conduct whole-genome scans in breast and prostate cancer, separately, and make the data available to the public. Built into the study is the availability of nearly 7,000 cases and 7,000 controls for each disease to conduct rapid replication of findings based on an initial scan of 1,200 cases and 1,200 controls per disease, drawn from prospective, cohort studies. Over 500,000 SNPs will be analysed per subject. The choice of SNPs is based on tagging bins of SNPs using a pairwise correlation (r2 > 0.8) in the North European group in HapMap Phase 2 [11].


As mentioned above, one of the most challenging aspects of genetic association studies in cancer is replication of study results. This is essential for a more thorough understanding of biological mechanisms and the development of preventive or treatment strategies. Many studies resulting in a possible association of a particular genetic variant with an increased risk of cancer have failed to be reproduced. Some of the reasons for this include small study size, population stratification, gene-environment interactions, linkage disequilibrium around the variant studied and other intrinsic study biases. For example, in an analysis of 201 studies of complex disease of 25 different associations, Lohmueller et al. found evidence for replication in just less than half of the studies [67]. Another review of genetic association studies in complex diseases also showed low reproducibility [68].

Meta-analyses and large investigator networks are crucial to address these issues. Recent meta-analyses have shown reproducibility of both positive and negative associations. These include a null association of GSTM1 deficiency in breast cancer [6971], prostate cancer [72] and in colorectal cancer [73], but positive associations of GSTM1 in leukaemia [74] and bladder cancer [75]. Meta-analyses have confirmed positive associations of IGF1 promoter [CA]n repeats in breast cancer [76], the NAT2 slow-acetylator phenotype in bladder cancer [75] and polymorphisms in DNA-repair genes in breast [60, 77]. The InterLymph Consortium investigated SNPs in key immune pathway genes, TNF and IL10, in non-Hodgkin's lymphoma (NHL) and showed an increased risk for NHL in TNF-308A and IL10-3575A allele carriers [39].

A hopeful future

Well-designed, well-powered studies of genetic association in cancer hold great promise for advancing knowledge of cancer biology, genetic risk factors for cancer, therapeutic response and outcome. SNPs have the potential to be used as markers of disease risk, even in the absence of understanding the functional implications of the SNP. Studies of mutations in genes such as BRCA1 and TP53 in families have made profound impacts on our understanding of molecular and cellular biology [19, 22]. While SNPs may not be associated with cancer risk to the same degree as a highly penetrant mutation in familial cancer, they will still contribute significantly to an understanding of a pathway or process in cancer biology. SNPs may confer as yet unknown subtle changes in gene function, transcription, intron - exon splicing or protein folding that, in the context of the right environmental exposure and/or in the appropriate genetic background of other variants, could have a significant effect on disease risk or outcome.

The public health implications of genetic association studies in cancer and other complex diseases are just beginning to emerge [44]. An excellent example of this is a study of age-related macular degeneration in which the population-attributable risk of genetic variation in the complement factor H gene is approximately 50 per cent [7880]. A population-attributable risk for genetic variation in cancer this significant has yet to be described, but it is possible. This, in combination with improved understanding of gene-gene and gene-environment interactions, will provide the basis for early diagnosis, intervention and prevention of cancer. It should be pointed out, however, that the promise of studying genetic variation in cancer cannot be realised without the careful collection and annotation of cases and controls in sufficiently large studies. For the low penetrant SNPs, replication of results will have to be followed by demonstration of plausibility before entering clinical testing.

In conclusion, the tools for looking at common genetic variation are now available. Moreover, there is the opportunity to sequence large portions of the genome in many cases and controls on the horizon. The genetic opportunities will best be realised when studies that include outcome and co-variates have been carried out, especially those that reflect the environmental contributions to cancer.


This is a US Government work, and, as such, is in the public domain of the United States of America.


  1. 1.

    Brookes AJ: The essence of SNPs. Gene. 1999, 234: 177-186. 10.1016/S0378-1119(99)00219-X.

    CAS  Article  PubMed  Google Scholar 

  2. 2.

    Kong A, Gudbjartsson DF, Sainz J, et al: A high-resolution recombination map of the human genome. Nat Genet. 2002, 31: 241-247.

    CAS  PubMed  Google Scholar 

  3. 3.

    Collins A, Lonjou C, Morton NE: Genetic epidemiology of single-nucleotide polymorphisms. Proc Natl Acad Sci USA. 1999, 96: 15173-15177. 10.1073/pnas.96.26.15173.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  4. 4.

    Cardon LR, Abecasis GR: Using haplotype blocks to map human complex trait loci. Trends Genet. 2003, 19: 135-140. 10.1016/S0168-9525(03)00022-2.

    CAS  Article  PubMed  Google Scholar 

  5. 5.

    Martin ER, Lai EH, Gilbert JR, et al: SNPing away at complex diseases: Analysis of single-nucleotide polymorphisms around APOE in Alzheimer disease. Am J Hum Genet. 2000, 67: 383-394. 10.1086/303003.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  6. 6.

    Altshuler D, Brooks LD, Chakravarti A, et al: A Haplotype map of the human genome. Nature. 2005, 437: 1299-1320. 10.1038/nature04226.

    Article  Google Scholar 

  7. 7.

    Packer BR, Yeager M, Burdett L, et al: SNP500Cancer: A public resource for sequence validation, assay development, and frequency analysis for genetic variation in candidate genes. Nucleic Acids Res. 2006, 34: D617-D621. 10.1093/nar/gkj151.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  8. 8.

    Kwok PY, Chen X: Detection of single nucleotide polymorphisms. Cur Issues Mol Biol. 2003, 5: 43-60.

    CAS  Google Scholar 

  9. 9.

    Steemers FJ, Chang W, Lee G, et al: Whole-genome genotyping with the single-base extension assay. Nat Methods. 2006, 3: 31-33. 10.1038/nmeth842.

    CAS  Article  PubMed  Google Scholar 

  10. 10.

    de Bakker PI, Yelensky R, Pe'er I, et al: Efficiency and power in genetic association studies. Nat Genet. 2005, 37: 1217-1223. 10.1038/ng1669.

    CAS  Article  PubMed  Google Scholar 

  11. 11.

    Carlson CS, Eberle MA, Rieder MJ, et al: Selecting a maximally informative set of single-nucleotide polymorphisms for association analyses using linkage disequilibrium. Am J Hum Genet. 2004, 74: 106-120. 10.1086/381000.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  12. 12.

    Balmain A, Gray J, Ponder B: The genetics and genomics of cancer. Nat Genet. 2003, 33 (Suppl): 238-244.

    CAS  Article  PubMed  Google Scholar 

  13. 13.

    Chen YC, Hunter DJ: Molecular epidemiology of cancer. CA Cancer J Clin. 2005, 55: 45-54. 10.3322/canjclin.55.1.45.

    Article  PubMed  Google Scholar 

  14. 14.

    Knudson AG: Mutation and cancer: Statistical study of retinoblastoma. Proc Natl Acad Sci USA. 1971, 68: 820-823. 10.1073/pnas.68.4.820.

    PubMed Central  Article  PubMed  Google Scholar 

  15. 15.

    Zhu L: Tumour suppressor retinoblastoma protein Rb: A transcriptional regulator. Eur J Cancer. 2005, 41: 2415-2427. 10.1016/j.ejca.2005.08.009.

    CAS  Article  PubMed  Google Scholar 

  16. 16.

    Li F, Fraumeni JF: Soft-tissue sarcomas, breast cancer, and other neoplasms. A familial syndrome?. Ann Intern Med. 1969, 71: 747-752.

    CAS  Article  PubMed  Google Scholar 

  17. 17.

    Li FP, Fraumeni JF, Mulvihill JJ, et al: A cancer family syndrome in twenty-four kindreds. Cancer Res. 1988, 48: 5358-5362.

    CAS  PubMed  Google Scholar 

  18. 18.

    Sengupta S, Harris CC: p53: Traffic cop at the crossroads of DNA repair and recombination. Nat Rev Mol Cell Biol. 2005, 6: 44-55. 10.1038/nrm1546.

    CAS  Article  PubMed  Google Scholar 

  19. 19.

    Artandi SE, Attardi LD: Pathways connecting telomeres and p53 in senescence, apoptosis, and cancer. Biochem Biophys Res Commun. 2005, 331: 881-890. 10.1016/j.bbrc.2005.03.211.

    CAS  Article  PubMed  Google Scholar 

  20. 20.

    Vogelstein B, Kinzler KW: Cancer genes and the pathways they control. Nat Med. 2004, 10: 789-799. 10.1038/nm1087.

    CAS  Article  PubMed  Google Scholar 

  21. 21.

    Hopper JL: Genetic epidemiology of female breast cancer. Semin Cancer Biol. 2001, 11: 367-374. 10.1006/scbi.2001.0392.

    CAS  Article  PubMed  Google Scholar 

  22. 22.

    Yoshida K, Miki Y: Role of BRCA1 and BRCA2 as regulators of DNA repair, transcription, and cell cycle in response to DNA damage. Cancer Sci. 2004, 95: 866-871. 10.1111/j.1349-7006.2004.tb02195.x.

    CAS  Article  PubMed  Google Scholar 

  23. 23.

    Freedman ML, Penney KL, Stram DO, et al: Common variation in BRCA2 and breast cancer risk: A haplotype-based analysis in the Multiethnic Cohort. Hum Mol Genet. 2004, 13: 2431-2441. 10.1093/hmg/ddh270.

    CAS  Article  PubMed  Google Scholar 

  24. 24.

    Chanock S: Candidate genes and single nucleotide polymorphisms (SNPs) in the study of human disease. Dis Markers. 2001, 17: 89-98.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  25. 25.

    Tabor HK, Risch NJ, Myers RM: Opinion: Candidate-gene approaches for studying complex genetic traits: Practical considerations. Nat Rev Genet. 2002, 3: 391-397.

    CAS  Article  PubMed  Google Scholar 

  26. 26.

    Hughes AL, Packer B, Welch R, et al: High level of functional polymorphism indicates a unique role of natural selection at human immune system loci. Immunogenetics. 2005, 57: 821-827. 10.1007/s00251-005-0052-7.

    CAS  Article  PubMed  Google Scholar 

  27. 27.

    Fredman D, Sawyer SL, Stromqvist L, et al: Nonsynonymous SNPs: Validation characteristics, derived allele frequency patterns, and suggestive evidence for natural selection. Hum Mutat. 2006, 27: 173-186. 10.1002/humu.20289.

    CAS  Article  PubMed  Google Scholar 

  28. 28.

    Johnson MM, Houck J, Chen C: Screening for deleterious nonsynonymous single-nucleotide polymorphisms in genes involved in steroid hormone metabolism and response. Cancer Epidemiol Biomarkers Prev. 2005, 14: 1326-1329. 10.1158/1055-9965.EPI-04-0815.

    CAS  Article  PubMed  Google Scholar 

  29. 29.

    Bond GL, Hu W, Bond EE, et al: A single nucleotide polymorphism in the MDM2 promoter attenuates the p53 tumor suppressor pathway and accelerates tumor formation in humans. Cell. 2004, 119: 591-602. 10.1016/j.cell.2004.11.022.

    CAS  Article  PubMed  Google Scholar 

  30. 30.

    Duan J, Wainwright MS, Comeron JM, et al: Synonymous mutations in the human dopamine receptor D2 (DRD2) affect mRNA stability and synthesis of the receptor. Hum Mol Genet. 2003, 12: 205-216. 10.1093/hmg/ddg055.

    CAS  Article  PubMed  Google Scholar 

  31. 31.

    Watters JW, McLeod HL: Cancer pharmacogenomics: Current and future applications. Biochim Biophys Acta. 2003, 1603: 99-111.

    CAS  PubMed  Google Scholar 

  32. 32.

    Reich D, Patterson N: Will admixture mapping work to find disease genes?. Philos Trans R Soc Lond B Biol Sci. 2005, 360: 1605-1607. 10.1098/rstb.2005.1691.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  33. 33.

    Patterson N, Hattangadi N, Lane B, et al: Methods for high-density admixture mapping of disease genes. Am J Hum Genet. 2004, 74: 979-1000. 10.1086/420871.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  34. 34.

    Voight BF, Kudaravalli S, Wen X, Pritchard JK: A map of recent positive selection in the human genome. PLoS Biol. 2006, 4: e72-10.1371/journal.pbio.0040072.

    PubMed Central  Article  PubMed  Google Scholar 

  35. 35.

    Akey JM, Eberle MA, Rieder MJ, et al: Population history and natural selection shape patterns of genetic variation in 132 genes. PLoS Biol. 2004, 2: e286-10.1371/journal.pbio.0020286.

    PubMed Central  Article  PubMed  Google Scholar 

  36. 36.

    Ioannidis JP, Ntzani EE, Trikalinos TA: "Racial" differences in genetic effects for complex diseases. Nat Genet. 2004, 36: 1312-1318. 10.1038/ng1474.

    CAS  Article  PubMed  Google Scholar 

  37. 37.

    Cardon LR, Palmer LJ: Population stratification and spurious allelic association. Lancet. 2003, 361: 598-604. 10.1016/S0140-6736(03)12520-2.

    Article  PubMed  Google Scholar 

  38. 38.

    Marchini J, Cardon LR, Phillips MS, Donnelly P: The effects of human population structure on large genetic association studies. Nat Genet. 2004, 36: 512-517. 10.1038/ng1337.

    CAS  Article  PubMed  Google Scholar 

  39. 39.

    Rothman N, Skibola CF, Wang SS, et al: Genetic variation in TNF and IL10 and risk of non-Hodgkin lymphoma: A report from the InterLymph Consortium. Lancet Oncol. 2006, 7: 27-38. 10.1016/S1470-2045(05)70434-4.

    CAS  Article  PubMed  Google Scholar 

  40. 40.

    El Omar EM, Rabkin CS, Gammon MD, et al: Increased risk of noncardia gastric cancer associated with proinflammatory cytokine gene polymorphisms. Gastroenterology. 2003, 124: 1193-1201. 10.1016/S0016-5085(03)00157-4.

    CAS  Article  PubMed  Google Scholar 

  41. 41.

    Hein DW, Doll MA, Fretland AJ, et al: Molecular genetics and epidemiology of the NAT1 and NAT2 acetylation polymorphisms. Cancer Epidemiol Biomarkers Prev. 2000, 9: 29-42.

    CAS  PubMed  Google Scholar 

  42. 42.

    Golka K, Prior V, Blaszkewicz M, Bolt HM: The enhanced bladder cancer susceptibility of NAT2 slow acetylators towards aromatic amines: A review considering ethnic differences. Toxicol Lett. 2002, 128: 229-241. 10.1016/S0378-4274(01)00544-6.

    CAS  Article  PubMed  Google Scholar 

  43. 43.

    Hein DW: Molecular genetics and function of NAT1 and NAT2: Role in aromatic amine metabolism and carcinogenesis. Mutat Res. 2002, 506-507: 65-77.

    CAS  Article  PubMed  Google Scholar 

  44. 44.

    Davey SG, Ebrahim S: "Mendelian randomization": Can genetic epidemiology contribute to understanding environmental determinants of disease?. Int J Epidemiol. 2003, 32: 1-22. 10.1093/ije/dyg070.

    Article  Google Scholar 

  45. 45.

    Davey SG, Ebrahim S, Lewis S, et al: Genetic epidemiology and public health: Hope, hype, and future prospects. Lancet. 2005, 366: 1484-1498. 10.1016/S0140-6736(05)67601-5.

    Article  Google Scholar 

  46. 46.

    Smith GD, Ebrahim S: Mendelian randomization: Prospects, potentials, and limitations. Int J Epidemiol. 2004, 33: 30-42. 10.1093/ije/dyh132.

    Article  PubMed  Google Scholar 

  47. 47.

    Katan MB: Apolipoprotein E isoforms, serum cholesterol, and cancer. Lancet. 1986, 1 (8479): 507-508.

    CAS  Article  PubMed  Google Scholar 

  48. 48.

    Lewis SJ, Smith GD: Alcohol, ALDH2, and esophageal cancer: A meta-analysis which illustrates the potentials and limitations of a Mendelian randomization approach. Cancer Epidemiol Biomarkers Prev. 2005, 14: 1967-1971. 10.1158/1055-9965.EPI-05-0196.

    CAS  Article  PubMed  Google Scholar 

  49. 49.

    Sorlie T, Tibshirani R, Parker J, et al: Repeated observation of breast tumor subtypes in independent gene expression data sets. Proc Natl Acad Sci USA. 2003, 100: 8418-8423. 10.1073/pnas.0932692100.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  50. 50.

    Staudt LM: Molecular diagnosis of the hematologic cancers. N Engl J Med. 2003, 348: 1777-1785. 10.1056/NEJMra020067.

    CAS  Article  PubMed  Google Scholar 

  51. 51.

    Carlson CS, Eberle MA, Rieder MJ, et al: Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans. Nat Genet. 2003, 33: 518-521. 10.1038/ng1128.

    CAS  Article  PubMed  Google Scholar 

  52. 52.

    Kolonel LN, Altshuler D, Henderson BE: The multiethnic cohort study: Exploring genes, lifestyle and cancer risk. Nat Rev Cancer. 2004, 4: 519-527. 10.1038/nrc1389.

    CAS  Article  PubMed  Google Scholar 

  53. 53.

    Hunter DJ, Riboli E, Haiman CA, et al: A candidate gene approach to searching for low-penetrance breast and prostate cancer genes. Nat Rev Cancer. 2005, 5: 977-985. 10.1038/nrc1754.

    CAS  Article  PubMed  Google Scholar 

  54. 54.

    Ioannidis JP, Gwinn M, Little J, et al: A road map for efficient and reliable human genome epidemiology. Nat Genet. 2006, 38: 3-5. 10.1038/ng0106-3.

    CAS  Article  PubMed  Google Scholar 

  55. 55.

    Benhamou S, Sarasin A: ERCC2/XPD gene polymorphisms and lung cancer: A HuGE review. Am J Epidemiol. 2005, 161: 1-14. 10.1093/aje/kwi018.

    Article  PubMed  Google Scholar 

  56. 56.

    Hung RJ, Hall J, Brennan P, Boffetta P: Genetic polymorphisms in the base excision repair pathway and cancer risk: A HuGE review. Am J Epidemiol. 2005, 162: 925-942. 10.1093/aje/kwi318.

    Article  PubMed  Google Scholar 

  57. 57.

    Masson LF, Sharp L, Cotton SC, Little J: Cytochrome P-450 1A1 gene polymorphisms and risk of breast cancer: A HuGE review. Am J Epidemiol. 2005, 161: 901-915. 10.1093/aje/kwi121.

    CAS  Article  PubMed  Google Scholar 

  58. 58.

    Hu Z, Wei Q, Wang X, Shen H: DNA repair gene XPD polymorphism and lung cancer risk: A meta-analysis. Lung Cancer. 2004, 46: 1-10. 10.1016/j.lungcan.2004.03.016.

    CAS  Article  PubMed  Google Scholar 

  59. 59.

    Le Marchand L, Guo C, Benhamou S, et al: Pooled analysis of the CYP1A1 exon 7 polymorphism and lung cancer (United States). Cancer Causes Control. 2003, 14: 339-346. 10.1023/A:1023956201228.

    Article  PubMed  Google Scholar 

  60. 60.

    Garcia-Closas M, Egan KM, Newcomb PA, et al: Polymorphisms in DNA double-strand break repair genes and risk of breast cancer: Two population-based studies in USA and Poland, and meta-analyses. Hum Genet. 2006, 119: 376-388. 10.1007/s00439-006-0135-z.

    CAS  Article  PubMed  Google Scholar 

  61. 61.

    Savage SA, Abnet CC, Mark SD, et al: Variants of the IL8 and IL8RB genes and risk for gastric cardia adenocarcinoma and esophageal squamous cell carcinoma. Cancer Epidemiol Biomarkers Prev. 2004, 13: 2251-2257.

    CAS  PubMed  Google Scholar 

  62. 62.

    Taguchi A, Ohmiya N, Shirai K, et al: Interleukin-8 promoter polymorphism increases the risk of atrophic gastritis and gastric cancer in Japan. Cancer Epidemiol Biomarkers Prev. 2005, 14: 2487-2493. 10.1158/1055-9965.EPI-05-0326.

    CAS  Article  PubMed  Google Scholar 

  63. 63.

    Ohyauchi M, Imatani A, Yonechi M, et al: The polymorphism interleukin 8 -251 A/T influences the susceptibility of Helicobacter pylori related gastric diseases in the Japanese population. Gut. 2005, 54: 330-335. 10.1136/gut.2003.033050.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  64. 64.

    Wacholder S, Chanock S, Garcia-Closas M, et al: Assessing the probability that a positive report is false: An approach for molecular epidemiology studies. J Natl Cancer Inst. 2004, 96: 434-442. 10.1093/jnci/djh075.

    Article  PubMed  Google Scholar 

  65. 65.

    Sabatti C, Service S, Freimer N: False discovery rate in linkage and association genome screens for complex disorders. Genetics. 2003, 164: 829-833.

    PubMed Central  PubMed  Google Scholar 

  66. 66.

    Weller JI, Song JZ, Heyen DW, et al: A new approach to the problem of multiple comparisons in the genetic dissection of complex traits. Genetics. 1998, 150: 1699-1706.

    PubMed Central  CAS  PubMed  Google Scholar 

  67. 67.

    Lohmueller KE, Pearce CL, Pike M, et al: Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat Genet. 2003, 33: 177-182. 10.1038/ng1071.

    CAS  Article  PubMed  Google Scholar 

  68. 68.

    Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K: A comprehensive review of genetic association studies. Genet Med. 2002, 4: 45-61. 10.1097/00125817-200203000-00002.

    CAS  Article  PubMed  Google Scholar 

  69. 69.

    Egan KM, Cai Q, Shu XO, et al: Genetic polymorphisms in GSTM1, GSTP1, and GSTT1 and the risk for breast cancer: Results from the Shanghai Breast Cancer Study and meta-analysis. Cancer Epidemiol Biomarkers Prev. 2004, 13: 197-204. 10.1158/1055-9965.EPI-03-0294.

    CAS  Article  PubMed  Google Scholar 

  70. 70.

    Sull JW, Ohrr H, Kang DR, Nam CM: Glutathione S-transferase M1 status and breast cancer risk: A meta-analysis. Yonsei Med J. 2004, 45: 683-689.

    CAS  Article  PubMed  Google Scholar 

  71. 71.

    Vogl FD, Taioli E, Maugard C, et al: Glutathione S-transferases M1, T1, and P1 and breast cancer: A pooled analysis. Cancer Epidemiol Biomarkers Prev. 2004, 13: 1473-1479.

    CAS  PubMed  Google Scholar 

  72. 72.

    Ntais C, Polycarpou A, Ioannidis JP: Association of GSTM1, GSTT1, and GSTP1 gene polymorphisms with the risk of prostate cancer:A meta-analysis. Cancer Epidemiol Biomarkers Prev. 2005, 14: 176-181.

    CAS  PubMed  Google Scholar 

  73. 73.

    Smit KM, Gaspari L, Weijenberg MP, et al: Interaction between smoking, GSTM1 deletion and colorectal cancer: Results from the GSEC study. Biomarkers. 2003, 8: 299-310. 10.1080/1354750031000121467.

    Article  Google Scholar 

  74. 74.

    Ye Z, Song H: Glutathione s-transferase polymorphisms (GSTM1, GSTP1 and GSTT1) and the risk of acute leukaemia: A systematic review and meta-analysis. Eur J Cancer. 2005, 41: 980-989. 10.1016/j.ejca.2005.01.014.

    CAS  Article  PubMed  Google Scholar 

  75. 75.

    Garcia-Closas M, Malats N, Silverman D, et al: NAT2 slow acetylation, GSTM1 null genotype, and risk of bladder cancer: Results from the Spanish Bladder Cancer Study and meta-analyses. Lancet. 2005, 366: 649-659. 10.1016/S0140-6736(05)67137-1.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  76. 76.

    Wen W, Gao YT, Shu XO, et al: Insulin-like growth factor-I gene polymorphism and breast cancer risk in Chinese women. Int J Cancer. 2005, 113: 307-311. 10.1002/ijc.20571.

    CAS  Article  PubMed  Google Scholar 

  77. 77.

    Zhang Y, Newcomb PA, Egan KM, et al: Genetic poly-morphisms in base-excision repair pathway genes and risk of breast cancer. Cancer Epidemiol Biomarkers Prev. 2006, 15: 353-358. 10.1158/1055-9965.EPI-05-0653.

    CAS  Article  PubMed  Google Scholar 

  78. 78.

    Klein RJ, Zeiss C, Chew EY, et al: Complement factor H polymorphism in age-related macular degeneration. Science. 2005, 308: 385-389. 10.1126/science.1109557.

    PubMed Central  CAS  Article  PubMed  Google Scholar 

  79. 79.

    Edwards AO, Ritter III, Abel KJ, et al: Complement factor H polymorphism and age-related macular degeneration. Science. 2005, 308: 421-424. 10.1126/science.1110189.

    CAS  Article  PubMed  Google Scholar 

  80. 80.

    Haines JL, Hauser MA, Schmidt S, et al: Complement factor H variant increases he risk of age-related macular degeneration. Science. 2005, 308: 419-421. 10.1126/science.1110359.

    CAS  Article  PubMed  Google Scholar 

Download references

Author information



Corresponding author

Correspondence to Stephen J Chanock.

Rights and permissions

Reprints and Permissions

About this article

Cite this article

Savage, S.A., Chanock, S.J. Genetic association studies in cancer: Good, bad or no longer ugly?. Hum Genomics 2, 415 (2006).

Download citation


  • single nucleotide polymorphism
  • haplotype
  • association study
  • genome-wide scan
  • linkage disequilibrium
  • cancer risk