From DNA to RNA to disease and back: The 'central dogma' of regulatory disease variation
© Henry Stewart Publications 2006
Received: 25 April 2006
Accepted: 25 April 2006
Published: 1 June 2006
Much of the focus of human disease genetics is directed towards identifying nucleotide variants that contribute to disease phenotypes. This is a complex problem, often involving contributions from multiple loci and their interactions, as well as effects due to environmental factors. Although some diseases with a genetic basis are caused by nucleotide changes that alter an amino acid sequence, in other cases, disease risk is associated with altered gene regulation. This paper focuses on how studies of gene expression variation might complement disease studies and provide crucial links between genotype and phenotype.
Keywordsgene expression human disease linkage mapping association mapping
Understanding the causes of human disease is one of the most fundamental goals of modern medicine. Individuals differ with respect to disease susceptibility, disease progression and effectiveness of treatment. Identifying the factors contributing to these differences, and elucidating their interactions as they contribute to aspects of disease phenotype, is a precursor to improved prevention, detection and treatment of disease.
Much of the understanding of human disease derives from the study of those diseases that segregate in families in a Mendelian fashion, where the causative variants and the genes in which they reside have been identified through classical family linkage approaches  and through studies in large pedigrees and in isolated populations based on founder effects . The vast majority of common diseases exhibit a more complex mode of inheritance, however, aggregating in families but rarely exhibiting Mendelian inheritance. Examples of diseases of this type include diabetes, obesity, schizophrenia and asthma. Understanding of these 'complex' diseases is improving, although still limited, but it is clear that genetic variation plays an important role in susceptibility to disease, for example in autoimmune and infectious diseases . Most complex disease is thought to be caused by the combined effect of genetic variants at a few loci or multiple loci, each with only modest functional effects on susceptibility. Additional roles are played by environmental factors and their interactions. Mapping the genomic regions contributing to disease creates new directions for disease research and is an important step towards improving human health.
Approaches to identifying the genes involved in complex disease can be generally grouped into two categories: candidate gene studies and linkage/association studies. Candidate gene studies use knowledge about the biology of a disease, and about genes in physiologically or biochemically relevant pathways, and attempt to correlate genetic variation at these 'candidate genes' with disease phenotype. Unfortunately, for most diseases, this type of information is not available or complete enough to prove widely useful, and including them in some of the analyses is more likely to increase the noise than it is to reduce the search space for the disease. Genome-wide linkage studies and association analyses serve as alternative approaches to surveying the contribution to disease of genetic variants located anywhere in the genome. The genome-wide aspect means that these studies do not require any a priori hypothesis that a particular region is involved, although predictions about the potential effect of specific variants (eg non-synonymous single nucleotide polymorphisms [SNPs]) can be incorporated in the models. In this respect, these approaches are unbiased. Family-based linkage studies entail identifying genetic variants in families that co-segregate with disease more often than would be expected by chance. In general, linkage studies have achieved limited success in identifying genomic regions involved in complex disease, in part because they are underpowered to detect moderate genetic effects. Furthermore, because identification of a region or regions associated with the disease or trait requires identifying those alleles that segregate with the disease in families, which in turn depends on recombination within the families, it can be difficult to narrow a region exhibiting significant linkage. An alternative methodology is to perform association analysis, which looks for correlation of genetic variants with aspects of phenotype, but does not require a pedigree structure for the individuals. Association analyses are more powerful for the detection of common disease alleles with small to modest effects [4, 5], and increasingly are being used successfully in studies to identify genes contributing to disease [6, 7].
Genes with non-coding variants affecting disease
HIV-1 progression and transmission
Cardiovascular disease risk
Attention deficit hyperactivity disorder
Rheumatoid arthritis and autoimmune disease
Type I diabetes
Type II diabetes
Breast cancer progression
Coronary artery disease
Inflammatory bowel disease
Colorectal cancer, breast cancer, hepatocellular carcinoma
Resources for genome-wide analysis
An efficient approach to the study of human disease benefits from the use of shared resources. For example, in order to perform genome-wide linkage or association analyses, suitable DNA markers are required. The human genome is estimated to harbour more than 10 million SNPs, present at > 1 per cent frequency , and these SNPs are located throughout the genome in regions of coding and non-coding DNA. Publicly available databases of SNP alleles, assays and genotypes are accessible online (eg dbSNP  and HapMap ). High-throughput genotyping platforms and reductions in genotyping costs now make whole-genome genotyping feasible for large numbers of samples. Gene expression can also be quantified in a high-throughput manner using commercially available microarrays, permitting the detection of small differences in expression levels among samples.
The establishment of cell lines creates resources that can be used by multiple research groups from around the world to survey various cellular phenotypes. With respect to the study of gene expression, it is desirable to establish cell lines from different tissues because gene expression is highly dependent on developmental and cellular context and, indeed, some diseases manifest their phenotypes only in specific tissues. In addition, the cell perturbations that accompany the establishment of cell lines suggest the study of gene expression in primary tissues, although, clearly, the choice of sample depends on the purpose, stage and feasibility of a study, the sample size required and its availability. Despite some shortcomings of cell lines as perfect proxies for the complete set of human tissues, data can be collected on a large scale with respect to sample size and reproducibility and can provide candidates for further study in other samples. Currently, there are relatively few data on gene expression across the diversity of healthy human tissues or across multiple individuals from different populations. These data from healthy individuals will provide important information on the range of naturally occurring gene expression variation and will serve as a baseline against which to compare disease-associated molecular phenotypes.
Statistical issues in genome-wide analysis
Although genome-wide association studies are thought to have more power than family-based linkage studies, they present strong challenges in the form of statistical interpretation. For example, a simple genome-wide association study may test hundreds of thousands of SNPs for association to a phenotype (or, more typically, multiple phenotypes), and more complicated models allowing for SNP - SNP interactions vastly increase the already large number of statistical tests. With such a large number of tests, the significance threshold must be adjusted to control for the number of false-positive associations. Although procedures for multiple test correction exist -- for example, Bonferroni correction, false discovery rate [35, 36] and permutations of phenotypes relative to genotypes  -- it remains unclear which is the best method to apply in this context. It is also not trivial to infer the biological significance of an association from statistical significance, because allele frequencies, variance of the phenotype, density of markers and linkage disequilibrium (LD) can have a tremendous impact on the statistical significance inferred.
Human genetic variation is structured into haplotypes, such that alleles at nearby loci often show strong statistical association with one another. Because of this association, known as LD, a large region may contain multiple SNPs exhibiting a significant association with a given phenotype. Although this structure of human genetic variation facilitates association mapping, it can complicate subsequent fine-scale mapping to narrow the associated region and locate the causal variant, as discussed below. Another concern in association studies is the potential for false associations caused by population stratification , so care must be taken to reduce these effects through appropriate experimental design and data analysis [39, 40].
Generating functional candidates using eQTL mapping
Regions with functional effects on gene expression can be localised through the use of association mapping. Gene expression, or mRNA level, is a quantitative phenotype that can be assayed in multiple individuals. When the same individuals are surveyed for genetic variation at marker loci, for example SNPs, association analysis tests whether variation at each SNP can explain the observed phenotypic variation. The rationale behind this analysis is that markers themselves are either the causal variant or are highly correlated (in LD) with the causal variant.
Association mapping of gene expression variation has been successful in many species, including human [42–45], yeast [46–48], mouse [49–52], rat , fish [54, 55] and maize . Together, these studies provide several striking observations related to the nature of functional variation influencing gene expression. First, variation in gene expression levels among individuals is common -- and much of that phenotypic variation has a genetic basis. Much of the association signal is located cis- to the gene of interest [45, 52, 56], although trans-acting variants have also been observed. Hotspots of gene regulation (ie regions of the genome influencing expression of several genes) have been observed in some , but not all, studies.
There are several ways in which the study of the regulation of gene expression can enhance disease studies as well as narrow the choice of candidate regions for disease association studies. Where information exists about the contribution of particular genes to a disease phenotype or susceptibility, understanding the regulatory control of those genes may assist in elucidating the complete set of effects. In addition, understanding the regulation of categories of genes, or genes of a particular pathway, may provide targets for further follow-up in disease studies. It may prove more time-efficient and cost-effective to have a list of many potential functional variants located throughout the genome, however, and test them against a large number of diseases. Whole-genome eQTL studies can provide a list of regions of the genome with functional effects on the expression of known genes (Figure 1a). SNPs located within these regions can then serve as candidates for disease association studies, much in the same way that non-synonymous SNPs are often considered because of their potential functional effect. There are several advantages to this type of targeted approach over a whole-genome scan. First, because the number of SNPs to be genotyped in each individual is reduced, many more individuals can be surveyed in a disease study without vastly increasing costs. The reduction in the number of markers tested can eliminate some of the problems of multiple test correction, more sensible thresholds can be used and smaller effect variants can be detected. Secondly, any significant associations detected between SNP and disease phenotype provide both a mechanism (gene regulation) and the identity of the affected gene. Finally, the fact that potential causal regulatory variants were initially discovered in healthy individuals and subsequently have been associated with disease means that such variants are common and are likely to contribute significantly to the disease risk of the population.
The methodology above carries the risk of focusing only on certain types of genomic variants, while it is known that much of genome function is still missing. A way to circumvent this problem is to enhance disease studies by incorporating the data on functional regulatory regions while using commercially available whole-genome SNP genotyping chips in disease studies, in order to perform the association analysis using Bayesian methods that assign different prior probabilities to SNPs on the array. Under such a scenario, SNPs located in regions with known functional effects on expression of specific genes -- as identified through eQTL studies -- would be assigned a higher prior probability of being associated with a phenotype. In addition, one might assign a higher prior probability to SNPs in known promoters, enhancers or transcription factors. Thus, one could focus on the effects of candidate variants without missing other important signals. Another substantial advance of knowing regulatory variants before performing a genome-wide association study is that one can correlate phenotypes and regulatory networks and utilise such information in the statistical modelling of the disease.
Supporting genome-wide disease association studies: Narrowing on disease-associated non-coding signals
Genome-wide association studies are now increasing in frequency [6, 57], and although it would have been preferable to have identified all functional regulatory variants in advance, investigators will be faced with the challenge of interpreting some of the strongest association signals. Many of the association studies have a multi-phase design, wherein a fraction of the SNPs with the top statistical significance in the first phase are genotyped in a subsequent phase in a new set of individuals. The statistical exercise must eventually give way to biological interpretation, however, and the identification of the causal variant will be necessary. Although most of the confirmed disease-causing variants are located in coding regions, this observation is due to an ascertainment bias in the ability to predict the potential functional consequences of nucleotide variation. As the human genome is composed of only ~3 -5 per cent coding DNA, and studies increasingly attribute function to non-coding DNA, it might be expected that much of the disease-causing variation will be non-coding and that many of the significant peaks in an association analysis will fall in regions devoid of genes.
Several studies illustrate the utility and validity of using gene expression variation for disease fine mapping. Two of these studies have focused on identifying functional nucleotide variation by focusing primarily on the regions surrounding each of a set of genes (cis-), but also considering other regions located trans- to the genes [42, 45]. These studies showed that a large fraction of genes (10 - 20 per cent) have significant variation that affect their gene expression in cis, and in some cases in trans. The regulatory variants that affect gene expression variation can be mapped with the same resolution as disease variants in genome-wide association studies because, in both cases, the resolution depends on the LD structure of the human populations. These studies, which allow the identification of regulatory haplotypes, need to be verified before functional experiments can be performed. The most appropriate way to perform a first-pass verification is to test whether allelic imbalance in expression is correlated with heterozygosity in the same SNPs as those that showed genotypic association with gene expression.
Even with this information, and the fact that the effect of a causal variant may be known to have an effect on gene regulation, it is still a long way from being able to identify the exact DNA variant that causes the regulatory effect and subsequent increased disease risk. This is a stage where things become complex for many reasons. For example, although the genome-wide distribution of LD is quite variable, average LD in the human genome extends over large regions, which makes it challenging to fine map a causal variant in many regions. In the best case scenario, associated SNPs would be identified in a region of very low LD, thus reducing the number of potential causal variants to test subsequently. More often, an associated region of approximately 10 - 20 kilobases will be identified . Although fine mapping in a population with reduced LD (eg Africans) might assist in identifying shorter associated regions, it is at this stage where extensive amounts of information about genome function are crucial. The diversity of methodologies for large-scale interrogation of the human genome for function is increasing; the resulting information will be very important for prioritising which of those associated DNA segments to focus on first.
Interpreting regulatory variation
The identification of the causal variant can benefit from incorporating information about genome function. Many studies to determine functionality within the human genome sequence are now in progress using high-throughput, genome-wide methodologies. The ENCyclopedia Of DNA Elements (ENCODE) project is the best example  of this type of study. The aim of this project is to attribute a functional identity to each nucleotide of the human genome. In its pilot phase, 1 per cent of the human genome (44 genomic regions) has been studied extensively for function, interspecies conservation and population genetic variation. The comprehensive analysis of these 44 regions will provide important clues for the pattern and structure of genome function and will allow predictions for the nature of variations behind complex disease and phenotypic variation. This, and other ongoing studies, will offer a first-pass annotation of functional elements in the human genome and will provide the framework for detailed characterisation of functional variation.
If an established and confirmed association of a region with disease and gene expression variation exists, and there is light annotation of the associated region for coding and non-coding elements, it is possible to apply brute force approaches to identify the specific DNA changes that are causal. A recommended strategy is to perform extensive resequencing of potentially functional segments of the region in high and low expressing individuals. The number of individuals required to be assayed depends on the magnitude of the functional effect and the predicted within-population frequency of the causal variant. This can be assisted by initial power calculations that allow prediction of what is likely to be identified, given the study design. The optimal approach seems to be to sample sequences from individuals at each of the two ends of the phenotypic (expression) distribution, and then proceed inwards.
As soon as a set of genomic segments have been resequenced, one should look for variants that appear to have equal or better correlation with the phenotype than that observed in the initial association study. This can be determined by genotyping all of the potentially functional variants (identified in the resequencing approach in a subset of the individuals) in the original complete sample. Depending on the strength of these correlations, and where the highly correlated variants are located, the appropriate approach should be adopted for direct functional testing of causal haplotypes. Such approaches can include reporter constructs, binding assays, RNA stability assays and chromatin modification assays using all of the alternative haplotypes.
In this paper, some issues have been discussed that arise from the incorporation of gene expression variation data in disease studies. The overall message is that gene expression can greatly assist the discovery of disease variants, as well as the interpretation of the biological effects of causal variants. Further exploration of gene expression variation in more samples and more cell types will greatly enhance both our understanding of phenotypic variation in humans and also the nature of regulatory variation and its impact on complex disease.
- Jimenez-Sanchez G, Childs B, Valle D: Human disease genes. Nature. 2001, 409: 853-855. 10.1038/35057050.View ArticlePubMed
- Gulcher JR, Kong A, Stefansson K: The role of linkage studies for common diseases. Curr Opin Genet Dev. 2001, 11: 264-267. 10.1016/S0959-437X(00)00188-X.View ArticlePubMed
- Cooke GS, Hill AV: Genetics of susceptibility to human infectious disease. Nat Rev Genet. 2001, 2: 967-977. 10.1038/35103577.View ArticlePubMed
- Risch N, Merikangas K: The future of genetic studies of complex human diseases. Science. 1996, 273: 1516-1517. 10.1126/science.273.5281.1516.View ArticlePubMed
- Hirschhorn JN, Daly MJ: Genome-wide association studies for common diseases and complex traits. Nat Rev Genet. 2005, 6: 95-108.View ArticlePubMed
- Klein RJ, Zeiss C, Chew EY, et al: Complement factor H polymorphism in age-related macular degeneration. Science. 2005, 308: 385-389. 10.1126/science.1109557.PubMed CentralView ArticlePubMed
- Ozaki K, Ohnishi Y, Iida A, et al: Functional SNPs in the lymphotoxin-alpha gene that are associated with susceptibility to myocardial infarction. Nat Genet. 2002, 32: 650-654. 10.1038/ng1047.View ArticlePubMed
- Kasvosve I, Delanghe JR, Gomo ZA, et al: Transferrin polymorphism influences iron status in blacks. Clin Chem. 2000, 46: 1535-1539.PubMed
- Ueda H, Howson JM, Esposito L, et al: Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease. Nature. 2003, 423: 506-511. 10.1038/nature01621.View ArticlePubMed
- Tournamille C, Le Van Kim C, Gane P, et al: Molecular basis and PCR-DNA typing of the Fya/fyb blood group polymorphism. Hum Genet. 1995, 95: 407-410.View ArticlePubMed
- Tsuge M, Hamamoto R, Silva FP, et al: A variable number of tandem repeats polymorphism in an E2F-1 binding element in the 5' flanking region of SMYD3 is a risk factor for human cancers. Nat Genet. 2005, 37: 1104-1107. 10.1038/ng1638.View ArticlePubMed
- Zhou XF, Cui J, DeStefano AL, et al: Polymorphisms in the promoter region of catalase gene and essential hypertension. Dis Markers. 2005, 21: 3-7.PubMed CentralView ArticlePubMed
- McDermott DH, Zimmerman PA, Guignard F, et al: CCR5 promoter polymorphism and HIV-1 disease progression. Multicenter AIDS Cohort Study (MACS). Lancet. 1998, 352: 866-870. 10.1016/S0140-6736(98)04158-0.View ArticlePubMed
- Kostrikis LG, Neumann AU, Thomson B, et al: A polymorphism in the regulatory region of the CC-chemokine receptor 5 gene influences perinatal transmission of human immunodeficiency virus type 1 to African-American infants. J Virol. 1999, 73: 10264-10271.PubMed CentralPubMed
- Sakuntabhai A, Turbpaiboon C, Casademont I, et al: A variant in the CD209 promoter is associated with severity of dengue disease. Nat Genet. 2005, 37: 507-513. 10.1038/ng1550.View ArticlePubMed
- Carlson CS, Aldred SF, Lee PK, et al: Polymorphisms within the C-reactive protein (CRP) promoter region are associated with plasma CRP levels. Am J Hum Genet. 2005, 77: 64-77. 10.1086/431366.PubMed CentralView ArticlePubMed
- VanNess SH, Owens MJ, Kilts CD: The variable number of tandem repeats element in DAT1 regulates in vitro dopamine transporter density. BMC Genet. 2005, 6: 55-PubMed CentralView ArticlePubMed
- Kochi Y, Yamada R, Suzuki A, et al: A functional variant in FCRL3, encoding Fc receptor-like 3, is associated with rheumatoid arthritis and several autoimmunities. Nat Genet. 2005, 37: 478-485. 10.1038/ng1540.PubMed CentralView ArticlePubMed
- Kwok JB, Hallupp M, Loy CT, et al: GSK3B polymorphisms alter transcription and splicing in Parkinson's disease. Ann Neurol. 2005, 58: 829-839. 10.1002/ana.20691.View ArticlePubMed
- Al-Zahrani A, Sandhu MS, Luben RN, et al: IGF1 and IGFBP3 tagging polymorphisms are associated with circulating levels of IGF1, IGFBP3 and risk of breast cancer. Hum Mol Genet. 2006, 15: 1-10. 10.1093/hmg/ddl043.View ArticlePubMed
- Bennett ST, Lucassen AM, Gough SC, et al: Susceptibility to human type 1 diabetes at IDDM2 is determined by tandem repeat variation at the insulin gene minisatellite locus. Nat Genet. 1995, 9: 284-292. 10.1038/ng0395-284.View ArticlePubMed
- Karim MA, Wang X, Hale TC, Elbein SC: Insulin promoter factor 1 variation is associated with type 2 diabetes in African Americans. BMC Med Genet. 2005, 6: 37-PubMed CentralView ArticlePubMed
- Przybylowska K, Kluczna A, Zadrozny M, et al: Polymorphisms of the promoter regions of matrix metalloproteinases genes MMP-1 and MMP-9 in breast cancer. Breast Cancer Res Treat. 2006, 95: 65-72. 10.1007/s10549-005-9042-6.View ArticlePubMed
- Humphries SE, Luong LA, Talmud PJ, et al: The 5A/6A polymorphism in the promoter of the stromelysin-1 (MMP-3) gene predicts progression of angiographically determined coronary artery disease in men in the LOCAT gemfibrozil study Lopid Coronary Angiography Trial. Atherosclerosis. 1998, 139: 49-56. 10.1016/S0021-9150(98)00053-7.View ArticlePubMed
- Ye S, Eriksson P, Hamsten A, et al: Progression of coronary atherosclerosis is associated with a common genetic variant of the human stromelysin-1 promoter which results in reduced gene expression. J Biol Chem. 1996, 271: 13055-13060. 10.1074/jbc.271.22.13055.View ArticlePubMed
- Borm ME, van Bodegraven AA, Mulder CJ, et al: A NFKB1 promoter polymorphism is involved in susceptibility to ulcerative colitis. Int J Immunogenet. 2005, 32: 401-405. 10.1111/j.1744-313X.2005.00546.x.View ArticlePubMed
- Emison ES, McCallion AS, Kashuk CS, et al: A common sex-dependent mutation in a RET enhancer underlies Hirschsprung disease risk. Nature. 2005, 434: 857-863. 10.1038/nature03467.View ArticlePubMed
- Grice EA, Rochelle ES, Green ED, et al: Evaluation of the RET regulatory landscape reveals the biological relevance of a HSCRimplicated enhancer. Hum Mol Genet. 2005, 14: 3837-3845. 10.1093/hmg/ddi408.View ArticlePubMed
- Knight JC, Udalova I, Hill AV, et al: A polymorphism that affects OCT-1 binding to the TNF promoter region is associated with severe malaria. Nat Genet. 1999, 22: 145-150. 10.1038/9649.View ArticlePubMed
- Gottesman II, Gould TD: The endophenotype concept in psychiatry: Etymology and strategic intentions. Am J Psychiatry. 2003, 160: 636-645. 10.1176/appi.ajp.160.4.636.View ArticlePubMed
- Watts JA, Morley M, Burdick JT, et al: Gene expression phenotype in heterozygous carriers of ataxia telangiectasia. Am J Hum Genet. 2002, 71: 791-800. 10.1086/342974.PubMed CentralView ArticlePubMed
- Kruglyak L, Nickerson DA: Variation is the spice of life. Nat Genet. 2001, 27: 234-236. 10.1038/85776.View ArticlePubMed
- Benjamini Y, Hochberg Y: Controlling the false discovery rate -- A practical approach to multiple testing. J R Stat Soc Ser B Methodol. 1995, 57: 289-300.
- Storey JD, Tibshirani R: Statistical significance for genomewide studies. Proc Natl Acad Sci USA. 2003, 100: 9440-9445. 10.1073/pnas.1530509100.PubMed CentralView ArticlePubMed
- Doerge RW, Churchill GA: Permutation tests for multiple loci affecting a quantitative character. Genetics. 1996, 142: 285-294.PubMed CentralPubMed
- Cardon LR, Palmer LJ: Population stratification and spurious allelic association. Lancet. 2003, 361: 598-604. 10.1016/S0140-6736(03)12520-2.View ArticlePubMed
- Pritchard JK, Stephens M, Rosenberg NA, Donnelly P: Association mapping in structured populations. Am J Hum Genet. 2000, 67: 170-181. 10.1086/302959.PubMed CentralView ArticlePubMed
- Tang H, Quertermous T, Rodriguez B, et al: Genetic structure, self-identified race/ethnicity, and confounding in case-control association studies. Am J Hum Genet. 2005, 76: 268-275. 10.1086/427888.PubMed CentralView ArticlePubMed
- Stranger BE, Dermitzakis ET: The genetics of regulatory variation in the human genome. Hum Genomics. 2005, 2: 126-131.PubMed CentralView ArticlePubMed
- Cheung VG, Spielman RS, Ewens KG, et al: Mapping determinants of human gene expression by regional and genome-wide association. Nature. 2005, 437: 1365-1369. 10.1038/nature04244.PubMed CentralView ArticlePubMed
- Monks SA, Leonardson A, Zhu H, et al: Genetic inheritance of gene expression in human cell lines. Am J Hum Genet. 2004, 75: 1094-1105. 10.1086/426461.PubMed CentralView ArticlePubMed
- Morley M, Molony CM, Weber TM, et al: Genetic analysis of genome-wide variation in human gene expression. Nature. 2004, 430: 743-747. 10.1038/nature02797.PubMed CentralView ArticlePubMed
- Stranger BE, Forrest MS, Clark AG, et al: Genome-wide associations of gene expression variation in humans. PLoS Genet. 2005, 1: e78-10.1371/journal.pgen.0010078.PubMed CentralView ArticlePubMed
- Brem RB, Kruglyak L: The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc Natl Acad Sci USA. 2005, 102: 1572-1577. 10.1073/pnas.0408709102.PubMed CentralView ArticlePubMed
- Brem RB, Yvert G, Clinton R, Kruglyak L: Genetic dissection of transcriptional regulation in budding yeast. Science. 2002, 296: 752-755. 10.1126/science.1069516.View ArticlePubMed
- Yvert G, Brem RB, Whittle J, et al: Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nat Genet. 2003, 35: 57-64.View ArticlePubMed
- Cowles CR, Hirschhorn JN, Altshuler D, Lander ES: Detection of regulatory variation in mouse genes. Nat Genet. 2002, 32: 432-437. 10.1038/ng992.View ArticlePubMed
- Doss S, Schadt EE, Drake TA, Lusis AJ: Cis-acting expression quantitative trait loci in mice. Genome Res. 2005, 15: 681-691. 10.1101/gr.3216905.PubMed CentralView ArticlePubMed
- Sandberg R, Yasuda R, Pankratz DG, et al: Regional and strain-specific gene expression mapping in the adult mouse brain. Proc Natl Acad Sci USA. 2000, 97: 11038-11043.PubMed CentralView ArticlePubMed
- Schadt EE, Monks SA, Drake TA, et al: Genetics of gene expression surveyed in maize, mouse and man. Nature. 2003, 422: 297-302. 10.1038/nature01434.View ArticlePubMed
- Walker JR, Su AI, Self DW, et al: Applications of a rat multiple tissue gene expression data set. Genome Res. 2004, 14: 742-749. 10.1101/gr.2161804.PubMed CentralView ArticlePubMed
- Oleksiak MF, Churchill GA, Crawford DL: Variation in gene expression within and among natural populations. Nat Genet. 2002, 32: 261-266. 10.1038/ng983.View ArticlePubMed
- Oleksiak MF, Roach JL, Crawford DL: Natural variation in cardiac metabolism and gene expression in Fundulus heteroclitus. Nat Genet. 2005, 37: 67-72.PubMed CentralPubMed
- Yan H, Yuan W, Velculescu VE, et al: Allelic variation in human gene expression. Science. 2002, 297: 1143-10.1126/science.1072545.View ArticlePubMed
- Herbert A, Gerry NP, McQueen MB, et al: A common genetic variant is associated with adult and childhood obesity. Science. 2004, 312: 279-283.View Article
- The ENCODE Project Consortium: The ENCODE (ENCyclopedia Of DNA Elements) Project. Science. 2004, 306: 636-640.View Article