- Open Access
Functional intronic polymorphisms: Buried treasure awaiting discovery within our genes
Human Genomics volume 4, Article number: 284 (2010)
'In Nature's infinite book of secrecy, a little I can read.'
Antony and Cleopatra [Act I, Scene 2], William Shakespeare
Pathological mutations occurring within the extended consensus sequences of exon-intron splice junctions account for ~10 per cent of all inherited lesions logged in The Human Gene Mutation Database (HGMD®; http://www.hgmd.org) and are frequently encountered in mutation screening studies . Mutations residing in other intronic locations (including the canonical branch-point sequence, 5'-YURAY-3'), however, may often go undetected unless patient RNA can be analysed and the mutations in question induce aberrant splicing (eg exon skipping or cryptic splice site utilisation) that is readily distinguishable qualitatively or quantitatively from normal (and/or normal alternative) splicing. Indeed, introns probably represent a substantially larger mutational target than has hitherto been appreciated, on account of their containing a multiplicity of functional elements, including intron splice enhancers and silencers that regulate alternative splicing,[4, 5]trans-splicing elements  and other regulatory elements, some of which may be deeply embedded within very large introns .
In addition to pathological mutations sensu stricto, introns also harbour functional polymorphisms that can influence the expression of the genes that host them. Some of these intronic variants may also confer susceptibility to disease or otherwise modulate the genotype-phenotype relationship. For the reasons discussed above, it is very likely that such variants will have been seriously under-ascertained to date. Although most of these variants are single nucleotide polymorphisms (SNPs), others may be of the insertion/deletion type . With the advent of genome-wide association studies (GWAS), an increasing number of potentially functional intronic variants are being identified . In the majority of cases, however, it is unclear whether such variants are of direct functional significance, as opposed to simply being in linkage disequilibrium with another (as yet unidentified) functional SNP in the vicinity . Even when GWAS studies deem a newly identified intronic polymorphism to be 'functional', it should be appreciated that such a term may often be ascribed solely on the basis of an observed association between a specific allele and a plasma protein level, enzymatic activity or a clinical/laboratory phenotype -- even although in reality such associations cannot readily distinguish a bona fide functional SNP from a linkage disequilibrium effect.
As has been noted with pathological mutations, the vast majority of known functional intronic polymorphisms are located within the extended consensus sequences of exon-intron splice junctions . Some intronic polymorphic variants do not occur within the splice junctions, however, but nevertheless still act so as to change the splicing phenotype as a consequence of their being located within an intron splice enhancer or branchpoint site, or by activating a cryptic splice site [11, 12]. This is, from a biological point of view, a more interesting category of intronic SNP to study, since the mechanisms by which these variants exert their effects on the splicing phenotype are often unclear and may be quite subtle. In the pages of this issue, Millar et al. report that a SNP, buried deep within intron 4 of the human growth hormone (GH1) gene, is of direct functional significance by virtue of its influence on the expression of this gene. This polymorphism therefore joins the ranks of the hitherto relatively small number of human intronic SNPs located outwith exon-intron splice junctions that have been shown by various methods of in vitro characterisation to be of direct functional significance. Table 1 lists some of the best characterised examples of such functional SNPs, most of which are located at least ~30 base pairs (bp) from the nearest splice site. These SNPs have been shown to influence either the transcriptional activity or the splicing efficiency of their host genes, or instead to alter the expression of alternative transcripts.
How should we go about increasing the number of identified functional intronic polymorphisms? One approach would be to employ exon-tiling microarrays to perform genome-wide scans to identify intronic SNPs responsible for inter-individual differences in the splicing phenotype [11, 14, 15]. Since currently available bioinformatics tools are inadequate to the task of predicting splicing consequences, however, all SNPs identified in this way would have to be further validated using mini-gene constructs to determine the resulting splicing phenotype . One feature that might prove helpful in identifying intronic SNPs is that such variants are often located within gene regions that are characterised by a reduced level of genetic variation .
Precisely because we invariably adopt a gene-centric approach to screening introns for functional polymorphisms, we should be wary of the existence of overlapping genes, a not infrequent occurrence in our complex genome. Thus, for example, the functional SNP rs4988235, located 13.9 kilobases upstream of the lactase (LCT) gene and associated with adult-type hypolactasia, actually resides deep within intron 13 of the minichromosome maintenance complex component 6 (MCM6) gene [17–19]. In addition, since disease-associated intronic SNPs that play a role in long-range gene regulation have also recently been identified,[20, 21] we should be aware that some SNPs may influence the expression of remote genes at distance, rather than the expression of those genes which actually host them. These caveats notwithstanding, new techniques such as chromosome conformational capture  and chromatin immunoprecipitation followed by deep sequencing (ChIP-seq) promise greatly to increase the number of functional intronic polymorphisms identified, thereby potentially pinpointing the locations of a whole new lexicon of intron-located regulatory elements, which will increase our understanding of intron structure and function.
Stenson PD, Mort M, Ball EV, Howells K, et al: 'The Human Gene Mutation Database: 2008 update'. Genome Med. 2009, 1: 13-10.1186/gm13.
Krawczak M, Thomas NS, Hundrieser B, Mort M, et al: 'Single base-pair substitutions in exon-intron junctions of human genes: Nature, distribution, and consequences for mRNA splicing'. Hum Mutat. 2007, 28: 150-158. 10.1002/humu.20400.
Královicová J, Lei H, Vorechovský I: 'Phenotypic consequences of branch point substitutions'. Hum Mutat. 2006, 27: 803-813. 10.1002/humu.20362.
Wang X, Wang K, Radovich M, Wang Y, et al: 'Genome-wide prediction of cis-acting RNA elements regulating tissue-specific pre-mRNA alternative splicing'. BMC Genomics. 2009, 10 (Suppl 1): S4-10.1186/1471-2164-10-S1-S4.
Tress ML, Martelli PL, Frankish A, Reeves GA, et al: 'The implications of alternative splicing in the ENCODE protein complement'. Proc Natl Acad Sci USA. 2007, 104: 5495-5500. 10.1073/pnas.0700800104.
Gingeras TR: 'Implications of chimaeric non-co-linear transcripts'. Nature. 2009, 461: 206-211. 10.1038/nature08452.
Solis AS, Shariat N, Patton JG: 'Splicing fidelity, enhancers, and disease'. Front Biosci. 2008, 13: 1926-1942. 10.2741/2812.
Wilkins JM, Southam L, Mustafa Z, Chapman K, et al: 'Association of a functional microsatellite within intron 1 of the BMP5 gene with susceptibility to osteoarthritis'. BMC Med Genet. 2009, 10: 141-10.1186/1471-2350-10-141.
Manolio TA, Collins FS, Cox NJ, Goldstein DB, et al: 'Finding the missing heritability of complex diseases'. Nature. 2009, 461: 747-753. 10.1038/nature08494.
McCauley JL, Kenealy SJ, Margulies EH, Schnetz-Boutaud N, et al: 'SNPs in multi-species conserved Sequences (MCS) as useful markers in association studies: A practical approach'. BMC Genomics. 2007, 8: 266-10.1186/1471-2164-8-266.
Kwan T, Benovoy D, Dias C, Gurd S, et al: 'Genome-wide analysis of transcript isoform variation in humans'. Nat Genet. 2008, 40: 225-231. 10.1038/ng.2007.57.
Coulombe-Huntington J, Lam KC, Dias C, Majewski J: 'Fine-scale variation and genetic determinants of alternative splicing across individuals'. PLoS Genet. 2009, 5: e1000766-10.1371/journal.pgen.1000766.
Millar DS, Horan M, Chuzhanova NA, Cooper DN: 'Characterisation of a functional intronic polymorphism in the human growth hormone (GH1) gene'. Hum Genomics. 2010, 4: 289-301.
Hull J, Campino S, Rowlands K, Chan M-S, et al: 'Identification of common genetic variation that modulates alternative splicing'. PLoS Genet. 2007, 3: e99-10.1371/journal.pgen.0030099.
Nembarware V, Lupindo B, Schouest K, Spillane C, et al: 'Genome-wide survey of allele-specific splicing in humans'. BMC Genomics. 2008, 9: 265-10.1186/1471-2164-9-265.
Lomelin D, Jorgenson E, Risch N: 'Human genetic variation recognizes functional elements in noncoding sequence'. Genome Res. 2010, 20: 311-319. 10.1101/gr.094151.109.
Enattah NS, Sahi T, Savilahti E, Terwilliger JD, et al: 'Identification of a variant associated with adult-type hypolactasia'. Nat Genet. 2002, 30: 233-237. 10.1038/ng826.
Olds LC, Sibley E: 'Lactase persistence DNA variant enhances lactase promoter activity in vitro: Functional role as a cis regulatory element'. Hum Mol Genet. 2003, 12: 2333-2340. 10.1093/hmg/ddg244.
Lewinsky RH, Jensen TG, Møller J, Stensballe A, et al: 'T-13910 DNA variant associated with lactase persistence interacts with Oct-1 and stimulates lactase promoter activity in vitro'. Hum Mol Genet. 2005, 14: 3945-3953. 10.1093/hmg/ddi418.
Ragvin A, Moro E, Fredman D, Navratilova P, et al: 'Long-range gene regulation links genomic type 2 diabetes and obesity risk regions to HHEX, SOX4, and IRX3'. Proc Natl Acad Sci USA. 2010, 107: 775-780. 10.1073/pnas.0911591107.
Jowett JB, Curran JE, Johnson MP, Carless MA, et al: 'Genetic variation at the FTO locus influences RBL2 gene expression'. Diabetes. 2010, 59: 726-732. 10.2337/db09-1277.
Dostie J, Dekker J: 'Mapping networks of physical interactions between genomic elements using 5C technology'. Nat Protoc. 2007, 2: 988-1002. 10.1038/nprot.2007.116.
Visel A, Blow MJ, Li Z, Zhang T, et al: 'ChIP-seq accurately predicts tissue-specific activity of enhancers'. Nature. 2009, 457: 854-858. 10.1038/nature07730.