Integrative genomic analysis reveals functional diversification of APOBEC gene family in breast cancer
© Zhang et al. 2015
Received: 27 September 2015
Accepted: 9 December 2015
Published: 18 December 2015
The human APOBEC protein family plays critical but distinct roles in host defense. Recent studies revealed that APOBECs mediate C-to-T mutagenesis in multiple cancers, including breast cancer. It is still unclear whether APOBEC gene family shows functional diversification involved in cancer mutagenesis.
We performed an integrated analysis to characterize the functional diversification of APOBEC gene family associated with breast cancer mutagenesis relative to estrogen receptor (ER) status. Among the APOBEC family, we found that both APOBEC3B and APOBEC3C mRNA levels were significantly higher in estrogen receptor negative (ER−) subtype compared with estrogen receptor positive (ER+) subtype (P < 2.2 × 10−16 and P < 3.1 × 10−5, respectively). Epigenomic data further reflected the distinct chromatin states of APOBEC3B and APOBEC3C relative to ER status. Notably, we observed the significantly positive correlation between the APOBEC3B-mediated mutagenesis and APOBEC3B expression levels in ER+ cancers but not in ER− cancers. In contrast, we discovered the negative correlation of APOBEC3C mRNA levels with base-substitution mutations in ER− tumors. Meanwhile, we observed that breast cancers in carriers of germline deletion of APOBEC3B gene harbor similar mutation patterns, but higher mutation rates in the TCW motif (W corresponds to A or T) than cancers in non-carriers, indicating additional factors may also induce carcinogenic mutagenesis.
These results suggest that functional potential of APOBEC3B and APOBEC3C involved in cancer mutagenesis is associated with ER status.
KeywordsAPOBEC family Estrogen receptor Breast cancer Methylation Mutagenesis Chromatin state
In the human genome, APOBEC family comprises of a total of 11 genes. They have been known to encode Zn2+-dependent DNA cytosine deaminases. Of these 11 members, seven apolipoprotein B mRNA editing enzyme, catalytic polypeptide-like 3 genes (APOBEC3A, APOBEC3B, APOBEC3C, APOBEC3D, APOBEC3F, APOBEC3G, and APOBEC3H) are tandemly distributed on chromosome 22 . The other four members are located on other chromosomes, including APOBEC1, APOBEC2, APOBEC4, and activation-induced cytidine deaminase (AICDA). The APOBEC family proteins show different classes of DNA cytosine deaminase domains and distinct tissue expression profiles [2–5], indicating they may have diverse biological functions. For example, the APOBEC proteins exhibit different activities in restricting virus replication and inhibiting LINE-1 retrotransposition [6–8].
Recent studies showed the elevated expression of APOBEC3B in multiple tumors [9–12]. Meanwhile, the strong correlation of somatic base-substitution mutation with APOBEC3B mRNA levels in cancer samples has implicated APOBEC3B as an enzymatic source inducing the C-to-T somatic mutations [9, 10, 13, 14]. However, in breast cancer subtypes, the distinct mutational patterns and genomic changes have also been reported [15–18], meaning that the APOBEC-mediated mutagenesis and APOBEC family expression may be variable in cancer subtypes .
Thanks to two projects, The Cancer Genome Atlas (TCGA)  and the Encyclopedia of DNA Elements (ENCODE) project , leveraging these diverse types of data to interpret functional features of genes or gene families in particular cancer types is possible. Here, we performed integrated analysis on diverse high-throughput sequencing data involved in somatic mutation, gene expression, and epigenetic profiles. The aim of this study is to investigate the functional diversification of the APOBEC family genes in breast cancer, in cancer subtypes with a focus on estrogen receptor (ER) status.
For breast cell lines, a total of 42 high-throughput sequencing data, including 10 RNA sequencing (RNA-seq) data from two data sets [20, 21], 18 ChIP sequencing (ChIP-seq) data, 4 corresponding input DNA (control) from five data sets, and 10 bisulfite sequencing (BS-seq) data from two data sets [19–22], were collected in this study (Additional file 1: Table S1). Among ten RNA-seq data, eight are from breast cancer cell lines and two are from normal breast cell lines.
Of the 18 ChIP-seq datasets, six histone modifications, including histone H3 lysine 4 methylations (H3K4me1, H3K4me3), lysine 9 trimethylation (H3K9me3), lysine 27 trimethylation (H3K27me3), lysine 36 trimethylation (H3K36me3), and histone H3 lysine 27 acetylations (H3K27ac), were conducted in each of the HCC1954 cell line, an estrogen receptor negative (ER−) breast cancer cell line, MCF-7 cell line, an estrogen receptor positive (ER+) breast cancer cell line, and human mammary epithelial cells (HMEC), a normal breast cell line.
For the ten bisulfite-seq datasets, two included whole-genome BS-seq at the single nucleotide resolution in the ER− breast cancer cell line HCC1954 and the normal cell line HMEC . Eight were generated by reduced representation bisulfite sequencing (RRBS) in low depth . As no or low sequencing coverage in the proximal promoters of APOBECs in the initial analysis, we discarded these eight RRBS data for further analyses.
Summary of breast cell lines and tissue specimens in the study
Breast non-tumor cell lines
Breast cancer cell lines
Breast tumor tissues
Adjacent normal breast tissues
No. of samples
4 (50 %)
664 (72.6 %)
67 (78.9 %)
2 (100 %)
4 (50 %)
196 (21.4 %)
15 (17.6 %)
55 (6.0 %)
3 (3.5 %)
28 (3.1 %)
2 (2.4 %)
162 (17.7 %)
17 (20 %)
597 (65.2 %)
66 (77.6 %)
128 (14.0 %)
Availablity of exome-seq data
750 (82.0 %)
165 (18.0 %)
Expression profiling of APOBECs in breast cancer
We then expanded this analysis to include a total of 1000 RNA-seq data from TCGA project. In agreement with observation in cancer cell lines, both the APOBEC3B and APOBEC3C genes show higher expression levels in ER− cancers (P < 2.2 × 10−16 and P < 3.1 × 10−5, respectively, two-sided Wilcoxon rank sum test) relative to ER+ cancers (Fig. 1b). For other APOBEC family members (Additional file 2: Figure S1), they show either extremely low (median log2-transformed RPKM < 0) mRNA levels (including APOBEC3A, APOBEC3H, APOBEC1, APOBEC2, APOBEC4, and AICDA), or no obvious expression differences between ER+ and ER− breast cancers (including APOBEC3D, APOBEC3F, and APOBEC3G). Additionally, the expression levels of APOBEC family genes in 55 breast cancer tissues with unknown ER status are similar with those samples of ER+ subtypes, indicating many of these 55 patients probably belong to ER+ subtype. These results together indicate that the expression patterns of APOBEC3B and APOBEC3C genes may be associated with ER status.
Chromatin states of APOBEC gene family in breast cancer cells
To investigate whether the different expression profiles of APOBEC family members could show different chromatin states in relation to ER status, we integrated ChIP-seq data for six histone modifications, including H3K4me3 and H3K36me3 (two epigenetic markers connected with gene activity) [23, 24], H3K27ac and H3K4me1 (two enhancer-associated hallmarks) [25, 26], and H3K9me3 and H3K27me3 (two typical epigenetic markers indicative of transcriptional repression and heterochromatin formation) [23, 27], across three breast tissue relevant cell types: ER+ (MCF-7), ER− (HCC1954) breast cancer cells, and normal mammary epithelial cells (HMEC). Notably, in the ER− cell line HCC1954, we observed active promoter and clustered enhancers surrounding both the APOBEC3B and APOBEC3C loci (Fig. 1c). Meanwhile, these two genes show strong transcription signal. In addition, parts of active promoters or enhancers at the APOBEC3F and APOBEC3G genes were also found. For the other APOBEC family genes, we did not observe active promoter or enhancers (Additional file 2: Figure S2).
In contrast, in the ER+ cancer cell line MCF-7, we only observed promoter (Fig. 1c) activity at the APOBEC3B, but not at the APOBEC3C and the other APOBEC family genes (Additional file 2: Figure S2). Furthermore, compared with the normal HMECs, the loss of promoter activity at the APOBEC3C, APOBEC3F, and APOBEC3G genes was specifically detected in the ER+ cancer cell (Fig. 1c). The observed chromatin state profiles of the APOBEC family genes are quite concordant with the gene expression patterns, further suggesting that the functional divergence of the APOBEC3B and APOBEC3C genes may be related to the ER status.
In addition, with respect to several APOBEC family members (including APOBEC3A, APOBEC3H, APOBEC1, APOBEC2, APOBEC4, and AICDA) showing no or extremely low expression levels in both ER+ and ER− breast cancers, we did not find any enrichment of repressive chromatin modifications (H3K9me3 and H3K27me3) in these genes (Fig. 1c and Additional file 2: Figure S2), suggesting some additional molecular mechanisms probably participate in the regulation of these genes.
DNA methylation of APOBECs in breast cancer cells
Clinical outcome with mRNA levels of APOBECs
APOBEC-mediated mutagenesis in relation to ER status
APOBEC-mediated mutagenesis in relation to germline deletion of APOBEC3B
To address whether the APOBEC-mediated mutagenesis in breast tumors is directly induced by the APOBEC3B protein, stratified by CNV of APOBEC3B locus, we further compared the number of mutations per tumor exome in breast cancer tissues. The results show the remarkably positive correlation (r = 0.33, P = 8.9 × 10−16) between the number of SNVs per exome and APOBEC3B expression levels for APOBEC3B with CN2, but not for CN1 and CN0 types (Fig. 5b). Similar but negative correlation of mutations per exome with APOBEC3C expression levels is observed. Meanwhile, the comparative results show that the number of somatic SNVs in the absence of APOBEC3B (CN0) is not significantly different with the other two genotypes (CN1 and CN2, Fig. 5c), although some biases could be introduced in the comparison, including the unequal sample size and regional (not whole-genome) variants for three genotypes. Furthermore, in terms of mutation types, the mutation patterns are quite similar with predominance of C-T transition for all three APOBEC3B genotypes (Fig. 5d). Compared with samples with CN1 and CN2 of APOBEC3B, we observed elevated frequency of C-G mutation in samples containing the CN0 of APOBEC3B (Fig. 5d). The excessive analysis of trinucleotide motifs with centered C>A, C>G, and C>T mutation types shows a significantly increased TCW motif occurrence in CN0 group (where W corresponds to A or T, Fig. 5e) compared with CN1 (P = 2.4 × 10−4) and CN2 groups (P < 2.2 × 10−16) based on Fisher’s exact test.
In this study, we investigated the genomic, transcriptomic, and epigenetic regulation of APOBEC gene family in breast cancers by integrating diverse high-throughput sequencing data. The transcriptome data from both breast cancer cell lines and tumor specimens reveal that the APOBEC3B and APOBEC3C genes show significant differences at the transcription levels between ER+ and ER− breast tumors. On the one hand, we recapitulated previous findings that the up-regulation of APOBEC3B gene represents a growing enzyme-catalyzed cytosine-to-uracil deamination activity in carcinogenesis [9, 10]. On the other hand, the substantial up-regulation of APOBEC3B gene and absence of down-regulation of APOBEC3C gene in ER− subtypes suggest that the cytosine deaminase activity for APOBECs may be distinct in cancer subtypes. Furthermore, our study also suggests that the APOBEC-induced cancer mutagenesis is distinct regarding ER status, in agreement with the phenomena of different mutational spectrum observed between ER+ and ER− breast cancers .
Our results provide an additional support of the functional significance of APOBEC3B and probably APOBEC3C in breast carcinogenesis . Based on the overall similar mutation patterns between breast cancers in carriers of germline deletion of APOBEC3B genes and non-carriers, and a higher occurrence of somatic mutations on the TCW motif in APOBEC3B deletion samples, we speculate that germline deletion of APOBEC3B, expression of APOBEC3B and APOBEC3C, and somatic mutation may interact in contributing to breast cancer.
Integrated analysis of functional genomic data is a powerful approach with broad applications in many areas, including prediction of gene activity  and annotation of non-coding RNAs . Using a similar approach, our integrated analyses illuminate the difference of chromatin states among APOBECs in breast cancer subtypes, which is highly consistent with the gene expression data. For instance, the active promoter and enhancer states at the APOBEC3B gene are concordant with the over-expression of APOBEC3B, particularly in ER− subtype. Meanwhile, the loss of active promoter and enhancer signals at the APOBEC3C gene in the ER+ cancer cell reflects the down-regulation of this gene observed in the ER+ subtype. In addition, the DNA hypermethylation may explain no or low expression profiles of other APOBECs. These results are concordant with Isobe’s report [33, 34] in blood cells. Meanwhile, Pauklin et al.  have reported the estrogen induced increase of APOBEC3B gene expression by nearly 1.5-fold in a dose-dependent manner, but not APOBEC3C gene and other members in human MCF-7 cells, suggesting the possibility that ER may directly (or indirectly) regulate APOBECs in breast cancer cells through recruiting chromatin modifiers .
Compared with APOBEC3B, the APOBEC3C gene may play a different role in the cancer genome mutagenesis. For example, we observed that the elevated expression levels of APOBEC3C but lowered expression levels of APOBEC3B in breast cancer patients have better clinical outcomes. The opposite correlation of APOBEC3B and APOBEC3C genes’ expression with clinical outcomes further provides a possibility of using both two genes as potential biomarkers in prognosis. Although no functional assays are performed in this study, several lines of evidence support the possible functionality of the APOBEC3C protein in breast cancer.
First, APOBEC3C is localized in both the cytoplasmic and nuclear compartments, which is not only observed in human cells [37–39], but also observed in the rhesus counterparts , suggesting functionally constrained localization of APOBEC3C. However, for other APOBECs, they show either tissue-restricted expression patterns (APOBEC3A, APOBEC1, APOBEC2, APOBEC4, and AICDA) or cytoplasmic localization (APOBEC3D, APOBEC3F, APOBEC3G, and APOBEC3H) in human cells [3, 4, 6, 7, 10, 39, 40]. Along with high expression levels of APOBEC3C gene in ER− breast cancer cells, it is possible that both APOBEC3B and APOBEC3C retain DNA cytosine deaminase activity in breast cancer. Meanwhile, Lackey et al. reported that throughout mitosis in HEK293T and HeLa cells, APOBEC3C has access to genomic DNA during interphase and telophase, while APOBEC3B is excluded from the genomic DNA during mitosis . Such phenomena are consistent with our findings that there is less functional interaction between APOBEC3C and APOBEC3B proteins, suggesting they may perform different roles in cancer cells.
Second, given that the APOBEC3C possesses a single active Z2-cytosine deaminase domain, while the APOBEC3B has double Z-coordinating (Z1 and Z2) deaminase domains [8, 41], the crystal structure determinants and functional comparison have revealed the distinct substrate preferences for binding HIV-1 DNA between the single- and double-domained APOBEC3 enzymes [42–44], raising the possibility that these two enzymes have differential DNA binding specificity which might help explain the relative differences in their observed mutagenesis in breast cancer cells, especially in the ER− cancers.
Third, besides the deaminase activity for APOBECs, the APOBEC3 family proteins also contribute to inhibit the L1 retrotransposition in a deaminase independent manner [6, 38, 45]. For instance, depletion of APOBEC3C significantly increases the L1 retrotransposition activity by ~80 % in HeLa cells . Meanwhile, many studies have strongly indicated the enhanced activity of LINE-1 retrotransposons in human cancers induces genome instability, DNA damage, and genetic variation [46–51], further implying the potentially functional roles of APOBEC3C in breast cancer.
In conclusion, our integrated analyses suggested that APOBEC3B and APOBEC3C expression patterns were correlated with ER status and clinical outcome, providing an additional implication that these two genes may contribute to mutation profile and clinical outcome in breast cancer subtypes.
We collected functional genomics data from experiments including the following: (1) RNA-seq, (2) chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq), and (3) bisulfite sequencing (BS-seq) data, for breast cell lines (including both normal and cancer) from the Gene Expression Omnibus (GEO) database  and ENCODE project. We initially searched the GEO database using the following terms: breast [AND] sequencing. After manual curation to rule out data where cell lines were treated by additional chemicals or siRNAs in experiments, all relevant data with no redundancy were retained for further analyses. We retrieved the ChIP-seq data released by the ENCODE project for well-characterized chromatin modifications in two breast cell lines: one normal cell line, called human mammary epithelial cells (HMEC), and the other ER+ breast cancer cell line (MCF-7). We also downloaded a total of 1000 RNA-seq data with approval (BAM files, Level 1) and corresponding clinical data (Biotab format) for breast tissue specimens (including 915 cancer and 85 adjacent normal tissue specimens) from TCGA (https://tcga-data.nci.nih.gov/tcga/). In addition, we downloaded 840 CEL files (Affymetrix SNP 6.0 array, level 1) and processed somatic mutation data (level 2) from 776 whole exome-sequencing (exome-seq) for the breast cancer from TCGA data portal, where 787 and 750 have matched RNA-seq data, respectively.
Data processing and statistics
All raw sequencing data (FASTQ format) were initially mapped to the human reference genome (hg19) using Bowtie2 program  with the default setting. Aligned data in SAM format were processed and converted into BAM files using SAMtools program . As each type of sequencing data conveys specifically biological purpose, we used the following methods to process the corresponding sequencing data. Except relevant programs described, all other bioinformatics analyses were implemented using Perl and R programming.
For RNA-seq data, we used similar methods described elsewhere . In brief, to characterize the quantitative expression of each RefSeq gene, the reads per kilobase and million mapped reads (RPKM) was calculated as the number of mapped reads with the mapping quality (MAQ) ≥30 aligning to each transcript multiplied by 1 million and divided by the length of the transcript times the total number of aligned reads. Then the RPKM values for APOBEC family members were extracted to quantify their expression levels. The same method was also utilized to process BAM files for TCGA RNA-seq data.
To uniformly analyze and visualize ChIP-seq data, we used the MACS14 (version 1.4.2) algorithm  to call peaks of ChIP-seq data for histone modifications against the corresponding control data in 20 bp resolution (−−space = 20) at P < 1 × 10−5. The resulting peak files (.bed) were then implemented into the ChromHMM algorithm  at a 200-bp resolution for chromatin state characterization. We ran ChromHMM with a range of possible states and settled on a 9 state model as it accurately captured biologically meaningful patterns in a reproducible way. The resulting genome segmentation files were uploaded to the UCSC Genome Browser as a custom track to visualize the chromatin sates and comparison among APOBEC family members.
For BS-seq data, we employed the BSMAP (version 2.74) program  to analyze DNA methylation ratios on each CpG site. Each CpG site supported by ≥4 reads was considered. To compare the DNA methylation ratios among APOBECs, we calculated mean DNA methylation levels within each 100-bp sliding window across ±5 kb regions of transcription start site (TSS) for each of APOBEC family members.
Level 1 CEL files for the breast cancer were used for copy number variation (CNV) calling. CNVs were detected based on the signal intensities of over 1.8 million SNP or copy number probes on the Affymetrix SNP 6.0 array. The Affymetrix Power Tools (APT-1.14.3) package was used to normalize for samples run on the same plate according to file handles and the chemistry file available. CNVs were called using Birdsuite (version 1.4) , which calls both common and rare CNVs. The APOBEC3B deletion is named as CNP2576 (hg19, chr22: 39363620–39375307) in Canary and was determined by 24 probes. In the present study, we only focus on this APOBEC3B deletion and copy number states of 0, 1, and 2 (CN0, CN1, and CN2) were used. Genotypes were checked against sample barcodes and participant IDs to identify possible duplicate samples.
For somatic mutation data, both somatic single nucleotide variants (SNVs) and C>T/G>A mutations per tumor were first summed. The correlation between the number of C>T/G>A mutations (or somatic SNVs) and expression levels of APOBECs was calculated using non-parametric Spearman’s Rank correlation statistics. The same analysis was employed stratifying for APOBEC3B germline deletion states. Six types of somatic SNVs (C>A/G>T, C>G/G>C, C>T/G>A, T>A/A>T, T>C/A>G, T>G/A>C) were counted and their relative frequencies were then calculated and compared. Following the method reported in the previous study , we compared the trinucleotides motif patterns with the background distribution normalized among breast cancer samples in different copies of APOBEC3B gene.
We thank Jing He and Yan Guo for data processing and analyses and Samantha Stansel for clerical support in the preparation of this manuscript. The content is solely the responsibility of the authors and does not necessarily represent the official views of the funding agents. This work was supported in part by US National Institutes of Health grants R01CA137013.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Jarmuz A, Chester A, Bayliss J, Gisbourne J, Dunham I, Scott J, et al. An anthropoid-specific locus of orphan C to U RNA-editing enzymes on chromosome 22. Genomics. 2002;79(3):285–96.PubMedView ArticleGoogle Scholar
- Rogozin IB, Basu MK, Jordan IK, Pavlov YI, Koonin EV. APOBEC4, a new member of the AID/APOBEC family of polynucleotide (deoxy)cytidine deaminases predicted by computational analysis. Cell Cycle. 2005;4(9):1281–5.PubMedView ArticleGoogle Scholar
- Sato Y, Probst HC, Tatsumi R, Ikeuchi Y, Neuberger MS, Rada C. Deficiency in APOBEC2 leads to a shift in muscle fiber type, diminished body mass, and myopathy. J Biol Chem. 2010;285(10):7111–8.PubMed CentralPubMedView ArticleGoogle Scholar
- Muramatsu M, Sankaranand VS, Anant S, Sugai M, Kinoshita K, Davidson NO, et al. Specific expression of activation-induced cytidine deaminase (AID), a novel member of the RNA-editing deaminase family in germinal center B cells. J Biol Chem. 1999;274(26):18470–6.PubMedView ArticleGoogle Scholar
- Daniel J, Nussenzweig A. The AID-induced DNA damage response in chromatin. Mol Cell. 2013;50(3):309–21.PubMed CentralPubMedView ArticleGoogle Scholar
- Kinomoto M, Kanno T, Shimura M, Ishizaka Y, Kojima A, Kurata T, et al. All APOBEC3 family proteins differentially inhibit LINE-1 retrotransposition. Nucleic Acids Res. 2007;35(9):2955–64.PubMed CentralPubMedView ArticleGoogle Scholar
- Hultquist JF, Lengyel JA, Refsland EW, LaRue RS, Lackey L, Brown WL, et al. Human and rhesus APOBEC3D, APOBEC3F, APOBEC3G, and APOBEC3H demonstrate a conserved capacity to restrict vif-deficient HIV-1. J Virol. 2011;85(21):11220–34.PubMed CentralPubMedView ArticleGoogle Scholar
- Langlois MA, Beale RCL, Conticello SG, Neuberger MS. Mutational comparison of the single-domained APOBEC3C and double-domained APOBEC3F/G anti-retroviral cytidine deaminases provides insight into their DNA target site specificities. Nucleic Acids Res. 2005;33(6):1913–23.PubMed CentralPubMedView ArticleGoogle Scholar
- Burns MB, Lackey L, Carpenter MA, Rathore A, Land AM, Leonard B, et al. APOBEC3B is an enzymatic source of mutation in breast cancer. Nature. 2013;494(7437):366–70.PubMed CentralPubMedView ArticleGoogle Scholar
- Burns MB, Temiz NA, Harris RS. Evidence for APOBEC3B mutagenesis in multiple human cancers. Nat Genet. 2013;45(9):977–83.PubMed CentralPubMedView ArticleGoogle Scholar
- Tsuboi M, Yamane A, Horiguchi J, Yokobori T, Kawabata-Iwakawa R, Yoshiyama S, et al. APOBEC3B high expression status is associated with aggressive phenotype in Japanese breast cancers. Breast Cancer. 2015:1–9. doi:https://doi.org/10.1007/s12282-015-0641-8
- Cescon DW, Haibe-Kains B, Mak TW. APOBEC3B expression in breast cancer reflects cellular proliferation, while a deletion polymorphism is associated with immune activation. Proc Natl Acad Sci. 2015;112(9):2841–6.PubMed CentralPubMedView ArticleGoogle Scholar
- Leonard B, Hart SN, Burns MB, Carpenter MA, Temiz NA, Rathore A, et al. APOBEC3B upregulation and genomic mutation patterns in serous ovarian carcinoma. Cancer Res. 2013;73(24):7222–31.PubMedView ArticleGoogle Scholar
- Roberts SA, Lawrence MS, Klimczak LJ, Grimm SA, Fargo D, Stojanov P, et al. An APOBEC cytidine deaminase mutagenesis pattern is widespread in human cancers. Nat Genet. 2013;45(9):970–6.PubMed CentralPubMedView ArticleGoogle Scholar
- Nik-Zainal S, Alexandrov LB, Wedge DC, Van Loo P, Greenman CD, Raine K, et al. Mutational processes molding the genomes of 21 breast cancers. Cell. 2012;149(5):979–93.PubMed CentralPubMedView ArticleGoogle Scholar
- Kan Z, Jaiswal BS, Stinson J, Janakiraman V, Bhatt D, Stern HM, et al. Diverse somatic mutation patterns and pathway alterations in human cancers. Nature. 2010;466(7308):869–73.PubMedView ArticleGoogle Scholar
- Network TCGA. Comprehensive molecular portraits of human breast tumours. Nature. 2012;490(7418):61–70.View ArticleGoogle Scholar
- Stephens PJ, McBride DJ, Lin ML, Varela I, Pleasance ED, Simpson JT, et al. Complex landscapes of somatic rearrangement in human breast cancer genomes. Nature. 2009;462(7276):1005–10.PubMed CentralPubMedView ArticleGoogle Scholar
- Consortium TEP. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489(7414):57–74.View ArticleGoogle Scholar
- Hon GC, Hawkins RD, Caballero OL, Lo C, Lister R, Pelizzola M, et al. Global DNA hypomethylation coupled to repressive chromatin domain formation and gene silencing in breast cancer. Genome Res. 2012;22(2):246–58.PubMed CentralPubMedView ArticleGoogle Scholar
- Sun Z, Asmann YW, Kalari KR, Bot B, Eckel-Passow JE, Baker TR, et al. Integrated analysis of gene expression, CpG island methylation, and gene copy number in breast cancer cells by deep sequencing. PLoS ONE. 2011;6(2):e17490.PubMed CentralPubMedView ArticleGoogle Scholar
- Frietze S, Wang R, Yao L, Tak Y, Ye Z, Gaddis M, et al. Cell type-specific binding patterns reveal that TCF7L2 can be tethered to the genome by association with GATA3. Genome Biol. 2012;13(9):R52.PubMed CentralPubMedView ArticleGoogle Scholar
- Kouzarides T. Chromatin modifications and their function. Cell. 2007;128(4):693–705.PubMedView ArticleGoogle Scholar
- Mikkelsen TS, Ku M, Jaffe DB, Issac B, Lieberman E, Giannoukos G, et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature. 2007;448(7153):553–60.PubMed CentralPubMedView ArticleGoogle Scholar
- Wang Z, Zang C, Rosenfeld JA, Schones DE, Barski A, Cuddapah S, et al. Combinatorial patterns of histone acetylations and methylations in the human genome. Nat Genet. 2008;40(7):897–903.PubMed CentralPubMedView ArticleGoogle Scholar
- Heintzman ND, Stuart RK, Hon G, Fu Y, Ching CW, Hawkins RD, et al. Distinct and predictive chromatin signatures of transcriptional promoters and enhancers in the human genome. Nat Genet. 2007;39(3):311–8.PubMedView ArticleGoogle Scholar
- Zhao XD, Han X, Chew JL, Liu J, Chiu KP, Choo A, et al. Whole-genome mapping of histone H3 Lys4 and 27 trimethylations reveals distinct genomic compartments in human embryonic stem cells. Cell Stem Cell. 2007;1(3):286–98.PubMedView ArticleGoogle Scholar
- Jones PA. Functions of DNA methylation: islands, start sites, gene bodies and beyond. Nat Rev Genet. 2012;13(7):484–92.PubMedView ArticleGoogle Scholar
- Weisenberger DJ. Characterizing DNA, methylation alterations from The Cancer Genome Atlas. J Clin Invest. 2014;124(1):17–23.PubMed CentralPubMedView ArticleGoogle Scholar
- Kidd JM, Newman TL, Tuzun E, Kaul R, Eichler EE. Population stratification of a common APOBEC gene deletion polymorphism. PLoS Genet. 2007;3(4):e63.PubMed CentralPubMedView ArticleGoogle Scholar
- Creyghton MP, Cheng AW, Welstead GG, Kooistra T, Carey BW, Steine EJ, et al. Histone H3K27ac separates active from poised enhancers and predicts developmental state. Proc Natl Acad Sci. 2010;107(50):21931–6.PubMed CentralPubMedView ArticleGoogle Scholar
- Guttman M, Amit I, Garber M, French C, Lin MF, Feldser D, et al. Chromatin signature reveals over a thousand highly conserved large non-coding RNAs in mammals. Nature. 2009;458(7235):223–7.PubMed CentralPubMedView ArticleGoogle Scholar
- Fujimura S, Matsui T, Kuwahara K, Maeda K, Sakaguchi N. Germinal center B-cell-associated DNA hypomethylation at transcriptional regions of the AID gene. Mol Immunol. 2008;45(6):1712–9.PubMedView ArticleGoogle Scholar
- Isobe T, Song SN, Tiwari P, Ito H, Yamaguchi Y, Yoshizaki K. Activation-induced cytidine deaminase auto-activates and triggers aberrant gene expression. FEBS Lett. 2013;587(16):2487–92.PubMedView ArticleGoogle Scholar
- Pauklin S, Sernandez IV, Bachmann G, Ramiro AR, Petersen-Mahrt SK. Estrogen directly activates AID transcription and function. J Exp Med. 2009;206(1):99–111.PubMed CentralPubMedView ArticleGoogle Scholar
- Gaughan L, Stockley J, Coffey K, ONeill D, Jones DL, Wade M, et al. KDM4B is a master regulator of the estrogen receptor signalling cascade. Nucleic Acids Res. 2013;41(14):6892–904.PubMed CentralPubMedView ArticleGoogle Scholar
- Lackey L, Law EK, Brown WL, Harris RS. Subcellular localization of the APOBEC3 proteins during mitosis and implications for genomic DNA deamination. Cell Cycle. 2013;12(5):762–72.PubMed CentralPubMedView ArticleGoogle Scholar
- Niewiadomska AM, Tian C, Tan L, Wang T, Sarkis PTN, Yu XF. Differential inhibition of long interspersed element 1 by APOBEC3 does not correlate with high-molecular-mass-complex formation or P-body association. J Virol. 2007;81(17):9577–83.PubMed CentralPubMedView ArticleGoogle Scholar
- Muckenfuss H, Hamdorf M, Held U, Perkovic M, Lower J, Cichutek K, et al. APOBEC3 proteins inhibit human LINE-1 retrotransposition. J Biol Chem. 2006;281(31):22161–72.PubMedView ArticleGoogle Scholar
- Fujino T, Navaratnam N, Scott J. Human apolipoprotein B RNA editing deaminase gene (APOBEC1). Genomics. 1998;47(2):266–75.PubMedView ArticleGoogle Scholar
- LaRue RS, Andresdottir V, Blanchard Y, Conticello SG, Derse D, Emerman M, et al. Guidelines for naming nonprimate APOBEC3 genes and proteins. J Virol. 2009;83(2):494–7.PubMed CentralPubMedView ArticleGoogle Scholar
- Kitamura S, Ode H, Nakashima M, Imahashi M, Naganawa Y, Kurosawa T, et al. The APOBEC3C crystal structure and the interface for HIV-1 Vif binding. Nat Struct Mol Biol. 2012;19(10):1005–10.PubMedView ArticleGoogle Scholar
- Siu KK, Sultana A, Azimi FC, Lee JE. Structural determinants of HIV-1 Vif susceptibility and DNA binding in APOBEC3F. Nat Commun. 2013;4–2593.Google Scholar
- Mitra M, Hercik K, Byeon IJ, Ahn J, Hill S, Hinchee-Rodriguez K, et al. Structural determinants of human APOBEC3A enzymatic and nucleic acid binding properties. Nucleic Acids Research. 2014;42(2):1095-110.Google Scholar
- Horn AV, Klawitter S, Held U, Berger A, Jaguva Vasudevan AA, Bock A, et al. Human LINE-1 restriction by APOBEC3C is deaminase independent and mediated by an ORF1p interaction that affects LINE reverse transcriptase activity. Nucleic Acids Res. 2014;42(1):396–416.PubMed CentralPubMedView ArticleGoogle Scholar
- Gary LB, Robert DCR, Thomas GF. Expression of LINE-1 retrotransposons in human breast cancer. Cancer. 1994;73(9):2333–6.View ArticleGoogle Scholar
- Belgnaoui SM, Gosden R, Semmes OJ, Haoudi A. Human LINE-1 retrotransposon induces DNA damage and apoptosis in cancer cells. Cancer Cell Int. 2006;6(1):13.PubMed CentralPubMedView ArticleGoogle Scholar
- Gasior SL, Wakeman TP, Xu B, Deininger PL. The human LINE-1 retrotransposon creates DNA double-strand breaks. J Mol Biol. 2006;357(5):1383–93.PubMed CentralPubMedView ArticleGoogle Scholar
- Daskalos A, Nikolaidis G, Xinarianos G, Savvari P, Cassidy A, Zakopoulou R, et al. Hypomethylation of retrotransposable elements correlates with genomic instability in non-small cell lung cancer. Int J Cancer. 2009;124(1):81–7.PubMedView ArticleGoogle Scholar
- Hancks DC, Kazazian J. Active human retrotransposons: variation and disease. Curr Opin Genet Dev. 6/20122012. 2012;22(3):191–203.Google Scholar
- Symer DE, Connelly C, Szak ST, Caputo EM, Cost GJ, Parmigiani G, et al. Human L1 retrotransposition is associated with genetic instability in vivo. Cell. 2002;110(3):327–38.PubMedView ArticleGoogle Scholar
- Barrett T, Troup DB, Wilhite SE, Ledoux P, Evangelista C, Kim IF, et al. NCBI GEO: archive for functional genomics data sets—10 years on. Nucleic Acids Res. 2011;39 suppl 1:D1005–D10.PubMed CentralPubMedView ArticleGoogle Scholar
- Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Meth. 2012;9(4):357–9.View ArticleGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map (SAM) Format and SAMtools. Bioinformatics. 2009.Google Scholar
- Yan G, Zhang G, Fang X, Zhang Y, Li C, Ling F, et al. Genome sequencing and comparison of two nonhuman primate animal models, the cynomolgus and Chinese rhesus macaques. Nat Biotech. 2011;29(11):1019–23.View ArticleGoogle Scholar
- Zhang Y, Liu T, Meyer C, Eeckhoute J, Johnson D, Bernstein B, et al. Model-based Analysis of ChIP-Seq (MACS). Genome Biol. 2008;9(9):R137.PubMed CentralPubMedView ArticleGoogle Scholar
- Ernst J, Kellis M. ChromHMM: automating chromatin-state discovery and characterization. Nat Meth. 2012;9(3):215–6.View ArticleGoogle Scholar
- Xi Y, Li W. BSMAP: whole genome bisulfite sequence MAPping program. BMC Bioinformatics. 2009;10(1):232.PubMed CentralPubMedView ArticleGoogle Scholar
- Korn JM, Kuruvilla FG, McCarroll SA, Wysoker A, Nemesh J, Cawley S, et al. Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet. 2008;40(10):1253–60.PubMed CentralPubMedView ArticleGoogle Scholar