- Primary research
- Open Access
Whole-genome sequencing of Chinese centenarians reveals important genetic variants in aging WGS of centenarian for genetic analysis of aging
Human Genomics volume 14, Article number: 23 (2020)
Genetic research on longevity has provided important insights into the mechanism of aging and aging-related diseases. Pinpointing import genetic variants associated with aging could provide insights for aging research.
We performed a whole-genome sequencing in 19 centenarians to establish the genetic basis of human longevity.
Using SKAT analysis, we found 41 significantly correlated genes in centenarians as compared to control genomes. Pathway enrichment analysis of these genes showed that immune-related pathways were enriched, suggesting that immune pathways might be critically involved in aging. HLA typing was next performed based on the whole-genome sequencing data obtained. We discovered that several HLA subtypes were significantly overrepresented.
Our study indicated a new mechanism of longevity, suggesting potential genetic variants for further study.
With the development of human genomics research, a large number of studies of the genetics of longevity have been conducted. Scientists from various countries have proposed many different theories concerning the mechanisms of aging from different perspectives, involving oxidative stress, energy metabolism, signal transduction pathways, immune response, etc. [1, 2]. These mechanisms interact with each other and are influenced by heredity to some degree [2, 3]. The identification of longevity-related biological markers is critical to an in-depth understanding of the mechanisms of carrier protection against common disease and/or of the retardation of the process of aging.
Studies revealed from 300 to 750 genes related to longevity that are critically involved in a variety of life activities, such as growth and development, energy metabolism, oxidative stress, genomic stability maintenance, and neurocognition . These candidate genes include mainly APOE, a gene involved in lipoprotein metabolism [5, 6]. Others are those involved in cell cycle regulation, cell growth and signal transduction, the maintenance of genome stability, and the endocrine-related pathway [7,8,9]. In addition, the candidates for longevity encompass genes related to drug metabolism, the ones involved in protein folding, stabilization, and degradation, as well those related to coagulation and regulation of circulation , etc. In most cases, these genes or their polymorphic sites were examined in multiple population replication studies, which discovered certain longevity-associated genes or pathways [4,5,6,7,8,9,10].
Besides, longevity is associated with immunity and inflammation [11, 12]. HLA gene, also known as the major histocompatibility complex gene, encodes the major histocompatibility complex (MHC), which is a gene family existing in most vertebrate genomes, closely related to the immune system . Earlier study indicated that HLA may be the genetic basis for the specific response patterns of longevity and longevity immunity . Inflammatory cytokines, such as TNF-α, IL-1β, and IL-6, may be key players . IL-10 limits and terminates the inflammatory response by inhibiting the action of T cells, monocytes, and macrophages, and thus, the genetic variants of this gene may also affect the longevity phenotype .
However, at present, most of the investigations on longevity factors of centenarians are performed on a small number of candidate miRNAs, single tissues, or single samples, and only few studies have systematically conducted analyses of multiple tissues and copies at the whole-genome level [17,18,19].
Based on the results of previous cohort study, in this study, we aimed to use genome-wide sequencing technology to conduct genome-wide association studies and analysis of centenarians. Our findings would facilitate a more accurate focus on the most important genetic basis and molecular mechanisms associated with longevity. The conclusions of this study can serve as the basis for the public efforts towards the extension of the length of life. Moreover, they will provide a scientific reference for further clinical research on disease treatment and overall health care promotion.
SKAT analysis revealed significantly correlated genes in aging
The sequencing platform Illumina XTen (Illumina, San Diego, CA, USA) was used for sequencing of the entire genome of 19 centenarians at an average depth of 30×. The sequencing quality metrics are provided in Supplementary Table 1. Baseline information of the centenarians is shown in Table 1. The identified variants were annotated, and non-synonymous variants affecting gene function were selected for association analysis.
The experiment design is shown in Fig. 1. Association analysis is applied to WGS data to find important gene and pathways. All centenarian and controls were Eastern Asiatic Mongoloids ascertained to be of Chinese descent (Zhejiang Province, Southeast China). Correlative analysis involved mainly association analysis, SKAT, and Burden tests for rare variants. These methods are commonly employed for GWAS research, especially for case/control samples. They are identified by the difference in frequency of occurrence of variants between case and control samples. Variations associated with the phenotype of the disease, generally directed against common variants, are detected using this method, where the frequency of occurrence and the variation contribute to the phenotype of the disease. Annotations and literature surveys can be used for significantly related variants to further determine the effect of related genes and variants on gene function.
Based on the whole-genome sequencing data, this analysis was performed on the variants detected with MAF > 1%. The control sample consisted of selected 1000G East Asian population data, and the total number of the control samples is 208 . PCA analysis was conducted to evaluate the stratification of the case and control group (supplementary Fig 1). For rare variants with a low frequency of variation, analysis using the method Rare variants case/control association test was performed.
A total number of 41 (Supplementary Table 2) significantly correlated genes were obtained through SKAT analysis. The top 10 genes were as follows: PABPC3, BAGE2, HLA-DRB1, PDE4DIP, PADI4, CHI3L2, MUC17, WARS, HLA-DRB5, and SIRPB1 (Table 2).
Immune system-related pathway was significantly enriched
The significant genes were subjected to differential pathway enrichment analysis. Then, MutsigDB was used to enrich the KEGG and Reactome pathways. As can be seen in Table 3, the associated genes were significantly enriched in the pathways related to immune and inflammatory responses, such as those of interferons, antibodies, and immunity.
HLA subtypes are correlated with aging
Based on the whole-genome sequencing data obtained, HLA-typing was performed. Through the analysis of HLA type distribution, as presented in Tables 3 and 4 and Fig. 2, we found that the type II HLA genes had an important relationship with longevity. Among them, the HLA DRB1 *13:02, HLA DRB1 *14:01, and HLA DRB1 *16:02 were significantly associated with longevity.
Researches on the genetic mechanisms of longevity have been conducted from many perspectives, including that on longevity-related genes, variants, and biological pathways [4, 10]. With the advancements in the NGS technology and analysis algorithms, increasingly more longevity-related genetic features could be found and would be useful for the understanding of mechanisms of longevity and related diseases [21, 22].
In our study, we used a small centenarian cohort to establish the association between the genetic variants and longevity. By SKAT analysis, rare variants were found that were related to the longevity phenotype. HLA-DRB1, HLA-DRB5, and PDE4 DIP have been reported to be associated with longevity [23,24,25]. Importantly, HLA-DRB1 variants have been specifically reported to have been significantly enriched in a French centenarian study . Further, through the analysis of HLA type distribution, we found several subtypes of HLA DRB1 which have a closer relationship with longevity. Other significantly related genes that we found in this study, such as PABPC3, BAGE2, PADI4, CHI3L2, MUC17, WARS, and SIRPB1, have never been reported before, and their functions deserve further study. Pathway enrichment was performed and showed an important association of the immune-related pathway and the aging process. Previous examinations revealed that immune and inflammatory responses are closely related to ARD (aging-related diseases) [27, 28]. The significant differences in the gene enrichment of the related pathways suggest that a possible longevity mechanism may be associated with protective variants in genes that occur in the related pathways. Here, we have shown that genome-wide data can be further mined as compared with the findings of traditional SNP studies. For example, HLA-type analysis could also be associated with the phenotype, which appears to be a possibility for expanded mining of genome-wide data. Nonetheless, the relatively small sample size limited the power of our findings, and thus further validation by large cohort studies is required.
In conclusion, the findings of our study provide novel insights into aging mechanisms, suggesting the involvement of several genes, pathways, and specific HLA subtypes that are worth further investigations.
Materials and methods
A random number table was used to randomly collect whole-blood samples from the centenarian cohort in Zhejiang Province, China. Ten percent of centenarians, that is, 19 centenarians, were chosen. All of them were free of major age-related diseases, i.e., cardiovascular or cerebrovascular disease, cancer, dementia, renal or hepatic failure, etc. They were informed about the study and signed a letter of consent, in accordance with the guidelines of the Ethics Committee of the First Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, China (2015-KL-008-01). PBMC were isolated to be used for extraction of genomic DNA. Nineteen sex-matched samples with good physical condition were randomly selected from 1000G East Asian population data to serve as a control group . All centenarian and controls were Eastern Asiatic Mongoloids ascertained to be of Chinese descent (Zhejiang Province, Southeast China).
Library preparation and whole-genome sequencing
Indexed Illumina NGS libraries were prepared from plasma DNA and germline genomic DNA. Next, an NGS library was prepared using the KAPA library preparation kit (Kapa Biosystems). Agencourt AMPure XP beads (Beckman-Coulter) were used to purify the extracted DNA. A 100-fold mole excess of ligation Illumina TruSeq adaptors was used for ligation at 16 °C for 16 h. Size selection of DNA fragments was performed in the 100-μL solution system, and then, the connected fragments were amplified with 500 μm Illumina backbone oligonucleotides for 4–9 rounds of PCR. After that, the DNA fragments were input.
The library concentration was assessed by Qubit and Qpcr. The fragment length was determined using a 2100 Bioanalyzer with a DNA 1000 kit (Agilent). DNA fragments were mixed with HiFi Hot Start Ready Mix (1×), and 2 × 150 bp sequencing of multiple libraries was finally performed with Illumina HiSeq X10.
Paired reads were aligned to the hg19 reference genome using the BWA (V0.7.15-r1140)-mem command . Then, they were sorted and indexed using SAMtools . An in-house Python script was utilized to evaluate the various statistics collected, including mapping statistics, read quality, and panels capture efficiency.
For each sample, the SAMtools pileup function was employed to generate variant candidates among the corresponding sites. We excluded the SNP sites and lower depth sites (≤ eu) among the candidates and removed the reads with low base (< Q30) and mapping qualities (< 40).
The file with the data of the centenarian variants was subjected to SKAT. Functional annotation of the genomic variation of each sample was performed, distinguishing between rare variants and variants affecting the protein function. Next, the influence of the variants in the gene was scored. We used the aforementioned three methods to test candidate genome variants to identify potential rare variants associated with the phenotype. Literature and databases were searched to find how the associated variants affected the biological processes, and speculation on disease mechanisms was carried out.
Pathway enrichment analysis
The KEGG pathway enrichment was conducted using the DAVID Functional Annotation Bioinformatics Microarray Analysis.
HLA typing was performed through the HLAscan algorithm . HLAscan started with sequence reads in FASTQ format for mapping to IMGT/HLA data. For targeted sequencing data, sequence reads were used as direct input for HLAscan, whereas for WGS and WES data, we selected reads for HLA genes prior to running the HLAscan. In comparison with the targeted sequencing data, alignment of whole-genome/exome data directly to the IMGT/HLA database may lead to the omission of some HLA reads. Nonetheless, this algorithm was adopted because alignment of HLA reads to the IMGT/HLA database is advantageous in regard to both time and computational processing without loss of predictive accuracy. Initial alignment was performed using BWA-MEM (v0.7.10-r789) with default options. The alignment was the best fit for HLAscan in our investigation, which involved many allele sequences in IMGT/HLA and BWA-MEM. Sequence reads in the BAM file were sorted by reference coordinates using the FixMateInformation function, followed by removal of duplicate reads using MarkDuplicates in the Picard software package (version 1.68) (http://picard.sourceforge.net). Subsequently, identification of indels and re-alignment around these features were performed with the RealignerTargetCreator and IndelRealigner tools, respectively, and base-pair quality scores were recalibrated with BaseRecalibrator and PrintReads using the GATK software (version 3.3.0) . Throughout these processes, sequence reads corresponding to the exonic regions of HLA genes were selected based on an initial alignment generated using GATK with a whole-genome reference (GRCh37.p13). This filtering step does not classify the sequence reads into specific HLA genes.
Availability of data and materials
Please contact the author for data requests.
Santos-Lozano A, Santamarina A, Pareja-Galeano H, Sanchis-Gomar F, Fiuza-Luces C, Cristi-Montero C, Bernal-Pino A, Lucia A, Garatachea N. The genetics of exceptional longevity: insights from centenarians. Matiritas. 2016;90:49–57 https://doi.org/10.1016/j.maturitas.2016.05.006.
Govindaraju D, Gil A, Barzilai N. Genetics, lifestyle and longevity: lessons from centenarians. Applied and Translational Genomics. 2015;4:23–32 https://doi.org/10.1016/j.atg.2015.01.001.
Gierman HJ, Kristen F, Roach JC, Coles NS, LI H, Gustavo G, Markov GJ, Smith JD, Leroy H, Stephen CL, Kim SK. Whole-genome sequencing of the world’s oldest people. J. Plos One. 2014;9:1–10 https://doi.org/10.1371/journal.pone.0112430.
Budovsky A, Craig T, Wang J, Tacutu R, Csordas A, Lourenço J, Fraifeld VE, de Magalhães JP. LongevityMap: a database of human genetic variants associated with longevity. Trends Genet. 2013;29:559–60 https://doi.org/10.1016/j.tig.2013.08.003.
Schächter F, Faure-Delanef L, Guénot F, Rouger H, Froguel P, Lesueur-Ginot L, Cohen D. Genetic associations with human longevity at the APOE and ACE loci. Nat Genet. 1994;6:29–32 https://doi.org/10.1038/ng0194-29.
Garatachea N, Emanuele E, Calero M, Fuku N, Arai Y, Abe Y, Murakami H, Miyachi M, Yvert T, Verde Z, Zea MA, Venturini L, Santiago C, Santos-Lozano A, Rodríguez-Romo G, Ricevuti G, Hirose N, Rábano A, Lucia A. ApoE gene and exceptional longevity: insights from three independent cohorts. Exp Gerontol. 2014;53:16–23 https://doi.org/10.1016/j.exger.2014.02.004.
Willcox BJ, Donlon TA, He Q, Chen R, Grove JS, Yano K, Masaki KH, Willcox DC, Rodriguez B, Curb JD. FOXO3A genotype is strongly associated with human longevity. PNAS. 2008;105:13987–92 https://doi.org/10.1073/pnas.0801030105.
Mustafina OE, Nasibullin TR, Érdman VV, Tuktarova IA. Association analysis of polymorphic loci of TP53 and NFKB1 genes with human age and longevity. Adv Gerontol. 2011;24:397–404 https://doi.org/10.1134/S2079057012020129.
Barbieri M, Bonafè M, Franceschi C, Paolisso G. Insulin/IGF-I-signaling pathway: an evolutionarily conserved mechanism of longevity from yeast to humans. Am J Physiol Endocrinol Metab. 2003;285:E1064–71 https://doi.org/10.1152/ajpendo.00296.2003.
Christensen K, Johnson TE, Vaupel JW. The quest for genetic determinants of human longevity: challenges and insights. Nat Rev Genet. 2006;7:436–48 https://doi.org/10.1038/nrg1871.
Franceschi C, Bonafè M, Valensin S, Olivieri F, De Luca M, Ottaviani E, De Benedictis G. Inflamm-aging. An evolutionary perspective on immunosenescence. Ann N Y Acad Sci. 2000;908:244–54 https://doi.org/10.1111/j.1749-6632.2000.tb06651.x.
Franceschi C, Capri M, Monti D, Giunta S, Olivieri F, Sevini F, Panourgia MP, Invidia L, Celani L, Scurti M, Cevenini E, Castellani GC, Salvioli S. Inflammaging and anti-inflammaging: a systemic perspective on aging and longevity emerged from studies in humans. Mech Ageing Dev. 2007;128:92–105 https://doi.org/10.1016/j.mad.2006.11.016.
Marsh SG, Albert ED, Bodmer WF, Bontrop RE, Dupont B, Erlich HA, Fernández-Viña M, Geraghty DE, Holdsworth R, Hurley CK, Lau M, Lee KW, Mach B, Maiers M, Mayr WR, Müller CR, Parham P, Petersdorf EW, Sasazuki T, Strominger JL, Svejgaard A, Terasaki PI, Tiercy JM, Trowsdale J. Nomenclature for factors of the HLA system, 2010. Tissue Antigens. 2010;75:291–455 https://doi.org/10.1111/j.1399-0039.2010.01466.x.
Caruso C, Candore G, Colonna Romano G, Lio D, Bonafè M, Valensin S, Franceschi C. HLA, aging, and longevity: a critical reappraisal. Hum Immunol. 2000;61:942–9 https://doi.org/10.1016/S0198-8859(00)00168-3.
Chung HY, Cesari M, Anton S, Marzetti E, Giovannini S, Seo AY, Carter C, Yu BP, Leeuwenburgh C. Molecular inflammation: underpinnings of aging and age-related diseases. Ageing Res Rev. 2009;8:18–30 https://doi.org/10.1016/j.arr.2008.07.002.
Lio D, Scola L, Crivello A, Colonna-Romano G, Candore G, Bonafè M, Cavallone L, Franceschi C, Caruso C. Gender-specific association between -1082 IL-10 promoter polymorphism and longevity. Genes Immun. 2002;3:30–3 https://doi.org/10.1038/sj.gene.6363827.
Noren Hooten N, Fitzpatrick M, Wood WH 3rd, De S, Ejiogu N, Zhang Y, Mattison JA, Becker KG, Zonderman AB, Evans MK. Age-related changes in microRNA levels in serum. Aging. 2013;5:725–40 https://doi.org/10.18632/aging.100603.
Sanchis-Gomar F, Pareja-Galeano H, Santos-Lozano A, Garatachea N, Fiuza-Luces C, Venturini L, Ricevuti G, Lucia A, Emanuele E. A preliminary candidate approach identifies the combination of chemerin, fetuin-A, and fibroblast growth factors 19 and 21 as a potential biomarker panel of successful aging. Age. 2015;37:9776 https://doi.org/10.1007/s11357-015-9776-y.
van der Spoel E, Jansen SW, Akintola AA, Ballieux BE, Cobbaert CM, Slagboom PE, Blauw GJ, Westendorp RGJ, Pijl H, Roelfsema F, van Heemst D. Growth hormone secretion is diminished and tightly controlled in humans enriched for familial longevity. Aging Cell. 2016;15:1126–31 https://doi.org/10.1111/acel.
Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, Korbel JO, Marchini JL, McCarthy S, McVean GA, Abecasis GR. A global reference for human genetic variation. Nature. 2015;526:68–74 https://doi.org/10.1038/nature15393.
Hannum G, Guinney J, Zhao L, Zhang L, Hughes G, Sadda S, Klotzle B, Bibikova M, Fan JB, Gao Y, Deconde R, Chen M, Rajapakse I, Friend S, Ideker T, Zhang K. Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol Cell. 2013;49:359–67 https://doi.org/10.1016/j.molcel.2012.10.016.
van den Akker EB, Deelen J, Slagboom PE, Beekman M. Exome and whole genome sequencing in aging and longevity. Adv Exp Med Biol. 2015;847:127–39 https://doi.org/10.1007/978-1-4939-2404-2_6.
Lio D, Pes GM, Carru C, Listì F, Ferlazzo V, Candore G, Colonna-Romano G, Ferrucci L, Deiana L, Baggio G, Franceschi C, Caruso C. Association between the HLA-DR alleles and longevity: a study in Sardinian population. Exp Gerontol. 2003;38:313-317. https://doi.org/10.1016/s0531-5565(02)00178-x.
Lagaay AM, D'Amaro J, Ligthart GJ, Schreuder GM, van Rood JJ, Hijmans W. Longevity and heredity in humans. Association with the human leucocyte antigen phenotype. Ann N Y Acad Sci. 1991;621:78–89 https://doi.org/10.1111/j.1749-6632.1991.tb16970.x.
Phillips BE, Williams JP, Gustafsson T, Bouchard C, Rankinen T, Knudsen S, Smith K, Timmons JA, Atherton PJ. Molecular networks of human muscle adaptation to exercise and age. PLoS Genet. 2013;9:e1003389 https://doi.org/10.1371/journal.pgen.1003389.
Ivanova R, Hénon N, Lepage V, Charron D, Vicaut E, Schächter F. HLA-DR alleles display sex-dependent effects on survival and discriminate between individual and familial longevity. Hum Mol Genet. 1998;7:187–94 https://doi.org/10.1093/hmg/7.2.187.
Goldberg EL, Dixit VD. Drivers of age-related inflammation and strategies for healthspan extension. Immunol Rev. 2015;265:63–74 https://doi.org/10.1111/imr.12295.
Licastro F, Candore G, Lio D, Porcellini E, Colonna-Romano G, Franceschi C, Caruso C. Innate immunity and inflammation in ageing: a key for understanding age-related diseases. Immun Ageing. 2005;2:8 https://doi.org/10.1186/1742-4933-2-8.
Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60 https://doi.org/10.1093/bioinformatics/btp324.
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G. Durbin R,1000 Genome Project Data Processing Subgroup. The sequence alignment/map format and SAMtools. Bioinformatics. 2009;25:2078–9 https://doi.org/10.1093/bioinformatics/btp352.
Ka S, Lee S, Hong J, Cho Y, Sung J, Kim HN, Kim HL, Jung J. HLAscan: genotyping of the HLA region using next-generation sequencing data. BMC Bioinformatics. 2017;18:258 https://doi.org/10.1186/s12859-017-1671-3.
McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, Garimella K, Altshuler D, Gabriel S, Daly M, DePristo MA. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303 https://doi.org/10.1101/gr.107524.110.
We express our sincere gratitude to Mr. Maoyang for the valuable assistance.
This research was supported by the Science Research Foundation for TCM of Zhejiang Province, China (no. ZZYJ-WTRW-2018-01, no. 2017ZA045) and the National Natural Science Foundation of China (no. 81503527).
Ethics approval and consent to participate
Nineteen centenarians were chosen. All of them were free of major age-related diseases, i.e., cardiovascular or cerebrovascular disease, cancer, dementia, renal or hepatic failure, etc. They were informed about the study and signed a letter of consent, in accordance with the guidelines of the Ethics Committee of the First Affiliated Hospital of Zhejiang Chinese Medical University, Hangzhou, China (2015-KL-008-01).
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
PCA plot of the case and control group. PCA plot was made using the SNP genotype of centinarian group and the control group.
Sequencing quality metrics
Significantly correlated genes obtained through SKAT analysis.
About this article
Cite this article
Shen, S., Li, C., Xiao, L. et al. Whole-genome sequencing of Chinese centenarians reveals important genetic variants in aging WGS of centenarian for genetic analysis of aging. Hum Genomics 14, 23 (2020). https://doi.org/10.1186/s40246-020-00271-7