A study of the role of GATA4 polymorphism in cardiovascular metabolic disorders

Background The study was designed to evaluate the association of GATA4 gene polymorphism with coronary artery disease (CAD) and its metabolic risk factors, including dyslipidaemic disorders, obesity, type 2 diabetes and hypertension, following a preliminary study linking early onset of CAD in heterozygous familial hypercholesterolaemia to chromosome 8, which harbours the GATA4 gene. Results We first sequenced the whole GATA4 gene in 250 individuals to identify variants of interest and then investigated the association of 12 single-nucleotide polymorphisms (SNPs) with the disease traits using Taqman chemistry in 4,278 angiographed Saudi individuals. Of the studied SNPs, rs804280 (1.14 (1.03 to 1.27); p = 0.009) was associated with CAD (2,274 cases vs 2,004 controls), hypercholesterolaemia (1,590 vs 2,487) (1.61 (1.03–2.52); p = 0.037) and elevated low-density lipoprotein-cholesterol (hLDLC) (575 vs 3,404) (1.87 (1.10–3.15); p = 0.020). Additionally, rs3729855_T (1.52 (1.09–2.11; p = 0.013)) and rs17153743 (AG + GG) (2.30 (1.30–4.26); p = 0.005) were implicated in hypertension (3,312 vs 966), following adjustments for confounders. Furthermore, haplotypes CCCGTGCC (χ2 = 4.71; p = 0.041) and GACCCGTG (χ2 = 3.84; p = 0.050) constructed from the SNPs were associated with CAD and ACCCACGC (χ2 = 6.58; p = 0.010) with myocardial infarction, while hypercholesterolaemia (χ2 = 3.86; p = 0.050) and hLDLC (χ2 = 4.94; p = 0.026) shared the AACCCATGT, and AACCCATGTC was associated with hLDLC (χ2 = 4.83; p = 0.028). A 10-mer GACCCGCGCC (χ2 = 7.59; p = 0.006) was associated with obesity (1,631 vs 2,362), and the GACACACCC (χ2 = 4.05; p = 0.044) was implicated in type 2 diabetes mellitus 2,378 vs 1,900). Conclusion Our study implicates GATA4 in CAD and its metabolic risk traits. The finding also points to the possible involvement of yet undefined entities related to GATA4 transcription activity or gene regulatory pathways in events leading to these cardiovascular disorders.


Background
The GATA binding proteins constitute a family of cellrestricted zinc-finger transcription factors (TFs), which recognize the GATA motif present in the promoters of many genes. This family, comprising six developmental/ cell-type specific transcription factors, GATA1-6, is critical to the development of diverse tissues [1][2][3][4] and acts in cooperation with more widely expressed factors to direct lineage-specific gene expression [5][6][7][8][9][10][11]. Their transcriptional activity is modulated through interactions with nuclear proteins, including the zinc finger proteins of the Kruppel and FOG/U-shaped families, general coactivators of the p300 and cAMP-response elementbinding (CREB) protein (CBP), the myocardial-expressed protein Nkx2.5, and NF-AT3 [12][13][14][15]. In particular, in the cardiac tissue where its expression is more than 20-fold greater than in other tissues [16], it is a critical regulator of cardiac angiogenesis and gene expression, and is involved in modulating cardiomyocyte differentiation and adaptive responses of the adult heart [5,17,18]. Additionally, GATA4 is abundantly expressed in the endocardium and endothelial cells, pointing to an important regulatory control in their development and function [11,19]. The abundance and localization of the GATA4 to the myocardial tissue suggests a pivotal cardiovascular functional input, which if perturbed, may lead to various cardiac disorders and coronary artery disease (CAD)-related events. However, while the importance of this gene is now well appreciated in congenital heart diseases [20][21][22][23][24][25][26] and some other cardiac malformations [27][28][29], little is known about its potential role in vascular activityrelated diseases, such as dyslipidaemia and diabetes mellitus. Dyslipidaemia can be triggered as a result of a genetic predisposition, secondary causes, or a combination of both. However, the genetic causes of this disorder remain to be identified. In a preliminary study investigating the genomic linkage to early onset of CAD in two Saudi families with heterozygous familial hypercholesterolaemia (HFH), we identified a locus on chromosome 8, which harbours the GATA4 gene, as a plausible candidate for CAD, HFH and harbouring of low high-density lipoprotein levels. This led to the notion that GATA4 presents a potential candidate for CAD onset, especially in dyslipidaemic conditions. Given the lack of information on the role of this gene in the etiology of atherosclerosis, our study sought to comprehensively investigate the likelihood of GATA4 polymorphism predisposing individuals to acquiring the cardiovascular metabolic risk traits, particularly dyslipidaemia, obesity, type 2 diabetes mellitus and hypertension, as a potential trigger for the disease onset related to these disorders. To this effect, we first sequenced the gene in the two HFH families and an additional 250 individuals from the Saudi general population to identify potentially informative variants, and then performed a population-based association study for selected variants with CAD and its risk traits in a larger cohort of angiographed Saudi individuals.

Linkage analysis for dyslipidaemia and early onset coronary artery disease
The initial scanning of the HFH yielded several peaks in different genomic regions including Chromosome (Chr) 8, among others, giving a logarithm of the odds (LOD) score of 1.8 that isolated at least three of the affected siblings with the early onset of CAD in both families. One of the potential culprits at this locus is GATA4, which we elected to pursue further for its role in dyslipidaemia-related onset of CAD. Subsequent sequencing of the genes in all members of the two HFH families and an additional 250 individuals led to the selection of 12 single-nucleotide polymorphisms (SNPs), rs2740434_G > A (minor allele frequency = 0.26), rs17153743_A > G (0.02), rs13264774_C > T (0.14), rs56298569_G > C (0.44), rs804280 A > C (0.04), rs3729855 C > T (0.18), rs3729856_A > G (0.45), rs1062219 C > T (0.45), rs12825_C > G (0.09), rs804291_C > T (0.32), rs11785481 C > T (0.12), rs3203358 C > G (0.20) for further case-control studies. Selection of the SNPs was based partly on the prevalence in our general population and partly on currently available information of their role in the disease. Furthermore, these SNPs reside in the later portion of the gene (also encompassing its three prime untranslated region, 3′-UTR), which encodes the C-terminal of the protein and harbours gene regulatory motifs, both of which are thought to be important in the transcriptional activity of the GATA4 (Figure 1). We were also curious to understand the potential role of changes in the 3′-UTR in the disease. Figure 2 displays the linkage disequilibrium (LD) structure of the ten SNPs included in the haplotyping.
We then proceeded to investigate further potential relationships of these haplotypes with other metabolic risk traits, particularly hypertension, type 2 diabetes and obesity. The results showed that a 10-mer GACCCGCGCC (χ 2 = 7.59; p = 0.006) was associated with obesity, while a 9-mer GACACACCC (χ 2 = 4.05; p = 0.044) was implicated in type 2 diabetes mellitus (Table 4). We also observed several protective haplotypes for all disease traits, which were primarily complementary to the core of the causative sequences (See also Additional file 2, GATA4 Haplo Suppl data). However, we could not establish any significant causative link with hypertension.

Discussion
In the present study, we first screened two families with HFH in which the primary probands presented with severe phenotypes of early onset CAD. We identified several potential loci of which Chr 8 appealed as the most attractive choice for detailed investigation on the genetic basis for dyslipidaemia-related onset of the disease. This locus harbours the GATA4 gene at Chr 8p23.1-22, which we elected to study first as the most likely candidate. Screening the complete coding and non-coding areas of the gene revealed several mutations, among which we selected 12 for the case-control study. The study has unequivocally identified two SNPs, the rs1062219 and rs804280 as risk variants for CAD, both of which were also linked to congenital heart disease in the present study as well as other previous studies in different ethnic populations [25,26,30]. Therefore, our study does not only furnish strong support for an important role for these SNPs in congenital heart disease, but also points to a possible sharing of common disease pathways involved in the etiologies of CAD and congenital heart disease at the GATA4 signaling level. Notably, by far, the majority of GATA4 mutations studied to date seem to point to their influencing its transcriptional activity, and have been associated primarily with cardiac malformations, such as congenital heart disease. Thus, the difference in the nature of the congenital heart disease and CAD etiologies renders it very intriguing why these two disorders would share the GATA4, or any other signalling pathway for that matter, as a common disease pathway. A number of speculations have recently been advanced on how changes in GATA4 gene may influence disease pathways. One of these suggests an impediment of the GATA4 transcription activity in congenital heart disease as involving at least two of the variants, p.A348A and p.S377G [26], which were also included in the present study. Since GATA4 transcription activity is subject to regulation at the level of gene expression and through post-translational modifications of protein, the processes involved in these regulatory mechanisms are therefore worthy considering as possible culprits in our study. On the other hand, however, in our study, while the former was implicated in MI, the latter was actually protective against acquiring CAD. Accordingly, these observations seem to point to delineable differences between CAD and congenital heart disease in their interactions with the GATA4 variants. Besides, the ubiquity of GATA4 in the myocardium also raises fundamental questions with respect to possible diversity in its cardiac-related function(s) and its involvement in other cardiovascular disease pathways, such as those leading to atherosclerosis. Hence, the primary question raised in our study was whether a link existed between the impact of GATA4 on CAD manifestation and the presence of metabolic disorders, as our linkage study in HFH seemed to suggest. To begin with, our results pointed to an association of the rs804291 and rs11785481 with both hChol and hLDLC, which appeared to follow primarily an autosomal recessive mode of inheritance. A closer analysis of the data also indicated that while the rs11785481 was not directly related to CAD per se, the probability of an individual acquiring CAD increased greatly in hypercholesterolaemic individuals. We also noted that two other SNPs that were not directly associated with hChol became so in the presence of MI, similarly indicative of a possible interaction of the dyslipidaemic disease traits with changes in GATA4 as a possible link to CAD/MI manifestation in these individuals. Moreover, the analyses for the other metabolic risk factors revealed the association for two variants with hypertension, further linking risk traits for metabolic syndrome to GATA4 polymorphism. Altogether, these observations furnish support for the notion of a contribution of some interactions of metabolic risk traits with GATA4 to the disease pathways leading to atherosclerosis, an assertion which requires further investigation.
The various ways in which the GATA4 variants relate to alterations in lipid levels and CAD/MI in this study lead to important questions regarding the possible mechanisms or pathways linking them with one another. Based on our results, it appears that such mechanisms may involve events associated with the harbouring of lHDLC, for example, being linked to pathways directly influencing circulating lipid levels and possibly leading to acquiring CAD in dyslipidaemic individuals, as indicated by the inverse relationship of some of these traits with the different GATA4 variants. Accordingly, the processes appear to be separable from those leading to cardiac malformations. Hence, the novel finding implicating the GATA4 in dyslipidaemia seems to point to yet undefined entities, possibly involving the regulation of the GATA4 functional state, as playing an important role in the disease process. In this regard, perhaps the most revealing observation of the present study is the linking of noncoding variants of the gene to disease. Of particular interest was the finding of causative haplotypes for CAD/MI as well as the metabolic risk traits encompassing variants in the 3′-UTR of the gene. Thereby, isolating the two coding SNPs from those residing in the 3′-UTR did not alter the significance level for the latter, possibly pointing to the changes at this chromosomal locus rather than the individual variants as the underlying genetic basis for these manifestations. Notably, the significance levels for the haplotype associations were higher than those of individual constituent variants, pointing to the importance of these genomic sequences in revealing the impact of a gene on disease that would otherwise remain uncovered through associations with changes at individual loci. Implications are that these events may be due to changes other than those directly pertaining to a functional motif of the GATA4 protein. Furthermore, the close proximity of these variants may also suggest the presence of sequences encoding some other yet unidentified entities as the potential culprits, especially considering the fact that several 3-mer haplotypes at this locus were shared by some of the traits. Thus, it is plausible to postulate that alteration at this genomic locus, and not necessarily in the GATA4 gene function per se, may offer an explanation for the observed link to alterations in the metabolic risk traits and CAD manifestation.
A number of speculations have been raised with regard to the possible ways in which changes in the 3′-UTR of the GATA4 gene may influence disease pathways. For example, both germline and somatic mutations have recently been described in this region that were predicted to affect RNA folding as cause for congenital heart disease [25]. This is in line with the notion that the mechanisms by which GATA4 contributes to hChol metabolism may be related to the regulation of the gene itself or its mRNA maturation rather than the transcriptional activity of its protein product. These mechanisms are likely to be the result of complex interactions with yet unidentified cofactors, as has been suggested recently by some studies [6]. While the study has produced interesting data that may contribute to our knowledge of GATA4 interaction with cardiovascular disease, there are some limitations in the extent of its applicability. Thus, one of the main limitations is that no such potential mechanisms were tested to verify the notion that the 3′-UTR of the GATA4 gene plays an important role in the discussed cardiovascular disease pathways. Furthermore, like other association studies on complex diseases, the potential impact of the present findings may be limited to ethnic Arab populations due to inter-ethnic variations in prevalent epigenetic and environmental factors. Besides, it is not likely that our present findings per se can be exploited as predictive markers for the dyslipidaemic disorders.

Conclusion
In summary, our study identified the GATA4 transcription factor as an independent risk factor for congenital heart disease and CAD/MI and a metabolic risk trait for cardiovascular diseases. Thus, apart from the demonstrated association of GATA4 polymorphism with dyslipidaemia, our results also point to interrelationships of the two disease components with CAD/MI, which may explain partly how these diseases lead to atherosclerosis. The finding of several causative haplotypes for MI/CAD embracing the 3′UTR of the GATA4 gene points to important roles for this chromosomal locus in the etiology of CAD/MI and the possible involvement of yet undefined entities related to GATA4 transcription activity or gene regulatory pathways in events leading to these cardiovascular disorders.

Study population
The present study was performed in three stages involving three independent groups of Saudi individuals. The first group comprised two families of a total of 22 members in which HFH was prevalent (Figure 3). The index case of the first family was the third (S3) of seven sons and two daughters born to unrelated parents, who underwent triple coronary artery bypass surgery at the age of 14 years. On admission, this propositus had reported with a chest wound and xanthomas, as well as clinical features of bilateral carotid artery disease and a very severe form of familial hyperlipidaemia (FH). He presented with very high cholesterol (Chol) level of 10.1 mmol/L (desirable <5.2 mmol/l) and LDL-Chol level of 7.9 mmol/L (optimal <2.59 mmol/l). Furthermore, he harboured low HDL-Chol levels (0.51 mmol/l; normal 1.04-1.55). The father displayed the characteristics of borderline HFH (Chol 6.1 mmol/L, LDL 4.0 mmol/L), while the mother, two other sons (S4 and S6) and a daughter (D2) had clinical phenotypes of the disease (combination of high Chol (≥6.2 mmol/L) and high LDL-Chol (≥4.12 mmol/L)). Two other sons with otherwise normal Chol and LDL-Chol levels also displayed low HDL-Chol levels (<1.04 mmol/L), while none of the family members had isolated elevated LDL-Chol levels. In the second family, two index cases, two daughters of a family of seven daughters and two sons concurrently presented with early onset CAD at the age of 17 and 15 years. The two candidates had identical exceedingly high Chol levels of 22.5 mmol/l and LDL-Chol levels of 19.6 (D6) and 18.0 mmol/l (D7), in addition to harbouring low HDL levels of 0.93 and 0.65 mmol/l, respectively. The father (8.6 mmol/l), mother (7.7 mmol/l) and a third daughter (D3; 7.6 mmol/l) were Figure 3 Pedigrees of two families with heterozygous hyperlipidaemia. Proband S3 in Family One underwent triple coronary artery bypass at the age of 14 years, as well as D6 and D7 in Family Two presented with early onset coronary artery disease at the age of 17 and 15 years. also hypercholesterolaemic. The criteria for the diagnosis of familial hyperlipidaemia (FH) were adapted from our Institutional Guidelines, employing the reference values approved by the USA National Cholesterol Education Program (NCEP). Following the identification of Chr 8p23 region, which harbours the GATA4 gene (8p23.1-22), as a potential culprit for the disease through linkage study of the above two families, we elected to sequence the gene in the two families and a group of 250 healthy blood donors visiting our Blood Bank Clinic to identify potentially informative SNPs of interest for the case-control association study.
The case-control study was performed in a total of 4,274 Saudi individuals consisting of 2,274 CAD patients (1,736 males and 538 females, mean age 60.8 ± 0.4 years) with angiographically determined narrowing of the coronary vessels by at least 50% and 2,004 angiographed controls (1,074 male and 930 female, mean age 50.2 ± 0.5 years) ( Table 5). These controls (CON) were individuals undergoing surgery for valvular disease or reporting for follow-up on various other cardiac procedures, but were established to have no significant narrowing of the coronary vessels. Among the study population, 1,590 patients were hypercholesterolaemic (hChol) patients (1,051 male and 539 female, mean age 58.2 ± 0.2 years) undergoing anti-hyperlipidaemic therapy and/or known to have high levels of total cholesterol (Chol >6.20 mmol/L), 1,776 patients harboured lHDLC levels (<1.04 mmol/L), 1,088 had high triglyceride (hTG) levels (>2.25 mmol/L), and 575 exhibited high low-density lipoprotein levels. These were denoted as dyslipidaemia study patients. Another subset of interest comprised 2,378 individuals having type 2 diabetes mellitus (T2DM; formerly known as non-insulin-dependent diabetes mellitus or adult-onset diabetes). The National Diabetes Data Group of the USA and the second World Health Organization Expert Committee on Diabetes Mellitus (1998) [31] define type 2 diabetes mellitus as a metabolic disorder that is characterized by high blood glucose (generally defined as fasting glucose level >126 mg/dL) in the context of insulin resistance and relative insulin deficiency. Furthermore, 3,312 individuals had primary (essential) hypertension (HTN) ( Table 5), defined as ≥140 mm Hg systolic blood pressure or ≥90 mm Hg diastolic pressure based on The Sixth Report of the Joint National Committee on Prevention, Detection, Evaluation, and Treatment of High Blood Pressure (JNC VI) criteria [32]. Accordingly, essential, primary, or idiopathic hypertension is defined as high blood pressure in which secondary causes such as renovascular disease, renal failure, pheochromocytoma, aldosteronism, or other causes of secondary hypertension or Mendelian forms (monogenic) are absent [32]. The fourth group comprised 1,631 obese candidates with body-mass index (BMI) of ≥30.0 kg/m 2 , and classified as the obesity subset. Among these subsets of patients, some patients harboured a combination of two or possibly three of the cardiovascular risk traits. All individuals with major cardiac rhythm disturbances, incapacitating or life-threatening illness, major psychiatric illness or substance abuse, history of cerebral vascular disease, neurological disorders, and administration of psychotropic medication or any other disorders likely to interfere with variables under investigation were excluded from the study. This study was performed in accordance with the regulations laid down by the King Faisal Specialist Hospital and Research Centre Ethics Committee in compliance with the Helsinki Declaration [33] and all participants signed an informed consent. Five millilitres of peripheral blood was sampled in EDTA tubes after obtaining written consent, and genomic DNA extracted from leukocytes by the standard salt methods using PUREGENE DNA isolation kit (Qiagen, Germantown, MD, USA).

Linkage analysis and gene sequencing
Whole genome-wide scanning of two families with heterozygous familial hypercholesterolaemia was performed using the Affymetrix Gene Chip 250 Sty1 mapping array (Affymetrix, Inc., Santa Clara, CA, USA), and multipoint parametric linkage analysis for estimating the LOD scores performed using the GeneHunter Easy Linkage analysis software 4.0 (Affymetrix, Inc., Santa Clara, CA, USA) as described previously [34]. A recessive model of inheritance was used with a population-disease allele frequency of 0.0001, based on the Asian SNP allele frequencies, and the Copy Number Analyzer for GeneChip® (CNAG) Ver. 3.0 (Affymetrix, Inc., Santa Clara, CA, USA) software was employed in order to search for shared chromosomal regions of homozygosity. Following the identification of Chr 8 as a potential risk locus, screening for GATA4 mutations of interest was accomplished by sequencing on the MegaBACE DNA analysis system (Amersham Biosciences, Sunnyvale, CA, USA). Briefly, the DNA was subjected to PCR amplification by standard methods, following which the PCR products were sequenced and the data analyzed The table summarizes the univariate and multivariate analyses for the relationships of the gene variants with coronary artery disease, myocardial infarction and congenital heart disease in the studied 4,278 individuals. Bonferroni tests have been performed to adjust for age and sex, and adjustment made for the confounders in the multivariate tests.*p < 0.05; **p < 0.005.