Analysis of ACE2 Genetic Variants by Direct Exome Sequencing in 99 SARS-CoV-2 Positive Patients

Coronaviruses (CoV) are a large family of viruses that are common in humans and many animal species. Animal coronaviruses rarely infect humans with the exceptions of the Middle East Respiratory Syndrome (MERS-CoV), the Severe acute respiratory syndrome coronavirus (SARS-CoV), and now SARS-CoV-2, which is the cause of the ongoing pandemic of coronavirus disease 2019 (COVID-19). Many studies have suggested that genetic variants in the ACE2 gene may inuence the host susceptibility or resistance to SARS-CoV-2 virus according to the functional role of ACE2 in human pathophysiology. However, many of these studies have been conducted in silico based on epidemiological and population data. We, therefore, investigated the occurrence of ACE2 variants in a cohort of 99 Italian unrelated individuals clinically diagnosed with coronavirus disease 19 (COVID-19) to demonstrate a possible allelic association with COVID-19 by direct DNA analysis in a cohort of positive patients. We identied three different germline variants: one intronic c.439+4G>A and two missense c.1888G>C p.(Asp630His) and c.2158A>G p.(Asn720Asp), in a total of 26 patients with a similar frequency in male and female. Thus far, only c.1888G>C p.(Asp630His) shows a statistically different frequency compared to the ethnically matched populations, suggesting that further research is needed to establish whether this variant is of functional signicance. Our results suggest that there is no evidence of consistent ACE2 variants associated to the COVID-19. We hypothesize that rare susceptibility/resistant alleles could be located in the non-coding regions of the ACE2 gene, known to play a role in regulation of the gene activity.


Introduction
Since the end of last year, in December 2019, Chinese authorities have reported several cases of pneumonia in Wuhan City, Hubei province of China [1]. A novel Betacoronavirus was identi ed as the causative agent of the viral acute respiratory human distress [2,3]. Afterwards, the disease was named "Coronavirus Disease 2019 (COVID-19)" by the World Health Organization (WHO) [4].
Coronaviruses (CoV) are a large family of viruses that are common in humans and other animal species, including bats [5], camels, cattle and cats. Animal coronaviruses rarely infect humans and then spread between subjects with the exceptions of the Middle East Respiratory Syndrome (MERS-CoV), the Severe Acute Respiratory Syndrome Coronavirus (SARS-CoV), and now SARS-CoV-2, which is the cause of the ongoing pandemic [6]. A critical step for a viral infection is receptor recognition and binding to the host-cell surface. The Angiotensin-Converting Enzyme 2 (ACE2) has been identi ed as a functional receptor for SARS-CoV-2, allowing host-cell entry [7]. SARS-CoV-2 uses an extensively glycosylated spike (S) protein that protrudes from the viral envelope and mediates the binding to ACE2 [5], the carboxypeptidase that catalyzes the hydrolysis of angiotensin II to angiotensin (1-7) [8]. The S protein is a 1,273 aa long structural glycoprotein located on the outer envelope of the virus. It has two functional subunits: an N-terminal S1 subunit and a shorter C-terminal S2 subunit. ACE2 is a single-pass type I membrane protein (805 aa) and it contains an N-terminal peptidase M2 domain and a C-terminal collectrin domain. The binding a nity of the ACE2 Receptor-Binding Domain (RBD) to the C-terminal domain of S1 subunit of the SARS-CoV-2 S protein is 10-to 20-fold higher than that of SARS-CoV, which may contribute to the higher infectivity and transmissibility of SARS-CoV-2 [9,10,11]. The high variation in clinical severity observed among patients may be suggestive of a critical role of the inter-individual variability in the host genetic background. Several studies inferred that genetic variants in ACE2 (MIM: #300335) gene may in uence the individual susceptibility or resistance to SARS-CoV-2 according to the functional role of ACE2 in human pathophysiology [12]. However, those previous analyses have been conducted in silico, based on epidemiological and population data. In this study, we therefore investigated the occurrence of ACE2 variants in a cohort of 99 Italian SARS-CoV-2 positive patients by direct exome sequencing, followed by Sanger sequencing analysis. We reported three variants in the ACE2 gene in a total of 26 patients of the 99 examined: c.1888G>C, p.(Asp630His); c.439+4G>A; c.2158A>G, p.(Asn720Asp). We therefore extended the study to a cohort of 1,000 individuals from the Italian general population to establish the frequency of these ACE2 variants in this Italian cohort. Finally, we compared the allelic frequencies of these variants detected in our COVID cohort with those present in GnomAD (EUR).and with the frequencies of our Italian cohort.

COVID-19 patients
We enrolled 99 COVID-19 patients hospitalized at the University Hospital of Rome "Tor Vergata" and Bambino Gesù Children's Hospital Rome, during the period between March and April 2020. All patients were diagnosed with COVID-19 based on clinical evidence and con rmed by viral RNA detection at oropharyngeal and nasopharyngeal swabs by real-time PCR. We divided them in two subtypes according to the degree of respiratory disease severity: the severe one was diagnosed according to respiratory impairment, requiring non-invasive ventilation; the extremely severe was de ned as respiratory failure, requiring invasive ventilation and intensive care unit admission or as death outcome. The majority of the enrolled patients were males (61males, 38 females). Median age was ethics committee at University Hospital of Rome Tor Vergata (PTV) (protocol no. 50/20). The study was conducted in agreement with the principles of the Declaration of Helsinki. Informed written consent was obtained from each patients. In the Italian cohort of control subjects we had 500 males and 500 females.
Whole Exome Sequencing and Data preprocessing Library preparation and whole exome capture were performed by using the Twist Human Core Exome Kit (Twist Bioscience) according to the manufacture's protocol and sequenced on the Illumina NovaSeq 6000 platform. The BaseSpace pipeline (Illumina) and the TGex software (LifeMap Sciences) were used for the variant calling and annotating variants, respectively. Sequencing data were aligned to the hg19 human reference genome. Based on the guidelines of the American College of Medical Genetics and Genomics (ACMG), a minimum depth coverage of 30X was considered suitable for analysis. Variants were examined for coverage and Qscore (minimum threshold of 30), and visualized by the Integrative Genome Viewer (IGV).

Statistical Analysis
Differences in alleles frequencies between groups were evaluated by the Pearson χ2 test or by Fisher's exact test, as requested according to the numbers of samples in the compared groups. P values less than 0.05 were considered statistically signi cant. The Hardy-Weinberg equilibrium was evaluated, where possible, by the Pearson χ2 test.

Results
In ACE2 gene we identi ed three different germline variants, one intronic c.439 + 4G > A and two missense c.1888G > C p.(Asp630His) and c.2158A > G p. (Asn720Asp), in a total of 26 patients (14 females and 12 males). The frequency of the three identi ed are similar between male and female patients suggesting that there is no gender effect underlying the frequency distribution of ACE2 variants (Table 1). GnomAD database analysis revealed that these identi ed ACE2 variants are reported with a cumulative frequency of 0.2289 in ethnically matched populations (EUR). The cumulative frequencies of these variants in our examined Italian cohort is 0.2353 and is not statistically different (Table 1). A signi cant difference was detected only for the c.1888G > C p.
(Asp630His) found in a heterozygous female (p = 0.0067) ( Table 1). The allelic frequency of this variant in GnomAD for the EUR reference population is 0.0000368 con rming that this is a very rare allele. This variant was not found in our Italian control population. In order to predict the potential impact of this variant on the protein we used several tools (PolyPhen2, Mutation Taster, SIFT, MetaLR_pred, and MetaSVM_pred.). The in-silico analysis gave con icting computational verdicts because of 3 benign predictions vs. 2 pathogenic predictions. The sequence alignment of the ACE2 protein with its orthologous proteins shows that the wild type residue is not highly conserved in species implying an irrelevant functional or structural role of this residue in the ACE2 protein. However, this variant deserves further investigation in a larger COVID-19 cohorts as well as functional studies. Concerning the other two variants, the recurrent c.439 + 4A > G (rs2285666) intronic variant has been previously reported by Strafella et al. [14] and by Asselta et al. [15] in two different Italian

Discussion
Numerous in silico data suggested that the ACE2 variants in structural part of the protein could have an impact on the pathogen binding dynamics or increase the quantitative expression of ACE2. All these studies were carried out on an epidemiological basis of population allele frequencies deposited in the various available databases. We systematically analyzed the ACE2 coding-region variants in a representative cohort of Italian patients severely affected by COVID-19 in order to identify rare and causative predisposing alleles. Although we identi ed in a single COVID − 19 patient a variant (p.Asp630His), very rare in European population and not detected in our Italian control population, we do not believe that there is an enrichment of ACE2 coding mutant alleles in the population of Italian patients affected by COVID-19. However, further studies in larger Covid-19 cohorts could contribute to clarify its role. Our results con rm and extend the knowledge that ACE2 is a gene with a low allelic frequency of missense variants as expected on the basis of GnomAD population data. In fact, we provide evidence that the rate of amino acid changes at the binding region with SARS-CoV-2 and at the protein cleavage sites is very low. This suggests that these regions have been under evolutionary pressure, probably for the essential catalytic role of ACE2 as transmembrane carboxypeptidase. It is possible that rare susceptibility alleles are located in the non-coding regions of the gene, involved in the regulation of ACE2 gene activity. Also a recent GWA study on a high number of patients did not show evidence of association with ACE2 variability [17]. Mutant alleles in non-coding DNA can cause alterations in expression levels or timing. These variations concern enhancers, promoters, insulators and silencers or regions that provide instructions for producing functional RNA molecules, such as transfer RNA, miRNAs or long non-coding RNA [16]. By inspecting the human genetic variants pool available at https://www.ncbi.nlm.nih.gov/snp/, ∼ 16,493 SNPs were extracted after ltering for the non-coding regions of ACE2. We are aware that the totality of these variants has no functional meaning. However, some of these, may in uence the expression of the receptor in a tissue-dependent way. It is therefore of interest to explore the existence of ACE2 susceptibility alleles to SARS-CoV-2 in these regulatory regions. Interestingly very recently, Bunyavanich et al. [18] showed age-dependent expression of ACE2 gene in nasal epithelium, highlighting that the different levels of ACE2 expression may be the reason for a lower incidence of COVID-19 in children [18]. Several studies have shown that ACE2 gene undergoes the action of at least four miRNAs: miR-200c, let-7b, miR-1246, and miR-125b [19][20][21][22]. Polymorphisms within genes coding for these miRNAs could be of great help with regards to investigations on the regulation of ACE2 gene expression and the possible signi cance of variations in further more in-depth studies.

Conclusions
In conclusion, our study suggests that there are no ACE2 encoding variants associated with patients severely affected by COVID-19. However, we cannot rule out a Type II error considered to be a relatively small size of the samples tested. Despite this, we hypothesize that rare susceptibility alleles may be located in the non-coding regions of the ACE2 gene, known to have a role in regulating gene activity. It is therefore interesting to explore the existence of ACE2 susceptibility alleles to SARS-CoV-2 in the regulatory regions of the gene.

Declarations Ethical declarations
Biological samples enrolled in the study were collected according to the ethical procedures of the GEFACOVID2.0 research program promoted by the University of Rome Tor Vergata. This program will ensure that its work is carried out with the highest regard for ethical issues and with respect to the rights, integrity and privacy of patients. All consent, material/information storage and distribution procedures has been approved by the local Ethics Committees (CEI PTV protocol no. 50/20). SARS-CoV-2 positive patients who participated in the study, have signed a informed consent prepared ad hoc, which provided detailed information on the type of test, the implications of the genetic results and the possible psychosocial implications. As regards the participation of children in the research, consent and authorization have been signed by the parents in accordance with the rules laid down by the Ethics Committee of the Bambino Gesù Hospital in Rome (http://www.ospedalebambinogesu.it/en/home).

Consent for publication
Not applicable Availability of data and materials Please contact authors for data request

Competing Interests
The authors declare no competing interests.