Identi cation of Differentially Expressed Genes and Signaling Pathways in Human Conjunctiva and Reproductive Tract Infected With Chlamydia Trachomatis

Guo-Dong Zhu Departments of Geriatrics and Oncology, Guangzhou First People’s Hospital, School of Medicine, South China University of Technology, Guangzhou, Guangdong, 510180, China Xun-Jie Cao Department of Clinical Laboratory Medicine, The Third A liated Hospital of Guangzhou Medical University, Guangzhou, 510150, China. Department of Clinical Medicine, The Third Clinical School of Guangzhou Medical University, Guangzhou, 511436, China Ya-Ping Li Department of Clinical Laboratory Medicine, The Third A liated Hospital of Guangzhou Medical University, Guangzhou, 510150, China. Department of Clinical Medicine, The Third Clinical School of Guangzhou Medical University, Guangzhou, 511436, China. Depar Jia-Xin Li Department of Clinical Laboratory Medicine, The Third A liated Hospital of Guangzhou Medical University, Guangzhou, 510150, China. Department of Clinical Medicine, The Third Clinical School of Guangzhou Medical University, Guangzhou, 511436, China Zi-Jian Leng Department of Clinical Laboratory Medicine, The Third A liated Hospital of Guangzhou Medical University, Guangzhou, 510150, China. Department of Clinical Medicine, The Third Clinical School of Guangzhou Medical University, Guangzhou, 511436, China Li-Min Xie Department of Clinical Laboratory Medicine, The Third A liated Hospital of Guangzhou Medical University, Guangzhou, 510150, China. Department of Clinical Medicine, The Third Clinical School of Guangzhou Medical University, Guangzhou, 511436, China Xu-guang Guo (  gysygxg@gmail.com ) The Third A liated Hospital of Guangzhou Medical University https://orcid.org/0000-0003-1302-5234

CD40, and CSF3 in the reproductive tract infection group proved to have considerable statistical signi cance.

Conclusion
In our research, the key genes in the biological process of reproductive tract infection with Chlamydia trachomatis were clari ed through bioinformatics analysis. These hub genes may be further used in clinical treatment and clinical diagnosis.

Background
Chlamydia trachomatis (C. trachomatis, Ct), the causative agent of both the blinding eye disease known as trachoma and of sexually transmitted infections, is an obligate intracellular bacterium whose only natural hosts are humans [1] . The distinct developmental cycle of Ct consists of two phases: an infectious, non-replicative elementary body (EB) and a replicative, non-infectious reticulate body (RB) and the bacteria alternate between these two morphologically distinct forms [2] . The latest data from the World Health Organization show that Chlamydia trachomatis is the main cause of bacterial sexually transmitted diseases, with 127 million new cases every year, with almost 60% of these infections occurring in young adults aged 14-24 [3,4] .

Page 4/29
Ct is the main pathogen that causes both trachoma and sexually transmitted diseases clinically. Most Ct infections have no obvious symptoms, resulting in potential epidemics and often causing multiple complications [5] . Studies have shown that Chlamydia trachomatis infections can cause serious damage to the female reproductive system, including (but not limited to) fallopian tube damage, pelvic in ammatory disease, cervicitis, endometritis, ectopic pregnancy, and ultimately female infertility due to fallopian tube abnormalities [6] . What's more, Ct infection can also cause urethritis, epididymitis, prostatitis, and infertility in men [7] . Moreover, trachoma is caused by repeated infection by Chlamydia trachomatis, which leads to the formation of scar tissue on the inner surface of the eyelid, along with the erosion of the surface of the cornea, ultimately leading to blindness [8] . In fact, this disease is still an important cause of blindness worldwide. In addition, Chlamydia trachomatis infection increases the risk of chronic fatigue syndrome and reactive arthritis, doubles the risk of ectopic pregnancy, and increases the risk of infection with sexually transmitted diseases such as human papillomavirus and HIV [9] .
Gene expression is examined using array technology, which can effectively and simultaneously measure thousands of different mRNAs from a single biological sample [10,11] . Microarrays can be manufactured by a variety of techniques, including spotting, inkjet synthesis, and photolithographic synthesis [12] . So far, this technology has been applied in many elds [13][14][15] . Additionally, it can be used to analyze the transcriptome of humans infected with Chlamydia trachomatis.
In order to study the characteristics of early host cells infected with Chlamydia trachomatis, this study used bioinformatics to analyze the RNA transcription pro les of human conjunctiva, fallopian tube, and endometrial tissue infected by Chlamydia trachomatis.

Data Source
The datasets analyzed in this study were all from the GEO database (https://www.ncbi.nlm.nih.gov/geo/) after searching for keywords related to Chlamydia trachomatis. Finally, 79 series about Ct were obtained. We selected four separate gene expression pro les (GSE20430, GSE20436, GSE26692, and GSE41075) for our study. Information about 119 human samples of Chlamydia trachomatis infection were retrieved from these pro les. Among them, GSE20430 was based on the GPL201platform, GSE20436 was based on GPL570, GSE26692 was based on GPL4133, and GSE41075 was based on GPL571.

Data processing and identi cation of DEGs
Using the R 4.0.1 software, the data of GSE26692 was batch calibrated and standardized via the marray package. The data of GSE20430, GSE20436, and GSE41075 were batch calibrated and standardized by using the affy package. Then the limma package was applied to identify the DEGs. In addition, the ggpubr and ggthemes packages were used to draw volcano maps, and the pheatmap software package was used to draw heatmaps to visualize the DEGs. When analyzing the data of GSE26692 and GSE41075, P < 0.2 and |FC|≥1 were used as critical values to compare the gene expression pro les of infected and uninfected samples. The data of GSE20430 and GSE20436 were analyzed with P < 0.05 and |FC|≥1 as critical values.

GO and KEGG pathway analysis of DEGs
The gene ontology (GO) database was used to compile the functional analysis of DEGs in terms of molecular function (MF), biological process (BP), and cellular component (CC). The Kyoto Encyclopedia of Genes and Genomes (KEGG) was used to investigate the signaling pathways of DEGs. The GO and KEGG pathway analyses of the DEGs were performed using the cluster Pro ler package in the R software (P value cutoff = 0.05).

PPI Network Construction and Hub Gene Identi cation
We used the Search Tool for the Retrieval of Interacting Genes (STRING), a database analysis platform, to analyze our DEGs data and obtain a PPI map. With respect to the PPI, pairs with a combined score > 0.4 were selected to visualize the PPI network using Cytoscape 3.7.2 software. The degree of node connection is positively correlated with the stability of the whole network. In addition, we selected the top 10 genes in the central index as the core candidate genes.

Veri cation of intersection hub genes and construction of intersection gene-miRNA interaction
The DEGs between the normal (non-infected) samples and conjunctiva samples infected by Chlamydia trachomatis were identi ed by data from two transcription pro les (GSE20430 and GSE20436). The DEGs between the normal samples and reproductive tract samples infected by Chlamydia trachomatis were identi ed by data from two other transcription pro les (GSE26692 and GSE41075). The tool used for analysis was OmicShare Tools (https://www.omicshare.com/tools/Home/Soft/venn). After obtaining two sets of hub genes, GSE87110 was used to verify the hub genes of the reproductive tract infection.
However, the hub genes of the conjunctival infection could not be veri ed, since there was no suitable dataset. The above analysis was done by using the GraphPad Prism 8.0 software. In addition, the gene-miRNA interaction was constructed on the NetworkAnalyst 3.0 platform using the miRTarBase v8.0 database (https://www.networkanalyst.ca/NetworkAnalyst/home.xhtml).

Identi cation of DEGs
The two conjunctival infection groups showed 1104 and 4159 DEGs, respectively. The fallopian tube infection group and the endometrial infection group obtained 12441 DEGs and 6744 DEGs, respectively. Volcano maps and heatmaps were used to visualize the differences in gene expression. The gene expression pro les of infected conjunctiva were GSE20430 (Fig. 1A) and GSE20436 (Fig. 1B); the gene expression pro le of infected fallopian tube epithelial cells was GSE26692 ( Fig. 2A). The gene expression pro le of infected endometrial epithelial cells was GSE41075 (Fig. 2B).

GO enrichment analysis of DEGs
The GO functional enrichment analysis was performed for GSE20430, GSE20436, GSE26692, and GSE41075 gene expression pro les. With respect to the GSE20430 dataset (Fig. 3A), the results of CC showed that the DEGs were mainly enriched at the external side of the plasma membrane, cell-substrate junction, and focal adhesion. As for BP, the results indicated that the DEGs were principally enriched in Tcell activation, leukocyte migration, and positive regulation of cell adhesion. The results of the MF analysis showed that the DEGs were signi cantly rich in immune receptor activity, cytokine binding, and cytokine receptor activity.
In the context of the GSE20436 dataset (Fig. 3B), the DEGs in BP were enriched in T-cell activation, regulation of lymphocyte activation, and regulation of cell − cell adhesion; in MF, they were enriched in actin binding, cytokine receptor, and binding cytokine activity; in CC, they were enriched in the external side of plasma membrane, the secretory granule membrane, and membrane region.
With respect to the GSE26692 dataset (Fig. 3C), it was observed that the DEGs in BP were enriched in response to molecules of bacterial origin, regulation of cell − cell adhesion, and response to lipopolysaccharides; in MF, they were enriched in receptor ligand activity, signaling receptor activator activity, and cytokine activity; in CC, they were enriched in collagen-containing extracellular matrix, secretory granule lumen, and cytoplasmic vesicle lumen.
In relation to the GSE41075 dataset (Fig. 3D), it was observed that the DEGs in BP were enriched in neutrophil activation involved in immune response, neutrophil-mediated immunity, and regulation of membrane potential; in MF, they were enriched in signaling receptor activator activity, receptor ligand activity, and cell adhesion molecule binding; in CC, they were enriched in neuronal cell body, presynapse and synaptic membrane.

KEGG pathway enrichment analysis
In order to better identify the biological functions of the DEGs, a KEGG pathway analysis was conducted. P < 0.05 was considered statistically signi cant. The results of the analysis are shown in Fig. 4. According to the P-value, ten signi cant enrichment pathways of two conjunctival infection groups, one fallopian tube infection group, and one endometrial infection group were obtained. The signi cant enrichment pathways found in case of the conjunctival infection groups are shown in Figs. 4A and 4B, respectively. The signi cant enrichment pathways found in fallopian tube infections and endometrial infections are shown in Fig. 4C and Fig. 4D, respectively.

Construction of PPI network and hub genes identi cation
The intersection genes of GSE20430 and GSE20436 with respect to conjunctiva ( Figure S1) were obtained. In order to study the relationship between different gene expression proteins in conjunctival intersection genes, we uploaded 600 DEGs to STRING to establish a PPI network (Fig. 5A). The PPI network involved a total of 599 nodes and 5595 edges. The hub genes are shown in Fig. 5B.
Then the intersection genes of GSE26692 and GSE41075 in reproductive tract infection were obtained ( Figure S2). In order to study the relationship between different gene expression proteins in this group of intersection genes, we uploaded 135 DEGs to STRING to establish a PPI network (Fig. 6A). The PPI network involved a total of 131 nodes and 262 edges. The hub genes are shown in Fig. 6B.

Veri cation of intersection hub genes and construction of intersection gene-miRNA interaction
In order to make the research more rigorous, we used GSE87110 to verify the hub genes of reproductive tract infections. With P < 0.05 as the threshold, we found that the analysis results of the hub genes (CSF2, CD40, and CSF3) were statistically signi cant (Fig. 7). The gene-miRNA interaction of the two sets of intersection genes is shown in Fig. 8. There were two subnetworks for gene-miRNA interaction for the intersection genes of reproductive tract infection and conjunctival infection. The top 5 miRNA ranked by degree of signi cance in case of reproductive tract infection and conjunctival infection are shown in Table 1 and Table 2, respectively.

Discussion
Bioinformatics concepts were applied in this research to analyze the RNA transcription pro les of human conjunctiva, fallopian tube, and endometrial epithelial cells infected by Ct to study the characteristics of early infected host cells.
According to the GO enrichment analyses, regulation of T-cell activation showed high enrichment scores in the BP among the conjunctival infection group, which corresponded to the previous ndings that CD4 T cells and IFN-γ play a primary role in immunity against Ct infection [16] . Interestingly, in the endometrium group, the BP was observed to be signi cantly associated with neutrophil activation involved in immune response. In 2018 Karthika Rajeeve et al. reported that chlamydia trachomatis paralyses neutrophils to evade the host's innate immune response, implying that neutrophil activation is essential to the antiinfectious immunity [17] . Besides, the MF of the two groups of conjunctiva and fallopian tubes were enriched in cytokine receptor activity, including interferon-class cytokine receptor activity and IL receptor interleukin receptor activity. Numerous studies have con rmed the importance of cytokines such as IFN-γ in host resistance to Ct infection [18] . Among them, multiple interferon-stimulated genes (ISG)-mediated cell-autonomous host defenses have been shown to protect mice against experimental Ct infection [19] . In the fallopian tube infection group and endometrium infection group, the MF were highly associated with receptor ligand activity, including the vitamin D receptor activator activity, suggesting its involvement in defense against Ct infections [20] .
In the current study, KEGG was used to identify certain cell signaling pathways that are closely related to Ct infection. In this study, the common enrichment pathways of the two conjunctival data sets are the hematopoietic cell lineage pathway and the osteoclast differentiation pathway. Studies have shown that the blood system has a high response ability to generate in ammation signals caused by infection or injury and hematopoietic stem cells are responsible for the nal production of blood cells and play an important role in immunity and tissue repair [21] . The hematopoietic cell lineage pathway plays an important role in early infections. Therefore, hematopoietic cell lineage pathway plays a " rst responder" role in host defense when chlamydia trachomatis infects conjunctival epithelial cells. In addition, Ct infects conjunctival epithelial cells to promote osteoclast differentiation, and its pathway is achieved through the PI3K-Akt signaling pathway and the cAMP signaling pathway [22] . These two signaling pathways were also enriched in the fallopian tube infection in this study. Many studies have determined that the PI3K-Akt signaling pathway plays an important role in the differentiation and function of osteoclasts [22] . Osteoclasts play an absorption role in local in ammatory lesions. Since cAMP has a regulatory effect on the transcription level of chlamydia trachomatis, it has an inhibitory effect on the development cycle of chlamydia trachomatis [23,24] . Therefore, it can be inferred that the osteoclast differentiation pathway has the effect of resisting infection aggravation and immune clearance in Ct infection of the conjunctiva. Additionally, the results of this study show that the interaction pathways of cytokines and cytokine receptors are enriched in infections of the conjunctiva, fallopian tube, and endometrium. Studies have shown that the interaction between cytokines and receptors may be crucial for determining the role of in ammation in the development of diseases, because after the cells are infected, the host cells induce a large number of cytokines, chemokines, and reactive oxygen species. This in turn activates natural immunity and regulatory immune response and through the interaction of cytokines and cytokine receptors, play a role in the clearance process [25] . Therefore, in Ct infection of conjunctival and genital epithelial cells, the interaction of cytokines and cytokine receptors may play a role in immune response and immune clearance.
Determining the relationship between proteins is an important step in understanding protein functions and identifying related biological pathways [26] . A growing body of evidence shows that protein-protein interactions are critical in many biological processes in living cells [27] . Therefore, the hub gene selected by PPI may also play a crucial role in the biological process of Ct infection. We used GSE87110 to verify the hub genes in case of reproductive tract infections. Then we found that the CSF2, CD40, and CSF3 genes were statistically signi cant. Studies have shown that epithelial cells infected by chlamydia trachomatis can release CSF2, which can mediate the in ux and activation of in ammatory cells at the infection site [28,29] . In addition, the secretion of CSF2 can promote the maturation and activation of neutrophils [30] . Therefore, CSF2 plays an important role in the in ammatory response. In Schlievert et al.'s study of staphylococcus aureus infection, it was found that superantigens can destroy the mucosal barrier by binding to CD40 and then express chemokines to promote infection [31] . Therefore, we can infer that the upregulation of CD40 means that such a mechanism may also exist in Ct infection. Like CSF2, CSF3 is also involved in the host response to microbial infections [32] . CSF3 can increase the chemotactic activity of neutrophils, thus causing in ammation [33] .
In this study, we identi ed the key genes for Ct infection in the reproductive tract and conjunctiva. However, we could only verify the key genes of reproductive tract infection through external dataset. More experiments need to be implemented to verify these key genes in the conjunctival cells as well.

Conclusion
In our research, the key genes in the biological process of reproductive tract infection with Chlamydia trachomatis were clari ed through bioinformatics analysis. These hub genes mainly affect the veri cation process of Chlamydia trachomatis infection and may be further used in clinical treatment and clinical diagnosis.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.

Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Funding
There is no funding support for our study. Tables   Due to technical limitations, table 1  Comparison of DEG in infected samples and control samples. The horizontal axis is Log2 (multiple change); the more the point deviates from the center, it indicates the difference multiple change; the vertical axis is -Log 10 (adjusted P value), the closer the point is to the top of the graph, the more signi cant the difference; the red dot represents upregulation substitution The blue dots represent downregulated replacement genes, and the gray dots represent unobvious genes. Comparison of DEG in infected samples and control samples. The horizontal axis is Log2 (multiple change); the more the point deviates from the center, it indicates the difference multiple change; the vertical axis is -Log 10 (adjusted P value), the closer the point is to the top of the graph, the more signi cant the difference; the red dot represents upregulation substitution The blue dots represent downregulated replacement genes, and the gray dots represent unobvious genes.

Figure 2
Volcano map and heat map of differentially expressed genes (DEGs) in GSE26692 (A), and GSE41075 (B). Comparison of DEG in infected samples and control samples. The horizontal axis is Log2 (multiple change); the more the point deviates from the center, it indicates the difference multiple change; the vertical axis is -Log 10 (adjusted P value), the closer the point is to the top of the graph, the more signi cant the difference; the red dot represents upregulation substitution The blue dots represent downregulated replacement genes, and the gray dots represent unobvious genes.