Skip to main content

Identification of novel immune-related signatures for keloid diagnosis and treatment: insights from integrated bulk RNA-seq and scRNA-seq analysis

Abstract

Background

Keloid is a disease characterized by proliferation of fibrous tissue after the healing of skin tissue, which seriously affects the daily life of patients. However, the clinical treatment of keloids still has limitations, that is, it is not effective in controlling keloids, resulting in a high recurrence rate. Thus, it is urgent to identify new signatures to improve the diagnosis and treatment of keloids.

Method

Bulk RNA seq and scRNA seq data were downloaded from the GEO database. First, we used WGCNA and MEGENA to co-identify keloid/immune-related DEGs. Subsequently, we used three machine learning algorithms (Randomforest, SVM-RFE, and LASSO) to identify hub immune-related genes of keloid (KHIGs) and investigated the heterogeneous expression of KHIGs during fibroblast subpopulation differentiation using scRNA-seq. Finally, we used HE and Masson staining, quantitative reverse transcription-PCR, western blotting, immunohistochemical, and Immunofluorescent assay to investigate the dysregulated expression and the mechanism of retinoic acid in keloids.

Results

In the present study, we identified PTGFR, RBP5, and LIF as KHIGs and validated their diagnostic performance. Subsequently, we constructed a novel artificial neural network molecular diagnostic model based on the transcriptome pattern of KHIGs, which is expected to break through the current dilemma faced by molecular diagnosis of keloids in the clinic. Meanwhile, the constructed IG score can also effectively predict keloid risk, which provides a new strategy for keloid prevention. Additionally, we observed that KHIGs were also heterogeneously expressed in the constructed differentiation trajectories of fibroblast subtypes, which may affect the differentiation of fibroblast subtypes and thus lead to dysregulation of the immune microenvironment in keloids. Finally, we found that retinoic acid may treat or alleviate keloids by inhibiting RBP5 to differentiate pro-inflammatory fibroblasts (PIF) to mesenchymal fibroblasts (MF), which further reduces collagen secretion.

Conclusion

In summary, the present study provides novel immune signatures (PTGFR, RBP5, and LIF) for keloid diagnosis and treatment, and identifies retinoic acid as potential anti-keloid drugs. More importantly, we provide a new perspective for understanding the interactions between different fibroblast subtypes in keloids and the remodeling of their immune microenvironment.

Introduction

Keloid is mainly caused by traumatic or inflammatory skin damage, and it is widely believed that keloid formation is based on dysfunctional tissue repair [1]. However, keloids are not scars in the usual sense because they extend far beyond the original lesion area [2]. The most common pier sites are the chest, shoulder, back, and earlobes [3]. There are numerous treatment methods for keloids, including surgical and non-surgical approaches and comprehensive treatment that combines various methods. However, it has not been effectively controlled, and the recurrence rate remains very high [4]. Surgical resection and postoperative radiotherapy are still the primary methods for treating keloid lesions, but the recurrence rate after postoperative radiotherapy can also reach 16% [5].

Studies have shown that the cause of keloids is not fully understood, but fibroblasts are known to be the primary effector cells in keloid development [6]. Among these factors, TGF-β is believed to be closely associated with the increased proliferation activity of fibroblasts in keloids, enhanced invasiveness, and excessive collagen deposition [7]. Keloids are composed of a variety of subtypes of fibroblasts, which include mesenchymal fibroblast (MF), pro-inflammatory fibroblast (PIF), secretory-papillary fibroblast (SPF), secretory-reticular fibroblast (SRF) [8]. There has been increasing evidence that dysregulation of immune genes affects wound healing and leads to keloid formation [9] and that levels of various immune cells are differentially increased in keloid tissues [10]. Thus, the genome-wide revelation of dysregulated immune-related genes in this direction is expected to identify immune-related signatures that could help investigate the molecular features and mechanisms of keloid development and provide new strategies for treating keloids. Recently, high-throughput sequencing has been the hottest topic in “omics” research [11]. Machine learning is also valuable for analyzing multi-dimensional transcriptome data and identifying biologically significant genes [12, 13]. Single cell RNA sequencing is a major innovation and technological advancement in the life sciences, which provides gene expression information at the individual cell level and is essential for revealing cellular heterogeneity [14]. Thus, bulk RNA-seq combined with scRNA seq can identify different gene expression features and heterogeneity of fibroblast subtypes in keloid scars.

In summary, we performed the present study to identify and elucidate the role of immune signatures in keloids. First, we applied Weighted Gene Co-expression Network Analysis (WGCNA), Multiscale Embedded Gene Co-expression Network Analysis (MEGENA), and three machine learning algorithms to screen for hub immune-related genes of keloid. Next, we analyzed the heterogeneous expression of KHIGS in keloid fibroblast subtypes using single-cell data. Finally, we identified the promising novel drugs by targeting KHIGs, among which retinoic acid can treat keloid by reducing the differentiation of PIF to MF. This provides a new perspective for treating keloid.

Materials and methods

Data download and processing

Both keloid bulk RNA seq (GSE44270 and GSE7890) and scRNA seq (GSE163973) used in the present study were downloaded from the GEO database (https://www.ncbi.nlm.nih.gov/geo/), the details of which are shown in Table 1. We grouped samples derived from normal skin tissues into the normal group, while samples derived from keloid tissues were considered as the keloid group. In dataset GSE7890, we selected samples of keloid and normal skin that were not treated with hydrocortisone. In the GSE44270 dataset, we selected fibroblast samples from non-lesional, control and keloid. Additionally, since keloids usually occur under the age of 40 and in various locations [15,16,17,18]. Therefore, the age of the patients and the locations of the skin specimens needed to be clarified. Based on the above inclusion criteria, we excluded from the study samples that were older than 40 years of age and the location from which the skin specimen was taken was not clear. The basic information of the samples included in the present study is shown in Table S1 To eliminate the batch effect between datasets caused by different platforms, we used “ComBat” in the R package “sva” to eliminate the batch effect of the merged data array of GSE44270 and GSE7890. In addition, the immune-related genes used in the present study were annotated in the ImmPort database (https://www.immport.org/ resources), shown in Table S2 Finally, we used the R package “Limma” for screening differentially expressed genes (DEGs) between keloid tissue and normal skin tissue [19].

Table 1 The basic information regarding the dataset in the present study

Co-identification of keloid-related DEGs by WGCNA and MEGENA

WGCNA is a systems biology method for describing the correlation patterns between genes in microarray samples by applying the R package “WGCNA” to construct disease-related gene network modules [20, 21]. Firstly, a weighted adjacency matrix was generated by calculating Pearson’s Correlation among all pairs of genes. The optimal soft threshold was ten, and a hierarchical clustering tree was constructed using the degree of topological encounter as distance. The module is divided by the dynamic cutting tree method. Values of sample phenotypic characteristics were turned into visual heat maps of colors. Finally, the correlation and significance between the modules and keloid were calculated, and the values of sample phenotypic characteristics were turned into visual heat maps of colors. Finally, the correlation and significance between the modules and keloid were calculated, and the correlation heatmap was drawn to determine the modules most relevant to keloid.

MEGENA can effectively construct and analyze large-scale planar filter co-expression networks [22]. We performed MEGENA analysis using the R package “MEGENA”. Multiscale hub analysis (MHA) is used to identify highly complex network clustering features (CTA). We also used Cytoscape to visualize the genes in clusters c1_2, c1_4, c1_6, and c1_15 of MEGENA. Subsequently, we used the R function “phyper” to test the hypergeometric distribution of all MEGENA and turquoise modules in WGCNA. Finally, we filtered out the MEGENA modules with significant overlap with turquoise modules in WGCNA as keloid-related gene modules under the MEGENA algorithm. Subsequently, we intersected the turquoise module of WGCNA with the keloid-related gene modules in MEGENA, and the intersected genes obtained were used as the keloid-related DEGs in the present study.

Functional enrichment analysis

Gene Ontology (GO) term and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis of keloid/immune-related DEGs were performed using the “clusterProfiler” package [23]. P value < 0.05 is a screening criterion to investigate their biological functions, signaling pathway enrichment, and disease similarities.

Identification of hub immune-related genes of keloid

Machine learning is widely used in the biomedical field and can efficiently and rapidly analyze biological data to accurately identify hub genes in gene expression profiles [24]. We used three machine learning algorithms to identify KHIGs in the present study, namely, last absolute shrinkage and selection operator (LASSO), random forest (RF), and support vector machine recursive feature elimination (SVM-RFE). Specifically, we screened keloid/immune-related DEGs for potential candidate genes using the LASSO algorithm in the “glmnet” package, the RF algorithm in the “randomForest” package, and the SVM-RFE algorithm in the “e1071” package [25,26,27]. Then, we used upset diagrams to crossover the candidate genes screened by the three algorithms mentioned above and finally identified three crossover candidate genes named KHIGs.

Construction and verification of the ANN (Artificial neural network) model

We constructed an ANN diagnostic model based on the transcriptome level of KHIGs and the clinical characteristics of the samples (age and gender) using the R package “neuralnet”. The ANN model is able to mimic the structure and function of the brain’s neural networks to derive a set of classification rules from complex and irregular data, resulting in a highly accurate diagnostic model [28]. Specifically, we use the “caret” R package to divide the samples into a training set and a testing set in the ratio of 7:3, and the detailed sample division is shown in Table S3. This division allows us to train the model on one subset and evaluate the performance of the model on a separate subset, thus ensuring a fair assessment of the accuracy of the algorithm. Next, we utilized the R packages “neuralnet” and “neuralnettools” to construct an ANN diagnostic model, which consists of three main layers: input, hidden, and output layers, all of which work in coordination to work to achieve accurate keloid classification. During the construction of the model, the input data consisted of the expression of KHIGs and the clinical characteristics of the samples (age and gender). Additionally, the output data was composed of two nodes that we defined: “Normal” for the normal group and “Keloid” for the keloid group. Each of these nodes employed a softmax activation function to ensure the model in the configuration of the key parameters of the model, we set the first hidden layer to 8 neurons and the second hidden layer to 3 neurons, and utilized the gene weight information to produce an accurate keloid classification model. Meanwhile, we reduce the risk of overfitting and underfitting by 10-fold cross-validation. Finally, we use accuracy; precision; recall; F1-score and AUC to evaluate the diagnostic ability of the ANN model in the training set and testing set.

Construction of immune genes score

To quantify the expression patterns of immune-related genes in keloid patients, we constructed a scoring system to quantify the expression of immune-related genes and their diagnostic effects in keloid patients, which we called immune genes score (IG score).The process of establishing IG score is as follows:

First, we used the R package “ConensusClusterPlus” to perform unsupervised clustering of keloid patients; in terms of the clustering effect, the clustering stability was higher when k = 2. Thus, we characterized keloid patients into two different clusters (C1 and C2) based on the unsupervised clustering results. To construct the IG score, we performed principal component analysis (PCA) based on the KHIGs expression levels and used principal component 1 and principal component 2 as the feature scores.The formula for calculating the IG score is as follows:

IG score=∑(PC1i + PC2i)

where “i” stands for KHIG. we divided the samples with IG scores greater > 0 into the high IG score group, and the samples with IG scores < 0 into the low IG score group [29, 30].

Evaluation of immune cell infiltration and their correlation with the KHIGs

Infiltration of immune cells was assessed using ssGSEA [31]. Specifically, ssGSEA was performed in R language using the R packages “GSVA” and “GSEABase” and the immunological characteristics of keloid patients were assessed using the ssGSEA algorithm, respectively. We performed single-sample gene set enrichment analysis (ssGSEA) and calculated ssGSEA scores. We used the “pheatmap” R package to visualize the infiltration levels of different immune cells. To assess the differential infiltration abundance of different immune cells between normals and keloids, we used the Wilcoxon test for pairwise comparisons.

Fibroblast cells subtype recognition

We downloaded the scRNA-seq data of keloid from the previous study of Deng et al. [32], which contains 3 keloid samples, 3 control samples, and 40,655 cells, including 13,437 fibroblast cells after Deng et al. obtained the cells by quality control. We retrieved the scRNA-seq of 13,437 fibroblasts, which had been annotated, and analyzed them using the R package “Seurat“ [33]. Next, we normalized gene expression in fibroblasts using the “NormalizeData” function and performed principal component analysis using the “ElbowPlot” function to extract the top 20 principal components (PCs), which were further analyzed using the “FindVariableFeatures” function. We used the “FindNeighbors” and “FindClusters” functions for unsupervised and unbiased clustering of cell subpopulations. Then, we annotated fibroblast subpopulations using known markers [32], respectively PIF (APOE/CXCL3), MF (ASPN/POSTN), SRF (APCDD1/COL18A1), SPF (WISP2/MFAP5).

Pseudotime trajectory analysis

Pseudotime trajectory analysis uses single-cell transcriptome sequencing to determine cell differentiation status. In the present study, we performed pseudotime trajectory analysis of individual cells using the R software package “monocle2“ [34,35,36]. We utilized the Seurat processed data as a matrix of Genes-Cells for raw UMI counts. We used the “newCellDataset” function to generate an object (expressionFamily = negbinomial. size) for pseudotime trajectory analysis. Subsequently, we included genes with an average expression value higher than 0.1 and detected their expression in at least ten cells for pseudotime trajectory analysis. We used the “reduceDimension” function to reduce the dimensionality of the cell differentiation trajectories with parameters reduction_method = “DDRTree” and max_components = 2. We used the “orderCells” function to construct pseudotime trajectories and order the cells, and we used “plot_cell_trajectory” to classify and visualize the cells.

Identification and docking of drugs targeting KHIGs

Previous studies have shown that drug discovery begins with identifying disease targets, and target-based drug discovery is the most common strategy for new drug development [37, 38]. Thus, to identify drugs that can target KHIGs, we used the Enrichr platform (https://maayanlab.cloud/Enrichr/) for online identification and analysis. First, we entered the KHIGs gene symbols on the homepage of the Enrichr platform. Then, we screened the DSigDB database in the “Disease/Drugs” module to identify drugs that target KHIGs and set p < 0.05 as statistically significant. Subsequently, we investigated the interactions of the screened drugs with KHIGs using molecular docking techniques to identify the most promising drugs. Specifically, we obtained the molecular structures of the screened drugs from the PubChem database (https://pubchem.ncbi.nlm.nih.gov/). Meanwhile, the 3D coordinates of LIF (PDB ID, 1BQU; resolution, 2.00Å), PTGFR (PDB ID, 8IUK; resolution, 2.67 Å), and RBP5 (PDB ID, 7A9Y; resolution, 1.64 Å) were retrieved from the PDB database (https://www.rcsb.org/). Protein and molecular files were converted to PDBQT format, excluding water molecules, and including polar hydrogen atoms. To allow unrestricted molecular movement, a centered grid box surrounded the structural domains of each protein. Use the AutoDock tool for protein-ligand docking and visualize the interactions between receptors and ligands using PyMOL.

Patients samples

In previous analyses, we found dysregulated expression of KHIGs in keloid scars. To further validate this finding, we collected samples from clinical patients in both disease and non-disease groups, and then validated KHIGs at the gene and protein levels (qPCR, WB and IHC). This study included three untreated keloid patients and three requiring skin grafting for abdominal skin removal. The detailed information is shown in Table S4. All patients signed the informed consent form, and samples (the full-thickness skin tissue) were collected, processed, and analyzed under the guidance of the Ethics Committee of Guangzhou Red Cross Hospital (Ethics Approval number 2022-292-01).

Cell extraction and culture

Briefly, harvested tissue samples are washed with phosphate-buffered saline, cut into chunks, and digested with 1 mg/ml collagenase type I (SIGMA, USA). The digested tissue samples were centrifuged at 1500 × g for 5 min and filtered through a 100 μm filter. The supernatant was discarded, re-suspended in DMEM (CORNING, China), and cultured in DMEM supplemented with 10% fetal bovine serum (Gibco, Australia) at 37℃ and 5% CO2. The experiment used 2–3 primary cells.

Eosin staining

Tissues were embedded in paraffin, sectioned into 4-µm intervals, and hydrated with a series of ethanol concentrations. The cleaned sections were placed in hematoxylin dye for about 5 min and eosin dye for 5 min. Then, the slices were dehydrated, sealed, observed, and photographed under microscopy (Nikon).

Masson staining

Formalin-fixed paraffin-embedded skin tissue sections were stained with hematoxylin for 8 min, differentiated with 1% hydrochloric acid alcohol for 15 s, incubated with Masson blue solution for 15 s, and then subjected to Ponceau acid fuchsin staining for 5 min. After washing with 1% phosphomolybdic acid solution for 2 min, aniline blue staining was conducted for 2 min. The tissue sections were dehydrated with anhydrous ethanol three times for 10 s each, then incubated with xylene 3 times for 1 min each. Finally, the tissue sections were sealed with neutral resin, and microscopy (Nikon) was used to observe the stained skin tissues.

Quantitative real-time polymerase chain reaction

Total RNA was extracted from keloid fibroblasts (KFb) and normal skin fibroblasts (HSFb) using Trizol (Ambion, USA). Then 500 ng RNA was used to synthesize cDNA with PrimeScriptTM RT Master Mix (Takara, Japan), and the mRNA levels were quantified using the TB GreenTM Premix Ex Taq II (Takara, Japan). The 2−ΔΔCT relative quantitative algorithm analyzed the results. Primer sequence-Table S5.

Western blot assay

Protein concentrations were determined by the BCA Protein Assay Kit (Thermo Fisher Scientific; USA) following HSFb and KFb cell lysis in RIPA buffer and PMSF (Beyotime; China). The 20ug protein sample was placed into a 10% or 8% sodium dodecyl sulfate-polyacrylamide gel for electrophoresis and then transferred to a polyvinylidene fluoride membrane (Merck; USA). The membrane was blocked for one hour at room temperature and incubated with primary antibodies overnight at 4℃. The primary antibodies used were Mouse anti-β-actin (1:1000, Servicebio, China). Rabbit anti-LIF (1:1000, Zenbio, China), Rabbit anti-RBP5 (1:1000, Zenbio, China), Rabbit anti-PTGFR (1:1000, Abcam, USA). The membranes were washed three times with 0.05% Tween 20 Tris-buffered saline and incubated with the corresponding HRP-conjugated secondary antibodies for two hours at room temperature. We measured the blots’ densities using an ECL reagent kit (Thermo Fisher, USA). The protein levels were detected using the ChemiDocTMXRS + imaging system (Bio-Rad; USA), which detected the protein expression. The data analysis uses ImageJ software. The internal control is β-actin.

Immunohistochemistry assay

Formalin-fixed paraffin-embedded keloid and skin tissue are Dewaxed and rehydrated with dimethylbenzene and different concentrations of ethyl alcohol, then antigenically repaired and sealed. The sections were then incubated with the following primary antibodies overnight at 4℃: Rabbit anti-LIE (1:100, Zenbio, China), Rabbit anti-RBP5 (1:100, Zenbio, China), and Rabbit anti-PTGFR (1:400, Servicebio, China). Subsequently, the sections were rinsed thrice with phosphate-buffered saline for 5 min each time. The sections were then sequentially incubated with biotin-conjugated secondary antibodies and horseradish peroxidase (HRP)-conjugated secondary antibodies at 22℃ for 30 min. Chromogenic, hematoxylin staining. The samples were photographed by microscope (Nikon), and images were analyzed using ImageJ v2.1.0.

Immunofluorescent assay

The sections of HSFb and KFb cells were treated with 0.3% Triton X-100 and blocked with 10% BSA. The processed sections were incubated overnight with primary antibodies at 4℃. The primary antibodies used were Rabbit anti-collagen I(1:200; Abcam, USA). The corresponding secondary antibodies (1:100; Zenbio, China) were added to the sections for specific binding. Phalloidin-SF488(Zenbio, China) was used to stain the actin filaments, and DAPI (Beyotime; China) was used to stain the nucleus. The stained sections were imaged using microscopy (Nikon).

Statistical analysis

Statistical analysis and visualization were conducted using R and GraphPad Prism software for this study. The t-test and the Mann–Whitney U-test (the Wilcoxon rank sum test) were selected for whether the data conformed to a normal distribution. Statistical significance was defined as p < 0.05.

Results

Identification of differentially expressed immune-related genes in keloid

We performed a series of analyses based on keloid bulk RNA-seq to systematically investigate the functions of immune-related genes in keloid, and the workflow chart of this study is shown in Fig. 1. However, different datasets exhibit data heterogeneities and batch effects. Figs. 2A and B illustrate the overall profiles of GSE44270 and GSE7890 before merging. We used the “ComBat” function in the R package “sva” to eliminate the batch effect [39]. Figs. 2C and D show the overall landscape of the GSE44270 and GSE7890 data after removing the batch effect using the ComBat function. The results indicated that the samples were uniformly distributed, the batch effect between GSE44270 and GSE7890 was removed, and the data could be used for subsequent analysis.

Following this, we obtained 836 DEGs by gene expression differential analysis (Table S6), of which 426 were significantly up-regulated and 410 were significantly down-regulated genes (Fig. 2E). The overall landscape of DEGs is shown in Fig. 2F. Subsequently, we subjected the 836 DEGs to WGCNA and MEGENA analyses to identify keloid-related DEGs. Specifically, we selected 10 as the soft threshold based on mean connectivity and scale independence (Fig. 2G). The hierarchical clustering of modules is carried out based on the topological overlap matrix (TOM), and similar modules on the cluster tree are merged (Fig. 2H). Cluster tree maps and characteristic heat maps of module feature genes were drawn (Fig. 2I). Lastly, we obtained three gene modules, and the results showed that the turquoise module was significantly positively correlated with keloids and had the highest correlation (r = 0.83, p = 8e-07) (Fig. 2J). Thus, we identified 497 genes relevant to keloids through the turquoise module (Table S7). Similarly, we also constructed the MEGENA network with 836 DEGs (Fig. 2K), and we identified a total of 53 modules by MEGENA analysis (Table S8), of which 25 modules had p value ≤ 0.05 (Fig. S1A), and the first four modules were shown as Fig. S1B. Eventually, the upset diagram showed 44 keloid/immune-related DEGs between WGCNA’s turquoise module, MEGENA’s significant module, and immune-related genes (Fig. 2L).

Meanwhile, we investigated the potential biological functions of the 44 overlapping DEGs based on GO and KEGG enrichment analyses, in which the GO results showed that they were mainly enriched in “chemokine receptor binding,” “growth factor receptor binding,” “vascular endothelial growth factor binding,” “extracellular matrix,” “transcription factor AP-1 complex”, “cell proliferation,” “apoptotic process” and “inflammatory response“ (Fig. S2A). The results of KEGG showed that they were mainly enriched in the” TNF signaling pathway,” “IL-17 signaling pathway”, “ErbB signaling pathway,” “TGF-beta signaling pathway,” “Chemokine signaling pathway,” “MAPK signaling pathway,” “Apoptosis” and “Jak-STAT signaling pathway“ (Fig. S2B).

Fig. 1
figure 1

Schematic view of the procedures for data collection and analyses in keloid

Identification of KHIGs based on multiple machine learning algorithms

To identify the hub immune genes in keloid, we applied three machine learning algorithms to analyze and identify the 44 keloid/immune-related DEGs, including LASSO, SVM-RFE and RF (Table S9). Specifically, we identified 8 candidate genes (ARG2, STC2, LIF, RBP5, C5, LEFTY2, GREM1, and PTGFR) by LASSO algorithm (Fig. 3A) and 15 candidate genes by SVM-RFE algorithm with an accuracy of 0.997 and an error rate of 0.003 (PTGFR, OPRD1, ARG2, GREM1, LIF, RBP5, C5, GAST, HSPA1L, ULBP1, GALR3, SEMA4G, LEFTY2, TNC, NEO1) (Fig. 3B), and RF algorithm identified 8 candidate genes with importance greater than 0.5 (LIF, RBP5, SEMA4G, SEMA3A, CCL20, PTGFR, FGF2, LIFR) (Fig. 3C). Subsequently, we intersected the candidate genes obtained from the mentioned above machine learning algorithms and finally found that PTGFR, RBP5, and LIF could be indicated by all the algorithms, meaning that PTGFR, RBP5, and LIF could be used as KHIGs for keloid under the multiplex algorithm (Fig. 3D).

Fig. 2
figure 2

Identification of keloid/immune-related DEGs based on bulk RNA-seq. (A-B) The overall landscape of unprocessed data from the cohort. (C-D) The overall landscape of data from the cohort after removing batch effects. (E) Volcanic map showing the DEGs between keloids and controls. Red dots indicate up-regulated genes, and blue dots indicate down-regulated genes. (F) Heatmap showing the overall landscape of the DEGs between keloids and normals. (G) Graphs of the soft-threshold power versus scale-free topology model fit index and mean connectivity. (H) Dendrogram of the genes clustered based on a dissimilarity measure. (I) Dendrogram of samples and a heatmap plot of the indicated traits. (J) Analysis of the relationship between gene modules and traits (Keloid/Normal). (K) MEGENA analysis of DEGs, where each node represents a module and larger nodes indicate more genes. (L) The upset diagram showed crossover genes (keloid/immune-related DEGs) of WGCNA, MEGENA, and immune-related genes

Fig. 3
figure 3

Identification of KHIGs using 3 machine learning algorithms. (A) LASSO coefficient profiles of the 44 overlapping DEGs (left panel). After cross-validation for tuning parameter selection, 8 candidate genes were identified (right panel). (B) SVM–RFE algorithm identified 15 candidate genes with an accuracy of 0.997 (left panel) and an error of 0.003 (right panel). (C) RandomForest algorithm identified 8 candidate genes. RandomForest error rate versus the number of classification trees (left panel) and gene importance scores (right panel). (D) Venn diagram showing the candidate genes (KHIGs) screened by RandomForest, SVM-RFE, and LASSO algorithms

Evaluation of the diagnostic performance of KHIGs applying to keloid

Immediately after that, we verified the importance of KHIGs in multiple dimensions. Specifically, based on the expression level, KHIGs were all dysregulated expression (Figs. 4A-C). Based on the diagnostic perspective, the receiver operating characteristic curve (ROC) showed that KHIGs all had a high area under the curve (AUC) values in which the AUC values of LIF, PTGFR, and RBP5 were respectively 0.937, 0.921, and 0.881 (Fig. 4D). Additionally, we also constructed an ANN model to diagnose the onset of keloids based on the transcriptome level of KHIGs and the clinical characteristics of the samples (age and gender) (Fig. 4E). Meanwhile, we assessed the prediction ability of the ANN using multiple evaluation metrics, including accuracy, precision, recall, F1-score and AUC. The results of the ANN predicting training and testing sets are shown in Table 2; Figs. 4F and G, where the accuracy, precision, recall, F1-score and AUC of the training set are 0.813, 0.890, 0.800, 0.842 and 0.817, respectively, and those of the testing set are 0.714, 0.750, 0.750, 0.750 and 0.708, respectively. Overall, the ANN model was convincing as an independent diagnostic predictor of keloids.

Fig. 4
figure 4

Diagnostic performance of KIHGs in keloids. (A-C) The expression of the screened KHIGs between the keloid and normal samples. (D) ROC showing the diagnostic performance of the screened KHIGs. (E) Construction of ANN based on the transcriptome level of KHIGs and the clinical characteristics of the samples (age and gender). (F) The AUC of the training set with a value of 0.817. (G) The AUC of the testing set with a value of 0.708

Table 2 ANN diagnosis effect for the training and testing set

Construction and validation of the IG score based on KHIGs

To assess the correlation between KHIGs and keloid risk, we performed unsupervised clustering of keloid patients (Figs. 5A-C). The heatmap showed clear boundaries and stable and reliable clustering of the cluster1 and cluster2 (C1 and C2) when k = 2, and the Uniform Manifold Approximation and Projection (UMAP) and t-distributed stochastic neighbor embedding (t-SNE) dimensionality reduction analyses performed also showed that C1 and C2 were well-dispersed (Fig. 5D). Subsequently, we calculated the IG score based on the KHIGs and further assessed the risk of keloid in each sample using principal component analysis, and all samples were divided into low or high IG score subgroups based on IG scores < or > 0. The results showed that the high IG score subgroup belonged to C1, whereas the low IG score subgroup belonged to C2 and the normal group (Fig. 5E). We also compared the IG score of the normal, C1 and C2 separately, and the results showed that the C1 had higher IG scores than the C2 and the IG scores of keloid (C1 and C2) were higher than the normal group (Figs. 5F and G). Lastly, we performed the ROC analysis to further assess the predictive ability of the IG score, and the results showed that the AUC value of IG score was 0.977, which means that it has a good predictable performance (Fig. 5H). In conclusion, these data not only suggest that IG score based on KHIGs are predictive in identifying the risk of keloid patients, but also imply that keloids themselves may be heterogeneous and have different disease subtypes.

Fig. 5
figure 5

Construction of IG scores for assessment of keloid risk. (A-C) Unsupervised clustering of keloids based on KHIGs. (D) UMAP (left panel) and t-SNE (right panel) analyses to validate the effect of unsupervised clustering of keloids by KHIGs, with each point representing a sample. (E) Alluvial plots showing the associations between the three groups: disease state grouping, unsupervised clustering grouping and IG score grouping. (F-G) Violin plots showing the comparison of IG scores between the normals and the keloids distributed in C1 and C2. (H) The ROC curve used to evaluate the IG score

Immune cell infiltration analysis

We performed evaluation of immunological characterization of keloid samples using the ssGSEA algorithm, that is, we calculated the of immune cells abundance for each sample, including normal samples. Fig. 6A showed the overall immune cell infiltration, and the results indicated that there was a significant difference in immune cell infiltration between the keloids and the normals. Compared with normals, keloids have lower “T follicular helper cell” and “Macrophage” (Fig. 6B). Following this, we investigated the relationship between the KHIGs expression and the abundance of the immune cells mentioned above that are dysregulated in abundance by correlation analysis. We found significant correlations between the KHIGs expression and the abundance of these immune celltypes. For example, the LIF expression was negatively correlated with the abundance of “T follicular helper cell” and “Macrophage” (Fig. 6C), whereas the RBP5 expression was negatively correlated with its abundance (Fig. 6D), and then the PTGFR expression was positively correlated with the abundance of “T follicular helper cell” and “Gamma delta T cell” (Fig. 6E). The significant correlation between the KHIGs expression and the immune cell abundance implies that KHIGs may have a potential role in regulating the immune microenvironment of keloid.

Fig. 6
figure 6

Association of immune cell infiltration abundance with KHIGs. (A) Heatmap showing the overall landscape of immune cell abundance for keloid samples and normal samples. (B) Box plots showing the differences in immune cell infiltration abundance between keloid samples and control samples. (C) Correlation analysis between the expression of LIF and immune cell infiltration abundance. On the outside of the thick black line it is a positive correlation and on the inside of the thick black line it is a negative correlation. (D) Correlation analysis between the expression of RBP5 and immune cell infiltration abundance. On the outside of the thick black line it is a positive correlation and on the inside of the thick black line it is a negative correlation. (E) Correlation analysis between the expression of PTGFR and immune cell infiltration abundance. On the outside of the thick black line it is a positive correlation and on the inside of the thick black line it is a negative correlation. (ns, no significance, *p < 0.05, **p < 0.01, ***p < 0.001)

Single-cell RNA seq analysis of skin tissue

Single-cell seq of 13,437 fibroblasts from 3 keloid samples and 3 normal samples were integrated into the present study (Table S10), and four clusters were identified based on principal component analysis and the UMAP algorithm (Fig. 7A). We annotated the fibroblast subtypes according to the known markers [32], that is, PIF (APOE/CXCL3), MF (ASPN/POSTN), SRF (APCDD1/COL18A1), SPF (WISP2/MFAP5) (Figs. 7B-F). The single-cell landscapes of annotated fibroblast subtypes are shown in Figs. 7G and H. We found that MF and PIF accounted for a higher proportion of the various fibroblast subtypes by cell percentage counting, with MF being more prevalent in keloids and PIF being more prevalent in normal tissues (Fig. 7I). Lastly, we have shown the KHIGs expression in fibroblast subtypes in Figs. 7J-M.

Fibroblasts exhibit temporal heterogeneity in gene expression patterns during differentiation. To explore the heterogeneous expression of KHIGs during fibroblast differentiation, we used the “Monocle2” R package to perform pseudotime analysis. The pseudotime analysis of fibroblasts showed one node and three states during fibroblast differentiation, which were attributed to pseudotime values for the differentiation process. The results according to differentiation state, cell type, and pseudotime sequence showed that fibroblasts differentiated chronologically from the beginning to the end of the differentiation trajectory in state 1, state 3, and state 2, which means that PIF, SPF, and SRF tend to differentiate towards MF (Fig. 7N). Subsequently, we found that the expression pattern of KHIGs on fibroblast differentiation trajectories also showed temporal heterogeneity; that is, LIF showed a tendency to decrease, then increase, and then decrease, and the expression of PTGFR gradually decreased, whereas the expression of RBP5 showed a tendency to increase and then decrease (Fig. 7O).

Fig. 7
figure 7

Single-cell RNA seq analysis of skin tissue. (A) 4 cell clusters were identified. (B-E) Heatmap of known marker expression in each type of cell subpopulation. (F) Dot plot of known marker expression in each type of cell subpopulation. (G-H) Mesenchymal fibroblast (MF), pro-inflammatory fibroblast (PIF), secretory-papillary fibroblast (SPF), and secretory-reticular fibroblast (SRF), the 4 types of fibroblast subtypes were annotated by known markers. (I) Counting of cell proportions. (J) Dot plot of KHIGs expression in the four types of fibroblast subtypes. (K-M) Violin plots (left panels) and heatmaps (right panels) of KHIGs expression in four types of fibroblast subtypes; each dot represents one cell. (N) Differentiation trajectories of fibroblasts, respectively, grouped based on differentiation status, pseudotime sequence, and cell types. (O) Jitter plot of KHIGs heterogeneously expressed with cell differentiation (pseudotime sequence). (ns, no significance, *p < 0.05, **p < 0.01, ***p < 0.001)

Validation of KHIGs

The HE and Masson stain can show the difference between keloids and normal skin lesions. In normal skin, the fibers are arranged regularly and densely. In keloid, more fibers are found in the dermis, arranged irregularly and densely (Figs. 8A and B). To verify the expression of KHIGs in keloid, mRNA and protein expression levels of fibroblasts in keloid and normal skin were detected by qRT-PCR and Western blot, and KHIGs protein levels in keloid and normal skin were detected by IHC. The results of qRT-PCR showed that KHIGs had significant differences in keloid fibroblast samples (p value ≤ 0.05) (Fig. 8C), and LIF expression was up-regulated in keloid. At the same time, PTGFR and RBP5 were down-regulated in keloid. Western blot (Figs. 8D and E) and IHC (Figs. 8F-K) showed that PTGFR and RBP5 verified our results at the protein level (p value ≤ 0.05), and LIE expression was less and did not differ at the protein level.

Fig. 8
figure 8

Validation of KHIGs. (A-B) HE and Masson staining of normal skin and keloid. (C) Relative mRNA expression levels of KHIGs. (D) Western blotting assay of KHIGs, it is related to Figure S3. (E) Relative protein levels of KHIGs. (F-H) Immunofluorescent assay of KHIGs. (I-K) Relative protein levels of KHIGs. (ns, no significance, *p < 0.05, **p < 0.01, ***p < 0.001)

Evaluation of retinoic acid

We used the Enrichr platform to identify potential drugs by targeting KHIGs that can be used to treat or alleviate keloids. We found that Luteolin(LIF) (Fig. 9A), Retinoic acid (RBP5) (Fig. 9B), and Parthenolide (PTGFR) (Fig. 9C) bind closely (p value ≤ 0.05). To analyze the binding affinity of the three drugs, the MMPBSA method was used to calculate the binding free energy of drugs and molecules. The results showed that KHIGs were all able to dock with the identified drugs with high binding energies, such as LIF-Luteolin (-7.1 kcal/mol), RBP5-Retinoic acid (-7.6 kcal/mol), PTGFR- Parthenolide (-6.6 kcal/mol). In summary, it is suggested that these drugs may positively affect the targeted treatment of keloid. To explore the role of Retinoic acid in the pathogenesis of keloid. First, we will treat HSF cells with 10µM TGF-β1 for 48 h. We found that the mRNA expression of MF (ASPN, POSTN, and COMP) was up-regulated (p value ≤ 0.05), and the mRNA expression of PIF (APOE, CCL19, and CXCL3) was down-regulated(p value ≤ 0.05) (Figs. 9D and E). They indicate that TGF-β1 could further increase the proportion of MF in normal skin fibroblasts. Then, 1µM Retinoic acid was added to 10µM TGF-β1 and treated together for 48 h. The results showed (Figs. 9F-H) that 10µM TGF-β1 group protein RBP5 expression decreased, and collagen type I significantly increased compared with the control group. More interestingly, after adding Retinoic acid to TGF-β1, we found that Retinoic acid can promote the increase of RBP5 expression and inhibit the production of collagen I.

Fig. 9
figure 9

Retinoic Acid inhibition of TGF-β1-induced collagen I production in fibroblasts. (A) The structure of the complex is formed by the docking of Luteolin with LIF. (B) The structure of the complex is formed by the docking of Retinoic acid with RBP5. (C) The structure of the complex is formed by the docking of Parthenolide with PTGFR. (D) Relative mRNA expression levels of ASPN, POSTN, and COMP. (E) mRNA expression levels of APOE, CXCL3, and CCL19. (F) Western blotting assay of RBP5 and collagen I, it is related to Figure S3. (G) Relative protein levels of RBP5 and collagen I. (H) Immunofluorescent assay of Collagen I, DAPI, and Phalloidin. (ns, no significance, *p < 0.05, **p < 0.01, ***p < 0.001)

Fig. 10
figure 10

Schematic illustration illustrating the pathogenesis of keloid

The potential regulatory mechanism of KHIGS in contributing to keloid

To reveal the mechanism of KHIGS in keloid, we further integrated the literature reports [32, 40] and our research results. We drew the mechanism diagram of KHIG in keloid: (1) Fibroblast differentiation trajectory: the trend of PIF, SPF, and SRF differentiation to MF. (2) TGF-β, TNF-α, PDGF, and Wnt signaling pathways play a key role in regulating fibroblast function. (3) KHIGs also showed temporal heterogeneity in the expression pattern of fibroblast differentiation locus; that is, LIF decreased first and then increased, and PTGFR expression decreased first and then gradually decreased. The expression of RBP5 decreased first and then increased. (4) Communication between SPF and SRF cells and PIF decreased, while communication between PIF and MF cells increased. (5) Retinoic acid further inhibits the transformation of PIF to MF by promoting the expression of RB5, thereby reducing the production of collagen and extracellular matrix (ECM) (Fig. 10).

Discussion

Keloids result from abnormal scarring on injured skin and are characterized by excessive proliferation of fibroblasts, excessive production of extracellular matrix, and excessive deposition of collagen, often causing physical and psychological distress to the patient [41, 42]. At present, treatments for keloids, such as surgery, radiation, and physical therapy, are ineffective and have a high recurrence rate [43]. In keloid pathogenesis, the accumulation of many inflammatory cells in keloid lesions, such as macrophages, mast cells, and T lymphocytes, suggests that keloid lesions may be an inflammatory skin disease [44]. Hence, most current research has focused on the inflammatory aspects of keloids [45]. Here is little emphasis on the importance of immunity for keloids. There is a new view that immunity plays a crucial role in preventing pathogen invasion, inducing inflammation, and regulating fibroblasts in keloid in the study of keloid mechanism [46]. Thus, it is necessary to study keloids in immune-related aspects through bioinformatics.

In the present study, we first obtained 836 DEGs, including 410 up-regulated genes and 426 down-regulated genes, from publicly available datasets of keloid and normal skin tissue samples after removal of batch effects. Then, we intersected the keloid-related genes identified by WGCNA and MEGENA analyses with the immune gene list and finally obtained 44 keloid-related immune genes. We performed GO and KEGG enrichment analyses on the mentioned 44 genes, and GO analyses showed that these genes were mainly involved in inflammatory responses, cell proliferation, apoptosis, and other processes in biological processes, which was consistent with the results of previous studies [47,48,49]. Regarding cellular components, these genes were mainly enriched in the extracellular matrix, transcription factor AP-1 complex, collagen-containing extracellular matrix, etc., consistent with previous studies [50,51,52]. Regarding molecular functions, these genes were mainly enriched in chemokine receptor binding, growth factor receptor binding, integrin binding, etc., which was consistent with previous studies [53,54,55]. KEGG analysis showed that 44 keloid-related immune genes were mainly enriched in the Jak-STAT signaling pathway, MAPK signaling pathway, TGF-β signaling pathway, and other related signaling pathways, consistent with previous studies [56,57,58]. These results suggest that 44 keloid-related immune genes may regulate these key signaling pathways during the development and progression of keloids, which needs to be further investigated.

We further screened 44 keloid-related immune genes using multiple machine-learning algorithms: LASSO, SVM-RFE, and RF. We showed that LIF, PTGFR, and RBP5 can be indicated by all algorithms named KHIGs. LIF was significantly up-regulated in keloids compared to normal skin, while PTGFR and RBP5 were down-regulated in keloids. In addition, our ANN models constructed based on the expression of KHIGs and the clinical information of the samples all had AUC values higher than 0.7. Previous studies have shown that the area under the ROC curve is greater than 0.5, proving that the diagnostic model has some diagnostic value [59]. That means that KHIGs have good performance as dysregulated immune genes associated with keloid in predicting keloid. Notably, the leukemic inhibitory factor (LIF) is the most variable member of the interleukin-6 family [60], and LIF has previously been shown to be a potential therapeutic target for renal interstitial fibrosis [61]. The prostaglandin F(2α) receptor (PTGFR) is thought to play a role in pregnancy and labor [62], while Retinol-binding protein 5 (RBP5) regulates cell differentiation and proliferation [63].

It is well-known that fibroblasts are crucial factors in the pathogenesis of keloids. Previous studies have shown four different subtypes of fibroblasts in keloids and normal tissues [64]. We further found that the trend of KHIGS expression was also different in different cell subpopulations, which may suggest that the heterogeneous expression of KHIGS in different cell subtypes may also promote keloid. In addition, in our pseudo-time analysis, MF belonged to terminally differentiated cells. In contrast, fibroblasts of the other three subtypes could be transformed into MF. This further suggests that KHIGS plays a sophisticated function in the molecular mechanisms of keloids, implying that the heterogeneous expression of KHIGS in different fibroblast subtypes epitomizes the heterogeneity of keloids, meaning that KHIGS may serve as a predictor of keloid risk.

It has been shown in previous studies that the process of drug discovery begins with the identification of disease targets, and target-based drug discovery is the most common strategy for new drug exploitation [37, 38]. Current data mining of existing biomedical data and information contributes significantly to target discovery in the “Omics” era. This implies that target discovery is a critical step in the biomarker and drug discovery process for diagnosing and treating human diseases [65,66,67]. In the present study, we identified three potential drugs by targeting KHIGS: Parthenolide (PTGFR), Luteolin (LIF), and Retinoic acid (RBP5). Parthenolide, naturally occurring sesquiterpene lactone derived from feverfew, exhibits exceptional anti-cancer and anti-inflammatory properties [68]. The research suggests that Parthenolide could inhibit the UVB-induced proliferation of keratinocytes and melanocytes in the mouse skin [69]. Luteolin is a flavonoid found in different plants, such as vegetables, medicinal herbs, and fruits, as a modulator of skin aging and inflammation [70]. As a modulator of skin aging and inflammation, it has been extensively studied in skin aging, skin cancer, wound healing, and inflammatory skin diseases [71]. Retinoic acid is a metabolic product derived from vitamin A, acting at a nuclear level to maintain the proper transcriptional activity [72]. Retinoic acid and its derivatives have therapeutic potential for severe skin diseases [73]. Based on the above studies, it can be seen that the above three drugs have positive effects on skin diseases. We then used molecular docking methods to investigate further the binding affinity between these drugs and their related KHIGS, so we speculate that Parthenolide, Luteolin, and Retinoic acid are potential drugs for treating keloids. Finally, we chose retinoic acid with the highest binding energy. Previous studies have shown that retinoic acid can treat keloids [74], but the specific mechanism is unknown. In the experiment, we believe that retinoic acid may inhibit the form of PIF to MF differentiation through RBP5 to reduce collagen secretion further.

Conclusion

In conclusion, the present study provided new immune signatures (PTGFR, RBP5, and LIF) for keloid diagnosis and treatment using multiple bioinformatic analyses and machine learning algorithms. Meanwhile, retinoic acid was identified as a potential anti-keloid drug by molecular docking methods as well as experimental validation. It helps to break through the current dilemma encountered in the diagnosis and treatment of keloid in the clinic.

Data availability

Datasets related to this article are from public database (GSE44270, GSE7890 and GSE163973). All data generated or analyzed during this study are included in this article/Additional files, further inquiries can be directed to the corresponding author.

References

  1. Delaleu J, Charvet E, Petit A. Keloid disease: review with clinical atlas. Part I: definitions, history, epidemiology, clinics and diagnosis. Ann Dermatol Venereol. 2023;150(1):3–15.

    Article  CAS  PubMed  Google Scholar 

  2. Katayama Y, Naitoh M, Kubota H, Yamawaki S, Aya R, Ishiko T, Morimoto N. Chondroitin sulfate promotes the proliferation of keloid fibroblasts through activation of the integrin and protein kinase B pathways. Int J Mol Sci 2020, 21(6).

  3. Murray JC, Pollack SV, Pinnell SR. Keloids: a review. J Am Acad Dermatol. 1981;4(4):461–70.

    Article  CAS  PubMed  Google Scholar 

  4. Ogawa R. Keloid and hypertrophic scars are the result of chronic inflammation in the reticular dermis. Int J Mol Sci 2017, 18(3).

  5. Hwang NH, Chang JH, Lee NK, Yang KS. Effect of the biologically effective dose of electron beam radiation therapy on recurrence rate after keloid excision: a meta-analysis. Radiotherapy Oncology: J Eur Soc Therapeutic Radiol Oncol. 2022;173:146–53.

    Article  CAS  Google Scholar 

  6. Marty P, Chatelain B, Lihoreau T, Tissot M, Dirand Z, Humbert P, Senez C, Secomandi E, Isidoro C, Rolin G. Halofuginone regulates keloid fibroblast fibrotic response to TGF-β induction. Biomed Pharmacotherapy = Biomedecine Pharmacotherapie. 2021;135:111182.

    Article  CAS  PubMed  Google Scholar 

  7. Zhang T, Wang XF, Wang ZC, Lou D, Fang QQ, Hu YY, Zhao WY, Zhang LY, Wu LH, Tan WQ. Current potential therapeutic strategies targeting the TGF-β/Smad signaling pathway to attenuate keloid and hypertrophic scar formation. Biomed Pharmacotherapy = Biomedecine Pharmacotherapie. 2020;129:110287.

    Article  CAS  PubMed  Google Scholar 

  8. Xie J, Chen L, Cao Y, Wu D, Xiong W, Zhang K, Shi J, Wang M. Single-Cell Sequencing Analysis and Weighted Co-expression Network Analysis Based on Public Databases identified that TNC is a Novel Biomarker for Keloid. Front Immunol. 2021;12:783907.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Wang ZC, Zhao WY, Cao Y, Liu YQ, Sun Q, Shi P, Cai JQ, Shen XZ, Tan WQ. The roles of inflammation in Keloid and hypertrophic scars. Front Immunol. 2020;11:603187.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Shaker SA, Ayuob NN, Hajrah NH. Cell talk: a phenomenon observed in the keloid scar by immunohistochemical study. Appl Immunohistochem Mol Morphology: AIMM. 2011;19(2):153–9.

    Article  PubMed  Google Scholar 

  11. Pareek CS, Smoczynski R, Tretyn A. Sequencing technologies and genome sequencing. J Appl Genet. 2011;52(4):413–35.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Uddin S, Khan A, Hossain ME, Moni MA. Comparing different supervised machine learning algorithms for disease prediction. BMC Med Inf Decis Mak. 2019;19(1):281.

    Article  Google Scholar 

  13. Choi RY, Coyner AS, Kalpathy-Cramer J, Chiang MF, Campbell JP. Introduction to machine learning, neural networks, and Deep Learning. Translational Vis Sci Technol. 2020;9(2):14.

    Google Scholar 

  14. González-Silva L, Quevedo L, Varela I. Tumor Functional Heterogeneity unraveled by scRNA-seq technologies. Trends cancer. 2020;6(1):13–9.

    Article  PubMed  Google Scholar 

  15. Littlefield A, Lenahan C. Cholelithiasis: presentation and management. J Midwifery Women’s Health. 2019;64(3):289–97.

    Article  Google Scholar 

  16. Zhu Z, Kong W, Wang H, Xiao Y, Shi Y, Gan L, Sun Y, Tang H, Xia Z. Clinical status of hospitalized keloid cases from 2013 to 2018. Burns: J Int Soc Burn Injuries. 2022;48(8):1874–84.

    Article  Google Scholar 

  17. Wang JC, Fort CL, Hom DB. Location propensity for keloids in the Head and Neck. Facial Plast Surg Aesthetic Med. 2021;23(1):59–64.

    Article  Google Scholar 

  18. Park TH, Chang CH. Location of keloids and its treatment modality may influence the Keloid recurrence in children. J Craniofac Surg. 2015;26(4):1355–7.

    Article  PubMed  Google Scholar 

  19. Ritchie ME, Phipson B, Wu D, Hu Y, Law CW, Shi W, Smyth GK. Limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 2015;43(7):e47.

    Article  PubMed  PubMed Central  Google Scholar 

  20. Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008;9:559.

    Article  PubMed  PubMed Central  Google Scholar 

  21. Luo Y, Coskun V, Liang A, Yu J, Cheng L, Ge W, Shi Z, Zhang K, Li C, Cui Y, et al. Single-cell transcriptome analyses reveal signals to activate dormant neural stem cells. Cell. 2015;161(5):1175–86.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Song WM, Zhang B. Multiscale Embedded Gene Co-expression Network Analysis. PLoS Comput Biol. 2015;11(11):e1004574.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Wu T, Hu E, Xu S, Chen M, Guo P, Dai Z, Feng T, Zhou L, Tang W, Zhan L, et al. clusterProfiler 4.0: a universal enrichment tool for interpreting omics data. Innov (Cambridge (Mass). 2021;2(3):100141.

    CAS  Google Scholar 

  24. Chen Y, Li Y, Narayan R, Subramanian A, Xie X. Gene expression inference with deep learning. Bioinf (Oxford England). 2016;32(12):1832–9.

    CAS  Google Scholar 

  25. Noble WS. What is a support vector machine? Nat Biotechnol. 2006;24(12):1565–7.

    Article  CAS  PubMed  Google Scholar 

  26. Paul A, Mukherjee DP, Das P, Gangopadhyay A, Chintha AR, Kundu S. Improved Random Forest for classification. IEEE Trans Image Processing: Publication IEEE Signal Process Soc. 2018;27(8):4012–24.

    Article  Google Scholar 

  27. Vasquez MM, Hu C, Roe DJ, Chen Z, Halonen M, Guerra S. Least absolute shrinkage and selection operator type methods for the identification of serum biomarkers of overweight and obesity: simulation and application. BMC Med Res Methodol. 2016;16(1):154.

    Article  PubMed  PubMed Central  Google Scholar 

  28. Shi Y, Ying X, Yang J. Deep unsupervised domain adaptation with Time Series Sensor Data: a Survey. Sensors 2022, 22(15).

  29. Xu J, Liang C, Li J. A signal recognition particle-related joint model of LASSO regression, SVM-RFE and artificial neural network for the diagnosis of systemic sclerosis-associated pulmonary hypertension. Front Genet. 2022;13:1078200.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Zhang B, Wu Q, Li B, Wang D, Wang L, Zhou YL. M(6)a regulator-mediated methylation modification patterns and tumor microenvironment infiltration characterization in gastric cancer. Mol Cancer. 2020;19(1):53.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Hänzelmann S, Castelo R, Guinney J. GSVA: gene set variation analysis for microarray and RNA-seq data. BMC Bioinformatics. 2013;14:7.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Deng CC, Hu YF, Zhu DH, Cheng Q, Gu JJ, Feng QL, Zhang LX, Xu YP, Wang D, Rong Z, et al. Single-cell RNA-seq reveals fibroblast heterogeneity and increased mesenchymal fibroblasts in human fibrotic skin diseases. Nat Commun. 2021;12(1):3709.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Satija R, Farrell JA, Gennert D, Schier AF, Regev A. Spatial reconstruction of single-cell gene expression data. Nat Biotechnol. 2015;33(5):495–502.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Qiu X, Mao Q, Tang Y, Wang L, Chawla R, Pliner HA, Trapnell C. Reversed graph embedding resolves complex single-cell trajectories. Nat Methods. 2017;14(10):979–82.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Trapnell C, Cacchiarelli D, Grimsby J, Pokharel P, Li S, Morse M, Lennon NJ, Livak KJ, Mikkelsen TS, Rinn JL. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nat Biotechnol. 2014;32(4):381–6.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Haghverdi L, Büttner M, Wolf FA, Buettner F, Theis FJ. Diffusion pseudotime robustly reconstructs lineage branching. Nat Methods. 2016;13(10):845–8.

    Article  CAS  PubMed  Google Scholar 

  37. Paananen J, Fortino V. An omics perspective on drug target discovery platforms. Brief Bioinform. 2020;21(6):1937–53.

    Article  CAS  PubMed  Google Scholar 

  38. Trajanoska K, Bhérer C, Taliun D, Zhou S, Richards JB, Mooser V. From target discovery to clinical drug development with human genetics. Nature. 2023;620(7975):737–45.

    Article  CAS  PubMed  Google Scholar 

  39. Leek JT, Johnson WE, Parker HS, Jaffe AE, Storey JD. The sva package for removing batch effects and other unwanted variation in high-throughput experiments. Bioinf (Oxford England). 2012;28(6):882–3.

    CAS  Google Scholar 

  40. Plikus MV, Wang X, Sinha S, Forte E, Thompson SM, Herzog EL, Driskell RR, Rosenthal N, Biernaskie J, Horsley V. Fibroblasts: origins, definitions, and functions in health and disease. Cell. 2021;184(15):3852–72.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Naik PP. Novel targets and therapies for keloid. Clin Exp Dermatol. 2022;47(3):507–15.

    Article  CAS  PubMed  Google Scholar 

  42. Andrews JP, Marttala J, Macarak E, Rosenbloom J, Uitto J. Keloids: the paradigm of skin fibrosis - pathomechanisms and treatment. Matrix Biology: J Int Soc Matrix Biology. 2016;51:37–46.

    Article  CAS  Google Scholar 

  43. Hao YH, Xing XJ, Zhao ZG, Xie F, Hao T, Yang Y, Li CX. A multimodal therapeutic approach improves the clinical outcome of auricular keloid patients. Int J Dermatol. 2019;58(6):745–9.

    Article  CAS  PubMed  Google Scholar 

  44. Zhang X, Wu X, Li D. The communication from Immune cells to the fibroblasts in Keloids: implications for Immunotherapy. Int J Mol Sci 2023, 24(20).

  45. Hong YK, Chang YH, Lin YC, Chen B, Guevara BEK, Hsu CK. Inflammation in Wound Healing and pathological scarring. Adv Wound care. 2023;12(5):288–300.

    Article  Google Scholar 

  46. Chen Y, Jin Q, Fu X, Qiao J, Niu F. Connection between T regulatory cell enrichment and collagen deposition in keloid. Exp Cell Res. 2019;383(2):111549.

    Article  CAS  PubMed  Google Scholar 

  47. Kurimoto-Nishiguchi M, Muraoka K, Inaba Y, Kunimoto K, Yamamoto Y, Kumegawa S, Ueno K, Asamura S, Nakatani Y, Sawamura S, et al. Glycoprotein M6A upregulation detected by transcriptome analysis controls the proliferation of keloidal fibroblasts. J Dermatol. 2023;50(9):1170–9.

    Article  CAS  PubMed  Google Scholar 

  48. Wang XM, Liu XM, Wang Y, Chen ZY. Activating transcription factor 3 (ATF3) regulates cell growth, apoptosis, invasion and collagen synthesis in keloid fibroblast through transforming growth factor beta (TGF-beta)/SMAD signaling pathway. Bioengineered. 2021;12(1):117–26.

    Article  PubMed  Google Scholar 

  49. Huang C, Ogawa R. Role of inflammasomes in Keloids and hypertrophic scars-lessons learned from Chronic Diabetic wounds and skin fibrosis. Int J Mol Sci 2022, 23(12).

  50. Liu L, Yu H, Long Y, You Z, Ogawa R, Du Y, Huang C. Asporin inhibits collagen matrix-mediated intercellular mechanocommunications between fibroblasts during keloid progression. FASEB Journal: Official Publication Federation Am Soc Experimental Biology. 2021;35(7):e21705.

    Article  CAS  Google Scholar 

  51. Xue M, Jackson CJ. Extracellular matrix reorganization during Wound Healing and its impact on abnormal scarring. Adv Wound care. 2015;4(3):119–36.

    Article  Google Scholar 

  52. Xia W, Kong W, Wang Z, Phan TT, Lim IJ, Longaker MT, Yang GP. Increased CCN2 transcription in keloid fibroblasts requires cooperativity between AP-1 and SMAD binding sites. Ann Surg. 2007;246(5):886–95.

    Article  PubMed  Google Scholar 

  53. Furie N, Shteynberg D, Elkhatib R, Perry L, Ullmann Y, Feferman Y, Preis M, Flugelman MY, Tzchori I. Fibulin-5 regulates keloid-derived fibroblast-like cells through integrin beta-1. Int J Cosmet Sci. 2016;38(1):35–40.

    Article  CAS  PubMed  Google Scholar 

  54. Rees PA, Greaves NS, Baguneid M, Bayat A. Chemokines in Wound Healing and as potential therapeutic targets for reducing cutaneous scarring. Adv Wound care. 2015;4(11):687–703.

    Article  Google Scholar 

  55. Daian T, Ohtsuru A, Rogounovitch T, Ishihara H, Hirano A, Akiyama-Uchida Y, Saenko V, Fujii T, Yamashita S. Insulin-like growth factor-I enhances transforming growth factor-beta-induced extracellular matrix protein production through the P38/activating transcription factor-2 signaling pathway in keloid fibroblasts. J Invest Dermatol. 2003;120(6):956–62.

    Article  CAS  PubMed  Google Scholar 

  56. Qian L, Wang Q, Wei C, Wang L, Yang Y, Deng X, Liu J, Qi F. Protein tyrosine phosphatase 1B regulates fibroblasts proliferation, motility and extracellular matrix synthesis via the MAPK/ERK signalling pathway in keloid. Exp Dermatol. 2022;31(2):202–13.

    Article  CAS  PubMed  Google Scholar 

  57. Chao H, Zheng L, Hsu P, He J, Wu R, Xu S, Zeng R, Zhou Y, Ma H, Liu H et al. IL-13RA2 downregulation in fibroblasts promotes keloid fibrosis via JAK/STAT6 activation. JCI Insight 2023, 8(6).

  58. Hu H, Mao G, Zheng J, Guo F. Keloid patient plasma-derived exosomal hsa_circ_0020792 promotes normal skin fibroblasts proliferation, Migration, and Fibrogenesis via modulating miR-193a-5p and activating TGF-β1/Smad2/3 signaling. Drug Des Devel Ther. 2022;16:4223–34.

    Article  PubMed  PubMed Central  Google Scholar 

  59. Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, Müller M. pROC: an open-source package for R and S + to analyze and compare ROC curves. BMC Bioinformatics. 2011;12:77.

    Article  PubMed  PubMed Central  Google Scholar 

  60. Nicola NA, Babon JJ. Leukemia inhibitory factor (LIF). Cytokine Growth Factor Rev. 2015;26(5):533–44.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  61. Xu S, Yang X, Chen Q, Liu Z, Chen Y, Yao X, Xiao A, Tian J, Xie L, Zhou M, et al. Leukemia inhibitory factor is a therapeutic target for renal interstitial fibrosis. EBioMedicine. 2022;86:104312.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  62. Hay A, Wood S, Olson D, Slater DM. Labour is associated with decreased expression of the PGF2alpha receptor (PTGFR) and a novel PTGFR splice variant in human myometrium but not decidua. Mol Hum Reprod. 2010;16(10):752–60.

    Article  CAS  PubMed  Google Scholar 

  63. Ho JC, Cheung ST, Poon WS, Lee YT, Ng IO, Fan ST. Down-regulation of retinol binding protein 5 is associated with aggressive tumor features in hepatocellular carcinoma. J Cancer Res Clin Oncol. 2007;133(12):929–36.

    Article  CAS  PubMed  Google Scholar 

  64. Shan M, Xiao M, Xu J, Sun W, Wang Z, Du W, Liu X, Nie M, Wang X, Liang Z, et al. Multi-omics analyses reveal bacteria and catalase associated with keloid disease. EBioMedicine. 2023;99:104904.

    Article  PubMed  PubMed Central  Google Scholar 

  65. Yang Y, Adelstein SJ, Kassis AI. Target discovery from data mining approaches. Drug Discovery Today. 2012;17(Suppl):S16–23.

    Article  CAS  PubMed  Google Scholar 

  66. Sleno L, Emili A. Proteomic methods for drug target discovery. Curr Opin Chem Biol. 2008;12(1):46–54.

    Article  CAS  PubMed  Google Scholar 

  67. Sams-Dodd F. Target-based drug discovery: is something wrong? Drug Discovery Today. 2005;10(2):139–47.

    Article  CAS  PubMed  Google Scholar 

  68. Mathema VB, Koh YS, Thakuri BC, Sillanpää M. Parthenolide, a sesquiterpene lactone, expresses multiple anti-cancer and anti-inflammatory activities. Inflammation. 2012;35(2):560–5.

    Article  CAS  PubMed  Google Scholar 

  69. Tanaka K, Hasegawa J, Asamitsu K, Okamoto T. Prevention of the ultraviolet B-mediated skin photoaging by a nuclear factor kappaB inhibitor, parthenolide. J Pharmacol Exp Ther. 2005;315(2):624–30.

    Article  CAS  PubMed  Google Scholar 

  70. Imran M, Rauf A, Abu-Izneid T, Nadeem M, Shariati MA, Khan IA, Imran A, Orhan IE, Rizwan M, Atif M, et al. Luteolin, a flavonoid, as an anticancer agent: a review. Biomed Pharmacotherapy = Biomedecine Pharmacotherapie. 2019;112:108612.

    Article  CAS  PubMed  Google Scholar 

  71. Gendrisch F, Esser PR, Schempp CM, Wölfle U. Luteolin as a modulator of skin aging and inflammation. Biofactors. 2021;47(2):170–80.

    Article  CAS  PubMed  Google Scholar 

  72. Pouso MR, Cairrao E. Effect of retinoic acid on the neurovascular unit: a review. Brain Res Bull. 2022;184:34–45.

    Article  CAS  PubMed  Google Scholar 

  73. Szymański Ł, Skopek R, Palusińska M, Schenk T, Stengel S, Lewicki S, Kraj L, Kamiński P, Zelent A. Retinoic acid and its derivatives in skin. Cells 2020, 9(12).

  74. Correa-Gallegos D, Ye H, Dasgupta B, Sardogan A, Kadri S, Kandi R, Dai R, Lin Y, Kopplin R, Shenai DS, et al. CD201(+) fascia progenitors choreograph injury repair. Nature. 2023;623(7988):792–802.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

The authors thank all individuals who participated in this work. They sincerely acknowledge the Gene Expression Omnibus database (GEO) for providing transcriptomic and single-cell sequencing data. Guangzhou Medical Essential Subject, Guangzhou Municipal Health Commission, 2021-03-2024-03. Guangzhou Science and Technology Plan Project 202102010058, 202201020003, 202102020481, and 202102010074, Guangzhou Research Hospital Construction Project 2023A03J0524 support this work.

Funding

This work is supported by Guangzhou Science and Technology Plan Project 202102010058, 202201020003, 202102020481, and 202102010074.

Author information

Authors and Affiliations

Authors

Contributions

K.X. and S.W. wrote the main manuscript text, X.L., B.C., and Z.Z. organized the project and K.X., S.W., W.C., Y.H., P.L.,Z.C.,and J.L. prepared Figs. 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10. All authors reviewed the manuscript.

Corresponding authors

Correspondence to Bin Chen, Zhi Zhang or Xiaojian Li.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary Material 1

Supplementary Material 2

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Xiao, K., Wang, S., Chen, W. et al. Identification of novel immune-related signatures for keloid diagnosis and treatment: insights from integrated bulk RNA-seq and scRNA-seq analysis. Hum Genomics 18, 80 (2024). https://doi.org/10.1186/s40246-024-00647-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s40246-024-00647-z

Keywords