Background
Systemic lupus erythematosus (SLE), one of the most complicated autoimmune diseases in the world, is caused by various endogenous antigens [
1]. Lupus nephritis (LN), a common and serious complication of SLE, is characterized by hematuria, proteinuria, and impaired glomerular filtration rate [
2]. The lack of understanding regarding the molecular mechanisms of LN hinders the development of specific targeted therapy for this progressive disease [
3]. Tracking the biological changes in LN at the genomic level is a worthwhile strategy [
4]. In recent years, gene sequencing technology combined with bioinformatic analysis has been conducted to identify genes relevant to diseases that might serve as prognostic biomarkers and be developed as therapeutic targets in the future [
5]. Bioinformatic analysis can process large amounts of samples within an extremely short time and provide valuable information about diseases, and several genes closely associated with SLE have been identified and driven research innovations in recent years [
6‐
8]. However, few studies utilized bioinformatic analysis to characterize kidney tissue in the context of LN.
Many previous works found that immune cell infiltration is associated with treatment and clinical outcome in different types of cancer [
9,
10]. Immune cells consisting of innate and adaptive immune populations, including dendritic cells, macrophages, neutrophils, T cells, and B cells, are associated with active and suppressive immune functions [
11]. However, given the functionally distinct cell types that comprise the immune response, assessing immune infiltration and determining whether differences in the composition of the immune infiltration can improve the development of novel immunotherapeutic drugs to target these cells is important. The CIBERSORT algorithm is an analytical tool whereby RNA-seq data can be used to assess the expression changes of immune cells and obtain the proportion of various types of immune cells from the samples. CIBERSORT offers 22 cell types encompassing monocytes, natural killer cells, B cells, T cells, eosinophils, macrophages, neutrophils, plasma cells, dendritic cells, and mast cells [
12]. It has been prevalently used to determine the immune cell landscapes in many malignant tumors such as breast cancer, hepatocellular carcinoma, and colorectal cancer [
13‐
15]. In SLE pathogenesis, various immune cells have been widely evaluated and demonstrated to be harmful [
16]. Immune cell infiltration is also a hallmark of LN. Immune cells, such as monocytes, B cells, and T cells, are recruited to kidney tissue and produce cytokines and chemokines to cause tissue damage [
17]. However, the landscape of immune infiltration in LN has not been entirely revealed.
Although LN can affect all components of the kidney, the glomerulus is the most suitable tissue and is closely related to the pathogenesis and treatment of the disease [
18]. In our present study, the microarray data were downloaded from the Gene Expression Omnibus (GEO) database. By using CIBERSORT, we first investigated the difference in immune infiltration between LN kidney tissue and normal tissue in 22 subpopulations of immune cells. Gene set enrichment analysis (GSEA) was employed for functional enrichment analyses and to determine the most significant functional terms. A list of genes closely related to immune infiltration was screened out and validated against another dataset with clinical information from the GEO database. This study aimed to describe the characteristics of LN glomerular immune infiltration for the first time and to identify some key genes related to immune infiltration that affect clinical manifestation, so as to provide data resources for future research.
Discussion
With the development of bioinformatics, increasing attention has been focused on finding hub genes in various diseases, and the collected information on these genes can provide new means for exploring diseases. Multiple susceptibility genes may determine disease occurrence.
In this study, we uncovered different expressional cell patterns of immune infiltration in LN and association with clinical features. Monocytes were the prominent differentially expressed cells. These are important components of the innate immune system; they have an antigen presentation capacity and produce several inflammatory cytokines in SLE [
19]. Monocytes accounted for approximately 4% of blood leukocytes in healthy mice and over 50% in lupus-prone mice [
20]. Our result also showed that monocytes constituted 30–50% of immune cells in human LN glomeruli. Activated NK cells were also increased in glomeruli. However, reports from other studies showed lower proportions of NK cells in SLE patient blood, especially in patients with LN [
21,
22]. However, in rheumatoid arthritis tissue, NK cells were reported to contradict the function of circulating NK cells, which indicated that tissue NK cells may have different effects as compared with blood NK cells in autoimmune disease [
23]. Clinical and experimental evidence indicated that aberrant memory B cells and Tfh cells played an important role in the pathogenesis of human SLE [
24‐
26]. Resting M0 macrophages can polarize into M1 and M2 macrophages in the presence of the appropriate cytokines [
27]. However, no research has explained the function of increased M0 macrophages in LN. The specific role of these immune cells in functional immune responses still remains to be elucidated.
“Activation of immune response” was the top associated pathway under GSEA-based GO analysis. The activation of innate and adaptive immune system triggering immune complex deposition, complement activation, and self-antigen production displayed a toxic effect on renal glomerular and tubular cells, thereby promoting the development of nephritis in patients with SLE [
28,
29]. Through KEGG pathway analysis, several kinds of virus infection pathways were associated with LN. The immunoreaction of LN and response to virus may share several common features.
By combining CIBERSORT results and “activation of immune response” GO term, we found many novel commonly expressed genes, some of which were important in autoimmune diseases. For example, FCN1 was proven to be associated with monocytes in patients with microscopic polyangiitis [
30]. Another study involving weighted correlation network analysis showed that RSAD2 related to CD4+ T cells may be the most highly ranked hub gene in SLE [
7]. BTK mediates TLR signaling in macrophages and may be a promising treatment approach for LN [
31‐
33]. These genes were observed to be highly or mildly associated with immune cells in kidney tissues.
Through a review of documents about lupus and related genes [
34‐
48], 15 core genes related to clinical manifestation were found to be associated in autoimmune disease (Table
1). FCER1G, CLEC7A, MARCO, CLEC7A, PSMB9, and PSMB8 showed apparent correlation with clinical manifestation. FCER1G, which is associated with multiple leukocyte receptor complexes and mediates signal transduction, plays a negative regulatory role in the B cell responses [
36]. CLEC7A, also known as dectin-1, is a type II membrane receptor expressed in the membrane of some leukocytes and likely contributes to the synthesis of pro-inflammatory cytokines in autoimmune conditions [
37]. MARCO, a scavenger receptor family, plays important roles in the clearance of apoptotic cells. The presence of anti-MARCO antibodies in SLE patients might contribute to the breakdown of self-tolerance and the pathogenesis of SLE [
46]. PSMB8 is involved in antigen processing and presentation in naïve CD4+ T cells, and PSMB9 is induced by interferon stimulation in SLE [
41,
48]. All these core genes require additional studies to elucidate the complex interaction with clinical features.
Table 1
The previous studies about core genes in autoimmune disease
GPB1 | Blood | Promotes antimicrobial immunity and cell death. Key mediator of angiostatic effects of inflammation and is induced by interferon (IFN)-α and IFN-γ. | | |
CD36 | Blood | Expresses on the cell surface of monocyte/macrophages and involved in the recognition and uptake of pro-atherogenic oxidized low-density lipoprotein (LDL). | | |
FCER1G | Spleen | Associated with multiple leukocyte receptor complexes and mediates signal transduction. | | |
CLEC7A | Blood | Involved in the clearance of apoptotic cells, uptake and presentation of cellular antigens and triggers different cytokines and chemokines. | Salazar-Aldrete, et al. [ 37] | |
ITGB2 | Bone Marrow | Encodes integrin β2 protein (CD18). Plays important roles in leukocyte adhesion, immune and inflammatory reactions, immigration through endothelial and chemotaxis. | | |
LILRB4 | Blood | Associated with increased inflammatory cytokine levels in SLE and is expressed by many leukocytes. | | |
HLA − DRA | Blood | SLE susceptibility genes and plays a central role in the immune system by presenting peptides derived from extracellular proteins. | | |
PSMB9 | Skin | Upregulates in the pathophysiology of cutaneous lesions of dermatomyositis and SLE. | | |
BTK | Blood | Plays an important role in both B cell and FcgammaR mediated myeloid cell activation. BTK inhibition may be a promising treatment approach for lupus nephritis. | | |
PYCARD | Blood | Forms inflammasome complexes mediate the inflammatory and apoptotic signaling pathways. | | |
CFP | Blood | The only positive regulator of the complement system. Recognized apoptotic and necrotic cells. | | |
CFD | Blood | Encodes a protein functioned as an adipokine that involved in regulation of immune system and inflammatory responses. | | |
MARCO | Blood | Binds to apoptotic cells and contribute to the clearance of apoptotic cells. | | |
CD3D | Blood | Single nucleotide polymorphism in the immune compartment and B cells, also involved in T cell signaling. | | |
PSMB8 | Blood | Involved in antigen-processing and presentation in naïve CD4 + T cells and hypomethylated in SLE. | | |
The current work is the first to use CIBERSORT to analyze immune cell infiltration of glomerular tissue in LN. All data were derived from GEO and were therefore reliable. The correlation results of CIBERSORT and GSEA to obtain core genes were validated in clinical data, leading to many new information for our future research. The analytical methods were scientific and novel. However, our study has some limitations. Only a few datasets of LN were available on the GEO database; therefore, the number of samples included in this study was relatively small. However, despite the small sample sizes, we still found some significant differences among groups. In addition, clinical tests need to be conducted to support our results.
Methods
Microarray data processing
The data in our study came from a public domain. The normalized expression matrix and sample information were downloaded from the GEO database (
www.ncbi.nlm.nih.gov/geo). We used “lupus nephritis” as a keyword for searching. The data selection criteria were as follows: (1) the study type was expression profiling by array; (2) the organisms must be
Homo sapiens; (3) the samples of each dataset must include glomerular tissue. In accordance with the above criteria, the GSE32591 microarray dataset based on the Affymetrix Human GeneChip U133A (affy) platform was hit and adopted for CIBERSORT. The GSE113342 microarray dataset based on nCounter Nanostring Human Immunology v2 was used to demonstrate the association between selected genes and clinical feature later. Only 500 immune-related genes were detected in this dataset.
Evaluation of immune cell infiltration
Gene expression datasets of GSE32591 were processed to remove the null values. The missing values were supplemented by KNN method in “impute” package [
49], the format was prepared in accordance with the accepted format of CIBERSORT, and then data were uploaded to the CIBERSORT web portal (
http://cibersort.stanford.edu/). We used the original CIBERSORT gene signature file LM22, which defines 22 immune cell subtypes, to analyze datasets from human glomerular tissues and normal tissues. CIBERSORT
p-value < 0.05 was included.
Differential analysis of immune cell infiltration types
To analyze the significant differential expression of different cell types of immune cells, we used the difference analysis between the disease group and the control group. Limma package and Bayesian method were used to construct a linear model [
50].
P-value < 0.05 was the cut-off standard. To further understand the relationship between these different types of immune cell infiltration, Pearson correlation coefficient was used to find the correlation between these differentially expressed types of immune cells.
GSEA preparation
GSEA is an analytical method for genome-wide expression profile microarray data. It can identify functional enrichment by comparing genes with predefined gene sets. A gene set is a group of genes that shares localization, pathways, functions, or other features. GSEA was conducted using clusterProfiler package (version 3.5) [
51]. The fold change of gene expression between LN group and control group was calculated, and the gene list was generated according to the change of |log2FC|. Then, we utilized GSEA-based enriched Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses.
GSEA-based enriched GO analysis
GO analysis includes three categories: molecular function, biological process, and cellular component. In the present study, we only selected biological process to perform GO analysis. GO analysis was performed through gseGO function in clusterProfiler package. The adjusted p-value < 0.05 was set as the cut-off criteria. The connections between the most significant GO terms and participating genes were visualized by GOenrich package with a network diagram.
GSEA-based KEGG pathway analysis
KEGG pathway enrichment analyses were also conducted by gseKEGG function in clusterProfiler package. The adjusted p-value < 0.05 was set as the cut-off criteria.
Core gene list and correlation analysis
The core gene list obtained in the most significant GO term was analyzed by Spearman correlation with the differentially expressed immune cells from CIBERSORT results. Five groups of correlation analysis data were obtained. P-value < 0.05 was used as the cut-off standard, and genes with the top 10 highest absolute values of correlation coefficients were visualized in each group.
Validation of core genes and association with clinical manifestations
In dataset GSE113342 with clinical information, patient part B was excluded because it was data after treatment, and only first renal biopsy data (patient part A), which had approximately 500 immune gene expression values that coincided with the genes obtained in the most significant GO term associated with immune response, were chosen for analysis. Gene intersection was calculated first, and the Spearman correlation analysis between these intersecting genes and clinical information, such as age, grade, and 12-month treatment response, was further applied.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.