Introduction
As the coronavirus disease 2019 (COVID-19) pandemic enters its third year, the disease’s impact on the cardiovascular system is becoming increasingly evident [
1]. COVID-19 is caused by SARS-CoV-2 infection, and angiotensin-converting enzyme 2 (ACE2) is the key receptor for this virus to enter cells. In addition to mediating the interaction between host cells and the SARS-CoV-2 spike (S) glycoprotein, ACE2 has a homeostatic function in regulating the renin-angiotensin-aldosterone system (RAAS), which is pivotal for both the cardiovascular and immune systems [
2,
3]. ACE2 expression has been reported to correlate with increased viral load in human cell lines and mice [
4]. Therefore, ACE2 may be a key link between SARS-CoV-2 infection and cardiovascular disease (CVD) [
5,
6,
7,
8].
Nasal and pulmonary epithelial cells are thought to be the first cells to be infected by SARS-CoV-2, after initial viral replication and circulation, but many cardiomyocytes express components necessary for SARS-CoV-2 uptake and replication. Although expression of the transmembrane serine protease TMPRSS2 is low in the heart, ACE2 and other auxiliary proteases (e.g., ADAM17, FURIN, NRP1, CTSL) known to be involved in viral activation, membrane fusion, and integrin coreceptors are highly expressed in cardiac tissue [
9,
10,
11]. ADAM17 (a disintegrin and metalloprotease 17) was found to mediate the proteolysis and ectodomain shedding of ACE2; its activity was upregulated after SARS-CoV bound to ACE2 and promoted viral entry, whereas knockdown of ADAM17 by siRNA severely attenuated viral entry into cells [
12,
13]. It is also noteworthy that in cells lacking TMPRSS2 expression, tissue proteinase L (CTSL) similarly promotes human coronavirus SARS-CoV and SARS-CoV-2 infection of host cells via a slow acid-activated pathway [
14,
15,
16,
17]. However, the internal mechanism of SARS-CoV-2 invasion of myocardial tissue and how clinical features such as cardiovascular complications affect SARS-CoV-2 infection are not yet fully understood.
Individuals differ widely in the clinical consequences of infection, from asymptomatic illness to death. The severity and mortality of COVID-19 are closely related to CVD. A prospective, multicenter cohort study reported that 12.3% of 73,197 COVID-19 inpatients had cardiovascular comorbidities [
18]. However, the prognosis of CVD in patients with COVID-19 seems to be controversial. Some studies have found that myocardial injury is significantly correlated with the fatal outcome of COVID-19, and patients with underlying CVD but no myocardial injury have a better prognosis [
19]. Therefore, it is necessary to further explore the mechanism of SARS-CoV-2 infection of myocardial tissue to reveal the impact of cardiovascular diseases on COVID-19. Factors such as age and sex are also considered to be associated with the severity and mortality of COVID-19 [
18]. We can better understand COVID-19 tropism and illness outcome heterogeneity by identifying the specific cell types that can be infected by SARS-CoV-2 and correlating proteins critical to SARS-CoV-2 infection with important variables such as age and sex [
20].
Limited by the difficulty of obtaining human heart samples and the natural differences between other biological models and the human body, research on the molecular mechanisms of human heart disease has been hindered [
21]. Utilizing single-cell RNA-sequencing (scRNA-seq) datasets from human cardiac samples may be one of the avenues to overcome the above problems. In this study, the first single-cell meta-analysis of the human heart and cell-type-specific expression patterns of ACE2 and auxiliary proteases were mapped by comprehensive analysis of the scRNA-seq datasets of 10 CVD patients.
Discussion
This study reveals, for the first time at single-cell resolution, the factors that affect the entry of SARS-CoV-2 into cardiac tissue cells in different CVD states. Single-cell technologies offer powerful new tools to dissect cell types that reside within healthy and diseased tissues [
23,
24]. In recent years, this approach has been leveraged to provide a deeper understanding of the cellular composition of the healthy human heart, but deciphering how cardiovascular disease affects the cardiac cellular transcriptional landscape has been hampered by limited sample data [
25,
26]. By analyzing approximately 90,000 single cells from different cardiac regions of 10 donors with DCM, ICM, hypertension, and healthy states, we identified 10 major cardiomyocyte types and revealed their cell type-specific transcriptional programs. Because of the low expression of TMPRSS2 in myocardial tissue, we used coexpressed forms of ACE2 and auxiliary proteases (ADAM17 and CTSL) to locate specific cell types—pericytes and fibroblasts—that SARS-CoV-2 might invade. Based on specific cell types (PC0, FC1 and FC2), we correlated key factors of SARS-CoV-2 invasion of host cells (ACE2, ADAM17 and CTSL) with key Thank you for your valuable feedback on the
Introduction section of the manuscript.covariates such as age, sex and cardiovascular comorbidity and found the correlations between them. In addition, the expression of other potential auxiliary proteases may help in the search for therapeutic possibilities related to the disruption of viral processing by protease inhibition. This study identified AGT, CALM3, PCSK5, NRP1 and LMAN as proteases coexpressed with ACE2. The enrichment analysis revealed that relevant immune pathways involved in viral infection might include the extracellular matrix interaction pathway, adherent plaque pathway, vascular smooth muscle contraction inflammatory response and oxidative stress. Correlation analysis of pericytes revealed specific high expression of IFITM3 and AGT in pericytes. Finally, cell‒cell communication experiments revealed the presence of IFN-II signaling pathway and PAR signaling pathway differences in fibroblasts from different cardiovascular comorbidities.
Pericytes are mural cells that cover and adhere to the basement membrane of the cardiac microcirculation (including terminal microarteries, precapillary microvenules and capillaries) and have the function of regulating blood flow and vascular permeability [
27]. Pericytes may be one of the first cells contacted by virus particles entering myocardial tissue through the blood circulation. Due to the high expression of ACE2, ADAM17, CTSL and related auxiliary proteases, SARS-CoV-2 infects and replicates in pericytes. In an autopsy report of a patient with COVID-19, it was found that SARS-CoV-2 was located in the interstitial cells of myocardial tissue rather than cardiomyocytes in slices of cardiac tissue [
28]. Despite the antiviral function of the pericyte-specific, highly expressed IFITM3 protein, the human IFITM3 protein exhibits a pro-viral effect, i.e., enhanced viral fusion at the plasma membrane [
29,
30]. Endocytosis of IFITM3 promotes mutation of residues in its YxxФ motif, converting human IFITM3 into an enhancer of SARS-CoV-2 infection, and cell-to-cell fusion assays confirm that endocytic mutants enhance spike-mediated fusion with the plasma membrane [
30]. The genetic variant rs12252-C of IFITM3 is associated with more severe COVID-19, possibly by causing defects in the control of viral replication in cells [
31]. Angiotensinogen, expressed by AGT, is an important component of the RAAS and is a potent regulator of blood pressure, fluid balance, and electrolyte homeostasis. The specific high expression of AGT and the interaction between pericytes and endothelial cells may be one of the reasons why SARS-CoV-2 infection of pericytes leads to vasoconstriction and decreased myocardial blood flow [
27]. In addition, it has been shown that the S-glycoprotein of SARS-CoV-2 induces pericardial cells to produce proinflammatory cytokines, including MCP1, IL-6, IL-1β, and TNF-α, which are important components of the cytokine storm associated with respiratory failure and high mortality in patients with COVID-19 [
32].
This study used a mixed-effects model to attempt to correlate key factors (ACE2, ADAM17, and CTSL) for SARS-CoV-2 invasion of host cells with key covariates such as age, sex, and cardiovascular comorbidities. Sex is an important factor affecting SARS-CoV-2-infected cells. In the present study, the expression of ACE2 and auxiliary proteases clearly trended higher in male patients. Epidemiological studies have found that men are more likely to be infected than women, and they account for the majority of severe illnesses and deaths [
33]. Surprisingly, the expression of ACE2 and auxiliary proteases in healthy myocardial tissue in this study showed a more pronounced trend for various cardiovascular comorbidities. Previous studies have focused more on myocardial damage by COVID-19, ignoring the effect of cardiovascular comorbidities themselves on the virus. Mortality from COVID-19 among adults with congenital heart disease (CHD) is commensurate with that in the general population [
34]. A possible explanation for this phenomenon is that preconditioning of the underlying CVD paradoxically protects the myocardium from damage caused by SARS-CoV-2, thereby mitigating the overall impact on the myocardium [
34,
35]. Historical data from previous viral epidemics confirm adverse pulmonary and cardiovascular effects in CVD patients [
36], but this phenomenon may be due to an overwhelming immune-inflammatory cytokine response, direct viral invasion of cardiomyocytes, or poor myocardial oxygenation from severe hypoxia due to lung injury. These mechanisms might plausibly complicate patients with CHD who were already prone to myocardial dysfunction, limited myocardial oxygenation, or pulmonary vascular disease. We attempted to reduce the background variance and unbalanced distribution of explanatory covariates for the proposed model. Potential confounders and limited sample size made the modeling of clinical characteristics factors crude. A larger dataset may help get rid of the above issues. Therefore, there is a need to further design reasonable experiments to validate the actual effect of cardiovascular comorbidities on COVID-19.
Finally, the expression of other potential auxiliary proteases may contribute to the generation of therapeutic hypotheses related to the disruption of viral processing through protease inhibition. The S glycoprotein of SARS-CoV-2 contains two subunits, S1 and S2, and NRP1 can bind to the furin cleavage site of the S1 subunit, which can promote SARS-CoV-2 infection [
7]. Because PCSK5 proteases are located in different membrane compartments, they may process the S glycoprotein of SARS-CoV-2 at different viral stages [
37]. The protein CALM3, which is significantly coexpressed with ACE2, is part of the calcium signal transduction pathway, which mediates the control of a large number of enzymes, ion channels, aquaporins and other proteins through calcium binding [
38]. LMAN can participate in biological processes such as protein metabolism, Golgi transport dynamics and subsequent protein modification. In addition, fibroblasts are among the specific cells that SARS-CoV-2 may first invade by virtue of their double-positive protein localization [
23,
28]. We observed significant differences in the IFN-II signaling pathway and PAR signaling pathway in fibroblasts under different cardiovascular mergers by cell‒cell communication analysis. One of the hallmarks of severe COVID-19 is a persistent interferon (IFN) response [
39]. An innate error affecting human fibroblast IFN immunity may underlie life-threatening COVID-19 pneumonia in patients without prior severe infection [
40]. Protease-activated receptors (PARs) may enhance platelet activation through thrombin-mediated platelet calcium mobilization in injured myocardium [
41].
This study still has some limitations that should be mentioned. First, it attempted to reveal the influence of underlying cardiovascular disease factors alone on SARS-CoV-2 infection of myocardial tissue. SARS-CoV-2-infected tissues were not used. SARS-CoV-2 that has infected other tissues may instruct the cell to start replicating its genome, produce its proteins and assemble them into many new copies of the virus, which, upon release, can circulate in the blood and affect myocardial tissue after release [
32]. Therefore, we tried to rule out the influence of SARS-CoV-2-infected tissue data. Second, the difficulty of obtaining human heart samples restricted the sample size of this study. However, studies like this may be one of the ways to solve the limitation of sample size: by referring to previous studies, using published research data and using the research model of coexpression of virus and key proteins [
42]. A strength of this study is that it used an innovative research model of the mechanism by which SARS-CoV-2 affects myocardial tissue, and it excluded some negative factors of the impact of SARS-CoV-2 infection on myocardial tissue.
In conclusion, this study is the first to correlate cell type-specific changes in expression levels with age, sex, and cardiovascular disease. It provides new insight into the pathway by which SARS-CoV-2 infects myocardial tissue. Our meta-analysis provides a detailed molecular and cellular map to help us better understand the invasion, pathogenesis, and association with clinical features of SARS-CoV-2 in myocardial tissue.
Methods
Published dataset and patient samples
Sample collection was reviewed and approved by the Institutional Review Board (IRB) at the institution where the sample was originally collected. GSE145154 was approved by the Ethics Committee of Fuwai Hospital in Beijing, China. Tissue samples of hearts with DCM and ICM in the GSE145154 dataset were obtained from patients undergoing transplant, while these causes of DCM excluded patients: cardiac amyloidosis, cardiac sarcoidosis, viral myocarditis, giant cell myocarditis, peripartum cardiomyopathy, chemotherapy-associated cardiomyopathy, obesity, diabetic cardiomyopathy, arterial coronary disease, valvular disease, and congenital heart disease. GSE134355 was approved by the Research Ethics Committee of the Zhejiang University School of Medicine, Research Ethics Committee of the First Affiliated Hospital, Research Ethics Committee of the Second Affiliated Hospital and Research Ethics Committee of Women’s Hospital at Zhejiang University (Approval Number: 20,170,029, 2,01,80,017, 2,01,90,034, 2,01,80,15, 2,01,85,07, 2,01,87,66 and 2,01,81,85). Informed consent for fetal tissue collection and research was obtained from each patient after her decision to legally terminate her pregnancy but before the abortive procedure was performed. Informed consent for collection and research of surgically removed adult tissues was obtained from each patient before the operation. Informed consent for the collection and research of tissues from deceased-organ donation was obtained from the donor family after the cardiac death of the donor.
Publicly available single-cell RNA-seq datasets were downloaded from the Gene Expression Omnibus (GEO) [
43]. GSE145154 was sequenced on the Illumina HiSeq 6000 and HiSeq X Ten platforms using 10x Genomics technology, including 2 DCM tissues, 2 ICM tissues, and 1 healthy heart tissue, with left and right ventricular samples taken from each patient for sequencing. GSE134355 was sequenced on the HiSeq X Ten platform using 10x Genomics technology, including 2 adult hypertensive patient heart tissue and 2 fetal normal heart tissue.
Integrated analysis of published datasets
The single-cell data used were the original UMI count data. Its preprocessing, quality control, normalization, and dimensionality reduction clustering were all performed using the Scanpy package (v4.0) [
44]. The quality control standards were as follows: (1) Each gene must be expressed in at least 3 cells. (2) At least 500 genes were expressed in each cell. (3) The variable nfeatures and the counts of each sample were according to median ± 3*MAD (median absolute deviation) standard screening. (4) The mitochondrial gene proportion was 10% as a threshold. (5) The hemoglobin gene proportion was 1% as a threshold. The subsequent data standardization, normalization, search for hypervariable genes, and dimensionality reduction clustering were all done according to the default parameters and standard procedures of the Seurat package.
The log1p function ln(10,000 × g
ij + 1) and column sum were used to log-normalize (UMIs/10,000 + 1) each dataset, where a gene’s expression profile g is the outcome of the UMI count for each gene i, for cell j, normalized by the total of all UMI counts for cell j. We use the harmony-pytorch Python implementation (v0.1.1;
https://github.com/lilab-bcb/harmony-pytorch/) of the Harmony scRNA-seq integration method for batch correction to integrate data between different samples, and selected the first 30 principal components and resolution = 1 for dimensionality reduction clustering [
45]. Single-cell group naming was done by reading papers to collect marker genes and manually annotating them.
Differential gene expression analysis
To further analyze the differentially expressed genes among cell populations, we used the FindMarkers function in Seurat v4.0 for analysis. The selection criteria for differential genes were adj. p < 0.05, and the selection criteria for logFC were based on an earlier report [
44].
Coexpression analysis across diseases and cell types
We collected single-cell sequencing data of 3 CVDs and normal cardiomyocytes. To evaluate the coexpression of ACE2 and ADAM17 in different cell types and different disease conditions, we selected cell types with more than 15 ACE2
+ cells for analysis, and ACE2- cells were selected by downsampling according to clinical characteristics using the ROSE package. We employed a mixed model with a random intercept that differed for each donor to account for donor-specific effects (i.e., batch effects):
$${\text{Y}}_{\text{i}} \sim \text{A}\text{C}\text{E}2 + \left(1\right|\text{S})$$
Where ACE2 represents the binary coexpression state of each cell (that is, double-positive versus double-negative cells), Yi represents the expression level of gene i in cells, expressed in units of log2(transcripts per 10,000 reads (TP10K) + 1), and S represents the donor from which each cell was isolated. The specific implementation used the lme4 package of R software for analysis [
46].
Integrated analysis for associating ACE2, ADAM17 and CTSL expression with age, sex and cardiovascular comorbidities
We combined all scRNA-seq datasets of human left and right ventricular cells, as well as fetal samples, including the expression counts of just the above three genes, to analyze the relationships between age, sex, and cardiovascular comorbidities and the expression of ACE2, ADAM17, and CTSL. First, to refine the localization of
ACE2+ADAM17+ cells, we subdivided each cell subpopulation of single cells, integrated data between different samples of the same cell type using the harmony package, and selected the top 30 main adult components and the unique resolution of each cell population for dimensionality reduction clustering [
45]. Single-cell group naming was done by reading papers to collect marker genes and manually annotating them. Then, the expression levels of ACE2, ADAM17, and CTSL in different cell subpopulations were plotted according to the results of subgroup segmentation, and the subpopulation with a higher content of double-positive cells was selected to explore the relationship between
ACE2+ADAM17+ cells and clinical characteristics. The data imbalance was also treated using the downsampling method in the ROSE package, and the relationship between the two was assessed using the mixed-effects model in the lme4 package [
46].
$${\text{Y}}_{\text{i}} \sim {\text{X}}_{\text{i}} + \left(1\right|\text{d}\text{o}\text{n}\text{o}\text{r})$$
where Yi represents the expression level of the dichotomized genes, while Xi represents the different clinical features. The specific method can be found in the published literature [
20].
Coexpression of ACE2 and other auxiliary protease classes
Additional proteases may play a role in the proteolytic cleavage of viral protein entry and exit. To predict such proteases, we tested the coexpression of ACE2 with each of 625 annotated human protease genes [
47]. To further analyze the coexpression of ACE2 and other protease classes, we assessed the coexpression of all genes and ACE2 in three disease tissue types and in healthy tissues using a random-effects model with the lme4 package [
46]. The relationships between ACE2 and the PCSK family, CTSL, ADAM17, NRP1, HMGB1, CALM1, CALM3, KNG1, AAMP, NTS, AGT, DEFA5, SLC6A19 and other proteases were investigated. The relationships between ACE2 and these proteases in different cell types were also analyzed.
Functional enrichment analysis of double-positive cells
To further analyze the functional enrichment in the double-positive cells, we selected
ACE2+-ADAM17+ cells and
ACE2-ADAM17 cells in different tissues for differential analysis to obtain the differential genes. The numbers of
ACE2+-ADAM17+ cells and
ACE2-ADAM17 cells were balanced by a downsampling method and then modeled using a random forest algorithm. The top 500 genes associated with double-positivity of cells were filtered by importance, and the intersection of the top 500 genes in different tissues and the genes specific to each tissue were calculated separately. Then, the top 10 genes ranked by the sum of importance were taken for visualization using Cytoscape software [
48]. In addition, to further analyze the functional enrichment of double-positive cells, we input the common genes among the top 500 genes in different tissues for KEGG enrichment analysis, using the package clusterProfiler in R software [
49]. The functional enrichment map of
ACE2+-CTSL+ cells was drawn in the same way.
To identify the genes related to double-positive cells in different cell types, we found the genes related to double-positive cells in different cell types using the random effect model, sorted them according to the size of the effect value, and selected the first 12 genes using Cytoscape software [
48].
Analysis of cell‒cell communications
CellChat objects were created based on the pericyte UMI count matrix of each group (DCM, ICM, hypertension, and healthy) via CellChat (
https://github.com/sqjin/CellChat, R package, v.1). The difference between the cell interaction of different diseased myocardial tissues and the cell interaction of normal myocardial tissue was calculated by the CellChat package. With “CellChatDB.human” set up as the ligand—receptor interaction database, cell‒cell communication analysis was then performed via the default settings. The total number of interactions was compared against interaction strength by merging the CellChat objects of each group by the function mergeCellChat. The visualization of the differential number of interactions or interaction strength among different cell populations was achieved by the function netVisual_diffInteraction. Finally, differentially expressed signaling pathways were found by the function rankNet, and the signaling gene expression distribution between different datasets was visualized by the function plotGeneExpression [
50].
Statistical analysis
All data calculations and statistical analyses in this study were done using R software (
https://www.r-projec t.org/, version 4.1.2). All statistical P value values were two-sided, where differential genetic screening was considered statistically significant with a corrected P value < 0.05, and the P value standard values for the remaining statistical tests were as described in the text.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.