Background
Lung cancer represents a major health issue worldwide due to its high incidence and mortality rates [
1]. The disease is often diagnosed at advanced stages when current clinical options are largely ineffective resulting five years survival rates of less than 10% [
2]. Identification of circulating molecular biomarkers is therefore critical to improve early detection of lung cancer and represents an important approach with a large clinical potential [
3]. Despite the recent improvements in the characterization of different circulating biomarkers such as cell free DNA [
4,
5], circulating tumor cells [
6‐
8], extracellular vesicles or circulating miRNA [
9,
10], the search for optimal biomarkers still remains a challenge. In the past, most studies were centered on identifying potential biomarkers using molecules differentially expressed by tumor cells. However, in the last few years, the concept that each tumor is a complex system composed by both, cancer cells and the surrounding stroma, represented by a variety of cell types such as fibroblasts, immune and endothelial has been well established [
11]. This notion has important implications for biomarkers research as novel candidates with diagnostic or prognostic value can potentially be obtained by analyzing molecules produced by stromal cells during their interactions with cancer cells [
12,
13].
In particular, activated fibroblasts play a prominent role in lung carcinogenesis due to their abilities to trigger several signaling pathways implicated in tumor formation and metastasis [
14‐
16]. It has also been demonstrated that cancer-associated fibroblasts (CAFs) within the reactive stroma are responsible for deposition of elevated amounts of extracellular matrix (ECM) [
17]. In physiological condition, ECM provides mechanical and biochemical support to the surrounding cells and is actively involved in cell proliferation and migration [
18]. On the other hand, under pathological conditions the interactions between activated fibroblasts and epithelial cells result in production of different growth factors, cytokines and proteases which modify the surrounding ECM, by changing its composition and facilitating pathological alterations such as chronic obstructive pulmonary disease (COPD) or idiopathic pulmonary fibrosis (IPF) [
19] and potentially leading to cancer development [
20,
21].
As a consequence of this remodeling, proteins related to ECM are released into blood and could be considered as potential novel circulating biomarkers [
22]. Therefore, new biomarkers for early detection of lung cancer derived from remodeling of ECM could be developed by identifying candidate genes and pathways from gene expression profiling of cancer associated fibroblasts (CAF). A recent review started to unravel the key pathways involved in their functional effects highlighting the existence of common mechanisms as well as specificities in different cancer types (breast, prostate and lung cancer) [
23]. The results showed that most of the commonly enriched gene sets characterizing tumor-promoting fibroblasts were related to structural ECM molecules and ECM organization (e.g. collagens) and in particular, to the ECM3 signature, an ECM-based signature already found associated with bad prognosis in aggressive breast carcinomas (i.e. grade III) [
24]. Mechanistically the deposition of collagens in the surrounding tumor influences cancer cells behavior promoting cancer progression and invasion in several cancer types [
25,
26]. However, the utility of any specific collagen fragment as plasma circulating biomarker in lung cancer remains unproven. Several studies have also pointed out the importance of another lung microenvironment-related protein, secreted protein acidic and rich in cysteine (SPARC), a collagen-binding matricellular protein considered as a key player in the tumor progression most likely by supporting crosstalk at the tumor–stroma interface [
27,
28]. SPARC has been predominantly detected in the tumor-associated stroma, specifically in ECM produced by activated fibroblasts. Interestingly, the localization of SPARC in NSCLC tissues is linked to disease prognosis. In fact, high levels of SPARC expression within NSCLC tumor tissues, are associated to longer survival, while its absence represents a negative prognostic factor. On the other hand, high expression of SPARC in the stroma is associated with poor overall survival in lung cancer patients [
29]. Since changes in the ECM can occur early in cancer progression, in this study we aimed to identify plasma circulating proteins originated by the ECM compartment and to investigate their potential utility as biomarkers for early diagnosis of lung cancer. In addition, we aimed to explore the association of ECM proteins with different clinical parameters and their potential prognostic value.
Methods
Patient characteristics and tissue sampling
The current study was approved by the Fondazione IRCCS, Istituto Nazionale dei Tumori Ethics Review Board and included all consecutive patients from whom plasma samples were available and who underwent a complete anatomical resection for primary lung cancer at the Thoracic Surgery Division of the National Cancer Institute of Milan, from January 2012 to July 2014. Healthy heavy smoker controls were enrolled in a lung cancer screening program (
clinicaltrial.gov NCT 02247453,
www.biomild.org) from January 2013 to January 2016. Written informed consent was obtained from all patients and healthy heavy smokers controls for blood collection. All cases used in this study were confirmed to be primary lung cancer by pathology review. Study participants were mainly heavy smokers (12 non-smokers for the analysis of COL11A1 and COL10A1) and were matched 1:1 to the patient cohorts according to sex and age classes (< 50, 50–54, 55–59, 60–64, 65–69, 70–75, > 75) for the analysis of COL11A1 and COL10A1 and according to sex, smoking history and age classes for the analysis of SPARC. Overall survival was the study outcome of interest, thus patients contributed with their time interval from surgery until the date of death or until 16th January 2017 for survivors. Blood collection was performed shortly before surgery to avoid the impact of surgery in the markers quantification. Plasma extraction was described elsewhere [
30]. Briefly, whole blood samples (5–10 ml) were collected as first blood with spray-coated K
2EDTA tubes (BD-Becton, Dickinson and Company, Plymouth, UK). Within 2 h, plasma was separated by a first centrifugation step at 2500 RPM at 4 °C for 10 min. The supernatant containing plasma was carefully collected avoiding the fraction closest to the lymphocytic ring. Plasma was then centrifuged a second time at 2500 RPM at 4 °C for 10 min. and the supernatant collected and stored at − 80 C until further.
Establishment of cell cultures and conditioned medium
Cultures of primary cancer-associated (CAF) and fibroblasts derived from normal counterpart of (NF) lung cancer patients were isolated from surgical specimens and cultured as already described [
31]. All cell lines were routinely tested to exclude presence of mycoplasma contamination, grown as adherent monolayer and harvested at controlled density. To obtain conditioned medium (CM), cells were grown in controlled conditions in serum free medium at the same density (cells number = 1X10
6). After 24 h, the CM was collected, centrifuged to eliminate cell debris and stored at − 80 C until further.
RNA purification, microarray and data analysis
Total RNA was extracted from fibroblasts cell cultures using RNA easy kit (Qiagen), followed by a clean-up treatment to remove genomic DNA. RNA purity was assessed with bioanalyzer (Agilent technologies) and concentration of RNA was evaluated by nanodrop 2000c (Thermo Scientific). Each microarray experiment was performed using 300 ng of total RNA. Procedures included first strand synthesis, second strand synthesis, double-strand cDNA clean up, in vitro transcription, cRNA purification and fragmentation. One microgram of biotinylated cRNA were finally applied to each hybridization array, Illumina Human HT-12v4 Expression BeadChip (Illumina, Inc., San Diego, CA, USA) at 58 °C for 18 h. Illumina BeadStudio software version 3.8 was used to obtain the raw data. Class comparison analysis was performed using the
limma Bioconductor package [
32]. Cancer associated fibroblast were compared with normal fibroblasts and all genes were ranked according to the modified
t-statistics values obtained. These ranked gene lists were subjected to a Gene Set Enrichment Analysis (GSEA, v.4.0) to identify Gene Ontology terms or Canonical Pathways (BIOCARTA, KEGG, REACTOME) significantly enriched. Enrichment was considered significant at
p-value < 0.05.
For cell lysates preparation, cell lines were solubilized for 1 h on ice with TNTG lysis buffer containing 50 mM Tris-HCl pH 7.5, 150 mM NaCl, 100 mM NaF, 10 mM sodium pyrophosphate, 10% glycerol, 1% Triton X-100 and protease inhibitor cocktail (Complete Mini, Roche, Basel, Switzerland). For extracellular matrix isolation, cultured cells were treated with the hypotonic buffer NH4OH 20 mM for 20 min. After two washes with Phosphate-buffered saline (PBS) 1X the extracellular matrix present on the plastic plate was recovered with heated loading buffer (Laemmli solution) with the help of a scraper. Protein levels in cell-derived extracellular matrix were normalized with respect to the number of cells seeded and grown in the same conditions of cells used for extracellular matrix recovering. Conditioned media (CMs) were processed with Amicon Ultra-15 Centrifugal Filter Unit with Ultracel-3 membrane (Merk Millipore, Billerica, MA, USA) for concentration of proteins with a molecular mass greater than 5 kDa, according to the manufacturer’s instructions. Concentration factors ranged from 30 to 40X.
Proteins analysis techniques
Western Blot. Protein lysates (20 μg), a fixed volume of solubilized extracellular matrix (20 μl) or a fixed volume of concentrated CM (26 μl) were mixed with loading buffer under reducing conditions, heated for 5 min at 95 °C, loaded on 4–12% precast NuPage SDS-Bis-Tris gels (Life Technologies, Carlsbad, CA, USA). The proteins were then transferred to PVDF membranes (Merk Millipore), stained with Red Ponceau to check loading and membranes saturated for 1 h at room temperature in blocking solution (5% low-fat milk in TBS + 0.1% Tween-20) before probing with the appropriate antibodies. Blots were washed with TBS-0.5% Tween-20 and further incubated with horseradish peroxidase-conjugated secondary antibodies (GE Healthcare, Little Chalfont, UK) for 1 h at room temperature. Western blots were developed using the enhanced chemiluminescence method (GE Healthcare) according to the manufacturer’s instructions. Data were acquired and analyzed using Quantity One 4.6.6 software (Bio-Rad, Hercules, CA, USA). The following primary antibodies were used: COL11A1 (1:500 rabbit polyclonal, NBP1–55803, Novus Biologicals, Littleton, CO USA), COL10A1 (1:1000 rabbit polyclonal, LS-C157654, LSBio LifeSpan Biosciences, Seattle, WA USA), SPARC (5 μg/ml mouse IgG1, 33–5500 Invitrogen), and vinculin (1:1000 mouse monoclonal, hVIN-1 clone, Sigma Aldrich). Vinculin quantification, concentration factor and cell counts were used to normalize ECM proteins in cell lysates, CMs and extracellular matrix preparation, respectively.
Circulating proteins were measured by using commercially available ELISA kits (COL10A1 and COL11A1 LifeSpan Biosciences, Inc., SPARC R&D), according to manufacturer’s instruction. Duplicate measures were performed for each sample. Protein levels were expressed as OD value as measured by Microplate Reader
Tecan Infinite® M1000.
Statistical analysis
After matching, the analysis of the levels of ECM molecules was performed on a set of 57 lung cancer patients and 57 healthy controls for COL11A1 and COL10A1 and on a set of 90 lung cancer patients and 90 healthy controls for SPARC. Raw absorbance values were corrected by exploiting the values of the ELISA standards. The distributions of the absorbance values of ECM molecules in plasma samples of lung cancer patients and healthy controls were compared by using the Wilcoxon test. The association between each molecule levels and clinicopathological variables was investigated using the Wilcoxon test for categorical variables and the Spearman correlation coefficient for continuous variables. A univariable logistic regression model including the molecule was fitted for the comparison between patients and healthy controls, and the area under the ROC curve (AUC) was estimated as a measure of discriminative ability Sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) corresponding to the cutoff of the ROC maximizing the Youden Index, were extrapolated
. AUC, sensitivity/specificity, PPV/NPV 95% confidence intervals were estimated using bootstrap procedure (Efron B. An Introduction to the Bootstrap. Chapman and Hall, New York, NY; 1993). Multivariable quantile regression models with median as the reference quantile were implemented to study the association between the molecules and the disease status (tumor vs control), adjusting for possible confounders such as age, packyears, COPD and sex. Moreover, we studied the prognostic effect of ECM molecule levels on overall survival (OS). At univariable analyses each molecule was categorized according to its tertiles and the Kaplan-Meier curves were estimated and statistically compared by means of the log-rank test. Multivariable Cox models were also implemented, where the effect of the molecule was adjusted for age, packyears, COPD (present vs absent) and sex (M vs F); the corresponding Hazard Ratios (HR) were estimated. In all the models the molecules were included as continuous variables using 3-knots restricted cubic splines [
33] The analyses were carried out using R software, version 3.2.0 (
http://www.r-project.org/). The test results were considered statistically significant whenever a two-sided
p-value below 0.05 was obtained.
Discussion
In this study, we aimed to determine whether proteins related to the extracellular matrix are released in plasma and can potentially be useful as lung cancer biomarkers. We started our investigation from gene expression profiling of lung fibroblasts which are known to play an important role in cancer progression and ECM remodeling [
20]. This approach revealed that ECM related proteins are particularly enriched in primary cultures of CAF from lung cancer patients [
23] and also released in their conditioned culture medium. We then focused on molecules belonging to the ECM3 signature, previously identified in breast cancer and found to be prognostic when overexpressed in the most aggressive tumors [
24]. We therefore analyzed two isoforms of collagens (COL11A1 and COL10A1 as the most expressed in lung fibroblasts gene profiling) and SPARC. Collagens represents the most abundant proteins of extracellular matrix and recent studies have highlighted an important role for COL11A1 and COL10A1 in many aspects of neoplastic progression [
34]. Previous observations reported that COL11A1 is more expressed in CAF than in normal breast or pancreatic fibroblasts [
35‐
37]. Overexpression of COL11A1 was also found in non-small cell lung cancer tissue samples where it correlates with pathological stage, presence of lymph node metastasis, and poor prognosis [
36,
38]. Our study showed higher COL11A1 gene expression in CAF compared to NF, but no difference in protein expression in cells or its release in culture medium. In addition, we did not find any difference in protein levels in plasma of patient’s vs heavy-smokers controls highlighting the inadequacy of this protein as potential indicator of pathological features. This observation indicates that COL11A1 is albeit differentially expressed in CAF is not sufficiently discriminatory at the protein level.
Instead, COL10A expression showed remarkable significant difference between controls and lung cancer patients thus constituting a potential diagnostic candidate. However, subgroup analyses showed that this finding was restricted to the female group. In recent years, the importance of gender related biomarkers has gained more attention especially for lung cancer [
39]. Several biological processes showed substantial differences between males and females in various hormonal states highlighting their impact on biomarker studies [
40]. In this work even if we do not have additional data or insights to potentially explain such differences we underline that gender effects should be considered before starting any biomarkers development study. Our data suggest however that COL10A1 could represent a potentially promising biomarker for lung cancer in females and that its relevance could be potentially explored in gender-related cancers such as breast or ovarian or in specific subgroups in other cancers. The most promising results were obtained from the study on SPARC protein. Although SPARC has recently emerged as a prognostic biomarker in different tumors [
41‐
43], its role in lung cancer remains controversial [
44]. In non-small cell lung cancer, the localization (stromal or tumoral) of SPARC expression is associated with different disease prognosis. Absence of SPARC expression within the tumors is a negative prognostic factor [
42] while high levels of SPARC protein expression, albeit rare, are associated with longer survival and could be protective against tumor aggressiveness [
45]. On the other hand, patients bearing SPARC-positive stroma have significantly poorer overall survival [
29]. Other studies found no prognostic impact of stromal SPARC expression [
46]. To our knowledge, this is the first study showing the release of high levels of SPARC in plasma of lung cancer patients to indicate that this protein could represent a useful diagnostic biomarker for lung cancer. Based on the previous literature and given the detectable levels of SPARC in conditioned medium of fibroblasts, we can hypothesize that the source of circulating SPARC in our patients is from stromal fibroblasts, rather than from cancer cells, and that protein levels in plasma could reflect changes in a microenvironment that becomes activated and permissive to tumor growth and progression.
SPARC protein levels in plasma were high in all stages of the disease indicating that this could also serve as marker of early lung cancer. Consistently with a previous study [
46] we did not find any association between levels of circulating SPARC and the prognosis of our patients, reinforcing the hypothesis that its presence reflects early changes in the microenvironment more related with initial tumor growth than relapse and metastasis. Therefore, we can speculate that SPARC expression could influence stroma responsiveness during tumor formation, representing an indicator of the disease at its earlier phases. The effect of SPARC on malignant progression may instead depend on other events such as EMT, growth factor or immune modulation [
47,
48].
Most importantly, the multivariate analysis also confirmed the significant association between disease status and levels of circulating SPARC. Interestingly, the observation that SPARC could be a promising diagnostic biomarker was confirmed by the ROC analysis, showing an AUC of 0.744, with an optimal cutoff corresponding to 64.4% of sensitivity and 78.9% of specificity. However further studies involving larger numbers of subjects are required to confirm these results.
Interestingly, it has been suggested that ECM modification provides protection against chemotherapy-induced apoptosis and may play a role in the failure of cancer therapy [
49]. Since the decrease (in terms of degradation) of ECM related proteins could be a marker of drug response we analyzed in this study patients that were treated with chemotherapy before surgery. Although our results are preliminary due to the low number of cases analyzed we report slightly decreased levels of SPARC protein in these patients indicating that beside a diagnostic marker SPARC may represent a marker of treatment response. Future longitudinal studies are however needed before firm conclusions can be drawn.
In conclusion, this is the first study to test circulating biomarkers related to ECM remodeling as possible diagnostic tools for lung cancer patients. SPARC emerged as the most promising biomarker, but it is possible that other genes identified in the comparative expression analysis of lung fibroblasts could be also useful for diagnostic purpose, either alone or in combination. It is now apparent, in fact, that panels of protein-based and nucleic acids-based cancer biomarkers, as opposed to single biomarkers, will probably be necessary for reliable cancer detection, especially to improve selection of high-risk individuals for CT screening and to distinguish malignant from benign nodules or identify patients with particularly aggressive cancers. Since the classical ELISA method that we used in our study suffer limitations in analysis time, sample size, equipment cost, and is not easily scalable to measure panels of proteins, new bioanalytical technologies should be developed to realize the full potential of protein biomarkers in the clinical setting. Our work represents an explorative study to verify whether proteins derived from ECM could be measured in plasma of lung cancer patients and their utility as circulating biomarkers and provides proof-of-concept on the feasibility and potential of this approach. Additionally, our study supports the concept that stroma-related plasma biomarkers may better fit as early diagnostic biomarkers than those strictly tumor-related. However, large prospective clinical studies are clearly warranted to confirm our preliminary results and further explore the existing potential of ECM-related circulating proteins.
Acknowledgements
The authors thank Claudio Citterio for data management and Paola Suatoni for handling of samples. The work was supported by the Italian Association for Cancer Research (Special Program “Innovative Tools for Cancer Risk Assessment and early Diagnosis”, 5 × 1000).