Background
Soft tissue sarcomas (STS) represent a diverse group of malignancies with different clinical behaviors. Adult STS can be grouped into two broad categories. One category has simple genomic profiles and specific cytogenetic changes, such as a point mutation or translocation (for example SYT-SSX in synovial sarcoma). The second category is comprised of tumors with more complex genomic patterns characterized by multiple gains and losses, including many leiomyosarcomas (LMS), pleomorphic liposarcomas, and undifferentiated pleomorphic sarcomas (UPS) (previously termed malignant fibrous histiocytomas) [
1‐
5]. Although UPS may represent a distinct tumor entity, many UPS have mRNA expression profiles that are similar to other well defined subtypes of STS, including LMS and liposarcoma, although they are not easily recognized as such based on histology (
http://www.iarc.fr/en/publications/pdfs-online/pat-gen/bb5/bb5-classifsofttissue.pdf) [
6‐
10].
While some differences in behavior generally correlate with histologic diagnosis and grade, significant heterogeneity of tumor biology exists even within histologic subsets. The heterogeneity of biological behavior complicates clinical care of patients with STS. One clinically important variable is whether a tumor will metastasize or not.
Gene expression patterns may be useful in the subclassification of STS, both for diagnosis and for prediction of clinical behavior [
2,
7‐
16]. In some cases, gene expression patterns may correlate better with biological behavior than histology, and some studies have suggested that gene expression patterns may correlate with metastatic potential in some high-grade STS [
11,
12,
14,
17]. A recent study identified a set of 67 genes involved in mitosis and chromosome integrity, termed the complexity index in sarcomas (CINSARC), that can predict metastasis outcome in non-translocation dependent STS [
11] and also synovial sarcoma [
18].
In earlier studies, we described gene expression profiles that identified two general subgroups in a set of clear cell renal cell carcinomas (ccRCC-gene set), a set of ovarian carcinomas (OVCA-gene set), and a set of aggressive fibromatosis samples (AF-gene set) [
19‐
22]. We recently reported the use of a gene set derived from these three studies to separate 73 high grade STS into 2 or 4 groups with different propensity of metastasis [
14]. Because the expression data for the STS sample set was limited since it was from a different platform than the Affymetrix system, we pooled the ccRCC-, OVCA-, and AF-gene sets for the earlier study.
In this study we confirmed the results from our earlier studies with an independent data set. We utilized our three gene sets to examine a larger group of 309 non-translocation associated STS using Affymetrix chip based expression profiling, in data sets in which all probes utilized in our earlier studies were represented. These gene sets successfully separated the STS samples into subsets with different probabilities of developing metastases.
Discussion
In the current study, 309 cases of four histologic subtypes (LipoD, UPS, LMS, OTH) of non-translocation associated soft tissue sarcoma (STS) were separated into subsets with different probability of metastasis using gene sets derived from our earlier studies with ccRCC, AF, and OVCA [
15,
19‐
22]. Each of the three gene sets separated the 136 UPS samples into two groups of differing metastatic propensity, while the AF-gene set also identified four subsets of the 62 LipoD samples. These data support the concept that these gene sets may predict biological behavior in some STS. The data also confirmed differences in the metastatic propensity between LMS and the other STS examined.
Both the current and CINSARC gene sets identified subsets of the UPS samples, when analyzed as a separate group, that differed in time to metastasis. Interestingly, the current study also detected subsets of LipoD, when analyzed as a separate group, that differed in probability of metastasis, but did not detect such subsets in the LMS or OTH subgroups. These differences were apparent even after excluding grade 1 tumors from the analysis. In contrast, the CINSARC gene set identified subsets of LMS, when analyzed as a separate group, that differed in probability of metastasis, but did not detect such subsets in the LipoD subgroup [
11]. Copy number losses, which may also reflect genomic instability, have recently been shown to identify subgroups of LipoD with a poor prognosis [
24]. These results are consistent with the hypothesis that gene expression patterns that correlate with metastatic propensity may depend on the background gene expression of the tumor, which may be determined by other gene sets and partly reflected by histology. For example, a set of genes that predicts metastasis in LMS might not be predictive in UPS, or a gene might be predictive in lung cancer but not colon cancer.
Differences were observed in the global gene expression patterns of the subsets that were identified in this study. For example, the subset of the LipoD samples with the lower probability of metastasis (LipoD-A) tended to over-express a number of genes expressed in adipocytes such as ADIPOQ, PLIN1, FABP4, LPL, PPARG, and THRSP. These samples might be viewed as less de-differentiated. Adiponectin, encoded by the ADIPOQ gene, is a hormone produced in adipose tissue that regulates several metabolic processes including fatty acid catabolism. It is strongly expressed in preadipocytes differentiating to adipocytes [
25,
26]. Perilipin, also known as lipid droplet-associated protein, is encoded by the PLIN1 gene and coats lipid droplets in adipocytes, acting as a coating that separates lipids from lipases. FABP4, also known as adipocyte protein 2, is a carrier protein for fatty acids and is expressed in adipocytes and macrophages. Lipoprotein lipase (LPL) is expressed in a number of tissues; the form in adipocytes is activated by insulin. PPARG, or peroxisome proliferator-activated receptor gamma, is a nuclear receptor that regulates fatty acid storage and glucose metabolism. PPARG activates genes that stimulate lipid uptake and adipogenesis by fat cells. THRSP, or thyroid hormone responsive SPOT14, plays an important role in regulating lipid metabolism [
27]. Interestingly, SPOT14 has been reported to be a marker of aggressive breast cancer [
28].
Other genes that were over-expressed in the LipoD-A subset of samples included ALDH1, ADH1B, and NTRK2. Alcohol dehydrogenase 1B (ADH1B) metabolizes a variety of substrates. Aldehyde dehydrogenase 1 (ALDH1) oxidizes aldehydes to carboxylic acids and also metabolizes a variety of substrates. ALDH1 is a member of a superfamily of genes and has been reported to be a marker of normal and malignant mammary stem cells, and a predictor of poor clinical outcome in breast cancer [
29]. NTRK2 functions as a receptor for several neurotrophins, including BDNF, NT-3, and NT-4; it can mediate several effects, including differentiation and survival.
In contrast, the LipoD-B subset (with the higher risk of developing metastatic disease) over-expressed a number of genes involved in regulating cell growth and cell division. Among the genes over-expressed in the LipoD-B subset compared to the LipoD-A subset, RUNX2 is a transcription factor that has a Runt DNA-binding domain and is active in osteoblast differentiation. CDC20 (cell-division cycle protein 20), cyclin B1 (CCNB1), cyclin A2 (CCNA2), cyclin E2 (CCNE2), cyclin-dependent kinase-1 (CDK1, also known as CDC2), cyclin-dependent kinases regulatory subunit 2 (CKS2), and cyclin-dependent kinase inhibitor 3 (CDKN3), aurora A kinase (AURKA), and aurora B kinase (AURKB) are involved in regulating cell division. TOP2A (DNA topoisomerase 2-alpha) is important in DNA replication and is felt to be a target for doxorubicin and etoposide. In some models, higher levels of TOP2A correlated with more resistance to doxorubicin [
30]. Podoplanin (PDPN) is a mucin-like protein expressed in a variety of cells. Notably, PDPN is a specific marker for lymphatic endothelial cells, and has been shown to be over-expressed in a variety of tumors. ANLN (actin binding protein anillin) binds actin, but is also localized to the nucleus in some cancer cells, and has been reported to be over-expressed in hormone resistant prostate cancer and squamous cell head and neck cancer. Lysyl oxidase (LOX) is a copper containing enzyme that catalyzes the formation of aldehydes from lysine, especially in collagen and elastin [
31]. LOX is regulated by hypoxia-inducible factors (HIFs) and has been reported to be upregulated in a number of cancers. In a mouse model, LOX inhibitors decreased metastasis [
32] and such inhibitors could eventually be clinically useful [
33]. Many of these regulatory proteins could serve as drug targets.
Extracellular matrix genes were also differentially expressed in the two LipoD subsets, with FN1 and CTHRC1 over-expressed in LipoD-B. Differential expression of extracellular matrix proteins in different subsets of malignancies appears to be a common theme. The recognition of a desmoplastic response to cancer has long been recognized [
34‐
36], and tumor-associated fibroblasts may play a role in cancer growth and development [
37‐
42]. The expression of several extracellular matrix and collagen genes has been related to invasion and survival in other tumors [
43‐
45], and studies in breast cancer described two types of stromal responses: a fibromatosis-like stromal gene signature and a CSF-1 macrophage stromal gene signature [
46].
Genes over-expressed in UPS-A (the good prognosis group) compared to UPS-B included: SCARA5, TNXB, ALDH1A3, ADH1B, MFAP5, AKAP12, DEFB1, TGFBR3, GAS7, IL17D, IL33, and CD34, as well as the growth factors NEGR1, PDGFD, FGF18, NTRK2, and IGFBP6, a regulator of IGF1. SCARA5 was expressed 34-fold higher in the UPS-A subgroup, and may function as a tumor suppressor gene. Over-expression of SCARA5 suppressed some malignant behaviors in hepatoma cells, and SCARA5 knockdown was associated with activation of MMP9 [
23]. Tenascin-XB (TNXB) is an extracellular matrix protein that may regulate collagen deposition by fibroblasts. Immune related proteins over expressed in UPS-A included: DEFB1 (defensin beta 1), IL17D, IL33, and A-kinase anchor protein 12 (AKAP12).
In contrast, genes over-expressed in UPS-B (the bad prognosis group) compared to UPS-A (the good prognosis group) included: STEAP1, CA12, AIM2, TNC, HS3ST3A1, RUNX2, POSTN, ADAM12, PLAUR, NRP2, RGS1, extracellular matrix proteins (COL11A1, COL10A1, FN), matrix metalloproteinases (MMP9, MME, MMP13, MMP1), FAP (a marker of fibroblast activation), and growth regulator proteins. Six-transmembrane epithelial antigen of the prostate (STEAP1) is an antigen detected in prostate cells, but up-regulated in multiple cancer cell lines, and is a metalloreductase. Recent studies on the expression patterns of STEAP1 and STEAP2 have suggested that they may be markers of mesenchymal stem cells [
47]. High STEAP1 expression in Ewing sarcoma has been reported to be associated with an improved outcome [
48]. In contrast, STEAP1 expression has also been reported to promote invasiveness in Ewing tumors [
49]. PLAUR has been associated with cell migration, cell adhesion, and cell cycle regulation.
Of the genes upregulated in both of the poor-prognosis groups (UPS-B and LipoD-B) compared with UPS-A and LipoD-A, respectively, only RRM2 is present in the CINSARC gene set; five genes, RRM2, LEF1, KDELR3, FN1, and CTHRC1, were present in the RCC-gene set; two genes, MICAL2 and KDELR3, were present in the AF-gene set; and one gene, RRM2, was present in the OVCA-gene set. The RRM2 gene encodes ribonucleoside-diphosphate reductase subunit M2, one of two subunits for ribonucleotide reductase. Ribonucleotide reductase catalyzes the formation of deoxyribonucleotides from ribonucleotides, a rate-limiting step in DNA synthesis, and is regulated in a cell-cycle dependent manner. Over-expression of RRM2 has been reported to enhance the metastatic potential of some tumors [
50]. LEF1 is a transcription factor involved in the Wnt-signaling pathway, and has been associated with other malignancies. KDELR3 is a receptor involved in protein sorting in the endoplasmic reticulum. MICAL2 is a monooxygenase that promotes depolymerization of F-actin, and has also been associated with progression of prostate cancer.
Although in the current study we focused on the identification of at most four subsets of each STS set with different biological behavior (manifest as metastatic propensity), in some cases we observed statistically significant differences in time to metastasis when the samples were analyzed as more than four subsets. The ability to detect multiple subgroups is strongly influenced by sample number and the distribution of samples among the various groups. Perhaps further heterogeneity that is clinically useful may be identified in larger sample sets. The ability to better predict the long-term outcome following surgery will greatly improve the treatment of patients with sarcomas, and gene expression profiles may provide a clinically meaningful approach to this problem, restricting the use of adjuvant modalities of therapy and reducing heterogeneity among groups in clinical trials of new drugs.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
KMS participated in study design, data analysis, and helped draft the manuscript. APNS participated in study design, data analysis, and helped draft the manuscript. WX participated in data analysis, and helped draft the manuscript. XL participated in data analysis, and helped draft the manuscript. PL participated in data acquisition and analysis, and helped draft the manuscript. JMC participated in data acquisition and analysis, and helped draft the manuscript. FC participated in data acquisition and analysis, and helped draft the manuscript. All authors read and approved the final manuscript.