At present, the histological analysis of the gastroscopic biopsy specimen is affected by the sampling location and tissue amount [
8]. In this study, a robust qualitative transcriptional signature, including two gene pairs consisting of three genes, was developed to aid the early diagnosis of GC using either gastroscopic biopsy or surgical resection specimens. The signature can accurately distinguish GC tissues from non-GC tissues including normal, gastritis and intestinal metaplasia tissues. As shown in this study, the signature can accurately classify GC tissues to GC when the proportion of the tumor epithelial cell was as low as 14%. Especially, it can identify most of GC adjacent-normal tissues as cancer, suggesting that the signature can identify GC even when the sampling location is inaccurate. Notably, all the non-GC tissues sampled by gastroscopic biopsy can be correctly identified as non-GC. However, the specimens sampled by gastroscopic biopsy for gastritis and intestinal metaplasia are limited, and it deserves further studies using large collections of non-GC specimens.
The amount of the gastroscopic biopsy specimens used in the study was about 1–8 µg total RNA [
41‐
43] which was relatively large. In clinical practice, it is often difficult to obtain sufficient amount of biopsy specimens for gene expression profiling or other molecular measurements [
11,
44]. Fortunately, we have shown that the REO-based signatures can be robustly applied to specimens with RNA amplification from as low as 150–250 pg total RNA of cancer cells [
31]. Therefore, it is highly possible that the two gene pairs could be used to gastroscopic biopsy specimens with minimum sampling amounts. We compared the expression levels of the two genes in each of the signature gene pairs. The fold changes (FC) of the two genes in each of the signature gene pairs across different datasets for the GC, GC adjacent-normal and non-GC groups were quite different (Additional files
7 and
8). For the gene pair of CYR61 and MMP28, the median values of FC between CYR61 and MMP28 ranged from 1.17 to 30.56 in the GC group across different datasets, while in the non-GC group the median values of FC ranged from 0.76 to 0.89 (Additional file
7: Table S4). Similar results for the gene pair of CYR61 and ACOX1 were also observed (Additional files
7 and
8). Notably, two genes with high expression levels in a sample can hardly reach large FC even if the absolute expression level difference between the two genes is rather large. Besides, two genes with low expression levels in a sample may reach large FC simply due to large measurement variations [
45]. To more clearly show the quantitative expression level difference of two genes in each of the signature gene pairs, we also calculated the value of the expression level of CYR61 minus the expression level of MMP28 (ACOX1) in a sample as a measure to show the difference of the two genes consisting of the signature gene pairs (Additional files
9 and
10). The median values of the subtraction of MMP28 from CYR61 ranged from 1.30 to 1868.50 in the GC group across different datasets, while in the non-GC group the median values ranged from − 2.29 to − 0.73 (Additional file
9: Table S5). The results were similar for the gene pair of CYR61 and ACOX1 (Additional files
9 and
10). The subtraction values were quite different for different platforms. However, they varied even in the same platform. For example, the median values of the subtraction of MMP28 from CYR61 in GC group ranged from 2.84 to 1868.5 for GPL6947 (Additional files
9 and
10). The above results showed that the subtle quantitative difference (such as FC and subtraction) of each of the signature gene pairs is quite different across different samples for both the GC and non-GC groups because the quantitative gene expression measurements are affected by the measurement batch effects and many other factors such like the sample quality [
29,
31,
46]. However, the REOs of the gene pairs in each group are very stable.
We additionally evaluated the performance of the signature on other types of cancers including liver, colorectal and pancreatic cancers (Additional file
11: Table S6). As shown in Additional file
12: Table S7, the results showed that the signature was unsuitable for these types of cancers. Notably, the signature can classify cancer tissues of liver, colorectum and pancreas as cancer although it cannot correctly classify most non-cancer tissues as non-cancer. The signature genes, including CYR61, MMP28 and ACOX1, may play important roles in the initiation and progression of cancer. As shown in Additional file
13: Table S8, CYR61 and MMP28 are involved in functions such as cell proliferation, differentiation or metastasis related to the initiation and progression of cancer. ACOX1 has been reported to regulate cancer development [
47] and its dysfunction is linked to hepatocarcinogenesis [
48] and migration and invasion of colorectal cancer cells [
49]. Therefore, the stable REOs of genes in the signature may be an inherent feature of cancer which deserves our future study.