Background
Reverse transcription quantitative real-time polymerase chain reaction (RT-qPCR) is a powerful tool for validating the observed gene expression differences, because of its greater sensitivity and specificity. In traditional gene expression studies, a 'reference gene', also called 'internal standard' or 'housekeeping gene' is used for the normalization. The expression of beta-actin (ACTB) and glyceraldehydes-3-phosphate dehydrogenase (GAPDH), used in a majority of studies [
1], was reported to vary with experimental conditions [
2] and clinical status of the tissue studied (
e.g. asthma), making these genes unsuitable as internal standards for use in normalization of gene expression [
3]. Thus, the validity of the reference gene chosen for statistical analysis is crucial for avoiding the hazard of misinterpreting data and invalid conclusions [
4].
It was suggested that at least three considerations should be taken into account in choosing a reference gene: 1) constancy of its expression throughout the intervention, 2) its amplification efficiency and 3) its abundance, which should be similar to that of the genes of interest [
5]. In addition, to ensure the relevance, accuracy and correctness of interpretations of RT-qPCR, it is recommended that the precise guidelines for RT-qPCR MIQE (Minimum Information for Publication of Quantitative Real-Time PCR Experiment) should be adhered to [
6]. Several tools for statistical analysis such as NormFinder [
7], geNorm [
8], BestKeeper [
9] have been developed to help in the choice of appropriate reference genes. These tools assess the variations in the expression of a number of potential reference genes and suggest which reference gene(s) is appropriate for normalization of gene expression data in a given study.
Stomach cancer is the fourth most common cancer worldwide, with a reported 934,000 cases in 2002 [
10]. Survival from stomach cancer is poor since patients are often diagnosed only after the disease has already advanced significantly [
11], which makes early detection very important. Screening aiming at early detection involves endoscopic examination. To confirm the presence of cancer, biopsies are taken from suspected tissues and subjected to RT-qPCR to confirm abnormal expression of cancer related genes. But appropriate reference genes have to be identified for valid comparisons between expressions of normal versus cancer genes. Reference genes have been described for RT-qPCR studies in various cancers of other tissues [
1,
12‐
21]. However there seems to be no consensus on reference genes for gene expression studies in stomach cancer. We therefore searched PubMed with MeSH terms "gastric cancer", "real-time", and "PCR". In an evaluation of 115 articles published from May 2007 to November 2009, we found that GAPDH (53 cases; 46.1%) and ACTB (41 cases; 35.7%) were the most frequently used reference genes in gastric cancer studies; followed by 18S rRNA (8 cases; 7.0%), beta-2-microglobulin (B2M; 3 cases; 2.6%), hypoxanthine phosphoribosyl transferase 1 (HPRT1; 2cases; 1.7%), TATA binding protein (TBP; 1 case; 0.9%), and beta-tubulin (TUBB; 1 case; 0.9%). In five cases (4.3%), external standard curve was used for absolute quantification (AQ) instead of normalized value by reference gene.
The present study has therefore been designed to find best reference genes for the gene expression studies in stomach cancer. In this study, we investigated the five reference genes that have been most frequently used genes in stomach cancer studies (ACTB, GAPDH, B2M, 18S rRNA, and HPRT1) and for comparison, RPL29, a reference gene used in other cancer studies, in 'non-stomach cancer cell lines', 'stomach cancer cell lines', 'normal stomach tissues' and 'tumor stomach tissues' (Table
1). In order to choose the most appropriate reference gene from the above list, we compared the expressions of glycoprotein NMB (GPNMB), our target gene, with those in the above named list of possible "reference" genes.
Table 1
Potential reference genes evaluated in this study.
ACTB | NM_001101 | Beta-actin | 7p15-12 | Cytoskeletal structural protein |
GAPDH | NM_002046 | Glyceraldehyde-3-phosphate dehydrogenase | 12p13 | Oxidoreductase in glycolysis and gluconeogenesis |
HPRT1 | NM_000194 | Hypoxanthine phosphoribosyl transferase 1 | Xq26 | Metabolic salvage of purines |
B2M | NM_004048 | Beta-2-microglobulin | 15q21.1 | Beta-chain of major histocompatibility complex class I molecules |
18S rRNA | NR_003286 | 18S ribosomal RNA | 22p12 | Ribosome subunit |
RPL29 | NM_00992 | Ribosomal protein L29 | 3p21.3-p21.2 | Structural constituent of ribosome |
GPNMB | NM_001005340 | Glycoprotein (transmembrane) nmb | 7p15||C | Involved in growth delay and reduction of metastatic potential |
Discussion
Differential gene expression in cancer identified from transcriptome study suggests that some specific genes might be involved in tumorigenesis and metastasis of cancer. RT-qPCR is a robust and specific method for the validation of the identity of candidate genes of stomach cancer, because it detects even very weak signals from extremely small amounts of biopsied samples if the patient is in early stage of cancer. However, in the absence of appropriate reference genes, data obtained are open to question leading to misinterpretation. Prior to this study, no validated reference gene has been identified for 'stomach cancer cell line' or 'stomach cancer tissue', but ACTB and GAPDH have been used most frequently until now without consideration of their inconsistent expressions in different experimental settings and clinical conditions. We examined, in addition to ACTB and GAPDH, four other reference genes, HPRT1, RPL29, 18S rRNA, and B2M that have been evaluated as reference genes in recent studies for other human cancers.
It is evident that choosing appropriate primer set is an important starting point to obtain accurate results. We considered following points in selecting primers. First, we adopted primer sets that were previously reported to have or designed to possess an amplicon length around 200 bp. Second, all of the primer sets were required to span at least two neighboring exons except for 18S rRNA gene which is not an mRNA. The above two points are related to the amplification efficiency. It is necessary that reference gene and target gene maintain similar amplification efficiency [
13]. Amplicon length is closely related to amplification efficiency [
24]. So one would expect similar efficiency from amplicon of similar length, and higher efficiency from a shorter amplicon. The benefit of shorter amplicon, 70-250 bp, in RT-qPCR is that amplification is "independent" of RNA quality [
25]. The amplification efficiency is also affected by gDNA contamination, because competitive binding of primers acts as a limiting factor causing decrease of amplification efficiency [
13]. In this context, DNaseI treatment during the RNA purification is crucial to avoid amplification from residual gDNA, but it might not be totally effective. Therefore, our second consideration helps to detect possible contamination with gDNA with different amplicon sizes. We confirmed that each forward and reverse primer is anchored on different exon by BLAT searches on human genome sequences and also that there was no amplified product from contaminating gDNA with extended amplicon length (Table
2). Besides, we also ensured the high quality of RNA, the starting material, in several ways. We also performed the experiments in triplicate for every gene and every sample.
Since the development of qPCR, several statistical programs were developed to identify optimal reference genes. We chose geNorm and NormFinder to analyze the stability of the six reference genes we studied. The geNorm program calculates M-values based on the average pair-wise variation of a particular gene compared with all other studied candidate reference genes and ranks them [
8]. In comparison, NormFinder adopts a strategy, called 'model-based approach to estimation of expression variation' [
7]. These distinct strategies identified for us the best single or combination reference genes in each group of comparisons. GeNorm identified RPL29-HPRT1 as the most optimal combination for 'all stomach tissues', while NormFinder identified RPL29-B2M instead. Although HPRT1 or B2M was in the best combination identified both by geNorm and NormFinder, respectively, they were ranked third in our single reference gene ranking by each analysis (Table
4). With 'stomach cancer cell lines', the rankings from two analyses were identical,
i.e. GAPDH-B2M was the most stable reference gene combination followed by RPL29. These results were supported by statistical data, because the highly ranked reference genes have narrower range of variations in expression levels (Figure
1). For example, the most unstable gene, HPRT1, in 'stomach cancer cell lines' has much wider range of expression compared to GAPDH or B2M. This is also true for 'all stomach tissues', since RPL29 and HPRT1 have much narrower range of expression than GAPDH or ACTB.
Some reference genes such as ACTB and B2M are expressed somewhat more in stomach tissues than in cancer cell lines. Cancer cell lines are supposed to be more activated in metabolism, eventually displaying higher transcription activities. However, higher expression of ACTB and B2M was reported in stomach tissues than AGS/SNU-638 stomach cancer cell lines [
26] as well as higher B2M expression in liver tissues than HepG2/Hep3B/SK-HEP-1/SNU-182 liver cancer cell lines [
17]. In comparison, in this study, the average expression levels of GAPDH and RPL29 were similar in stomach tissues and cancer cell lines. Thus, it appears that metabolically more activated cancer cells are not always in higher transcriptional activity for every gene and every kinds of cell line.
To determine the best reference genes, we analyzed our results with the six candidate reference genes under the suggested rules [
5]. First, in terms of amplification efficiency, all primer sets seem acceptable because they have similar and close to perfect amplification efficiency of 2.0 (Table
2). Second, in terms of constant expression in comparable conditions, HPRT1 and 18S rRNA are to be excluded from the best candidate list for comparing gene expression in normal and tumor stomach tissues', because statistical analysis revealed a significant increase in gene expression in tumor tissues compared to normal tissues. Third, in terms of the abundance of reference genes and target gene, ideally the reference should be almost same in its abundance. However, in reality it is hard to find out genes showing exactly same amount of expression. Therefore, it is advisable to use the lesser different reference gene to give out the more accurate interpretation. In agreement with this, RPL29 seems appropriate because the expression is close to that of GPNMB. Lastly, in selecting the best reference genes from two algorithms, we considered whether selecting multiple reference genes in combination is better than selecting a single reference gene alone, because there is still considerable difference of opinion on the use of multiple reference genes as reported in several studies [
13]. Since some of the genes included in combination were shown to be differentially expressed between normal and tumor or stomach cancer cell lines and tissues, it is necessary to take into account the consistency of expression ranges of each reference gene. Although RPL29-HPRT1 combination has been suggested as the best for 'all stomach tissues' by geNorm, it is also evident that HPRT1 expression has increased from normal to tumor and this combination is not considered suitable one. In this context, it is not advisable to accept the best combinations for 'all stomach cancer cell lines and tissues'. Both algorithms suggested combinations that have 18S rRNA showing differences in the levels of expression between normal and tumor stomach tissues. Actually, only RPL29 showed consistent expression range for all stomach cancer cells and tissues, suggesting that using single reference gene may be more appropriate for comparisons. Taking these findings together, B2M seems to be the most suitable single reference gene for 'stomach cancer cell lines' and RPL29 for 'all stomach tissues'. RPL29 is also the best for comparing target gene expressions in stomach cancer cells and tissues. Using GAPDH-B2M combination for comparing gene expressions in 'stomach cancer cell lines' and RPL29-B2M combination for comparing in 'all stomach tissues' is therefore recommended. We recognize the limitation of this study in that we examined a limited number of samples, but we feel that our conclusions and recommendations are supported in part by previously deposited expression data from microarray and reports in the literature. In an asthmatic airway study, ACTB and GAPDH were found to be unsuitable as reference genes [
4]. Same conclusion was reported for breast, prostate and pancreatic cancers where transcript levels of GAPDH were found elevated [
15]. For stomach tissues, we confirmed this in microarray data deposited in ArrayExpress Gene Expression ATLAS
http://www.ebi.ac.uk/arrayexpress. The gene expression profile of advance gastric cancer tissues (E-GEOD-2685) showed elevated expression in ACTB (
p-value = 7.73e-3) and GAPDH (
p-value = 1.94e-3), but no significant difference with other four candidate reference genes. For the target gene GPNMB, elevation of expression was observed in primary gastric tumors (
p-value = 1.11e-8; E-GEOD-15460). Thus, it seems clear that blindly choosing just ACTB or GAPDH without such evaluations should be avoided.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
H-WR performed all the experiments, statistical analyses, and drafted manuscript. B-CL and E-SC performed RNA purification and performed RT-qPCR experiment. I-JC contributed to the acquisition of patient tissues and clinical data, and also for the interpretation of data. Y-SL participated in designing experiment and interpretation of data. S-HG conceived and design the study and drafted the manuscript. All authors read and agreed to the content of this manuscript.