Background
Although the incidence of gastric cancer has decreased worldwide, it remains the fifth most common malignancy in the world [
1]. Gastric cancer remains a heavy burden in Eastern Asia, South America, and a number of European countries. However, prevention and screening programs for gastric cancer particularly at the national level have not yet been established in most countries. The exceptions are Korea and Japan where gastric cancer screening programs have already been introduced [
2]. Recently, the International Agency for Research on Cancer has suggested the establishment of
Helicobacter pylori screening and eradication programs in countries with a high incidence of gastric cancer, taking the local context into consideration [
3]. However, the efficacy of the screening methods used has not yet been evaluated, although they have been anticipated to reduce gastric cancer incidence. Thus,
H. pylori screening has not yet been officially introduced either as a national and regional program.
Chronic
H. pylori infection plays a central role in the development of gastric cancer as shown by biological and epidemiological studies [
4]. In a recent study,
H. pylori infection was reported to be associated with 90% of non-cardia gastric cancer [
5]. The associations of other factors including Cag A, blood type, and lifestyle with gastric cancer have also been investigated [
6‐
11]. Based on several risk factors related to lifestyle, prediction models for gastric cancer have been developed, and these models have demonstrated the capability of discriminating high-risk individuals [
12]. The serum pepsinogen (PG) test can diagnose gastric atrophy, and it has been used for gastric cancer screening and risk stratification for gastric cancer with
H. pylori antibody [
13‐
17]
. Sasazuki et al. reported that the odds ratio for gastric cancer development of
H. pylori infection with the gastric atrophy was higher to that of
H. pylori infection and this was lower than that of
H. pylori infection with positive result of CagA [
11]. Charvat et al. developed a prediction model for gastric cancer based on
H. pylori infection and gastric atrophy with the risk factors related to lifestyle [
18]. The strong association between gastric cancer and these risk factors suggested a high possibility of predicting gastric cancer incidence in the high-risk group detected by the serum PG and
H. pylori antibody tests. If the future risk for gastric cancer development can be optimally clarified, appropriate preventive measures can be taken according to individual risks. These preventive measures can be made more efficient for gastric cancer screening to accurately target cancer screening subjects and decrease the screening frequency of the low-risk group. However, these results are not directly connected with primary cancer screening. To adopt these biomarkers in gastric cancer screening, both sensitivity and specificity should be assessed considering the balance of benefits and harms.
Receiver operating characteristic (ROC) analysis is a widely accepted method for selecting an optimal cut-off value for tests as well as for comparing the sensitivity and specificity of diagnostic tests [
19]. Optimal sensitivity and specificity can maintain the balance of the benefits and harms of a diagnostic test. A high possibility of predicting gastric cancer incidence indicates high sensitivity, but the indication of specificity, which identifies the proportion of subjects without gastric cancer, is still unclear. A low specificity reportedly indicates a high false-positive result and this becomes harm in asymptomatic people [
20]. Therefore, a high specificity is also required. However, the predictive sensitivity and specificity of these biomarkers for gastric cancer development remain unclear. In this study, we evaluated the predictive sensitivity and specificity of the
H. pylori antibody and serum PG tests for predicting gastric cancer development by ROC analysis based on a long follow-up period.
Methods
Study population
The Japan Public Health Center (JPHC)-based prospective study on cancer and cardiovascular disease (JPHC study) was established in 1990. The study population was defined as all inhabitants in 27 municipalities under 9 public health centers. The study population and design of the JPHC study have been described in detail elsewhere [
11]. As a whole, a population-based cohort of 61,009 men and 62,567 women was identified and followed from January 1, 1990 to December 31, 2004. Blood sample was provided voluntarily by these subjects during their health check-ups and was collected from 1990 to 1995. Although a questionnaire survey was performed at their health check-ups, there was no question related to medicines for gastric diseases which they were taking. Newly diagnosed cases of cancer were collected through local major hospitals and local cancer registries.
This study was approved by the Institutional Review Board of the National Cancer Center, Japan (Approval number: 2001-013, 14-038). Written informed consent was obtained from all the participants in the JPHC study.
Laboratory data
The level of IgG antibodies to H. pylori was measured using a direct ELISA kit (E.Plate ‘Eiken’ H. pylori Antibody, Eiken Kagaku Co., Ltd., Tokyo, Japan). The serum levels of PG I and II were measured by two-step enzyme immunoassay using commercial kits (E.Plate ‘Eiken’ Pepsinogen I and Pepsinogen II Eiken Kagaku Co., Ltd.). All measurements were performed by a person blinded to the study. The levels of PG I, PG II, PG I/II, and H. pylori antibody were used for diagnosing and predicting gastric cancer development. In Japan, a combination of PG I, PG I/II, and H. pylori antibody measurements has been a commonly used method for stratifying the risk of gastric cancer. PG I ≤ 70 ng/mL and PG I/II ≤ 3.0 indicate chronic atrophic gastritis. H. pylori infection was classified as positive when the H. pylori antibody titer was ≥ 10 U/mL.
Statistical analysis
ROC analysis was performed following the Hanley and McNeil’s method. The area under the curve (AUC) indicated diagnostic accuracy and defined the optimal cut-off points of the diagnostic tests. The AUC and 95% confidence interval (CI) were estimated and compared among the different biomarkers or their combination. When the highest likelihood ratio was obtained, the cut-off value for sensitivity and specificity was defined as optimal. Statistical analysis was performed using STATA 13.0 (STATA, College Station, TX, USA). All test statistics were two-tailed, and p-values < 0.05 were considered to indicate a statistically significant difference.
Before the main analysis for the prediction of gastric cancer development, ROC analysis for these biomarkers was performed to investigate their ability to diagnose H. pylori infection. H. pylori infection was used for determining outcome using 2 cut-off values (≥10 U/mL and ≥ 5 U/mL) and the AUCs among PG I, PG II, and PG I/II were compared. Then, gastric cancer was used for determining predictive outcome and the AUCs among PG I, PG II, PG I/II, and H. pylori antibody titer were compared. The AUCs were also compared among combination methods using PG I, PG II, PG I/II, and H. pylori antibody. Finally, the AUCs were estimated and compared between the commonly used definition of the defined value of a combination method using PG I, PG I/II, and H. pylori antibody. The subjects were classified into 4 groups according to the risk for gastric cancer development based on their levels of serum PG and H. pylori antibody at enrollment. To discriminate the positive and negative results, the following standard categories were used: PG I/II = 3.0, PG I = 70.0 ng/mL, and H. pylori antibody = 10.0 U/mL. Atrophic gastritis was defined on the basis of the results of a combination of PG I/II and PG I. Based on these categories, the results were divided into 4groups. The first group subjects had a “normal” PG level (negative atrophy) and were negative for H. pylori antibody (negative H. pylori infection). The second group subjects had a “normal” PG level and were positive for H. pylori antibody. The third group subjects had an “atrophic” PG level and were positive for H. pylori antibody. The fourth group subjects had an “atrophic” PG level and were negative for H. pylori antibody.
Discussion
The association of various risk factors including Cag A, blood type, and lifestyle with gastric cancer development has been investigated, and several risk factors have been shown to have a strong association [
6‐
11]. Although the predication model has been developed based on these results, methods for risk stratification in connection with gastric cancer screening have not been conclusively identified [
12,
18]. However, the adaptation of the serum PG and
H. pylori antibody tests have been anticipated because these methods involve simple blood tests [
13‐
17]. A meta-analysis of prospective cohort studies of gastric cancer development using the combination method of
H. pylori antibody and serum PG tests with gastric cancer screening has shown that it is possible to stratify the background risks of gastric cancer [
21]. In this study, we investigated the best available sensitivity and specificity of the serum PG and
H. pylori antibody tests for the prediction of gastric cancer development in connection with cancer screening using ROC analysis. However, the AUCs were usually low because the sensitivity was relatively high when the specificity became extremely low. ROC analyses is a graphical technique for assessing the ability of a test to discriminate between those with disease and those without disease [
22]. It allows the determination of the cut-off value at which optimal sensitivity and specificity can be obtained and enables the comparison of 2 or more diagnostic tests. Regarding the interpretation of AUC results, a test with an area > 0.9 indicates high accuracy, 0.7–0.9 as moderate accuracy, 0.5–0.7 as low accuracy, and 0.5 as a chance result [
22]. In the present ROC analysis, the AUCs for all the methods were below 0.7 even if the highest AUC was obtained when PG I/II was used as a predictive biomarker for gastric cancer development. Based on these definitions, the predictive sensitivity and specificity of gastric cancer development were found to be low in all single tests and combination methods using serum PG and
H. pylori antibody. Thus, these biomarkers could not discriminate clearly between individuals with and without gastric cancer development in this study.
When the combination method using serum PG and
H. pylori antibody tests was evaluated in this study, a high sensitivity was obtained; however, the specificity was low. In a previous study using the same dataset for a nested case-control study, a strong association between
H. pylori infection, gastric atrophy and gastric cancer development was shown. The following odds ratios were obtained when the risk of gastric cancer development was compared with the individuals with both negative
H. pylori infection and gastric atrophy: 4.2 (95% CI: 2.2-8.0) for the individuals with positive
H. pylori infection and negative atrophy; 10.1 (95% CI: 5.6-18.2) for the individuals with both positive
H. pylori infection and gastric atrophy; 4.9 (95% CI: 2.05-12.1) for the individuals with negative
H. pylori infection and positive gastric atophy [
11]. Similar results were obtained from other studies that evaluated the association of
H. pylori infection, gastric atrophy and gastric cancer development [
11,
12,
21]. Although these results confirmed the validity of the strong association for gastric cancer development, and the results supported a high sensitivity for the prediction of gastric cancer development, the possibility of not developing gastric cancer was not assessed and the specificity was ignored. When a prediction model is adopted in clinical practice, it is necessary to provide accurate and discriminating predictions in both situations: with and without gastric cancer development [
23]. Therefore, specificity is an important indicator particularity in connection with cancer screening because the target subjects are asymptomatic people. As low specificity translate into an increase in the number of unnecessary examinations, this results in the psychological burden of mislabeling results [
20]. When the specificities were calculated on the basis of previous studies which evaluated the association between
H. pylori infection
, gastric atrophy, and gastric cancer development, similar results related to sensitivity and specificity were obtained. Based on previous studies related to gastric cancer screening [
13‐
15], the predicative sensitivity and specificity of the combination method using PG I/II, PG I, and
H. pylori antibody with a standard cut-off value were 94.0% and 34.3%, respectively. Even if other risk factors of gastric cancer were included in the model using PG I/II, PG I, and
H. pylori antibody, the sensitivity and specificity of gastric cancer development were 96.5% and 28.8%, respectively [
18]. Although the basic condition and follow-up times were different in these studies, the predicative accuracy of gastric cancer development was consistently low using the serum PG and
H. pylori antibody tests. These results have not been given attention because of the lack of a wide perspective in evaluating the balance of benefits and harms in connection with gastric cancer screening. Thus, only sensitivity was similarly evaluated in these studies.
Prognosis was estimated from the risk of future outcomes in individuals based on their clinical and non-clinical characteristics. Prediction performance could be targeted to a high-risk group for cancer screenings and the use of promotion to encourage participation in the screenings. In the case of low-dose CT screening for lung cancer, risk prediction models have been developed based on different variables including smoking and other risk factors [
24‐
27]. The AUCs of these models were 0.67 to 0.88 and these models discriminated the risk of lung cancer adequately. Although
H. pylori infection is a primary cause of gastric cancer development, the serum PG and
H. pylori antibody tests are insufficient in predicting whether or not an individual has gastric cancer. The aim of an etiological study is to identify particular risk factors attributed to the outcomes. On the other hand, a prediction study provides possible outcomes based on multiple variables associated with the outcome regardless of the cause [
28]. In the prediction model, every causal factor is a predictor, but not every predictor is a necessary cause. Because of the possible confusion between an etiological study and a prediction model, biomarkers have been expected to be adopted as cancer screening methods [
29,
30]. An accurate prognostic model does not provide any benefits and change the behaviors of the target population of cancer screening if it is not generalizable even though it is verified [
31]. In addition, inappropriate use of these biomarkers can lead to a misunderstanding and mismatched labeling of individual risks of cancer. In this study, the highest AUC was obtained in PG I/II, which was also correlated with
H. pylori infection. Although the ability of PG I/II to discriminate gastric cancer development is limited, there is another possibility of assessing the appropriate screening interval. In HPV screening for cervical cancer, the screening interval can be expanded after a negative result of HPV testing [
32,
33]. The diagnosis of atrophy has improved by conventional endoscopy, thus it has been adopted in clinical practice and endoscopic screening. Nomura et al. have reported that endoscopic findings correlated well with PG I/II based on a multicenter prospective study [
34]. Hence, endoscopic diagnosis should also be investigated for the prediction of gastric cancer development in connection with gastric cancer screening. Despite the limitation of PG I/II for predicting gastric cancer development, further study on how to effectively utilize it for gastric cancer screening is advantageous.
This study has several limitations.
Firstly, the background of this study has changed compared with that of other studies in the 1990s. The participants then were recruited in the early 1990s for a large-scale cohort study in Japan. Over the last 2 decades, the incidence of gastric cancer and the infection rate of
H. pylori have decreased, particularly in younger age groups [
35,
36]. Therefore, the present results might not be completely applicable to the current situation.
Secondly, the study subjects might not be a representative sample of the whole Japanese population. Our study subjects were taken from the dataset of a previous nested case-control study. The subjects were chosen from 97,644 eligible subjects who participated in the survey and blood donation. In the previous study, the participants in the health check-up survey had different socioeconomic statuses and favorable lifestyle profiles, such as smoking less, participating in more physical exercises, and eating more green vegetables and fruits [
37].
Third, we used a case-control dataset for this analysis. Diagnostic accuracy can be overestimated if the test is evaluated in a group of patients already known to have the disease and in normal patients [
38,
39]. The results might also be overestimated.
Fourth, there was no detailed information regarding the medicine the subjects took for gastric disease. As the health insurance did not cover
H. pylori eradication during the study period, asymptomatic people had few opportunities to avail of the program. Moreover, a proton pump inhibitor might be also affected by misclassification.
Finally, we could not completely exclude individuals with gastric cancer at the baseline because the baseline survey included general health check-up, but not endoscopic examination. Therefore, the predictive sensitivity of gastric cancer development might be overestimated.
Acknowledgements
We also thank Ms. Kanoko Matsushima and Ms. Ikuko Tominaga for research assistance.