Introduction
Colorectal cancer is the second leading cause of cancer-related death in the United States and accounts for 10% of all cancer-related death worldwide [
1]. Early detection through screening programs has proven essential in reducing cancer mortality [
2]. While colonoscopy is considered the gold standard for early detection of CRC, the cost and cumbersome nature of procedures have led to more frequent use of noninvasive initial screening tests. The fecal immunochemical test (FIT) is the primary modality used in the Los Angeles County Department of Health Services (LADHS) for asymptomatic, average-risk patients. At a hemoglobin cutoff of 100 ng/mL or 20 ug/g, the specificity of FIT tests for CRC is 95% and for advanced neoplasia 97% [
3].
Despite these promising statistics, 39–52% of patients referred for FIT-positive diagnostic colonoscopy have neither adenomas nor more advanced pathology [
4]. Such false-positive FIT (FP-FIT) results expose patients to unnecessary colonoscopy, which increases healthcare burden and cost, exposes patients to unnecessary interventions, reduces patient compliance to yearly FIT testing [
5], and can generate psychological distress up to 6 weeks after a normal colonoscopy [
6]. This directly hampers efforts to reduce unnecessary health care interventions.
Previous studies have examined factors affecting FP-FITs with conflicting findings. Some studies suggest that pharmacologic agents such as aspirin, clopidogrel, warfarin, and nonsteroidal anti-inflammatory drugs (NSAIDs) have no impact on FIT test characteristics [
7‐
10]. Other studies, however, suggest that one or more of these medications significantly impact FIT results [
9,
11‐
13]. There are also disparities as to whether factors such as age, sex, presence of hemorrhoids, smoking history, CRC history, or BMI influence FP-FIT results [
4,
11,
14‐
23].
Given the contradictory evidence from previous studies and lack of individual studies comparing multiple factors within the same population, we aimed to identify demographic, personal, pharmacologic, and other clinical predictors that lead to FP-FIT in our largely Hispanic LADHS population. Identifying these factors can aid in creating personalized screening strategies, strengthening conclusions gained from FIT results, and reducing unnecessary colonoscopies. Finally, we trained multiple machine learning models (MLMs) to predict FP-FIT as well as the presence of advanced adenoma and compared their performance.
Materials and Methods
Study Population
We conducted a retrospective study of average-risk patients at or over the age of 50 who underwent diagnostic colonoscopy following a positive screening FIT between 2015 and 2018 at Olive View-UCLA Medical Center (OVMC), one of three major hospitals within LADHS. Average-risk patients were asymptomatic individuals without family history of colorectal cancer or prior premalignant or malignant polyps. A total of 596 adult patients were identified, all of whom had undergone FIT screening with the OC-Auto-FIT test, an immunochemical test using a hemoglobin level of 100 ng/mL (20 ug/g) as the threshold for a positive FIT.
Endoscopic and Pathologic Procedures and Definitions
All colonoscopies were performed at OVMC. Participants prepared for colonoscopy per standardized instructions, including: a clear liquid diet the day before endoscopy and completing four liters of split-dose polyethylene glycol solution (GoLYTELY) the evening prior to endoscopy. Colonoscopies were excluded if deemed by the performing endoscopist to have suboptimal or inadequate bowel preparation. The study only included colonoscopies demonstrating adequate preparation. All visualized lesions were biopsied or removed and sent for histologic assessment.
Data on endoscopic and histologic findings were collected, including the number of adenomatous polyps and the size of the largest polyp found. Colonoscopies demonstrating one or more adenomas or more advanced pathology were defined as positive. Advanced adenomas were classified according to recent societal guidelines (adenomas with size greater than 10 mm, three or more adenomas, or histology showing tubulovillous or villous morphology or adenocarcinoma) [
31]. Colonoscopies demonstrating only hyperplastic polyps were defined as negative (i.e., FP-FIT).
Predictor Variables
Predictors collected from each patient’s electronic health records and colonoscopy reports included: age, sex, ethnicity, body mass index (BMI), history of smoking, personal history of gastrointestinal malignancy, presence of diverticula on colonoscopy, presence of hemorrhoids, NSAID use, antiplatelet agent use, and anticoagulation use. Medications were only included if actively used by the patient at the time of positive FIT testing as evidenced by clinic visit notes.
Statistical Analysis and Machine Learning Models
We performed descriptive statistics to depict the patient population and compare characteristics between patients with and without a FP-FIT (primary outcome). We used multiple logistic regression to investigate relationships between the aforementioned predictors and a FP-FIT result. In addition, we used linear regression to elucidate the relationship between the same predictors and the number of adenomatous polyps observed on colonoscopy (secondary outcome). Patients missing data for any of the predictors were excluded from regression modeling.
Next, we trained machine learning models (MLMs) to predict a FP-FIT result as well as the presence of ≥ 1 advanced adenoma (secondary outcome). The goal was to create a statistical model that could inform the clinician of the presence or absence of an FP-FIT result or, conversely, an advanced adenoma using readily available demographic and clinical parameters. MLMs have been used in recent years to predict CRC using noninvasive parameters such as complete blood count and fecal microbiota composition [
24‐
30]. The statistical modeling improves as the number of data points or patients it is “trained” on grows. A subset of the dataset is traditionally held out to measure the performance of the model (testing set). Because the model is not trained on this subset of the dataset, an understanding of how accurately the model will predict a particular outcome in new, previously unseen patients can be gained.
Our dataset was randomly divided with 80% of observations assigned to the training set and the remaining 20% to the testing set. The FP-FIT training set consisted of 470 patients, of which 212 (45.1%) had a FP-FIT result and 258 (54.9%) had a true-positive FIT (TP-FIT) result. The testing set included 117 patients, of which 64 (54.7%) had a FP-FIT and 53 (45.3%) had a TP-FIT. The advanced adenoma training set consisted of 472 patients, of which 148 (31.4%) had an advanced adenoma and 324 (68.6%) did not, while the testing set was comprised of 117 patients, of which 36 (30.8%) had an advanced adenoma and 81 (69.2%) did not.
We trained the following four supervised MLMs on both the FP-FIT and advanced adenoma data: (1) generalized linear model (GLM), (2) support vector machine (SVM) with linear kernel, (3) SVM with radial basis function (RBF) kernel, and (4) random forest. The same predictors described previously were used as predictors or features in each of the MLMs. Nine patients were excluded from the FP-FIT dataset and seven patients were excluded from the advanced adenoma dataset because data were missing for one or more features. Imputation was not performed as these patients comprised only 1.5% and 1.2% of the entire cohort, respectively. We used tenfold cross-validation for resampling when tuning the SVM and random forest model hyperparameters. Each MLM was then validated on the FP-FIT or advanced adenoma testing sets, and receiver operator characteristic (ROC) curves with corresponding area under the ROC curve (AUROC) were generated.
Youden’s index and the point closest to (0,1) method were used to calculate the optimal cut points above which a patient was considered to have a FP-FIT or advanced adenoma. Youden’s index maximizes the sum of sensitivity and specificity, while the point closest to (0,1) method minimizes the Euclidean distance between the ROC curve and the (0,1) point [
32]. Accuracy, sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) were calculated for the best-performing MLM at its optimal cut point.
Descriptive statistics were performed using Stata/IC 16.1 (StataCorp, College Station, TX, USA). Logistic regression, linear regression, and machine learning experiments were performed using R 4.0.2 and the caret, ranger, and pROC libraries. A p value < 0.05 was considered statistically significant.
Discussion
In this study, we examined factors associated with FP-FIT results and conversely, factors associated with the presence of adenomatous polyps. Our cohort included average-risk LADHS patients who had a positive result with the standard OC-Auto-FIT kit using a hemoglobin cutoff of 20 ug/g. The study identified 596 FIT-positive participants, 45% of whom were found to have a FP-FIT after undergoing colonoscopy. Given the large healthcare burden caused by FP-FIT results and subsequent colonoscopies, it is crucial to identify patients who may or may not have high diagnostic yield for CRC or advanced adenomas on colonoscopy.
Across the literature, there does not appear to be any consensus as to which factors are predictive of FP-FIT results. In studies from Germany and the Netherlands, male sex, older age, and greater BMI were significant predictors of a FP-FIT [
9]. In contrast, studies from Barcelona, Australia, and Italy showed that older age and male sex were associated with increased odds of advanced neoplasia, thus a TP-FIT [
15,
19,
21]. A large meta-analysis that included both Asian and European populations found female sex and NSAIDs to be significant predictors of FP-FIT [
9]. In addition, hemorrhoids were found to increase odds of a FP-FIT in a large Korean cohort [
13], though hemorrhoids were not found to be a significant predictor in studies from the Netherlands and Taiwan [
5,
33,
34].
Within our largely Hispanic LADHS cohort, we found that female sex, younger age, lower BMI, and presence of hemorrhoids on colonoscopy significantly increased the odds of an FP-FIT result. Similarly, female sex and presence of hemorrhoids were associated with a fewer number of adenomatous polyps. These results are consistent with established data that male gender, older age, and higher BMI are known predictors for gastrointestinal malignancies [
35‐
37]. Furthermore, the presence of hemorrhoids is a commonly suspected cause of FP-FIT as it is known to cause rectal bleeding. Finally, our findings that anticoagulants, antiplatelet agents, and NSAIDs do not affect FP-FIT results are in line with data from other studies [
7‐
10].
We trained four MLMs to predict FP-FIT and four other MLMs to predict the presence of advanced adenomas. The SVM with RBF kernel demonstrated the best performance predicting an FP-FIT result, while the GLM demonstrated the best performance predicting the presence of advanced adenomas with AUROCs of 0.618 and 0.614 respectively. Given the low AUROCs, the MLMs do not perform well enough to be clinically valuable; however, retraining them using a larger dataset and a different set of features may improve their performance.
A primary strength of this study was the consistency of FIT testing, as all participants received the same OC-Auto-FIT test with a standard hemoglobin cutoff at 20 ug/g. Additionally, the multiple predictors individually studied in separate studies previously were evaluated collectively within this study. These factors included not only demographic data, but also clinical history, presence of diverticula or hemorrhoids on colonoscopy, as well as medication history with NSAIDs, antiplatelet use, or anticoagulation use. Finally, this study is one of the largest studies of FP-FIT involving a predominantly Hispanic, safety net population.
Despite the substantial strengths of this study, several limitations should be acknowledged. First, the study did not include patients with positive FIT results who did not attend their colonoscopy appointment or were lost to follow-up. Because these individuals did not have a diagnostic result after positive FIT, this could potentially affect selection bias. Second, the presence of hemorrhoids on colonoscopy predicted a FP-FIT. Notably, however, this information may not be known at time of FIT invitation, making it difficult for clinicians to predict FP-FIT. Third, we did not distinguish between initial FIT and repeated FIT testing, which can improve adenoma detection. Subgroup analysis of repeated applications of FIT beyond 1-time FIT may improve sensitivity of our FP-FIT MLM. Finally, the study did not examine the statistical impact of adenomas that may have been missed during colonoscopy.
Based on the findings in this study, clinicians can implement a risk-based screening strategy to determine which patients may have a low or high yield diagnostic colonoscopy following a positive FIT result. Pending further validation, these data may be useful in determining the most appropriate CRC screening modality in patients who are at increased a prior risk of a FP-FIT and when interpreting a positive FIT result. Particularly with the new US Preventive Services Task Force (USPSTF) proposal to initiate screening at age 45, gastroenterologists have an even greater responsibility to stratify which patients should be scheduled for colonoscopy and which patients can get FIT testing [
38]. Overall, the addition of personalized, risk-based screening strategies could increase the accuracy and diagnostic yield of FIT screening, reducing the number of unnecessary colonoscopies and healthcare burdens.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.