Introduction
Cervical cancer is the fourth most common cancer in the female reproductive system and the seventh most common cancer worldwide. There is a higher likelihood of cancer tumors growing in areas where endocervix cells become exocervix cells or near the Squamocolumnar Junction (SCJ). Cervical cancer is one of the main factors related to the death of females worldwide [
1]. According to the World Health Organization (WHO) cervical cancer report in 2020, there were about 604,127 diagnosed cases and 341,831 deaths worldwide, of which 1,056 diagnosed cases and 644 deaths occurred in Iran [
2]. Sexually transmitted diseases, multiple partners, smoking, weak nutrition, and the immune system play a role in the growth and development of cervical cancer [
3]. An important risk factor for cervical cancer is the persistence of human papillomavirus (HPV), especially genotypes 16 and 18 [
4]. Although about 90% of human papillomavirus infections heal by themselves within two years, some may also lead to the growth of cancerous masses in the cervix [
5,
6]. Diagnosing a cancerous mass in the early stages increases the patient’s chance of survival and treatment. In late diagnosis, the possibility of complete recovery of the patient decreases [
7]. Cervical cancer is entirely preventable and treatable if pre-cancer symptoms are identified at an early stage. The pap smear is frequently used for cervix medical diagnosis to track cervical cancer. A few cervical cell samples are taken, a cell smear is made, the cells are examined under a microscope for abnormalities, and the result is a diagnosis of the cervical condition [
8]. Physicians consider the patient's chance of survival to guide their treatment plan.
Survival prediction is a set of statistical methods for data analysis, where the outcome variable is the time to an event. In other words, survival prediction is calculated by considering the time between exposure to the event and the occurrence of the event [
9]. According to the American Society of Clinical Oncology (ASCO), the average 5-year overall survival rate for cervical cancer is 66%, i.e., about 66% of people diagnosed with cervical cancer today will survive for at least the next five years. The best treatment method for each patient can be adopted by evaluating the patient’s clinical and treatment data to accurately predict the patient’s survival. Researchers have often used classical statistical methods such as non-parametric, parametric, and semi-parametric (COX) tests to predict survival [
10]. In recent years, artificial intelligence algorithms, with their impressive capabilities, have been in fierce competition with statistical tests and have grown significantly in survival prediction.
Big data are being generated and stored with the rapid growth of digital technologies in healthcare and the evolution of electronic health records (EHR) [
11]. Classical statistical methods often focus on the relationship between dependent variables to achieve the final result, but machine learning algorithms can learn hidden patterns in data. Machine learning algorithms do not require implicit assumptions and can manage non-linear relationships between variables [
12]. Machine learning makes computers intelligent without directly teaching them how to make decisions and solve problems [
13]. Today, machine learning algorithms have been studied and developed in the diagnosis, prognosis, and prediction of the occurrence of many diseases [
14], which performed very well in dealing with Big data [
15].
This study aimed to evaluate published studies on machine learning algorithms in predicting the survival of patients with cervical cancer, considering overall, disease-free, and progression-free survival.
Discussion
A systematic review of 229 articles resulted in the inclusion of 13 articles. The selected articles contained qualitative and quantitative information about predicting and analyzing the survival of cervical cancer patients using machine learning algorithms. The number of articles using machine learning algorithms to predict cervical cancer survival was few. Studies related to all three types (overall survival, disease-free survival, and progression-free survival) were inevitably included in the study due to the variation in survival and the small number of studies specific to each type of survival.
The three included studies that used open-access databases were more transparent and competitive in preprocessing and model building. Multiple researchers can analyze open-access databases to discover the most valuable features and the best machine-learning model for that particular dataset. Another essential thing even mentioned in the article [
32] was the correlation of the model output with the data of a specific geographical environment and the change of medical prescriptions over time. Generalizability and the time interval between data collection and modeling can be evaluated in the applicability of the model output. Databases with open access were more suitable and valuable for studying and predicting survival.
The included articles used datasets with different sizes and types for modeling. The largest dataset included in the study was related to the article [
17], with 14,946 clinical tabular data and C-index (0.86). The smallest dataset included in the study is related to the article [
26] with 85 image data records (PET/CT) and C-index (0.77). Image datasets had fewer records than other datasets among the imported articles. According to the reports of (Illia Horenko) [
33], small datasets used in model training often cause overfitting of the model and reduce the model’s capacity for generalization. Image datasets sometimes make the model more accurate than tabular data, which can be caused by the power of image processing algorithms [
34]. Feature extraction, feature selection, transfer learning, fine-tuning, augmentation, object segmentation, and object detection were the most critical advantages of image processing algorithms [
34‐
36]. In addition to the cases mentioned, convolutional neural networks obtained valuable results on 3D images [
37]. Recently, medical image datasets have been used to predict the survival of patients. However, larger image datasets and more optimal convolutional neural network structures should reach a robust model.
Only two of the articles included in this study had external validation. Article [
18] with molecular data and the other article [
24] with the combination of clinical tabular data and images (PET/CT) obtained precision of 0.82 and 0.42 respectively. The model’s generalizability is more reliable in external validation due to the use of different data. Most included articles used the five-fold cross-validation method for internal validation. Cross-validation is a resampling method for evaluating a model with limited data [
38]. The advent of open-access datasets and standard databases of medical data has made it more feasible to evaluate models using external validation methods.
Data wrangling and preprocessing play an essential role in modeling and model output. Medical datasets often include noise, redundant data, outliers, missing data, and irrelevant variables [
39]. Hoeren mentioned that the actual value of data lies in its usability [
40], and data quality is the most critical concern in model training. Data cleaning is one of the essential solutions in the data preprocessing stage for reducing errors, preventing model bias caused by dirty data, and obtaining the best results [
41]. Therefore, data preprocessing such as cleaning, transformation, reduction, and integration, should be conducted properly, which includes 70–80% of the training and model workload [
42]. All the included studies paid attention to this principle.
Among all the included articles, six used hyperparameter tuning and feature selection methods in their study [
18,
21,
24‐
26,
28]. Studies often used hyperparameter tuning and feature selection to avoid overfitting or to achieve high-accuracy models [
24,
25]. According to articles [
25,
32], selecting appropriate modeling variables directly affected the model’s output. Therefore, feature selection, extraction, reduction, and engineering are necessary to reach an ideal model. Hyperparameter tuning is one of the essential steps in the model-building pipeline, which can produce a model with high accuracy by finding the most optimal input parameters. Most of the entered studies used the Grid search method for this operation. Considering that feature selection in convolutional neural networks is done automatically, having background knowledge can enhance the model’s reliability. Approaches such as Bayesian Optimization and Evolutionary algorithms like Genetic Algorithms [
26] and Artificial Fish Swarm [
18] can be more suitable approaches for hyperparameter tuning and feature selection.
Recently, the use of Hybrid and Ensemble models has increased in the medical field, especially in predicting survival. Three of the included studies that used the abovementioned methods to predict survival have obtained acceptable accuracy and precision [
18,
19,
21]. Random forest (RF) and Extreme Gradient Boosting (XGBoost) models are also among Ensemble learning (EL) algorithms [
26]. Developing and optimizing machine learning models using hybrid and ensemble techniques continuously improve computational aspects, performance, generalizability, and accuracy [
43]. Ensemble models, like deep learning algorithms, have spontaneous feature selection ability. In these two Ensemble and Hybrid learning methods, several models with weak learners are trained to solve a specific problem and combined to achieve better results [
44].
Most studies have used a combination of clinical, imaging, and molecular data to predict survival to achieve greater accuracy in training machine learning models. Articles [
22‐
25] used a combination of clinical data types with more accuracy and reliability. Most articles that used composite data to predict cervical cancer survival occurred from 2021 onwards. Random forest and deep learning were the most used in mixed data modeling. All types of patient data, with the help of artificial intelligence, can play a significant role in Precision Medicine.
With recent advances in artificial intelligence, deep learning algorithms have undeniably gained power as well. Deep learning algorithms are able to recognize patterns from large, extensive and heterogenous data. They have also provided an admirable ability to process image, video, text, audio and signals [
45]. According to comparative studies, it has been determined that artificial intelligence has a better performance than classical statistics [
45]. With the daily advancement of technologies and the rapid expansion of artificial intelligence science, we will see the use of transformers [
46], meta learning [
47] and quantum machine learning [
48] in medical data processing in the near future. Nevertheless, solutions to the questions of interpretability and explainability should be considered together with the immense potential of AI in health research [
49].
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.