Electronic supplementary material
The online version of this article (doi:10.1186/1472-6947-12-54) contains supplementary material, which is available to authorized users.
The authors declare that they have no competing interests.
MT (Takada) carried out the statistical analysis. MS performed data-mining analysis. MT, MS and YN drafted the manuscript. HM, WH and DN collected the validation data and drafted the manuscript. MK helped to design the study and helped to draft the manuscript. KK collected the training data. HS, TI and MT (Tomita) helped to design the study. MT (Toi) conceived the fundamental idea, designed the study and drafted the manuscript. All authors read and approved the final manuscript.
The aim of this study was to develop a new data-mining model to predict axillary lymph node (AxLN) metastasis in primary breast cancer. To achieve this, we used a decision tree-based prediction method—the alternating decision tree (ADTree).
Clinical datasets for primary breast cancer patients who underwent sentinel lymph node biopsy or AxLN dissection without prior treatment were collected from three institutes (institute A, n = 148; institute B, n = 143; institute C, n = 174) and were used for variable selection, model training and external validation, respectively. The models were evaluated using area under the receiver operating characteristics (ROC) curve analysis to discriminate node-positive patients from node-negative patients.
The ADTree model selected 15 of 24 clinicopathological variables in the variable selection dataset. The resulting area under the ROC curve values were 0.770 [95% confidence interval (CI), 0.689–0.850] for the model training dataset and 0.772 (95% CI: 0.689–0.856) for the validation dataset, demonstrating high accuracy and generalization ability of the model. The bootstrap value of the validation dataset was 0.768 (95% CI: 0.763–0.774).
Our prediction model showed high accuracy for predicting nodal metastasis in patients with breast cancer using commonly recorded clinical variables. Therefore, our model might help oncologists in the decision-making process for primary breast cancer patients before starting treatment.