Electronic supplementary material
The online version of this article (https://doi.org/10.1186/s12911-019-0878-9) contains supplementary material, which is available to authorized users.
Matthias Eberl and Simone M. Cuff contributed equally to this work.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
A substantial proportion of microbiological screening in diagnostic laboratories is due to suspected urinary tract infections (UTIs), yet approximately two thirds of urine samples typically yield negative culture results. By reducing the number of query samples to be cultured and enabling diagnostic services to concentrate on those in which there are true microbial infections, a significant improvement in efficiency of the service is possible.
Screening process for urine samples prior to culture was modelled in a single clinical microbiology laboratory covering three hospitals and community services across Bristol and Bath, UK. Retrospective analysis of all urine microscopy, culture, and sensitivity reports over one year was used to compare two methods of classification: a heuristic model using a combination of white blood cell count and bacterial count, and a machine learning approach testing three algorithms (Random Forest, Neural Network, Extreme Gradient Boosting) whilst factoring in independent variables including demographics, historical urine culture results, and clinical details provided with the specimen.
A total of 212,554 urine reports were analysed. Initial findings demonstrated the potential for using machine learning algorithms, which outperformed the heuristic model in terms of relative workload reduction achieved at a classification sensitivity > 95%. Upon further analysis of classification sensitivity of subpopulations, we concluded that samples from pregnant patients and children (age 11 or younger) require independent evaluation. First the removal of pregnant patients and children from the classification process was investigated but this diminished the workload reduction achieved. The optimal solution was found to be three Extreme Gradient Boosting algorithms, trained independently for the classification of pregnant patients, children, and then all other patients. When combined, this system granted a relative workload reduction of 41% and a sensitivity of 95% for each of the stratified patient groups.
Based on the considerable time and cost savings achieved, without compromising the diagnostic performance, the heuristic model was successfully implemented in routine clinical practice in the diagnostic laboratory at Severn Pathology, Bristol. Our work shows the potential application of supervised machine learning models in improving service efficiency at a time when demand often surpasses resources of public healthcare providers.