Skip to main content
Log in

A comparative analysis of semi-supervised learning: The case of article selection for medical systematic reviews

  • Published:
Information Systems Frontiers Aims and scope Submit manuscript

Abstract

While systematic reviews are positioned as an essential element of modern evidence-based medical practice, the creation of these reviews is resource intensive. To mitigate this problem, there have been some attempts to leverage supervised machine learning to automate the article triage procedure. This approach has been proved to be helpful for updating existing systematic reviews. However, this technique holds very little promise for creating new reviews because training data is rarely available when it comes to systematic creation. In this research we assess and compare the applicability of semi-supervised learning to overcome this labeling bottleneck and support the creation of systematic reviews. The results indicated that semi-supervised learning could significantly reduce the human effort and is a viable technique for automating medical systematic review creation with a small-sized training dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Adeva, G., Atxa, P., Carrillo, U., & Zengotitabengoa, A. (2014). Automatic text classification to support systematic reviews in medicine. Expert Systems with Applications, 41(4), 1498–1508.

    Article  Google Scholar 

  • Allen, I., & Olkin, I. (1999). Estimating time to conduct a meta‐analysis from number of citations retrieved. JAMA, 282(7), 634–635.

    Article  Google Scholar 

  • Bekhuis, T., & Demner-Fushman, D. (2012). Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. Artificial Intelligence in Medicine, 55, 197–207.

    Article  Google Scholar 

  • Bennett, K. and Demiriz, A. (1999). Semi-supervised support vector machines. Advances in Neural Information processing systems: 368–374.

  • Cohen, A. M., Hersh, W. R., Peterson, K., & Yen, P.-Y. (2006). Reducing workload in systematic review preparation using automated citation classification. Journal of the American Medical Informatics Association, 13(2), 206–219.

    Article  Google Scholar 

  • Cohen, A. M., Ambert, K., & McDonagh, M. (2009). Cross-topic learning for work prioritization in systematic review creation and update. Journal of the American Medical Informatics Association, 16(5), 690–704.

    Article  Google Scholar 

  • Frunza, O., Inkpen, D. and Matwin, S. (2010). Building Systematic Reviews Using Automatic Text Classification Techniques. Proceedings of the 23rd International Conference on Computational Linguistics: Posters. Association for Computational Linguistics: 303–311.

  • Gieseke, F., Airola, A., Pahikkala, T., & Kramer, O. (2014). Fast and simple gradient-based optimization for semi-supervised support vector machines. Neurocomputing, 123, 23–32.

    Article  Google Scholar 

  • Jin, Y., Huang, C., & Zhao, L. (2011). A semi-supervised learning algorithm based on modified self-training SVM. Journal of Computers, 6(7), 1438–1443.

    Article  Google Scholar 

  • Lin, J. S., O’Connor, E., Rossom, R. C., Perdue, L. A., & Eckstrom, E. (2013). Screening for cognitive impairment in older adults: a systematic review for the U.S. preventive services task force. Annals of Internal Medicine, 159(9), 601–612.

    Google Scholar 

  • Matwin, S., Kouznetsov, A., Inkpen, D., Frunza, O., & O’Blenis, P. (2010). A new algorithm for reducing the workload of experts in performing systematic reviews. Journal of the American Medical Informatics Association, 17(4), 446–453.

    Article  Google Scholar 

  • McGowan, J., & Sampson, M. (2005). Systematic reviews need systematic searchers. Journal of the Medical Library Association, 93(1), 74–80.

    Google Scholar 

  • Murdoch, T., & Detsky, A. (2013). The inevitable application of big data to health care. JAMA, 309(13), 1351–1352.

    Article  Google Scholar 

  • Robertson, S. (2004). Understanding inverse document frequency: on theoretical arguments for IDF. Journal of Documentation, 60(5), 503–520.

    Article  Google Scholar 

  • Settles, B. (2010). Active learning literature survey. University of Wisconsin, Madison 52(11): 55–66.

  • Shemilt, I., Simon, A., Hollands, G. J., Marteau, T. M., Ogilvie, D., O’Mara-Eves, A., Kelly, M. P., & Thomas, J. (2013). Pinpointing needles in giant haystacks: use of text mining to reduce impractical screening workload in extremely large scoping reviews. Research Synthesis Methods, 5(1), 31–49.

    Article  Google Scholar 

  • Shojania, K. G., Sampson, M., Ansari, M. T. and Garritty, C. (2007). Updating Systematic Reviews. Publication No. AHRQ 07–0087, Rockville, MD, Agency for Healthcare Research and Quality.

  • Song, M., Yu, H. and Han, W. S. (2011). Combining active learning and semi-supervised learning techniques to extract protein interaction sentences. BMC bioinformatics 12.

  • Thomas, J., McNaught, J., & Ananiadou, S. (2011). Applications of text mining within systematic reviews. Research Synthesis Methods, 2(1), 1–14.

    Article  Google Scholar 

  • Timsina, P., Liu, J. and El-Gayar, O. (2015). Advanced analytics for the automation of medical systematic reviews. Information Systems Frontiers (A Special Issue on Big Data and Analytics in Healthcare): 1–16.

  • Tsafnat, G., Glasziou, P., Choong, M., Dunn, A., Galgani, F., & Coiera, E. (2014). Systematic review automation technologies. Systematic Reviews, 3, 74.

    Article  Google Scholar 

  • Wang, S., Li, D., Petrick, N., Sahiner, B., Linguraru, M. G., & Summersa, R. M. (2015). Optimizing area under the ROC curve using semi-supervised learning. Pattern Recognition, 48(1), 276–287.

    Article  Google Scholar 

  • Zhou, D., Bousquet, O., Lal, T. N., Weston, J. and Schölkopf, B. (2004). Learning with Local and Global Consistency. Max Planck Institute for Biological Cybernetics, 72076 Tuebingen, Germany.

  • Zhu, X. (2005). Semi-supervised learning literature survey. TR-1530, University of Wisconsin-Madison, Department of Computer Science.

  • Zhu, X. and Ghahramani, Z. (2002). Learning from labeled and unlabeled data with label propagation. Technical Report CMU-CALD-02-107, Carnegie Mellon University.

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jun Liu.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, J., Timsina, P. & El-Gayar, O. A comparative analysis of semi-supervised learning: The case of article selection for medical systematic reviews. Inf Syst Front 20, 195–207 (2018). https://doi.org/10.1007/s10796-016-9724-0

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10796-016-9724-0

Keywords

Navigation