Abstract
Many applications involve the use of binary classifiers, including applications where safety and security are critical. The quantitative assessment of such classifiers typically involves receiver operator characteristic (ROC) methods and the estimation of sensitivity/specificity. But such techniques have their limitations. For safety/security critical applications, more relevant measures of reliability and risk should be estimated. Moreover, ROC techniques do not explicitly account for: 1) inherent uncertainties one faces during assessments, 2) reliability evidence other than the observed failure behaviour of the classifier, and 3) how this observed failure behaviour alters one’s uncertainty about classifier reliability. We address these limitations using conservative Bayesian inference (CBI) methods, producing statistically principled, conservative values for risk/reliability measures of interest. Our analyses reveals trade-offs amongst all binary classifiers with the same expected loss – the most reliable classifiers are those most likely to experience high impact failures. This trade-off is harnessed by using diverse redundant binary classifiers.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
View this as the conditional expected loss, given the occurrence of the relevant error.
- 2.
In [31] their focus was uncertainty about the value of the probability of failure for a system. Contrastingly, our result applies to the uncertainty about whether a given classifier will fail on its next classification task, and the loss incurred if it does.
- 3.
From these, the expected loss for any of the adjudication functions may be computed.
References
Bartlett, P., Jordan, M., McAuliffe, J.: Convexity, classification, and risk bounds. J. Am. Stat. Assoc. 101, 138–156 (2006). https://doi.org/10.1198/016214505000000907
Bishop, P., Bloomfield, R., Littlewood, B., Povyakalo, A., Wright, D.: Toward a formalism for conservative claims about the dependability of software-based systems. IEEE Trans. Softw. Eng. 37(5), 708–717 (2011)
Blough, D.M., Sullivan, G.F.: A comparison of voting strategies for fault-tolerant distributed systems. In: Proceedings Ninth Symposium on Reliable Distributed Systems, pp. 136–145 (1990). https://doi.org/10.1109/RELDIS.1990.93959
Box, G.E., Tiao, G.C.: Nature of Bayesian Inference, chap. 1, pp. 1–75. Wiley (2011). https://doi.org/10.1002/9781118033197.ch1
Buda, M., Maki, A., Mazurowski, M.A.: A systematic study of the class imbalance problem in convolutional neural networks. Neural Netw. 106, 249–259 (2018). https://doi.org/10.1016/j.neunet.2018.07.011
Dembczyński, K., Kotłowski, W., Koyejo, O., Natarajan, N.: Consistency analysis for binary classification revisited. In: Precup, D., Teh, Y.W. (eds.) Proceedings of the 34th International Conference on Machine Learning. Proceedings of Machine Learning Research, PMLR, International Convention Centre, Sydney, Australia, vol. 70, pp. 961–969, 06–11 August 2017. http://proceedings.mlr.press/v70/dembczynski17a.html
Di Giandomenico, F., Strigini, L.: Adjudicators for diverse-redundant components. In: Proceedings Ninth Symposium on Reliable Distributed Systems, pp. 114–123, October 1990. https://doi.org/10.1109/RELDIS.1990.93957
Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006). http://dx.doi.org/10.1016/j.patrec.2005.10.010
Fawcett, T., Flach, P.A.: A response to Webb and Ting’s on the application of ROC analysis to predict classification performance under varying class distributions. Mach. Learn. 58(1), 33–38 (2005). https://doi.org/10.1007/s10994-005-5256-4
Flach, P., Shaomin, W.: Repairing concavities in ROC curves. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence (IJCAI 2005), IJCAI, pp. 702–707, August 2005
Gaffney, J.E., Ulvila, J.W.: Evaluation of intrusion detectors: a decision theory approach. In: Proceedings of the 2001 IEEE Symposium on Security and Privacy, pp. 50–61. IEEE (2001). http://dl.acm.org/citation.cfm?id=882495.884438
Gaffney, J.E., Ulvila, J.W.: Evaluation of intrusion detection systems. J. Res. Natl. Inst. Stand. Technol. 108(6), 453–473 (2003)
Gelman, A., Carlin, J.B., Stern, H.S., Rubin, D.B.: Bayesian Data Analysis, 2nd edn. Chapman and Hall/CRC (2004)
Gelman, A., Shalizi, C.R.: Philosophy and the practice of Bayesian statistics. Br. J. Math. Stat. Psychol. 66(1), 8–38 (2013). https://doi.org/10.1111/j.2044-8317.2011.02037.x
Hand, D.J., Till, R.J.: A simple generalisation of the area under the roc curve for multiple class classification problems. Mach. Learn. 45(2), 171–186 (2001). https://doi.org/10.1023/A:1010920819831
Japkowicz, N., Stephen, S.: The class imbalance problem: a systematic study. Intell. Data Anal. 6(5), 429–449 (2002)
Koyejo, O.O., Natarajan, N., Ravikumar, P.K., Dhillon, I.S.: Consistent binary classification with generalized performance metrics. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N.D., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27, pp. 2744–2752. Curran Associates, Inc. (2014)
Littlewood, B., Salako, K., Strigini, L., Zhao, X.: On reliability assessment when a software-based system is replaced by a thought-to-be-better one. Reliab. Eng. Syst. Saf. 197, 106752 (2020). https://doi.org/10.1016/j.ress.2019.106752
Markowitz, H.M.: Portfolio selection. J. Finan. 7(1), 77–91 (1952)
Markowitz, H.M.: Portfolio Selection, Efficient Diversification of Investments. Wiley, Hoboken (1959)
Nayak, J., Naik, B., Behera, D.H.: A comprehensive survey on support vector machine in data mining tasks: applications and challenges. Int. J. Database Theory Appl. 8, 169–186 (2015). https://doi.org/10.14257/ijdta.2015.8.1.18
Pouyanfar, S., et al.: A survey on deep learning: algorithms, techniques, and applications. ACM Comput. Surv. 51(5) (2018). https://doi.org/10.1145/3234150
Provost, F., Fawcett, T.: Robust classification systems for imprecise environments. In: Proceedings of AAAI 1998, pp. 706–713. AAAI press (1998)
Provost, F., Fawcett, T.: Analysis and visualization of classifier performance: comparison under imprecise class and cost distributions. In: Proceedings of the Third International Conference on Knowledge Discovery and Data Mining, KDD 1997, pp. 43–48. AAAI Press (1997)
Provost, F., Fawcett, T.: Robust classification for imprecise environments. Mach. Learn. 42(3), 203–231 (2001). https://doi.org/10.1023/A:1007601015854
RavinderReddy, R., Kavya, B., Yellasiri, R.: A survey on SVM classifiers for intrusion detection. Int. J. Comput. Appl. 98, 34–44 (2014). https://doi.org/10.5120/17294-7779
Schilling, R.: Measures, Integrals and Martingales, 2nd edn. Cambridge University Press, Cambridge (2017)
Scott, M.J.J., Niranjan, M., Prager, R.W.: Realisable classifiers: improving operating performance on variable cost problems. In: Proceedings of the British Machine Vision Conference, pp. 31.1–31.10. BMVA Press (1998). https://doi.org/10.5244/C.12.31
Sharpe, W.F.: Capital asset prices: a theory of market equilibrium under conditions of risk. J. Finan. 19(3), 425–442 (1964)
Strigini, L., Povyakalo, A.: Software fault-freeness and reliability predictions. In: Bitsch, F., Guiochet, J., Kaâniche, M. (eds.) SAFECOMP 2013. LNCS, vol. 8153, pp. 106–117. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-40793-2_10
Strigini, L., Wright, D.: Bounds on survival probability given mean probability of failure per demand; and the paradoxical advantages of uncertainty. Reliab. Eng. Syst. Saf. 128, 66–83 (2014). https://doi.org/10.1016/j.ress.2014.02.004
Swets, J., Dawes, R., Monahan, J.: Better decisions through science. Sci. Am. 283, 82–87 (2000). https://doi.org/10.1038/scientificamerican1000-82
Swets, J.A.: Measuring the accuracy of diagnostic systems. Science 240(4857), 1285–93 (1988)
Webb, G., Ting, K.: On the application of ROC analysis to predict classification performance under varying class distributions. Mach. Learn. 58, 25–32 (2005). https://doi.org/10.1007/s10994-005-4257-7
Zhao, X., Littlewood, B., Povyakalo, A., Strigini, L., Wright, D.: Modeling the probability of failure on demand (pfd) of a 1-out-of-2 system in which one channel is ‘quasi-perfect’. Reliab. Eng. Syst. Saf. 158, 230–245 (2017)
Zhao, X., Littlewood, B., Povyakalo, A., Strigini, L., Wright, D.: Conservative claims for the probability of perfection of a software-based system using operational experience of previous similar systems. Reliab. Eng. Syst. Saf. 175, 265–282 (2018). https://doi.org/10.1016/j.ress.2018.03.032
Zhao, X., Robu, V., Flynn, D., Salako, K., Strigini, L.: Assessing the safety and reliability of autonomous vehicles from road testing. In: The 30th International Symposium on Software Reliability Engineering (ISSRE), Berlin, Germany. IEEE (2019, in press)
Acknowledgment
This work was supported by the European Commission through the H2020 programme under grant agreement 700692 (DiSIEM). My thanks to the anonymous reviewers for their helpful suggestions for improving the presentation.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Salako, K. (2020). Loss-Size and Reliability Trade-Offs Amongst Diverse Redundant Binary Classifiers. In: Gribaudo, M., Jansen, D.N., Remke, A. (eds) Quantitative Evaluation of Systems. QEST 2020. Lecture Notes in Computer Science(), vol 12289. Springer, Cham. https://doi.org/10.1007/978-3-030-59854-9_8
Download citation
DOI: https://doi.org/10.1007/978-3-030-59854-9_8
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-59853-2
Online ISBN: 978-3-030-59854-9
eBook Packages: Computer ScienceComputer Science (R0)