Estimation of prediction error by using K-fold cross-validation

Fushiki, Tadayoshi

doi:10.1007/s11222-009-9153-8

Estimation of prediction error by using K-fold cross-validation

Published: 10 October 2009

Volume 21, pages 137–146, (2011)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Tadayoshi Fushiki¹

8750 Accesses
695 Citations
3 Altmetric
Explore all metrics

Abstract

Estimation of prediction accuracy is important when our aim is prediction. The training error is an easy estimate of prediction error, but it has a downward bias. On the other hand, K-fold cross-validation has an upward bias. The upward bias may be negligible in leave-one-out cross-validation, but it sometimes cannot be neglected in 5-fold or 10-fold cross-validation, which are favored from a computational standpoint. Since the training error has a downward bias and K-fold cross-validation has an upward bias, there will be an appropriate estimate in a family that connects the two estimates. In this paper, we investigate two families that connect the training error and K-fold cross-validation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Burman, P.: A comparative study of ordinary cross-validation, v-fold cross-validation and the repeated learning-testing methods. Biometrika 76, 503–514 (1989)
MATH MathSciNet Google Scholar
Davison, A.C., Hinkley, D.V.: Bootstrap Methods and Their Application. Cambridge University Press, Cambridge (1997)
MATH Google Scholar
Efron, B.: The estimation of prediction error: covariance penalties and cross-validation (with discussion). J. Am. Stat. Assoc. 99, 619–642 (2004)
Article MATH MathSciNet Google Scholar
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, New York (2001)
MATH Google Scholar
Konishi, S., Kitagawa, G.: Information Criteria and Statistical Modeling. Springer, New York (2007)
Google Scholar
Li, K.-C.: Asymptotic optimality for C _p, C _L, cross-validation and generalized cross-validation: discrete index set. Ann. Stat. 15(3), 958–975 (1987)
Article MATH Google Scholar
Shao, J.: Linear model selection by cross-validation. J. Am. Stat. Assoc. 88, 486–494 (1993)
Article MATH Google Scholar
Stone, M.: Cross-validatory choice and assessment of statistical predictions (with discussion). J. R. Stat. Soc., Ser. B 36, 111–147 (1974)
MATH Google Scholar
Stone, M.: An asymptotic equivalence of choice of model by cross-validation and Akaike’s criterion. J. R. Stat. Soc., Ser. B 39, 44–47 (1977)
MATH Google Scholar
van der Vaart, A.W.: Asymptotic Statistics. Cambridge University Press, Cambridge (1998)
MATH Google Scholar
Yanagihara, H., Tonda, T., Matsumoto, C.: Bias correction of cross-validation criterion based on Kullback-Leibler information under a general condition. J. Multivar. Anal. 97, 1965–1975 (2006)
Article MATH MathSciNet Google Scholar
Yang, Y.: Consistency of cross validation for comparing regression procedures. Ann. Stat. 35, 2450–2473 (2007)
Article MATH Google Scholar

Download references

Author information

Authors and Affiliations

The Institute of Statistical Mathematics, 10-3 Midori-cho, Tachikawa, Tokyo, 190-8562, Japan
Tadayoshi Fushiki

Authors

Tadayoshi Fushiki
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Tadayoshi Fushiki.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fushiki, T. Estimation of prediction error by using K-fold cross-validation. Stat Comput 21, 137–146 (2011). https://doi.org/10.1007/s11222-009-9153-8

Download citation

Received: 14 March 2009
Accepted: 30 September 2009
Published: 10 October 2009
Issue Date: April 2011
DOI: https://doi.org/10.1007/s11222-009-9153-8

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Estimation of prediction error by using K-fold cross-validation

Abstract

Access this article

Similar content being viewed by others

The leave-worst-k-out criterion for cross validation

Weighted Classification Error Rate Estimator for the Euclidean Distance Classifier

What is an optimal value of k in k-fold cross-validation in discrete Bayesian network analysis?

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Estimation of prediction error by using K-fold cross-validation

Abstract

Access this article

Similar content being viewed by others

The leave-worst-k-out criterion for cross validation

Weighted Classification Error Rate Estimator for the Euclidean Distance Classifier

What is an optimal value of k in k-fold cross-validation in discrete Bayesian network analysis?

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation