Skip to main content
Erschienen in: European Journal of Epidemiology 8/2019

03.06.2019 | METHODS

A descriptive review of variable selection methods in four epidemiologic journals: there is still room for improvement

verfasst von: Denis Talbot, Victoria Kubuta Massamba

Erschienen in: European Journal of Epidemiology | Ausgabe 8/2019

Einloggen, um Zugang zu erhalten

Abstract

A review of epidemiological papers conducted in 2009 concluded that several studies employed variable selection methods susceptible to introduce bias and yield inadequate inferences. Many new confounder selection methods have been developed since then. The goal of the study was to provide an updated descriptive portrait of which variable selection methods are used by epidemiologists for analyzing observational data. Studies published in four major epidemiological journals in 2015 were reviewed. Only articles concerned with a predictive or explicative objective and reporting on the analysis of individual data were included. Method(s) employed for selecting variables were extracted from retained articles. A total of 975 articles were retrieved and 299 met eligibility criteria, 292 of which pursued an explicative objective. Among those, 146 studies (50%) reported using prior knowledge or causal graphs for selecting variables, 34 (12%) used change in effect estimate methods, 26 (9%) used stepwise approaches, 16 (5%) employed univariate analyses, 5 (2%) used various other methods and 107 (37%) did not provide sufficient details to allow classification (more than one method could be employed in a single article). Despite being less frequent than in the previous review, stepwise and univariable analyses, which are susceptible to introduce bias and produce inadequate inferences, were still prevalent. Moreover, 37% studies did not provide sufficient details to assess how variables were selected. We thus believe there is still room for improvement in variable selection methods used by epidemiologists and in their reporting.
Anhänge
Nur mit Berechtigung zugänglich
Literatur
4.
Zurück zum Zitat Harrell FE. Regression modeling strategies, with applications to linear models, survival analysis and logistic regression. 2nd ed. New York: Springer; 2015. Harrell FE. Regression modeling strategies, with applications to linear models, survival analysis and logistic regression. 2nd ed. New York: Springer; 2015.
5.
Zurück zum Zitat Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating. New York: Springer; 2009.CrossRef Steyerberg EW. Clinical prediction models: a practical approach to development, validation, and updating. New York: Springer; 2009.CrossRef
7.
Zurück zum Zitat Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10(1):37–48.CrossRef Greenland S, Pearl J, Robins JM. Causal diagrams for epidemiologic research. Epidemiology. 1999;10(1):37–48.CrossRef
10.
Zurück zum Zitat Chatfield C. Model uncertainty, data mining and statistical inference. J R Stat Soc Ser A Stat Soc. 1995;158(3):419–44.CrossRef Chatfield C. Model uncertainty, data mining and statistical inference. J R Stat Soc Ser A Stat Soc. 1995;158(3):419–44.CrossRef
11.
Zurück zum Zitat Steyerberg EW, Bleeker SE, Moll HA, Grobbee DE, Moons KG. Internal and external validation of predictive models: a simulation study of bias and precision in small samples. J Clin Epidemiol. 2003;56(5):441–7.CrossRefPubMed Steyerberg EW, Bleeker SE, Moll HA, Grobbee DE, Moons KG. Internal and external validation of predictive models: a simulation study of bias and precision in small samples. J Clin Epidemiol. 2003;56(5):441–7.CrossRefPubMed
12.
Zurück zum Zitat Sun G-W, Shook TL, Kay GL. Inappropriate use of bivariable analysis to screen risk factors for use in multivariable analysis. J Clin Epidemiol. 1996;49(8):907–16.CrossRefPubMed Sun G-W, Shook TL, Kay GL. Inappropriate use of bivariable analysis to screen risk factors for use in multivariable analysis. J Clin Epidemiol. 1996;49(8):907–16.CrossRefPubMed
13.
Zurück zum Zitat Maldonado G, Greenland S. Simulation study of confounder-selection strategies. Am J Epidemiol. 1993;138(11):923–36.CrossRef Maldonado G, Greenland S. Simulation study of confounder-selection strategies. Am J Epidemiol. 1993;138(11):923–36.CrossRef
14.
Zurück zum Zitat Mickey RM, Greenland S. The impact of confounder selection criteria on effect estimation. Am J Epidemiol. 1989;129(1):125–37.CrossRefPubMed Mickey RM, Greenland S. The impact of confounder selection criteria on effect estimation. Am J Epidemiol. 1989;129(1):125–37.CrossRefPubMed
15.
Zurück zum Zitat Weng H-Y, Hsueh Y-H, Messam LLM, Hertz-Picciotto I. Methods of covariate selection: directed acyclic graphs and the change-in-estimate procedure. Am J Epidemiol. 2009;169(10):1182–90.CrossRefPubMed Weng H-Y, Hsueh Y-H, Messam LLM, Hertz-Picciotto I. Methods of covariate selection: directed acyclic graphs and the change-in-estimate procedure. Am J Epidemiol. 2009;169(10):1182–90.CrossRefPubMed
16.
Zurück zum Zitat Hernán MA, Hernández-Díaz S, Werler MM, Mitchell AA. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol. 2002;155(2):176–84.CrossRefPubMed Hernán MA, Hernández-Díaz S, Werler MM, Mitchell AA. Causal knowledge as a prerequisite for confounding evaluation: an application to birth defects epidemiology. Am J Epidemiol. 2002;155(2):176–84.CrossRefPubMed
17.
Zurück zum Zitat Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Series B Stat Methodol. 1996;58(1):267–88. Tibshirani R. Regression shrinkage and selection via the lasso. J R Stat Soc Series B Stat Methodol. 1996;58(1):267–88.
18.
Zurück zum Zitat Zou H. The adaptive lasso and its oracle properties. J Am Stat Assoc. 2006;101(476):1418–29.CrossRef Zou H. The adaptive lasso and its oracle properties. J Am Stat Assoc. 2006;101(476):1418–29.CrossRef
19.
Zurück zum Zitat Hoeting JA, Madigan D, Raftery AE, Volinsky CT. Bayesian model averaging: a tutorial. Stat Sci. 1999;14(4):382–401.CrossRef Hoeting JA, Madigan D, Raftery AE, Volinsky CT. Bayesian model averaging: a tutorial. Stat Sci. 1999;14(4):382–401.CrossRef
20.
Zurück zum Zitat Steyerberg EW, Eijkemans MJ, Harrell FE, Habbema JDF. Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets. Stat Med. 2000;19(8):1059–79.CrossRefPubMed Steyerberg EW, Eijkemans MJ, Harrell FE, Habbema JDF. Prognostic modelling with logistic regression analysis: a comparison of selection and estimation methods in small data sets. Stat Med. 2000;19(8):1059–79.CrossRefPubMed
32.
Zurück zum Zitat Wilson A, Reich BJ. Confounder selection via penalized credible regions. Biometrics. 2014;70(4):852–61.CrossRefPubMed Wilson A, Reich BJ. Confounder selection via penalized credible regions. Biometrics. 2014;70(4):852–61.CrossRefPubMed
37.
Zurück zum Zitat DiMaggio C. Small-area spatiotemporal analysis of pedestrian and bicyclist injuries in New York City. Epidemiology. 2015;26(2):247–54.CrossRefPubMed DiMaggio C. Small-area spatiotemporal analysis of pedestrian and bicyclist injuries in New York City. Epidemiology. 2015;26(2):247–54.CrossRefPubMed
39.
Zurück zum Zitat Miettinen OS, Cook EF. Confounding: essence and detection. Am J Epidemiol. 1981;114(4):593–603.CrossRefPubMed Miettinen OS, Cook EF. Confounding: essence and detection. Am J Epidemiol. 1981;114(4):593–603.CrossRefPubMed
Metadaten
Titel
A descriptive review of variable selection methods in four epidemiologic journals: there is still room for improvement
verfasst von
Denis Talbot
Victoria Kubuta Massamba
Publikationsdatum
03.06.2019
Verlag
Springer Netherlands
Erschienen in
European Journal of Epidemiology / Ausgabe 8/2019
Print ISSN: 0393-2990
Elektronische ISSN: 1573-7284
DOI
https://doi.org/10.1007/s10654-019-00529-y

Weitere Artikel der Ausgabe 8/2019

European Journal of Epidemiology 8/2019 Zur Ausgabe