The online version of this article (doi:10.1186/1471-2288-14-15) contains supplementary material, which is available to authorized users.
Dean W Yergens, Daniel J Dutton contributed equally to this work.
DWY is a co-founder of Synthesis Research Inc which owns the intellectual property for the Synthesis software application. DJD and SP declare they have no competing interests.
DWY contributed to the study design, software development, analysis and the writing of the manuscript. DJD contributed to the study design, analysis, and editing of the manuscript. SP contributed to the study design and was integral in all aspects of the project and the writing of the manuscript. All authors reviewed and provided comments on the final manuscript. All authors read and approved the final manuscript.
The Canadian Community Health Survey (CCHS) is a cross-sectional survey that has collected information on health determinants, health status and the utilization of the health system in Canada since 2001. Several hundred articles have been written utilizing the CCHS dataset. Previous analyses of statistical methods utilized in the literature have focused on a particular journal or set of journals to understand the statistical literacy required for understanding the published research. In this study, we describe the statistical methods referenced in the published literature utilizing the CCHS dataset(s).
A descriptive study was undertaken of references published in Medline, Embase, Web of Knowledge and Scopus associated with the CCHS. These references were imported into a Java application utilizing the searchable Apache Lucene text database and screened based upon pre-defined inclusion and exclusion criteria. Full-text PDF articles that met the inclusion criteria were then used for the identification of descriptive, elementary and regression statistical methods referenced in these articles. The identification of statistical methods occurred through an automated search of key words on the full-text articles utilizing the Java application.
We identified 4811 references from the 4 bibliographical databases for possible inclusion. After exclusions, 663 references were used for the analysis. Descriptive statistics such as means or proportions were presented in a majority of the articles (97.7%). Elementary-level statistics such as t-tests were less frequently referenced (29.7%) than descriptive statistics. Regression methods were frequently referenced in the articles: 79.8% of articles contained reference to regression in general with logistic regression appearing most frequently in 67.1% of the articles.
Our study shows a diverse set of analysis methods being referenced in the CCHS literature, however, the literature heavily relies on only a subset of all possible statistical tools. This information can be used in identifying gaps in statistical methods that could be applied to future analysis of public health surveys, insight into training and educational programs, and also identifies the level of statistical literacy needed to understand the published literature.
Statistics Canada, Statistics Division: Canadian Community Health Survey (CCHS) Cycle 3.1 (2005) - Public Use Microdata File (PUMF) User Guide. 2006
Statistics Canada: Data Liberation Initiative (DLI). 2013, http://www.statcan.gc.ca/dli-idd/dli-idd-eng.htm, Date modified: 2013-05-03
Becker PJ, Viljoen E, Wolmarans L, IJsselmuiden CB: An assessment of the statistical procedures used in original papers published in the SAMJ during 1992. South African medical journal = Suid-Afrikaanse tydskrif vir geneeskunde. 1995, 85 (9): 881-884. PubMed
Rao MH, Khan N: Comparison of statistical methods, type of articles and study design used in selected Pakistani medical journals in 1998 and 2007. JPMA. 2010, 60 (9): 745-750.
Yergens D, Ray J, Doig CJ: American Medical Informatics Association (AMIA) 2012 Annual Symposium. KSv2: Application for Enhancing Scoping and Systematic Reviews. 2012, November 2012. (poster)
McCandless M, Hatcher E, Gospodnetic O: Lucene in action. 2010, Stanford, CT USA: Manning Publications Co, 2
Spat S, Cadonna B, Rakovac I, Gutl C, Leitner H, Stark G, Beck P: Enhanced information retrieval from narrative German-language clinical text documents using automated document classification. Stud Health Technol Inform. 2008, 136: 473-478. PubMed
RStudio (2012): RStudio: Integrated development environment for R (Version 0.97.336). 2012, Boston, MA: Computer software, http://www.rstudio.org,
R Core Team: R: A Language and Environment for Statistical Computing. 2013, Vienna, Austria: R Foundation for Statistical Computing, http://www.r-project.org/,
Shields M, Gorber SC, Janssen I, Tremblay MS: Bias in self-reported estimates of obesity in Canadian health surveys: an update on correction equations for adults. Health Rep. 2011, 22 (3): 35-45. PubMed
Feng Y, Bernier J, McIntosh C, Orpana H: Validation of disability categories derived from Health Utilities Index Mark 3 scores. Health Rep. 2009, 20 (2): 43-50. PubMed
Garriguet D: Diet quality in Canada. Health Rep. 2009, 20 (3): 41-52. PubMed
- An overview of the statistical methods reported by studies using the Canadian community health survey
Dean W Yergens
Daniel J Dutton
Scott B Patten
- BioMed Central
Neu im Fachgebiet AINS
Meistgelesene Bücher aus dem Fachgebiet AINS
Mail Icon II