Skip to main content
Original Article

How Low Can You Go?

An Investigation of the Influence of Sample Size and Model Complexity on Point and Interval Estimates in Two-Level Linear Models

Published Online:https://doi.org/10.1027/1614-2241/a000062

Whereas general sample size guidelines have been suggested when estimating multilevel models, they are only generalizable to a relatively limited number of data conditions and model structures, both of which are not very feasible for the applied researcher. In an effort to expand our understanding of two-level multilevel models under less than ideal conditions, Monte Carlo methods, through SAS/IML, were used to examine model convergence rates, parameter point estimates (statistical bias), parameter interval estimates (confidence interval accuracy and precision), and both Type I error control and statistical power of tests associated with the fixed effects from linear two-level models estimated with PROC MIXED. These outcomes were analyzed as a function of: (a) level-1 sample size, (b) level-2 sample size, (c) intercept variance, (d) slope variance, (e) collinearity, and (f) model complexity. Bias was minimal across nearly all conditions simulated. The 95% confidence interval coverage and Type I error rate tended to be slightly conservative. The degree of statistical power was related to sample sizes and level of fixed effects; higher power was observed with larger sample sizes and level-1 fixed effects.

References

  • Bell, B. A. , Ferron, J. M. , Kromrey, J. D. (2008, August). Cluster size in multilevel models: The impact of sparse data structures on point and interval estimates in two-level models. Proceedings of the joint statistical meetings, survey research methods section (pp. 1122–1129). Alexandria, VA: American Statistical Association. First citation in articleGoogle Scholar

  • Bell, B. A. , Ferron, J. M. , Kromrey, J. D. (2009, April). The effect of sparse data structures and model misspecification on point and interval estimates in multilevel models Presented at the annual meeting of the American Educational Research Association, San Diego, CA. First citation in articleGoogle Scholar

  • Browne, W. J. , Draper, D. (2000). Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models. Computational Statistics, 15, 391–420. doi: 10.1007/s001800000041 First citation in articleCrossrefGoogle Scholar

  • Burton, A. , Altman, D. G. , Royston, P. , Holder, R. L. (2006). The design of simulation studies in medical statistics. Statistics in Medicine, 25, 4279–4292. doi: 10.1002/sim.2673 First citation in articleCrossrefGoogle Scholar

  • Clarke, P. (2008). When can group level clustering be ignored? Multilevel models versus single-level models with sparse data. Journal of Epidemiology and Community Health, 62, 752–758. doi: 10.1136/jech.2007.060798 First citation in articleCrossrefGoogle Scholar

  • Clarke, P. , Wheaton, B. (2007). Addressing data sparseness in contextual population research using cluster analysis to create synthetic neighborhoods. Sociological Methods & Research, 35, 311–351. doi: 10.1177/0049124106292362 First citation in articleCrossrefGoogle Scholar

  • Cohen, J. (1968). Multiple regression as a general data-analytic system. Psychological Bulletin, 70, 426–443. doi: 10.1037/h0026714 First citation in articleCrossrefGoogle Scholar

  • Dedrick, R. F. , Ferron, J. M. , Hess, M. R. , Hogarty, K. Y. , Kromrey, J. D. , Lang, T. R. , … Lee, R. (2009). Multilevel modeling: A review of methodological issues and applications. Review of Educational Research, 79, 69–102. doi: 10.3102/0034654308325581 First citation in articleCrossrefGoogle Scholar

  • De Jong, K. , Moerbeek, M. , Van Der Leeden, R. (2010). A prior power analysis in longitudinal three-level multilevel models: An example with therapist effects. Psychotherapy Research, 20, 273–284. doi: 10.1080/10503300903376320 First citation in articleCrossrefGoogle Scholar

  • Donner, A. , Klar, N. (2000). Design and analysis of cluster randomization trials in health research. London, UK: Arnold. First citation in articleGoogle Scholar

  • Goldstein, H. (2003). Multilevel statistical models (3rd ed.). London: Edward Arnold. First citation in articleGoogle Scholar

  • Heck, R. H. , Thomas, S. L. (2000). An introduction to multilevel modeling techniques. Mahwah, NJ: Erlbaum. First citation in articleGoogle Scholar

  • Hess, M. R. , Ferron, J. M. , Bell Ellison, B. , Dedrick, R. , Lewis, S. E. (2006, April). Interval estimates of fixed effects in multi-level models: Effects of small sample size. Presented at the annual meeting of the American Educational Research Association, San Francisco, CA First citation in articleGoogle Scholar

  • Hox, J. J. (1998). Multilevel modeling: When and why. In I. Balderjahn, R. Mathar, M. Schader (Eds.), Classification, data analysis, and data highways (pp. 147–154). New York, NY: Springer. First citation in articleCrossrefGoogle Scholar

  • Hox, J. J. (2002). Multilevel analysis: Techniques and applications. Mahwah, NJ: Erlbaum. First citation in articleCrossrefGoogle Scholar

  • Hox, J. J. , Maas, C. J. M (2001). The accuracy of multilevel structural equation modeling with psuedobalanced groups and small samples. Structural Equation Modeling, 8, 157–174. doi: 10.1207/S15328007SEM0802_1 First citation in articleCrossrefGoogle Scholar

  • Julian, M. (2001). The consequences of ignoring multilevel data structures in nonhierarchical covariance modeling. Structural Equation Modeling, 8, 325–352. doi: 10.1207/S15328007SEM0803_1 First citation in articleCrossrefGoogle Scholar

  • Klein, K. , Kozlowski, S. W. J. (Eds.). (2000). Multilevel theory, research, and methods in organizations. San Francisco, CA: Jossey-Bass. First citation in articleGoogle Scholar

  • Maas, C. J. M. , Hox, J. J. (2004). Robustness issues in multilevel regression analysis. Statistica Neerlandica, 58, 127–137. doi: 10.1046/j.0039-0402.2003.00252.x First citation in articleCrossrefGoogle Scholar

  • Maas, C. J. M. , Hox, J. J. (2005). Sufficient sample sizes for multilevel modeling. Methodology, 1, 86–92. doi: 10.1027/1614-2241.1.3.86 First citation in articleLinkGoogle Scholar

  • Moerbeek, M. (2004). The consequences of ignoring a level of nesting in multilevel analysis. Multivariate Behavioral Research, 39, 129–149. doi: 10.1207/s15327906mbr3901_5 First citation in articleCrossrefGoogle Scholar

  • Moerbeek, M. (2006). Power and money in cluster randomized trials: When is it worth measuring a covariate? Statistics in Medicine, 25, 2607–2617. doi: 10.1002/sim.2297 First citation in articleCrossrefGoogle Scholar

  • Moineddin, R. , Matheson, F. I. , Glazier, R. H. (2007). A simulation study of sample size for multilevel logistic regression models. BMS Medical Research Methodology, 7, 1–10. doi: 10.1186/1471-2288-7-34 First citation in articleCrossrefGoogle Scholar

  • Mok, M. (1995). Sample size requirements for 2-level designs in educational research. Unpublished manuscript, Macquarie University, Sydney, Australia. First citation in articleGoogle Scholar

  • Murray, D. M. (1998). Design and analysis of group-randomized trials. New York, NY: Oxford University Press. First citation in articleGoogle Scholar

  • Newsom, J. T. , Nishishiba, M. (2002). Nonconvergence and sample bias in hierarchical linear modeling of dyadic data. Unpublished manuscript, Portland State University. First citation in articleGoogle Scholar

  • Nich, C. , Carroll, K. (1997). Now you see it, now you don’t: A comparison of traditional versus random-effects regression models in the analysis of longitudinal follow-up data from a clinical trial. Journal of Consulting and Clinical Psychology, 65, 252–261. doi: 10.1037//0022-006X.65.2.252 First citation in articleCrossrefGoogle Scholar

  • Raudenbush, S. W. , Bryk, A. S. (2002). Hierarchical linear models. Newbury Park, CA: Sage. First citation in articleGoogle Scholar

  • Reise, S. P. , Duan, N. (2003). Design issues in multilevel studies. In S. P. Reise, N. Duan (Eds.), Multilevel modeling: methodological advances, issues and applications (pp. 285–298). Mahwah, NJ: Erlbaum. First citation in articleGoogle Scholar

  • SAS Institute Inc . (2003). SAS, release 9.1 [computer program]. Cary, NC: SAS Institute. First citation in articleGoogle Scholar

  • SAS Institute Inc . (2008). SAS/IML® 9.2 User’s guide. Cary, NC: SAS Institute. First citation in articleGoogle Scholar

  • SAS Institute Inc . (2009). SAS® 9.2 Language Reference: Dictionary (2nd ed.). Cary, NC: SAS Institute. First citation in articleGoogle Scholar

  • Shadish, W. , Cook, T. , Campbell, D. (2002). Experimental and quasi-experimental designs for generalized causal inference. Boston, MD: Houghton Mifflin. First citation in articleGoogle Scholar

  • Snijders, T. A. B. (2005). Power and sample size in multilevel linear models. In B. S. Everitt, D. C. Howell (Eds.), Encyclopedia of Statistics in Behavioral Science (pp. 1570–1573). Chicester, UK: Wiley. First citation in articleCrossrefGoogle Scholar

  • Snijders, T. A. B. , Bosker, R. J. (1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling. Thousand Oaks, CA: Sage. First citation in articleGoogle Scholar

  • Wampold, B. E. , Serlin, R. C. (2000). The consequences of ignoring a nested factor on measures of effect size in analysis of variance. Psychological Methods, 5, 425–433. doi: 10.1037//1082-989X.5.4.425 First citation in articleCrossrefGoogle Scholar