Multilevel Analysis

Snijders, Tom A. B.

doi:10.1007/978-3-642-04898-2_387

Tom A. B. Snijders^2,3

9464 Accesses
358 Citations
3 Altmetric

Multilevel Analysis, Hierarchical Linear Models

The term “Multilevel Analysis” is mostly used interchangeably with “Hierarchical Linear Modeling,” although strictly speaking these terms are distinct. Multilevel Analysis may be understood to refer broadly to the methodology of research questions and data structures that involve more than one type of unit. This originated in studies involving several levels of aggregation, such as individuals and counties, or pupils, classrooms, and schools. Starting with Robinson’s (1950) discussion of the ecological fallacy, where associations between variables at one level of aggregation are mistakenly regarded as evidence for associations at a different aggregation level (see Alker 1969, for an extensive review), this led to interest in how to analyze data including several aggregation levels. This situation arises as a matter of course in educational research, and studies of the contributions made by different sources of variation such as students, teachers, classroom composition, school organization, etc., were seminal in the development of statistical methodology in the 1980s (see the review in Chap. 1 of de Leeuw and Meijer 2008). The basic idea is that studying the simultaneous effects of variables at the levels of students, teachers, classrooms, etc., on student achievement requires the use of regression-type models that comprise error terms for each of those levels separately; this is similar to mixed effects models studied in the traditional linear models literature such as Scheffé (1959).

The prototypical statistical model that expresses this is the Hierarchical Linear Model, which is a mixed effects regression model for nested designs. In the two-level situation – applicable, e.g., to a study of students in classrooms – it can be expressed as follows. The more detailed level (students) is called the lower level, or level 1; the grouping level (classrooms) is called the higher level, or level 2. Highlighting the distinction with regular regression models, the terminology speaks of units rather than cases, and there are specific types of unit at each level. In our example, the level-1 units, students, are denoted by i and the level-2 units, classrooms, by j. Level-1 units are nested in level-2 units (each student is a member of exactly one classroom) and the data structure is allowed to be unbalanced, such that j runs from 1 to N while i runs, for a given j, from 1 to n _j. The basic two-level hierarchical linear model can be expressed as

$${Y }_{ij} = {\beta }_{0} +{ \sum \limits_{h=1}^{r}}{\beta }_{h}\,{x}_{hij} + {U}_{0j} +{ \sum \limits_{h=1}^{p}}{U}_{hj}\,{z}_{hij} + {R}_{ij};$$

(1a)

or, more succinctly, as

$$\mathbf{Y} = \mathbf{X}\,\beta + \mathbf{Z\,U} + \mathbf{R}.$$

(1b)

Here Y _ij is the dependent variable, defined for level-1 unit i within level-2 unit j; the variables x _hij and z _hij are the explanatory variables. Variables R _ij are residual terms, or error terms, at level 1, while U _hj for h = 0, …, p are residual terms, or error terms, at level 2. In the case p = 0 this is called a random intercept model, for p ≥ 1 it is called a random slope model. The usual assumption is that all R _ij and all vectors U _j = (U _0j, …, U _pj) are independent, R _ij having a normal $\mathcal{N}(0,{\sigma }^{2})$ and U _j having a multivariate normal ${\mathcal{N}}_{p+1}(\mathbf{0},\mathbf{T})$ distribution. Parameters β _h are regression coefficients (fixed effects), while the U _hj are random effects. The presence of both of these makes (1) into a mixed linear model. In most practical cases, the variables with random effects are a subset of the variables with fixed effects (x _hij = z _hij for h ≤ p; p ≤ r), but this is not necessary.

More Than Two Levels

This model can be extended to a three- or more-level model for data with three or more nested levels by including random effects at each of these levels. For example, for a three level structure where level-3 units are denoted by k = 1, …, M, level-2 units by j = 1, …, N _k, and level-1 units by i = 1, …, n _ij, the model is

$$\begin{array}{rcl}{ Y }_{ijk}& =& {\beta }_{0} +{ \sum \limits_{h=1}^{r}{\beta }_{ h}}\,{x}{hijk} + {U}_{0jk} +{ \sum \limits_{h=1}^{p}}{U}_{ hjk}\,{z}_{hijk} + {V }_{0k} \\ & & +{\sum\limits_{h=1}^{q}}{V }_{ hk}\,{w}_{hijk} + {R}_{ijk}, \\ \end{array}$$

(2)

where the U _hjk are the random effects at level 2, while the V _hk are the random effects at level 3. An example is research into outcome variables Y _ijk of students (i) nested in classrooms ( j) nested in schools (k), and the presence of error terms at all three levels provides a basis for testing effects of pupil variables, classroom or teacher variables, as well as school variables.

The development both of inferential methods and of applications was oriented first to this type of nested models, but much interest now is given also to the more general case where the restriction of nested random effects is dropped. In this sense, multilevel analysis refers to methodology of research questions and data structures that involve several sources of variation – each type of units then refers to a specific source of variation, with or without nesting. In social science applications this can be fruitfully applied to research questions in which different types of actor and context are involved; e.g., patients, doctors, hospitals, and insurance companies in health-related research; or students, teachers, schools, and neighborhoods in educational research. The word “level” then is used for such a type of units. Given the use of random effects, the most natural applications are those where each “level” is associated with some population of units.

Longitudinal Studies

A special area of application of multilevel models is longitudinal studies, in which the lowest level corresponds to repeated observations of the level-two units. Often the level-two units are individuals, but these may also be organizations, countries, etc. This application of mixed effects models was pioneered by Laird and Ware (1982). An important advantage of the hierarchical linear model over other statistical models for longitudinal data is the possibility to obtain parameter estimates and tests also under highly unbalanced situations, where the number of observations per individual, and the time points where they are measured, are different between individuals. Another advantage is the possibility of seamless integration with nesting if individuals within higher-level units.

Model Specification

The usual considerations for model specification in linear models apply here, too, but additional considerations arise from the presence in the model of the random effects and the data structure being nested or having multiple types of unit in some other way. An important practical issue is to avoid the ecological fallacy mentioned above; i.e., to attribute fixed effects to the correct level. In the original paper by Robinson (1950), one of the examples was about the correlation between literacy and ethnic background as measured in the USA in the 1930s, computed as a correlation at the individual level, or at the level of averages for large geographical regions. The correlation was .203 between individuals, and .946 between regions, illustrating how widely different correlations at different levels of aggregation may be.

Consider a two-level model (1) where variable X ₁ with values x _1ij is defined as a level-1 variable – literacy in Robinson’s example. For “level-2 units” we also use the term “groups.” To avoid the ecological fallacy, one will have to include a relevant level-2 variable that reflects the composition of the level-2 units with respect to variable X ₁. The mostly used composition variable is the group mean of X ₁,

$$\bar{{x}}_{1.j} = \frac{1} {{n}_{j}}{ \sum \limits_{i=1}^{{n}_{j} }}{x}_{1ij}.$$

The usual procedure then is to include x _1ij as well as $\bar{{x}}_{1.j}$ among the explanatory variables with fixed effects. This allows separate estimation of the within-group regression (the coefficient of x _1ij) and the between-group regression (the sum of the coefficients of x _1ij and $\bar{{x}}_{1.j}$).

In some cases, notably in many economic studies (see Greene 2003), researchers are interested especially in the within-group regression coefficients, and wish to control for the possibility of unmeasured heterogeneity between the groups. If there is no interest in the between-group regression coefficients one may use a model with fixed effects for all the groups: in the simplest case this is

$${Y }_{ij} = {\beta }_{0} +{ \sum \limits_{h=1}^{r}{\beta }_{ h}}\,{x}_{hij} + {\gamma }_{j} + {R}_{ij}.$$

(3)

The parameters γ_j (which here have to be restricted, e.g., to have a mean 0 in order to achieve identifiability) then represent all differences between the level-two units, as far as these differences apply as a constant additive term to all level-1 units within the group. For example in the case of longitudinal studies where level-2 units are individuals and a linear model is used, this will represent all time-constant differences between individuals. Note that (3) is a linear model with only one error term.

Model (1) implies the distribution

$$\mathbf{y} \sim {\mathcal{N}}_{p}\left (\mathbf{X}\,\beta \mathbf{,Z\,T\,Z'} + {\sigma }^{2}I\right ).$$

Generalizations are possible where the level-1 residual terms R _ij are not i.i.d.; they can be heteroscedastic, have time-series dependence, etc. The specification of the variables Z having random effects is crucial to obtain a well-fitting model. See Chap. 9 of Snijders and Bosker (1999), Chap. 9 of Raudenbush and Bryk(2002), and Chap. 3 of de Leeuw and Meijer(2008).

Inference

A major reason for the take-off of multilevel analysis in the 1980s was the development of algorithms for maximum likelihood estimation for unbalanced nested designs. The EM algorithm (Dempster et al. 1981), Iteratively Reweighted Least Squares (Goldstein 1986), and Fisher Scoring (Longford 1987) were applied to obtain ML estimates for hierarchical linear models. The MCMC implementation of Bayesian procedures has proved very useful for a large variety of more complex multilevel models, both for non-nested random effects and for generalized linear mixed models; see Browne and Draper (2000) and Chap. 2 of de Leeuw and Meijer (2008).

Hypothesis tests for the fixed coefficients β _h can be carried out by Wald or Likelihood Ratio tests in the usual way. For testing parameters of the random effects, some care must be taken because the estimates of the random effect variances τ _hh ² (the diagonal elements of T) are not approximately normally distributed if τ _hh ² = 0. Tests for these parameters can be based on estimated fixed effects, using least squares estimates for U _hj in a specification where these are treated as fixed effects (Bryk and Raudenbush 2002, Chap. 3); based on appropriate distributions of the log likelihood ratio; or obtained as score tests (Berkhof and Snijders2001).

About the Author

Professor Snijders is Elected Member of the European Academy of Sociology (2006) and Elected Correspondent of the Royal Netherlands Academy of Arts and Sciences (2007). He was awarded the Order of Knight of the Netherlands Lion (2008). Professor Snijders was Chairman of the Department of Statistics, Measurement Theory, and Information Technology, of the University of Groningen (1997–2000). He has supervised 52 Ph.D. students. He has been associate editor of various journals, and Editor of Statistica Neerlandica (1986–1990). Currently he is co-editor of Social Networks, Associate editor of Annals of Applied Statistics, and Associate editor of Journal of Social Structure. Professor Snijders has (co-)authored about 100 refereed papers and several books, including Multilevel analysis. An introduction to basic and advanced multilevel modeling. (with Bosker, R.J., London etc.: Sage Publications, 1999). In 2005, he was awarded an honorary doctorate in the Social Sciences from the University of Stockholm.

Cross References

Bayesian Statistics

Cross Classified and Multiple Membership Multilevel Models

Mixed Membership Models

Moderating and Mediating Variables in Psychological Research

Nonlinear Mixed Effects Models

Research Designs

Statistical Analysis of Longitudinal and Correlated Data

Statistical Inference in Ecology

References and Further Reading

To explore current research activities and to obtain information training materials etc., visit the website www.cmm.bristol.ac.uk. There is also an on-line discussion group at www.jiscmail.ac.uk/lists/multilevel.html.
There is a variety of textbooks, such as Goldstein (2003), Longford (1993), Raudenbush and Bryk (2003), and Snijders and Bosker (1999). A wealth of material is contained in de Leeuw and Meijer (2008).
Google Scholar
Alker HR (1969) A typology of ecological fallacies. In: Dogan M, Rokkan S (eds) Quantitative ecological analysis in the social sciences. MIT Press, Cambridge, pp 69–86
Google Scholar
Berkhof J, Snijders TAB (2001) Variance component testing in multilevel models. J Educ Behav Stat 26:133–152
Google Scholar
Browne WJ, Draper D (2000) Implementation and performance issues in the Bayesian and likelihood fitting of multilevel models. Computational Stat 15:391–420
MATH Google Scholar
de Leeuw J, Meijer E (2008) Handbook of multilevel analysis. Springer, New York
MATH Google Scholar
Dempster AP, Rubin DB, Tsutakawa RK (1981) Estimation in covariance components models. J Am Stat Assoc 76:341–353
MATH MathSciNet Google Scholar
Goldstein H (1986) Multilevel mixed linear model analysis using iterative generalized least squares. Biometrika 73:43–56
MATH MathSciNet Google Scholar
Goldstein H (2003) Multilevel statistical models, 3rd edn. Edward Arnold, London
MATH Google Scholar
Greene W (2003) Econometric analysis, 5th edn. Prentice Hall, Upper Saddle River
Google Scholar
Laird NM, Ware JH (1982) Random-effects models for longitudinal data. Biometrics 38:963–974
MATH Google Scholar
Longford NT (1987) A fast scoring algorithm for maximum likelihood estimation in unbalanced mixed models with nested random effects. Biometrika 74:812–827
MathSciNet Google Scholar
Longford NT (1993) Random coefficient models. Oxford University Press, New York
MATH Google Scholar
Raudenbush SW, Bryk AS (2002) Hierarchical linear models: applications and data analysis methods, 2nd edn. Sage, Thousand Oaks
Google Scholar
Robinson WS (1950) Ecological correlations and the behavior of individuals. Am Sociol Rev 15:351–357
Google Scholar
Scheffé H (1959) The analysis of variance. Wiley, New York
MATH Google Scholar
Snijders TAB, Bosker RJ (1999) Multilevel analysis: an introduction to basic and advanced multilevel modeling. Sage, London
MATH Google Scholar

Download references

Author information

Authors and Affiliations

University of Oxford, Oxford, UK
Tom A. B. Snijders (Professor of Statistics Professor of Methodology and Statistics, Faculty of Behavioral and Social Sciences)
University of Groningen, Groningen, Netherlands
Tom A. B. Snijders (Professor of Statistics Professor of Methodology and Statistics, Faculty of Behavioral and Social Sciences)

Authors

Tom A. B. Snijders
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Statistics and Informatics, Faculty of Economics, University of Kragujevac, City of Kragujevac, Serbia
Miodrag Lovric

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Snijders, T.A.B. (2011). Multilevel Analysis. In: Lovric, M. (eds) International Encyclopedia of Statistical Science. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04898-2_387

Download citation

DOI: https://doi.org/10.1007/978-3-642-04898-2_387
Published: 02 December 2014
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04897-5
Online ISBN: 978-3-642-04898-2
eBook Packages: Mathematics and StatisticsReference Module Computer Science and Engineering

Publish with us

Policies and ethics