Elsevier

Journal of Econometrics

Volume 142, Issue 2, February 2008, Pages 615-635
Journal of Econometrics

Regression discontinuity designs: A guide to practice

https://doi.org/10.1016/j.jeconom.2007.05.001Get rights and content

Abstract

In regression discontinuity (RD) designs for evaluating causal effects of interventions, assignment to a treatment is determined at least partly by the value of an observed covariate lying on either side of a fixed threshold. These designs were first introduced in the evaluation literature by Thistlewaite and Campbell [1960. Regression-discontinuity analysis: an alternative to the ex-post Facto experiment. Journal of Educational Psychology 51, 309–317] With the exception of a few unpublished theoretical papers, these methods did not attract much attention in the economics literature until recently. Starting in the late 1990s, there has been a large number of studies in economics applying and extending RD methods. In this paper we review some of the practical and theoretical issues in implementation of RD methods.

Introduction

Since the late 1990s there has been a large number of studies in economics applying and extending regression discontinuity (RD) methods, including Van Der Klaauw (2002), Black (1999), Angrist and Lavy (1999), Lee (2007), Chay and Greenstone (2005), DiNardo and Lee (2004), Chay et al. (2005), and Card et al. (2006). Key theoretical and conceptual contributions include the interpretation of estimates for fuzzy regression discontinuity (FRD) designs allowing for general heterogeneity of treatment effects (Hahn et al., 2001, HTV from hereon), adaptive estimation methods (Sun, 2005), specific methods for choosing bandwidths (Ludwig and Miller, 2005), and various tests for discontinuities in means and distributions of non-affected variables (Lee, 2007, McCrary, 2007).

In this paper, we review some of the practical issues in implementation of RD methods. There is relatively little novel in this discussion. Our general goal is instead to address practical issues in implementing RD designs and review some of the new theoretical developments.

After reviewing some basic concepts in Section 2, the paper focuses on five specific issues in the implementation of RD designs. In Section 3 we stress graphical analyses as powerful methods for illustrating the design. In Section 4 we discuss estimation and suggest using local linear regression methods using only the observations close to the discontinuity point. In Section 5 we propose choosing the bandwidth using cross-validation. In Section 6 we provide a simple plug-in estimator for the asymptotic variance and a second estimator that exploits the link with instrumental variable methods derived by HTV. In Section 7 we discuss a number of specification tests and sensitivity analyses based on tests for (a) discontinuities in the average values for covariates, (b) discontinuities in the conditional density of the forcing variable, as suggested by McCrary, and (c) discontinuities in the average outcome at other values of the forcing variable.

Section snippets

Basics

Our discussion will frame the RD design in the context of the modern literature on causal effects and treatment effects, using the Rubin Causal Model (RCM) set up with potential outcomes (Rubin, 1974, Holland, 1986, Imbens and Rubin, 2007), rather than the regression framework that was originally used in this literature. For a general discussion of the RCM and its use in the economic literature, see the survey by Imbens and Wooldridge (2007).

In the basic setting for the RCM (and for the RD

Nonparametric regression at the boundary

The practical estimation of the treatment effect τ in both the SRD and FRD designs is largely a standard nonparametric regression problem (e.g., Pagan and Ullah, 1999, Härdle, 1990, Li and Racine, 2007). However, there are two unusual features. In this case we are interested in the regression function at a single point, and in addition that single point is a boundary point. As a result, standard nonparametric kernel regression does not work very well. At boundary points, such estimators have a

Bandwidth selection

An important issue in practice is the selection of the smoothing parameter, the binwidth h. In general there are two approaches to choose bandwidths. A first approach consists of characterizing the optimal bandwidth in terms of the unknown joint distribution of all variables. The relevant components of this distribution can then be estimated, and plugged into the optimal bandwidth function. The second approach, on which we focus here, is based on a cross-validation procedure. The specific

Inference

We now discuss some asymptotic properties for the estimator for the FRD case given in (4.7) or its alternative representation in (4.9).5 More general results are given in HTV. We continue to make some

Specification testing

There are generally two main conceptual concerns in the application of RD designs, sharp or fuzzy. A first concern about RD designs is the possibility of other changes at the same cutoff value of the covariate. Such changes may affect the outcome, and these effects may be attributed erroneously to the treatment of interest. For example, at age 65 individuals become eligible for discounts at many cultural institutions. However, if one finds that there is a discontinuity in the number of hours

Conclusion: a summary guide to practice

In this paper, we reviewed the literature on RD designs and discussed the implications for applied researchers interested in implementing these methods. We end the paper by providing a summary guide of steps to be followed when implementing RD designs. We start with the case of SRD, and then add a number of details specific to the case of FRD.

Case 1: SRD designs

  • 1.

    Graph the data (Section 3) by computing the average value of the outcome variable over a set of bins. The binwidth has to be large

Acknowledgments

We are grateful for discussions with David Card and Wilbert Van Der Klaauw. Financial support for this research was generously provided through NSF Grant SES 0452590 and the SSHRC of Canada.

References (42)

  • W. Trochim

    Regression-discontinuity design

  • J.D. Angrist et al.

    Does compulsory school attendance affect schooling and earnings?

    Quarterly Journal of Economics

    (1991)
  • J.D. Angrist et al.

    Using Maimonides’ rule to estimate the effect of class size on scholastic achievement

    Quarterly Journal of Economics

    (1999)
  • J.D. Angrist et al.

    Identification of causal effects using instrumental variables

    Journal of the American Statistical Association

    (1996)
  • Battistin, E., Rettore, E., 2007. Ineligibles and eligible non-participants as a double comparison group in...
  • S. Black

    Do better schools matter? Parental valuation of elementary education

    Quarterly Journal of Economics

    (1999)
  • Card, D., Dobkin, C., Maestas, N., 2004. The impact of nearly universal insurance coverage on health care utilization...
  • Card, D., Mas, A., Rothstein, J., 2006. Tipping and the dynamics of segregation in neighborhoods and schools....
  • K. Chay et al.

    Does air quality matter? Evidence from the housing market

    Journal of Political Economy

    (2005)
  • K. Chay et al.

    The central role of noise in evaluating interventions that use test scores to rank schools

    American Economic Review

    (2005)
  • J. DiNardo et al.

    Economic impacts of new unionization on private sector employers: 1984–2001

    Quarterly Journal of Economics

    (2004)
  • J. Fan et al.

    Local Polynomial Modelling and its Applications

    (1996)
  • Hahn, J., Todd, P., Van Der Klaauw, W., 1999. Evaluating the effect of an anti discrimination law using a...
  • J. Hahn et al.

    Identification and estimation of treatment effects with a regression discontinuity design

    Econometrica

    (2001)
  • W. Härdle

    Applied Nonparametric Regression

    (1990)
  • J.J. Heckman et al.

    Alternative methods for evaluating the impact of training programs (with discussion)

    Journal of the American Statistical Association

    (1989)
  • P. Holland

    Statistics and causal inference (with discussion)

    Journal of the American Statistical Association

    (1986)
  • G. Imbens

    Nonparametric estimation of average treatment effects under exogeneity: a review

    Review of Economics and Statistics

    (2004)
  • G. Imbens et al.

    Identification and estimation of local average treatment effects

    Econometrica

    (1994)
  • G. Imbens et al.

    Causal Inference: Statistical Methods for Estimating Causal Effects in Biomedical, Social, and Behavioral Sciences

    (2007)
  • G. Imbens et al.

    Evaluating the cost of conscription in The Netherlands

    Journal of Business and Economic Statistics

    (1995)
  • Cited by (2434)

    • Fertility responses to cash transfers in Uruguay

      2024, World Development Perspectives
    View all citing articles on Scopus
    View full text