Regression discontinuity designs: A guide to practice
Introduction
Since the late 1990s there has been a large number of studies in economics applying and extending regression discontinuity (RD) methods, including Van Der Klaauw (2002), Black (1999), Angrist and Lavy (1999), Lee (2007), Chay and Greenstone (2005), DiNardo and Lee (2004), Chay et al. (2005), and Card et al. (2006). Key theoretical and conceptual contributions include the interpretation of estimates for fuzzy regression discontinuity (FRD) designs allowing for general heterogeneity of treatment effects (Hahn et al., 2001, HTV from hereon), adaptive estimation methods (Sun, 2005), specific methods for choosing bandwidths (Ludwig and Miller, 2005), and various tests for discontinuities in means and distributions of non-affected variables (Lee, 2007, McCrary, 2007).
In this paper, we review some of the practical issues in implementation of RD methods. There is relatively little novel in this discussion. Our general goal is instead to address practical issues in implementing RD designs and review some of the new theoretical developments.
After reviewing some basic concepts in Section 2, the paper focuses on five specific issues in the implementation of RD designs. In Section 3 we stress graphical analyses as powerful methods for illustrating the design. In Section 4 we discuss estimation and suggest using local linear regression methods using only the observations close to the discontinuity point. In Section 5 we propose choosing the bandwidth using cross-validation. In Section 6 we provide a simple plug-in estimator for the asymptotic variance and a second estimator that exploits the link with instrumental variable methods derived by HTV. In Section 7 we discuss a number of specification tests and sensitivity analyses based on tests for (a) discontinuities in the average values for covariates, (b) discontinuities in the conditional density of the forcing variable, as suggested by McCrary, and (c) discontinuities in the average outcome at other values of the forcing variable.
Section snippets
Basics
Our discussion will frame the RD design in the context of the modern literature on causal effects and treatment effects, using the Rubin Causal Model (RCM) set up with potential outcomes (Rubin, 1974, Holland, 1986, Imbens and Rubin, 2007), rather than the regression framework that was originally used in this literature. For a general discussion of the RCM and its use in the economic literature, see the survey by Imbens and Wooldridge (2007).
In the basic setting for the RCM (and for the RD
Nonparametric regression at the boundary
The practical estimation of the treatment effect in both the SRD and FRD designs is largely a standard nonparametric regression problem (e.g., Pagan and Ullah, 1999, Härdle, 1990, Li and Racine, 2007). However, there are two unusual features. In this case we are interested in the regression function at a single point, and in addition that single point is a boundary point. As a result, standard nonparametric kernel regression does not work very well. At boundary points, such estimators have a
Bandwidth selection
An important issue in practice is the selection of the smoothing parameter, the binwidth . In general there are two approaches to choose bandwidths. A first approach consists of characterizing the optimal bandwidth in terms of the unknown joint distribution of all variables. The relevant components of this distribution can then be estimated, and plugged into the optimal bandwidth function. The second approach, on which we focus here, is based on a cross-validation procedure. The specific
Inference
We now discuss some asymptotic properties for the estimator for the FRD case given in (4.7) or its alternative representation in (4.9).5 More general results are given in HTV. We continue to make some
Specification testing
There are generally two main conceptual concerns in the application of RD designs, sharp or fuzzy. A first concern about RD designs is the possibility of other changes at the same cutoff value of the covariate. Such changes may affect the outcome, and these effects may be attributed erroneously to the treatment of interest. For example, at age 65 individuals become eligible for discounts at many cultural institutions. However, if one finds that there is a discontinuity in the number of hours
Conclusion: a summary guide to practice
In this paper, we reviewed the literature on RD designs and discussed the implications for applied researchers interested in implementing these methods. We end the paper by providing a summary guide of steps to be followed when implementing RD designs. We start with the case of SRD, and then add a number of details specific to the case of FRD.
Case 1: SRD designs
- 1.
Graph the data (Section 3) by computing the average value of the outcome variable over a set of bins. The binwidth has to be large
Acknowledgments
We are grateful for discussions with David Card and Wilbert Van Der Klaauw. Financial support for this research was generously provided through NSF Grant SES 0452590 and the SSHRC of Canada.
References (42)
Regression-discontinuity design
- et al.
Does compulsory school attendance affect schooling and earnings?
Quarterly Journal of Economics
(1991) - et al.
Using Maimonides’ rule to estimate the effect of class size on scholastic achievement
Quarterly Journal of Economics
(1999) - et al.
Identification of causal effects using instrumental variables
Journal of the American Statistical Association
(1996) - Battistin, E., Rettore, E., 2007. Ineligibles and eligible non-participants as a double comparison group in...
Do better schools matter? Parental valuation of elementary education
Quarterly Journal of Economics
(1999)- Card, D., Dobkin, C., Maestas, N., 2004. The impact of nearly universal insurance coverage on health care utilization...
- Card, D., Mas, A., Rothstein, J., 2006. Tipping and the dynamics of segregation in neighborhoods and schools....
- et al.
Does air quality matter? Evidence from the housing market
Journal of Political Economy
(2005) - et al.
The central role of noise in evaluating interventions that use test scores to rank schools
American Economic Review
(2005)
Economic impacts of new unionization on private sector employers: 1984–2001
Quarterly Journal of Economics
Local Polynomial Modelling and its Applications
Identification and estimation of treatment effects with a regression discontinuity design
Econometrica
Applied Nonparametric Regression
Alternative methods for evaluating the impact of training programs (with discussion)
Journal of the American Statistical Association
Statistics and causal inference (with discussion)
Journal of the American Statistical Association
Nonparametric estimation of average treatment effects under exogeneity: a review
Review of Economics and Statistics
Identification and estimation of local average treatment effects
Econometrica
Causal Inference: Statistical Methods for Estimating Causal Effects in Biomedical, Social, and Behavioral Sciences
Evaluating the cost of conscription in The Netherlands
Journal of Business and Economic Statistics
Cited by (2434)
Impact of higher capital buffers on banks’ lending and risk-taking in the short- and medium-term: Evidence from the euro area experiments
2024, Journal of Financial StabilityThe effect of female leadership on contracting from Capitol Hill to Main Street
2024, Journal of Financial EconomicsFertility responses to cash transfers in Uruguay
2024, World Development PerspectivesThe impact of subsidies on house prices in Mexico's mortgage market for low-income households 2008–2019
2024, Journal of Housing EconomicsSibling spillovers and the choice to get vaccinated: Evidence from a regression discontinuity design
2024, Journal of Health Economics