Abstract
Missing response problem is ubiquitous in survey sampling, medical, social science and epidemiology studies. It is well known that non-ignorable missing is the most difficult missing data problem where the missing of a response depends on its own value. In statistical literature, unlike the ignorable missing data problem, not many papers on non-ignorable missing data are available except for the full parametric model based approach. In this paper we study a semiparametric model for non-ignorable missing data in which the missing probability is known up to some parameters, but the underlying distributions are not specified. By employing Owen (1988)’s empirical likelihood method we can obtain the constrained maximum empirical likelihood estimators of the parameters in the missing probability and the mean response which are shown to be asymptotically normal. Moreover the likelihood ratio statistic can be used to test whether the missing of the responses is non-ignorable or completely at random. The theoretical results are confirmed by a simulation study. As an illustration, the analysis of a real AIDS trial data shows that the missing of CD4 counts around two years are non-ignorable and the sample mean based on observed data only is biased.
Similar content being viewed by others
References
Alho JM (1990) Adjusting for nonresponse bias using logistic regression. Biometrika 77:617–624
Chan KCG, Yam SCP (2014) Oracle, multiple robust and multipurpose calibration in a missing response problem. Stat Sci 29:380–396
Chen J, Qin J (1993) Empirical likelihood estimation for finite populations and the effective usage of auxiliary information. Biometrika 80:107–116
Chen K (2001) Parametric models for response-biased sampling. J R Stat Soc Ser B Stat Methodol 63:775–789
Cochran WG (1977) Sampling techniques, 3rd edn., Wiley series in probability and mathematical statisticsWiley, New York
Davidian M, Tsiatis AA, Leon S (2005) Semiparametric estimation of treatment effect in a pretest-posttest study with missing data. Stat Sci 20:261–301 with comments and a rejoinder by the authors
Godambe VP (1960) An optimum property of regular maximum likelihood estimation. Ann Math Stat 31:1208–1211
Greenlees JS, Reece WS, Zieschan KD (1982) Imputation of missing values when the probability of response depends on the variable being imputed. J Am Stat Assoc 77:251–261
Hall P, Scala BL (1990) Methodology and algorithms of empirical likelihood. Int Stat Rev/Revue Internationale de Statistique 58:109–127
Hammer SM, Katzenstein DA, Hughes MD, Gundacker H, Schooley RT, Haubrich RH, Henry WK, Lederman MM, Phair JP, Niu M, Hirsch MS, Merigan TC (1996) A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. N Engl J Med 335:1081–1090
Han P, Wang L (2013) Estimation with missing data: beyond double robustness. Biometrika 100:417–430
Kim JK, Im J (2014) Propensity score adjustment with several follow-ups. Biometrika 101:439–448
Kim JK, Yu CL (2011) A semi-parametric estimation of mean functionals with non-ignorable missing data. J Am Stat Assoc 106:157–165
Li L, Shen C, Li X, Robins JM (2011) On weighting approaches for missing data. Stat Methods Med Res
Liang K-Y, Qin J (2000) Regression analysis under non-standard situations: a pairwise pseudolikelihood approach. J R Stat Soc Ser B 62:773–786
Little RJA (1982) Models for nonresponse in sample surveys. J Am Stat Assoc 77:237–250
Little RJA, Rubin RB (2002) Statistical analysis with missing data, 2nd edn., Wiley series in probability and statisticsWiley, Hoboken
Nevo A (2003) Using weights to adjust for sample selection when auxiliary information is available. J Bus Econ Stat 21:43–52
Niu C, Guo X, Xu W, Zhu L (2014) Empirical likelihood inference in linear regression with nonignorable missing response. Comput Stat Data Anal 79:91–112
Owen AB (1988) Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75:237–249
Owen AB (2001) Empirical likelihood. Chapman & Hall, Boca Raton
Qin J, Zhang B (2007) Empirical-likelihood-based inference in missing response problems and its application in observational studies. J R Stat Soc Ser B 69:101–122
Rotnitzky A, Robins JM (1997) Analysis of semi-parametric regression models with non-ignorable non-response. Stat Med 16:81–102
Scharfstein DO, Rotnitzky A, Robins JM (1999) Adjusting for nonignorable drop-out using semiparametric nonresponse models. J Am Stat Assoc 94:1096–1146 with comments and a rejoinder by the authors
Small CG, McLeish DL (1988) Generalizations of ancillarity, completeness and sufficiency in an inference function space. Ann Stat 16:534–551
Small CG, McLeish DL (1989) Projection as a method for increasing sensitivity and eliminating nuisance parameters. Biometrika 76:693–703
Tan Z (2010) Bounded, efficient and doubly robust estimation with inverse weighting. Biometrika 97:661–682
Tang CY, Leng C (2011) Empirical likelihood and quantile regression in longitudinal data analysis. Biometrika 98:1001–1006
Tang CY, Qin Y (2012) An efficient empirical likelihood approach for estimating equations with missing data. Biometrika 99:1001–1007
Tang G, Little RJA, Raghunathan TE (2003) Analysis of multivariate missing data with nonignorable nonresponse. Biometrika 90:747–764
Vardi Y (1982) Nonparametric estimation in the presence of length bias. Ann Stat 10:616–620
Vardi Y (1985) Empirical distributions in selection bias models. Ann Stat 13:178–205 with discussion by C. L. Mallows
Wang Q, Dai P (2008) Semiparametric model-based inference in the presence of missing responses. Biometrika 95:721–734
Wang S, Shao J, Kim JK (2014) An instrument variable approach for identification and estimation with nonignorable nonresponse. Stat Sin 24:1097–1116
Zhao P-Y, Tang M-L, Tang N-S (2013) Robust estimation of distribution functions and quantiles with non-ignorable missing data. Can J Stat 41:575–595
Zhong P-S, Chen S (2014) Jackknife empirical likelihood inference with regression imputation and survey data. J Multivar Anal 129:193–205
Zhou Y, Wan ATK, Wang X (2008) Estimating equations inference with missing data. J Am Stat Assoc 103:1187–1199
Acknowledgments
The authors would like to thank the Editor, the Guest Editor and the two referees for their careful reading and for some useful comments and suggestions that have greatly improved the original submission.
Author information
Authors and Affiliations
Corresponding author
Appendix
Appendix
1.1 Derivation of loglikelihood \(\ell (\beta )\) by profiling out \((\eta ,\theta )\)
Differentiating \(\ell (\varvec{\lambda },\varvec{\nu },\xi )\) w.r.t. \((\eta ,\theta )\) and using (4) and (5) we have
Equating the above derivatives to zeros and noting \(\frac{\partial }{\partial \eta } \psi (x,y;\xi )= (-1, \theta ^{\mathrm { T} }, 0)^{\mathrm { T} }\) and \( \frac{\partial }{\partial \theta } \psi ^{\mathrm { T} }(x,y;\xi )= \left( \varvec{0}, -(1-\eta )I, \varvec{0}\right) \) we have
Therefore we have proved (6) which imply
Plugging these into \(\ell (\varvec{\lambda },\varvec{\nu },\xi )\) we obtain \(\ell (\varvec{\lambda },\varvec{\nu },\xi )=\tilde{\ell }(\varvec{\omega })-n_0\log (n_1/n_0)\). So we arrive at (7) by ignoring the parameter-free term. The expressions for \(S({\varvec{\omega }})\) and \(\hat{p}_i\) follow from the above results.
In rest of this section we present proofs of the Theorems in Sect. 2. We assume that \(\eta =\eta _0\) and \({\varvec{\omega }}=(\varvec{\lambda }^{\mathrm { T} },\beta ^{\mathrm { T} })^{\mathrm { T} }={\varvec{\omega }}_0=(\varvec{\lambda }^{\mathrm { T} }_0,\beta ^{\mathrm { T} }_0)^{\mathrm { T} }\). We denote the conditional nonresponse probability by \(\bar{\pi }(x, y)=1-\pi (x, y)\). To simplify notations we will suppress the dependence of some quantities on the parameter \(\beta \) and even the random variables X and/or Y.
1.2 Proof of Theorem 1
It is easy to prove that when \(\bar{\mu }=\mu _0\), \(\eta =\eta _0\) and \({\varvec{\omega }}={\varvec{\omega }}_0\), the estimating functions are \(S({\varvec{\omega }})=\sum _{i=1}^n z_i +\mathcal{{O}}_P(1)\), where \(z_i=g(x_i, y_i, d_i)\), \(g=(g_{1}^{\mathrm { T} }, g_{2}^{\mathrm { T} }, g_{3}^{\mathrm { T} })^{\mathrm { T} }\) and
It can be shown that in distribution
where U is the variance-covariance matrix of \(Z=g(X,Y,D)\). It is easy to see that when \(\bar{\mu }=\mu _0\), \(\eta =\eta _0\) and \({\varvec{\omega }}={\varvec{\omega }}_0\), the entries of \(U=(u_{ij})_{3\times 3}\) are
The entries of the Hessian matrix of \(\tilde{\ell }({\varvec{\omega }})\), \(H({\varvec{\omega }})=\{H_{ij}({\varvec{\omega }})\}_{3\times 3}\), are
It can also be shown that when \({\varvec{\omega }}={\varvec{\omega }}_0\), \(\eta =\eta _0\), and either \(\bar{\mu }=\mu _0\) or \(\bar{\mu }=n^{-1}\sum _{i=1}^n\mu (x_i)\), we have
Let \(\Sigma _\mu =\text{ var }\{\mu (X)\}\). We have
where
Clearly, under the conditions of the theorem, matrix V is positive definite. It is also true that
Therefore, Part (i) of the theorem follows from (9) and (16).
If \(\bar{\mu }=\frac{1}{n}\sum _{i=1}^n\mu (x_i)\), \(\eta =\eta _0\) and \({\varvec{\omega }}={\varvec{\omega }}_0\), then
where \(z_i^*=[0,\eta \{\mu (x_i)-\mu _0\}^{\mathrm { T} },0]^{\mathrm { T} }\). So the asymptotic variance of \(S({\varvec{\omega }})\) in this case is \(U+W\), where \(W=(w_{ij})_{3\times 3}\) is a symmetric matrix whose entries are
So Part (ii) is also proved.
1.3 Proof of Theorem 2
Expanding \(\hat{\mu }_Y\) as a function of \(\hat{\varvec{\omega }}\) at \({\varvec{\omega }}={\varvec{\omega }}_0\), we have
where
Thus if \(\bar{\mu }=\mu _0\), \(\sqrt{n}(\hat{\mu }_Y-\mu _Y)\rightarrow N(0,\sigma ^2), \) where
If \(\bar{\mu }=\frac{1}{n}\sum _{i=1}^n\mu (x_i)\), \(\sqrt{n}(\hat{\mu }_Y-\mu _Y)\rightarrow N(0,\sigma ^2_*), \) where
and \(Z^*=[0,\eta \{\mu (X)-\mu _0\},0]^{\mathrm { T} }\). The proof of Theorem 2 is complete.
1.4 Proof of Theorem 3
Without loss of generality, we assume \(p=\text{ dim }(\beta _2)=1\) and \(r(y)=y\). Under the null hypothesis \(H_0\): \(\beta _2=0\), \( \pi (x,y;\beta ) =\eta _0. \) Replacing \(\psi (x,y,\xi )\) by \(\psi _1(x,y,\xi )\) and using the above model for \(\pi =P(D=1)\), we have, by Taylor expansion, when \(\eta =\eta _0\)
where \({\varvec{\omega }}^*=(\lambda ^{\mathrm { T} }, \beta _1)^{\mathrm { T} }\) takes on \(({\mathbf \lambda }^{\mathrm { T} }_0, \beta _{10})^{\mathrm { T} }\), \(\tilde{S}({\varvec{\omega }}^*)=\{\tilde{S}_1^{\mathrm { T} }({\varvec{\omega }}^*),\tilde{S}_2({\varvec{\omega }}^*)\}^{\mathrm { T} }\),
and
While using model (1) we have
where \({\varvec{\omega }}=(\lambda ^{\mathrm { T} },\beta ^{\mathrm { T} })^{\mathrm { T} }\) takes on \((\lambda ^{\mathrm { T} }_0,\beta ^{\mathrm { T} }_0)^{\mathrm { T} }\) with \(\beta _0=(\beta _{10},\beta _{20}=0)^{\mathrm { T} }\),
Inverting the partitioned matrix, we obtain
where \( \varsigma ^2= V_0^{\mathrm { T} }\tilde{V}^{-1}V_0 = \eta (1-\eta ) \left( C^{\mathrm { T} }A^{-1}C-\mu _Y^2\right) . \) It follows from (18) that
where \(\tilde{\eta }=n_1/n\). By (26) and (26) we can get
where \(D=C-\mu _YB\). By Theorem 2, \(\sqrt{n}D^{\mathrm { T} }\tilde{\lambda }\rightarrow N(0, \sigma _0^2)\) in distribution, where
and \(\Sigma _\phi =\text{ var }_F(\phi )\). Simple matrix algebra shows that
Consequently, \(R=2\{\tilde{\ell }(\hat{\varvec{\omega }})-\tilde{\ell }(\hat{{\varvec{\omega }}}^*)\}\) converges in distribution to \(\chi ^2(p)\) distribution.
Rights and permissions
About this article
Cite this article
Guan, Z., Qin, J. Empirical likelihood method for non-ignorable missing data problems. Lifetime Data Anal 23, 113–135 (2017). https://doi.org/10.1007/s10985-016-9381-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10985-016-9381-0