Skip to main content

Advertisement

Log in

Empirical likelihood method for non-ignorable missing data problems

  • Published:
Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

Missing response problem is ubiquitous in survey sampling, medical, social science and epidemiology studies. It is well known that non-ignorable missing is the most difficult missing data problem where the missing of a response depends on its own value. In statistical literature, unlike the ignorable missing data problem, not many papers on non-ignorable missing data are available except for the full parametric model based approach. In this paper we study a semiparametric model for non-ignorable missing data in which the missing probability is known up to some parameters, but the underlying distributions are not specified. By employing Owen (1988)’s empirical likelihood method we can obtain the constrained maximum empirical likelihood estimators of the parameters in the missing probability and the mean response which are shown to be asymptotically normal. Moreover the likelihood ratio statistic can be used to test whether the missing of the responses is non-ignorable or completely at random. The theoretical results are confirmed by a simulation study. As an illustration, the analysis of a real AIDS trial data shows that the missing of CD4 counts around two years are non-ignorable and the sample mean based on observed data only is biased.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1

Similar content being viewed by others

References

  • Alho JM (1990) Adjusting for nonresponse bias using logistic regression. Biometrika 77:617–624

    Article  MathSciNet  MATH  Google Scholar 

  • Chan KCG, Yam SCP (2014) Oracle, multiple robust and multipurpose calibration in a missing response problem. Stat Sci 29:380–396

    Article  MathSciNet  MATH  Google Scholar 

  • Chen J, Qin J (1993) Empirical likelihood estimation for finite populations and the effective usage of auxiliary information. Biometrika 80:107–116

    Article  MathSciNet  MATH  Google Scholar 

  • Chen K (2001) Parametric models for response-biased sampling. J R Stat Soc Ser B Stat Methodol 63:775–789

    Article  MathSciNet  MATH  Google Scholar 

  • Cochran WG (1977) Sampling techniques, 3rd edn., Wiley series in probability and mathematical statisticsWiley, New York

    MATH  Google Scholar 

  • Davidian M, Tsiatis AA, Leon S (2005) Semiparametric estimation of treatment effect in a pretest-posttest study with missing data. Stat Sci 20:261–301 with comments and a rejoinder by the authors

    Article  MathSciNet  MATH  Google Scholar 

  • Godambe VP (1960) An optimum property of regular maximum likelihood estimation. Ann Math Stat 31:1208–1211

    Article  MathSciNet  MATH  Google Scholar 

  • Greenlees JS, Reece WS, Zieschan KD (1982) Imputation of missing values when the probability of response depends on the variable being imputed. J Am Stat Assoc 77:251–261

    Article  Google Scholar 

  • Hall P, Scala BL (1990) Methodology and algorithms of empirical likelihood. Int Stat Rev/Revue Internationale de Statistique 58:109–127

    MATH  Google Scholar 

  • Hammer SM, Katzenstein DA, Hughes MD, Gundacker H, Schooley RT, Haubrich RH, Henry WK, Lederman MM, Phair JP, Niu M, Hirsch MS, Merigan TC (1996) A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. N Engl J Med 335:1081–1090

    Article  Google Scholar 

  • Han P, Wang L (2013) Estimation with missing data: beyond double robustness. Biometrika 100:417–430

    Article  MathSciNet  MATH  Google Scholar 

  • Kim JK, Im J (2014) Propensity score adjustment with several follow-ups. Biometrika 101:439–448

    Article  MathSciNet  MATH  Google Scholar 

  • Kim JK, Yu CL (2011) A semi-parametric estimation of mean functionals with non-ignorable missing data. J Am Stat Assoc 106:157–165

    Article  Google Scholar 

  • Li L, Shen C, Li X, Robins JM (2011) On weighting approaches for missing data. Stat Methods Med Res

  • Liang K-Y, Qin J (2000) Regression analysis under non-standard situations: a pairwise pseudolikelihood approach. J R Stat Soc Ser B 62:773–786

    Article  MathSciNet  MATH  Google Scholar 

  • Little RJA (1982) Models for nonresponse in sample surveys. J Am Stat Assoc 77:237–250

    Article  MathSciNet  MATH  Google Scholar 

  • Little RJA, Rubin RB (2002) Statistical analysis with missing data, 2nd edn., Wiley series in probability and statisticsWiley, Hoboken

    MATH  Google Scholar 

  • Nevo A (2003) Using weights to adjust for sample selection when auxiliary information is available. J Bus Econ Stat 21:43–52

    Article  MathSciNet  Google Scholar 

  • Niu C, Guo X, Xu W, Zhu L (2014) Empirical likelihood inference in linear regression with nonignorable missing response. Comput Stat Data Anal 79:91–112

    Article  MathSciNet  Google Scholar 

  • Owen AB (1988) Empirical likelihood ratio confidence intervals for a single functional. Biometrika 75:237–249

    Article  MathSciNet  MATH  Google Scholar 

  • Owen AB (2001) Empirical likelihood. Chapman & Hall, Boca Raton

    Book  MATH  Google Scholar 

  • Qin J, Zhang B (2007) Empirical-likelihood-based inference in missing response problems and its application in observational studies. J R Stat Soc Ser B 69:101–122

    Article  MathSciNet  Google Scholar 

  • Rotnitzky A, Robins JM (1997) Analysis of semi-parametric regression models with non-ignorable non-response. Stat Med 16:81–102

    Article  Google Scholar 

  • Scharfstein DO, Rotnitzky A, Robins JM (1999) Adjusting for nonignorable drop-out using semiparametric nonresponse models. J Am Stat Assoc 94:1096–1146 with comments and a rejoinder by the authors

    Article  MathSciNet  MATH  Google Scholar 

  • Small CG, McLeish DL (1988) Generalizations of ancillarity, completeness and sufficiency in an inference function space. Ann Stat 16:534–551

    Article  MathSciNet  MATH  Google Scholar 

  • Small CG, McLeish DL (1989) Projection as a method for increasing sensitivity and eliminating nuisance parameters. Biometrika 76:693–703

    Article  MathSciNet  MATH  Google Scholar 

  • Tan Z (2010) Bounded, efficient and doubly robust estimation with inverse weighting. Biometrika 97:661–682

    Article  MathSciNet  MATH  Google Scholar 

  • Tang CY, Leng C (2011) Empirical likelihood and quantile regression in longitudinal data analysis. Biometrika 98:1001–1006

    Article  MathSciNet  MATH  Google Scholar 

  • Tang CY, Qin Y (2012) An efficient empirical likelihood approach for estimating equations with missing data. Biometrika 99:1001–1007

    Article  MathSciNet  MATH  Google Scholar 

  • Tang G, Little RJA, Raghunathan TE (2003) Analysis of multivariate missing data with nonignorable nonresponse. Biometrika 90:747–764

    Article  MathSciNet  MATH  Google Scholar 

  • Vardi Y (1982) Nonparametric estimation in the presence of length bias. Ann Stat 10:616–620

    Article  MathSciNet  MATH  Google Scholar 

  • Vardi Y (1985) Empirical distributions in selection bias models. Ann Stat 13:178–205 with discussion by C. L. Mallows

    Article  MathSciNet  MATH  Google Scholar 

  • Wang Q, Dai P (2008) Semiparametric model-based inference in the presence of missing responses. Biometrika 95:721–734

    Article  MathSciNet  MATH  Google Scholar 

  • Wang S, Shao J, Kim JK (2014) An instrument variable approach for identification and estimation with nonignorable nonresponse. Stat Sin 24:1097–1116

    MATH  Google Scholar 

  • Zhao P-Y, Tang M-L, Tang N-S (2013) Robust estimation of distribution functions and quantiles with non-ignorable missing data. Can J Stat 41:575–595

    Article  MathSciNet  MATH  Google Scholar 

  • Zhong P-S, Chen S (2014) Jackknife empirical likelihood inference with regression imputation and survey data. J Multivar Anal 129:193–205

    Article  MathSciNet  MATH  Google Scholar 

  • Zhou Y, Wan ATK, Wang X (2008) Estimating equations inference with missing data. J Am Stat Assoc 103:1187–1199

    Article  MathSciNet  MATH  Google Scholar 

Download references

Acknowledgments

The authors would like to thank the Editor, the Guest Editor and the two referees for their careful reading and for some useful comments and suggestions that have greatly improved the original submission.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zhong Guan.

Appendix

Appendix

1.1 Derivation of loglikelihood \(\ell (\beta )\) by profiling out \((\eta ,\theta )\)

Differentiating \(\ell (\varvec{\lambda },\varvec{\nu },\xi )\) w.r.t. \((\eta ,\theta )\) and using (4) and (5) we have

$$\begin{aligned} \frac{\partial \ell (\varvec{\lambda },\varvec{\nu },\xi )}{\partial \eta }= & {} -\frac{n_0}{1-\eta }-\sum _{i=1}^{n_1}\frac{ \varvec{\lambda }^{\mathrm { T} }\frac{\partial }{\partial \eta }\psi (x_i,y_i)}{1+\varvec{\lambda }^{\mathrm { T} }\psi (x_i,y_i)},\\ \frac{\partial \ell (\varvec{\lambda },\varvec{\nu },\xi )}{\partial \theta }= & {} -\sum _{i=1}^{n_1}\frac{\frac{\partial }{\partial \theta }\psi ^{\mathrm { T} }(x_i,y_i)\varvec{\lambda }}{1+\varvec{\lambda }^{\mathrm { T} }\psi (x_i,y_i)}+\sum _{j=n_1+1}^n \frac{ \varvec{\nu }}{1+\varvec{\nu }^{\mathrm { T} }\{\phi (x_j)-\theta \}}. \end{aligned}$$

Equating the above derivatives to zeros and noting \(\frac{\partial }{\partial \eta } \psi (x,y;\xi )= (-1, \theta ^{\mathrm { T} }, 0)^{\mathrm { T} }\) and \( \frac{\partial }{\partial \theta } \psi ^{\mathrm { T} }(x,y;\xi )= \left( \varvec{0}, -(1-\eta )I, \varvec{0}\right) \) we have

$$\begin{aligned} 0= & {} -\frac{n_0}{1-\eta }+(\lambda _0-\theta ^{\mathrm { T} }\lambda ^{(1)}) \sum _{i=1}^{n_1}\frac{1}{1+\varvec{\lambda }^{\mathrm { T} }\psi (x_i,y_i)}=(\lambda _0-\theta ^{\mathrm { T} }\lambda ^{(1)}){n_1}-\frac{n_0}{1-\eta },\\ 0= & {} \sum _{i=1}^{n_1}\frac{(1-\eta )\varvec{\lambda }^{(1)}}{1+\varvec{\lambda }^{\mathrm { T} }\psi (x_i,y_i)}+\sum _{j=n_1+1}^n \frac{ \varvec{\nu }}{1+\varvec{\nu }^{\mathrm { T} }\{\phi (x_j)-\theta \}} =n_1(1-\eta )\varvec{\lambda }^{(1)} +n_0\varvec{\nu }. \end{aligned}$$

Therefore we have proved (6) which imply

$$\begin{aligned} 1+\varvec{\lambda }^{\mathrm { T} }\psi (x_i,y_i)= & {} \lambda _0[\pi (x, y;\beta )-\eta ]+[1-\pi (x, y;\beta )]\varvec{\lambda }^{(1)}\phi (x)-(1-\eta )\theta ^{\mathrm { T} }\varvec{\lambda }^{(1)}\\&+\varvec{\lambda }_2[\mu (x)-\bar{\mu }]\\= & {} \lambda _0[\pi (x, y;\beta )-1]+[1-\pi (x, y;\beta )]\varvec{\lambda }^{(1)}\phi (x)+\frac{n_0}{n_1}\\&+\varvec{\lambda }_2[\mu (x)-\bar{\mu }]\\= & {} \frac{n_0}{n_1}-\lambda _0[1-\pi (x, y;\beta )]+[1-\pi (x, y;\beta )]\varvec{\lambda }^{(1)}\phi (x)\\&+\varvec{\lambda }_2[\mu (x)-\bar{\mu }]\\= & {} \frac{n_0}{n_1}+\varvec{\lambda }^{\mathrm { T} }\gamma (x, y;\beta ),\\ 1+\varvec{\nu }^{\mathrm { T} }\{\phi (x_j)-\theta \}= & {} 1-\frac{n_1(1-\eta )}{n_0}\{\phi (x_j)-\theta \}^{\mathrm { T} }\varvec{\lambda }^{(1)}\\= & {} 1-\frac{n_1(1-\eta )}{n_0} \phi ^{\mathrm { T} }(x_j)\varvec{\lambda }^{(1)}+\frac{n_1(1-\eta )}{n_0}\theta ^{\mathrm { T} }\varvec{\lambda }^{(1)}\\= & {} 1-\frac{n_1(1-\eta )}{n_0} \phi ^{\mathrm { T} }(x_j)\varvec{\lambda }^{(1)}+\frac{n_1(1-\eta )}{n_0} \lambda _0-1\\= & {} \frac{n_1(1-\eta )}{n_0} \lambda _0-\frac{n_1(1-\eta )}{n_0} \phi ^{\mathrm { T} }(x_j)\varvec{\lambda }^{(1)}\\= & {} \frac{n_1(1-\eta )}{n_0}\varvec{\lambda }_1^{\mathrm { T} }\rho (x_j;\beta ). \end{aligned}$$

Plugging these into \(\ell (\varvec{\lambda },\varvec{\nu },\xi )\) we obtain \(\ell (\varvec{\lambda },\varvec{\nu },\xi )=\tilde{\ell }(\varvec{\omega })-n_0\log (n_1/n_0)\). So we arrive at (7) by ignoring the parameter-free term. The expressions for \(S({\varvec{\omega }})\) and \(\hat{p}_i\) follow from the above results.

In rest of this section we present proofs of the Theorems in Sect. 2. We assume that \(\eta =\eta _0\) and \({\varvec{\omega }}=(\varvec{\lambda }^{\mathrm { T} },\beta ^{\mathrm { T} })^{\mathrm { T} }={\varvec{\omega }}_0=(\varvec{\lambda }^{\mathrm { T} }_0,\beta ^{\mathrm { T} }_0)^{\mathrm { T} }\). We denote the conditional nonresponse probability by \(\bar{\pi }(x, y)=1-\pi (x, y)\). To simplify notations we will suppress the dependence of some quantities on the parameter \(\beta \) and even the random variables X and/or Y.

1.2 Proof of Theorem 1

It is easy to prove that when \(\bar{\mu }=\mu _0\), \(\eta =\eta _0\) and \({\varvec{\omega }}={\varvec{\omega }}_0\), the estimating functions are \(S({\varvec{\omega }})=\sum _{i=1}^n z_i +\mathcal{{O}}_P(1)\), where \(z_i=g(x_i, y_i, d_i)\), \(g=(g_{1}^{\mathrm { T} }, g_{2}^{\mathrm { T} }, g_{3}^{\mathrm { T} })^{\mathrm { T} }\) and

$$\begin{aligned} g_{1}(x,y,d)&= \eta \left( d\left[ \frac{\rho (x)}{\pi (x,y)}+\text{ E }\left\{ \frac{\bar{\pi }(X,Y)\rho (X)}{\eta \pi (X,Y)}\right\} \right] -\rho (x)- \text{ E }\left\{ \frac{\bar{\pi }(X,Y)\rho (X)}{\pi (X,Y)}\right\} \right) ,\\ g_2(x,y,d)&= -\eta \left( d\left[ \frac{\mu (x)-\mu _0}{\pi (x,y)}+\text{ E }\left\{ \frac{\mu (X)-\mu _0}{\eta \pi (X,Y)}\right\} \right] - \text{ E }\left\{ \frac{\mu (X)-\mu _0}{\pi (X,Y)}\right\} \right) , \\ g_3(x,y,d)&= -\frac{1}{\eta }\text{ E }\left\{ \frac{\dot{\pi }_\beta (X,\,Y)}{\pi (X, \,Y)}\right\} (d-\eta ). \end{aligned}$$

It can be shown that in distribution

$$\begin{aligned} \frac{1}{\sqrt{n}}\sum _{i=1}^n z_i\rightarrow N(0, U), \end{aligned}$$
(9)

where U is the variance-covariance matrix of \(Z=g(X,Y,D)\). It is easy to see that when \(\bar{\mu }=\mu _0\), \(\eta =\eta _0\) and \({\varvec{\omega }}={\varvec{\omega }}_0\), the entries of \(U=(u_{ij})_{3\times 3}\) are

$$\begin{aligned} u_{11}= & {} {\eta ^2}\text{ E }\left( \frac{\bar{\pi }}{\pi }\rho \rho ^{\mathrm { T} }\right) +\eta (1-\eta )\text{ E }\left( \frac{\bar{\pi }}{\pi }\rho \right) \text{ E }\left( \frac{\bar{\pi }}{\pi }\rho ^{\mathrm { T} }\right) \nonumber \\&+\eta \left\{ \text{ E }\left( \bar{\pi }\rho \right) \text{ E }\left( \frac{\bar{\pi }}{\pi }\rho ^{\mathrm { T} }\right) +\text{ E }\left( \frac{\bar{\pi }}{\pi }\rho \right) \text{ E }\left( \bar{\pi }\rho ^{\mathrm { T} }\right) \right\} , \end{aligned}$$
(10)
$$\begin{aligned} u_{21}= & {} u_{12}^{\mathrm { T} }= -\eta ^2 \left[ \text{ E }\left\{ \frac{\bar{\pi }}{\pi }(\mu -\mu _0)\rho ^{\mathrm { T} }\right\} +\frac{1-\eta }{\eta } \text{ E }\left( \frac{\mu -\mu _0}{\pi }\right) \right. \nonumber \\&\left. \text{ E }\left( \frac{1-\eta +\pi }{\pi }\bar{\pi }\rho ^{\mathrm { T} }\right) \right] , \end{aligned}$$
(11)
$$\begin{aligned} u_{31}= & {} u_{13}^{\mathrm { T} }=-\text{ E }\left( \frac{\dot{\pi }_\beta }{\pi }\right) \text{ E }\left( \frac{1-\eta +\pi }{\pi }\bar{\pi }\rho ^{\mathrm { T} }\right) , \end{aligned}$$
(12)
$$\begin{aligned} u_{22}= & {} \eta (1-\eta )\text{ E }\left( \frac{\mu -\mu _0}{\pi }\right) \text{ E }\left( \frac{\mu -\mu _0}{\pi }\right) ^{\mathrm { T} }\nonumber \\&+\eta ^2\text{ E }\left\{ \frac{(\mu -\mu _0)(\mu -\mu _0)^{\mathrm { T} }}{\pi }\right\} , \end{aligned}$$
(13)
$$\begin{aligned} u_{32}= & {} u_{23}^{\mathrm { T} }=(1-\eta )\text{ E }\left( \frac{\dot{\pi }_\beta }{\pi }\right) \text{ E }\left( \frac{\mu -\mu _0}{\pi }\right) ^{\mathrm { T} }, \end{aligned}$$
(14)
$$\begin{aligned} u_{33}= & {} \frac{1-\eta }{\eta }\text{ E }\left( \frac{\dot{\pi }_\beta }{\pi }\right) \text{ E }\left( \frac{\dot{\pi }_\beta ^{\mathrm { T} }}{\pi }\right) . \end{aligned}$$
(15)

The entries of the Hessian matrix of \(\tilde{\ell }({\varvec{\omega }})\), \(H({\varvec{\omega }})=\{H_{ij}({\varvec{\omega }})\}_{3\times 3}\), are

$$\begin{aligned} H_{11}({\varvec{\omega }})= & {} \sum _{i=1}^{n_1}\frac{\bar{\pi }^2(x_i,y_i;\beta )\rho (x_i)\rho ^{\mathrm { T} }(x_i)}{\{n/n_1+\varvec{\lambda }^{\mathrm { T} }\gamma (x_i,y_i)\}^2} +\sum _{j=n_1+1}^n \frac{\rho (x_j)\rho ^{\mathrm { T} }(x_j)}{\{\varvec{\lambda }_{1}^{\mathrm { T} }\rho (x_j;\beta )\}^2}, \\ H_{12}({\varvec{\omega }})= & {} -\sum _{i=1}^{n_1}\frac{\bar{\pi }(x_i,y_i;\beta )\rho (x_i)\{\mu (x_i)-\bar{\mu }\}^{\mathrm { T} }}{\{n/n_1+\varvec{\lambda }^{\mathrm { T} }\gamma (x_i,y_i)\}^2},\\ H_{13}({\varvec{\omega }})= & {} \sum _{i=1}^{n_1}\frac{\left[ \{n/n_1+\varvec{\lambda }^{\mathrm { T} }\gamma (x_i,y_i)\}I+\bar{\pi }(x_i,y_i)\rho (x_i)\varvec{\lambda }_{1}^{\mathrm { T} }\right] }{\{n/n_1+\varvec{\lambda }^{\mathrm { T} }\gamma (x_i,y_i)\}^2}\\&\quad \quad \cdot \Big . \left[ \bar{\pi }(x_i,y_i)\dot{\rho }_{\beta ^{\mathrm { T} }}(x_i)-\rho (x_i)\dot{\pi }_\beta ^{\mathrm { T} }(x_i,y_i)\right] \\&-\sum _{j=n_1+1}^n \frac{\left\{ \varvec{\lambda }_1^{\mathrm { T} }\rho (x_j)I-\rho (x_j)\varvec{\lambda }_{1}^{\mathrm { T} }\right\} \dot{\rho }_{\beta ^{\mathrm { T} }}(x_j)}{\{\varvec{\lambda }_{1}^{\mathrm { T} }\rho (x_j;\beta )\}^2},\\ H_{22}({\varvec{\omega }})= & {} \sum _{i=1}^{n_1}\frac{\{\mu (x_i)-\bar{\mu }\}\{\mu (x_i)-\bar{\mu }\}^{\mathrm { T} }}{\{n/n_1+\varvec{\lambda }^{\mathrm { T} }\gamma (x_i,y_i)\}^2},\\ H_{23}({\varvec{\omega }})= & {} -\sum _{i=1}^{n_1}\frac{\{\mu (x_i)-\bar{\mu }\}\varvec{\lambda }_{1}^{\mathrm { T} }\left[ \bar{\pi }(x_i,y_i)\dot{\rho }_{\beta ^{\mathrm { T} }}(x_i)-\rho (x_i)\dot{\pi }_\beta ^{\mathrm { T} }(x_i,y_i)\right] }{\{n/n_1+\varvec{\lambda }^{\mathrm { T} }\gamma (x_i,y_i)\}^2},\\ H_{33}({\varvec{\omega }})= & {} \sum _{i=1}^{n_1}\frac{\pi (x_i,y_i;\beta )\ddot{\pi }_{\beta \beta }(x_i,y_i;\beta )-\dot{\pi }_{\beta }(x_i,y_i;\beta )\dot{\pi }_{\beta }^{\mathrm { T} }(x_i,y_i;\beta )}{\pi ^2(x_i,y_i,\beta )}\\&-\sum _{i=1}^{n_1}\frac{\left\{ n/n_1+\varvec{\lambda }^{\mathrm { T} }\gamma (x_i,y_i)\right\} \frac{\partial }{\partial \beta ^{\mathrm { T} }}\dot{\gamma }^{\mathrm { T} }_\beta (x_i,y_i) \varvec{\lambda } -\dot{\gamma }^{\mathrm { T} }_\beta (x_i,y_i) \varvec{\lambda }\left\{ \dot{\gamma }^{\mathrm { T} }_\beta (x_i,y_i) \varvec{\lambda }\right\} ^{\mathrm { T} }}{\left\{ n/n_1+\varvec{\lambda }^{\mathrm { T} }\gamma (x_i,y_i)\right\} ^2}\\&+\sum _{j=n_1+1}^n\frac{ \varvec{\lambda }_1^{\mathrm { T} }\rho (x_j;\beta )\frac{\partial }{\partial \beta ^{\mathrm { T} }}\dot{\rho }^{\mathrm { T} }_\beta (x_j)\varvec{\lambda }_1 -\dot{\rho }^{\mathrm { T} }_\beta (x_j)\varvec{\lambda }_1\left\{ \dot{\rho }^{\mathrm { T} }_\beta (x_j)\varvec{\lambda }_1\right\} ^{\mathrm { T} }}{\left\{ \varvec{\lambda }_1^{\mathrm { T} }\rho (x_j;\beta )\right\} ^2}. \end{aligned}$$

It can also be shown that when \({\varvec{\omega }}={\varvec{\omega }}_0\), \(\eta =\eta _0\), and either \(\bar{\mu }=\mu _0\) or \(\bar{\mu }=n^{-1}\sum _{i=1}^n\mu (x_i)\), we have

$$\begin{aligned} V_n\equiv \frac{1}{n}H(\varvec{\omega })=\frac{1}{n}\frac{\partial S({\varvec{\omega }})}{\partial {\varvec{\omega }}} \rightarrow V=(v_{ij})_{3\times 3}, \;\text{ a.s.. } \end{aligned}$$
(16)

Let \(\Sigma _\mu =\text{ var }\{\mu (X)\}\). We have

$$\begin{aligned} V= \left( \begin{array}{ll} V_{11} &{} -\eta \text{ E }\left( \frac{1}{\pi }\Phi \dot{\pi }_\beta ^{\mathrm { T} }\right) \\ -\eta \text{ E }\left( \frac{1}{\pi }\dot{\pi }_\beta \Phi ^{\mathrm { T} }\right) &{} \mathbf 0 _{(p+1)\times (p+1)} \\ \end{array} \right) , \end{aligned}$$
(17)

where

$$\begin{aligned} V_{11}\equiv \left( \begin{array}{ll} v_{11} &{} v_{12} \\ v_{21} &{} v_{22} \\ \end{array} \right) =\eta ^2\text{ E }\left( \frac{\bar{\pi }}{\pi }\Phi \Phi ^{\mathrm { T} }\right) +\eta ^2\left( \begin{array}{ll} 0 &{} 0\\ 0 &{} \Sigma _\mu \\ \end{array} \right) . \end{aligned}$$

Clearly, under the conditions of the theorem, matrix V is positive definite. It is also true that

$$\begin{aligned} \sqrt{n}(\hat{{\varvec{\omega }}}-{\varvec{\omega }})=-V_n^{-1}\frac{1}{\sqrt{n}}\sum _{i=1}^n z_i+\mathcal{O}_P(n^{-1/2}). \end{aligned}$$
(18)

Therefore, Part (i) of the theorem follows from (9) and (16).

If \(\bar{\mu }=\frac{1}{n}\sum _{i=1}^n\mu (x_i)\), \(\eta =\eta _0\) and \({\varvec{\omega }}={\varvec{\omega }}_0\), then

$$\begin{aligned}S_3({\varvec{\omega }})=\sum _{i=1}^{n} \left( z_i+z_i^* \right) +\mathcal{{O}}_P(1),\end{aligned}$$

where \(z_i^*=[0,\eta \{\mu (x_i)-\mu _0\}^{\mathrm { T} },0]^{\mathrm { T} }\). So the asymptotic variance of \(S({\varvec{\omega }})\) in this case is \(U+W\), where \(W=(w_{ij})_{3\times 3}\) is a symmetric matrix whose entries are

$$\begin{aligned} w_{ij}= & {} 0, \;\; \text{ if } i\ne 2 \hbox { or } j\ne 2, \end{aligned}$$
(19)
$$\begin{aligned} w_{21}= & {} w_{12}^{\mathrm { T} }=\eta ^2\text{ E }\left( \frac{\pi \mu }{\eta }-\mu _0\right) \text{ E }\left( \frac{\bar{\pi }}{\pi }\rho ^{\mathrm { T} }\right) , \end{aligned}$$
(20)
$$\begin{aligned} w_{22}= & {} -\eta ^2\left\{ \text{ E }\left( \frac{\pi \mu }{\eta }-\mu _0\right) \text{ E }\left( \frac{\mu -\mu _0}{\pi }\right) ^{\mathrm { T} }\right. \nonumber \\&\left. +\text{ E }\left( \frac{\mu -\mu _0}{\pi }\right) \text{ E }\left( \frac{\pi \mu }{\eta }-\mu _0\right) ^{\mathrm { T} }\right\} -\eta ^2\text{ var }(\mu ) \end{aligned}$$
(21)
$$\begin{aligned} w_{32}= & {} w_{23}^{\mathrm { T} }= -\eta \text{ E }\left( \frac{\dot{\pi }_\beta }{\pi }\right) \text{ E }\left( \frac{\pi \mu }{\eta }-\mu _0\right) . \end{aligned}$$
(22)

So Part (ii) is also proved.

1.3 Proof of Theorem 2

Expanding \(\hat{\mu }_Y\) as a function of \(\hat{\varvec{\omega }}\) at \({\varvec{\omega }}={\varvec{\omega }}_0\), we have

$$\begin{aligned} \sqrt{n}(\hat{\mu }_Y-\mu _Y)= & {} \frac{1}{\sqrt{n}}\sum _{i=1}^{n} \left[ \left\{ \frac{d_iy_i}{\pi (x_i,y_i;\beta )}-\mu _Y\right\} +\frac{1}{\eta }\text{ E }\left\{ \frac{Y}{\pi (X,Y;\beta )}\right\} (d_i-\eta )\right] \\&- \frac{1}{\sqrt{n}}K^{\mathrm { T} }(\beta ,\eta ) V^{-1}S({\varvec{\omega }})+\mathcal{O}_P(n^{-1/2}), \end{aligned}$$

where

$$\begin{aligned} K(\beta ,\eta )= \text{ E }\left\{ \frac{Y}{\pi (X,Y;\beta )} \left( \begin{array}{c} \eta \gamma (X,Y;\beta )\\ \dot{\pi }_\beta (X,Y;\beta ) \end{array} \right) \right\} . \end{aligned}$$

Thus if \(\bar{\mu }=\mu _0\), \(\sqrt{n}(\hat{\mu }_Y-\mu _Y)\rightarrow N(0,\sigma ^2), \) where

$$\begin{aligned} \sigma ^2= & {} \frac{1-\eta }{\eta }\left\{ \text{ E }\left( \frac{Y}{\pi }\right) -\mu _Y\right\} ^2 +\left\{ \text{ E }\left( \frac{Y^2}{\pi }\right) -\frac{\mu _Y^2}{\eta }\right\} \nonumber \\&+K^{\mathrm { T} }(\beta ,\eta )\Sigma K(\beta ,\eta ) \nonumber \\&-2K^{\mathrm { T} }(\beta ,\eta )V^{-1}\left\{ \text{ E }(YZ)+\frac{1}{\eta }\text{ E }\left( \frac{Y}{\pi }\right) \text{ E }\left( {\pi }Z\right) \right\} . \end{aligned}$$
(23)

If \(\bar{\mu }=\frac{1}{n}\sum _{i=1}^n\mu (x_i)\), \(\sqrt{n}(\hat{\mu }_Y-\mu _Y)\rightarrow N(0,\sigma ^2_*), \) where

$$\begin{aligned} \sigma _*^2= & {} \sigma ^2+K^{\mathrm { T} }(\beta ,\eta )V^{-1}WV^{-1} K(\beta ,\eta ) \nonumber \\&-2K^{\mathrm { T} }(\beta ,\eta )V^{-1}\left[ \text{ E }\{YZ^*\}+\frac{1}{\eta }\text{ E }\left( \frac{Y}{\pi }\right) \text{ E }\left\{ {\pi }Z^*\right\} \right] , \end{aligned}$$
(24)

and \(Z^*=[0,\eta \{\mu (X)-\mu _0\},0]^{\mathrm { T} }\). The proof of Theorem 2 is complete.

1.4 Proof of Theorem 3

Without loss of generality, we assume \(p=\text{ dim }(\beta _2)=1\) and \(r(y)=y\). Under the null hypothesis \(H_0\): \(\beta _2=0\), \( \pi (x,y;\beta ) =\eta _0. \) Replacing \(\psi (x,y,\xi )\) by \(\psi _1(x,y,\xi )\) and using the above model for \(\pi =P(D=1)\), we have, by Taylor expansion, when \(\eta =\eta _0\)

$$\begin{aligned} \tilde{\ell }(\hat{\varvec{\omega }}^*)=\tilde{\ell }({\varvec{\omega }}^*)-\frac{1}{2n}\tilde{S}^{\mathrm { T} }({\varvec{\omega }}^*)\tilde{V}^{-1}\tilde{S}({\varvec{\omega }}^*)+o_P(1), \end{aligned}$$
(25)

where \({\varvec{\omega }}^*=(\lambda ^{\mathrm { T} }, \beta _1)^{\mathrm { T} }\) takes on \(({\mathbf \lambda }^{\mathrm { T} }_0, \beta _{10})^{\mathrm { T} }\), \(\tilde{S}({\varvec{\omega }}^*)=\{\tilde{S}_1^{\mathrm { T} }({\varvec{\omega }}^*),\tilde{S}_2({\varvec{\omega }}^*)\}^{\mathrm { T} }\),

$$\begin{aligned} \tilde{S}_1({\varvec{\omega }}^*)= & {} \sum _{i=1}^{n}\left( (d_i-\eta )\left[ \rho (x_i)+\frac{1-\eta }{\eta }\text{ E }\left\{ \rho (X)\right\} \right] \right) +\mathcal{{O}}_P(1),\\ \tilde{S}_2({\varvec{\omega }}^*)= & {} \frac{1-\eta }{\eta } \sum _{i=1}^{n}(d_i-\eta )+\mathcal{{O}}_P(1), \end{aligned}$$

and

$$\begin{aligned}\tilde{V}={\eta (1-\eta )} \left( \begin{array}{ll} A &{} B \\ B^{\mathrm { T} }&{} 0\\ \end{array} \right) ,\quad A=\text{ E }(\rho \rho ^{\mathrm { T} }) ,\quad B=\text{ E }(\rho ). \end{aligned}$$

While using model (1) we have

$$\begin{aligned} \tilde{\ell }(\hat{\varvec{\omega }})=\tilde{\ell }({\varvec{\omega }})-\frac{1}{2n}S^{\mathrm { T} }({\varvec{\omega }})V^{-1}S({\varvec{\omega }})+o_P(1), \end{aligned}$$
(26)

where \({\varvec{\omega }}=(\lambda ^{\mathrm { T} },\beta ^{\mathrm { T} })^{\mathrm { T} }\) takes on \((\lambda ^{\mathrm { T} }_0,\beta ^{\mathrm { T} }_0)^{\mathrm { T} }\) with \(\beta _0=(\beta _{10},\beta _{20}=0)^{\mathrm { T} }\),

$$\begin{aligned}S({\varvec{\omega }})= \left( \begin{array}{l} \tilde{S}({\varvec{\omega }}^*) \\ n\mu _Y\frac{1-\eta }{\eta } (\tilde{\eta }-\eta )\\ \end{array} \right) ,\;\;V= \left( \begin{array}{ll} \tilde{V} &{} V_0 \\ V_0^{\mathrm { T} }&{} 0 \\ \end{array} \right) ,\\ V_0=\eta (1-\eta )(C^{\mathrm { T} },0)^{\mathrm { T} },\;\;C^{\mathrm { T} }= \left\{ \mu _Y, -\text{ E }(y\phi ^{\mathrm { T} }) \right\} . \end{aligned}$$

Inverting the partitioned matrix, we obtain

$$\begin{aligned} V^{-1} =\left( \begin{array}{ll} \tilde{V}^{-1} &{}\quad \mathbf 0 _{(d+1)\times 1} \\ \mathbf 0 ^{\mathrm { T} }_{(d+1)\times 1} &{}\quad 0 \\ \end{array} \right) -\frac{1}{\varsigma ^2} \left( \begin{array}{ll} \tilde{V}^{-1}V_0V_0^{\mathrm { T} }\tilde{V}^{-1} &{}\quad -{\tilde{V}^{-1}V_0} \\ -{V_0^{\mathrm { T} }\tilde{V}^{-1}} &{}\quad 1 \\ \end{array} \right) , \end{aligned}$$

where \( \varsigma ^2= V_0^{\mathrm { T} }\tilde{V}^{-1}V_0 = \eta (1-\eta ) \left( C^{\mathrm { T} }A^{-1}C-\mu _Y^2\right) . \) It follows from (18) that

$$\begin{aligned} \tilde{S}({\varvec{\omega }})=-n\tilde{V} ({\varvec{\omega }}^*-{\varvec{\omega }})+\mathcal{O}_P(1),\;\; \tilde{\eta }-\eta =-\eta ^2B^{\mathrm { T} }\tilde{\lambda }+\mathcal{O}_P(n^{-1}), \end{aligned}$$

where \(\tilde{\eta }=n_1/n\). By (26) and (26) we can get

$$\begin{aligned} 2\{\tilde{\ell }(\hat{\varvec{\omega }})-\tilde{\ell }(\hat{{\varvec{\omega }}}^*)\}= & {} \frac{n}{\varsigma ^2}\left\{ \eta (1-\eta )D^{\mathrm { T} }\tilde{\lambda }\right\} ^2, \end{aligned}$$

where \(D=C-\mu _YB\). By Theorem 2, \(\sqrt{n}D^{\mathrm { T} }\tilde{\lambda }\rightarrow N(0, \sigma _0^2)\) in distribution, where

$$\begin{aligned} \sigma _0^2= & {} (D^{\mathrm { T} },0)\tilde{V}^{-1}\tilde{U}\tilde{V}^{-1}(D^{\mathrm { T} },0)^{\mathrm { T} },\\ \tilde{U}= & {} \frac{1-\eta }{\eta } \left( \begin{array}{ll} \tilde{A} &{} (1-\eta ) B \\ (1-\eta ) B^{\mathrm { T} }&{} (1-\eta )^2 \\ \end{array} \right) , \;\tilde{A}=A-(1-\eta ^2) \left( \begin{array}{ll} 0 &{} 0\\ 0 &{} \Sigma _\phi \\ \end{array} \right) , \end{aligned}$$

and \(\Sigma _\phi =\text{ var }_F(\phi )\). Simple matrix algebra shows that

$$\begin{aligned} \sigma _0^2= & {} \frac{1}{\eta ^3(1-\eta )} D^{\mathrm { T} }A^{-1}\tilde{A}A^{-1}D =\frac{\varsigma ^2}{\{\eta (1-\eta )\}^2}. \end{aligned}$$

Consequently, \(R=2\{\tilde{\ell }(\hat{\varvec{\omega }})-\tilde{\ell }(\hat{{\varvec{\omega }}}^*)\}\) converges in distribution to \(\chi ^2(p)\) distribution.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Guan, Z., Qin, J. Empirical likelihood method for non-ignorable missing data problems. Lifetime Data Anal 23, 113–135 (2017). https://doi.org/10.1007/s10985-016-9381-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10985-016-9381-0

Keywords

Navigation