Background
Missing data occur almost in all types of studies and cause inefficient and biased estimates of parameters if they are handled improperly. In a survey, missing data occur, when a selected respondent refuses to participate (unit nonresponse) or respondent does not provide answer to entire survey questions (item nonresponse) [
1,
2]. For unit nonresponse, the weighting adjustment technique is applied, in which weight of respondents are increased to represent non-respondents [
3], whereas for item nonresponse, imputation methods are employed [
1,
4].
There are three types of mechanisms under which missing data occur: missing completely at random (MCAR), missing at random (MAR) and missing not at random (MNAR) [
1,
5]. When missing data are MCAR, the probability of missingness does not depend on the missing and other observed data. An example is when survey papers are lost accidentally. If missing data are MAR, the probability of missingness depends only on observed data, but not on the missing data themselves. For example, people from different demographic backgrounds may decline to answer based on beliefs or traditions. When missing data are MNAR, the probability of missingness depends on both observed and missing data. For example, people with high incomes are less likely to report their incomes than those of people with average or low income. Data under MCAR mechanism can be tested statistically by little’s test [
6]. However, there is no clear technique to diagnose and distinguish between MAR and MNAR. Thus, MAR and MNAR can only be reasoned or hypothesized [
1,
4].
There are several studies about methods used for handling missing data in each type of missing mechanisms [
7]. The most common method is case deletion in which subjects with missing values are deleted. The results from this method are inefficient, but unbiased, when the missing data hold MCAR assumption. However, when data are not MCAR, the results from this method are inefficient and biased [
4,
5]. Methods like mean substitution, last observation carried forward, hot deck imputation, cold deck imputation and regression imputation come under single imputation in which missing values are replaced by synthetic values [
2,
8]. The first two methods of single imputation assume missing data are MCAR, while the remaining methods assume missing data are MAR [
7]. The results obtained from mean substitution and hot deck imputation are biased under three missing mechanisms. However, the results obtained from conditional mean imputation are unbiased under MCAR and MAR, but may be biased under MNAR [
4]. Furthermore, in single imputation, values are imputed for one time; the uncertainties created by missing values are not accounted for. As a result, there are small standard errors, p-values and narrow confidence intervals [
5,
9]. In multiple imputation, unlike single imputation, missing values are imputed for more than one time and the uncertainties created by missing values are incorporated resulting in larger standard errors and wider confidence intervals [
1]. In addition, multiple imputation provide unbiased result, when data hold both MAR and MNAR assumption [
4].
In southern Asia and sub-Saharan Africa, more than half of women give birth at home [
10]. Therefore, analyzing data on infants delivered only at hospital would be biased [
11]. As a substitute to hospital based data, household data survey begin to collect information on infants born outside health facilities [
12]. However, the data on birth weight from a household survey become limited since mothers are unable to provide numeric birth weight [
11,
12]. Nepal Demographic and Health Survey (NDHS), 2011 reported that only 36% of weights of infants were measured at the time of birth [
13]. The same survey also reported that the prevalence of low birth weight (LBW) in Nepal was 12%, which was calculated from the available birth weight of infants. Studies conducted in Nepal on LBW by using demographic and health survey (DHS) data either have considered mother’s recall for infant’s size at birth as an alternative to the birth weight [
14] or analyzed the subset of measured birth weight [
15] for identifying the prevalence and factors associated with LBW. Estimating prevalence of LBW and identifying determinants associated with it only from the available birth weight may be biased, when missing birth weight are not MCAR. Besides missing values on the birth weight, missing values are also presented on determinants of birth weight, but are not handled in most of previous studies and the results obtained from these studies may be misrepresented. Thus, the main objective of this study is to identify factors associated with LBW using multiple imputation to handle missing data in both outcome and determinants.
Discussion
The overall prevalence of LBW from this study is 15.4% which is different from the study including only infants with measured birth weight conducted by [
15] in which the prevalence of LBW was found to be 11.5%. The difference is expected, because in this study there is an inclusion of additional 3,318 missing birth weight in the analysis. A study conducted by [
14] found the prevalence of small size at birth as 16% which is close to the prevalence of this study. This may be because mother’s recall of infant’s size at birth and other variables are used for imputing missing values in this study. As shown in Table
1, the prevalences of LBW for the determinants like mother’s age at child’s birth, gender of child, residence, ethnicity and parity are almost equal in each subgroup. It can be concluded that each subgroup has equal chance of having LBW infants. In this study, the prevalences of having LBW infants are higher among mothers living in low standard such as being poor, using highly polluting cooking fuels, not attending ANC visit and not consuming iron tablets during pregnancy than those of their respective subgroups and this finding is consistent with the previous study conducted by [
14].
The prevalences of LBW for BMI and ethnicity in each subgroup are surprisingly different from normal perception. For BMI, women with overweight have the lower prevalence and the lower odds of LBW compared to women with normal and underweight. The possible explanation for this is that overweight mothers are likely to give birth to bigger babies and underweight mothers are likely to give birth to smaller babies. This finding is consistent with the studies conducted by [
32,
33]. Furthermore, the results of this study reveal that the prevalence of LBW among relatively advantaged mother is higher than relatively disadvantaged mother (janajati). Even though, there have been studies on ethnicity affecting on LBW, these studies were performed in the high income countries [
34,
35]. From those studies, it seems that mothers from the advantaged group are less likely to give birth to LBW infants. However, in this study, the different effects on LBW from mothers with different ethnic backgrounds are insignificant because
p-value is higher than 0.05 from unadjusted odds ratio. Therefore, it is inconclusive to state that the odds of having LBW infants from differently ethnic mothers can be distinguished.
The current study finds that a mother has higher odds to give birth to LBW babies, when her decision on utilization of health services is relied only on others instead of herself and this finding is supported by [
36] in which women with the lowest decision making autonomy were more likely to have LBW. This is probably because women with the lowest decision making autonomy on their health care are less likely to receive regular health checkups together with ANC visit during pregnancy including safe deliveries and health information regarding pregnancy and childbirth. Apart from that, women with the lowest decision making autonomy on their own health may have poor nutrition uptake during pregnancy and that may consequently impair fetal growth [
36]. The variables such as ANC visit during pregnancy and consumption of iron tablets during pregnancy are not significant with LBW in the current study. However, studies performed by [
14,
15] found that mothers who did not attend ANC visit during pregnancy and mothers who did not consume iron tablets during pregnancy were more likely to give birth to LBW infants. This difference may be because [
14,
15] assumed the missing values presented on ANC visit and iron tablets consumption during pregnancy as no ANC visit and no consumption of iron tablets during pregnancy respectively. The result from this study also finds that mothers who use highly polluting fuel are more likely to give birth to LBW infants and this finding is supported by a study conducted in India [
37]. However, cooking fuel was found insignificant in the previous studies conducted in Nepal by [
14,
15]. This is probably because [
14,
15] supposed that mothers who did not belong to households (non dejure residents) used highly polluting cooking fuel.
The current study consists of missing data on the variables like birth weight, BMI, ANC visit, consumption of iron tablets during pregnancy, cooking fuel and women’s decision for utilization of health services. For birth weight, even though there has been a considerable rise in the percentage of measurement of infants birth weight at birth in the past 5 years from 17% in 2006 to 36% in 2011 [
13,
38], but home delivery is still a preferred choice for most mothers in Nepal as stated in [
39,
40]. Eventually, the problem of missing data on birth weight may continue for a long period. This suggests promoting and strengthening institutional delivery, provision of weighing scale and training to community health workers for measurement of birth weight of those infants who are born at home. However, missing data in other variables can be minimized with other measures. For instance, in DHS survey, the questions related to cooking fuel, collected in household level, should be assigned to individuals in the individual data file. Thus, a mother who is not member of household lack the data on cooking fuel and the problem of missing data on cooking fuel can be avoided, if questions related to cooking fuel are included in women’s questionnaire too.
Multiple imputation is employed in this study to handle missing data, because the analysis based on only complete cases of measured birth weight cannot be used since missing data are presented in more than one variable and the missing data are MAR. Moreover, using multiple imputation reduces bias downwards compared to analysis of complete cases, but it does not mean that using imputation methods for replacing missing values removes the bias completely.
The limitation of this study is that the efficiency of multiple imputation cannot be determined, because the data lack the completed record. Secondly, this efficiency might be lower because of high numbers of missing data. The study conducted by [
7] mentioned that the results from statistical analysis are more prone to be biased, when the amount of missing is greater than 10%. However, as stated by [
41], missing the data pattern and missing mechanism are more important than the percentage of missing data. Furthermore, the current study utilized the secondary data; thus, the exact reason for missing data is not clear for many variables.
Acknowledgements
We acknowledge Thailand’s Education Hub for ASEAN Countries (TEH-AC) for supporting US Master degree at Prince of Songkla University. We would like to express our sincere gratitude to Prof Don McNeil for providing guidance and support. We also thank to DHS measure for granting us permission to conduct this study.