Background
When the outcome of a study is binary, the most common method to estimate the effect is to calculate an odds ratio (OR) as an estimate of relative risk (RR) using a logistic regression. When the outcome prevalence is high (>10%), the OR can still be estimated by using the logistic regression model, but the OR is no longer an acceptable estimate for RR [
1]. Interpreting the OR as being equivalent to the RR still occurs in medical research, leading to overstated effect in the study findings [
2,
3]. The degree of overstatement depends on the outcome rate (e.g., disease prevalence). A higher outcome rate leads to higher degree of exaggeration. Zhang and Yu proposed a formula to convert the adjusted OR derived from the logistic regression model to a risk ratio in studies with common outcomes [
4]. However, the method was noted by McNutt et al. to produce biases in both point estimates and confidence intervals (CIs) [
5]. Miettinen suggested the doubling-of-cases method to estimate OR as an approximation of RR using logistic regression based on a modified dataset [
6]. Schouten et al. improved the method by applying robust standard errors [
7]. However, this method was mentioned by Skov et al. to produce prevalence greater than one [
8]. Several model-based methods have been proposed to estimate RR and its CI directly [
9]. The most popular ones are the robust (also known as modified) Poisson model [
10‐
12] and the log-binomial model [
8,
11,
13]. The performance based on simulations seemed to be equally good between the log-binomial model and the robust Poisson model [
3,
11,
12,
14] when sample sizes are reasonably large. Out of the two models, it was reported that the robust Poisson may be less affected by outliers compared to the log-binomial method [
15]. However, the research in this area is very limited. The purpose of this study is to evaluate the performance of the two methods using simulation in several scenarios when outliers exist and to provide insight into the selection of the appropriate models.
Discussion
In this study, evaluation was performed on the statistical properties of the two most popular model-based approaches to estimate RR for common binary outcomes in various scenarios when outliers existed. The results suggest that for data coming from a population in which the true relationship between the outcome and the covariate is not in a simple form (e.g., containing a higher order term), the robust Poisson model consistently outperforms the log-binomial model even when the level of contamination is low (e.g., 2%). Statistical software utilizes iterative weighted least squares (IWLS) approach or variations of IWLS to find MLEs for generalized linear models. For log-binomial models, the weights used by the IWLS approach contain the term 1/(1-
p), where
p = exp (
X
T
β) with a range from 0 to 1 [
19]. Lumley et al. pointed out that the MLE of a log-binomial model is likely to be too sensitive to outliers because a very large
p (referred to as
μ by the authors) has a large influence on the weights, even though the sum of the covariate values are still bounded [
15]. In our study both the MLE, generated by the log-binomial models and the pseudo-likelihood estimators, produced by the robust Poisson models, were deteriorated when outliers were introduced. However, the level of deterioration differed when the relationships between the confounder and the outcome was not in a simple form, possibly due to the bigger “
μ”s yielded by the log-binomial model when the higher-order term of
Z was added into the model.
Deddens and Pertersen compared the log-binomial and robust Poisson models by using three real-life examples [
20]. Out of the three examples, two produced different point estimates and SEs. In one of these two examples, the difference was nearly two folds for both point estimates and SEs for one of the covariates. The authors concluded that “the decision on which method to use is very important” [
20]; however, since the truth was unknown, it is unlikely to tell that between the log-binomial model and the robust Poisson model, which one can yield estimates that are closer to the truth. In one of the two examples in which differences between the two models were observed, the model contained a higher-order (quadratic) term. It is unclear whether or not the complexity of the model degenerated the performance of the models, especially for the log-binomial model.
Of the two methods, the log-binomial method is generally preferred due to the fact that the MLEs estimated by the log-binomial models are more efficient compared to the pseudo-likelihood estimators used by the robust Poisson models ([
10], page 2300). Spiegelman and Hertzmark recommend using the log-binomial models over the robust Poisson models when convergence is not an issue [
21]. Very small differences were observed in a simulation study with a sample size of 100 and a single independent variable with a uniform distribution [
14]. When data perfectly follows a log-binomial distribution (i.e., without outliers), the current study did not observe any difference in either biases or variances between the log-binomial and the robust Poisson models for large (n = 1,500) and moderate (n = 500) sample sizes. It appears that the gain in efficiency is beneficial to log-binomial models only for samples of small sizes.
It is not a surprise to observe negative biases when the simulation datasets were contaminated, because flipping the outcomes of the records that have the very low or very high probabilities tend to weaken the associations between the exposure and the outcome leading the associations towards null. The observation of more elevated biases when outcome rate = 10% compared to those of 25% and 40% comes with no surprise either since the impact of flipping on the estimates is expected to be more significant for scenarios with more extreme outcomes (close to 0% or 100%).
Robust methods to detect outliers especially high leverage points for logistic regression models are available in popular statistical software packages [
22‐
24]. However, similar approaches for log-binomial models are not yet available in commercial software packages although the adoption of the diagnostic statistics from those of logistic regression models were demonstrated and applied in a real life example [
25]. Efforts to develop goodness of fit tests resulted in reasonable type I errors yet low to moderate power [
25]. For this reason, the robust Poisson model seems to be a more attractive choice due to its capability of providing more robust results when outliers are undetected.
For the COPY method, the accuracy of parameter estimates depends on the number of virtual copies. Peterson and Deddens pointed out that “with 10,000 copies the results were generally accurate to three decimal places” in their scenarios [
17]. Therefore, the number of virtual copies we used (1,000,000) should provide accuracies that are high enough for our evaluation. The number of virtual copies should be carefully chosen, because a high number of virtual copies may result in failure of convergence.
Occasionally, failure of convergence remains to be an issue for log-binomial models even if the COPY method is applied. Peterson and Deddens [
14] included a continuous exposure variable (referred to as the continuous covariate by the authors) when applying the COPY method in the simulation and reported a range of convergence between 70% and 100%. In this study where C, the number of the virtue copies, was set to be 1,000,000, the COPY method converged in all 1,000 simulations in all the 48 scenarios with the linear confounder, and in 30 of out 48 scenarios with the non-linear confounder. In the 18 scenarios in which the COPY method did not converge in all 1,000 simulations (failed in 1 or more simulations), the converge rates ranged from 0.983 to 0.999 (median 0.996). If the COPY method fails to converge and the maximum likelihood-based estimators are desired, one can choose the Non-Linear Programming (NLP) procedure in SAS [
26]. The NLP method is computationally expensive. However, it does not encounter any convergence issues.
Given the lack of robustness of log-binomial models, the authors recommend using robust Poisson models to estimate RR when there are continuous covariates in the model, especially when the covariates are not in a simple linear form. Due to the concern of lack of efficiency for the robust Poisson models for small samples, log-binomial may still be the choice when the sample size is small.
A potential limitation of this study is that complex forms between the confounder and the outcome were generated by a quadratic equation. It is not clear whether or not the findings can be generalized to other complex situations. In addition, all of the outliers generated occurred to records with very low or very high probabilities and such outliers are more likely to be leverage points. The impact of the outliers generated by this study could be more significant compared to that of another study with outliers that were differently distributed.
In summary, the current study revealed the evidence to support the robustness of the robust Poisson model in various scenarios. Further research should focus on the model misspecification due to deviations of underlying probabilities. It is desirable for future studies to develop methods to identify leverage points and efficient goodness-of-fit test for log-binomial models.
Competing interests
The authors declare that they have no competing interests.
Authors’ contributions
WC conceived and carried out the study, and drafted the manuscript. JS participated in the design, data generation and interpretation of the analyses. LQ participated in the design, simulation and interpretation of the analyses. SA participated in the design and provided guidance. All the authors read and approved the final manuscript.