Ethics, consent and permissions
This study was approved by the Scientific and Ethics Review Unit of the Kenya Medical Research Institute (KEMRI). Additionally, it was approved by the Ministry of Health, with the Medical Superintendents of participant hospitals giving consent for participation. Individual consent for access to de-identified patient data was not required.
Study participants, data sources and management
All paediatric inpatient children admitted to the selected hospitals from 1 February 2014 to 28 February 2016, aged 2–59 months who had non-severe pneumonia at admission were eligible for inclusion in this study. This was determined from clinician diagnosis and clinical signs documented in patient records. To avoid confusion, we have used the term pneumonia to refer to children with a documented clinical diagnosis of pneumonia, and the terms non-severe and severe pneumonia to refer to those for whom WHO, under the 2013 revised definitions, recommends outpatient and inpatient care, respectively. The ideally diagnosed and managed population consisted of patients with a clinician-assigned admission diagnosis of non-severe pneumonia, who, as based on WHO guidelines, had clinical signs supporting this diagnosis and were treated with penicillin monotherapy. We excluded children born before the introduction of the pneumococcal conjugate vaccine to the national childhood immunisation schedule in January 2011; thus, the study population included children who were born after the introduction of both the pneumococcal and
Haemophilus influenzae type B (Hib) conjugate vaccines (introduced nationally in 2001). Comprehensive data collected for these admissions comprised clinical, investigation and treatment data focused on admission and discharge events, with up to 350 variables per patient encounter collected. These variables span different disease conditions. A detailed description of the methods of data collection and analysis is reported elsewhere [
11]. For our study, 37 variables were used: 18 variables were used in pneumonia classification criteria and in identifying the analysis population, and 19 variables were used for subsequent statistical analysis — 6 of which were interaction terms (variables used to test whether the effect of one independent variable differs depending on the level of another independent variable). In brief, hospitals implemented two data collection tools (a paediatric admission record and a discharge form) with one clerical assistant posted to each hospital to collect data from the medical records and laboratory reports. Data collection was conducted as soon as possible after discharge through abstracting data from inpatient paper records into a non-propriety electronic tool, Research Electronic Data Capture (REDCap) [
12]. Data quality reports were generated by R scripts [
13] based on validation rules and metadata pulled from REDCap’s application programming interface. These reports were fed back to the hospitals to improve the quality of clinical data used in this research. We have reported in detail elsewhere the process by which we established a clinical information network in Kenya, the multiple unique challenges we faced including the development of new data collection procedures and new methods to implement the provision of accurate reporting to hospitals [
11].
Quantitative variables
Our prognostic models focused on paediatric inpatient hospital mortality, described by a binary variable (dead or alive). Predictors were identified a priori guided by clinical expert opinion and literature review. We selected variables posited to be associated with mortality and which could also be widely ascertained in low-resource clinical settings. To denote nutritional status, we used recorded weight and age to retrospectively compute weight-for-age Z-scores (WAZ) using WHO child growth standards [
14], as data for these two variables were complete for the majority of patients studied. This resulted in the following predictors being selected, covering demographics and clinical characteristics:
Age < 12 months (binary), Sex-Female (binary), Respiratory rate ≥ 70 breaths/min (binary), Temperature ≥ 39 °C (binary), Weight-for-age Z-score (ordinal —
3 levels), Dehydration status (ordinal —
3 levels), Pallor (ordinal —
3 levels), Malaria status (ordinal —
3 levels), Presence of ≥ 1 comorbidity (binary), Hospital in malaria endemic area (binary), Acute nutrition status (binary). Table
1 provides description of levels of the ordinal variables. Severity of pneumonia was categorised based on documented WHO clinical criteria (non-severe vs severe) [
15]. Dummy binary variables were created for all levels in ordinal predictors. All predictors were assessed at the time of admission.
Table 1
Descriptive summary statistics of the included predictors and variables of interest (N = 10,687)
Age < 12 months | No | 5719 (53.51%) |
Yes | 4968 (46.49%) |
Female | No | 5856 (54.8%) |
Yes | 4736 (44.32%) |
Missing | 95 (0.89%) |
Pallor | None | 7613 (71.24%) |
Mild/moderate | 374 (3.5%) |
Severe | 98 (0.92%) |
Missing | 2602 (24.35%) |
Respiratory rate ≥ 70 breaths/min | No | 6622 (61.96%) |
Yes | 4065 (38.04%) |
Weight-forage Z-score (WAZ) | > –2SD | 8311 (77.77%) |
–2 to –3SD | 1202 (11.25%) |
< –3SD | 719 (6.73%) |
Missing | 455 (4.26%) |
Temperature ≥ 39 °C | No | 6577 (61.54%) |
Yes | 1257 (11.76%) |
Missing | 2853 (26.7%) |
Dehydration | No dehydration | 10,026 (93.81%) |
Some dehydration | 622 (5.82%) |
Missing | 39 (0.36%) |
Malaria | No malaria | 9611 (89.93%) |
Non-severe malaria | 1076 (10.07%) |
Hospital located in malaria endemic area | Yes | 4447 (41.61%) |
No | 6240 (58.39%) |
Acute malnutrition | None/at risk | 10,572 (98.92%) |
Moderate | 115 (1.08%) |
Presence of comorbiditya
| No | 7330 (68.59%) |
Yes | 3357 (31.41%) |
Patients with non-severe pneumonia with a diagnosis of either (1) severe dehydration or (2) severe malaria were recoded to severe pneumonia, since either of these diagnoses would render the respective patients’ ineligible for outpatient care. All cases of severe pneumonia were excluded from the analysis. Pneumonia cases with additional admission diagnosis of meningitis, acute malnutrition and shock were also excluded from the study sample; these conditions follow alternative management protocols under the clinical guidelines [
16].
Statistical methods
Data manipulation and statistical analyses were performed using R software [
13] employing the caret package [
17]. Categorical data were tabulated and summarised as proportions, while continuous variables were reported with medians and interquartile ranges as appropriate. To evaluate differences in the risk profile between the two groups of inpatient mortality outcomes, and identify predictors that substantively account for these differences, an adjusted multivariable logistic regression model was used.
In previous research, logistic regression modelling has been used to look at risk factors in pneumonia (Ambrose Agweyu, et al., Appropriateness of clinical severity classification of new World Health Organization (WHO) childhood pneumonia guidance: a multi-hospital retrospective cohort study.
The Lancet Global Health, under review). However, due to the violation of assumptions of independence of predictors (e.g.
age variable would be collinear with
Weight-for-age Z-score variable, etc.) and the limited understanding of the relationship of the predictors with mortality in non-severe pneumonia, this approach is susceptible to incorrect inferences about relationships between explanatory and response variables. Additionally, apart from model coefficients and significance tests, logistic regression models offer limited guidance on feature selection from model outputs that can guide future intervention design. Feature selection is defined and explained further in Additional file
2: Table S1 and in published reports [
18,
19]. Therefore, the magnitude of coefficients included in the traditional adjusted logistic regression models might not be good indicators of clinical value of features, since they do not incorporate clinical consequences involved in targeting those features [
20,
21].
To address these challenges in generating decision-analytic solutions from logistic models, machine learning techniques were used. These techniques were also explored to test whether, given the available data, models using complex adaptive techniques perform better in determining the inpatient mortality risk associated with non-severe pneumonia given the choice of predictors. The use of these techniques would also provide implicit feature selection as part of the model output, in addition to allowing us to evaluate whether there was consistency of findings given the different model choice. The machine learning models used in analysis were partial least squares - discriminant analysis (PLS-DA) [
22], random forests (RFs) [
23], support vector machines (SVMs) [
24] and elastic nets [
25]. Brief descriptions of these models are given in Additional file
2: Table S2. Detailed descriptions of the models are provided in the referenced works. Here we offer an introduction to the techniques used, which may be less familiar.
Model validation was checked by employing a 10-fold internal cross validation on two thirds of the data. The remaining one third of the data was used as the validation set. This is further explained in Additional file
2: Table S1. Variable importance scores, which would guide feature selection, were generated to identify predictor contribution to classification, with higher scores considered more relevant in classification. Detailed explanations of variable importance estimation for the models included in the analysis are reported elsewhere [
26]. The selection of critical parameters for each of these modelling techniques was auto-determined by the R caret train function by choosing the tuning parameters that produced the highest values of receiver operating characteristic (ROC) curves where a grid search cross-validation was applied. These parameters are provided in Additional file
2: Table S2.
To evaluate the clinical impact of implementing the models in practice as part of screening algorithms, we performed decision curve analysis, evaluating how different threshold probabilities vary the false-positive and false-negative rate expressed in terms of net benefit [
27]. The unit of net benefit is true positives, and the details of its calculation are extensively reported elsewhere [
20]. When carrying out a head-to-head comparison of different prediction models on the same population, the interpretation is straightforward — at each clinically relevant probability threshold, the model that has the highest net benefit is preferred. Models are also compared to the extreme choices of designating admitting all and no patients at high risk of inpatient mortality.
Learning using imbalanced outcome data
From the description of our outcome (inpatient mortality cases in non-severe pneumonia), we expect the cases to be imbalanced; i.e. the number of positive cases is much smaller than the number of negative cases. This introduces a high possibility of the resulting model being biased towards the dominant class, presenting poor accuracy to classify negative cases. In order to minimise this bias, we used the Synthetic Minority Over-Sampling Technique (SMOTE) filter [
28,
29] to address the imbalanced data. The SMOTE technique was used to oversample the negative cases, which eliminated the possibility of information loss. This was achieved by combining the features of existing instances with the features of their nearest neighbours to create additional synthetic instances. More details on this are provided in Additional file
2: Table S3.
Missing data
To handle missing data, multiple imputation by chained equations (generating 10 imputed datasets) was performed under a missing at random (MAR) assumption [
30].
Model performance was analysed using sensitivity (true positive rate), specificity (true negative rate) and ROC’s area under the curve (AUC). AUC is a combined indicator of sensitivity and specificity, equal to the probability that a classifier will rank a randomly chosen positive instance higher than a randomly chosen negative one [
31]. DeLong’s significance test was used to compare the ROC curves from each model type [
32].
Sensitivity analysis
We used alternative definitions of pneumonia severity to conduct sensitivity analyses. We performed analyses using clinician-defined severity (non-severe vs severe) and choice of initial treatment prescribed to the patient at admission (benzyl penicillin monotherapy vs alternative broad spectrum treatment) against the definition based on WHO severity criteria as the “gold-standard”. The three definitions should ideally represent populations that overlap perfectly; however, inconsistencies have been observed in previous work [
33,
34]. Comprehensive comparisons of risk where pneumonia guidelines were not adhered to (which is a common occurrence in low-resource settings) are lacking in the literature. Here, the key consideration was the widely reported lack of concordance of health workers’ pneumonia severity classification practices in comparison to clinical guidelines under routine conditions [
33‐
35].