Background
Methods
Setting
Information type | Variable | Variable type | Details |
---|---|---|---|
Temporal information | Arrival date and Time | Temporal | To the nearest minute. From 2006 to present day |
Presentation information | Facility | Categorical | One of 18 Hospitals spread over QLD (including a children’s hospital) |
Triage category | Categorical (Ordered) | Rating of urgency on presentation: 1,2,3,4,5 (1 requires resuscitation down to 5 being non-urgent) | |
Departure Status | Categorical | Discharged, Admitted, Did Not Wait, Transferred, Died in ED, Left Against Advice, Dead On Arrival | |
ICD-10 Code | Categorical | A coding of diseases, signs and symptoms, abnormal findings, complaints, social circumstances and external causes of injury or diseases; 5000 unique codes present in data set [19] | |
Demographic Information | Age | Continuous | Age in years |
Sex | Categorical |
-
have the potential to negatively affect the operation of a hospital during the winter bed crisis (e.g. due to their infectious nature or the sheer volume of cases)
-
provide an opportunity for intervention
-
have a behaviour which is difficult to predict
Disease group | Subgroup of ICD-10 codes | Description |
---|---|---|
J00-J06 | Acute upper respiratory infections | |
J09-J18 | Influenza and pneumonia | |
Flu: Influenza-associated diseases | J20-J22 | Other acute lower respiratory infections |
A00-A09 | Intestinal infectious diseases | |
B25-B34 | Other viral diseases | |
J30-J39 | Other diseases of upper respiratory tract | |
J40-J47 | Chronic lower respiratory diseases | |
J60-J70 | Lung diseases due to external agents | |
Respiratory: Diseases of the respiratory system | J80-J84 | Other respiratory diseases principally affecting the interstitium |
J85-J86 | Suppurative and necrotic conditions of lower respiratory tract | |
J90-J94 | Other diseases of pleura | |
J95-J99 | Other diseases of the respiratory system | |
Z00-Z13 | Persons encountering health services for examination and investigation | |
Z20-Z29 | Persons with potential health hazards related to communicable diseases | |
Z30-Z39 | Persons encountering health services in circumstances related to reproduction | |
Factors: Factors influencing health status and contact with health services | Z40-Z54 | Persons encountering health services for specific procedures and health care |
Z55-Z65 | Persons with potential health hazards related to socioeconomic and psychosocial circumstances | |
Z70-Z76 | Persons encountering health services in other circumstances | |
Z80-Z99 | Persons with potential health hazards related to family and personal history and certain conditions influencing health status |
Phase 1: developing a predictive model for patient counts
-
including predictor variables of different types (nominal, ordered categorical and continuous);
-
managing the sparsity of the data when we consider counts at such a detailed level of classification;
-
modelling the mean of the system and capturing the variation in order to correctly establish unusual cases in the testing phase; and
-
adressing the computational challenges posed by the scale of the problem (e.g. even holding the counts in memory for this large target space across many time points is constrained by current memory resources).
Step 1: developing a time-dependent model for total counts
Variable | Type | Description | Selected in Flu group model | Selected in respiratory group model | Selected in factors group model |
---|---|---|---|---|---|
Day
| Continuous | Number of days since beginning of training period in 2006 | Yes | Yes | Yes |
Weekday
| Categorical | The day of the week (reference category ‘Monday’) | Yes | Yes | Yes |
sin.day, cos.day
| Continuous | Yearly seasonal harmonics and
| Yes | Yes | Yes |
log1p.lagn
| Continuous | log of the count for the n th day before, n = 1,2,…,7, plus 1 | Yes | Yes | Yes |
is.public.hol
| Binary | An indicator for whether or not it is a QLD State public holiday | Yes | Yes | No |
is.school.hol
| Binary | An indicator for whether or not it is a QLD State school holiday | Yes | No | No |
l2.mod
| Categorical | The subgroup of ICD-10 codes | Yes | Yes | Yes |
l2.mod*Day
| Interaction | Interaction between the level 2 disease group and Day
| Yes | Yes | No |
l2.mod*Weekday
| Interaction | Interaction between the level 2 disease group and Weekday
| Yes | Yes | Yes |
l2.mod*sin.day
| Interaction | Interaction between the level 2 disease group and sin.day
| Yes | Yes | Yes |
l2.mod*cos.day
| Interaction | Interaction between the level 2 disease group and cos.day
| Yes | Yes | Yes |
l2.mod*log1p.lag1
| Interaction | Interaction between the level 2 disease group and log1p.lag1
| Yes | Yes | Yes |
Weekday* is.school.hol
| Interaction | Interaction between the Weekday and whether or not it is a school holiday | Yes | No | No |
Day*Weekday
| Interaction | Interaction between Day and Weekday
| No | Yes | No |
Day*sin.day
| Interaction | Interaction between Day and sin.day
| No | Yes | No |
Day*cos.day
| Interaction | Interaction between Day and cos.day
| No | Yes | No |
Day*is.public.holiday
| Interaction | Interaction between Day and is.public.holiday
| No | Yes | No |
Day*is.school.holiday
| Interaction | Interaction between Day and is.school.holiday
| No | No | No |
Step 2: predicting expected proportions to cells
-
by aggregating the data over time we achieve a computationally significant dimension reduction;
-
variables of different types are easily included;
-
regions of very low or zero frequency are grouped together and are given low (but non-zero) expected values; and
-
interactions are naturally included. While these interactions are empirically determined, at the model evaluation stage we can check that the interactions identified by domain experts are captured.
Step 3: assigning expected counts to cells
Phase 2: testing for unusually high counts using EWMA surveillance trees
-
applying the EWMA (Exponentially Weighted Moving Average) based temporal smoothing of observed and expected counts;
-
growing a Surveillance Tree on departures from expected value in the smoothed counts using a binary recursive partitioning approach;and
-
pruning the Surveillance Tree to reveal signals and control the false alarm rate.
Step 4: EWMA smoothing
Step 5: growing the surveillance tree
Step 6: pruning the surveillance tree
Age | Sex | Triage category | Departure status | Facility | Disease group | Disease subgroup | |
---|---|---|---|---|---|---|---|
(Intercept) | 3.7598 | 3.6926 | 3.9562 | 3.9712 | 3.3100 | 4.1558 | 3.5891 |
μ
| 0.0000 | -0.0022 | -0.0012 | -0.0007 | 0.0017 | -0.0012 | -0.0015 |
depth | -0.1481 | -0.1772 | -0.2327 | -0.2209 | -0.0675 | -0.2605 | -0.1022 |
1/ μ
| 0.6342 | 1.2781 | 0.6233 | 0.5121 | 0.8856 | 0.9699 | 0.0327 |
depth2
| 0.0060 | 0.0076 | 0.0121 | 0.0108 | 0.0015 | 0.0121 | 0.0038 |
1/depth | -0.9182 | -0.8120 | -1.0811 | -1.0864 | -0.4693 | -1.2520 | -1.3850 |
μ*depth | -0.0006 | -0.0001 | -0.0003 | -0.0003 | -0.0008 | -0.0002 | -0.0006 |
μ*(1/depth) | 0.0004 | 0.0021 | 0.0014 | 0.0010 | -0.0009 | 0.0013 | 0.0043 |
Applying the test prospectively
Evaluation of the methodology by simulation
Simulated hotspots
Evaluation measures
Results and discussion
Surveillance trees compared to univariate control chart in terms of effectiveness and timeliness
The effect of hotspot clustering
-
The hotspot affects the whole population being monitored by the univariate control chart, i.e. there is no clustering of the higher counts in a subspace. In other words, what do the Surveillance Trees lose in performance when we are in the optimal situation for the univariate control chart?
-
The hotspot affects a subgroup of the population being monitored by the univariate control chart. In other words, what do we gain by using the Surveillance Tree method to search for subgroups?
Hotspot | Number found by hotspot peak | Number found by hotspot end | ||
---|---|---|---|---|
Univariate control chart
| Surveillance tree |
Univariate control chart
| Surveillance tree | |
a. All flu cases | 839 | 282 | 971 | 770 |
b. Subgroup of flu cases | 824 | 899 | 972 | 998 |
Hotspot | Number found by hotspot peak | Number found by hotspot end | ||
---|---|---|---|---|
Univariate control chart
| Surveillance tree |
Univariate control chart
| Surveillance tree | |
a. Subgroup of All Cases | 407 | 634 | 725 | 968 |
b. Subgroup of Flu Cases with ceiling | 39 | 943 | 70 | 1000 |
The effect of hotspot strength, duration and timing
Hotspot | Number found by hotspot peak | Number found by hotspot end | ||
---|---|---|---|---|
Univariate control chart
| Surveillance tree |
Univariate control chart
| Surveillance tree | |
Peak 20 | 117 | 89 | 281 | 280 |
Peak 40 | 407 | 634 | 725 | 968 |
Peak 80 | 930 | 1000 | 997 | 1000 |
Hotspot | Number found by hotspot peak | Number found by hotspot end | ||
---|---|---|---|---|
Univariate control chart
| Surveillance tree |
Univariate control chart
| Surveillance tree | |
Peak at Day 3 | 109 | 61 | 303 | 261 |
Peak at Day 7 | 407 | 634 | 725 | 968 |
Peak at Day 14 | 721 | 981 | 922 | 1000 |
Peak at Day 21 | 873 | 1000 | 977 | 1000 |