Summary
We have developed and implemented an adaptation of the CUSUM methodology to detect changes in prescribing for one CCG or practice, in relation to the whole population of CCGs or practices, across a wide range of prescribing measures. Our modification and implementation successfully met various specific requirements of our use case, as discussed below. The method was effective in detecting changes that we determined to be clinically important. Though we did not formally assess the utility and appropriateness of the alerts generated, we plan to assess their impact once sufficient follow-up data has been accrued.
Strengths & weaknesses
Our modification and implementation of the CUSUM method meets various specific requirements of our use-case. Firstly, in contrast to standard Shewhart control charts [
7,
9], the approach described here is able to detect small changes over a period of time that may still be clinically interesting. Secondly, by using a multiple of the standard deviation of the reference mean as the threshold value for detecting changes, the method is able to adapt to our diverse range of measures and across many CCGs and practices. This means that where the level of noise is especially high, the algorithm adjusts such that typical levels of noise do not trigger an alert. Conversely, where the variation in percentile is very low initially, an alert is triggered very quickly once a change occurs, to indicate atypical behaviour.
Thirdly, after an initial alert has been triggered our modification of the standard CUSUM implementation checks for continuing deviation from the mean over the preceding 12 months, and re-triggers an alert if such continued change is detected. This meets an important requirement on OpenPrescribing: the alerts service is open to any user, some of whom may sign up for alerts shortly after an initial trigger has been sent, and may not be aware of historic alerts. This confers the additional benefit of reminding CCGs or practices that do not respond to the initial alert that a change on a measure has both occurred and is ongoing. This adaptation also has the unintended benefit of sometimes selecting a more appropriate reference mean – often after the change has largely stopped – which then reduces the chance of unnecessary alerts being generated after the change has taken place. Another advantage of the approach that we have taken is that it is easy to modify the parameters of the CUSUM algorithm, in order to alter how sensitive it is to change. We set these parameters according to recommendations by Montgomery [
16], and in our view, the algorithm triggered alerts at times that we considered clinically appropriate.
Through informal user testing (not reported here) and iteration, we think that an appropriate balance has been found in the level and suitability of alerting. An interesting point to note is that CCGs tended to have more detected changes than practices. This is likely due a higher level of statistical noise in practices, due to generally lower prescribing numbers. It is not necessarily a problem for CCGs to receive a higher volume of alerts, given that they often have a dedicated medicines optimisation team who can investigate alerts appropriately.
Occasionally, small changes in the percentile are detected as alerts. This occurs where the percentile is especially consistent and occurs more commonly at extreme percentiles, where the percentiles are more spaced out. However, such small changes in percentile can correspond to substantial absolute changes in prescribing. For example, for the example given in Fig.
1, between May and June 2016, the CCG moves from the 100th to the 99th percentiles, but this change corresponds to a change from 62.2 to 34.8% in the proportion of Cerazette prescribing. It is therefore not useful to set universal limits for the size of percentile change that should trigger an alert.
In a few cases, the algorithm detects a change in a somewhat arbitrary place (e.g. high-cost ACE inhibitors for CCG 05Y in Additional file
1: Appendix A). This is possible when the level of noise within the percentiles changes over time. For example, if the level of noise is low initially, a low trigger threshold will be set, if the noise then increases (perhaps due to a reduction in overall prescribing for that measure), this may occasionally trigger an alert when there is no underlying shift in the measure. This also occurs where prescribing numbers are especially small (low single figure denominators. This is more common in small practices and can cause the percentile to change very erratically. Though this does not always trigger an inappropriate alert, there may be some utility in filtering out alerts where changes are detected based on very small numbers; we will consider and respond to user-feedback on this issue.
These examples highlight some potential pitfalls in applying the same method to a diverse array of data, but do not negate the utility of these methods; rather they emphasize the need for users to investigate alerts individually. Indeed, these limitations are mostly restricted to situations where the underlying data are not sufficient to make a meaningful judgement about a CCG or practice’s prescribing, even with careful clinical consideration. Given the lack of formal testing here, it is currently left to the reader and user to determine how useful the generated alerts are. Here we set out to describe the development of the method, such that users can understand how alerts are generated and that others may use the same implementation.
Context of other findings
There are many examples of the use of SPC, and even CUSUM in medicine. The most comparable study that we know of [
22] used similar prescribing data and used the CUSUM methodology to detect a change of one clinical entity in relation to others in the local area, for a prespecified prescribing intervention. This is a good initial demonstration of the utility of CUSUM in detecting changes against background noise. We go further by creating an automated tool that is effective across many diverse prescribing measures, and diverse sizes of centre, across the health service of a whole country.
Additionally, SPC is being used increasingly in medical research generally. For example, for monitoring surgical outcomes [
23‐
25], monitoring emergency medical outcomes [
26] and even monitoring physiological response to antihypertensive treatments [
27]. These different studies have used various different CUSUM implementations (summarised in [
28,
29]) according to their different needs.
We used a two-sided implementation as described by Montgomery [
16] because we are interested in notifying practices when their prescribing behaviour changes in either direction. We do not know of any other studies that have used our retriggering adaptation, where we determine whether an increase is persistently occurring. However, the adaptation bears some mathematical resemblance to the manner in which the V-mask CUSUM method is calculated [
30]. Other adaptations to the CUSUM method are unlikely to be useful for our needs. For example, Novick et al. [
24] compare a risk adjusted CUSUM implementation to an unadjusted one. The risk adjustment is used in this case to correct for the baseline risk changing over time in surgical outcomes. Additionally, a Bernoulli CUSUM can be used for situations where a binary outcome is being measured [
31]. Though the prescribing measures used here could be described in terms of binary prescribing choices, we believe that it is simpler and more elegant to use the percentile for our needs.
Policy implications and further research
The intention of this implementation of the CUSUM algorithm is to notify interested users (i.e. those who subscribe to the alerts) of clinically important changes to their prescribing patterns in relation to the prescribing of peers. It is clear from the user testing that in order for the alerts to have the maximum positive impact, the manner in which they are communicated must be carefully considered. The user testing highlighted the need to communicate the size and duration of the change that has occurred along with the notification. Although we have considered detecting increase and decrease changes in the same way methodologically here, they clearly have different implications. A detected increase in percentile may (for most measures) highlight a need for action by the CCG or practice to bring prescribing back into line with their peers, whereas a detected decrease might indicate that a recent change that was made was effective in improving prescribing. There are two prescribing measures in the current set on OpenPrescribing (DOACs [
13] and pregabalin [
14]) where no value judgement is made over an increase or decrease in the measure, but change in relation to peers is noteworthy regardless, so these will be communicated in alerts differently to other measures. Additionally, while there are many examples of practices getting worse as defined by our measures, in some cases there are some legitimate underlying reasons for this. It is therefore important to stress that the alerts are intended as an initial signpost that something has changed, and it is important that each CCG, practice, or other user investigates any underlying reasons for a change identified.
There are two mechanisms for collecting further information on the impact and quality of this analytic approach. Firstly, within the OpenPrescribing project, prescribing behaviour can be monitored over time after changes are detected. As we know from the OpenPrescribing dataset who is receiving alerts and who has interacted with the emails in various ways (e.g. clicked links to investigate an alert further), we will be able to assess the impact of alerts by comparing the change in prescribing in the months after an alert by subscribing versus non-subscribing institutions. Secondly, this service is now generating alerts to users, and will shortly be presented on the OpenPrescribing “labs” page. We encourage users to review the triggering of alerts on a measure at any CCG/practice of interest and give feedback on whether they view the alerts and thresholds as clinically useful, or any other aspect of the OpenPrescribing project, by emailing feedback@openprescribing.net.