We assessed indicators of case management, which were the subject of the research study and might therefore have been influenced when health staffs were under observation. This study did not find strong evidence that the presence of the exit survey altered the prescribing behaviour of health staff.
There is an increasing need to capture and monitor the performance of health staff in resource poor countries as investments in health services increases and the tasks expected of health staff become more complex and diverse. However there are relatively few established methodologies to capture the content of the consultation in primary care settings. One can review routine documentation of the consultation, although the reliability of self-reported practices is uncertain [
12]. A commonly used alternative is to observe the consultation directly [
13,
14]. This may be complemented by a repeat consultation by an “expert” immediately after the consultation of interest [
15,
16]. These methods have a variety of potential limitations including the cost and practicality of having qualified health professionals to observe or repeat a consultation, and the strong influence that a peer observation may have on health workers. The patient exit survey is an interesting alternative [
17‐
19], as it might reduce errors associated with inaccurate completion of routine records and minimise patient recall by asking about the content of the consultation immediately after its completion.
The Hawthorne effect
All of these methods have the potential to alter the behaviour of health workers by creating anxiety, raising awareness from the novelty of the situation, or a desire to satisfy the expectation of the researchers. This ‘observer effect’ is generally referred to as the Hawthorne effect after the studies conducted in the Hawthorne electronics factory in the late 1920’s in Michigan, USA [
1,
2]. Although definitions vary widely, it usually relates to the difference in someone’s behaviour when aware they are participating in research, or under scrutiny, as opposed to their behaviour in a more ‘natural’ setting. Rigorous evaluation of this effect is however limited [
6], possibly explained by the complexity of this context-specific and multi-components concept, and also the challenges of measuring it without inducing it. Some studies have looked at the effect of direct observation on medical consultations [
20‐
22]. Although the design was usually before and after and other factors could have influenced the result, they generally observed difference toward better practices when health workers were being observed. This study is, to our knowledge, the first evaluation of the Hawthorne effect when conducting patient exit interviews.
Discussion of findings
Our primary results found no strong statistical evidence of important differences in clinical practice on days when exit surveys were conducted, but the differences we found were all in the direction of improved clinical practice on days when exit interviews were performed. The point estimates of effect size are modest, lying between 0.73 and 1.11, but with a lower confidence interval extending down to 0.53 for one of the outcomes. These results have implications for the interpretation of data captured through exit interviews and should be kept in mind when extrapolating data from exit surveys to “real world” practices. In the case of the TACT trial for example, the proportion of patients “appropriately treated” captured using exit interviews, could be an over-estimate.
The efficacy estimates could also be affected if the extent of the Hawthorne effect differed across trial arms, but this was not suggested by our analysis. All methods to assess case management have limitations [
23] and the most complete overall picture is likely to result from triangulation of the results from a variety of methods.
It is pertinent to consider why the Hawthorne effect comes about. It is possible that participants become more attentive to their whole work routine, even for aspects of care which are not under scrutiny. On the other hand, by trying to excel in the practice being assessed, health workers may neglect other aspects of care. In our study we found suggestions of better record-keeping in the MTUHA book on the days where exit-surveys were conducted. This could suggest that consultations were more systematically recorded on the days where an external observer was present. There were also some indications of differences regarding completeness of other MTUHA information; however these results are to be interpreted with caution as the pattern of completion of some of the information remained unclear, and an appropriate statistical model could not always be performed.
The last hypothesis explored in this paper was the change in Hawthorne effect over time. The initial assumption was that the novelty effect may tend to reduce over time, as participant become used to being observed, and their ‘natural’ behaviour would return and dominate the observation-conditioned behaviour. The change in Hawthorne effect over time on our primary outcome (RDT uptake) was not as expected, as significant decrease, and then increase, in observer effect was observed (Fig.
1). In the second period of the study, health workers were significantly less likely to report an RDT result on survey days, for reasons which remain unclear. One hypothesis was that it could be related to the regular visits by research team to check supplies, after which health workers could have been more motivated to demonstrate good performance (even if this was not the aim of these visits). However visits were regular and do not seem to explain the curvilinear pattern. Seasonal variations in malaria transmission rates did not seem to explain the pattern either. More importantly, however, we did not find any suggestion of a reduction in the Hawthorne effect over time, on any of the three outcomes. Although it is often assumed than any Hawthorne effect would reduce over time we did not find any evidence of this here, and no such effect was actually evident in the original Hawthorne studies [
24].
Some other interesting secondary findings include that no evidence was found for differences in Hawthorne effect between trial arms, which did not support that health workers in the intervention arms paid more attention to their practice on days when trial outcomes where measured, in order to satisfy the wishes of the investigators [
25]. Another issue arising is the difficulty of working with routine data, particularly when coming from a handwritten book, then transferred into a database via photographs. Not all book pages could be recorded, and some patterns of information availability were surprising, for example the recording of the “village of origin” was completely missing on some days, and completely recorded on some other, without a clear explanation (such as different health workers, or variations in book format or workload). Electronic routine data recording could facilitate access and improve consistency, and recently introduced integrated systems of RDT reading and recording may also offer useful benefits [
26].
Generalisability
It seems likely that the Hawthorne effect is sensitive to the context of the study and our findings may not apply to other settings or methodologies. Our study was conducted in health facilities participating in a randomised trial, in one region of Tanzania, representing a very specific context. However use of exit surveys is common and the findings have some wider implications. There are clearly some specific conditions that are likely to modify the Hawthorne effect and these include any situation where some level of reward or sanction could result from the result of the study, or at least where it is expected as such by the health worker. The perception of an exit survey conducted as part of a trial may well be different from one conducted as part of a national monitoring programme. In addition it seems that more intense observation such as might occur with a researcher actually observing the consultation or where the consultation is replicated by an expert could also be expected to result in modified behaviour and our results are unlikely to apply to these situations. Having the interviews performed by a trained non-health professional from the community, may have reduced the fear of judgment for the health workers.
Limitations
The study has a number of limitations. Firstly there was some knowledge among health staff that their routine records would be reviewed, although they were informed that this would be primarily to document the RDT result. All staffs were reassured that the results of the study would only be accessible to research staff and that data on individual health facilities or staff would not be revealed to anyone outside of the research team and in particular to senior or supervisory staff of the health clinics. Nonetheless, the trial could have affected the feeling of “scrutiny”, and health workers may have paid more attention to their practice even on days when exit surveys were not conducted, which would have reduced the apparent Hawthorne effect. The second major limitation is the reliance on completion of basic records and assumption that what was written was a true reflection of what was done. This is an inherent limitation of any study that aims to capture health worker performance without access to information obtained from direct observation. However, the main interest of the study was to investigate whether exit-surveys resulted in systematic difference in recording - we should therefore speak more of differences in ‘recording’ than differences in actual ‘practice’. Data used for this analysis were based on single data entry of photographs of the MTUHA records, and may not reflect the exact content of the book. For example, instances were reported where data could not be entered because the photos could not be read. Across the study periods, the median number of health facilities with data available on any specific day was 15 (out of 18). Again, this should not influence the assessment of the Hawthorne effect results if this is independent of survey days, but could bias the results otherwise (e.g. if health worker paid more attention to readability on days where exit surveys were conducted). Because survey were conducted on two randomly selected days per week, this design controlled for potential differences, and allows us to attribute the observed difference to the exit interview itself. However the schedule was not always strictly followed (see methods) or other biases could have occurred. We indeed observed differences in survey rates between health facilities, study periods and days of the week. We controlled for these factors in our analysis, but other unmeasured factors could have differed between surveyed and non-survey days and biased our findings. Another consideration is that what is reported here may not be considered as the whole Hawthorne effect, which would capture any difference in behaviour within and outside the research context. Here we have been able to capture the effect of conducting exit-surveys, but if health workers behave differently in general (even on days not monitored) because of participating in a trial, this would not have been captured here.