Background
Mathematical models of infection are useful decision support tools for health policy makers choosing between alternative interventions to limit the spread of infectious diseases [
1]. Increased focus on influenza pandemic preparedness and response in recent years has stimulated the development of agent-based models exploring alternative containment and mitigation strategies [
2‐
4] that may be employed to limit disease transmission. Such models have highlighted uncertainty regarding critical transmission parameters describing mixing between age classes or social groups in heterogeneously mixing populations [
3]. As model conclusions are often exquisitely sensitive to these parameter assignments [
5,
6], more recent models have drawn on European-derived data on age-based interactions to explore the impact of targeted immunisation strategies on Influenza A(H1N1) 2009 spread [
7].
European researchers have developed paper diary tools to estimate the number and intimacy of conversational and physical encounters between individuals [
8]. By these means, social mixing patterns within and between age groups have been characterised in a range of European countries [
9]. The information obtained from paper diaries has been compared with retrospective web-based self-report [
10] and while broadly consistent, prospective recording was more informative. Conversely, Glass et al compared the use of paper diaries with a classroom-completed survey and found the latter to be more consistently completed by school-aged individuals [
11].
The timeliness and hence completeness of recording using paper diaries has been questioned in other applications. Stone and colleagues, using a paper diary electronically instrumented to capture opening and closure, were able to demonstrate a marked discrepancy between reported and actual compliance with completion of a pain diary [
12]. Diaries were not even opened on 32% of study days; in contrast, 94% actual compliance was achieved with an electronic diary [
12]. Similar comparison of paper and electronic tools in a range of health services research applications has consistently demonstrated improved completeness and accuracy when electronic devices are used [
13]. Particular benefits are the ability to objectively assess timeliness of data entries [
14], and ease and acceptability of use [
15].
The primary aim of this study was to compare three different methods for the recording of social contacts likely to be sufficient for transmission of respiratory pathogens. Timeliness and completeness of entries in paper diary tools was compared to those entered in a hand-held electronic diary (PDA). We also assessed the ability of a questionnaire administered before study commencement to predict social encounters. Furthermore, we employed an explicitly location-based design across all three methods (pre-entry, paper, PDA) to aid in construction of bi-partite network models, in which interactions are characterised by the places in which they occur. Statistical characteristics of the recorded interactions were used for methodological assessment and comparison with European and other studies.
Discussion
This study used an explicitly location-dependent ascertainment method to capture social encounters. We were convinced of the validity of participants' self-reported encounters relevant to respiratory pathogen transmission by the rich narrative thread describing the day's activities captured within the diaries. More encounters were captured using the paper diary than other methods, with participants consistently reporting ease of use and timely data entry by this means. Moreover, respondents uniformly underestimated social encounters when asked to predict their contacts over the survey period (pre-entry questionnaire), justifying the need for contemporaneous diary recording.
The study's crossover design reduced between-individual variation when comparing the number of recorded contacts across prospective (paper, PDA) methods. Potential biases due to training, learning or 'burn-out' effects were minimised by ensuring that all participants were recruited and enrolled by a single research assistant (PMN) and by randomising subjects to commence recording using either the paper or electronic diary. Future protocols will additionally include randomisation of the first recording day [
17], to minimise interactions between day of the week and survey day.
Several factors may have contributed to the relatively poor acceptance of the electronic diary in this study compared with others, despite the relatively high representation of health-based researchers in this convenience sample who might be anticipated to be familiar with similar recording tools. Firstly, our location-based methodology may place a greater burden on participants than other study protocols. Secondly, the custom-built software designed for this study could potentially have been made more accessible with participant input over a longer development phase, although the close correlation observed between paper and PDA recordings for each subject suggested that most participants persisted with the electronic diary in spite of perceived difficulty. With the increasing use of 'smartphone' technologies, development of high-quality PDA-style software presents as an emergent challenge to field-based researchers. In relation to the study's objectives, the fact that respondents reported longer delays in entering contacts into the PDA went against the rationale for its use in other settings [
13], as improved timeliness of recording is desirable to minimise recall bias.
Definitive validation of recorded contacts by some form of external non-participant based observation would be desirable, but aside from privacy concerns, would conceivably change behaviour. As with other studies to date designed to measure the social interactions relevant to respiratory disease transmission, this study was not able to capture illness or exposure events. Mikolajczyk et al [
18] investigated whether contact counts for a group of school children in Germany were predictive of infection in the past 6 months, with no effect of household size or contacts observed. It remains a key challenge to design and deploy study protocols able to capture both an individual's dynamic social-network and their concurrent disease status (e.g. susceptible, exposed, infectious). Such data would provide definitive validation of the methods employed both here and in other studies.
The diary method developed for this protocol differed from tools used in other studies as location was the focal point for defining and describing each new set of recorded encounters. In consequence, and in contrast to earlier work [
8], repeated encounters were commonly described. For the purpose of assessing the comparability of our findings with other studies, we have identified the number of uniquely named individuals encountered on each day. The utility of signalling a change of location as a stimulus to recording is suggested by the relatively high number of contacts recorded by our study participants, compared with respondents from the European Union [
9]. For example, the mean (and standard deviation) number of reported contacts per day was 11.74 (7.67) in the United Kingdom and 19.77 (12.27) in Italy. The dynamic nature of contacts over the three surveyed days, with new casual contacts still being made on the third day of diary recording, concords with earlier findings from the United Kingdom [
17].
In an analysis of the POLYMOD data, Kretzschmar et al [
19] have classified individuals into seven distinct contact profiles, based on the locations in which they predominantly mix. The analysis will allow for the characterisation of how an infection spreads between locations, fulfilling a similar aim to our locations-based data for parameterising bi-partite networks of social contacts and locations. The diary format used here, whereby the 'intensity' of contact events in each separate location is captured through multiple measures (all people in the location, those at arms length, those with a recorded contact), will allow comparative analysis of different measures of intensity when constructing similar contact profiles.
The usefulness of recording the intensity of contact is confirmed by consistent observation of closer mixing in home and social settings than work environments [
17]. However, the importance of sampling to observed within and between-age mixing patterns is demonstrated by contrasting findings of university-based studies [
8,
17,
20] with those containing a larger proportion of family households, such as ours. While the convenience sample employed in the present protocol was not representative of metropolitan Melbourne, our findings were more concordant with results from a large-scale European study, in demonstrating a high level of child-adult mixing within families [
9]. In addition, we too observed an association between household size and the total number of reported contacts [
9]. These findings highlight the importance of deriving population samples that are representative not only of age, but also the household size distribution within a given country and setting.
With others, we observed clear differences between weekend and weekdays, both in the number and location of encounters. Due to budgetary and resource limitations, we restricted collection to three study days, choosing to have only one weekend day (Sunday). The absolute number of contacts observed was similar to findings from the European POLYMOD surveys [
9]. Clearly it is desirable to capture information about individuals over as many days as possible in order to assess true daily variation. However we have confirmed the observation of reduced compliance with recording over time [
21], which undermines the validity of repeated observations. Hens et al also observed decreased compliance over just two days using a modification of the POLYMOD survey in Belgium [
22]. Defining the optimal sampling time frame is a necessary trade off and may be driven by the primary study question of interest. For example, it may be desirable to estimate the total number of potential contacts made by an individual infected with a given respiratory pathogen, in which case the duration of the infectious period may determine the number of study days, with appropriate caveats regarding data quality.
Our results confirm the importance of household-level mixing to providing opportunities for the spread of infection both within and between age groups. Further data collection is required to supplement this pilot study in order to aid parameterisation of models describing heterogeneity of population mixing in the Australian urban context. How these results compare to similar European studies will be of interest.
Models of infectious disease spread are of use to policy makers aiming to predict the likely impact of interventions to reduce disease transmission targeted at specific age groups or social settings. Discussion of the possible benefit of school closure to mitigate the spread of pandemic influenza is a timely example and reflects current uncertainty regarding the contribution of spread in different settings to outbreak dynamics [
23,
24].
We have reported both mean and median encounters in Figure
3 as our data serves dual purposes. Mean encounters are the most appropriate input into stratified compartmental mathematical models of disease-transmission, which assume homogenous contact numbers within age-strata. Furthermore, in such models it is the structure of the Who-Acquires-Infection-From-Whom (WAIFW) matrix, used to capture the relative propensity of mixing between strata, that is of key importance [
25]. The main concern in this context is with the distribution of contacts, rather than their absolute number and so comparison of data recording sources should therefore focus on a comparison of the inferred WAIFW matrices. In contrast, stochastic agent based simulations or network-based models of disease transmission directly model the distribution of cases. Furthermore, in this context the absolute number of contacts is important, for example in modelling the degree distribution of network nodes. As such, characterisation of our data by median and interquartile ranges of the observations is appropriate.
Variation in behaviour is likely to explain, at least in part, differences in the number of secondary transmission events between individuals. Published stochastic models [
26] have examined the consequence of allowing for variation in the offspring distribution, while such effects can be routinely incorporated into agent-based and network-based models. Furthermore, our location-based methodology, in contrast to other methods [
9] captures the different environments in which repeat encounters are likely to occur. Spatial agent-based simulations [
2,
4] model the movement of agents (individuals) from location to location over time, currently parameterised using census data. Our study method provides data of exactly this type, with associated proxy measures of exposure-risk given by the intensity of contact events within each location. Similarly, our location-based data is key to the development of pathogen transmission models based on bi-partite graphs of social-networks by location. Of course, the relative importance of different locations to pathogen transmission is a complex function of time, number of contacts, 'intensity' of contacts and the pathogen itself, with no definitive studies as yet to reject or accept one model structure over another.
Vaccine & Immunisation Research Group, Murdoch Childrens Research Institute & Melbourne School of Population Health, The University of Melbourne, Victoria, Australia (James M McCaw, Paula M Nathan, Kristian Forbes, Terence M Nolan, Jodie McVernon)
School of Behavioural Science, The University of Melbourne, Victoria, Australia (Philippa E Pattison, Garry L Robins)
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
JMcC, JMcV and KF performed the statistical analyses and interpreted the results. PN coordinated the study, recruiting all subjects and collating the data. JMcC, JMcV, PP, GR and TN conceived the study and designed the diary collection tools. JMcV and JMcC wrote the initial drafts of the manuscript. All authors contributed to and approved the final manuscript.