Participants
We defined our sampling frame to be 1) researchers with experience in studying or developing A&F interventions, 2) methodologists from organizations who routinely provide A&F interventions, and 3) knowledge users with specific expertise in A&F interventions. Using participant lists from several international meetings of A&F intervention science and implementation, we were able to generate a list of 211 individuals (75 male, 136 female) coming primarily from Canada (66%), the UK (27%), and other countries (7%).
Developing the prioritization list
Our previous work identified 313 hypotheses suggested by experts to be testable, theory-informed predictions for how health care A&F interventions could be improved [
4]. The list of hypotheses was organized into a set of 30 themes using independent assignment of codes in an iterative process by three coders and confirmed by a fourth member of the team. The resulting hypothesis list was comprehensive, but efforts to translate the list into a survey made it evident that many hypotheses were conceptually similar, had redundant phrasing, or were not entirely clear. As these items would be difficult to prioritize, we undertook a process aimed specifically at eliminating these issues. First, two independent reviewers (HLC;KC) reviewed the full list of hypotheses to group together similar hypotheses and highlight unclear hypotheses. Next, three members of the team (HLC;KC;JCB) held consensus discussions to confirm unclear and redundant hypotheses and select which hypothesis from any grouping of similar hypotheses was the most clearly worded. In doing so, 98 hypotheses were deleted and one hypothesis was split into two. This process also resulted in a reduction of the number of themes from 30 to 29. The remaining 216 hypotheses were then reviewed again by all three team members for clarity, which led to examples being added to four hypotheses, and a rewording of 15 hypotheses. An example of redundancy was the following two hypotheses in which the first one was retained and the second one eliminated: ‘
Feedback will be more effective when focused on the few, most important behaviours’; ‘
Feedback will be more effective if the focus is on only one specific behaviour at a time’. An example of a vague (and therefore eliminated) hypothesis was ‘
Feedback needs to consider alternatives and substitutes beyond the one focal intervention’. Lastly, the following is a sample hypothesis that was altered to include an example: ‘
Feedback will be more effective if emphasis is on what needs to be achieved (loss framing) as opposed to what was achieved (gain framing)’ was reworded to include this example:
‘i.e., 20% of your patients did not receive the proper prescription vs. 80% did receive the proper prescription’. In our previous work, when developing the hypotheses as well as the resulting themes, we used the term ‘feedback’ aiming to refer to the specific data provision episode within the A&F intervention. While the broader term ‘A&F intervention’ could also include other components of a complex intervention, we expect that our use of the term ‘feedback’ in the context of specific interventions was generally interpreted as a placeholder for the term ‘A&F intervention’. In this manuscript, we use the term ‘A&F intervention’ to refer to these interventions but have retained the term ‘feedback’ when describing hypotheses, the themes and the study materials in the Additional files
1 and
2.
Survey design
In designing the prioritization exercise, our goal was to identify a list of hypotheses that members of the A&F intervention research community believe should be prioritized for further exploration. Because it would have likely had an adverse affect on response rate to ask participants to rank order all 216 hypotheses, we asked them to choose a list of up to 50 ‘priority’ hypotheses. We chose the number 50 to achieve a reasonable trade-off between restrictiveness (‘why can’t I choose this important additional hypothesis?’) and specificity (i.e., requiring respondents to be more selective than a simple yes/no endorsement).
The online survey was developed for this study, and created by and housed at the Ottawa Hospital Research Institute. See Additional file
2 for a copy of the survey. It consisted of four tabs that respondents could select sequentially: Instructions, Demographics, Prioritization Exercise, and Summary. Instructions asked respondents to consider 1) the quality of the idea behind the hypothesis (as best as they could interpret it), and 2) its likelihood of advancing the field. If they thought a hypothesis was interesting but poorly worded, they were instructed that they could still select the hypothesis but should provide comments about wording and why they selected it despite the problem identified. Respondents were also instructed not to select hypotheses that they felt were unclear, uninteresting, or already well understood.
Demographics collected on respondents included Country where the respondent does most of their work (via text box), Work Role (check all that apply: Researcher, Policy Maker, Health System Administrator, Healthcare Delivery, Other), and Career Level (via drop-down menu: Early (< 5 years); mid (5–15 years) and senior (> 15 years)). As the survey group was known to our team (i.e., names gathered from invited meetings), we extracted Sex and Country variables for the entire sample frame based on existing meeting information; identifying information was eliminated from the dataset prior to analysis.
The prioritization exercise included a series of web pages (~ 10 hypotheses per page) that highlighted the theme from which each hypothesis was derived, hypothesis number, the hypothesis itself, a checkbox for selecting that hypothesis as one of the top 50, and an optional comment box next to each hypothesis. We included a function on the prioritization tab to reveal a running tally of how many hypotheses had been selected (i.e., ‘35 of 50 selected’). If more than 50 hypotheses were selected, a pop-up message appeared with a reminder to limit selection to 50. If more than 60 were selected, a warning message appeared that the maximum number had been selected. That is, while participants were told to choose 50 hypotheses, they could select any number up to 60. Theme presentations were randomly sorted for each participant to reduce order effects that might have arisen due to respondent fatigue, but hypotheses within each theme remained in a consistent order to facilitate clarity of the theme.
The summary tab listed all chosen hypotheses, relevant themes, and the total number of hypotheses selected. Respondents were able to review their selections and make changes as needed. If they were happy with their selections, they were instructed to click the “all done” button at the bottom of the summary page.
The task was piloted in two phases. An initial beta test (in Microsoft Excel) was carried out among team members (HLC;KWE,NI;SM;JCB) to ensure the instructions were clear, that the task was easy to complete, and to get a sense of the time commitment involved. After final web programming, the survey was again pilot tested among the team (HLC;KC;JCB) to ensure ease of use, understandability, and functionality.
Survey administration
Participants were sent an invitation email from the study PI on January 9, 2018 that included a short description of our work leading up to this prioritization survey, information on the task at hand, a unique participant ID with password, and the web survey link. The email also listed the names of our study team and included a published paper about our work [
5] along with the REB approved participant information sheet as attachments. The participant information sheet included all regulatory requirements, such as the study purpose, funding information, and how personal information would be protected. Participants were told that the survey was voluntary and would likely take no more than 60 min to complete. Non-responders were sent three follow-up emails at approximately 2-week intervals (i.e., January 22, 2018, February 13, 2018 and March 9, 2018). Duplicate entries were avoided by assigning unique participant IDs with a password, managing password resets through a research coordinator, and having the participant re-enter the survey at the point where they last exited if they logged in multiple times.
Analysis
The data were downloaded from the locally hosted secure server into Microsoft Excel for analysis. Frequencies were calculated for all demographic variables and for Sex and Country, with chi-squared analyses calculated to evaluate differences between responder and non-responder groups.
We calculated the total number of times individual hypotheses, and hypotheses in each theme, were endorsed in respondents’ top 50. As the number of hypotheses in each theme varied, our theme endorsement was calculated as a proportion of possible endorsements (i.e., number of endorsements of hypotheses in a theme over the number of hypotheses in the theme multiplied by the number of participants).