Background
Programmes that aim to increase physical activity and improve dietary behaviours in individuals, groups and populations play a central role in addressing local, national and global public health priorities [
1,
2]. Recent strategies have advocated approaches that are multi-sectorial, community-centred and evidence-based [
1,
3‐
5]. Understanding if, when, and how these programmes are effective is important to justify policy, programme and funding decisions, and to inform and improve future decisions and practice. In order to achieve this, there is a need for appropriate and comprehensive programme evaluation [
6,
7].
Practice-based evidence is generated from formal evaluation of programmes in real-world settings and is a fundamental part of evidence-based public health [
8‐
10]. Those involved in the design, delivery and commissioning of physical activity and dietary change programmes are expected to evaluate programmes and contribute to the evidence base. However, real-world behaviour change programmes are complex and difficult to evaluate [
11,
12]. The challenges of programme evaluation may relate to contextual factors that influence the complexity of the programme itself, e.g. its setting, target population, intervention function(s), or intended outcome(s) [
12], or to factors that influence the evaluation priorities and objectives, e.g. differing stakeholder evaluation needs and organisational, political or resourcing factors [
13]. Some of the practical challenges in conducting evaluation include the use of appropriate evaluation methods and tools, understanding what counts as evidence and how that is applied, and the roles of practitioners and researchers in evaluating real-world programmes [
7,
9,
11,
14,
15].
Evaluation frameworks facilitate a systematic approach to evaluation and can help mitigate against some of the above challenges. Frameworks can enable multiple stakeholders to gain a shared understanding of the programme and evaluation process, and help to identify and agree upon appropriate objectives and methods. In this way, they can facilitate a more comprehensive evaluation, and may improve the fit between researcher-led and practitioner-led evaluation approaches [
14]. A range of evaluation frameworks have been published. These include those developed specifically for use in programmes targeting specific health behaviours, conditions or populations (e.g. physical activity programmes [
16‐
18]), those developed for health promotion and public health programmes more broadly (e.g. RE-AIM [
19]), and generic frameworks intended to be applicable across a range of contexts, settings and sectors (e.g. Realist Evaluation [
20]).
It is noteworthy that there is wide variation in the use of terminology used to describe frameworks, in the format of different frameworks, and in the context and ways in which they are intended to be used. Differentiating between frameworks, guidance, models or tools can be a challenge [
21]. In this review the term ‘evaluation framework’ is used to include any structured guidance which facilitates a systematic evaluation of the implementation or outcomes of a programme. A ‘generic’ framework is used to refer to one that is intended for use across a range of contexts, settings and sectors, as opposed to one that has been developed for use in a specific context or field. Several frameworks have been developed for evaluation of programme implementation (process evaluation), whilst others focus on programme effectiveness (outcome evaluation) or are intended to facilitate an overall or comprehensive evaluation. In order to understand the content and focus of the frameworks and the contexts in which they may be applied, we have referred to the individual elements encompassed within evaluation as an “evaluation component”.
Many frameworks and developments in evaluation come from the research community, yet their intended audience and purpose is often unclear. For example, questions remain about the extent to which these frameworks are intended for use in practitioner-led or researcher-led evaluation, and their applicability to different evaluation objectives, programmes, and contexts.
Previous reviews of evaluation frameworks have been limited to frameworks which evaluate specific aspects of a programme, for example health inequalities [
22], or methods used in health programme evaluations [
23,
24]. Within the field of implementation science, reviews have focused on frameworks for translation of research to practice [
25,
26]. The review by Denford et al. [
27] made a valuable contribution by providing an overview of guidance available to support evaluation of public health programmes. However, it was limited to a subset of 48 documents created or sourced by national and international organisations and published since 2000. As a result some key evaluation frameworks published before 2000 or within the academic literature were not included, such as RE-AIM [
19] and Realist Evaluation [
20]. Denford et al. included various guidance documents intended for use in evaluating programmes targeting a broad range of health behaviours and health problems (e.g., smoking, asthma), as well as generic ones. Whilst they suggested that the wealth and breadth of available evaluation guidance may be a limiting factor in the ability of practitioners to access and apply appropriate guidance, the resulting review [
27] and associated online catalogue [
28] may still overwhelm practitioners seeking guidance on how to evaluate their specific programme.
To resolve some of this complexity we sought to develop a typology of frameworks, to help guide decision making by those involved in programme evaluation. The purpose was to appraise the frameworks that may be applicable for the evaluation of physical activity or dietary change programmes. By mapping the frameworks against a range of evaluation components (such as elements of process or outcome evaluation), we aimed to develop an overview of guidance included in each framework, enabling practitioners, commissioners and evaluators to identify and agree which frameworks may best meet their needs.
Objectives
1. To identify published frameworks that can be used for evaluation of physical activity and/or dietary change programmes.
2. To identify each framework’s stated scope in order to assess their applicability to different evaluation objectives, programmes and contexts.
3. To identify and map which evaluation components are encompassed within each framework.
4. To use the findings to develop a typology of frameworks.
Method
A scoping review approach was used, as this allowed the extent and nature of the literature on evaluation guidance to be identified and an overview of the available frameworks to be developed [
29‐
31]. In line with the stages of a scoping review [
29,
30], the process involved identification of the research question, a systematic search, consultation with experts, and mapping of the frameworks against different components of evaluation. We followed the PRISMA–ScR statement for the reporting of scoping reviews [
32].
Search strategy
To identify any frameworks that could be applied to physical activity and/or dietary change programmes, we used a broad search strategy to find those intended for use in public health, health promotion and generic programmes as well as those developed specifically for use in evaluating physical activity and dietary change programmes. Firstly, a search was conducted in Scopus. As a meta-database, including records from MEDLINE and EMBASE as well as other sources, Scopus is the world’s largest abstract and citation database of peer-reviewed literature. It contains sources across a range of fields including medicine, sciences, humanities and social sciences. The following search strategy was used: (TITLE ((framework OR model OR guid* OR tool)) AND TITLE-ABS-KEY ((“physical activity” OR exercise OR diet OR obes* OR overweight OR “public health” OR “health promotion”)) AND TITLE-ABS-KEY (communit*) AND TITLE-ABS-KEY (evaluat*)). No date restriction was applied. The search was undertaken in March 2018. All sources identified from the search were downloaded into the Endnote reference manager, and any duplicates were removed.
Secondly, between April and September 2018, we searched for grey literature on the websites of key organisations interested in evaluation of physical activity and/or dietary change programmes, using “evaluation framework” as a search term. This included the World Health Organization (WHO), Public Health England (PHE), Sport England, and the Centers for Disease Control and Prevention (CDC). Additional sources were identified from the authors’ existing files. We consulted evaluation experts and stakeholders including academics, those involved in public health policy development and evaluation, and evaluation consultants within the domains of physical activity or dietary change, to augment the search results. These experts were contacted and asked to provide feedback on the list of frameworks we had identified by the search strategy and to identify any omissions. Reference lists were examined for additional relevant sources.
Sources were screened by title and abstract, and then by full text (JF). Full text screening was independently validated (KM) and disagreements resolved through discussion. Consensus could not be reached for six sources, which were checked by a third reviewer (AJ) and agreed through further discussion.
Inclusion and exclusion criteria
Inclusion and exclusion criteria were defined a priori and applied to all sources (JF). Table
1 provides details of the full inclusion and exclusion criteria. Sources were included from both the academic and grey literature that described a framework to support systematic evaluation of a physical activity and/or dietary change programme, including generic, public health or health promotion frameworks applicable to physical activity or dietary change programmes. Academic literature included journal articles and books. Grey literature was defined as all other printed and electronic documents published by organisations and agencies. Web-based sources were included if they provided systematic guidance on how to conduct an evaluation but excluded if they were an organisation’s general website without guidance. Only sources in English were included.
Table 1
Inclusion and Exclusion Criteria
Sources describing a framework or guidance to support evaluation of a programme e.g. process &/or outcome evaluation. | Sources describing a specific measurement tool. |
Sources describing a framework or guidance to facilitate evaluation of physical activity, dietary change, public health or health promotion programmes. | Frameworks designed to support evaluation of programmes targeting other health behaviours (e.g. smoking, alcohol, substance abuse) or conditions not specifically linked to physical activity or dietary behaviours (e.g. HIV, mental health). |
Sources describing a framework or guidance to support evaluation of a specific evaluation component that aligns with the underlying principles of real-world, community-based or health promotion programmes, e.g. community development, participation, wider health and non-health outcomes. | Sources describing frameworks or guidelines intended to support evaluation of technology-based programmes or cost-effectiveness, as these are related to distinct specialised areas of evaluation or health promotion approach. |
Empirical and/or methodological studies reporting the development and/or validation of an evaluation framework, as well as conceptual or discussion papers describing a framework or guidance on evaluation. | Theoretical or conceptual models of conditions or interventions. Guidance on policy or action for management of disease, policy or clinical practices. Evaluation studies reporting the use of an evaluation framework. |
Data extraction and synthesis
To address the first and second objective, a data extraction template was used to collate information about each framework. The name of each framework was identified. Where no framework name was provided in the source, a short name was given based on the authors’ description in the title or abstract. To assess each framework’s scope and applicability to the evaluation of physical activity and/or dietary change programmes, data extraction fields included the stated evaluation objective, the types of programme it was intended for, and additional data related to general characteristics of each framework, e.g. its intended audience, format and development process.
To address the third objective we developed a set of data extraction fields to enable us to appraise whether each framework provided any guidance on a range of evaluation components, and what that guidance comprised. We have used the term ‘evaluation component’ to refer to individual elements encompassed within evaluation; for example elements of process or outcome evaluation. The list of evaluation components included in the data extraction template was identified a priori, and developed through a process of consensus building. We initially identified a list of evaluation components that were informed by recommendations for good practice in the evaluation literature, for example implementation, reach and unanticipated outcomes [
12,
33‐
35]. This was further developed through consultation with evaluation experts, who were contacted and asked to comment on the appropriateness of the evaluation components we had identified and to identify any gaps or additional components based on their personal experience and knowledge of programme evaluation. Table
2 shows the full list of evaluation components grouped into those related to: (1) process evaluation, (2) outcome evaluation and (3) study design. Grouping programme context, theory of change and logic models within process evaluation components aligns with its inclusion in the UK Medical Research Council (MRC) Process Evaluation guidance [
35], and recognises the crucial role of logic models in the early stages of developing an evaluation plan, in reporting causal assumptions about how a programme works, and informing process and outcome questions and methods. Where possible, pre-defined categorical responses were developed to facilitate the data extraction, coding and synthesis.
Table 2
Evaluation Components Agreed for Data Extraction and Mapping of Frameworks
(1) Process Evaluation | Describing programme context |
Using theory of change or logic models |
Reach |
Implementation |
Maintenance |
Any other process measures stated |
(2) Outcome Evaluation | Behavioural outcomes |
Health outcomes |
Non-health outcomes |
Unanticipated outcomes |
(3) Study Design | Stakeholder involvement |
Participatory evaluation |
Evaluation linked to stages of programme |
Evaluation at different time points |
Study design/method |
Data collection |
Data analysis |
Dissemination and reporting of findings |
Where authors had described the scope of a framework variably, and where terms were not mutually exclusive, multiple terms were noted in the data extraction table. For example, terms such as community or practice based were used interchangeably to describe a study, intervention, setting or population. Where frameworks gave more detailed guidance on specific evaluation components, we also extracted a summary of what the guidance comprised. For each evaluation component we assessed whether the framework simply mentioned or provided more detailed guidance on how to evaluate or break down the relevant component.
Data extraction was completed by JF. To verify the data extraction, a random sample of 20 sources was checked independently by AJ and WH. Differences were resolved through discussion and used to establish agreed definitions that were then applied to further data extraction.
Framework format, programme type and evaluation objectives are typically used to describe frameworks. We therefore used these aspects to develop our typology for the frameworks. For the purposes of categorising the frameworks within the typology we used the dominant term presented in the description and content of the source as the basis for identifying each framework’s most defining characteristic. The extracted data was also used to map each framework against the evaluation components in order to provide an overview of the guidance encompassed within the frameworks. A narrative synthesis of the findings is presented.
Discussion
Our scoping review identified 71 evaluation frameworks, considerably more than previous reviews of evaluation frameworks within the field of public health [
25‐
27]. The broad search strategy we applied enabled us to identify frameworks developed within a range of domains that we could add to those included in these earlier reviews. The focused set of inclusion and exclusion criteria we then applied meant that we only included frameworks specific to or generalisable to physical activity and/or dietary change programmes. In addition to the 12 frameworks specifically intended for physical activity and/or dietary change programme evaluation, we identified a further 59 intended for public health, health promotion, behaviour change or generic programmes that were applicable to physical activity and/or dietary change programmes.
Our review has highlighted the plethora of frameworks available; previous reviews [
27] reported this as a potential challenge to practitioners and evaluators navigating and making use of the available guidance. Our review also highlighted the variability in terms used by authors to describe the purpose and scope of the frameworks. Although we identified a growing number of frameworks developed by and for practitioners, e.g. [
102,
103,
106,
107,
111], in many frameworks the intended audience was unclear. Terms used to describe programme types were poorly defined and were often used interchangeably. Some phrases such as ‘natural experiment’ and ‘real-world’ were used to refer to the evaluation approach and the intervention itself, whilst others (e.g. behaviour change and sustainability) were used to refer to both intervention processes and outcomes. Several frameworks which stated they were intended to support both programme planning and evaluation provided insufficient details about how these facilitated evaluation. The lack of clarity in the extent to which frameworks are intended to be used by researcher-led or practitioner-led evaluation, and in their applicability to different programmes and evaluation objectives, has implications for those using the available guidance. There needs to be a greater consensus of how terms are defined within public health evaluation. An agreed common language would enable those involved in programme evaluation to understand more clearly the applicability of the different frameworks and would help this research area to move forward.
Our typology and mapping resolves some of that complexity in purpose and scope of frameworks by signposting to relevant frameworks and by developing an overview of what guidance is encompassed within each. Our appraisal of frameworks has highlighted areas of overlap, strengths and limitations in the guidance available to support programme evaluation. For example, the inclusion of key process evaluation components (e.g. describing programme contexts and causal mechanisms, reach, and use of logic models) in most frameworks reflects the growing understanding of the importance of these aspects of evaluation to facilitate a more detailed understanding of whether and how a programme works [
7,
33‐
35,
118]. These components represent strengths within the existing guidance, and areas where there is already an abundance of guidance.
The mapping process and appraisal also identified components where more guidance would be beneficial. We found limited guidance on participatory approaches, non-health and unanticipated outcomes, and wider programme components (e.g. resources, training, delivery, adaptation, partnerships, organisational structures), and sustainability. These components represent aspects of evaluation that require further development of guidance. Stakeholder involvement or participatory evaluation was mentioned in all but nine of the frameworks, reflecting the growing recognition of the importance of stakeholder engagement in evaluation decisions and processes [
34,
84]. However, detailed guidance on how to incorporate participatory evaluation methods was only provided by seven frameworks [
34,
56,
64,
68,
73,
80,
81], and represents another area where further development of guidance would be beneficial. Compared to other categories within the typology, frameworks specific to physical activity programmes more consistently provided guidance on evaluation of health and behavioural outcomes, including the use of appropriate data collection and analysis methods. By their nature these components are specific and therefore may be difficult to define within more generic frameworks. Frameworks developed to facilitate evaluation of specific programme elements, such as sustainability [
76,
93], and those intended to facilitate evaluation of partnerships [
78,
80,
92] or community [
68,
69,
80] also addressed some of the gaps within the more generic frameworks.
Our mapping and typology signpost to frameworks where guidance on specific components can be found. Although availability does not necessarily equate to accessibility or usability of information, the mapping of frameworks can be used to help understand some of the strengths and limitations within the guidance provided. Further investigation of whether and how frameworks have been used may provide insight into how fit for purpose they are, and the benefits and challenges of applying them within physical activity or dietary change programme evaluation. Furthermore, the typology and mapping can be used by practitioners, commissioners and evaluators of physical activity and/or dietary change programmes to identify frameworks relevant to their evaluation needs. They can also be used by researchers and those interested in developing evaluation guidance to identify evaluation components where it would be most useful to focus their efforts, rather than developing more guidance for components where there is already an abundance of guidance. Our categorisation could also be used by researchers publishing frameworks to more clearly report how these are intended to be used, and for those reporting evaluation studies to more clearly state how they have been used.
Strengths and limitations
Our broad search strategy enabled a comprehensive review which identified 71 frameworks within the academic and grey literature. By drawing on frameworks developed within different domains, we have added to previous reviews [
25,
27] to map a wide range of evaluation frameworks applicable to physical activity and/or dietary change programmes.
Our scoping review methods, which included consultation with experts, helped to maximise the chances of identifying relevant frameworks, and of applying relevant components which were based on consensus to appraise the frameworks. It was not our intention to apply a formal consensus building method, however we recognise that the use of a more formalised process would be an alternative approach. By consulting both practice and research-based experts we are confident that the results will be of interest and value to both practitioners and researchers concerned with evaluation of physical activity and/or dietary change programmes.
There are limitations of the review. The review only included sources published in the English language. The heterogeneity and ambiguity in use of terminology was a methodological challenge during screening, data extraction and synthesis. Frameworks intended to support specialist evaluation aspects such as health economic evaluation and evaluation of programmes using digital technologies (e.g., mobile health) are critical to practice and policy decisions, however we excluded these frameworks due to their specificity and also due to the large number available. A separate review of the available guidance to support these specialist evaluation aspects would be beneficial.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.