Background
Ever since the US Institute of Medicine described the so-called "quality chasm" in health care [
1], quality improvement has become an important policy issue. A proposed solution for bridging the chasm is setting quality improvement collaboratives (QIC's) to work. A nice example is the Breakthrough Series model that brings together teams from different hospitals or clinics with the aim to attain improvements on a certain theme [
2]. The QIC model in general and BTS in particular are widely adopted in Western countries [
3]. So far, there is little evidence, however, on the effectiveness of QIC's [
3,
4].
Despite the lack of evidence concerning effectiveness of QIC's, most studies evaluating QIC's are investigating their effectiveness rather than follow the collaborative as it gets formed. Bate and Robert argue that many evaluation studies take up an approach they describe as "summative, noninterventionist, and heavily reliant on quantitative assessments of "success"", which is "outcome-oriented" [
5]. They contrast this approach to an action-oriented or formative approach that is mainly qualitative and that is devised to improve the method along the way by giving feedback to leaders of improvement projects. As (qualitative) process descriptions are lacking, QIC's are often described as "black boxes" [
3,
6]. Knowing what actually occurs in setting up and carrying out collaboratives would seem crucial for interpreting the effectiveness results [
3,
6,
7].
Several suggestions for opening up the black box have been made. For example, Wilson et al asked collaborative leaders what they thought were crucial aspects of QIC's [
6]. On the basis of the information retrieved they proposed a framework of core elements that have to be described in order to meaningfully link effectiveness data to the workings of the collaborative. This framework is limited, however, in that the set topic and main elements are considered to stay fixed during the project, as if it is just a matter of implementing the elements rather than the elements changing themselves as a result of implementation. By assuming that the topic of a QIC can be predefined, the authors for example do not focus on the construction of the QIC and also do not explore whether the topic may change during the collaborative process.
Bate and Robert and colleagues [
8,
9] took up a more dynamic approach by providing process descriptions of two collaboratives and detailing the extent to which the collaborative method was implemented. Yet also they did not analyze the way the features can be created or constructed within collaboratives. For example, they described the difficulties that measuring could pose for improvement teams, but they did not analyze how measurability was constructed or what different functions measuring could have within such projects [
10]. They mainly looked at the success of implementation. In this sense, even Bate and Robert in their more dynamic approach still assume to know what collaboratives are before they even start opening the black box of collaboratives.
One of the reasons why this black box should be opened is to gain insight into the construction of effectiveness data within collaboratives, i.e. the relation between the topic and the outcomes of a QIC. As effectiveness is based both on the interventions carried out and on the way improvements are measured, two interrelated questions must be addressed. First, as summative research investigates the predefined effectiveness of a collaborative derived from the topic, it should be analyzed how this topic is created. Leading questions then concern possible changes in the topic during the project and the possible consequences of these changes for the predefined effectiveness. By focusing on the construction of the topic rather than on predefined elements, more insight is gained in what is actually done within the collaborative.
Secondly, the effectiveness measurement practices themselves should be analyzed. Within Breakthrough projects, measuring is assumed to play an important role. First, it helps in investigating a project's overall effectiveness. Second, teams themselves often use measurement instruments to investigate their own effectiveness and to adjust their improvement actions based on the results. But what roles do these measurement practices exactly play within QIC's and what is the relation between the topic and its measurability? Do the measurement instruments merely describe the topic and the improvements attained or do they affect the improvement practices as well?
In this article we will propose a way of opening the black box of collaboratives by using a dynamic perspective, though different from that of Bate and Robert. We study how the collaborative gets formed rather than taking fixed elements or the extent to which the elements are implemented as a starting point. To do so, we draw upon empirical material of two projects aimed at improving mental health care and care for the intellectually disabled. We studied these projects in the context of a larger evaluation study of the QIC they are part of.
The aim of our approach is threefold. First, we would like to propose a way for opening the black box of QIC's by focusing on their construction. Our second aim is to provide more insight into the dynamics of the collaborative process. So whereas the first question is more methodological, the second is an empirical question. Thirdly, we will study possible consequences of findings from this analysis for future evaluation studies.
The article is structured as follows. First we will describe a theoretical framework based on Actor-Network Theory (ANT). Then we will describe the two improvement projects and the way we gathered data for this paper. In the next sections we explore empirically what opening the black box of QIC's may mean from an ANT perspective. We do so by focusing on the way the topic and its measurability get constructed within the collaborative. In the conclusion we come back to the question what our analysis can add to (discussions on) evaluation studies of QIC's.
Theoretical framework
As a methodology for opening the black box of QIC's, we draw on Actor-Network Theory. From an ANT perspective, none of a collaborative's elements is fixed before start of a project. Seen from this perspective a collaborative is a dynamic process in which its elements get constructed [
11]. In drawing on ANT, researchers need to "follow the actors" and to analyze how these actors themselves define what is going on [
12,
13]. Therefore, we will not predefine the concept of 'collaborative' and its elements. Rather we look at the way the collaborative is formed during the project and what consequences this process has for the actors involved.
To analyze the dynamics within the topic of a collaborative, we use the ANT-notion of problematisation [
14], which involves a dynamic way of defining and constructing the problem. Hence, seen from this perspective a problem is not given and already out there, but is constructed in a process in which actors can always (implicitly or explicitly) oppose the problematisation process. We use the term 'problematisation' instead of problem definition for it offers two advantages. First, it means that the problem definition emerges from a performance and not just from a perspective [
15]. Secondly, it implies that the problematisation is not a singular event but is done over and over again, because (dynamic) practices make up a problematisation. Thereby, the term 'problematisation' allows us to follow how the different actors involved construct the topic of a QIC. The term suggests to investigate the way leaders of improvement projects present the topic, the way improvement teams participating in the project discuss the topic and the way teams perform it within the care organizations.
Looking at the problematisation process within improvement projects has already proven to be relevant. For example, Zuiderent-Jerak et al [
16] showed that different problematisations can co-exist within a medication safety improvement project. In this project some teams focused on client autonomy whereas others sought to reduce medication errors. So the authors showed that teams may differ in doing a problematisation process, but they did not analyze the construction of the problematisation over the course of an improvement project, which will be our focus.
Next to studying the problematisation process, we investigate the measurement practices. From an ANT-perspective it is suggested not to make an a priori distinction between human and non-human actors [
12]. The measurement instruments used within the projects can be perceived as non-human actors possibly contributing to the collaborative and the way the topic is performed. As said, measuring is assumed to play a dominant role in QIC's, notably regarding rapid cycle improvements. This means that improvement teams are to carry out small scale actions, measuring if the actions led to the expected outcomes, and, if not, adjusting the actions [
2]. Furthermore, measuring is often used to estimate a project's overall effectiveness.
Yet ANT-scholars and other scholars have pointed at the performative effect of measurement instruments, meaning that instruments not only measure a situation but also affect this situation in foreseen and unforeseen ways. Perhaps the most famous example of performativity is that of opinion polls, which are aimed to investigate the expected election result but at the same time these polls influence actual voting behaviour and thus possibly change the election result. Furthermore, in social sciences measuring plays a profound role in shaping identities of persons and groups [
17‐
19]. For example, Hacking holds that classification, including classification based on measuring, produces so-called looping effects, in which people react to the classification and make it either more true by behaving in line with the classification or make it less true by opposing to it [
20]. So also here it is said that classification and measuring "interact" [
20] with the world they refer to; they do not just represent this world but change it. As a last example, in organizational sciences Power [
21,
22] illustrated the performativity of measurements used within and between organizations. The data do not only represent practices in the organization but co-construct the organization and involved actors in foreseen and unforeseen ways. People sometimes start to focus mainly on the measures and attaining high results, thereby focusing less on other issues not captured in the measures [
21,
23].
Given that measurement practices can have a performative effect, their exact role(s) should be analyzed if we want to study the construction of a collaborative, because the measurement practices possibly affect the improvement practices and thereby also the performance of the topic - i.e. the problematisation - in foreseen and unforeseen ways. Consequently, if we want to address the question what effectiveness in improvement projects may mean, we should look at the way in which the problematisation process and the measurement practices are interlinked.
Methods
The collaborative approach
In this article we focus on two improvement projects that were part of a larger collaborative performed in the Netherlands: Care for Better. These projects were named 'recovery-oriented care' and 'social participation'. They aimed at improving long term mental health care and care for the intellectually disabled. Both projects started in 2007 and consisted of two rounds each lasting one year.
From twelve to fifteen improvement teams collaborated in each round of each project. Headed by a project leader, each team generally consisted of four to nine members. The two faculty teams of the projects consisted of an expert team and a core team made up of the program leader of the overall project and two or three 'process counsellors'. So in this article leaders of improvement teams are called project leaders, and the leaders of the overall project are called the program leaders. In the project 'recovery oriented care' only teams from mental health care participated. In the project 'social participation' a mixture of teams participated, some delivering care to intellectually disabled clients and others to psychiatric clients. The clients involved usually lived in a form of sheltered housing or at a ward of the institution.
For each round of each project, four national working conferences were organized at which faculty provided recommendations on the improvement actions and the method for improving. The starting conferences were mainly intended to familiarize teams with the proposed problem and improvement method. In the first and second working conferences the improvement practices were discussed in a mix of plenary sessions and workshops. The closing conference mostly served to sum up the results attained and to focus on sustaining and spreading the findings.
Despite the organizational similarities, the projects had different goals. The 'recovery-oriented care' project was devised to give clients more control over their lives, while the 'social participation' project aimed at enlarging and enriching the clients' social networks, supposedly making them feel less lonely. The exact interventions of improvement teams participating in these projects are part of our analysis and will be discussed in the results section.
Research methods
We evaluated these projects within the context of our larger evaluation study of the Care for Better collaborative [
16,
24]. We carried out ethnographic observations at nine of the sixteen conferences, distributed over the two rounds of the two projects. Most data gathered at these conferences comprise lectures of the faculty team, reactions of improvement teams to these lectures, and discussions concerning the improvement practices. Furthermore, we separately interviewed the two program leaders. In addition, we studied six improvement teams in depth. These teams were selected on the grounds of observations of the conferences. In all cases we interviewed the project leaders of these improvement teams. Sometimes additional interviews with team members were conducted. These case studies usually lasted one day or one-half day.
We focus on only these two projects as it gives us the opportunity to analyze them more in depth, but other projects could be used as example as well. We draw on the 'recovery-oriented care' project to illustrate the problematisation process as suggested by faculty and actually performed by participating teams. In zooming in at the problematisation process, we first analyze the way faculty of the 'recovery-oriented care' project presented the problem and, interrelated, proposed solutions. Secondly, we analyze how teams discussed things and set improvement projects going [
16], as well as whether and how they adopted the proposed problematisation of faculty. Measurement practices on the other hand are illustrated by observations from the 'social participation' project. We investigate the relation between the measurement practices and the initial problematisation of faculty, and furthermore explore the consequences of the measurement practices for the project and for the actors involved.
Conclusion
In this article we proposed a way for opening the black box of QIC's, going beyond a mere description of the elements or the extent to which they are implemented. We studied a collaborative in action and analyzed how it was formed over the course of one-year-long improvement projects. To illustrate our method and to actually open the black box we zoomed in at the problematisation process and the related measurement practices. Our empirical material came from two projects in a larger collaborative aimed at improving mental health care and care for the intellectually disabled.
The problematisation process in the 'recovery-oriented care' project proved to have undergone a transformation. At baseline professionals mainly sought to support clients' recovery process by not hindering it. Later on many teams were trying to stimulate clients. This problematisation process also had consequences for the different roles proposed for the actors involved; it had consequences for "who has the right and who has the obligation" to do something about the problem [
27]. So by using an ANT-perspective we showed that the topic is not fixed and given prior to the collaborative, but instead is formed within the collaborative, both by the expert knowledge of faculty and the local knowledge of improvement teams.
This change in problematisation may have been more pronounced in the projects we studied because the improvement teams were free to come up with their own targets, to 'do' their own problematisation. This may be different for other improvement projects. When improvement teams have that much leeway, a dynamic perspective is even more needed if only to see what the improvement project is all about.
To further open up the black box, we studied the role of the measurement practices within the 'social participation' project. As effectiveness studies often assume a direct relation between a topic and its measurability, researchers should unravel this relation, as we suggested. Our analysis showed that the measurement instrument is linked to the problematisation in more than one way. First, it measures the effectiveness of the collaborative in reaching the predefined goal(s). Secondly, it may strengthen the problematisation, supporting both the problem and the solution that faculty proposed. The instruments then may make it more likely that the teams will adopt the problematisation. But moreover, there were effects on the improvement practices as well. Measurement instruments inevitably carry assumptions, for example that clients are willing and able to have conversations about their network, and would do well not to count professionals among their friends. Therefore measurement instruments also co-define and co-produce the actors involved, and thus have a performative effect on the practices they measure [
21].
Both human and non-human actors play a role in constructing the collaborative, as we showed by using an ANT-perspective. However, by using this perspective, we were less concerned with the "why" or the intentionality question [
28], for example with the question why the project was framed in a certain way or why the faculty team of the 'social participation' project urged professionals and clients alike not to classify professionals as clients' friends. Instead we focused on the performance of a project and what consequences this performance has for the actors involved.
By focusing on these questions, we showed that the problem cannot be assumed to stay fixed over the course of the project. Yet these changes in problematisation do not automatically have consequences for measuring the effectiveness of the collaborative. In the 'recovery-oriented care' project, the solution changed, but teams were still trying to solve the same problem that faculty pointed at: the lack of future perspective. It depends on the actual goal of the program and on the indicators for success if changes imply a change in effectiveness as well.
But as much "summative" research assumes that goals do not change during the project, it is important to test this assumption. Otherwise, it is hard to ascribe the effectiveness - or lack of effectiveness - to the improvement actions and the collaborative method. So it would seem crucial not only to report on outcomes but also to analyze what happened in the collaborative. Therefore, our analysis can be seen as a plea for a mixed methods approach. This mixed methods approach is part of an ongoing debate and although some scholars argue for such an approach, the extent to which it is actually done leaves much to desire [
3,
4].
Bate and Robert in contrast argue that a summative evaluation (mainly quantitatively) and a formative evaluation (mainly qualitatively) will not mix at the end of the day: "Although there are some overlaps and similarities, they are, in our view, ultimately incompatible and incommensurable research paradigms (...)" [
5]. A formative approach implies intervening in the object one studies, which affects the outcomes, they say. This creates "impossible and, largely, unmanageable tensions" between intervention and experiment for no valid statements can be made about the method itself affecting certain results [
5].
Yet our analysis can be read as a contradiction of, or at least as a critical note to, their argument. We have seen that measuring practices indeed can have a performative function. These measurement practices of improvement teams themselves are often used by evaluation researchers to examine effectiveness of QIC's. Therefore, this type of evaluation research has to deal with the same performativity of the measurement practices, which has consequences for the proposed distinction between intervention and experiment. But even if measurement instruments are used that are not owned by improvement teams themselves, one could question if these are free from performative effects. Even the fact that measuring takes place already influences its outcomes and thus can be seen as an intervention in itself [
16,
21,
22].
Yet the performative effect of measuring will only occur, or will be more likely to occur, if measuring is part of a reflexive process [
29]. If not reflected upon it cannot lead to more awareness of a situation, it cannot lead to redefining of concepts or to confrontations with certain circumstances, which may all be effects of measuring as we have seen in the analysis. So only if this reflection occurs - which is highly likely when questions are asked to clients, as in our case - measuring is a performative process.
When accepting that measuring - given reflexivity - can have (per)formative elements as well, there will be no unmanageable tensions, no incommensurable paradigms between intervention and experiment. Measuring is based on an interpretative and supposedly also a performative process which is thus a formative instrument in itself. If there are always tensions between intervention and experiment, this tension is no valid reason for not combining the more formative and the more summative research. Besides, as collaboratives are governmental instruments to improve certain aspects of care, they deserve confirmation of the expected outcomes. Thus there is every reason to explore ways to intelligently combine qualitative and quantitative methods, gaining insight both into the outcomes and in the (dynamic) construction of the collaborative. This then is the next challenge for evaluation researchers of quality improvement collaboratives.
Competing interests
The authors declare that they have no competing interests.
Authors' contributions
TB collected ethnographic data collection at the conferences and within the organizations. AN and RB helped to draft the manuscript and took part in reviewing it. All authors read and approved the final manuscript.