Background
Despite efficacious therapies, depression remains a leading cause of disability [
1‐
3]. Most depression is detected in primary care, yet rates of appropriate treatment for detected patients remain low. There is ample randomized trial evidence that collaborative care management (CCM) for depression is an effective [
4‐
6] and cost-effective [
7] approach to improving treatment and outcomes for these patients. In CCM, a care manager supports primary care clinicians (PCCs) in assessing and treating depression symptoms, often with active involvement of collaborating mental health specialists. Care managers typically carry out a comprehensive initial assessment followed by a series of subsequent contacts focusing on treatment adherence and patient education and/or activation. Use of CCM, however, is not yet widespread in routine primary care settings. This study aimed to use a cluster-randomized design to formatively evaluate the success of evidence-based quality improvement (EBQI) methods in implementing effective CCM as part of routine Veteran Affairs (VA) care. Problems detected through our rigorous evaluation could then be used to support higher quality model development for sustaining and spreading CCM in VA primary care practices nationally. The study's major goals were thus: to learn about the process of implementing research in practice, including effects of context; to test the effectiveness of EBQI for adapting research-based CCM while maintaining its effectiveness; and to provide information for improving the implemented model.
Implementation of CCM as part of routine primary care requires system redesign. EBQI is a redesign method that supports clinical managers in making use of prior evidence on effective care models while taking account of local context [
8‐
10]. For this study, VA regional leaders and participating local sites adapted CCM to VA system and site conditions using EBQI [
11]. We term the locally adapted CCM model EBQI-CCM. The study used a cluster-randomized design to evaluate seven EBQI-CCM primary care practices versus three equivalent practices without EBQI-CCM.
Theory suggests that durable organizational change of the kind required by CCM is most likely when stakeholders are involved in design and implementation [
12,
13]. Yet classical continuous quality improvement (CQI) for depression, which maximizes participation, does not improve outcomes [
14,
15]. EBQI, a more structured form of CQI that engages leaders and QI teams in setting depression care priorities and understanding the evidence base and focuses teams on adapting existing CCM evidence and tools, has been more successful [
8,
16‐
18]. This study built on previous EBQI studies by adding external technical expert support from the research team to leverage the efforts of QI teams [
19].
CCM can be considered a practice innovation. Research shows that early adopters of innovations may be different from those who lag in using the innovation [
20]. We hypothesized that CCM, which depends on PCC participation, might yield different outcomes for patients of early adopter clinicians compared to patients of clinicians who demonstrated less use of the model. We found no prior CCM studies on this topic. Because this study tested a CCM model that was implemented as routine care prior to and during the randomized trial reported here, we were able to classify clinicians in terms of predilection to adopt CCM based on their observed model participation outside of the randomized trial. We then assessed CCM outcomes for our enrolled patient sample as a function of their PCC's predilection to adopt the model.
In this paper, we evaluate implementation by asking the intent to treat question: did depressed EBQI-CCM practice patients enrolled in the randomized evaluation and referred to CCM have better care than depressed patients at practices not implementing CCM? We also asked the contextual subgroup question: do EBQI-CCM site patients of early adopter clinicians experience different CCM participation outcomes than those of clinicians with a low predilection to adopt CCM? Because our purpose was to study and formatively evaluate the implementation of a well-researched technology [
5], our grant proposal powered the study on a process of care change (antidepressant use). We also assessed pre-post depression symptom outcome data on all patients referred to care management as part of EBQI. This data is documented elsewhere [
11], and is used in this paper to gain insight into differences between naturalistically-referred patients (representing true routine care use of CCM in the sites outside of research) and study enrolled patients.
Discussion
This study aimed to determine whether healthcare organizations could improve depression care quality using EBQI for adapting CCM to local and organizational context [
8‐
11]. This multi-level research/clinical partnership approach called for systematic adaptation of previously-tested, effective CCM for depression [
4‐
7], and for focusing on the key aspects of the Chronic Illness Care Model (
i.e., patient self-management support, decision support, delivery system design, and clinical information systems) [
48]. Researchers served as technical experts rather than decision makers or implementers. To rigorously and formatively evaluate EBQI-CCM early in its life cycle in VA, we used cluster randomized trial methodology powered to detect changes in the process, but not outcomes, of care. Our answer regarding whether healthcare organizations can use EBQI methods to improve depression care is mixed in that EBQI-CCM showed improvements in process (antidepressant use) while not demonstrating outcome improvements.
Unlike our answer regarding the effectiveness of EBQI-CCM, our answer regarding the usefulness of randomized trials for formative evaluation is strongly positive. The trial was relatively inexpensive ($750, 000) relative to system costs for fully implementing CCM nationally; identified the need to improve the EBQI-CCM model; and provided critical information on how to improve it. Information on issues raised by this trial, such as care manager workload, follow-up, access, and clinician effects on patient outcomes, is essential if the results of over 35 CCM randomized trials are to be replicated in routine care.
To understand study results, we evaluated adherence to the study intervention plans. In this study, the investigators implemented EBQI, rather than CCM itself; engaged sites implemented CCM. In terms of overall adherence to EBQI methods, our approach engaged regional and local leaders effectively in guiding local QI teams through PDSA cycles for designing and implementing CCM. All participating practices implemented and maintained EBQI-CCM throughout the study. In terms of adherence to CCM, as implemented by regional and local leaders with EBQI support, QI data [
11] showed that EBQI-CCM incorporated the key features identified in the published CCM literature [
50] in terms of patient education and activation [
26,
51], care management follow-up, systematically assessed symptoms, and collaboration between primary care providers, care managers, and mental health specialists [
31,
32]. Adherence to the TIDES DCM protocol for promptness and completion of all required clinical assessments for individual randomized evaluation patients, however, was problematic.
Despite the identified problems with completion of assessments, the EBQI-CCM site patients were prescribed antidepressants at appropriate doses significantly more often than those in non EBQI-CCM practices (a 23% difference). EBQI-CCM site patients similarly had significantly more prescriptions actually filled (a 22% difference). Prior studies of CQI or lower-intensity EBQI for depression in primary care have not shown improved prescribing [
8,
14,
15]. Increased antidepressant use, however, did not translate into robust improvements in depression symptoms, functional status, or satisfaction with care in intent to treat analyses.
Because we aimed for the most efficient use of study resources for evaluating the process and effectiveness of an implementation method, rather than the effectiveness of CCM, we predicated our design on the lower sample size required for assessing a key process of care (antidepressant use) rather than the more demanding sample size needed for assessing effects on patient symptom outcomes. Our power to detect a 20% difference in depression symptoms was only between 0.21 and 0.29, based on the obtained sample size and ICC. We thus cannot definitively say that depression symptom outcomes did not improve within the timeframe studied. We were, however, disappointed in the lack of robust symptom impacts, and sought to determine more explicitly what lessons readers should take away from our work.
To place our findings in context, we first asked: Do our randomized evaluation results test the effectiveness of CCM as a model? We conclude they do not. Other CCM studies have tested the CCM model as implemented using designs with higher researcher control, and shown effectiveness in diverse healthcare sites [
4‐
7]. These studies, however, did not test self-implementation of CCM by healthcare practices or sites using QI methods. For example, meta-analyses on the effectiveness of the CCM model [
4‐
6] exclude prior studies of QI methods for implementing CCM [
8,
14,
15], recognizing that these studies address a different question. Our randomized evaluation results test the ability of healthcare organizations and sites to adapt and implement research-designed CCM as a part of their organizational cultures and structures. In so doing, the results provide information for improvement. Because typical healthcare organizations or practices must use QI methods to adopt research-based depression care models, the challenges faced by this study are likely to be relevant to managers, policymakers, and researchers interested in improving depression care at a system or organizational level.
Our goal of combing a randomized evaluation with QI methods resulted in challenges related to timing. Our fixed windows for baseline and follow-up surveys meant that delays in initiation of care management for study patients, followed by lags in PCC ordering of treatments, were not accommodated in assessing outcomes. Thus, for many patients, antidepressant treatment or psychotherapy began only a short time before the seven-month follow-up survey. Furthermore, while 81% of randomized evaluation patients eligible for panel management completed the six month DCM assessment, many fewer had completed the designated number of follow-up contacts by that time. These results highlight the challenges for researchers in timing outcome measures relative to patient access to CCM in a study with a rigorous randomized design but low researcher control of the intervention.
Excessive demand for care management proved to be another challenge, and one that is potentially relevant to CCM program managers. While we did not intend to overload care managers, we inadvertently did. As originally envisioned, the randomized evaluation would have begun after EBQI-CCM practices had completed a small number of PDSA cycles of the CCM intervention involving as few as ten and no more than fifty total patients. Under this scenario, care managers could have covered both naturalistic referrals and randomized evaluation referrals, given typical care manger caseloads [
52]. In reality, the requirements of eight separate IRBs, faced with an unfamiliar implementation research model and with the introduction of HIPAA [
53], led to a prolonged period between start of naturalistic PDSA intervention development and start of the randomized evaluation (a gap per intervention practice of between 111 and 334 days with a mean of 263 days). The study team discovered that it was not feasible, under QI conditions, to turn off naturalistic referrals. Thus, care manager caseloads were full with naturalistically referred patients prior to the start of the randomized evaluation.
Our study mimicked a potential organizational policy such that patients screening positive for depression would be automatically referred to CCM, along with patients referred by their PCCs. This scenario represents a potentially realistic policy in VA in particular, because routine depression screening is already mandated. Our results show that implementing effective follow-up of primary care-detected depression for all eligible patients will be challenging.
Few previous studies of CCM have tested reach [
54] or the degree to which all eligible individuals in a given clinical setting can have access to the CCM model. Most previous CCM randomized trials have limited total patient access to CCM to patients enrolled in the trial, thus artificially controlling demand. Care manager caseload capacity [
52] bounds effective reach under naturalistic conditions. Our study ratio of approximately one care manager per 10, 000 primary care patients may need to be adjusted or ameliorated by other depression care redesigns.
Looking within the randomized evaluation and its patients, we found unanticipated associations between patient outcomes and whether the PCC they belonged to was an early adopter of CCM. Patients referred to CCM by the study team, but whose PCC was an early CCM adopter [
20], were significantly more likely to be assessed by care managers and to receive adequate care manager follow-up than other patients, independent of patient depression severity or comorbidities. These results suggest that patient access to full CCM care was shaped more by who their provider was than by patient need.
We think the clinician effects we observed are likely to substantially affect CCM models under naturalistic conditions. Clinician effects did not influence patient enrollment in the study. No clinicians in these practices refused to have their patients referred to CCM by study personnel, and the proportions of randomized evaluation patients belonging to early adopter clinicians versus those with a low predilection to adopt the model mirrored the proportions of clinicians who fell into these categories in the study practices.
Clinician predisposition to adopt CCM affects use of the model under naturalistic conditions in another way as well. In essence, patients of early adopter clinicians tend to monopolize the DCM resource. For example, 73% of naturalistically referred patients belonged to clinicians who habitually used CCM (ten or more naturalistic referrals), while only 27% of randomized evaluation patients belonged to this type of clinician. Registry data on patients naturalistically referred to CCM shows the same pattern we saw in data from the randomized evaluation; the quality of CCM care is better among patients of early adopter clinicians [
11]. Differential use of and benefits from CCM resources based on clinician characteristics should be taken into account both in interpreting studies of CCM and in implementing CCM as routine care.
Theories of social justice and mental health parity would argue that patient need, rather than clinician characteristics, should govern access to clinical resources. Our results emphasize the importance of active PCC engagement in CCM, a topic not addressed in most prior CCM research, though identified as issue by prior qualitative work on the TIDES program [
49]. Clinician effects may be mediated by, for example, more effective encouragement to patients about participating, and greater promptness in responding to care manager suggestions about ordering treatment. These activities might in turn be moderated by differences in clinician knowledge and attitudes about depression and/or greater experience in using CCM. Monitoring of clinician effects on use of CCM (and potentially other mental health services) is critical for CCM programs, and better methods for bringing all clinicians in line with the early adopters are needed [
55].
There are reasons to collect program evaluation data other than testing program efficacy or effectiveness. We think CCM programs should monitor patient outcomes using a registry, as was done in this study, for purposes of monitoring program utility and safety. Unlike our randomized evaluation results, our pre-post care manager registry data reflects how the program functions naturalistically. These naturalistic data, while not testing model effectiveness, show that CCM as implemented in these VA sites performed safely, and with outcomes that met or exceeded CCM targets for patients whose clinicians chose to refer them.
We used our registry data as well as our randomized trial data to shape and improve EBQI-CCM. As reported elsewhere [
11,
19], registry results on naturalistically referred patients show that: 82% of 208 were treated for depression in primary care without specialty referral; 74% stayed on medication for the recommended time; and 90% of primary care managed patients and 50% of mental health specialty managed patients had clinically significant reductions in depressive symptomatology (PHQ-9 scores < 10) at six months. On average, PHQ-9 scores improved nine points; improvement remained significant controlling for depression severity and complexity. While subject to selection bias, these results showed potential benefit of the program for diverse patients. If the results had been different, such as showing little or no symptom improvement or raising safety concerns, we would have considered stopping or fully redesigning the program. Instead, our follow-up PDSA cycles focused on reducing patient and clinician selection effects while continuing to monitor patient outcomes.
Our comparison of registry data with rigorous study data is encouraging regarding the validity of registry data for depression program monitoring purposes. Although we would not have discovered the effects of clinician adopter status based on registry data, because of the few included patients belonging to low adopter clinicians, we can replicate this clinician effect in retrospect through the registry. Other major context effects demonstrated by the randomized experiment, including effects of patient complexity, can also be observed through the registry data, supporting its validity. Per protocol analysis (looking only at patients who received full CCM) of symptom and functional status outcomes for randomized evaluation patients was consistent with registry results in terms of the level of outcome improvement observed, providing qualitative triangulation on the importance of ensuring patient completion of CCM as a critical target for improvement. We conclude that registry data on patient care and symptom outcomes can be accurate enough for program monitoring, with appropriate attention to selection bias and care manager training on data collection.
Registry data has limitations in addition to its inability to provide fully representative data. We know that registry patients were selected and not representative of all eligible patients. In addition, care managers, though extensively trained, may have been less objective or consistent in their administration of the PHQ-9 than were external data collectors.
One of the explicit goals of the TIDES program was to develop a CCM model that could be spread nationally in VA [
56]. In terms of this goal, the randomized evaluation of EBQI reported here informed the ongoing QI and model spread process for TIDES. For example, the issues with differential PCC involvement led to systematic training and engagement approaches on a national basis [
56]. As it improved, the TIDES program became one of the bases for the Veterans Health Administration (VHA) Primary Care-Mental Health Integration(PC-MHI) initiative [
57‐
59] and is codified in the VHA's Uniform Mental Health Services Package directive [
60]. EBQI thus seems to be an effective method for designing a program that sustains and spreads. Data from the randomized evaluation reported here, however, indicate that ensuring that the sustained, spread programs produced by EBQI achieve comparative effectiveness on a population basis is also critical. Ongoing national evaluation of primary care-mental health integration in VA has the potential to achieve this goal.
This study has limitations. The study focused on a single healthcare system (the VA), and on non-academic, small to medium-sized practices, almost a third of which were rural [
21]. Results may not be generalizable to other systems or practice types. Second, our study's power to detect symptom outcomes was limited by our follow-up window of seven months. For some patients, delays in access to care managers and thus in treatment initiation may have limited the possibilities for completing treatment within the study window. Third, we were not able to confirm registry data on care manager visits by analysis of administrative data because a specific encounter code for depression care management was not introduced by the VA until after this study. Fourth, use of consecutive sampling over-represents more frequent users of primary care relative to the full population of visiting patients [
61]. Finally, our process evaluation subgroup analyses on early adopters are not appropriate for drawing conclusions about causality or the overall effectiveness of EBQI-CCM. Selection bias, in particular, cannot be eliminated as a factor in these analyses.
In summary, this study showed that CCM, as implemented using EBQI, improved antidepressant prescribing across a representative sample of patients attending study practices. While this randomized evaluation does not test the effectiveness of CCM as an ideal model, it does test the effectiveness of CCM as designed and implemented in VA using QI methods, albeit early in the program's lifespan. The study encountered a number of difficulties likely to apply to other healthcare organizations implementing CCM as routine care, including the consequences of care manager overload and of differential PCC adoption of CCM. The lack of robust patient symptom improvement for the experimental group compared to usual care points to the importance of continuously monitoring and improving CCM programs during and after implementation. Otherwise, the cost-effectiveness benefits promised by CCM studies will not be achieved in reality.