Effectiveness of Alternative Ways of Implementing Care Management Components in Medicare D-SNPs: The Care Wisconsin and Gateway Study
Jelena Zurovac, Randy Brown, Bob Schmitz and Richard Chapman
Mathematica Policy Research
Evidence on best practices in care management for chronically ill Medicare beneficiaries offers few clear guidelines about what works best. Given the wide variation both within and across plans in how special needs plan (SNP) services are provided, it becomes important to identify how best to implement or improve intervention components rather than testing only the overall effectiveness of SNPs in general. In this study, we sought to understand which of two alternative ways of implementing each of several components of care management lead to better health outcomes in the two participating SNPs. We used an efficient orthogonal design that allowed us to simultaneously compare effectiveness of alternative approaches to implementing ten components of care management services. Efficient orthogonal designs have been used extensively in manufacturing, and in some health care organizations, but not in published health care evaluations. Such designs enable the testing of multi-component interventions and various ways of deploying each component, offering great potential as a tool for continuous improvement in health care quality.
We tested two alternatives--routine care (services routinely provided at the plans before the study) and enhanced care (more frequent or more intensive versions of the services)--for each of ten care components. The tested components included frequency-of-routine contacts; falls-risk screening frequency and referral to fall prevention programs; depression screening frequency, use of depression screening instruments, and mode of referral; member education and coaching strategies; and management of care transitions, including frequency of follow-up and use of protocols and tools.
Study Design and Analysis Methods
Randomization, Outcomes, and Data
The participants were: (1) care managers in Care Wisconsin and Gateway (in Pennsylvania) plans who implemented the interventions; and (2) the 1,562 dually eligible noninstitutionalized members with disabilities or frail elderly who comprised these care managers' caseloads. We randomly assigned each of the 24 care teams/care managers in the study (17 from Care Wisconsin and seven from Gateway) to implement a different, pre-selected combination of alternatives (routine care or enhanced care) for each of ten components, for one year. The care managers implemented the same intervention components for all of their members.
For each component, we analyzed whether members assigned to the enhanced care variant experienced different outcomes than those assigned to the routine care variant. Outcomes examined included: (1) the number of inpatient admissions for any reason; (2) the incidence of readmission within 30 days of any discharge (i.e., including those for mental health problems) and within 30 days of a discharge from medical stays only; and (3) the number of emergency room (ER) visits. We focused on readmissions for members hospitalized at least once during the follow-up because members who were not hospitalized cannot be readmitted. The program ran from May 16, 2011, through May 15, 2012, and we measured impacts over this period. We received approval for the study from the New England Institutional Review Board. U.S. Office of Management and Budget approval was not required because Mathematica did no primary data collection.
To analyze the effectiveness of enhanced versus routine care, we used two sources of secondary data obtained from the participating plans: (1) de-identified claims data on members' service use and chronic conditions; and (2) de-identified data on members' demographic characteristics and risk level, as assessed by each plan. For the implementation analysis, we used data collected by participating plans via tracking tools to assess the care managers' fidelity to their assigned component options. We also conducted discussions with care management staff to understand how faithfully the components were implemented.
Impact and Implementation Analysis Methods
We used regression analysis to compare the outcomes for members receiving routine care to the outcomes for members receiving enhanced care, controlling for any pre-intervention differences between the two groups in members' and care managers' characteristics. All four outcomes were analyzed over these follow-up periods for all members: 1-6 months; 7-12 months; and the full 1-12 month period. Analyses of effects of components on readmissions were done for hospitalized members over the 1-12 month follow-up. Regression analyses controlled for member characteristics observed over the two-year baseline period (May 16, 2009, to May 15, 2011).
Implementation analysis is particularly important because a finding from regression analyses that routine and enhanced care options were equally effective in terms of observed health outcomes for a given component might be incorrect if such care was not fully implemented. If such analysis suggests that some planned intervention enhancements were not well-implemented, further assessment should be done to identify the barriers to implementation of those interventions. In June through August 2012 (between one and three months after the intervention period ended, but before the analysis results were produced), we held discussions with care management staff to get their views on why enhanced care may have been more effective than routine care for some components but not for others, and to identify implementation facilitators and barriers. Care managers were instructed to use the tracking tool form after each contact with the members to record which components were provided. Using this information, we assessed the fidelity to assignments by examining: (1) the proportion of members receiving the assigned option at least once; (2) the annualized number of times each component or option was provided per member; and (3) the proportion of members receiving the option at least as often as assigned.
Study Findings and Discussion
The population of members in the study was composed of older adults, included more women than men, and was largely Caucasian. The proportions living in rural versus urban areas were about equal. Gateway members were younger, less likely to be newly enrolled in the plan, and less healthy than Care Wisconsin members. Use of hospital and ER services was high at both baseline and follow-up--47 percent of members were hospitalized during baseline and 43 percent were hospitalized during the follow-up period.
Even though outcomes were similar for those whose care managers were assigned to the enhanced version as for those assigned to the routine version for most of the ten care components, there were a few exceptions:
Requiring a higher minimum frequency of contacts and medication reviews was associated with 16 percent fewer ER visits at the full year of follow-up. Members assigned to enhanced care received slightly more contacts and many more medication reviews (38 percent) than members assigned to routine care. Care managers remarked that they had difficulties maintaining the more frequent contact rates due to already high caseloads.
Surprisingly, patients of care coordinators assigned to use the enhanced ("teachback") coaching method had 15 percent more ER visits than the routine coaching method at the full year of follow-up, a finding possibly attributable to care managers' lack of familiarity with the teachback method. Fidelity analysis showed that care managers assigned to teachback provided less coaching to their patients, in terms of the percentage of patients who received any coaching and the number of times they received coaching. The effect on ER visits dissipated in the second six-month follow-up, which might be due to care managers improving their teachback skills over time.
Results for outcomes measured over the periods of 1-6 months and 7-12 months were similar to those for the full period, suggesting that most of the enhanced options neither influenced outcomes early on but then dissipated, nor that they took several months to take effect. One exception is that assigning members to more frequent falls-risk screenings with an instrument was associated with a greater likelihood of readmission following a medical discharge as compared to those assigned to routine care; at the second six-month follow-up, the finding was reversed. These findings are most likely chance variation in when these readmissions occurred rather than real effects of opposite signs.
Some findings of no difference in outcomes may be attributable to a failure to implement the enhanced care option in a manner that sufficiently distinguished it from the routine care option. For example, although the teachback method was qualitatively more intensive, the fidelity analysis showed that the enhanced and routine care groups received approximately the same number of post-discharge follow-ups, consistent with care managers reporting having difficulties conducting the second follow-up because calls were time consuming and members difficult to reach. Although post-discharge follow-up with an instrument and a checklist was qualitatively more intensive and reported as useful by care managers because it provided structure, care managers assigned to the instrument/checklist performed fewer follow-ups, so it is not surprising that we observed similar outcomes for routine and enhanced care groups on this component.
Despite the findings that outcomes were not better for the enhanced version of most of the components tested, the participating plans have nonetheless decided to adopt some of these enhancements. The plans' decisions to adopt these enhancements were made before the results on relative effectiveness of enhancements were available to them, and were therefore based solely on their experience with the options. Care management staff reported several important lessons learned from the study implementation. We found that both plans' care managers and leaders believed that use of the teachback method was a useful and appealing innovation; the two plans intend to train all care managers in the method before requiring its routine use. In addition, Care Wisconsin plans to implement the Patient Health Questionnaire Nine-Question Instrument for depression screening because it was shorter than the tools used at the plan before the study and because community clinicians were familiar with it. Care Wisconsin is considering training care managers in falls-risk assessment. Care Wisconsin has also developed a post-discharge tool similar to one used in the study and is considering adoption of a second post-discharge follow-up because of positive feedback from care managers and because both these enhancements are believed to be helpful to members. Gateway noted that the study introduced more structure in routine contacts, falls-risk screening, and care transitions management, which it considers to be valuable and intends to continue. In addition, the plan intends to train care managers in depression screening. For both plans, the study highlighted the need to track the services delivered by care managers.
Several limitations in the study should be noted. Because only 24 care managers/teams participated in the study, only large differences in outcomes between routine and enhanced care options (22-32 percent of the mean outcome) were likely to be detected. Moreover, to obtain even this number of care managers, it was necessary to "pool" care managers from the two plans and estimate a single effect. We estimated intervention effects under the assumption that these effects were equal for the two plans after controlling for patient characteristics and plan-level differences in mean outcomes.
Members in the Gateway plan were not enrolled in care management for the full (one year) duration of the study, since members who reach their goals "graduate" from the program and cease to receive care management services. On average, members receive care management for 4-5 months. Therefore, it may have been even more difficult to detect differences between the tested options because 30 percent of the sample did not receive the interventions for long. Further, it is possible that exposure to intervention components for one year was not long enough for the measured outcomes to change, due to time required to learn how to implement a change in care management protocols effectively and for opportunities for preventing a hospitalization to arise. However, because we are analyzing a high-risk population and types of services that should show results relatively quickly, one year should be long enough for effects to be observed, if they are ever going to exist.
Given that we performed many comparisons between enhanced and routine care, it is possible that some findings resulted from chance. The number of significant differences was about what would be expected by chance for the 80 comparisons (two outcomes were analyzed for ten components for all members for three periods and two outcomes for ten components for hospitalized members). A joint test of whether all enhanced versus routine care differences were zero could not be rejected, indicating that even the few statistically significant observed differences may have been due to chance rather than to the interventions. This indicates that as a group, enhanced components did not have a different effect on measured outcomes than routine practices. While this could be viewed as routine care being just as effective as the tested enhancements, the lack of significant findings may also be due to insufficient statistical power to detect what may have been modest-size favorable effects. Only impacts of 22-32 percent or larger were detectable with 80 percent power.
The findings from the implementation analysis of the tracking data may be flawed by incomplete reporting by the care management staff on their activities. Care managers at both plans experienced some difficulties integrating tracking sheets into day-to-day activities, which indicates that future studies should consider other ways to track fidelity, such as via electronic health records. In the related Brand New Day orthogonal design study that we conducted (reported elsewhere), the plan collected such information as part of the electronic health record.
Another major limitation was that for five of the components, the care actually delivered by care coordinators assigned to the enhanced care option did not differ meaningfully from the care delivered to those assigned to routine care. Thus, it is not possible to determine whether these intended enhancements of a given component would be more effective from the routine care. For two of the components, at least part of the reason for failure to deliver the assigned intervention appeared to be a lack of explicit, unambiguous descriptions of how the enhanced care was to be delivered. For example, conversations with care mangers revealed that there was confusion about how often to review a care plan (Component 6) and what is involved in a plan review, which is an important finding for the participating plans. The multiplicity of routine care practices and a lack of understanding of what is involved in these practices illustrate that sharp differences between studied options can be difficult to specify and explain to care coordinators.
For the other three components for which the enhanced care option was not implemented in a manner that distinguished it sufficiently from the routine care option, the problem was either that the enhancement was not implemented consistently or fully, or that routine care was more intensive when delivered than specified by the participating plans. However, while this situation makes it impossible to evaluate the effectiveness of the planned enhancement, it should be viewed less as a limitation than as an important finding that can inform the plans of needs to standardize routine care practices, and an opportunity to learn why planned enhancements were not enacted. The analyses in this report took an "intent-to-treat" approach in which component effects are computed by comparing outcomes of those assigned to the two options, regardless of whether or how thoroughly the options were actually delivered. Standard supervisory measures at the two plans were continued throughout the study, so that the components were tested in a "real-world" environment with the currently available resources, rather than in a strictly controlled setting. Follow-up discussions with care coordinators revealed several reasons for the lack of full implementation of the enhanced variants, such as high caseloads, difficulty tracking which components they had already provided to a given patient, and multiple organizational changes occurring during the study period that were unrelated to the study.
Implications for Policy and Practice
The study illustrates the potential of orthogonal design for improving the effectiveness and efficiency of care management programs if enough observational units, such, as care managers, are available. Orthogonal design combines the rigor of experimental design with the ability to produce rapid results on the effectiveness of several components in a single experiment. It accommodates planned testing of alternative approaches to multi-component interventions and permits practitioners and researchers to tailor interventions to the target population and test enhancements to routine care. Given that orthogonal design tests combinations of routine and enhanced care, there is no traditional control group; all members receive each component of care (e.g., screening), but some receive it in a different style or intensity than had previously been used. Further, orthogonal studies are attractive because the care managers who implement the interventions all are engaged in testing new variations because each care manager implements some enhanced care and some routine care options. Care manager engagement is greatly enhanced if they are included in the development of the enhancements to be tested; this should always be a feature of orthogonal design studies of care management.
An important benefit of an orthogonal design study, as we have seen from the reaction of the participating plans, is enhanced clarity of expectations about how interventions are to be provided. Rather than implementing a broad model of care, care managers are told precisely how they are expected to implement each of the components of care management being tested. When routine care is not well defined or the way routine care is implemented differs across care managers, as is often the case, this structure itself can help standardize the care management intervention, leading to less variation in implementation across managers. Further, fidelity analysis allows participating plans to assess the degree to which components were carried out as specified, which can help the plan identify the areas of care management to focus on in their quality improvement efforts. While efforts to standardize care management interventions can be done without orthogonal design studies, conducting such a study forces plans to re-examine their processes of care and protocols, and can uncover unknown areas of confusion or misinterpretation concerning routine care and operations. The orthogonal design approach also encourages organizations to create a culture of learning by providing participants with a rigorous approach for testing out their new ideas.
This study also identifies some important difficulties with conducting orthogonal design studies in health care organizations. The types of variations in how care coordination is delivered studied here are likely to generate only moderate size effects on hospitalizations or ER use--that is, they are not strikingly different ways of delivering care coordination, but rather relatively minor twists. Furthermore, some of the interventions can only affect subsets of the enrollees (e.g., those with depression, those with a hospital admission), so the expected effect calculated over all enrollees is attenuated. To have adequate statistical power to detect such modest expected effects, a sizeable number of care coordinator units are needed because the variance of these outcomes across care coordinators is large. Without adequate power, statistically insignificant differences in outcomes between enhanced and routine versions of a care component cannot be taken as valid evidence that the routine (and typically less expensive) version of the intervention is just as effective as the enhanced version. Although the number of care coordinators (24) participating in this study exceeds the number used in some studies in other fields, it was not sufficient for this study due to the large variation in hospitalization rates and other key outcomes across coordinators.
The study also identifies how hard it can be to change the behavior of even dedicated health professionals. For each of the components, both the enhanced and routine care groups received the assigned component less often than specified in the study. Very few members received at least the minimum number of services (for example, contacts, post-discharge visits, screenings) as specified in the study protocol, and an even smaller percentage of members assigned to enhanced care received services at least as often as assigned. Even though this finding may have been due in part to under-reporting of services provided, the gap is so large that it seems likely that many patients did not receive the full complement of intended services, reflecting various barriers. The qualitative investigation of barriers to implementation is just as important for learning as estimation of the effects of the various enhancements.
|The Full Report is also available from the DALTCP website (http://aspe.hhs.gov/office_specific/daltcp.cfm) or directly at http://aspe.hhs.gov/daltcp/reports/2014/OrthoV1.shtml.|