1. Financial incentives are primary drivers of the adoption of new clinical practices, whether or not these practices are supported by the CER evidence. CER results that threaten the financial interests of a stakeholder will be challenged at all phases of the translation process.
The most fundamental determinant of successful CER translation is the extent to which the economics of adopting a new clinical practice are favorable to providers and patients. Not surprisingly, the marginal profitability of a procedure smooths the way for implementation of a supportive CER finding and acts as a strong deterrent to the implementation of one that suggests that a more expensive therapy may not be superior to a less expensive one. Our case studies of the comparative effectiveness of interventional and noninterventional procedures highlight the perverse consequences of fee-for-service reimbursement in the face of CER evidence showing little or no marginal benefit. Rates of reimbursement for PCI, CRT-D, and spinal stenosis surgery are orders of magnitude higher than those for alternative treatments, and many discussants noted that once patients were referred to interventional specialists, even if only for consultation, there was a high likelihood that they would receive an invasive procedure. In the words of one HF physician, interventionists are “mainly looking for contraindications [at that point].”
CATIE is a notable exception, in that prescribers typically have little financial stake in the choice of antipsychotics, so change in the implementation phase could not have been driven by financial considerations. But the lack of practice change following COURAGE and SPORT despite seemingly equivalent effectiveness of treatments and the modest benefit of surgery, respectively, illustrate the tendency to prescribe a more costly treatment when reimbursement policy is not well aligned with the CER evidence base. The high rates of inappropriate use of CRT-D and complex spinal surgery after the SPORT results were published reinforce this general conclusion.
While the role of financial incentives in the utilization of expensive therapies is fairly well known and may by itself explain the failure of implementation of a CER-based practice, our case studies highlight their role in influencing more than just the implementation phase. Financial incentives may have a greater influence on the adoption of new clinical practices than CER evidence in the following ways:
· Stakeholders with a financial interest in the outcome of a CER study may seek to influence the study design to increase the odds that its results will favor them, or they may initiate efforts to critique and thus undermine potentially unfavorable CER studies at the time participants are being enrolled. According to several physician discussants, the pharmaceutical and device industries, respectively, developed campaigns to raise doubts about the design of the CATIE and SPORT trials even before their findings were known. Critiques of CER study design by interested stakeholders may peak at the time results are released to maximize the likelihood that the studies will be viewed as methodologically weak. Many physician discussants believe these messages are not likely to be a critical factor in the way the average practitioner interprets the studies, but their impact on the formalization of CER evidence through guidelines, quality measures, and clinical decision support may be important, and this may have downstream effects on local implementation.
· The interpretation of CER results through a dynamic scientific debate among stakeholders appears to be influenced by financial incentives of the participants. In the case of CATIE, industry sponsorship of key opinion leaders and aggressive detailing practices reflected a high degree of pharmaceutical-industry effort to influence the interpretation of the results in a manner that was favorable to its financial interests. Subspecialty societies with and without potential financial gains publicly disagreed about the interpretation of COURAGE and SPORT results.
· The formalization of guidelines and quality measures based on CER evidence may be influenced in subtle ways by financing. Historically, most guideline and quality measure development activities have been supported by professional societies, industry, or some combination of the two. Conflicting subspecialty guidelines are a well-known problem, although this may begin to change in the wake of the recent IOM recommendations for improving the consistency and credibility of clinical-practice guidelines (Institute of Medicine, 2011). Professionals have few financial incentives to facilitate the development of performance measures unless they will be paid based on the measure results. Discussants noted that guideline recommendations for spinal surgery tracked closely with the mission of the sponsoring professional society and were likely to conflict with one another. Moreover, some guidelines may not change in a timely way in response to CER evidence. For example, the APA took the position that formulary restrictions and other policies of payers should not be altered by CATIE results, instead choosing to defend prescriber autonomy. The prescribing guidelines were eventually modified, but other formalization activities such as the modification of quality measures were slow or nonexistent.
§ Advocacy is often directed against payers in public forums and may have the effect of limiting their willingness to formalize CER evidence into coverage policies or utilization review policies. All of our payer discussants indicated that their organizations are averse to restricting coverage for treatments that are well established. Rather, they selectively pursue areas in which practice variation is extensive, where evidence clearly does not support the practice, and where risks to patients are unambiguous. Two examples of areas targeted by payers are the use of antipsychotics in pediatric populations and lumbar fusion surgery. Payers appear to be less willing to pursue prior authorization policies to avoid facing protracted and costly challenges by professional societies. In addition, payers who have narrow service areas may be more concerned with preserving good public relations, and public payers and publicly funded organizations like the VA may face additional political pressures to preserve generous coverage for their beneficiaries. When payers do implement prior authorization policies, they may rely on industry-standard medical-necessity criteria, which tend to be rather permissive.
- The dissemination of new practices based on results is an expensive undertaking. Detailing and advertising of new procedures and treatments are in the interest of the industry involved, but counterdetailing is rarely supported unless payers or the public choose to back it. Indeed, the lack of financing for dissemination activities to support CER-based practices may be an important impediment to practice change. Provider demand for clinical decision support tools and patient decision aids may drive vendors’ efforts to develop these tools, but with the exception of meaningful-use standards, there is little financial incentive for physicians to use them.
Despite the seemingly powerful influence of financial incentives favoring both the status quo and an accelerating panoply of new procedures, recent trends, including the emergence of new models of provider organization and payment, provide some basis for optimism that CER evidence can be more influential in the future. Physicians continue to abandon private, independent practice in favor of health-system affiliation and increasing reliance on salary-based payment. This trend may dampen their exposure to perverse financial incentives associated with fee-for-service payment. Physicians working within a larger organizational context may be more likely to use performance measurement and feedback, patient decision aids, clinical decision support tools, and registries, all of which have the potential to increase responsiveness to CER. Discussants commented that such tools had the potential to shed light on inappropriate utilization or other quality concerns that might prompt involvement by payers or policymakers. Value-based and episode-based payment approaches may also encourage more cost-effective care, particularly if these models encourage quality improvement.
Payers indicated to us that they are actively engaged in horizon-scanning for CER evidence that may form the basis of policies before practices become widespread in the community. While our case studies focused on well-established practices, newer CER may focus on less-widespread practices and may involve less-controversial treatment comparisons. Activities that curtail the influence of financial interests in each of the phases of CER translation (such as the recent IOM report calling for greater transparency and integrity of the guideline development process) may reduce the countervailing forces that work to undermine or neutralize even the best CER evidence.
2. Even the best CER studies may fail to produce an unambiguous “winner,” so it may be difficult to achieve a consensus interpretation of the results.
The promise of CER lies in its potential to increase the use of the safest and most effective treatments and decrease the use of ineffective treatments or those for which an equivalent, but less expensive treatment exists. CER studies that produce clear “winners” (showing unambiguously that one treatment is better than another or that two treatments have essentially equivalent effectiveness) should be more likely to change practice because their results are difficult to challenge. For example, the Women’s Health Initiative demonstrated conclusively that hormone replacement therapy increased the risk of clotting and MI. However, our case studies suggest that even among the best-designed and -conducted CER studies, ambiguous results are likely to be common. Moreover, persuading stakeholders about treatment equivalence may be much more difficult than persuading them of the superiority of one treatment over another, because equivalence is generally defined across multiple outcomes, while claims of superiority are often based on a statistically significant difference on a single outcome.
SPORT is a typical example of a CER study that was plagued by methodological concerns and failed to produce a clear winner. Given ambiguous results, professional societies interpreted the results differently, leading to contradictory guidelines. CATIE and COURAGE, two well-executed CER studies, found comparable effectiveness of alternative treatments, yet their impact on practice has been muted because of limitations in their designs. Debate about the meaning of the trials and their applicability beyond the included populations has continued for years. The CPOE trial produced unambiguous improvements in safety, but the generalizability of the results was questioned. COMPANION (as well as the subsequent CARE-HF trial) demonstrated a relatively unambiguous significant survival benefit of CRT, but an unclear benefit of CRT-D vis-à-vis CRT due to a positive finding on a single secondary outcome.
Many factors increase the risk of an ambiguous result from a CER study. Comparison of two somewhat effective treatments may result in smaller differences in expected effects than would result from a placebo-controlled study. The use of active comparison groups, more-inclusive patient populations, and real-world practice settings may either level the playing field, introduce greater statistical noise, or do both. Stakeholders may differentially weight the importance of study end points (e.g., clinical effectiveness, side effects, safety), contributing to different interpretations of the results. Meanwhile, stakeholders may differ in their equipoise for recommending treatments and therefore may view the same effect size differently. All of these factors complicate the interpretation of findings. In the CATIE case study, some stakeholders focused on the equivalent effectiveness outcomes between conventional and atypical antipsychotics to argue for step therapy, while others cited heterogeneity in benefits and harms of individual drugs to argue for maintaining open access to all antipsychotics. The nonsurgical spine community (and several spine-surgeon discussants) found the benefits of spinal stenosis surgery to be marginal, while most surgeons found that the results confirmed prior, smaller studies demonstrating the superiority of surgery.
CER studies that produce ambiguous results open the door to selective interpretation, which tends to undermine consensus that could facilitate guideline updates by professional societies or the formation of coverage policies by payers. In the case of CATIE, SAMHSA, the entity responsible for translating mental health research into practice, pursued a neutral policy of “greater patient engagement” because it viewed the CATIE results as “mixed.” While the American College of Cardiology typically drafts consensus statements following key trials, it failed to do so following COURAGE. Several payers indicated that they avoid focusing on clinical areas that lack clear winners to ensure that providers have enough flexibility to use clinical judgment when making treatment decisions. Payers are more likely to focus on off-label uses (e.g., antipsychotics for kids), practices out of the mainstream (e.g., dosing limits that are multiples of recommended limits), and procedures that display substantial variation in rates (e.g., spinal fusion surgery). In many states, payers are permitted by law to use cost-effectiveness data for coverage decisions as long as treatments can be considered equivalent, but operational definitions of equivalence are incomplete.
In cases where one treatment is unambiguously harmful, clinical practice has been known to change rapidly, and our findings confirm to some extent that general rule. Anecdotal reports suggest that the use of olanzapine, which had the most severe metabolic side effects (and greatest efficacy) among the medications assessed in CATIE, may have declined in the post-CATIE period, indicating that physicians may have perceived that the metabolic risks exceeded the benefits. The use of performance measurement, including mortality and complication rates, both within and outside of registries may also help provide data on treatment harms in the future and may influence providers’ decisions to recommend PCI, CRT, and spinal stenosis surgery or payers’ willingness to implement coverage policies. Thus, adverse event data from registries may help to identify more unambiguous losers over time.
While ambiguity may lead to incomplete use of CER results and limit the potentially attainable change in clinical practice, discussants emphasized that the lack of winners does not invariably mean that the CER fails to have an impact on clinical practice. Many indicated that identifying winners is not the goal of CER; the goal is to generate information to help physicians and patients arrive at satisfactory treatment decisions. Discussants indicated that CATIE, COURAGE, and SPORT reassured physicians that using moderate-dose conventional antipsychotics or less-aggressive therapies can confer equivalent or nearly equivalent benefits to patients. However, only an empirical study can confirm this hypothesis.
3. Cognitive biases play an important role in stakeholder interpretation of CER evidence and may be a formidable barrier in all phases of CER translation.
Our case studies suggest that at least three cognitive biases may influence the way physicians and other stakeholders interpret new CER evidence. First, confirmation bias, the tendency for a stakeholder to embrace evidence that confirms preconceived notions of treatment effectiveness and reject evidence to the contrary, may reinforce established practice patterns. Confirmation bias may be more prevalent in a CER context than in other types of clinical research, because treatments have often been in use for years and providers may be emotionally invested in them. In the years preceding CATIE, psychiatrists had been exposed to overwhelmingly positive research and messaging about atypical antipsychotics, which contributed to skepticism about CATIE’s unexpected findings. Both CATIE and COURAGE demonstrate how confirmation bias may have led stakeholders to dismiss CER results that challenged widely held treatment paradigms and, instead, to criticize the studies on methodological grounds. In our case studies, confirmation bias and financial incentives reinforced one another (with the possible exception of CATIE for prescribers), making it difficult to separate the magnitude of each effect. Stronger study designs (particularly emphasizing the generalizability of findings) and careful monitoring of study conduct (particularly to prevent crossover) may preempt these critiques and counteract the influence of confirmation bias.
A second cognitive bias is the belief that aggressive intervention (even for low marginal benefit) is better than inaction. This belief appears to be a potent driver of the use of PCI in patients with stable angina (the so-called “oculostenotic reflex”). It may be reinforced by perverse financial incentives and by providers’ perceived risk of malpractice liability if they decline to recommend the procedure. Payer discussants noted that physicians perceive that they “are not protected if there is an adverse outcome, even if you follow the evidence.” While patient preferences may also drive the use of more-aggressive treatment, providers may be less likely to attempt to convince a patient about a different course of treatment than the one favored by the patient because these situations may heighten their perceived exposure to a malpractice suit. More complete data on treatment harms or heterogeneity of benefits may promote greater equipoise among both physicians and patients. In addition, the use of utilization review with feedback to providers—particularly outlying physicians—can be effective in helping them recognize aberrant utilization. However, payers typically do not have all the elements within their administrative data systems to be able to measure appropriate utilization.
A third cognitive bias (at least in the United States) is the widespread tendency to perceive new technologies as superior to older technologies. Stakeholders who have financial interests in new technologies may also disseminate messages that reinforce this bias. For example, atypical antipsychotics were often marketed as “second-generation” medications, implying that they offered improvement over conventional antipsychotics. But, as one discussant noted, “We didn’t call beta-blockers ‘second-generation anti-hypertensives.’” Lengthy CER studies, such as COURAGE, are open to the criticism that technology has advanced, and this argument may carry weight with the average practitioner who observes the rapid change in medical technology first-hand. This is particularly important for the interpretation of CER studies involving devices. Adaptive study designs could provide the flexibility to allow the evolution of treatments through the course of a trial, but such designs were only utilized in one of the trials we studied.
Strategies for mitigating cognitive biases are available, but their effectiveness is not completely clear. Enhancing the transparency of stakeholder positions by using approaches that foster explicit formal decisionmaking processes could mitigate cognitive biases in a policymaking context. Disclosure of financial and intellectual conflicts of interest is another strategy used by the IOM and others, and regulating detailing and direct-to-consumer advertising may also be effective. These options are discussed in greater detail in Chapter Eight.
4. The questions posed by a CER study and its design may not adequately address the needs of end users or focus adequately on the decisionmaking opportunities with the greatest potential to influence clinical practice.
Our case studies suggest that multiple end users have potentially unrealistic expectations of CER studies. Even successfully designed and executed studies appear to disappoint some stakeholders. This occurs for a number of reasons. First, there is an unavoidable tension in the design of CER studies between the goals of producing treatment effectiveness data for selected populations (which would support personalized medicine) and producing general estimates of treatment effectiveness for a broadly inclusive population (which would support generalizable results). COURAGE and SPORT were designed to produce general population estimates (the second category of studies) and have been criticized for producing insufficiently detailed data to inform clinical treatment decisions for specific patients. While these two studies provide useful data on the average course of patients who pursued less-aggressive therapy, they provide limited data to enable the tailoring of treatment for individual patients. In SPORT, a sizable percentage of patients did not respond clinically to spine surgery, suggesting significant heterogeneity in benefits. Similarly, CRT may not have been disseminated widely because clinicians noted that a substantial population did not respond to it, but they lacked information that would allow them to predict which patients would respond. Assessing important causes of heterogeneity up front, if known, and conducting pre-planned analyses may be one way to deal with this tension; however, not all CER studies will be powered to detect such effects. And while registries may provide greater power to develop prediction rules, they lack the randomization of the original CER studies.
CER studies may not be designed with a comprehensive or explicit understanding of the beliefs and concerns of clinical practitioners. For example, CATIE focused on the relative effectiveness of different types of antipsychotic medications, but it had neither adequate power nor sufficient follow-up time to evaluate differences in safety—particularly the incidence of tardive dyskinesia. In retrospect, it was apparent that side effects were a chief concern for many practitioners, so some might argue that the question posed (i.e., relative effectiveness) was less compelling than questions about potential side effects and personalization of therapy. In this light, the trial design may not have been optimized to inform clinical practice. Similarly, hospital executives, who control decisionmaking regarding delivery interventions like CPOE, did not have relevant information on the applicability of the CER results to their particular settings.
Finally, many CER studies focus on a single step in a clinical decision algorithm. This is scientifically necessary, to provide clarity about the specific question the CER study is designed to address. However, the matching of scientific questions to high-leverage clinical decision points is critically important if the goal is to optimize the quality and costs of clinical practice. Head-to-head comparisons of treatments may help providers select appropriate treatments in the latter stages of clinical decision algorithms, and this is clearly important. However, upstream decisions about whether to pursue diagnostic tests or procedures may have a greater impact on patient outcomes and the overall value of care, especially if the downstream treatment decisions involve equivalent effectiveness of alternative treatments. Our discussants noted that the design of COURAGE focused on a downstream decision point at which interventionists were not at equipoise. CER that develops data for clinical-prediction rules and diagnostic algorithms may overcome this problem. For example, some current trials are focusing on the role of stress-testing to better guide the decision about whether to refer a patient to angiography. Similarly, diagnostic imaging has emerged as a key step in pathways leading to both PCI and spinal stenosis surgery. If providers such as primary-care physicians are responsible for upstream decisions to refer, and if both providers and patients have weaker incentives than interventionists have to choose an intervention, their decisionmaking may be more readily influenced by evidence.
5. Clinical decision support and patient decision aids can help to align clinical practice with CER evidence, but they are not widely used.
The use of clinical decision support in many areas of healthcare is limited, and the use of patient decision aids is suboptimal. Our SPORT, COURAGE, and COMPANION case studies confirm this. Furthermore, perverse financial incentives and lack of accountability for implementation have limited the production and dissemination of both of these tools. Limited adoption of CPOE is a prime example of the larger problem of inadequate health-information infrastructure that can support more consistent and effective use of CER to inform clinical decisions. For example, the absence of EHR-based algorithms that would identify potentially eligible patients and alert physicians to consider referral for CRT is an important missed opportunity leading to underuse of that technology. In general, decision support tools to promote evidence-based diagnostic testing and appropriate referral to specialists are uncommon, although both may lead to the use of treatments that are better aligned with CER evidence. According to one of our discussants who specializes in CER, one of the purposes of decision support is to increase the confidence of primary-care physicians in managing patients who do not require referral to specialists.
The consequences of limited patient empowerment in medical decisionmaking are significant. Many patients are unaware of the relative benefits and harms of treatment, and in some cases they greatly overestimate benefits and underestimate harms. For example, patients undergoing PCI for stable angina often mistakenly assume that the procedure reduces their risk of future heart attacks, which is not supported by the COURAGE findings or any other body of evidence. Well-informed patients who make treatment decisions according to their preferences may serve as a counterweight to providers who lack equipoise, particularly in the case of PCI and spinal stenosis surgery. Moreover, patient satisfaction with care can be significantly enhanced when decision aids are used. Although numerous medical publishing companies are developing products tailored specifically for patients, and professional societies have developed patient-oriented websites, both appear to be only weakly integrated with clinical processes, and their effectiveness is unclear. In contrast, direct-to-consumer advertising has become more prevalent and may create or reinforce misconceptions about treatments. The incorporation of CER evidence into direct-to-consumer advertising is an area that requires further study.
Even if incentives were better aligned and providers were held accountable for their use, it is unclear how to best incorporate clinical decision support and patient decision aids into clinical practice. Both sets of tools must be seamlessly embedded in clinical workflows. Physician discussants noted that primary-care physicians operate under extreme time pressures and re-source constraints and may have limited willingness or even opportunities to integrate these tools. While many payers are beginning to experiment with shared-decisionmaking initiatives, progress has been slow. Most tools are based on first-generation technologies that tend to overremind providers and disrupt workflow. Limited HIT infrastructure may continue to pose problems for the implementation of clinical decision support in the coming years unless CMS’s EHR Incentive program succeeds in transforming the HIT capacity nationwide. Finally, providers need significant training to develop competence in helping patients successfully use shared-decisionmaking tools.