An Environmental Scan of Pay for Performance in the Hospital Setting: Final Report. The Empirical Literature on Hospital P4P


As of June 2007, few peer-reviewed studies existed on the use of financial incentives and their impact on quality, patient experience, safety, or the efficient use of resources. While more than 40 hospital-based P4P programs are operating in the U.S., little empirical evidence has emerged from these payment reform experiments to gauge the impact of hospital P4P in meeting programmatic goals or to understand how various design features affect such things as engagement in the program, the likelihood of creating unintended consequences (such as reductions in access to care for more difficult patients), or the distribution of payments to providers.  Few P4P programs are undergoing formal evaluations to assess their impact, and challenges arise in conducting evaluations of real-world applications because the applications generally lack a comparison group that is required to assess the impact of the P4P intervention.

We reviewed the literature between January 1996 and June 2007 and found only nine published studies that address the impact of three separate hospital P4P programs in which formal evaluations have been occurring: 

  1. The Hawaii Medical Service Association (HMSA) P4P program 
  2. The Blue Cross Blue Shield (BCBS) of Michigan Hospital Incentive Program
  3. The Premier Hospital Quality Incentive Demonstration (PHQID).

Of the eight studies examining changes in performance, each one reported improvements over time in at least some of the hospital performance measures or condition-specific composites included in the specific study; however, it is difficult to disentangle the P4P effect from the effect of other quality improvement efforts that were occurring simultaneously. The strongest evidence on the impact of hospital P4P to date has been shown through the Lindenauer (2007) study of the impact of PHQID relative to the Medicare RHQDAPU program.  These studies, while showing a positive effect of P4P, reveal that the additional effects of P4P are somewhat modest relative to public reporting and other quality interventions that are occurring simultaneously.  Improvements in hospital performance have been observed in response to feedback reports (Williams et al., 2005) and public reporting, with a financial incentive for submitting data (Grossbart, 2006; Lindenauer et al., 2007).  One study found improvements in a few performance areas associated with P4P as compared with what was seen for control hospitals participating in voluntary quality improvement activities (Glickman et al., 2007).  It has been argued, however, that in order to accomplish sustained quality improvement, interventions should be multifaceted and focus on different levels of the health care system (Grol et al 2002; Grol and Grimshaw 2003). This suggests that to be most effective, P4P should be partnered with other activities such as public reporting and internal quality improvement activities, that also encourage quality improvement for the same clinical area. 

There is less evidence of the effect of P4P on patient outcomes. One study (Berthiaume et al., 2006) found reduced complication rates for obstetrical and surgical patients in an uncontrolled study, though it was not reported whether those improvements were statistically significant. Glickman et al. (2007) did not find significant differences in inpatient mortality improvement for AMI between PHQID and control hospitals exposed to an AMI quality improvement intervention.. None of the studies evaluating PHQID separately analyzed the other patient outcome measures (for coronary bypass survey and hip and knee replacement surgery) included in the program, so it is not clear whether improvements occurred in these measures.

Most of the published studies have significant methodological limitations. Six of the nine had no controls, which are critical for providing evidence of a link between P4P and performance improvements. This is particularly important given the documented temporal trend toward increasing performance on many hospital quality metrics.  Another important issue to consider is whether the experience of these smaller-scale incentive programs, with the exception of the PHQID, could be generalized to reflect what the effects would be of wholesale national implementation of a hospital P4P program by Medicare. 

