An Environmental Scan of Pay for Performance in the Hospital Setting: Final Report. Measures


  • Measure Set Determination. We identified two general approaches used by sponsors to determine the measure set for their hospital P4P programs. The first is a standardized, “one-size-fits-all” approach in which the measures applied to hospitals in the program do not vary. The second approach involves customization in one of two ways: (1) each hospital, in consultation with the program sponsor, selects from a structured, pre-determined menu of measures a subset on which to be measured (i.e., measures are from a pre-determined menu), or (2) each hospital works with the program sponsor to create a customized set of measures from the universe of measures that exist (i.e., measures are not from a pre-determined menu). Regardless of how the measure set was determined, many programs used all-payer data to construct the measures, primarily to ensure adequate amounts of data to score hospitals (i.e., to avoid the small-numbers problem).
  • Common Measure Types.
  • Clinical Quality. Consistent with their key goal of improving clinical quality, all sponsors included clinical process and/or outcome measures as part of their hospital P4P programs (23/23). Process-of-care measures were much more commonly included (22/23) than outcomes were (3/23). The reasons cited for the focus on process measures included the availability of measures and performance scores collected and reported by national organizations such as the Joint Commission and CMS, and concerns about the adequacy of risk adjustment for outcome measures. There is substantial overlap between the measures included by the Joint Commission, CMS, and HQA (as shown in Appendix C, which lists existing hospital measures and their sources). The most frequently used process measure sets were:
    • The Joint Commission’s “core” measures (10/23)
    • The CMS’ P4R (RHQDAPU) ten starter-set measures (7/23
    • The HQA-approved measures (that have since been incorporated into the RHQDAPU program) (5/23), and
    • The Surgical Care Improvement Project (SCIP) measures (3/23).

  The most frequently tracked outcome measures were:

    • Complications of care (e.g., Healthcare Cost and Utilization Project measures concerning pneumonia after major surgery) (3/23)
    • Mortality (3/23).


    • Patient Safety. Another important area of measurement used by a large number of program sponsors (16/23) was patient safety. Among the most commonly used measures were: 
      • 3 Leapfrog Leaps
      • CPOE (12/23)
      • Use of Intensivists (9/23)
      • Evidence-based Referral based on Volume (6/23)
      • National Quality Forum (NQF) Safe Practices (4th Leapfrog Leap) (7/23)
      • Safe Medication Practices (6/23). 
        • Efficiency or Resource Use. Approximately half of the program sponsors included measures of efficiency or resource use in their P4P programs (11/23). A challenge cited in this area was identifying reliable and valid measures, given that their development has lagged that of clinical measures. Resource use measures most frequently included were:
          • Readmission rates (5/23)
          • Average length of stay (4/23). 
            • Other resource use measures used by sponsors included unit cost, avoidable days, and admissions per 1,000 members.

        • Patient Experience. Measures of patient experience were used by many sponsors in their P4P programs (9/23). They often used “homegrown” metrics (6/23). Many said that, in moving forward, they anticipated using the emerging national standard, H-CAHPS, which was undergoing approval by the NQF and they expected would be required by CMS under the RHQDAPU program.


        • Structure. Some sponsors were also focusing on the structural components of hospitals (9/23). Typically, these measures center on use of an electronic health record (EHR) or other IT implementation beyond the use of CPOE (5/23). A notable exception was one sponsor’s inclusion in its P4P program of whether hospitals used rapid response teams. 


        • Quality Improvement. Some sponsors (8/23) included metrics related to hospital quality improvement activities, which was consistent with their desire to improve the quality of care delivered to their members. More specifically, some are taking into account participation in the following quality improvement efforts:
          • Regional quality improvement initiatives (3/23)
          • National registries/databases (3/23)—for example, the registries managed by the ACC and the Society of Thoracic Surgeons
          • Internal quality improvement initiatives (2/23)
          • Institute for Healthcare Improvement’s (IHI’s) 100,000 Lives Campaign (2/23) 
          • AHA’s “Get with the Guidelines” program (coronary artery disease, stroke) (2/23).


        • Administrative. Only a small number of the sponsors with whom we spoke included administrative performance measures (5/23). When used, these primarily focused on metrics having to do with claims submissions, such as:
          • Number of claims re-submitted (2/23)
          • Electronic claims submitted (2/23). 
        • Measurement Selection Criteria. Sponsors consistently said that one of the most important criteria they use in selecting measures for their hospital P4P programs is consistency with other reporting activities (17/23), the objective being to help minimize hospital reporting burdens (15/23). They said that coordinating with other efforts, such as Joint Commission core measures and CMS RHQDAPU measures, makes it easier to launch and maintain their own programs. Doing so was considered essential for avoiding a cacophony of measures and to help set a collaborative, rather than combative, tone with hospitals. Although many of the sponsors valued the ability to use existing CMS and Joint Commission reported measures, they reported that the current set of measures was too narrow in scope and that there was a need to expand the set of measures to more comprehensively measure the performance of a hospital. Additionally, the sponsors indicated that performance has “topped out” on many of the measures (e.g., care for AMI), rendering them of less utility for quality improvement or for distinguishing differences between hospitals. Evidence-based measures (13/23) and/or endorsement by known organizations (such as NQF, Joint Commission, or HQA) (12/23) were also cited as key factors used in selecting measures. This not only assists with consistency across programs, but also reduces “pushback” from hospitals, especially in the case of measures that have been endorsed by HQA. Lastly, the practical points of ease of data collection (12/23) and data availability (12/23) were also important considerations in measurement selection.


        • Risk Adjustment. Many sponsors risk-adjust some of the measures in their program (15/23), generally outcomes of care, complications, and/or cost/efficiency measures. All sponsors noted that they use the risk adjustment methods recommended by the organization that developed the measure.


        • Composites. Many sponsors used composite measures, which summarize performance across multiple individual measures, in contrast to reporting individual metrics (17/23). Composites are typically being used for payout (10/23) or in report cards to facilitate consumer understanding (8/23). Composites were frequently produced at the condition level, such as AMI or CHF. Composites can take a variety of forms, ranging from an average of performance on the individual measures weighted by the size of the denominators, to assessing whether the patient received all of the measured care for which they were eligible (referred to as the appropriate care composite). Because fewer hospitals provide the right care 100% of the time to patients with any given condition, the use of an appropriate care composite typically results in a performance score that is lower than scores for individual measures. Shifting the performance measure to achievement of all recommended care can reduce the extent to which hospital scores “top out,” which may have occurred for individual measures comprising the composite.


        •  Piloting Measures. Sponsors expressed mixed thoughts about the need to pilot the measures being used in their P4P programs prior to payout. Some felt strongly that a trial run is “necessary to be fair,” especially if using newly created or not commonly used measures. Others, primarily those adopting measures used by the Joint Commission or CMS, thought that hospitals have had enough time to get used to both measurement and P4P and that, consequently, it was time to “just get on with it.”

View full report


"PayPerform07.pdf" (pdf, 1.22Mb)

Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®