The environmental scan also uncovered a number of program implementation challenges that warrant consideration during program design and implementation.

The small numbers problem:  A sizeable number of hospitals have only a small number of events or cases to report for one or more measures.  A small number of events to score will result in unstable estimates of performance as a basis for determining performance-based incentive payments.  While this is a more acute problem for small and rural hospitals with a small number of patients per year, the problem also occurs in some medium- and large-size hospitals depending on their service mix, the details of measure specifications, and the use of sampling during data collection.  Using all-payer data, collecting and aggregating data over longer periods of time, using composite measures,1 and identifying measures relevant to smaller providers are approaches that can help to mitigate the small numbers problem and allow for the construction of more stable estimates of performance.

The Burden of Data Collection:  The data collection burden, which affects how many measures a P4P program can reasonably require a hospital to collect and report, creates challenges for efforts to comprehensively assess the performance of hospitals given the wide range of care and services provided within hospitals.  The more comprehensive the measure set used, the greater the burden on hospitals in the near term, given that most of the data needed to construct performance measures is contained in paper medical records.  In most cases, hospital information systems are not yet equipped to capture and easily retrieve the clinical information used to create performance measures, nor are they structured to enable routine monitoring of quality of care.  Until health information systems are upgraded to capture this information, program sponsors may be constrained in the number and breadth of measures they can expect hospitals to collect and report. Once effective information systems are built and put into place, the number of measures included in a P4P program could be expanded. 

Ensuring the Validity of Data used to Make Differential Payments:  P4P programs are also challenged with an acute need to ensure the integrity of the data used to score hospitals and make differential payments, which requires resources for data validation.  Allocating sufficient resources to validation work is critical for program credibility, and today only limited resources are being used for data validation within P4P programs.  Most hospitals stated that the current level of validation is insufficient, and the incentives to game the system will increase as the amount of money at risk in P4P programs increases.

  1. There are a variety of ways to construct composite measures, not all of which would help mitigate the small numbers problem.
  2. Public Law 108-173, December 8, 2003.
  3. An appropriate care measure is a composite measure that assesses what percentage of time a patient with a given clinical condition (e.g., AMI) received all of the recommended processes of care—in other words, how often a hospital provided “optimal” care for a patient with a given clinical condition.
  4. The journals searched were Managed Care, Hospitals and Health Networks, Modern Healthcare, Managed Health Care Executives, Healthcare Intelligence Network, Medical Economics, Managed Care Weekly, Modern Physician, Business Insurance, California Healthline, Managed Care Online, and Managed Care Magazine. The search terms used included pay for performance, pay for quality improvement, financial incentive, bonus, reward, hospital payment, performance improvement, and quality initiative.
  5. Any denominator less than 23 indicates that one or more of the organizations did not respond to the question. Non-responses were typically caused by limited time or a respondent’s inability to answer the question.
  6. Sponsors cited use of the AHA “Get with the Guidelines” database, the American College of Cardiology, and the Centers for Disease Control’s National Health Safety Network (NHSN).
  7. As previously described, there are a variety of methods that can be used to construct composite measures and all of the methods would help mitigate the small numbers problem. For example, the appropriate care model does not create more denominator events to be scored.
  8. CAHs serve as a “proxy” for the likely experience of small hospitals. CAHs are not required to submit data under RHQDAPU, although some voluntarily do so. CAHs are not Subsection D hospitals and are excluded from the proposed Medicare VBP program, as outlined in the Deficit Reduction Act of 2005.

In summary, P4P programs have the potential to drive system improvements but their impact is likely influenced not only by their design but also by what other structures are in place to support P4P—such as enhanced information systems for quality monitoring and feedback, aligned payments across all providers, and transparency.  The success of these programs in meeting improvement goals likely will be affected by their design, how they are implemented, and whether sufficient resources are allocated to provide the necessary day-to-day support for program operations and ongoing modification of the program.

Hospitals understand that P4P is likely to be part of their future and generally seem supportive of the concept.  They face a number of challenges to their ability to successfully participate in these programs, including lack of physician engagement, inadequate information infrastructure that necessitates the manual collection of data from charts, and potentially conflicting signals from various organizations measuring hospital performance. These implementation challenges are important to consider carefully in the design of any hospital P4P program.

