Another important issue to consider in designing a performance measurement system is the consequences of meeting - or failing to meet - the established standards or performance targets. While the question has not been formally studied, it is reasonable to assume that the greater the dollar amount of the penalty or bonus, the greater the incentive or deterrent effect. Determining the optimal amount is a challenge. If a limited pool of bonus funds is divided among a large number of measures, all with significant weights, the incentive to perform well on any one measure is likely to be eroded. When a bonus is set at a fixed amount, regardless of the size of the state's basic grant, as was the case for the TANF bonus for reductions in out-of-wedlock births, it is likely to have more of an effect on states with smaller grants, for whom the bonus could be quite large in relation to their grant amount. Because of this consideration, the High Performance Bonus awards were allocated to the top performing states in amounts proportional to their TANF block grants. A third scenario is that the penalty is too large to be viable.
It is not clear, in fact, whether it is necessary to attach any financial consequences to an outcome-based performance measurement system. Some have argued that the honor of being singled out as a high performer - or particularly the stigma of being singled out as a poor performer - may be a powerful enough incentive on its own. For example, a substantial amount of attention is paid to the annual Kids Count Data Book, which reports state performance on a wide range of indicators of child well-being. There are also political consequences for states associated with being found penalty-liable or selected for a bonus, regardless of the dollar amount (Dickinson and West, 1988).
State feedback at our consultation suggested that the threat of being penalized was very salient, regardless of the amount of the penalty or whether it was ultimately possible to avoid the penalty through a corrective compliance process. In support of this argument, they noted that attempts to enforce financial penalties in the past have inevitably resulted in expensive and time-consuming administrative and judicial appeals, which have long delayed, if not negated, any actual transfer of funds. Penalties appear to have greater political consequences than bonuses, possibly because of the negative publicity and the great difficulty in finding the funds needed to replace the funds lost as a result of the penalty. (Under TANF, states that are subject to a penalty must replace the withheld funds with "state-only" funds which do not count toward satisfying the maintenance-of-effort requirement.)
There are some circumstances under which financial incentives may even be counterproductive. For example, financial incentives may result in increased "creaming" of participants, avoidance of innovative, but unproven strategies, or even inaccurate data reporting. When stakeholders are reluctant to adopt outcome measures, collecting performance data without financial incentives could relieve some of their concerns.
Across the states, legislatures have come down on both sides of this issue. In some states, budgeting has been linked to performance standards, so that high performing programs - and even individual offices - can receive additional money, while low performers are at risk of losing funding. In other states, there are no financial consequences attached to the performance standards, but the results are widely disseminated each year and used to provide feedback in order to improve program operations (Hatry, 1999; Horsch, 1996(a); Schilder, 1998; Yates, 1997). The data can help public managers and service providers make decisions and monitor progress toward specific goals. Coupled with program evaluation data, performance measures can potentially be used to assess service strategies, determine why results were achieved or not, and decide how programs need to be changed.
An additional factor must be considered when a new performance measurement system is adopted. As discussed above, when data for a new measure are first collected, in many cases, states will have little ability to predict their performance in advance - either because the program is new and there is no past performance, or because the data collection requirement is new and there are no baseline data. This uncertainty about performance levels appears to have very different consequences depending on whether a bonus or a penalty is involved.
In the context of penalties, performance uncertainty appears to lead to highly risk-averse behavior. For example, in defining work activities in which welfare recipients could participate, many states initially restricted the permissible activities to those that could be counted toward the federal work participation rate. Now that a few years of data are available, many states have discovered that they are in no danger of being penalized and have expanded the range of activities they allow for participants. Some states now include, for instance, educational activities not directly related to employment (including high school and equivalency programs, basic and remedial education, English as a Second Language, and post-secondary education), which counted toward the participation requirement under JOBS, among the permissible activities for TANF participants when determined appropriate.
In the context of bonuses, uncertainty appears to lead to a "wait-and-see" attitude. Without a solid idea of either how much effort is needed to achieve a certain level of performance or the potential payoff (including the size of the bonus), some states may be unwilling to invest much effort or money in order to improve their ratings. For example, in many cases, the states that received bonuses in the first year of the High Performance Bonus were those that had made investments in work and work supports even before the interim performance criteria were announced. It would not be surprising to see other states - particularly those that were close to receiving bonuses - now begin to make or expand their investments in these areas.
One possible means of mitigating the negative consequences of this asymmetry would be to implement a new measurement system in phases, beginning first with bonuses for high performers and adding penalties only after several years of experience with the measures, when more information would be available to use in setting standards. This approach was recommended by a participant in the consultation in post-consultation correspondence.
2. In program evaluation literature, the impacts of a program are defined as the differences in outcomes between a group who participated in the program compared to the average outcomes the group would have achieved had they not participated. In a formal evaluation, this comparison is most reliably estimated by randomly assigning individuals to an experimental group that participates in the program or to a control group that does not and comparing their outcomes. Because the experimental and control groups are randomly assigned, any differences in their outcomes can be assumed to be caused by the program being evaluated.
3. In FY 1999, 15 states or territories did not serve two-parent families under the TANF program. They either served two-parent families entirely through separate state programs so the TANF two-parent participation requirements did not apply or did not serve two-parent families at all. (HHS, 2000(a)).