Report on Alternative Outcome Measures: Temporary Assistance for Needy Families (TANF) Block Grant . Issues in Using Outcome-Based Performance Measures in Welfare-to-Work Programs


Outcome-based performance measures  particularly measures that use the employment and earnings of program participants to gauge program success are increasingly common among welfare and workforce development programs. However, studies have highlighted a number of issues that need to be addressed when these types of measures are used. (Barnow, 1999; Bartik, 1996; U.S. Department of Health and Human Services, 1994). These issues  which are each discussed in more detail below  include an inconsistent relationship between outcomes and program effectiveness, a need to ensure that measures are fair and equitable, the possibility of unintended consequences, and the problem of multiple goals. Some of these issues stem from an absence of answers that research is able to provide at this time, while others are due to a growing body of evidence that suggests the inherent challenges of designing outcome-based performance systems for welfare-to-work programs.

Inconsistent relationship between outcomes and program effectiveness

One concern regarding the use of outcome-based performance measures to reflect program success is that specific measures that are commonly used  such as increasing employment rates or earnings  often do not accurately measure the "added value" of the programs (Bartik, 1996; Barnow, 1999; U.S. Department of Health and Human Services, 1994).

Research on welfare caseload dynamics has shown the natural movement of welfare recipients on and off welfare (Bane and Ellwood, 1983; Pavetti, 1993). The findings show that a large proportion of welfare recipients exit welfare after relatively short periods of time, while a substantial minority remain on welfare for longer spells. Some of those who leave do so due to employment; others leave for reasons related to marriage, remarriage, or further changes in their personal or economic situation; and still others leave for reasons that are not known. The studies also show that, in most cases, a large majority of those who leave welfare do so on their own, without either the benefit of an employment program or the requirement to participate in one. This movement off welfare and into employment represents what might be called a baseline or "natural" outcome unrelated to the operations of a welfare-to-work program.

The role of an employment program is not necessarily to achieve high outcome rates  but to add to the outcomes that would normally occur. To be judged successful, a program must exceed  or "add value to" the natural outcome rates. A program could do so in a number of ways. It could either move people more quickly into jobs and off welfare than they otherwise would have (through job search activities). Or it could assist getting people jobs who would not otherwise have gotten them, such as by providing job training or specialized services for clients facing difficult barriers (i.e., domestic violence, substance abuse). The strict enforcement of participation requirements may also cause some individuals to leave cash assistance rather than participate.

Evaluations of welfare-to-work programs have found that there is not a strong correlation between the "value-added" by the employment program and the attainment of high outcomes on employment-related measures. The 1994 Report to Congress by the U.S. Department of Health and Human Services examined random assignment studies of the welfare-to-work programs that operated in the 1980s. These evaluations were specifically designed to measure the "added value"  or impact  of programs targeted at welfare recipients in increasing the earnings and reducing the welfare dependency of those referred to an employment program (the program group) compared to an identical group of individuals (the control group) who did not receive program services. This review found that those programs that performed well on specific outcomes measures related to moving people into jobs and off welfare did not necessarily have greater success  in terms of program impacts  than those who did not perform well on the measures.

More recent data from the National Evaluation of Welfare-to-Work Strategies (NEWWS) (formerly the JOBS evaluation) confirms this finding (Freedman et al.,2000). This study included random assignment studies of welfare-to-work programs in seven sites; for simplicity, results from five sites are discussed here.(2)  As Table 1 shows, even though Columbus had the highest employment rate (50.2 percent), the "added value" by the program is lower than the other sites in the evaluation (3.5 percent). Moreover, the outcomes in Portland, which had a substantially higher "added value" were similar to several of the other sites. Thus, in this case, outcomes do not serve as a good "proxy" for added value, and an assessment of the relative effectiveness of the programs based solely on outcomes would have been mistaken.

Table 1.
Employment Outcomes and "Added Value" in NEWWS Sites
County Longitudinal
Percent Employed
After Two Years
"Added Value"
Program Group Control Group
Atlanta 73.8% 42.8% 38.5% +4.4%
Columbus 52.1% 50.2% 46.7% +3.5%
Grand Rapids 69.0% 47.2% 43.1% +4.1%
Portland 61.1% 46.2% 35.3% +10.9%
Riverside 43.8% 31.3% 27.1% +4.2%

This table also shows that participation rates  a process measure  also are not good proxies for program impacts. The participation rates shown on Table 1 are longitudinal measures  which report the proportion of individuals who participated in the program at least one day within a two-year period  and are calculated differently than the monthly participation rates required under JOBS and TANF(3)  (which report the number of individuals who participate a certain number of hours per week each month). These results show that the program with the highest participation rate, Atlanta (73.8 percent), had a very similar added value to the site with the lowest participation rate (43.8 percent).

Table 2 shows the outcomes and "added value" for a different measure  earnings over a two-year period  with similar results. This table shows that sometimes outcomes can be correlated with the level of added value. The Portland program group achieved both a high level of earnings ($7,133) and the highest level of added value ($1,842). However, the relationship is not consistent. Columbus had similar earnings to Portland ($7,569) but its added value was dramatically lower ($677). In addition, the site with the lowest earnings  Riverside ($5,488)  had the second highest added value ($1,276). Similar to the findings with employment rates, longitudinal participation rates are not correlated with the added value of programs on earnings measures.

Table 2.
Earnings Outcomes and "Added Value" in NEWWS Sites
County Longitudinal
Average Total Earnings
Over Two Years
"Added Value"
Program Group Control Group
Atlanta 73.8% $5,820 $5,006 $813
Columbus 52.1% $7,569 $6,892 $677
Grand Rapids 69.0% $5,674 $4,639 $1,035
Portland 61.1% $7,133 $5,291 $1,842
Riverside 43.8% $5,488 $4,213 $1,276

Research on workforce development programs has found similar results. Barnow (1999) found a weak correspondence between program impacts and measured performance in the JTPA program. In examining the 16 sites in the National JTPA Evaluation (evaluated using a random assignment design), this study found that the relationship between program performance on employment-related measures and program impact was positive but statistically insignificant.

This evidence suggests that a system of performance measurement that focuses on outcomes may not necessarily lead programs to increase their added value. Rather, it could reward the substantial amount of normal employment activity by welfare recipients rather than the programs added value: programs with higher (or lower) outcome rates overall may simply reflect the higher (or lower) natural outcomes. However, controlled evaluations, which are the best way to measure program impacts are generally too expensive and time-consuming to rely on for ongoing feedback and monitoring of programs.

Developing fair and equitable measures

As discussed above, there is a natural rate at which welfare recipients find jobs with no assistance from employment programs. Studies have found that this natural rate is due to the influence of several factors over which state and local managers have little control, including the states economic conditions and the demographics of the welfare caseload (Barnow, 1999; Bartik, 1996; U.S. Department of Health and Human Services, 1994). An important dimension of performance measurement systems is holding states accountable for performance that is within their control  not for factors for which they can be expected to have little or no responsibility.

Different state and local welfare-to-work programs operate within significantly different labor markets and under economic conditions that are diverse and highly variable. This may have a significant effect on the outcomes produced by the state or local program. For example, because there are fewer jobs available, a program operating in a depressed economy may place fewer recipients in jobs than one functioning in a booming labor market. In this case, the economy may be a key factor in explaining the difference between the states results, not the effectiveness of the employment program.

States and localities also have different and changing welfare caseloads in terms of overall size, demographics, and other local factors. For example, some states have a relatively high proportion of very disadvantaged recipients (e.g., those lacking educational credentials or employment histories) on their cash assistance caseload. Because it is more difficult to employ this group, a state in this situation could appear less successful, based on a job-related outcome measure, than a state that served a more job-ready population. In this case, a states performance would, at least in part, be driven by the composition of the cash assistance caseload.

Other important factors that affect the outcomes of a states employment program are the cash assistance benefit levels and income disregard policies. Cash assistance benefit levels for a family of three with no income range from $120 per month in Mississippi to $923 per month in Alaska (Gallagher et al. 1997). Earnings disregards  the amount of income disregarded when calculating the benefit level  also vary from small, decreasing disregards in some states to others that disregard all income as long as the family is below the poverty line. As a result, some states will perform better on certain outcome measures  such as the number of individuals who leave cash assistance due to employment  because of their grant level and earnings disregard policies rather than because of their programs performance. While states could change their benefit levels and earnings disregards so that they fared better on certain types of performance measures, that is not the goal of a performance measurement system. (See below for a discussion of unintended consequences). Instead, the goal is to develop measures that treat states equitably regardless of their benefit and earnings disregard levels.

These findings suggest that it is important to recognize the role that uncontrollable factors can have on performance measures and to develop mechanisms that ascribe differences in outcomes to the right factors. This process of ensuring that standards are fair and equitable across states is known as "leveling the playing field." Later sections of this paper discuss some of the mechanisms that have been developed to address this issue, including regression-adjusting measures to reflect differences in economic conditions and welfare caseloads, standards of performance that are negotiated to reflect local conditions, and using measures of improvement rather than absolute levels of performance.

Unintended consequences

Another issue that can be encountered in developing outcome-based performance measures is the creation of unintended consequences (Barnow, 1999; Bartik, 1996; U.S. Department of Health and Human Services, 1994). This means that an unintended behavior is created when trying to achieve a certain result. The most prevalent example of this in the world of employment and training programs is known as "creaming." When only the outcomes for clients enrolled in the program are considered, programs can enhance their performance on employment and earnings measures by serving those clients who are most "job ready" and who, with minimal program assistance, are most likely to become employed on their own. Programs may also have a disincentive to focus on the hard-to-serve clientele  who may actually be more in need of the services provided by the program  because it would affect their ability to achieve a certain level on performance measures.

There may be additional unintended or adverse consequences as well. For example, a focus on caseload reduction can lead to incentives to divert recipients from receiving assistance or to lower grant levels or earnings disregards so individuals leave assistance more quickly when they find jobs.

Multiple program goals

A final issue to address in developing an outcome-based system of performance measurement grows out of the relatively broad purposes of welfare-to-work programs. While the overall goal is to move individuals into employment and/or off cash assistance, there are a number of objectives that could be pursued to achieve this goal. For example, some programs may emphasize finding "better jobs" or jobs with benefits or higher wages while others may emphasize moving individuals quickly into jobs regardless of their wage level. Moreover, some programs may use their TANF programs to reduce poverty, for example by providing income support for needy families as long as they remain poor  even if this means the families receive assistance for a longer period. Others may view self-sufficiency and reduced dependency as the primary goal and reduce the level of support provided in order to make work a necessity.

Studies of welfare programs in the 1980's found that, depending on the objectives administrators identify and the service and management strategies they adopt, programs move in substantially different directions with different results. For example, one study found that a range of programs was successful depending on the specific measures used  some achieved high employment rates, others had larger earnings gains, and still others were more cost-effective (U.S. Department of Health and Human Services, 1994). However, achievement of one outcome did not necessarily correlate with achievement on the other outcomes. Thus, it is important to take care in identifying and promoting one particular program objective over another.