Report on Alternative Outcome Measures: Temporary Assistance for Needy Families (TANF) Block Grant

Publication Date

Nov 30, 2000

Department of Health and Human Services (HHS)
Administration for Children and Families (ACF)
and
Assistant Secretary for Planning and Evaluation (ASPE)

Context for this Report

Legislative Context

The following report is submitted pursuant to section 107 of the Personal Responsibility and Work Opportunity Reconciliation Act of 1996 (P.L. 104-193; PRWORA), which provides that the Secretary of the Department of Health and Human Services (HHS) shall conduct a study of alternative outcomes measures under the Temporary Assistance for Needy Families (TANF) program and submit a report on the findings of that study to the Senate Finance Committee and the House Ways and Means Committee.

Sec. 107. STUDY ON ALTERNATIVE OUTCOMES MEASURES
(a) STUDY.-- The Secretary shall, in cooperation with the States, study and analyze outcomes measures for evaluating the success of the States in moving individuals out of the welfare system through employment as an alternative to the minimum participation rates described in section 407 of the Social Security Act. The study shall include a determination as to whether such alternative outcomes measures should be applied on a national or a State-by-State basis and a preliminary assessment of the effects of section 408(a)(7)(C) of such Act⁽¹⁾.

This report has been prepared by staff in the Department of Health and Human Services, Administration for Children and Families (ACF) and the Office of the Assistant Secretary for Planning and Evaluation (ASPE) with assistance from contractor staff.

Introduction

The Department of Health and Human Services' efforts to study measures of welfare program outcomes began with a requirement of the Family Support Act of 1988. The resulting report acknowledged that performance under the welfare program's Job Opportunities and Basic Skills Training (JOBS) program should not be measured solely by levels of activity or participation, but it stopped short of making recommendations for specific performance standards and measures of outcomes (HHS, 1994). While enactment of the Government Performance and Results Act (GPRA) of 1993 has increased the use of performance measurement systems at all levels of government, critical issues, as discussed below, still need to be resolved before making recommendations for the best set of outcomes for performance measurement and standards, particularly when they are linked to financial consequences (penalties or bonuses).

In developing this report, we set out to accomplish several objectives. First, we reviewed the literature on performance measurement specifically as it applies to and is used in welfare and welfare-to-work programs. Second, we analyzed the participation requirements under the TANF program (as well as the JOBS participation requirements) to determine the benchmark against which potential outcome measures for evaluating the success of state block grant programs in helping recipients move from welfare to work would be compared. In order to take advantage of the experience gained through implementation of the TANF participation rate requirements, we delayed this analysis until the participation rate data for the first year of TANF implementation became available in summer 1999. Third, we sought input and advice from a wide range of individuals and organizations (including state representatives, researchers and advocates) to get their sense of the goals of TANF for which outcome measures would be appropriate and their suggestions for potential measures. Fourth, starting with the suggestions we received during our consultation process, we constructed a representative list of possible outcome measures, including an analysis of the data and measurement issues that would affect their usefulness as alternatives to the minimum participation rates. Finally, this report stops short of making recommendations for specific measures; rather, it provides a framework for policymakers to use in determining whether outcome-based performance measures should be used as a substitute for or in conjunction with the minimum participation rates, and if so, in selecting among potential measures.

Our immediate objective in preparing this report is to identify for policymakers the plethora of factors affecting state performance that might be impacted by outcome-based measures in general, and some specific outcome measures in particular. As noted above, the legislative request for this report mandated that outcome-based performance measures be evaluated as an alternative to the minimum participation rate requirements under current law. Because there are financial penalties linked to these participation rate requirements, we looked for outcome-based performance measures that are sufficiently robust to justify linking them to financial consequences.

Our goal is to present the strengths and weaknesses of different approaches to outcome-based performance measurement and different performance measures and highlight the tradeoffs that must be made, such as between measuring the long-term success of a program and having results quickly in order to provide timely feedback on program performance. In particular, we discuss in some detail the data limitations that constrain the choices of possible performance measures.

Chapter I of this report provides a brief summary of the evolution and application of performance measurement techniques for government-wide accountability under GPRA, with a special focus on the history of performance measurement in welfare-to-work programs. It also highlights the current system of penalties and rewards under the TANF program.

In Chapter II we discuss the basic elements of an outcome-based performance measurement system, and identify a number of issues that need to be taken into consideration in applying such a system to the TANF program.

Chapter III presents an illustrative but not exhaustive set of possible alternative outcomes measures for the TANF program, most of which were discussed at a consultation session we held in July 1999. These examples are presented both to illustrate the range of issues that need to be considered in developing an outcome-based performance measurement system for TANF, and because measures similar to these are likely to be among the candidates for inclusion in such a system. For each item addressed, we describe the performance measure and its relationship to the TANF program and examine various measurement issues that arise in defining how the measure would be calculated, issues related to the availability, quality and timeliness of data to calculate the measure, and fairness issues related to whether some states would be advantaged or disadvantaged by a particular measure.

Finally, in Chapter IV, drawing from our examination, we offer some conclusions to guide future considerations of instituting outcomes measures to evaluate states' success in implementing TANF.

This report also includes five appendices. Appendix A provides the results of our literature review of the use of outcome-based performance measures in welfare and workforce development programs. Appendix B summarizes the meeting we hosted in July 1999 to consult with states, researchers and advocates on potential outcomes measures for evaluating the success of the states in moving individuals out of the welfare system through employment as an alternative to the minimum participation rates. Although summarized in this Appendix, the views of consultation participants are incorporated throughout the report. Appendix C includes a description of the participation requirements for welfare recipients under both TANF and the predecessor JOBS program, and compares the two. In Appendix D we identify select data sources (both survey and administrative) for potential outcome measures and describe the characteristics of each. A bibliography is included as Appendix E.

It is our hope and intention that this report provides relevant information on strategies for evaluating states' success in implementing TANF to inform upcoming discussions pertinent to the program's reauthorization.

Background on Performance Measurement

Over the years, a range of terms has been used to describe the different types of performance measures used to gauge program success. Some studies have tried to achieve consensus on definitions of key terms in performance measurement, particularly for use in welfare-to-work and other employment programs (Brown and Corbett, 1997; Hatry, 1999; Martin and Kettner, 1996; Midwest Welfare Peer Assistance Network, 1999; U.S. Department of Health and Human Services, 1994). In particular, these studies draw a crucial distinction between process measures and outcome measures.

Process measures. Process measures address administrative or operational activities of the program. These types of measures usually reflect the "means" to getting to an end result rather than the goal itself. Examples of process measures include participation rates reflecting the type and level of service received through the program, and the percentage of applications for assistance which are acted upon in a timely manner.
Outcome measures. Outcome measures focus on the goals which the program hopes to achieve. In most cases, these measures focus on the outcomes for a group of individuals involved in the program. In welfare-to-work programs, key outcome measures typically include job placement rates, employment retention rates, or wage rates.

While this distinction is conceptually important, in practice, there is often some uncertainty about whether a particular measure should be considered a process or an outcome measure. This is not merely the result of confusing terminology, but reflects the reality that there is often a continuum between pure process measures and pure outcome measures. For example, in a program for teen parents, the number of people served and the cost per participant are process measures. Depending on the specific goals of the program, outcome measures might include the fraction of participants who have not had a subsequent child two years later, or the fraction of participants who are employed. Measures that fall in the middle of the continuum might include the fraction of participants who attend high school, the fraction of participants who have received a certificate of General Educational Development (GED), or the fraction who meet the program's internal definition of successful completion. Such intermediate goals are sometimes referred to as "interim outcome measures" because they represent an important milestone even though they are not the ultimate goal of the program. Other sources refer to such measures as "outputs."

Performance measurement, or the measurement of the results (or outcomes) and efficiency of services or programs (Hatry, 1999), has been the subject of growing interest at all levels of government in recent years. In particular, there has been a recent movement toward increased use of outcome measures, rather than process measures. A number of broad trends have contributed to this growth.

In response to critics who have expressed skepticism about the value of government services, many providers of government services have turned to outcome-based measures in order to prove the utility of their efforts. Specifically, this new accountability focus now requires providers to show not just that they have in fact spent the public money on the activities for which it was designated, and not just that they have been efficient in serving as many people as possible with the available funds (process measures), but also that the statutory goals of the programs are being met and that recipients are better off as a result (outcome measures).

To some degree, this trend represents the spread of techniques used in the private sector - including a focus on measurement of product quality and customer satisfaction and on the establishment of numerical targets for improvement. Similar techniques had also been used in the Defense Department since the 1950s to compare the expected costs and effectiveness of various proposed weapons systems. However, such techniques had not often been applied to the provision of social services. Many providers of social services had only rudimentary capacities to track what happened to the recipients of their services.

With the enactment of the GPRA in 1993, Congress required all federal agencies to identify the goals of their programs and to report annually on their progress in achieving these goals. GPRA seeks to shift the focus of federal management and decisionmaking from a preoccupation with process measures such as the number of tasks completed or units of service provided to a more direct consideration of the results or outcomes of programs - that is, the real differences the tasks or services provided make in people's lives (Hinchman, 1997).

The recent devolution of policy and program design and funding to the state and local level has also increased the attention paid to performance measurement. In a number of areas, including human services policy, federal policy makers have created block grants, which give states great flexibility in their use of funds within broad program parameters. In such an environment, the most logical way to hold states accountable for their use of public funds is to monitor program outcomes.

The increased emphasis on holding public agencies accountable for the attainment of program goals and the outcomes of their clients is also reflected in the numerous state initiatives to develop and use performance measures. In some states, the welfare agencies are participating in comprehensive performance measurement systems which focus on establishing indicators or benchmarks of progress toward goals across programs and agencies. In other states, performance measures are used internally by agencies to monitor the performance and accountability of local offices or of contractors.

However, the shift toward use of outcome-based performance measures has occasioned some controversy, particularly when financial consequences have been attached to agencies' success or failure in achieving specified targets or standards. The major reason is that even the most effective programs are only one element out of many that affect participants' outcomes. As Forsythe (2000) notes, "almost by definition, high-level outcome measures track social changes that are influenced by factors that are not under the direct control of operating agencies," such as the overall state of the economy, the underlying social and demographic characteristics of participants, and societal attitudes about the roles of men and women. Most program administrators are understandably leery of being measured - and possibly rewarded or penalized - on results which they can not fully control. Yet, the only measures which are totally under the control of program operators are process measures, such as the number of clients served. This issue is discussed in more detail below.

Performance Standards in Welfare-to-Work Programs Before PRWORA

Interest in outcome-based performance measures in the employment training and welfare-to-work arenas has been growing for quite some time. Since 1982, the U.S. Department of Labor has required states and local service agencies receiving funding under the Job Training Partnership Act (JTPA) program to report data on client outcomes and has provided corresponding incentives and sanctions on the basis of that outcome data. For adult JTPA participants, the key performance measures were the employment rate and average weekly earnings during the 13^th week after program exit. These outcomes were measured both for all adult participants and for the subset of participants who were also welfare recipients. The Workforce Investment Act of 1998 (WIA), which replaced JTPA, includes a performance measurement system which builds on the JTPA model.

The Family Support Act of 1988 (P.L. 100-485), which created the JOBS program, also established work participation rate requirements for the states to meet. Prior to the establishment of JOBS, the primary system for holding states accountable for their use of AFDC funds was the Quality Control (QC) system. QC focused on the accuracy of the eligibility determinations and benefit calculations. The Family Support Act emphasized participation in work activities and established a participation rate requirement as a measure of programs' success in engaging recipients in work-related activities, primarily education and training. States were required to engage seven percent of non-exempt recipients in activities in Fiscal Year (FY) 1990, rising to 20 percent by FY 1995. (More detail on the JOBS participation rate requirements and a comparison of them to the participation rate requirements under TANF are provided in Appendix C.)

In the Family Support Act, Congress asked HHS to develop recommendations for performance standards regarding "specific measures of outcomes" beyond simple measures of levels of activity or participation. The resultant report, completed in 1994, acknowledged the importance of developing an outcome-based system of performance measures, but raised some critical issues that the Department believed "needed to be addressed and dealt with prior to using outcomes as the basis for performance measurement and standards." (HHS, 1994) These included:

the inconsistent relationship between program outcomes and program effectiveness, as measured through program evaluations of controlled experiments;
the need to create a "level playing field" across and within states, and over time, taking into account differing economic and demographic circumstances;
the effect of the choice regarding who is counted toward the outcome measure on performance; and
the recognition that different state JOBS programs may have different objectives.

The 1994 report did not recommend specific outcome-based performance measures, but rather presented a workplan for a process to refine existing participation rates, develop outcome-based performance measures and standards, and strengthen accountability mechanisms, including modifications to the Quality Control system. This proposed course of action was overtaken by the enactment of PRWORA and the implementation of the new TANF block grants.

Penalties and Bonuses under TANF

If state variation in JOBS program objectives (considering the relatively prescriptive federal mandates) contributed to difficulty in recommending a system of outcome-based performance measures under JOBS, then passage of PRWORA added to the problem's complexity. PRWORA eliminated the cash welfare (AFDC) and job training (JOBS) programs and replaced them with a block grant called Temporary Assistance for Needy Families (TANF). Its purpose was to increase state flexibility in providing assistance to needy families within the framework of four broad Congressionally mandated goals. These goals are to:

provide assistance to needy families so that children may be cared for in their own homes or in the homes of relatives;
end the dependence of needy parents on government benefits by promoting job preparation, work, and marriage;
prevent and reduce the incidence of out-of-wedlock pregnancies and establish annual numerical goals for preventing and reducing the incidence of these pregnancies; and
encourage the formation and maintenance of two-parent families.

States have total flexibility in setting priorities among these goals as they choose how to spend their TANF block grant. There is no requirement that states spend equal amounts on each of these goals, or even that they spend any funds on a given goal. To date, the majority of TANF funds have been spent on cash and work-based assistance, which was the primary use of funds allowed under AFDC. Most of the remaining funds have been spent on work activities, child care, and other work-based supports. States also have full control over such aspects of the program as benefit levels, eligibility, and the design and sequencing of work activities. In order to protect state flexibility, Congress prohibited HHS from regulating state behavior unless specifically required under the law.

In counterpoint to this broad state flexibility, in designing TANF, Congress was quite prescriptive in some specific areas - such as time limits, sanctions, and requirements imposed on teen parents. These mandates are backed up by financial penalties, under which states may lose a portion of their block grant allocation for such violations as failure to participate in the income and eligibility verification system, failure to maintain a certain level of historic funding effort, or failure to comply with the five-year time limit on assistance. These penalties are all attached to process measures, and are designed to ensure compliance with Congressional priorities.

The penalty which has attracted the most attention so far is for failure to meet the work participation rate requirement, under which states must engage a target percentage of all recipients (with very limited exceptions) in work and work-related activities. As described in detail in Appendix C, the list of activities which may be counted under this requirement is more restrictive than the countable activities under JOBS. This participation rate requirement is becoming more challenging over time, as both the hours of participation required in order to be counted and the target participation rate rise each year. In the first three years of the TANF program, all states have achieved the required all-families participation rate, but 19 states failed to achieve the higher target for two-parent families in FY 1997, 14 failed to do so in FY 1998 and 8 failed to do so in FY 1999. (See Table 2 and Table 3 in Appendix C.)

While TANF does not include any penalties based on outcome-based performance measures, it is worth noting that the participation rate does have some aspects of an outcome measure, in that most of the people whom the states can count toward the rate are working in unsubsidized employment, which is key to one of the most fundamental goals of TANF - requiring families to make efforts to work. Moreover, states receive credit toward the participation rate for the degree to which their caseloads have declined since 1995, to the extent that these changes were not caused by changes in eligibility. Thus, the participation rate also rewards states that have moved families off welfare.

The bonuses under TANF are more outcome-oriented. Under one bonus, the Bonus for Reductions in Out-of-Wedlock Births, Congress provided up to $100,000,000 per year in bonuses for up to five states that demonstrate the greatest decreases in out-of-wedlock births, so long as those states also have a reduction in the abortion rate from FY 1995. Under a second bonus, the High Performance Bonus, Congress appropriated an average of $200,000,000 per year for five years to reward the highest performing states in achieving the goals of TANF. Congress did not specify the measures to be used, but required HHS to develop a formula for scoring states' performance, in consultation with the National Governors' Association and the American Public Human Services Association (formerly the American Public Welfare Association). For bonuses awarded in FYs 1999-2001, the measures used are primarily related to the first two goals of TANF: job entry and success in the workforce, measured by a weighted combination of earnings gains and job retention. Awards are provided both for absolute performance and for improvement compared to the previous year. States choose whether to compete for any or all of these work measures.

Under the Final Rule, published in the Federal Register on August 30, 2000 (65 FR 52814), for High Performance Bonuses awarded in FYs 2002 and 2003, states will also have the opportunity to compete under a measure of family formation and stability and five measures of states' success in supporting work and self-sufficiency by providing eligible families with health insurance (through Medicaid and SCHIP), food stamps, and child care subsidies (HHS, 2000(a)).

There is some overlap between the mandate for this report and the requirement to reward with a High Performance Bonus states identified as high performers, based on formulas that measure states' performance in operating their TANF programs. This report takes into account the input received through the High Performance Bonus consultation process, as well as lessons learned in developing the High Performance Bonus interim guidance and proposed and final rules. We also include the High Performance Bonus measures among those discussed in detail in Chapter III, as these measures could well serve as the building blocks for a system of outcome-based performance measurement.

However, the two requirements differ in some important respects. First, the High Performance Bonus measures needed to be selected and implemented in a narrow time frame, with limited possibility of developing new data sources or using past performance to establish benchmarks. More importantly, in considering outcome-based performance measures as an alternative to the work participation requirements, this report examines the use of such measures for penalties as well as for bonuses, and explores the issues which would need to be addressed if outcome-based performance measures were to be used as the primary mechanism for holding states accountable for their use of federal TANF funds.

Consultation

In keeping with the Department's commitment to a broad consultation strategy used throughout the development of TANF-related regulations, HHS hosted a consultation meeting on July 21, 1999 with representatives from states and research and advocacy organizations. The purpose of the meeting was to identify and discuss some potential outcomes measures that could be used to evaluate state performance as an alternative to the minimum participation rates. All states were invited to send representatives, as were a substantial number of research and advocacy organizations with whom the Department has consulted on other TANF issues, including the development of state guidance on the High Performance Bonus. About half the states participated, as well as almost 20 research and advocacy organizations and several federal agencies. A list of states and organizations represented is included in Appendix B, along with a more detailed summary of the consultation meeting.

The objectives of the consultation were to discuss:

the overall goals related to work that should be promoted through performance measures (e.g., employment, earnings, income, self-sufficiency, marriage);
specific measures that might be used to promote these goals without creating perverse incentives;
whether timely and accurate data are available for these measures at a reasonable cost;
the appropriateness of linking performance measures to penalties and/or bonuses at the state level; and
whether outcome-based performance measures should be applied on a national or state-by-state basis.

A major topic of discussion by the group was whether outcome-based performance measures should be focused only on the goal of moving recipients off welfare through employment, or whether they should address the other goals of TANF as well. Participants suggested measures representing a wide range of outcomes of interest, falling roughly into nine broad categories:

work participation and employment;
poverty and movement to self-sufficiency;
requirements for two-parent families;
duration of welfare receipt;
caseload reductions;
child outcomes;
supportive services;
customer satisfaction; and
educational outcomes.

Most of the participants were open in principle to measuring state performance across the wider range of TANF goals, but there was no consensus on which goals were most important. The first two categories of outcomes (i.e., work participation and employment and poverty and movement to self-sufficiency) were the focus of most of the discussion, along with identification of some specific measures that might be used to promote achievement of the desired outcomes. Discussion of the broader data issues affecting the measures (e.g., is it available, how much does it cost to collect it, is it accurate and precise at the state level, is it reported frequently and in a timely manner?) was limited, although in general, states expressed reservations about any measures that would increase their data collection and reporting burden or change their data requirements in the short term.

Efforts to identify the "preferred" and "least favorite" measures among those generated through the brainstorming session revealed fairly substantial differences between the group of state representatives and the group of representatives from research and advocacy organizations. State representatives favored a limited set of core measures, such as those used for the first two years of the High Performance Bonus under the interim guidance. They suggested that states should have the option whether to compete on additional measures, such as progression along the poverty continuum, a measure of the percent of those required to work who have earnings, and welfare recidivism. On the other hand, the researchers and advocates favored multiple measures in order to reflect the wide range of possible goals under TANF, such as a measure of labor market success, a broader measure of program participation, a measure of extreme child poverty, and a measure of the provision of supportive services. There was widespread agreement among all participants, however, that the two-parent work participation rate was a "least favorite" measure.

The summary of the highlights of the consultation, which was shared with all participants and consultation invitees, is at Appendix B.

Endnotes

1. Section 408(a)(7)(C) of the Social Security Act, as amended by PRWORA, provides for an exemption from the 60 month time limit for individuals who have been "battered or subjected to extreme cruelty." See page 17 of this report for a discussion of the domestic violence provisions of TANF.

Principles of Outcome-Based Performance Measurement

An outcome-based performance measurement system consists of four major elements:

Goals - What are the desired goals of the program? Will the goals be set at the national level or will states and communities be given the authority to choose among a range of possible priorities?
Measures - Are the measures relevant to the goals and desired outcomes of the program? Will users understand what is being measured and reported? Are valid data available to measure the outcomes? Do program administrators believe that these measures accurately reflect their performance?
Standards - What levels of performance do we expect? Should a single minimum nationwide standard be set for each measure, or should standards be adjusted to account for the economic and demographic differences among states and localities?
Consequences - What form should the consequences take - is publicity enough to affect state choices, or are financial penalties/incentives needed? Should penalties or bonuses, or some combination, be imposed on the basis of actual performance relative to the standards?

Engaging a broad range of stakeholders in structured discussions about all of these elements can be the basis for building performance partnerships in designing and implementing a performance measurement process. Even if one is not trying to achieve consensus, this wider audience provides a better perspective on what is possible in the real world (e.g., what is operationally feasible, what are possible unintended consequences, etc.) and may facilitate future data collection efforts (Hatry, 1999). Based on HHS's experiences in developing the TANF High Performance Bonus guidance and rulemaking and reports from states that have developed outcome-based performance measurement systems, holding inclusive discussions appears to be the preferred approach. Some states have used the process of developing results-based accountability systems to engage the wider public in a discussion about statewide policy goals and priorities (either within or across specific programs) and to build a commitment of public, private, and nonprofit resources toward these ends.

Goals

Developing an outcome-based performance system starts with identifying the goals of the program. What is the purpose of a program? What are its desired outcomes? How will we know if the program is working? This is not necessarily an easy task for a program with as many purposes and as much flexibility in the possible uses of funds as TANF - it may be difficult to reach consensus on the outcomes that we care about enough to single out for performance measures, particularly if the associated consequences are substantial.

As discussed above, under TANF, states have a great deal of flexibility in selecting how to distribute their funds among the four Congressionally-specified purposes. While all states have invested substantially in promoting work and self-sufficiency, beyond this common core, they have made different choices in the goals they promote: some have invested in programs to support the formation and maintenance of two-parent families, some have focused on preventing teen pregnancies, and others have expanded their supports for all working poor families, whether or not they have previously received cash assistance. Moreover, even within the general area of promoting work, states have made different decisions about how best to achieve this goal. For example, some states have adopted "work-first" programs which encourage recipients to accept any job they can get in order to acquire work experience, while others have encouraged recipients to be more selective and to participate in training to qualify for a job that offers some promotion potential, or that provides health insurance or other benefits. Which of these approaches will be determined to have the best outcomes depends at least in part on the specific measure that is selected.

The question of whether this degree of variation in goals is appropriate and desirable or whether state flexibility in this area should be restricted is a topic for TANF reauthorization discussions. There are any number of potential approaches, including:

Adopt mandatory measures of performance to reflect all goals which Congress considers of high priority. Particularly if penalties were associated with these measures, this approach would reduce state flexibility in the use of TANF and MOE funds.
Allow states to select from a list of performance measures, reflecting a range of goals, the ones under which they wish to compete for bonuses. This is the approach adopted by HHS in implementing the High Performance Bonuses. This approach gives states an incentive to improve their performance in areas where they may not have focused as much attention, but does not penalize those states which are not interested in competing under these measures.
Adopt only outcome-based performance measures related to those goals around which there are both general agreement and comparable data. Currently, this would probably limit outcome measurement to work-related measures.

Measures

Once there is agreement upon the goals of a program, the next step is to develop specific measures that reflect these goals. At this stage, both operational and theoretical concerns must be taken into account.

The availability of data is the primary operational concern. Overall goals must be linked to specific measures for which accurate and comparable data are available at the state level in a timely fashion and at a reasonable cost (Brown and Corbett, 1997; Hatry, 1999; Yates, 1997, Zornitsky and Rubin, 1988). Comparability of data means that the measures should detect real differences in program performance across states or localities or over time rather than reflect differences in the quality of data used to calculate the measures. Timeliness of data has not always been considered in the selection of measures, but it is necessary if the performance measurement system is expected to provide policy-relevant feedback to states on the results of their actions. At the July 1999 consultation, states expressed a clear preference for minimizing the cost and burden of data collection by using measures that could be assessed using national survey data or data they were already reporting over measures that would increase their data collection and reporting responsibilities.

Experience to date has shown that data systems at all levels of government fall short of the ideal (Brown and Corbett, 1997). These experiences have found that data for some measures cannot be collected at all, while others can only be measured poorly. Moreover, the cost of developing or improving data collection systems can be substantial. While states are collecting a range of information about TANF recipients beyond that required under the federal reporting rules, they do not all collect the same data elements. Even when the same general information is collected, there is no consistency in how it is measured across states (APHSA, 2000).

Some performance measurement systems have had success with existing administrative data - such as Unemployment Insurance (UI) records - which are collected uniformly across a range of states or localities (Bartik, 1996; Yates, 1997). Administrative data usually attain some level of quality and the cost of collecting the data is limited since they are generally collected for other purposes. However, the types of measures that can be derived from administrative data are limited. For example, UI records include data on earnings over a quarter, but not on hourly wages. Moreover, the quality of administrative data is highest for data elements that are directly related to the purpose for which the data were collected - such as the amount of benefits paid - and lower for other elements - such as the educational level of recipients (GAO, 1997). National or state surveys can also provide data on a wider range of measures, particularly if existing survey efforts can meet the needs of the performance measurement system. However, initiating survey efforts can be relatively expensive. If one is interested in outcomes that are valid at the level of specific states or localities, relatively large sample sizes will be required to achieve this level of precision. Appendix D of this report reviews the merits and disadvantages of several potential data sources for outcome measures.

The theoretical concerns in developing measures are driven by the fact, as noted earlier, that all high-level outcome measures are affected by a range of factors, not just by program performance. This would not be a problem if there were a strong correlation between performance on an outcome measure and program effectiveness, as shown through evaluation. Unfortunately, research has shown that there is not always a consistent relationship between program outcomes and impacts.⁽²⁾ For example, many welfare recipients find jobs on their own - without the assistance of welfare-to-work programs. The role of welfare-to-work programs is to add value to the "natural" movement off welfare and into employment. States with stronger economies and lower unemployment rates are generally able to move more individuals into employment than those with weaker economies. Similarly, states with a more disadvantaged caseload may have greater difficulty moving individuals into work than states with a more job-ready caseload. Therefore, differences in economic conditions or in caseload composition, rather than in welfare-to-work program effectiveness, may have more to do with performance on an outcome measure.

Appendix A examines this issue in more detail. Using data from random-assignment evaluations of welfare-to-work programs in five sites, it can be seen that there is not a consistent relationship between the programs with the highest employment rates or average earnings - two possible outcome measures - and the programs that produced the greatest impact on these measures. This problem is one of the major issues identified in the 1994 report to Congress (HHS, 1994) that needs to be resolved in order to adopt an outcome-based performance measurement system. However, the research also shows that this problem is not unique to outcome measures - participation rates over time also are poorly correlated with program impacts.

Several lessons can be drawn from this research:

Outcome-based performance measurement is only one element of a comprehensive monitoring and research program. By necessity, performance measurement systems are limited to those elements for which data can be collected inexpensively, routinely and in a timely fashion. In-depth understanding of participant experiences and program effectiveness requires different approaches, including detailed participant surveys, rigorous evaluations, and advanced econometric analysis (Forsythe, 2000).
Outcome-based performance measurement can still be a useful tool to monitor program operations and promote improvements, as long as stakeholders at all levels of operations agree that there is a clear logical system connecting the activities of program operators to the outcomes that are measured. Outcome measures may be used to identify areas where additional resources or technical assistance are needed (Perrin and Koshel, 1997).
The chosen measures must not give programs incentives to achieve high levels on performance measures through the use of strategies that subvert their fundamental intent. For example, it is important to develop measures for welfare-to-work programs that minimize incentives for creaming (e.g., serving only those who are most job-ready and most likely to become employed on their own, with minimal program assistance). Likewise, in measuring how well states are succeeding at helping former welfare recipients achieve enduring self-sufficiency, care should be taken not to focus exclusively on welfare recidivism, since a state could achieve very low levels of recidivism by making it impossible for former recipients to reapply for cash assistance, even without supports for employment.
Because it is impossible to fully account and adjust for all the variations in circumstances among states, no performance measurement system can be perfectly fair. It is important to develop mechanisms which recognize that states are facing different economic and demographic environments. Discussed below in the next section are a number of ways in which the standards for performance measures can be adjusted to provide a more level playing field for all states, and the advantages and disadvantages of each.

The selection of specific measures inevitably involves trade-offs. The use of multiple measures can help guard against any unintended consequences that might be caused by reliance solely on a single measure. However, it is important not to err by going too far in the other direction - a relatively complex system can have a less immediate effect on motivating programs in any particular direction (Bartik, 1996). It is also important not to lose sight of the program goals and desired outcomes: the measures that have been chosen must reflect the initial choice of goals.

Part III of this report includes a detailed examination of several potential measures that could be used to assess the performance of state Temporary Assistance for Needy Families (TANF) programs. These include the measures that have been selected for the TANF High Performance Bonus.

Standards

Standards identify expected levels of performance and provide the basis for assessing whether states are achieving program goals and, therefore, should be rewarded or penalized. The standards included in an outcome-based performance measurement system should be challenging yet achievable. Standards appear to be most likely to affect states at the margin - those which are in danger of being penalized or within striking range of receiving a bonus. In a study of JTPA programs, for example, Dickinson and West (1988) found that about 42 percent of the local operating entities they studied tried to maximize their measured performance, one-fourth tried only to slightly exceed their standards, and about one-third tried merely to meet their standards in order to avoid program sanctions. If a standard is set too low, it loses its effectiveness as an incentive for states to improve their performance. If it is set too high, states are likely to be put off by the unreasonable standard. Depending on the consequences for failure, states are likely to either simply give up trying to achieve the unreasonable standard or look for ways to get around it. For example, it appears that the high participation rate requirements for two-parent families on TANF have caused several states to change the way they provide assistance to such families by using state maintenance of effort (MOE) dollars rather than federal TANF funds.⁽³⁾

It is extremely difficult to determine an appropriate standard without baseline data on past performance. When data for a specific measure have never been collected or analyzed before, neither state nor federal policymakers are likely to know what would be a reasonable level of performance. In developing the TANF High Performance Bonus, HHS dealt with this issue by rewarding the top states in each category, rather than by establishing a fixed standard. It is still too early to tell whether this approach of rewarding the top performers will motivate the broad middle range of states to improve their performance. One encouraging sign, however, is that in the first year of the High Performance Bonus, a wide range of states (46) elected to submit data to compete for a bonus on one or more of the four measures.

The national average is another method that is used to set a standard, as is the case with the Quality Control system for the Food Stamp Program. At the consultation, state representatives expressed opposition to this approach. They objected to not knowing their performance target up front, and to the possibility that a state could find itself penalty-liable in a given year without experiencing any change in its performance, due simply to changes in other states' performance. (The same concerns would apply to a system that penalized the bottom n performers under a measure.)

One important issue is whether to establish a single nationwide performance standard for each measure or to adjust standards to account for differences in economic and demographic circumstances among states. In the past, different federal programs have chosen different options. Within TANF, we have examples of both absolute standards (the state work participation rates) and negotiated standards (the participation rates under the Tribal TANF program). JTPA used a regression model that took into account economic and demographic factors to adjust its performance standards for each state and for local areas. WIA provides for negotiated performance standards at both the state and sub-state level. Elements that must be considered in the negotiations process include: how the standards compare to other areas, taking into account economic and demographic factors and program design; the extent to which the standards promote continuous performance improvement; and the extent to which the standards assist the program in achieving a high level of customer satisfaction. (The use of outcome-based performance measures in these and other welfare and workforce development programs is discussed in more detail in Appendix A.)

One of the concerns that has been raised about modifying standards to reflect differences in demographic conditions is that it reduces the incentive for states to provide appropriate services to those populations identified as "hard-to-serve." The TANF program takes a unique approach to this issue with respect to the domestic violence hardship exemption. Section 408(a)(7)(C) of the Social Security Act, as amended by PRWORA, permits states to exempt victims of domestic violence from the time limit and, under regulations implementing that provision, from the work requirements. Individuals receiving an exemption from work participation rates or the time limit due to domestic violence are not removed from the initial calculations. However, if a state fails to meet the work participation rate requirements or exceeds the cap on time limit extensions, and can show that this failure is due to provision of good cause domestic violence waivers, HHS may grant reasonable cause relief from the penalties. States may only receive this relief if they have adopted the Family Violence Option and are providing appropriate services to individuals granted waivers. To date, no state has needed this relief.

Under the current participation rate requirements, similar relief is not provided to states that fail to meet the standards due to exemptions provided to individuals with other barriers to employment, such as mental health issues or substance abuse. In particular, states do not receive credit for engaging recipients in appropriate services that are not among the list of specific countable work-related activities.

A different approach, which does not directly adjust for economic and demographic conditions but has some of the same effects, is to reward states for improvements rather than (or in addition to) absolute levels of performance. This approach was taken by HHS in developing the TANF High Performance Bonus measures. This gives states that have performed poorly in the past a strong incentive to improve, even if they are unlikely to achieve results that place them in the ranks of higher-performing states. Moreover, since demographic conditions do not change very much from year to year, improvements are likely to be caused by changes in program operations rather than by underlying conditions.

While states participating in the consultation were generally receptive to the notion of basing some bonuses on improvement, some expressed concern about standards that have incremental increases each year. A few states that began their welfare reform efforts early on, under waiver policies, felt that they were approaching the maximum realistic levels of work participation and should not be penalized if they did not continue to improve.

States also expressed a great deal of concern at the consultation about rigid thresholds for penalties that create "cliffs" in which a small difference in outcomes could result in the imposition of large penalties. This is a particular concern where the data are believed to be "noisy" and error-prone. The current TANF regulations illustrate one way in which such threshold effects can be minimized - the amount of the penalty assessed for failure to achieve the minimum participation rate requirements is proportional to the degree of the failure. However, there is still a cliff under the statute because a state's failure to meet the participation rate, by whatever margin, results in its "maintenance of effort" (MOE) funding requirement increasing from 75 percent to 80 percent.

Consequences

Another important issue to consider in designing a performance measurement system is the consequences of meeting - or failing to meet - the established standards or performance targets. While the question has not been formally studied, it is reasonable to assume that the greater the dollar amount of the penalty or bonus, the greater the incentive or deterrent effect. Determining the optimal amount is a challenge. If a limited pool of bonus funds is divided among a large number of measures, all with significant weights, the incentive to perform well on any one measure is likely to be eroded. When a bonus is set at a fixed amount, regardless of the size of the state's basic grant, as was the case for the TANF bonus for reductions in out-of-wedlock births, it is likely to have more of an effect on states with smaller grants, for whom the bonus could be quite large in relation to their grant amount. Because of this consideration, the High Performance Bonus awards were allocated to the top performing states in amounts proportional to their TANF block grants. A third scenario is that the penalty is too large to be viable.

It is not clear, in fact, whether it is necessary to attach any financial consequences to an outcome-based performance measurement system. Some have argued that the honor of being singled out as a high performer - or particularly the stigma of being singled out as a poor performer - may be a powerful enough incentive on its own. For example, a substantial amount of attention is paid to the annual Kids Count Data Book, which reports state performance on a wide range of indicators of child well-being. There are also political consequences for states associated with being found penalty-liable or selected for a bonus, regardless of the dollar amount (Dickinson and West, 1988).

State feedback at our consultation suggested that the threat of being penalized was very salient, regardless of the amount of the penalty or whether it was ultimately possible to avoid the penalty through a corrective compliance process. In support of this argument, they noted that attempts to enforce financial penalties in the past have inevitably resulted in expensive and time-consuming administrative and judicial appeals, which have long delayed, if not negated, any actual transfer of funds. Penalties appear to have greater political consequences than bonuses, possibly because of the negative publicity and the great difficulty in finding the funds needed to replace the funds lost as a result of the penalty. (Under TANF, states that are subject to a penalty must replace the withheld funds with "state-only" funds which do not count toward satisfying the maintenance-of-effort requirement.)

There are some circumstances under which financial incentives may even be counterproductive. For example, financial incentives may result in increased "creaming" of participants, avoidance of innovative, but unproven strategies, or even inaccurate data reporting. When stakeholders are reluctant to adopt outcome measures, collecting performance data without financial incentives could relieve some of their concerns.

Across the states, legislatures have come down on both sides of this issue. In some states, budgeting has been linked to performance standards, so that high performing programs - and even individual offices - can receive additional money, while low performers are at risk of losing funding. In other states, there are no financial consequences attached to the performance standards, but the results are widely disseminated each year and used to provide feedback in order to improve program operations (Hatry, 1999; Horsch, 1996(a); Schilder, 1998; Yates, 1997). The data can help public managers and service providers make decisions and monitor progress toward specific goals. Coupled with program evaluation data, performance measures can potentially be used to assess service strategies, determine why results were achieved or not, and decide how programs need to be changed.

An additional factor must be considered when a new performance measurement system is adopted. As discussed above, when data for a new measure are first collected, in many cases, states will have little ability to predict their performance in advance - either because the program is new and there is no past performance, or because the data collection requirement is new and there are no baseline data. This uncertainty about performance levels appears to have very different consequences depending on whether a bonus or a penalty is involved.

In the context of penalties, performance uncertainty appears to lead to highly risk-averse behavior. For example, in defining work activities in which welfare recipients could participate, many states initially restricted the permissible activities to those that could be counted toward the federal work participation rate. Now that a few years of data are available, many states have discovered that they are in no danger of being penalized and have expanded the range of activities they allow for participants. Some states now include, for instance, educational activities not directly related to employment (including high school and equivalency programs, basic and remedial education, English as a Second Language, and post-secondary education), which counted toward the participation requirement under JOBS, among the permissible activities for TANF participants when determined appropriate.

In the context of bonuses, uncertainty appears to lead to a "wait-and-see" attitude. Without a solid idea of either how much effort is needed to achieve a certain level of performance or the potential payoff (including the size of the bonus), some states may be unwilling to invest much effort or money in order to improve their ratings. For example, in many cases, the states that received bonuses in the first year of the High Performance Bonus were those that had made investments in work and work supports even before the interim performance criteria were announced. It would not be surprising to see other states - particularly those that were close to receiving bonuses - now begin to make or expand their investments in these areas.

One possible means of mitigating the negative consequences of this asymmetry would be to implement a new measurement system in phases, beginning first with bonuses for high performers and adding penalties only after several years of experience with the measures, when more information would be available to use in setting standards. This approach was recommended by a participant in the consultation in post-consultation correspondence.

Endnotes

2. In program evaluation literature, the impacts of a program are defined as the differences in outcomes between a group who participated in the program compared to the average outcomes the group would have achieved had they not participated. In a formal evaluation, this comparison is most reliably estimated by randomly assigning individuals to an experimental group that participates in the program or to a control group that does not and comparing their outcomes. Because the experimental and control groups are randomly assigned, any differences in their outcomes can be assumed to be caused by the program being evaluated.

3. In FY 1999, 15 states or territories did not serve two-parent families under the TANF program. They either served two-parent families entirely through separate state programs so the TANF two-parent participation requirements did not apply or did not serve two-parent families at all. (HHS, 2000(a)).

Examination of Selected Outcome Measures

In this section, we consider selected outcome measures that could potentially be used to measure the performance of states' Temporary Assistance for Needy Families (TANF) programs. In order to assess the feasibility of using these alternative measures to gauge the success of state TANF programs, we examine several issues for each potential measure:

What the indicator will measure and its relationship to the goals of the TANF program;
Measurement issues that should be considered when defining how the indicator should be calculated;
A brief discussion of the availability, quality, and timeliness of data to calculate the measure, particularly on a state-by-state basis (see Appendix D for more detail on a variety of potential data sources); and
Issues related to the "fairness" of the measure, i.e., whether states are able to perform on equal footing with other states, and whether the TANF agency can reasonably be expected to affect the outcome. (The general questions of whether outcome-based performance measures accurately reflect program effectiveness, and how to adjust standards in order to compensate for differences in economic and socio-demographic factors are not discussed here.)

The measures considered below are not a comprehensive listing of possible alternative outcome measures, but are a representative sample of measures which focus on the goals that were of high priority to most participants in the consultation process: employment, child and family well-being, and the formation and stability of two-parent families. Within the context of these goals, we selected measures that seemed both salient and possible. In some cases, the proposed measures may be more accurately described as interim outcome measures or output measures, rather than true outcome measures. Several are based upon those measures being used for the TANF High Performance Bonus. We did not include measures that are already being used for other bonuses, such as the child support incentives, because the participants in the consultation generally agreed that it did not make sense to credit (or punish) states twice for the same performance under separate systems.
В

Table 1:
Potential Alternative Performance Measures for the TANF Program
Potential Performance Measure	Primary Data Source(s)*
Employment Related Measures
Job Entry Rate for TANF Recipients	UI Records (linked to TANF administrative records)
Employment Retention Rate for TANF Recipients	UI Records (linked to TANF administrative records)
Earnings Gains for TANF Recipients	UI Records (linked to TANF administrative records)
Percentage of Those Required to Work with Earnings	UI Records (linked to TANF administrative records)
Recidivism Rate for TANF Leavers	Linked state TANF administrative records
Measures of Child and Family Well-Being
Food Stamp ReceiptВ Percent of families eligible for Food Stamps that receive themВ Percent of poor children who are in working families that receive Food Stamps Percent of former TANF recipients receiving Food Stamps	American Community Survey, Food Stamp administrative data (linked to TANF administrative records)
Medicaid/SCHIP ReceiptВ Percent of families eligible for Medicaid/SCHIP that receive the benefit Percent of poor children who are in working families that receive Medicaid/SCHIP Percent of former TANF recipients receiving Medicaid/SCHIP	American Community Survey, Medicaid/SCHIP administrative data (linked to TANF administrative records)
Child Care Affordability and QualityВ Percent of eligible children receiving child care subsidies	Child Care Development Fund administrative data, linked to American Community Survey
Receipt of TANF and other types of transitional assistance by needy families	American Community Survey, state administrative data
Extreme Poverty Rate	American Community Survey
Family Formation/Stability Measures
Percentage of Children Living in Married Couple Homes	American Community Survey
Out-of-Wedlock Birth Rate for TANF Families	State TANF administrative data, National Center for Health Statistics
* See Appendix D for a description of the characteristics of various survey and administrative data sources.

Employment-Related Measures

Employment is one of the key objectives of the TANF program. At the federal level, the statute governing the TANF program explicitly defines ending dependence on government programs through the promotion of work as one of the four goals of the program. This is also one of the top priorities for states under TANF; when given the opportunity to compete for bonuses based on employment-related measures in FY 1999, a total of 46 states chose to do so. Because it is such a fundamental goal of TANF, multiple performance measures related to a state's success in helping individuals find and keep employment are discussed in this section.

Job Entry Rate

The job entry rate would measure the proportion of the unemployed TANF adult caseload that obtained a job. This measure gauges the success of states in achieving one of the key goals of the TANF program - moving individuals into employment. The job entry rate is one of the measures used by HHS to award the TANF High Performance Bonus. (The data used for making the FY 1999 awards are shown in Table 2.) In addition, the job entry rate has been used as a performance measure for workforce development programs operating under the Job Training Partnership Act (JTPA) and its successor, the Workforce Investment Act (WIA).

Table 2.
Job Entry Rate
(Data Reported for FY 1999 High Performance Bonus Awards)
State	1997 Rate	1997 Rank	1998 Rate	1998 Rank	97-98 % Improvement	Improvement Rank
Alabama	40.63	17	41.72	22	2.67	30
Alaska	*	*	48.78	13	*	*
Arizona	45.96	12	47.73	15	3.87	26
Arkansas	39.08	18	41.36	24	5.83	22
California	31.51	34	33.66	37	6.82	20
Colorado	34.14	30	36.84	32	7.89	16
Connecticut	27.32	38	24.40	41	-10.71	37
Delaware	52.79	6	62.71	2	18.80	8
Dist. of Columbia	21.34	41	23.58	43	10.50	12
Florida	27.73	37	28.65	40	3.29	28
Georgia	37.00	24	38.12	27	3.05	29
Hawaii	21.68	40	18.82	46	-13.22	39
Idaho	*	*	*	*	*	*
Illinois	47.45	10	52.37	10	10.37	14
Indiana	*	*	88.41	1	*	*
Iowa	37.83	20	40.41	25	6.84	19
Kansas	45.47	13	44.71	18	-1.66	34
Kentucky	34.35	29	37.22	31	8.35	15
Louisiana	35.92	27	49.31	12	37.27	2
Maine	*	*	*	*	*	*
Maryland	32.33	32	33.47	38	3.52	27
Massachusetts	31.04	35	35.45	36	14.20	9
Michigan	42.33	14	46.99	16	11.02	11
Minnesota	37.56	21	45.40	17	20.90	7
Mississippi	41.55	16	36.71	33	-11.65	38
Missouri	37.03	23	36.22	34	-2.18	35
Montana	*	*	42.84	20	*	*
Nebraska	*	*	*	*	*	*
Nevada	48.91	9	61.48	5	25.70	4
New Hampshire	38.77	19	36.02	35	-7.09	36
New Jersey	35.39	28	37.26	30	5.28	23
New Mexico	*	*	*	*	*	*
New York	27.78	36	30.68	39	10.47	13
North Carolina	37.25	22	38.10	28	2.28	32
North Dakota	59.80	2	62.36	4	4.29	25
Ohio	*	*	24.20	42	*	*
Oklahoma	33.22	31	42.56	21	28.09	3
Oregon	24.23	39	20.07	44	-17.17	40
Pennsylvania	54.79	5	58.77	6	7.27	17
Rhode Island	36.48	26	41.55	23	13.89	10
South Carolina	41.57	15	44.55	19	7.17	18
South Dakota	32.15	33	39.62	26	23.21	5
Tennessee	59.43	3	62.43	3	5.04	24
Texas	51.65	7	54.84	9	6.17	21
Utah	56.95	4	56.32	8	-1.11	33
Vermont	47.02	11	48.15	14	2.40	31
Virginia	*	*	*	*	*	*
Washington	36.74	25	51.27	11	39.52	1
West Virginia	16.57	42	20.04	45	20.93	6
Wisconsin	49.12	8	37.41	29	-23.84	41
Wyoming	82.39	1	57.72	7	-29.94	42
* State not participating
The job entry rate is the unduplicated number of adult recipients who entered employment for the first time in a given year (job entries) as a percent of the total unduplicated number of recipient adults unemployed for the first time in that year. Adult recipients participating in workfare or fully subsidized employment are not included in the numerator but are included in the denominator.

Measurement Issues. When defining the job entry rate, it is necessary to determine whether employment in both unsubsidized and subsidized employment will count toward the rate. Because it is perceived as more aligned with the goals of welfare-to-work programs, many programs using job entry as a performance measure generally count all employment that is not fully subsidized. In the interests of minimizing the data reporting burden, under the TANF High Performance Bonus final rule, HHS is counting all employment, whether or not subsidized.

Another measurement issue is determining which individuals to include in the rate. Should only those who receive benefits counting as "assistance" under the TANF rules be included, or should recipients of other types of services be counted? What about applicants who are diverted from receiving assistance? If diverted applicants are not counted, states with successful diversion programs will actually be lowering their job entry rate. Because there is no standardized definition of what constitutes an application, it would be extremely difficult to develop a measure that is comparable across states.

A related question is whether to count only individuals who leave cash assistance when they find a job or to also include those who remain on aid while working. Because individuals in states with low grant amounts are more likely to leave cash assistance when they find a job, states are treated more equitably if the rate counts all individuals who move into employment regardless of whether they leave cash assistance. Allowing only individuals who leave cash assistance for employment to count toward the rate may also give states incentives to reduce their grant levels or their earnings disregards. For these reasons, the job entry rate for the TANF High Performance Bonus allows states to count all individuals moving into employment, whether or not they are receiving cash assistance. However, finding work that ends dependence on cash assistance remains the ultimate goal of the TANF program.

A more expansive approach could examine work participation for a broader range of low-income families, not limited to those who have received TANF benefits. Such an approach would reward states that used their flexibility under TANF to serve a range of low-income families. However, this measure would probably be more affected by underlying economic conditions than by any action taken by the state's TANF program.

Data issues. The job entry rate could be measured on a state-by-state basis through several sources: state TANF administrative data, surveys, and Unemployment Insurance (UI) wage records. Of these sources, UI wage records, collected by state employment security agencies, are the preferable source, because they provide high quality employment data at a relatively low cost. However, they have two important limitations. First, they do not cover all jobs in a state, excluding such jobs as self-employment, agricultural employment, employment by the federal government or military, and jobs outside state boundaries. Second, UI records only track total quarterly earnings; they can not be used to calculate hourly wages. Historically, state administrative data on cash assistance recipients who find jobs have been of varying quality, in part because recipients often do not notify the welfare department when they find a job. Because data on job entry based on state TANF administrative records may vary in quality from state to state, it may be difficult to rank or evaluate state performance based on this source. Surveys may provide more accurate data, but they are generally too expensive for states to rely on for ongoing data needs.

Because of these concerns, for the 1999 and 2000 awards, states were allowed to submit the best data they had available for the interim High Performance Bonus measures, regardless of the source. Thus states could use matches with UI data, surveys, administrative records, or a combination of these data sources. In general, in 1999, states opted to use linked UI data, in some cases supplemented with information from administrative data regarding jobs not covered by the UI system. Under the final High Performance Bonus regulation, HHS has decided to calculate the work-related measures by linking information on TANF recipients with data from the National Directory of New Hires (NDNH), which combines the information in state UI databases with information from federal agency personnel offices. This minimizes the reporting burden on state agencies and ensures that the measures will be calculated based on data that are consistent across states.

Fairness issues. Data available to date for this measure (see Table 2 for FY 1998 data), show a great deal of variation in the job entry rates achieved. (For FY 1998, rates ranged from under 20 percent to almost 90 percent, with most states achieving rates between 30 and 50 percent.) This variation could be attributed to any or a combination of several factors, including the usual differences in economic conditions, the fraction of employment that is covered by UI in the state, and/or each state's stage of welfare reform implementation at the time of measurement. With respect to the last factor, states that had taken earlier aggressive steps to move recipients into work may have found that those recipients remaining unemployed faced substantial barriers to employment and thus were harder to place.

Employment Retention Rate

The employment retention rate would measure the length of time TANF recipients who found jobs stayed employed. In order to help individuals maintain self-sufficiency - a key objective of the TANF program - it is critical that states help individuals stay in their jobs or find another job quickly if they lose their initial job. Job retention is one component of the Success in the Workforce measure used under the TANF High Performance Bonus. (The data used for making the FY 1999 awards are shown in Table 3.)

Table 3.
Job Retention Rate
(Data Reported for FY 1999 High Performance Bonus Awards)
State	1997 Rate	1997 Rank	1998 Rate	1998 Rank	97-98 % Improvement	Improvement Rank
Alabama	*	*	*	*	*	*
Alaska	*	*	79.78	15	*	*
Arizona	81.93	8	83.12	8	1.45	8
Arkansas	78.95	20	79.13	19	0.22	13
California	83.94	5	84.61	3	0.80	10
Colorado	75.55	32	76.04	31	0.65	11
Connecticut	86.21	1	85.06	2	-1.33	26
Delaware	77.10	29	75.70	32	-1.81	29
Dist. of Columbia	75.09	33	69.69	39	-7.19	37
Florida	72.84	36	79.37	16	8.97	1
Georgia	73.36	35	67.93	41	-7.40	38
Hawaii	84.71	4	86.45	1	2.05	6
Idaho	*	*	*	*	*	*
Illinois	82.42	7	82.79	10	0.46	12
Indiana	*	*	83.48	6	*	*
Iowa	82.84	6	82.91	9	0.08	14
Kansas	77.44	28	76.58	29	-1.11	23
Kentucky	59.64	40	59.37	43	-0.46	21
Louisiana	66.39	38	57.19	44	-13.87	41
Maine	*	*	*	*	*	*
Maryland	76.17	30	73.82	35	-3.09	32
Massachusetts	77.99	25	73.21	37	-6.12	36
Michigan	78.84	21	78.74	20	-0.13	17
Minnesota	80.70	10	83.50	5	3.47	4
Mississippi	77.69	26	77.55	25	-0.19	18
Missouri	78.53	22	77.29	26	-1.58	27
Montana	*	*	67.46	42	*	*
Nebraska	*	*	*	*	*	*
Nevada	77.62	27	76.77	28	-1.10	22
New Hampshire	78.34	23	74.98	33	-4.29	34
New Jersey	80.24	13	76.99	27	-4.05	33
New Mexico	*	*	*	*	*	*
New York	81.86	9	83.25	7	1.70	7
North Carolina	79.56	16	78.62	21	-1.19	25
North Dakota	72.39	37	71.22	38	-1.62	28
Ohio	*	*	76.23	30	*	*
Oklahoma	62.52	39	68.11	40	8.93	2
Oregon	79.88	15	78.41	22	-1.84	30
Pennsylvania	79.55	17	79.30	17	-0.32	19
Rhode Island	80.04	14	82.16	11	2.64	5
South Carolina	80.41	12	81.18	13	0.96	9
South Dakota	75.62	31	74.09	34	-2.02	31
Tennessee	80.49	11	80.52	14	0.03	15
Texas	78.16	24	77.82	24	-0.44	20
Utah	74.12	34	73.26	36	-1.17	24
Vermont	79.21	18	79.17	18	-0.05	16
Virginia	*	*	*	*	*	*
Washington	78.96	19	83.81	4	6.14	3
West Virginia	59.21	41	53.41	45	-9.81	40
Wisconsin	85.57	2	78.15	23	-8.67	39
Wyoming	85.44	3	81.32	12	-4.82	35
* State not participating
The job retention rate is the average of the sum of the unduplicated number of employed adult recipients in one quarter who were also employed in the first subsequent quarter, as a percent of the sum of the unduplicated number of employed adult recipients in each quarter. (At this point they might be former recipients.) Adult recipients participating in workfare or in fully subsidized employment are not included in either the numerator or the denominator.

Measurement issues. When defining a retention measure, it is necessary to consider whether the focus is on job retention - the amount of time an individual stays in a specific job - or employment retention - the amount of time an individual remains employed regardless of whether it is the same job or a different job. It may be preferable to focus on employment retention because, in some instances, a change to a different job may be a move to a better job.

Another issue in defining how to calculate the employment retention rate is determining how long individuals will have to remain employed in order to be counted in the rate. The TANF program is likely to be more connected to individuals during the initial period after they find jobs and may have more influence over employment retention in this early period. However, longer periods of retention are clearly more meaningful and desirable. Different retention period choices have been made in the past under various programs. The TANF High Performance Bonus for FYs 1999 and 2000 requires that individuals who find employment in one quarter retain employment into the next quarter. Beginning in the third year of this bonus (FY 2001), the "job retention" measure will be based on three consecutive quarters of employment. The WIA program measures employment retention six months after an individual initially finds a job. The JTPA program used a measure that combined job entry and retention by measuring the employment rate 13 weeks after program exit.

Data issues. The employment retention rate could be most easily measured on a state-by-state basis through UI wage records or at a national level through the NDNH. As discussed above, these records provide relatively comparable and timely data across states at a relatively low cost. States are currently not required to collect information on job retention for the TANF program's reporting requirements, and it would be costly for them to track such individuals. While the JTPA program measured job retention by conducting a survey 13 weeks after individuals left the program, in part because of the cost of this data collection, the successor WIA program will use UI data to calculate employment retention.

While UI or NDNH records can be used to determine whether an individual was employed in several consecutive quarters, they do not indicate whether the individual is in the same job, whether they left their initial job and found a subsequent job, and/or whether there was a break in employment (unless the break covered an entire quarter). Thus, these data can be used only to provide measures of employment retention, rather than job retention, over a certain number of quarters, but they will not capture certain breaks in employment.

Earnings Gains

The earnings gains measure would reflect the increase in earnings over a specified time period for employed TANF recipients. Individuals who experience earnings gains are more likely to sustain their self-sufficiency over the long run and to escape poverty. Earnings gains are included as a performance measure for the TANF High Performance Bonus (the data used for making the FY 1999 awards are shown in Table 4) and the WIA program.

Table 4.
Earnings Gain Rate
(Data Reported for FY 1999 High Performance Bonus Awards)
State	1997 Rate	1997 Rank	1998 Rate	1998 Rank	97-98 % Improvement	Improvement Rank
Alabama	*	*	*	*	*	*
Alaska	*	*	16.93	39	*	*
Arizona	48.45	5	43.04	6	-11.16	27
Arkansas	27.41	26	20.53	35	-25.08	37
California	23.93	29	21.87	32	-8.62	24
Colorado	47.30	7	42.38	8	-10.39	26
Connecticut	25.50	27	24.50	31	-3.92	15
Delaware	35.40	16	27.03	27	-23.63	36
Dist. of Columbia	19.39	35	29.60	24	52.62	1
Florida	40.50	9	38.53	11	-4.86	19
Georgia	31.30	20	31.79	21	1.58	11
Hawaii	10.69	39	3.91	43	-63.41	40
Idaho	*	*	*	*	*	*
Illinois	21.75	32	20.96	34	-3.62	14
Indiana	*	*	25.50	29	*	*
Iowa	28.86	25	27.53	26	-4.63	18
Kansas	62.50	2	54.31	2	-13.10	30
Kentucky	30.37	23	24.50	30	-19.33	33
Louisiana	23.38	31	21.32	33	-8.80	25
Maine	*	*	*	*	*	*
Maryland	37.52	13	42.96	7	14.52	5
Massachusetts	13.82	37	9.04	42	-34.61	39
Michigan	32.26	18	33.15	19	2.79	10
Minnesota	47.63	6	40.27	9	-15.46	31
Mississippi	37.64	12	30.10	22	-20.03	34
Missouri	36.20	14	33.95	18	-6.21	21
Montana	*	*	48.99	4	*	*
Nebraska	*	*	*	*	*	*
Nevada	31.45	19	32.65	20	3.81	8
New Hampshire	56.47	3	54.24	3	-3.95	16
New Jersey	24.88	28	19.20	36	-22.84	35
New Mexico	*	*	*	*	*	*
New York	15.14	36	15.58	40	2.90	9
North Carolina	36.09	15	36.50	14	1.15	12
North Dakota	55.15	4	40.04	10	-27.41	38
Ohio	*	*	29.98	23	*	*
Oklahoma	-5.59	40	-10.79	44	-92.97	41
Oregon	43.52	8	43.29	5	-0.54	13
Pennsylvania	19.49	34	17.06	37	-12.46	28
Rhode Island	20.95	33	17.04	38	-18.69	32
South Carolina	32.46	17	35.70	16	9.98	6
South Dakota	63.81	1	67.98	1	6.54	7
Tennessee	30.17	24	26.28	28	-12.88	29
Texas	39.96	10	36.54	13	-8.56	23
Utah	30.50	22	38.51	12	26.28	4
Vermont	30.68	21	29.12	25	-5.08	20
Virginia	*	*	*	*	*	*
Washington	12.80	38	12.27	41	-4.21	17
West Virginia	-26.74	41	-15.46	45	42.17	3
Wisconsin	23.50	30	34.63	17	47.34	2
Wyoming	38.82	11	36.34	15	-6.40	22
* State not participating
This is the sum of the gain in earnings between the initial and second subsequent quarter in each of quarters 1 through 4 of the FY for the adult recipients employed in both these quarters, as a percent of the sum of their initial earnings in each of quarters 1 through 4. (At this point they might be former recipients.) Earnings gain of adult recipients participating in workfare or in fully subsidized employment are not included in either the numerator or the denominator.

Measurement issues. An important question is whether to measure earnings gains (which may be due to any of a number of factors - including an increase in the hourly wage rate, an increase in hours worked, or an increase in the number of days or weeks employed - or a combination of these) or to focus on an increase in hourly wages. Experimental studies have found that welfare-to-work programs have produced most of their earnings gains through increases in the amount of employment rather than through increases in hourly wages.

Another measurement issue relates to the time period over which earnings gains will be measured. The TANF program is likely to be more directly connected to individuals during the initial period after they find jobs, and may have more influence over recipient experiences in this early period. On the other hand, individuals who find jobs may have to spend some time in the labor force before they experience earnings gains. Welfare-to-work programs that use earnings gains as a performance measure typically do so over a relatively short time period. The TANF High Performance Bonus measures earnings gains between one quarter and the second subsequent quarter. The WIA program measures earnings gains over the six months after entry into unsubsidized employment, as compared to earnings in the period prior to WIA program participation.

Data issues. Like the employment retention rate, the earnings gains measure could most feasibly be calculated on a state-by-state basis through UI or NDNH records. While states usually have detailed earnings information in their TANF administrative data for current recipients, they do not systematically track the earnings of former recipients because it would be relatively expensive to collect. Thus, UI or NDNH records provide the highest quality data at the lowest cost.

Using UI or NDNH data, however, means that earnings progression can only be measured between one three-month period (quarter) and a subsequent one. These records do not provide data on the number of hours worked, and so they can not be used to measure wage gains. Moreover, because individuals may start or stop working in the middle of a quarter, earnings in a quarter may reflect a period of unemployment rather than a job with low pay or few hours. This is particularly likely to affect the first quarter of earnings.

Percentage of Those Required to Work with Earnings

This measure reports on the proportion of the TANF adult caseload that is required to work and that has earnings. This measure differs from those discussed above because it would measure the employment level only for those TANF recipients who are required to work and would include both individuals who find jobs as well as those who were already working. Because it includes all individuals on the caseload who are working, it is closely related to the current participation rate.

The statute governing the TANF program allows states to exempt certain individuals from the work participation rate - primarily individuals with a child under age one and disabled individuals in two-parent families - while the remainder of the caseload is required to work. In addition, states with a work program waiver in effect prior to the enactment of TANF are allowed to continue with their prior (and broader) exemption policies. Because it includes only individuals whom the state is attempting to move into employment, this type of measure may more accurately reflect a state's success in moving individuals into work compared to a job entry rate based on the entire adult caseload.

Measurement issues. A determination would have to be made whether those considered "required to work" would be measured according to the definition provided in the federal TANF statute or whether states operating under waivers with broader exemption policies could use their own definitions. Both definitions are problematic in some respect. If the federal definition is used, states operating under pre-existing waivers with broader exemption policies of their own may not perform as well on the measure because some of their exempt cases would technically be "required to work" under the federal statute. Similarly, if each state used its own definition of "required to work," it would be difficult to compare outcomes across states because of the differences in the populations being served.

Data issues. Like the other work-based measures, UI or NDNH records are the best source of data for this measure. For this measure the TANF adult caseload that is required to work (rather than the entire adult caseload) would be matched to UI or NDNH records. Since the final TANF regulations require states to track whether individuals are required to work, this should not impose a new or significant burden.

Fairness issues. There are a number of factors that would have uneven effects on state performance on this measure. As discussed above, the existing differences in state exemption policies because of pre-existing waiver policies create an unequal playing field for this measure. It may not be possible to compare state performance equitably given the differences in their current exemption policies. In addition, the generosity of earnings disregards and cash grant levels affect the ability and willingness of individuals to combine work and welfare when they find jobs. Finally, like the other employment-related measures discussed above, the health of a state's economy and the relative number of hard-to-serve (i.e., disadvantaged) individuals in its caseload will affect performance on this type of measure.

Recidivism Rate

The recidivism rate measures the extent to which individuals who leave TANF cash assistance return to the program within some specified time period. Because a key goal of the TANF program is to promote sustained self-sufficiency by ending dependence on assistance, the recidivism rate is an important measure of a state's success in achieving this goal.

Measurement issues. In defining the recidivism rate, it is necessary to specify the population that will be included. Recipients leave TANF assistance for a variety of reasons, including employment, full-family sanctions, time limits, and other reasons. If recipients leaving TANF for all of these reasons are included, a state's performance on this measure is likely to be more heavily affected by state policies than by its success in promoting self-sufficiency. For example, a state with many case closures due to a lifetime time limit is likely to have a relatively low recidivism rate. One way to try to level the playing field might be to limit this measure to those recipients who leave cash assistance due to employment. However, this approach raises its own set of problems; research shows that many more welfare recipients are employed after leaving welfare than administrative data records show as having closed their cases due to employment.

It may also be appropriate to limit the measure to cases that have closed for at least a minimum time period, in order to exclude those cases closed and then re-opened due to administrative churning (e.g., when a case is closed for failing to provide required information and re-opened a few days later when the client brings in the needed documentation). For example, in HHS-funded studies of TANF recipients who left assistance, state grantees were encouraged to define "leavers" as cases closed for at least two months, in order to minimize the effects of churning.

Another issue is that the members of a case may not all move on or off welfare together. For example, a case may be closed due to a full-family sanction, but the children in the case may later move into the custody of another family member and may receive assistance as a child-only case. A decision must be made whether recidivism means the return to welfare of the case head, of any adult in the case, or any member of the household, including children.

Data issues. At this point in time, state administrative records are the only reliable data source for measuring a recidivism rate. Because states routinely track the months that individuals receive TANF assistance, state administrative data should be of relatively high quality and available on a timely basis. While states are required to track the reasons individuals leave assistance, this measure would be somewhat more complex for states to calculate if they had to distinguish whether certain types of "leavers" (such as those subject to time limits or sanctions) - rather than if all leavers - were included in the rate.

Fairness issues. It may be difficult to develop a recidivism measure that treats states fairly and equitably. As noted above, state need standards and grant levels affect the ability and willingness of recipients who work part-time or at low wages to continue to receive assistance. This in turn affects an individual's job experience level (and the likelihood of finding another job quickly) when they leave assistance, as well as the relative attractiveness of returning to assistance if a job is lost. For example, individuals who leave welfare in a state with relatively high grant levels and/or generous income disregards may be less likely to return to the rolls - regardless of the effectiveness of the state's welfare-to-work program - because they are working at relatively high wages or for a greater number of hours at the time they leave welfare. Conversely, the higher grant level may make it more worthwhile to return to assistance if a job is lost. In a state with low grant levels and/or low earnings disregards, on the other hand, individuals will be forced to leave cash assistance when they find jobs even if they are low-wage, part-time, or temporary. With relatively less job experience, this group may be more likely to return to cash assistance over the long run, or they may feel it is not worth the hassle of returning to assistance for a small grant amount. Differences in state sanction policies and time limit rules will also affect how states perform on this type of measure. In addition, those states with a full-family sanction policy will have a different group in their "leaver" population than a state that only removes the adult from the case when a sanction is enacted.

Measures of Child and Family Well-Being

The well-being measures address the first TANF goal of providing assistance to needy families so that children may be cared for in their own homes. They assess state performance in taking action to ensure that low-income working families continue to receive the supports they need so that they may provide food, health care, child care and other basic needs for their children. These measures also address the goal of ending the dependence of needy parents on government benefits by promoting job preparation and work, because in many cases, low-income parents need aid from government programs - particularly health care coverage, assistance in purchasing food, and support in paying for child care - in order to work. Assistance from these programs helps make it possible for families to move off welfare into employment and to progress on the job toward eventual full independence.

Measures of child and family well-being, while carefully monitored at the national level, have traditionally been difficult to measure at the state level. The Current Population Survey (CPS) is currently the best source of annual data for well-being measures at the national level, but its comparatively small sample sizes limit its ability to measure state-specific outcomes except in a few, large states. Various small area estimation techniques are currently used by the Census Bureau and others to produce reliable state-by-state estimates, including combining and averaging three or four years of data into "moving averages." Unfortunately, moving averages make it difficult to track improvement or declines in state performance over short periods of time.

If it is implemented as planned, the American Community Survey (ACS) would be the preferred data source for well-being measures. We anticipate that nationwide data appropriate to calculate state-by-state performance measures will be available for 2000. Once it is in full operation, the ACS (based on the decennial census long form) will be available every year for areas and population groups of 65,000 or more. It should be noted that use of any measure that relies on the ACS is contingent upon the continued availability of the new Census Bureau data. Appendix D includes descriptions of these and other national surveys.

Food Stamp Receipt

An important indicator of well-being is whether all members of a family have regular access to food and are free from hunger. Ideally, we would measure access to food directly, such as through the food insecurity and hunger scale developed by the U.S. Department of Agriculture. However, while this scale is administered annually as a supplement to the Current Population Survey, the sample sizes are not large enough to track changes at the state level. One possible alternative is to use an intermediate outcome measure, such as participation in the Food Stamp Program. The Food Stamp Program is an entitlement program that is available to help all low-income families who meet the national eligibility standards, including families receiving cash assistance and working families, to purchase food for an adequate diet. Several measures could potentially be used to gauge states' success in ensuring that eligible families receive Food Stamps. These include: (1) the percentage of families eligible for Food Stamps that receives them, (2) the percentage of poor children in working families who are receiving Food Stamps; and (3) the percentage of former TANF recipients receiving Food Stamps; as well as variations on these alternatives.

Under the High Performance Bonus, HHS has chosen to award bonuses based on a variation of option 2. Beginning in FY 2002, bonuses will be provided to the three states with the greatest percentage of low-income working households in the state receiving Food Stamps and to the seven states with the greatest percentage point improvement in the same measure. For this purpose, low-income working households would be defined as households with children under the age of 18 which have an income of less than 130 percent of poverty and earnings equal to at least half-time, full-year employment at minimum wage. The threshold of 130 percent of poverty was used because most, although not all, families at this income level are eligible for Food Stamps.

Measurement issues. While these measures are similar in that they attempt to capture the proportion of poor individuals receiving Food Stamps, they differ in the extent to which they can be influenced by the TANF program. The first option, the percentage of eligible families receiving Food Stamps, uses a relatively broad population as the denominator - the population that is eligible for Food Stamps. Although a large part of the Food Stamp caseload traditionally has received AFDC/TANF, this measure would address the overall effectiveness of the Food Stamp Program in reaching its target population, as well as the effectiveness of the TANF program in ensuring that individuals who are diverted from or leave cash assistance receive this benefit. The second measure, the percentage of poor children in working families receiving Food Stamps, is more narrow than the first in that it focuses on children in poor working families. The measure examines Food Stamp receipt by a group that up to now has had relatively low participation rates in Food Stamps, although the levels are increasing somewhat. The third measure focuses on whether individuals who had received TANF were receiving Food Stamps after they left the program. While this measure focuses on an outcome that can be directly influenced by the TANF program, it cannot capture the extent to which individuals who are diverted from TANF assistance on the front end receive Food Stamp benefits. Since Food Stamp eligibility is on a household basis, it should not matter much whether children or adults are the unit for the measure.

Data issues. The population measures are best measured by a combination of national survey data and Food Stamp administrative data. In terms of national survey data, the American Community Survey (ACS) is the preferred data source. These national surveys are best-suited to calculate the denominator of the first and second measures. The number of poor children (defined as "below the poverty line") in working families (or a reasonable proxy for working) would be measured directly using the ACS. Food Stamp eligibility would not be measured directly, but could be estimated based on the information collected, although some measurement errors will result from the different periods of observation (surveys collect annual data while Food Stamp eligibility is based on monthly income).

While the CPS and ACS collect data on Food Stamp receipt, surveys have historically under-reported information on receipt of public benefits and are not considered the best gauge of program participation levels. An alternative would be to use Food Stamp Quality Control (QC) data to calculate the numerator of the measures. Food Stamp QC data provide high quality, timely program enrollment data on an annual basis, although it is a sample and there are measurement differences between the QC database and national surveys (e.g., QC captures data on program participation and household circumstances including income for a specific month, while national surveys examine program participation and household circumstances over the past year).

The third measure - the percentage of former TANF recipients who receive Food Stamps - could be calculated by using linked TANF and Food Stamp administrative data. This would require states to use their TANF administrative data to determine those individuals who left TANF and to match this group against Food Stamp administrative data. This measure would be more workable if it was restricted to a limited period of time after TANF exit.

Fairness issues. In terms of treating states fairly and equitably, the size of the relatively broad population groups that form the denominators for the first two measures is likely to be influenced by factors that are beyond the reach of the TANF program. The number of families eligible for Food Stamps or in poor working families is likely to be affected by the economy in the state as well as other programs and policies the state may have in place - including child care, health insurance, state earned income tax credits, and others. States with more poor families would have to work with a greater proportion of families than states with fewer poor families.

In addition, all else equal, states with higher TANF benefit levels will probably fare better on the two population measures because many of their working poor families will be eligible for TANF as well as Food Stamps (participation rates for Food Stamps are higher when families are also receiving TANF). These states are likely to perform more poorly on the leavers measure because those families earning enough to exit TANF will have higher incomes, making them eligible for only a small amount of Food Stamp benefits, which might not be worth the effort needed to continue to participate. In states with lower TANF benefit levels, families with earnings would become ineligible for TANF sooner, making their Food Stamp benefit levels higher, and increasing the likelihood of their remaining on Food Stamps longer.⁽⁴⁾

Medicaid/SCHIP Receipt

The Medicaid and State Children's Health Insurance Programs (SCHIP) are designed to provide health insurance to low-income individuals. Medicaid provides coverage to adults and children;⁽⁵⁾ SCHIP covers children. State performance in ensuring receipt of Medicaid/SCHIP by eligible individuals can be measured in several ways. These potential measures are parallel to those suggested for Food Stamps and include: (1) the percentage of individuals eligible for Medicaid/SCHIP that receive the benefit, (2) the percentage of poor children in working families who are receiving Medicaid/SCHIP, and (3) the percentage of former TANF recipients receiving Medicaid/SCHIP.

Under the TANF High Performance Bonus, beginning in FY 2002, a bonus will be provided to the three states with the highest rates of Medicaid/SCHIP enrollment for former TANF recipients and to the seven states with the greatest percentage point improvement. Specifically, the measure of state performance will be the percentage of leavers who are enrolled in Medicaid or SCHIP four months after leaving TANF (and who are not currently receiving TANF assistance in that month). The population will be limited to leavers who were enrolled in Medicaid or SCHIP at the time of case closure. Virtually all such individuals should still be eligible for Medicaid or SCHIP four months later.

Measurement issues. The measurement issues for these Medicaid/SCHIP performance measures are similar to those for Food Stamps discussed above. While the three proposed measures are similar in that they attempt to capture the proportion of poor individuals receiving Medicaid/SCHIP, they differ in their relationship to TANF. The first and second measures use a relatively broad population as the denominator - the eligible population and working families with children, respectively. Thus, these two measures will capture the overall effectiveness of the Medicaid/SCHIP programs in reaching their intended populations, as well as the effectiveness of the TANF program in ensuring that individuals who are diverted from or leave cash assistance receive Medicaid/SCHIP. The third measure may be better focused on directly measuring the outcomes of the TANF program, but it does not capture the extent to which those who are diverted from TANF on the front end receive Medicaid/SCHIP.

The Medicaid/SCHIP participation rate might vary substantially depending on whether it was measured for adult leavers or for their children, because eligibility is at the individual level, not for the whole family unit. Data from HHS-funded studies of families exiting welfare indicate that in some states children are substantially more likely than their parents to be enrolled in Medicaid. (In other states there was little difference between adult and child participation rates.)

Data issues. These measures would be calculated in similar ways to those described above for Food Stamps. Medicaid administrative data - primarily the Medicaid Standard Information System (MSIS) - would have to be used to calculate the numerator of the rates because the American Community Survey (ACS) does not collect data on participation in Medicaid or SCHIP. All states have developed or are in the process of developing the capability of reporting Medicaid and SCHIP enrollment data using MSIS. This enrollment data would be collected and reported by states electronically on a quarterly basis and would be available within a reasonable period after the end of a quarter. Like the Food Stamp measures, national survey data, preferably the ACS, could be used to calculate the denominator of the first two rates - the population eligible for Medicaid and working families with incomes below the poverty line. The third measure would be calculated by matching the administrative records of individuals leaving TANF assistance with Medicaid/SCHIP enrollment data maintained on MSIS.

Fairness issues. Unlike Food Stamps, where eligibility is set at the federal level, states have some discretion over Medicaid and SCHIP eligibility. This creates two types of issues. On the first measure - which examines the proportion of the eligible population that receives Medicaid/SCHIP - states have very different eligible populations. A state with a broad eligible population may have more difficulty performing well on this measure than a state that defines Medicaid eligibility more narrowly. On the other hand, the second measure - which examines the proportion of poor working families that receives Medicaid/SCHIP - states that provide coverage to a broader segment of the population will do better because of their current policies. While the goal of this measure may be to encourage states to expand the reach of these programs, states that have already done so would start out with an advantage on the absolute measure if bonuses or penalties are based on level of coverage, and with a disadvantage on a measure based on improvements in coverage. Similar to measures of Food Stamp participation, some have argued that these measures are not appropriate since state policymaking on health insurance coverage to a broader eligible population goes beyond the purview of the TANF program.

As discussed above, states with higher TANF benefit levels will probably fare better on the two population measures because many of their working poor families will be eligible for TANF as well as Medicaid and participation rates for Medicaid are higher when families are also participating in TANF. In addition, for the two population measures, state administrators may not have control over the size of the needy population in their state. States with a relatively larger population below the poverty line may have more difficulty succeeding on these types of measures compared to states with a relatively smaller poor population.

Child Care Affordability and Quality

As more and more families move from welfare to work, there is an increasing need for access to affordable, high-quality child care. Affordable child care is necessary to enable most low-income parents, particularly single parents, to move from welfare to work. A growing body of research indicates that quality and stability of the child care setting influences outcomes for children as well as the ability of parents to retain employment. Moreover, high quality child care can contribute to the healthy development of children, especially children in low-income families who are often disadvantaged educationally as well as financially. However, most high-quality child care is unaffordable for low-income workers, including those moving from welfare-to-work, without government subsidies.

Measurement issues. There is no simple way to measure the success of state efforts to make high-quality child care accessible and affordable to low-income families. Under both TANF and the Child Care and Development Fund (CCDF), states have almost total flexibility to determine both eligibility for child care subsidies and the package of subsidies that is provided. An ideal measure of child care access and affordability would reflect the percentage of eligible children served and how great a share of their income all low-income parents are paying for child care, both those who are benefitting from subsidies and those who are not. However, such a measure could only be calculated by asking questions not currently included on any survey that produces annual state-level estimates. For the FY 2002 High Performance Bonus, HHS intends to use a measure which combines the fraction of eligible children receiving child care subsidies with the amount of the required copayment as compared to family income. These factors measure the breadth and the depth of state child care subsidies.

Measuring the quality of child care is even more complex. While research has indicated several factors that are believed to contribute to high-quality child care (e.g. training of child care providers, high staff-to-child ratios), there is no universal agreement on what constitutes high-quality child care. For this reason, in the FY 2003 High Performance Bonus HHS intends to use a process measure that indirectly assesses the quality of services by comparing actual rates paid to applicable market rates. The logic behind the selection of this measure is that families in states which reimburse at a higher fraction of market rates will have access to a broader range of child care options, including higher quality care, which often costs more than mediocre care. While high reimbursement rates do not guarantee that families will access high-quality care, low reimbursement rates ensure that low-income families will not be able to access such care.

Data issues. A direct assessment of the affordability and quality of child care used by low-income families would require a new survey effort, which is not feasible for an ongoing performance measurement system.

The measures used under the TANF High Performance Bonus can generally be calculated using a combination of administrative data reported through CCDF and Census Bureau information on family income. HHS measures access to child care based on the percentage of children in families with 85 percent or less of the state's median income (the maximum eligibility level allowed under federal law) who are served with child care subsidies. HHS measures affordability based on the relationship between the state's reported family co-payments for subsidized child care and reported family income. The final quality component of the High Performance Bonus measure of child care requires states that choose to compete for the Bonus to report additional data collected through their mandatory biennial market rate surveys.

Fairness issues. Given a fixed amount of funds to be spent on child care subsidies, states may choose to allocate it in a variety of ways. They may provide generous subsidies to a smaller number of families, or more limited subsidies to a greater number of families. If the spending per family is to be limited, this can be accomplished either by requiring families to pay a greater portion of the child care expense (reducing affordability) or by capping the amount they will pay per child (thereby restricting family choice of child care providers, and potentially reducing the quality of the care received). Additional funding can be used to expand a program in any of these dimensions. The High Performance Bonus measures combine these elements in order to reflect these tradeoffs and avoid promoting one choice over the others. However, because this measure has not been calculated in the past (and the precise details of the measure are still under development), it is unclear whether this measure will truly be neutral among all approaches.

Receipt of TANF and Other Types of Transitional Assistance

Another possible performance measure related to child and family well-being is a broad measure that examines the percentage of needy children that receives TANF and/or other types of assistance, including Medicaid/SCHIP, Food Stamps, and child care.

Measurement issues. In calculating this measure, care would have to be taken to ensure that individuals receiving benefits from more than one program are not double counted in the numerator of the rate. Otherwise, the extent to which states are serving eligible needy families would be overstated. This measure could be calculated based either on children or family units.

Data issues. This measure is relatively complex to calculate and would require a combination of both administrative and survey data. Similar to the Food Stamp and Medicaid/SCHIP measures, the denominator of the rate could be calculated relatively easily using the American Community Survey. However, because it requires knowledge of the receipt of a number of different programs, determining the numerator of the rate is more difficult. The ACS cannot be used to determine the numerator of the rate because it does not capture whether individuals use certain public assistance programs - particularly child care subsidies and Medicaid/SCHIP. While administrative data from each of these programs could be used to calculate the numerator, this is more complicated than other similar measures discussed in this paper because a number of administrative data sources would have to be used and linked together (TANF, Food Stamps, Medicaid/SCHIP, and child care). Thus, while it is not impossible to calculate this measure, it is more burdensome for states than some of the other related measures.

Fairness issues. States with more generous benefit programs for TANF, child care, and Medicaid/SCHIP will perform better on this measure - at least initially - than states with less generous programs. While the goal of this measure may be to encourage states to expand these programs, states that have already done so will start out with an advantage on the performance measure when capturing absolute performance. States that have more restrictive coverage could potentially perform better on a measure based on improvement. In addition, as discussed above, state administrators may not have control over the size of the needy population in their state. States with a greater population below the federal poverty line may have more difficulty succeeding on this type of measure compared to states with fewer poor people.

Extreme Poverty Rate

Another alternative approach to measuring the well-being of children and families is to look directly at income levels, rather than program participation. One proposed measure would examine the rate of extreme child poverty - those living at 50 percent of the poverty level or less - or the change in this rate. This measure would focus on the extent to which states have success in improving the status of a very needy group - the "poorest of poor." At the July 1999 consultation meeting, there was a general sense that a measure below 100 percent of poverty was more likely to be influenced by changes in TANF policy than the standard measure at 100 percent of poverty, because most states' cash assistance need standards and payment levels are substantially below the official poverty line.⁽⁶⁾ In addition, these very poor families may be at greatest risk of experiencing substantial material hardships, such as hunger or homelessness.

Measurement issues. The simplest way to measure extreme poverty would be using the standard definition of income, which includes cash assistance, but not the cash equivalent value of food stamps, or the effects of taxes. A modified definition of income, which includes both taxes and transfers, would better capture the effects of a range of state policy choices on family well-being. The extreme poverty rate could be measured based on family units, on individuals, or on children. Focusing on children gives more weight to the circumstances of large families ( i.e., each child in a large family would be counted separately) who are somewhat more likely to be poor. By including all families, whether or not they receive TANF assistance, and whether or not they are employed, this measure rewards states that provide a comprehensive safety net for the most disadvantaged.

Data issues. The preferred data source for this measure is the ACS, which will provide reliable, timely state-specific data that could be used to calculate both the numerator and denominator of the rate, starting in 2000. The CPS provides data that could be used to calculate this measure in the short-term, but for all but the largest states we would have to rely on small area estimation techniques, such as moving averages (combining and averaging three or four years of data).

Fairness issues. The extreme poverty rate is likely to be affected by a number of factors - some of which are outside the scope of a state's TANF program. For example, the poverty rate is likely to be affected by a state's economy and by the policies of other social programs for the poor. Both the absolute rate and the change in the rate would be affected by these other factors. However, using the change in the extreme poverty rate as a measure might give states with historically high poverty rates an incentive to improve their performance, since they have a greater chance of performing well on a measure based on improvement.

Family Formation/Stability Measures

Family formation/stability measures address another goal of the TANF program - to promote marriage and encourage the formation and maintenance of two-parent families. This goal is based on the belief that families are one of the strongest factors in developing and sustaining high levels of individual competence and functioning in our complex society. The number of parents living with a child is generally tied to the amount and quality of human and economic resources available to that child. Children who live in a household with one parent are five times more likely to have family incomes below the poverty line than are children who grow up in a household with two parents.

Percentage of Children Living in Married-Couple Homes

Measuring the percentage of children living in married-couple families addresses the TANF goal of encouraging the formation and maintenance of two-parent families. Unlike a measure of nonmarital births, this measure considers whether parents marry and stay married for as long as they have children at home, not just at the time the child is born. In addition, it addresses the well-being of low-income children and families and could serve to stimulate state interest in a range of strategies that promote intact families. The TANF High Performance Bonus includes a measure of the percentage point change in the rate of all children who reside in married-couple families, based on a comparison of data between years.

Measurement issues. The primary issue for this measure is the population to which it will be applied. Under the Notice of Proposed Rulemaking (NPRM) for the TANF High Performance Bonus, HHS proposed to apply this measure to children in families with incomes below 200 percent of poverty who reside in married-couple families. HHS proposed to restrict the measure to poor and near-poor families because these are the ones who are most likely to be affected by welfare policy and to be targeted in state programs to promote marriage.

However, many commentators expressed concern about this measure, noting that if the measure was focused on poor or near-poor children, as the NPRM proposed, there was the potential for states to be rewarded for undesirable outcomes. For example, an increase in the number of low-income families, such as might be caused by an economic downturn, would also likely result in an increase in the share of low-income two-parent families. Conversely, policies that promoted marriage or rewarded work among two-parent families might cause these families to have incomes greater than 200 percent of the poverty level, causing a state to be punished for a positive outcome. In response to these concerns, in the Final Rule for the High Performance Bonus, HHS adopted a broader measure rewarding those states with the greatest increase in the percentage of children who reside in married-couple families, regardless of family income.

Data issues. Using this measure would entail no new data collection responsibilities on the part of the state. It is anticipated that national data will be available to measure state performance from the Census Bureau's decennial and annual (e.g., ACS) demographic programs. The Census Bureau's decennial and annual demographic programs would provide uniform, objective, and reliable state-level data beginning in 2001 (with respect to 2000). As noted above, use of this (or any measure that relies on the ACS) is contingent upon the continued availability of the new Census Bureau data.

Fairness issues. Since this is a population-based measure, the TANF agency would have limited ability to influence this family formation measure beyond the TANF population. However, the TANF goal of supporting two-parent families is not limited to "needy families," so it might be appropriate for TANF agencies to look beyond their traditional service population of families receiving cash assistance.

Out-of-Wedlock Birth Rate for TANF Families

This measure would report on the rate of nonmarital births that occur to TANF families and address the TANF goal of preventing and reducing the incidence of out-of-wedlock pregnancies. This measure differs from the existing Bonus for Reduction in Out-of-Wedlock Births, which measures changes in nonmarital birth rates for all families in a state.

Measurement issues. The advantage of restricting a measure of nonmarital births to TANF recipients (rather than all families) is that TANF agencies have much more ability to affect the choices of TANF recipients, through financial incentives, life skills classes, parenting education, and referrals to family planning providers, than they do to affect the general population.

However, such a measure has the possibility of undesired consequences. There is already some controversy over the "family cap" policies that several states have already adopted, which deny additional benefits to families for children conceived while the family was receiving welfare. There would probably be greater concern if states included a requirement not to have additional children in a personal responsibility contract, and denied assistance entirely to recipients who fail to comply. Policy makers might wish to add to this measure restrictions to ensure that the desired goals were not achieved through unacceptable means; for example, the Bonus for Reduction in Out-of-Wedlock Births includes a requirement that states which receive the bonus must also have achieved a reduction in their abortion rate.

Another concern is that the composition of the denominator of the measure - the TANF caseload - will be constantly changing. For example, it is widely believed that as the welfare caseloads have declined, the most advantaged (job-ready) recipients have disproportionately left the rolls, leaving a higher proportion of more disadvantaged, harder to employ, families behind. It is possible that the least job-ready recipients may have a higher rate of nonmarital births than the more job-ready individuals who leave TANF more quickly. Thus, it may be difficult to distinguish whether a change in the nonmarital birth rate was due to a change in the number of out-of-wedlock births or a change in the composition of the TANF caseload.

Data issues. There are two potential data sources for calculating the out-of-wedlock birth rate for TANF families. The most accessible and timely data source is TANF state administrative records. As part of their aggregate data collection for TANF, states are required to report on a quarterly basis the monthly number of families receiving TANF with an out-of-wedlock birth. The quality of these data has not yet been assessed. The National Center for Health Statistics (NCHS) is another potential data source. States report data on all out-of-wedlock births in their state to NCHS on an annual basis. These data could be matched to TANF administrative records to determine the out-of-wedlock birth rate for TANF families. However, this type of match has not been done yet, so its feasibility and burden would have to be determined.

In addition, a few states do not directly ask the mother's marital status on the birth registration form, but instead infer it from other information collected (typically, whether the mother and father have the same last name). This reduces the comparability of data among states.

Fairness issues. This measure has not previously been calculated. However, it is known from the existing Bonus for Reduction in Out-of-Wedlock Births, that there is a great deal of variation across states in the percentage of all births which are out-of-wedlock. This variation appears to be related to the demographic characteristics of the state's population, rather than to specific state policies. It is likely that this variation will also occur among states with respect to nonmarital births in the TANF population, again for reasons other than state policy choices. This suggests that a measure of improvement (as used under the Bonus for Reduction in Out-of-Wedlock Births) would be more appropriate than a simple absolute rate. However, as discussed previously, improvement is generally more difficult to achieve for states which have already achieved a high level of performance (in this case, a low rate of nonmarital births).

Endnotes

4. Unlike TANF, Food Stamp income eligibility standards and benefit levels are set at the national level. Eligibility is based on the number of people in the household and the amount of income the household has; most households must have income at or below the Federal Poverty guidelines after deductions are allowed. Almost all types of income, including cash assistance, are counted to determine if a household is eligible. Food Stamp benefits are 100% federally funded.

5. Medicaid is a federal-state matching entitlement program. The federal matching rate is inversely related to a state's per capita income, and can range from 50 to 83 percent. Within federal guidelines (states are required to serve some population groups and are permitted to serve others), each state designs and administers its own program. Thus, there is substantial variation among states in coverage, types and scope of benefits offered, and amounts of payments for services. Applicants' income and assets must be within program financial standards - for some population groups these standards vary among states; for others, standards are set by federal law.

6. A separate provision of PRWORA requires the chief executive officer of each state to report annually on the child poverty rate in the state. If, as a result of PRWORA, the child poverty rate of the state has increased by 5 percent or more, the state must prepare and submit a corrective action plan outlining the manner in which the state will reduce the child poverty rate.

Conclusions

The Department of Health and Human Services is fully committed to the concept of outcome-based performance measures. Measuring accomplishments rather than activities or processes is an important step toward ensuring that both federal agencies and our state and local partners maintain focus on the goals of the programs and are accountable for producing results. An emphasis on outcomes is particularly appropriate for a decentralized program such as TANF, where the federal government does not prescribe detailed program parameters.

This commitment is reflected in the Department's overall strategic planning process that links all activities to the goals established under the GPRA plan. In addition, states have been given outcome-based incentives under TANF through the High Performance Bonus and the Bonus for Reductions in Out-of-Wedlock Births. Although these bonus systems provide only financial incentives for positive outcomes without penalizing negative outcomes, and the GPRA plan has neither bonuses nor penalties attached to it, these systems are, nonetheless, important components of outcome-based performance measurement and lay the groundwork upon which future enhancements will be based.

The question is no longer whether outcome-based performance measurement is a valuable tool, but rather how to incorporate outcome-based measures into an overall accountability system. The following observations are offered to help guide this development process. We hope that these observations will prove useful during upcoming discussions on evaluating states' success under TANF as the program is considered for reauthorization.

Performance measures are only one element of a comprehensive system of program monitoring, research, and evaluation.

Outcome-based performance measures are part of a broader system of program information. As such, they should not be viewed as the way to answer all of our questions about the effects of a program. Neither should they be expected to substitute for other kinds of studies that provide different kinds of information.

For example, policymakers and researchers are greatly interested in whether individuals leaving welfare are succeeding in retaining employment. This question can best be answered with long-term data, covering periods of a year or more. However, such a time frame may be inappropriate for a performance measurement system, which must provide timely feedback to program operators. Similarly, cost constraints may prohibit the inclusion of performance measures based on data that can be collected only through detailed surveys. Therefore, it is usually necessary to supplement the data reported through a performance measurement system with more intensive studies to address questions that are outside the range of such a system. However, such surveys are generally conducted only intermittently and in limited geographic areas. Performance measurement, on the other hand, provides ongoing information about programs which cannot be obtained from such surveys.

In the welfare-to-work arena, this report has shown that it is difficult to identify performance measures that are reliable measures of the "value-added" benefits of a program - that is, measures that reflect outcomes beyond those that would have occurred without the program in place. This difficulty is compounded for a program such as TANF, where state policies and strategies vary widely, as do state capacities, economic conditions and caseload characteristics. To truly understand program effectiveness, it is important to supplement performance measures with other evaluation activities. Together these efforts will provide a more thorough and accurate assessment of program performance.

Used properly, performance measurement systems and rigorous evaluations complement each other. Performance measures can be a valuable tool for generating hypotheses about the relationships between interventions and outcomes. These hypotheses can then be tested through in-depth evaluations of program effectiveness in a limited number of research sites. Similarly, the findings from research and evaluation can be used to refine performance measurement systems and improve their value for monitoring and motivating performance.

Before specific measures can be identified, it is critical to reach agreement among all major stakeholders on appropriate program goals.

Several studies on the development of outcome-based performance measures have stressed that identifying and reaching consensus on program goals is one of the first and most critical steps (Dyer, 1994; Horsch, 1996 (b); Yates, 1997). This aspect of building performance measurement systems can be both challenging and time consuming - a study of GPRA implementation found that developing long-term strategic goals and translating them into specific performance measures was one of the most difficult elements of building a system (GAO, 1997). This process generally requires stakeholders to engage in consensus-building to define broadly shared visions of what program goals are important and what strategies are required to achieve them. As part of this effort, it is also critical to ensure that there is buy-in from top leadership at the outset (Bittner, 1998; Hatry, 1999).

The authorizing legislation clearly articulates four goals for TANF, and states enjoy wide discretion in setting their own program priorities from among those goals and in choosing how to spend their block grant funds. There is a broad consensus among stakeholders that work is a central focus of TANF. This consensus is reflected by the near-universal first-year performance competition among states on the various work-related measures of the TANF High Performance Bonus. Many states have also adopted internal goals on measures of their success in placing welfare recipients in jobs and assisting them in retaining employment and increasing their wages.

Agreement among stakeholders on goals beyond work is less universal. States are not required to spend equal amounts of block grant funding on each of the four TANF goals nor even to spend funds on each of the goals. To date, states have exercised their discretion by funding activities as diverse as programs focused on cash and work-based assistance; programs that include work activities, child care, and other work-based supports; programs that support the formation and maintenance of two-parent families; programs focused on preventing out-of-wedlock pregnancies; programs that expand supports for all working poor families, whether or not they have previously received cash assistance; and various combinations of these activities.

Only when goals have been agreed upon, does it make sense to discuss specific measures and related performance standards. Even if it is not possible to achieve full consensus, it is still important to include all stakeholders in the discussion, in order to obtain a real-world perspective on what is operationally feasible, what is the potential for unintended consequences, etc.

There is no perfect outcome-based performance measurement system. Therefore, it would be appropriate to build gradually upon an existing system.

This report has shown that outcome-based performance measures - particularly in the welfare-to-work arena - are susceptible to some shortcomings. Some measures may be influenced by factors other than the program, such as the state of the economy or the composition of the welfare caseload, and it may be too complicated or costly to detect the statistical impact or adjust for these factors. Other measures may elicit an unintended response by states or localities - such as creaming. Data problems make it impossible to use some desirable measures, because the data are too costly to collect, or not available at the state level.

This does not mean that it is not worthwhile to develop outcome-based performance measures. On the contrary, the experiences of the welfare and workforce development systems in developing and using these measures have shown that there are methods to work around these issues. But it is appropriate to proceed by building on existing measures and data systems, rather than starting from scratch with ambitious new measures.

Approaches that may be adopted include:

As recommended by Bartik (1996), select measures that meet a "minimum" fairness threshold, that cannot be manipulated by the client mix or program data in ways that do not increase the "value-added" of the program.
Use moderate or limited bonuses and penalties. Some researchers suggest keeping the level of bonuses or sanctions relatively weak unless and until a clear link with program effectiveness can be established (Barnow, 1999). Others find that attaching high stakes to accountability systems - particularly early in their development - may lead to a perception that the accountability systems are arbitrary or unreasonable and thereby undermine support for them (Bartik, 1996; Horsch, 1996(a)).
Collect baseline data on state performance for some time period before establishing standards for a particular measure, particularly when penalties are involved. For example, a system of outcome-based performance measures could initially be based on those measures established under the TANF High Performance Bonus, which reward the top performers on a measure without establishing a minimum performance threshold. Data for new measures might be collected first for information purposes only, then attached to bonuses, and only later, attached to penalties.
Implement a long term program of developing additional data sources that accurately - and within a reasonable cost - measures the desired outcomes most precisely. In some cases, it may be appropriate to add additional elements to the TANF data reporting system. In other cases, it may be desirable to add questions to large-scale national surveys.

Using outcome measures does not mean abandoning useful process measures, especially because of problems in attributing outcomes to program interventions.

It is not necessary to make an either/or choice between outcome-based performance measures and process measures, such as the work participation rate. Along with outcome measures, participation measures are useful for several reasons:

To maintain a level of federal direction over the types and hours of activities in which welfare recipients participate. The participation rate under TANF places a strong emphasis on actual work, whether subsidized or unsubsidized, and allows education, training, and job search to count only under strictly limited circumstances. At the July 1999 consultation, many state representatives and researchers spoke out in favor of continuing a participation rate, but expanding the range of allowable activities.
To ensure that a high percentage of recipients is receiving services. This is particularly critical now that welfare has become a time-limited program; if the mandatory work participation rates are not enforced, there is a danger that people will reach their time limits and lose cash assistance without ever having received services or the experience in work to help them move to self-sufficiency. Together, the participation rate target and the definition of which individuals are counted (e.g., whether recipients who are exempt from participation should be excluded from the calculation) determine what fraction of all recipients will be expected to be engaged in work-related activities.⁽⁷⁾
To give states some credit for engaging recipients in work-related activities even if, due to state economic conditions or the characteristics of the recipient caseload, they do not achieve the desired levels of outcomes (e.g., unsubsidized employment). It may take several years of services before the "hardest-to-serve" recipients can achieve unsubsidized employment and participation rate requirements can reward states for helping such recipients to achieve intermediate steps, such as participation in substance abuse treatment or community service employment.

Regardless of motivation, outcome-based performance measures and work participation rates are not mutually exclusive.

As our experience with the High Performance Bonus system has shown, implementing an outcome-based performance measurement system under TANF would not be easy, neither for those who would set the standards nor for those whose performance would be measured against them. As these conclusions indicate, it is important to proceed judiciously, with an understanding of the real-world limits of such a system as well as of its strengths. But, as Secretary Shalala said, speaking at the University of Michigan in 1997:

[B]y focusing on outcomes we do more than fulfill our moral obligation... we force ourselves to use scarce resources wisely; to develop objective standards that we can use to demand accountability; and to put ourselves in a position to achieve even better results in the future.⁽⁸⁾

Endnotes

7. Note that the target participation rate will always be lower than the fraction of the caseload that is mandated to participate, because some recipients will inevitably be between activities, or noncompliant.

8. Remarks by: Donna E. Shalala, Secretary of Health and Human Services, Fedele and Iris Fauri Memorial Lecture, University of Michigan, Ann Arbor, Michigan, September 18, 1997.

Appendices

Appendix A: Literature Review: The Use of Outcome-Based Performance Measures in Welfare and Workforce Development Programs

Introduction

Public programs focused on improving the employment levels and earnings among economically disadvantaged groups have witnessed an increasing use of outcome-based measures to determine program success. These programs use measures focusing on "results" to gauge program success and to hold public agencies accountable for achieving certain goals. The Personal Responsibility and Work Opportunity Reconciliation Act of 1996 (PRWORA) which changed the nature of the welfare system by devolving program responsibility to the states, enacting restrictions on the amount of time individuals can receive assistance, and requiring recipients to engage in work quickly required the U.S. Department of Health and Human Services (HHS) to reward states for the success of their cash assistance and welfare-to-work programs (known as the Temporary Assistance for Needy Families (TANF) program) based on their performance on a range of outcome-based measures.

The workforce development system also uses outcome-based performance measures to determine program success. The Workforce Investment Act of 1998 (WIA) which consolidates and streamlines a range of employment and training services for economically disadvantaged individuals requires the state and local workforce development agencies that operate the program to meet specific outcome-based performance measures and provides incentives to do so.⁽¹⁾ The WIA performance measurement system builds on the one developed under the Job Training Partnership Act (JTPA). Administered by the U.S. Department of Labor, JTPA was the principal federal job training program prior to WIA for economically disadvantaged youth and adults, dislocated workers, and others who faced significant barriers to employment.

This paper describes the experiences of programs designed to improve the employment prospects and earnings of economically disadvantaged adults particularly welfare-to-work programs in using outcome-based performance measures. To provide context, the paper begins with a review of the literature on the goals and defining elements of performance measurement systems. Next, the paper identifies issues that are critical to address when developing and using outcome-based performance measures in welfare-to-work programs. It also reviews studies that have described and assessed the use of outcome-based performance measures in the welfare and workforce development systems at the federal level and then turns to a discussion of similar initiatives at the state level. The paper concludes with a discussion of the lessons drawn from these experiences.

Defining Performance Measurement

An outcome or results-oriented system for measuring program success represents a shift from traditional approaches to accountability, which typically involve tracking inputs and processes. While laws like PRWORA and WIA require the use of outcome-based performance measures, many other human service programs are also increasingly using this type of accountability system. This change in emphasis stems in part from the Government Performance and Results Act (GPRA) enacted by Congress in 1993. Seeking to promote improved government performance and greater confidence in public programs, GPRA established a government wide requirement for agencies to identify agency and program goals and to report on their results in achieving those goals. The increasing use of performance measures in all types of human service programs has prompted a number of researchers to examine the goals and the defining elements of measures of program performance commonly known as performance measurement systems.

The Urban Institute (1980) defines performance measurement systems as the regular collection and reporting of program information in three areas their efficiency, quality, and effectiveness (Urban Institute, 1980). According to Martin and Kettner (1996), measuring the efficiency of a welfare-to-work program, for instance, involves assessing the amount of service provided and the number of clients completing the program and comparing these measures against the costs involved. Measuring quality involves the assessment of the nature of services provided and tries to maximize the quality of services provided in relation to program inputs. Measures of effectiveness focus on outcomes also referred to as results or accomplishments of programs, such as the number of individuals who find jobs through an employment program. As described more extensively below, in both the welfare and workforce development systems, an emphasis has been placed on measuring the effectiveness of programs rather than their efficiency or quality. This is usually what is meant by outcome-based performance measurement.

Studies have also explored how performance measurement systems can be used to fulfill a variety of purposes (Bartik, 1996; Behn, 1991; Brown and Corbett, 1997; Hatry, 1999; U.S. Department of Health and Human Services, 1994). Several goals of performance measurement systems have been recognized, each designed to make programs publicly accountable for their operations:

To assure basic fiscal integrity in the expenditure of public funds and their use for specified, authorized activities. This includes ensuring funds are spent on appropriate individuals or services.
To provide for sound management of program services by supplying information on program operations and outcomes. Performance measurement systems can provide information to program managers regarding whether the program is achieving critical program objectives beyond fiscal accountability such as who is served in the program, what types of services are provided, and the results of the programs efforts. This information can be used to inform management and policy decisions.
To monitor program outcomes. Performance measures provide information that allow higher levels of government or program managers to track how well a program is doing in achieving specific goals or objectives.
To motivate better performance by managers and line workers associated with the program. Some performance measures are designed to motivate staff to achieve particular goals that may be important to the program.

These goals are not mutually exclusive, but different goals may require different types of performance measures (Bartik, 1996). For example, to identify how to improve the effectiveness of programs, the measures selected have to be an accurate gauge of the programs effectiveness and may be required to be linked to information on operational strategies so it is known why particular approaches are effective. In contrast, to motivate local offices and staff, measures must be timely and understandable and linked to the allocation of resources. This indicates a need for a variety of different performance measures particularly if the system has multiple goals.

Over the years, a range of terms has been used to describe the different types of performance measures used to gauge program success and these terms are often used interchangeably, although they have varying connotations and meanings. Some studies have tried to achieve consensus on useful ways for defining performance measurement-related terms, particularly for use in welfare-to-work and other employment programs (Brown and Corbett, 1997; Hatry, 1999; Martin and Kettner, 1996; Midwest Welfare Peer Assistance Network, 1999; U.S. Department of Health and Human Services; 1994).

Outcome measures. Outcome performance measures focus on the results of a policy or program and are generally related to the goals the program hopes to achieve. In most cases, these measures focus on the outcomes for a group of individuals involved in the program. In welfare-to-work programs, key outcome measures include job placement rates, employment retention rates, or wage rates. Further, some types of outcome measures such as General Educational Development (GED) certificate attainment rates are referred to as "interim outcome measures" because they represent an important milestone even though they may not be the ultimate goal of the program.
Process measures. Process measures address administrative or operational activities of the program. These types of measures usually reflect the "means" to getting to an end result rather than the goal itself. Examples of process measures include participation rates reflecting the type and level of service received through the program, the percentage of applications for assistance which are acted upon in a timely manner, and the percentage of cases in which the cash benefits are calculated accurately.
Indicators. Indicators are measures of behavior, status, or condition that can be tracked over time and across people. Examples of indicators include state marriage and divorce rates or poverty rates. Indicators typically track the behavior or situations of broad population groups.
Performance standards. These are numerical "goals" or standards established for a performance measure, such as a 70 percent employment rate or a 25 percent participation rate.

Clearly, performance measurement systems can be used to meet a variety of goals and can be measured in different ways. Given these issues, careful consideration must be given to the design of performance measurement systems that use outcome measures. There are also issues that are specific to the design of outcome-based performance measures in welfare-to-work programs these are discussed below.

Issues in Using Outcome-Based Performance Measures in Welfare-to-Work Programs

Outcome-based performance measures particularly measures that use the employment and earnings of program participants to gauge program success are increasingly common among welfare and workforce development programs. However, studies have highlighted a number of issues that need to be addressed when these types of measures are used. (Barnow, 1999; Bartik, 1996; U.S. Department of Health and Human Services, 1994). These issues which are each discussed in more detail below include an inconsistent relationship between outcomes and program effectiveness, a need to ensure that measures are fair and equitable, the possibility of unintended consequences, and the problem of multiple goals. Some of these issues stem from an absence of answers that research is able to provide at this time, while others are due to a growing body of evidence that suggests the inherent challenges of designing outcome-based performance systems for welfare-to-work programs.

Inconsistent relationship between outcomes and program effectiveness

One concern regarding the use of outcome-based performance measures to reflect program success is that specific measures that are commonly used such as increasing employment rates or earnings often do not accurately measure the "added value" of the programs (Bartik, 1996; Barnow, 1999; U.S. Department of Health and Human Services, 1994).

Research on welfare caseload dynamics has shown the natural movement of welfare recipients on and off welfare (Bane and Ellwood, 1983; Pavetti, 1993). The findings show that a large proportion of welfare recipients exit welfare after relatively short periods of time, while a substantial minority remain on welfare for longer spells. Some of those who leave do so due to employment; others leave for reasons related to marriage, remarriage, or further changes in their personal or economic situation; and still others leave for reasons that are not known. The studies also show that, in most cases, a large majority of those who leave welfare do so on their own, without either the benefit of an employment program or the requirement to participate in one. This movement off welfare and into employment represents what might be called a baseline or "natural" outcome unrelated to the operations of a welfare-to-work program.

The role of an employment program is not necessarily to achieve high outcome rates but to add to the outcomes that would normally occur. To be judged successful, a program must exceed or "add value to" the natural outcome rates. A program could do so in a number of ways. It could either move people more quickly into jobs and off welfare than they otherwise would have (through job search activities). Or it could assist getting people jobs who would not otherwise have gotten them, such as by providing job training or specialized services for clients facing difficult barriers (i.e., domestic violence, substance abuse). The strict enforcement of participation requirements may also cause some individuals to leave cash assistance rather than participate.

Evaluations of welfare-to-work programs have found that there is not a strong correlation between the "value-added" by the employment program and the attainment of high outcomes on employment-related measures. The 1994 Report to Congress by the U.S. Department of Health and Human Services examined random assignment studies of the welfare-to-work programs that operated in the 1980s. These evaluations were specifically designed to measure the "added value" or impact of programs targeted at welfare recipients in increasing the earnings and reducing the welfare dependency of those referred to an employment program (the program group) compared to an identical group of individuals (the control group) who did not receive program services. This review found that those programs that performed well on specific outcomes measures related to moving people into jobs and off welfare did not necessarily have greater success in terms of program impacts than those who did not perform well on the measures.

More recent data from the National Evaluation of Welfare-to-Work Strategies (NEWWS) (formerly the JOBS evaluation) confirms this finding (Freedman et al.,2000). This study included random assignment studies of welfare-to-work programs in seven sites; for simplicity, results from five sites are discussed here.⁽²⁾ As Table 1 shows, even though Columbus had the highest employment rate (50.2 percent), the "added value" by the program is lower than the other sites in the evaluation (3.5 percent). Moreover, the outcomes in Portland, which had a substantially higher "added value" were similar to several of the other sites. Thus, in this case, outcomes do not serve as a good "proxy" for added value, and an assessment of the relative effectiveness of the programs based solely on outcomes would have been mistaken.

Table 1.
Employment Outcomes and "Added Value" in NEWWS Sites
County	Longitudinal Participation Rate	Percent Employed After Two Years		"Added Value" (Difference)
County	Longitudinal Participation Rate	Program Group	Control Group	"Added Value" (Difference)
Atlanta	73.8%	42.8%	38.5%	+4.4%
Columbus	52.1%	50.2%	46.7%	+3.5%
Grand Rapids	69.0%	47.2%	43.1%	+4.1%
Portland	61.1%	46.2%	35.3%	+10.9%
Riverside	43.8%	31.3%	27.1%	+4.2%

This table also shows that participation rates a process measure also are not good proxies for program impacts. The participation rates shown on Table 1 are longitudinal measures which report the proportion of individuals who participated in the program at least one day within a two-year period and are calculated differently than the monthly participation rates required under JOBS and TANF⁽³⁾ (which report the number of individuals who participate a certain number of hours per week each month). These results show that the program with the highest participation rate, Atlanta (73.8 percent), had a very similar added value to the site with the lowest participation rate (43.8 percent).

Table 2 shows the outcomes and "added value" for a different measure earnings over a two-year period with similar results. This table shows that sometimes outcomes can be correlated with the level of added value. The Portland program group achieved both a high level of earnings ($7,133) and the highest level of added value ($1,842). However, the relationship is not consistent. Columbus had similar earnings to Portland ($7,569) but its added value was dramatically lower ($677). In addition, the site with the lowest earnings Riverside ($5,488) had the second highest added value ($1,276). Similar to the findings with employment rates, longitudinal participation rates are not correlated with the added value of programs on earnings measures.

Table 2.
Earnings Outcomes and "Added Value" in NEWWS Sites
County	Longitudinal Participation Rate	Average Total Earnings Over Two Years		"Added Value" (Difference)
County	Longitudinal Participation Rate	Program Group	Control Group	"Added Value" (Difference)
Atlanta	73.8%	$5,820	$5,006	$813
Columbus	52.1%	$7,569	$6,892	$677
Grand Rapids	69.0%	$5,674	$4,639	$1,035
Portland	61.1%	$7,133	$5,291	$1,842
Riverside	43.8%	$5,488	$4,213	$1,276

Research on workforce development programs has found similar results. Barnow (1999) found a weak correspondence between program impacts and measured performance in the JTPA program. In examining the 16 sites in the National JTPA Evaluation (evaluated using a random assignment design), this study found that the relationship between program performance on employment-related measures and program impact was positive but statistically insignificant.

This evidence suggests that a system of performance measurement that focuses on outcomes may not necessarily lead programs to increase their added value. Rather, it could reward the substantial amount of normal employment activity by welfare recipients rather than the programs added value: programs with higher (or lower) outcome rates overall may simply reflect the higher (or lower) natural outcomes. However, controlled evaluations, which are the best way to measure program impacts are generally too expensive and time-consuming to rely on for ongoing feedback and monitoring of programs.

Developing fair and equitable measures

As discussed above, there is a natural rate at which welfare recipients find jobs with no assistance from employment programs. Studies have found that this natural rate is due to the influence of several factors over which state and local managers have little control, including the states economic conditions and the demographics of the welfare caseload (Barnow, 1999; Bartik, 1996; U.S. Department of Health and Human Services, 1994). An important dimension of performance measurement systems is holding states accountable for performance that is within their control not for factors for which they can be expected to have little or no responsibility.

Different state and local welfare-to-work programs operate within significantly different labor markets and under economic conditions that are diverse and highly variable. This may have a significant effect on the outcomes produced by the state or local program. For example, because there are fewer jobs available, a program operating in a depressed economy may place fewer recipients in jobs than one functioning in a booming labor market. In this case, the economy may be a key factor in explaining the difference between the states results, not the effectiveness of the employment program.

States and localities also have different and changing welfare caseloads in terms of overall size, demographics, and other local factors. For example, some states have a relatively high proportion of very disadvantaged recipients (e.g., those lacking educational credentials or employment histories) on their cash assistance caseload. Because it is more difficult to employ this group, a state in this situation could appear less successful, based on a job-related outcome measure, than a state that served a more job-ready population. In this case, a states performance would, at least in part, be driven by the composition of the cash assistance caseload.

Other important factors that affect the outcomes of a states employment program are the cash assistance benefit levels and income disregard policies. Cash assistance benefit levels for a family of three with no income range from $120 per month in Mississippi to $923 per month in Alaska (Gallagher et al. 1997). Earnings disregards the amount of income disregarded when calculating the benefit level also vary from small, decreasing disregards in some states to others that disregard all income as long as the family is below the poverty line. As a result, some states will perform better on certain outcome measures such as the number of individuals who leave cash assistance due to employment because of their grant level and earnings disregard policies rather than because of their programs performance. While states could change their benefit levels and earnings disregards so that they fared better on certain types of performance measures, that is not the goal of a performance measurement system. (See below for a discussion of unintended consequences). Instead, the goal is to develop measures that treat states equitably regardless of their benefit and earnings disregard levels.

These findings suggest that it is important to recognize the role that uncontrollable factors can have on performance measures and to develop mechanisms that ascribe differences in outcomes to the right factors. This process of ensuring that standards are fair and equitable across states is known as "leveling the playing field." Later sections of this paper discuss some of the mechanisms that have been developed to address this issue, including regression-adjusting measures to reflect differences in economic conditions and welfare caseloads, standards of performance that are negotiated to reflect local conditions, and using measures of improvement rather than absolute levels of performance.

Unintended consequences

Another issue that can be encountered in developing outcome-based performance measures is the creation of unintended consequences (Barnow, 1999; Bartik, 1996; U.S. Department of Health and Human Services, 1994). This means that an unintended behavior is created when trying to achieve a certain result. The most prevalent example of this in the world of employment and training programs is known as "creaming." When only the outcomes for clients enrolled in the program are considered, programs can enhance their performance on employment and earnings measures by serving those clients who are most "job ready" and who, with minimal program assistance, are most likely to become employed on their own. Programs may also have a disincentive to focus on the hard-to-serve clientele who may actually be more in need of the services provided by the program because it would affect their ability to achieve a certain level on performance measures.

There may be additional unintended or adverse consequences as well. For example, a focus on caseload reduction can lead to incentives to divert recipients from receiving assistance or to lower grant levels or earnings disregards so individuals leave assistance more quickly when they find jobs.

Multiple program goals

A final issue to address in developing an outcome-based system of performance measurement grows out of the relatively broad purposes of welfare-to-work programs. While the overall goal is to move individuals into employment and/or off cash assistance, there are a number of objectives that could be pursued to achieve this goal. For example, some programs may emphasize finding "better jobs" or jobs with benefits or higher wages while others may emphasize moving individuals quickly into jobs regardless of their wage level. Moreover, some programs may use their TANF programs to reduce poverty, for example by providing income support for needy families as long as they remain poor even if this means the families receive assistance for a longer period. Others may view self-sufficiency and reduced dependency as the primary goal and reduce the level of support provided in order to make work a necessity.

Studies of welfare programs in the 1980's found that, depending on the objectives administrators identify and the service and management strategies they adopt, programs move in substantially different directions with different results. For example, one study found that a range of programs was successful depending on the specific measures used some achieved high employment rates, others had larger earnings gains, and still others were more cost-effective (U.S. Department of Health and Human Services, 1994). However, achievement of one outcome did not necessarily correlate with achievement on the other outcomes. Thus, it is important to take care in identifying and promoting one particular program objective over another.

Federal-Level Experiences in Using Outcome-Based Performance Measures in the Welfare and Workforce Development Systems

In spite of the potential challenges in using outcome-based performance measures in welfare-to-work programs, at the federal level, both welfare and workforce development programs have increasingly focused on the use of these measures to gauge program success and effectiveness. This section discusses the evolution of each of these outcome-based performance based systems and, to the extent possible, discusses how they have addressed the specific challenges discussed above. In addition, the paper examines similarities as well as differences between these two systems.

The Use of Outcome Measures in the Welfare System

Welfare-to-work programs administered by the U.S. Department of Health and Human Services are increasingly shifting toward using outcome-based performance measures to gauge program success. The welfare-to-work program that preceded the TANF program the Job Opportunities and Basic Skills Training (JOBS) program did not explicitly require the federal government to establish outcome-based performance measures. Rather, this program primarily relied on two process measures participation rates and the proportion of funds spent on long-term welfare recipients as its primary measures of program accomplishments. Some studies found that the JOBS program was not sufficiently focused on employment, in part due to the nature of its performance measurement system (U.S. General Accounting Office, 1994, 1995).

The legislation governing the JOBS program did require the U.S. Department of Health and Human Services to develop recommendations for outcome-based measures. In its 1994 Report to Congress (U.S. Department of Health and Human Services, 1994), HHS developed a timeframe for developing these measures with measures to be put in place in 1996. However, the passage of PRWORA in 1996 superseded these plans. Possible performance measures mentioned in the 1994 Report to Congress included: percent of the cash assistance caseload that received aid for more than a specified period, the JTPA performance measures (see below), increases in employment and earnings of program participants after leaving the JOBS program, and retention of JOBS participants in unsubsidized employment.

The statute governing the TANF program contains more explicit guidance concerning the development of outcome-based performance measures. Like the JOBS statute, PRWORA also detailed participation rates states are required to meet. However, it also required HHS to develop a "high performance bonus" to reward states based on their success in attaining the goals of the act and to distribute a bonus to reward states based on their success in reducing out-of-wedlock births. For the high performance bonus, the law gave HHS working with the states discretion over what measures should be used. Congress was much more specific regarding the performance measure for out-of-wedlock births.

For the initial three years of the high performance bonus, HHS developed interim guidance that included outcome-based performance measures that reflect states performance in moving individuals from welfare to work (U.S. Department of Health and Human Services, 1998, 1999, and 2000). The guidance included four key work measures for the high performance bonus: (1) the job entry rate; (2) the success in the workforce rate (includes measures of both job retention and earnings gains); (3) the increase in job entry rate; and (4) the increase in the success in the work force rate. States use quarterly Unemployment Insurance (UI) records and other administrative data to calculate these measures. Bonuses are awarded to the ten states with the best performance on each measure.

In the final rule for bonuses to be awarded in FYs 2002 and 2003, HHS retained the work measures (but changed the data source to the Federal Parent Locator Service/National Directory of New Hires) and added a measure on family formation and stability (using Census Bureau data), and three measures of states success in supporting work and self-sufficiency by providing low-income working families with health insurance (using data submitted by the states), food stamps (using Census Bureau data), and child care assistance (using Census Bureau and data submitted by the states). Awards totaling $200 million per year will be awarded for bonus years 1999-2003.

The TANF program also includes a bonus to decrease out-of-wedlock births. For this bonus, the five states with the largest decrease in the ratio of out-of-wedlock births to total births (which also have a reduction in their abortion rates) will receive a bonus.⁽⁴⁾ A total of $100 million per year is available for this bonus.

While it is too early to assess the effects of the performance measurement system for the TANF program, it is important to note that the welfare system includes a number of mechanisms to deal with the potential issues in using outcome-based measures discussed above. By including measures based on program improvement, the TANF program adjusts somewhat although imperfectly for the lack of level playing field. Using a program improvement measure allows states that may be facing difficult economic conditions or serving a difficult caseload that would not otherwise receive a bonus to obtain an award. The system also uses a range of measures including job placement, job retention, and earnings progression to gauge success. This provides some opportunity for programs with different or multiple goals to compete for a bonus. In addition, the work measures for the high performance bonus include both working adults who leave TANF as well as those who remain on TANF. This dual focus reduces the impact of state program design and payment standards on state performance.

Finally, because the measures are based on the performance of all cash assistance recipients, the likelihood of "creaming," i.e., serving only the most employable welfare recipients, is reduced. The TANF participation rates, which also require participation by a broad segment of the welfare population, also serve to counterbalance the potential creaming effect of the measures.

The Use of Outcome Measures in the Workforce Development System

The WIA (and its predecessor JTPA) program, administered by the U.S. Department of Labor (DOL), provides a range of employment-related services to different types of disadvantaged individuals including adults, welfare recipients, and youth. The JTPA program, in particular, has had extensive experience using outcome-based performance measures to gauge the success of its employment and training programs. Because of this longer experience, more studies have been conducted on both the experiences and effects of the JTPA performance measurement system than on performance measurement within welfare programs. This section describes the JTPA and WIA performance measurement systems, as well as findings on the effectiveness of these systems.

The JTPA Program

Several studies have examined the experience of the JTPA program in developing and using outcome-based performance measures (Bartik, 1994; Barnow, 1999; Dickinson and West, 1988; Zornitsky and Rubin; 1988). When JTPA was enacted in 1982, the legislation included specific requirements for outcome-based performance standards. As described by Barnow (1999), the JTPA system had two primary goals: to monitor how well the state and local levels of government were performing in achieving the goals and objectives of the law and to improve performance by giving program operators incentives to achieve these goals and objectives.

Under JTPA, the U.S. Department of Labor was responsible for determining the performance measures for the local Service Delivery Areas (SDAs) the entities that operated the program at the local level. The primary role of states was to decide how bonus money should be distributed among the SDAs and how any performance-based sanctions should be imposed. In addition to the federally-set performance measures, states could propose supplementary performance measures to be used for allocating bonuses.

The JTPA performance measurement system had four core measures for adults and relied on survey data collected from program participants to calculate these measures. (Administrative data were used to compute performance on two youth measures.) SDAs were expected to meet or exceed performance standards specific thresholds were set at the federal level. Based on the most recent program experiences for all SDAs, DOL set the standards for the core performance measures at levels where 75 percent of the SDAs would be expected to exceed these minimum performance expectations. The measures for adults are listed below; the national standards for 1996/97 are noted in parentheses (Barnow, 1999):

The adult follow-up employment rate, defined as the proportion of adult respondents who were employed at least 20 hours per week during the 13th week after termination (59 percent);
The adult follow-up weekly earnings, defined as the average weekly earnings for all adults who were employed for at least 20 hours per week during the 13th week after termination ($281);
The welfare adult follow-up employment rate, defined in the same manner as the adult follow-up employment rate but for adult welfare recipients only (50 percent);
The welfare adult follow-up weekly earnings, defined in the same manner as the adult follow-up weekly earnings but for adult welfare recipients only ($244).

The JTPA performance measurement system included both monetary rewards and programmatic sanctions for SDAs that exceeded or failed to meet the performance standards. States were given control over which SDAs received positive and negative incentives. Up to five percent of JTPA funds were set aside to be used by states to reward SDAs who exceeded the performance standards. SDAs that exceeded the standards could receive additional funding, and activities undertaken with those funds could be exempted from performance standards. Thus, good performance provided SDAs with more flexibility to try new approaches or to serve more at-risk groups (Barnow, 1999). On the negative side, programs that failed to meet the standards set for them for in two consecutive years were subject to reorganization by the Governor (meaning the program could be restructured or restaffed).

In its early years of implementation, the JTPA performance measurement system was criticized because it promoted creaming and other unintended consequences (Barnow, 1992; Bartik, 1994; Zornitksy and Rubin, 1988). Unlike TANF, each SDA had some control over who was enrolled in program services and participation in the program was voluntary. This resulted in a stronger potential for creaming. Studies found that while the creaming tendencies were not universal across all SDAs, the standards did result in a focus on the less disadvantaged in some localities. Dickinson and West (1988) found that the JTPA performance standards did not prevent SDAs that had a strong commitment to serving hard-to-serve groups from targeting and serving those groups. In addition, Heckman et al. (1996) found that JTPA case workers accepted the least employable applicants into the program in spite of their effect on performance standards in part due to the fact that they preferred to assist the most disadvantaged clients. However, Dickinson and West (1988) also found that those SDAs that had the strongest focus on meeting the standards were also less likely to serve disadvantaged groups.

In addition to enrolling the most advantaged among those eligible for program services, the JTPA system was also thought to be encouraging SDAs to offer low-cost services (an early JTPA performance measure included measures of program costs per program terminee), such as job search assistance, rather than more intensive services such as long-term training. Concerns were raised because more intensive and longer-term training was believed to have a greater impact on earnings in the long run. (Barnow, 1999)

In response to these issues, the 1992 amendments to JTPA required states to adjust their performance standards to reflect differences in economic conditions and in the demographic characteristics of the program participants in each SDA (this had previously been at the discretion of the state). States were allowed to use DOL adjustment factors or an alternative procedure approved by DOL.⁽⁵⁾ In addition to providing a mechanism to level the playing field, it was intended that these adjustments would give states incentives to serve more disadvantaged individuals (Barnow, 1999). The amendments also prohibited any performance measures based on costs.

To some extent these efforts appear to have mitigated creaming in the JTPA program. While local SDAs did not always understand how the adjustment model would affect their performance on the measures (Barnow, 1992; Zornitsky and Rubin, 1988), Dickinson and West (1988) found in a study done before the adjustment model was mandatory that the SDAs that did use the adjustment model significantly increased the percentage of disadvantaged groups served.

Overall, the effect of the outcome-based standards on program performance in JTPA is mixed. As noted above, Barnow (1999) did not find a strong link between program effectiveness and performance on the JTPA standards. In addition, there appears to be considerable variation in the extent to which the performance standards influenced the local SDAs. Dickinson and West (1988) found, for example, that about 42 percent of the SDAs they studied tried to maximize their measured performance, one-fourth tried only to exceed their standards slightly, and about one-third tried merely to meet their standards in order to avoid program sanctions.

The variation in how SDAs responded to the performance standards may in part be due to the decentralized nature of the JTPA performance system where SDAs faced differing financial incentives. Because some states rewarded only the best performers while others distributed funds broadly among all SDAs that met or exceeded standards (Barnow, 1999), SDAs faced differing levels of financial incentives. In addition, because reorganization is such an extreme measure, states often used their discretionary authority to modify the standards so that poor performers would not fail in two consecutive years (Barnow, 1999). The fact that few SDAs actually faced the most severe penalties could have also influenced their response to the measures.

Workforce Investment Act

Building on the system developed under its predecessor, the WIA statute also places a strong emphasis on outcome-based performance standards. WIA requires that a comprehensive performance accountability system be developed with the following components: a focus on results defined by "core indicators" of performance; measures of "customer satisfaction" with programs and services; a strong emphasis on the continuous improvement of services; annual performance levels and improvement plans developed during negotiations with federal, state, and local partners; and awards and sanctions based on state and local performance.

The WIA performance system continues some aspects of the JTPA system but with some critical differences (DOL, 1999(a); DOL, 1999(b); DOL, 2000). Table 3 provides a comparison of the performance measures used under the two systems. The performance measures for adults include: entry into unsubsidized employment; retention in unsubsidized employment six months after entry into employment; earnings gains in unsubsidized employment six months after job entry; and attainment of a recognized credential in relation to the achievement of educational or occupational skills, by those who enter into unsubsidized employment. Customer satisfaction will be measured based on the responses of both program participants and employers. States may also develop additional indicators of performance. Unlike JTPA which relied on a survey of program participants, because of the reduced cost associated with data collection, states are required to use quarterly wage records (UI data) to compute performance on employment-related measures.

Table 3
Comparison of JTPA and WIA Performance Standards
JTPA Performance Measures	WIA Performance Measures
Percent of adults employed for at least 20 hours per week 13 weeks after program exit Average weekly earnings for those employed at least 20 hours per week 13 weeks after program exit Percent of welfare recipients employed for at least 20 hours per week 13 weeks after program exit Average weekly earnings for welfare recipients employed at least 20 hours per week 13 weeks after program exit	Percent who enter into unsubsidized employment Retention rates in unsubsidized employment six months after entry into employment Earnings gains in unsubsidized employment six months after job entry Attainment of a recognized credential in relation to the achievement of educational or occupational skills, by those who enter unsubsidized employment. "Customer" (i.e., program participants and employers) satisfaction

WIA requires that the expected levels of performance on each core indicator be negotiated between the Department of Labor and individual states an approach that is different from the SDA-level standards used in JTPA. The agreed-upon level of performance for each state must reflect how it compares with other states (taking into account differences in economic conditions, participant characteristics, and the proposed service mix and strategies). Each local workforce investment area can negotiate with the state and reach agreement on the local level of performance expected on each core indicator, taking similar factors into account.

Like JTPA, WIA has an incentive system with both rewards and sanctions. Rather than providing incentive grants to states, if a state fails to meet the adjusted levels of performance in two consecutive years, the state allocation can be reduced by up to five percent. The Department of Labor is required to award an incentive grant to each state that exceeds its performance levels for WIA (as well as those required for the Vocational and Applied Technology Education Act (Perkins Act)). States must set aside part of their allocation to provide incentive grants (or bonuses) to localities, at the discretion of the Governor. Localities that fail to meet the core indicators of performance for two consecutive years may be required to reorganize.

Because the population that can be served under WIA is broader than that served under JTPA, WIA does not require states to use a statistical model to adjust their performance standards. Instead, WIA levels the playing field by providing for negotiated performance standards at both the state and local level. Among the factors which must be considered in the negotiations process are how the levels compare to other state or local programs, taking into account such factors as economic and demographic characteristics and service design. (Other factors include promoting continuous improvement in performance measures and attaining a high level of customer satisfaction). Statistical analyses may be taken into account as part of these negotiations. In addition, by including a relatively broad range of measures to gauge program performance including customer satisfaction and credential attainment and allowing states to add additional measures, the WIA system has some accommodation for programs with different goals.

Overall, the welfare and workforce development performance measurement systems have some common elements. The most striking similarity is the type of core measures used both systems use measures based on the job placement rate, earnings progression, and job retention (at least for a preliminary period under TANF). Both systems also rely primarily on a bonus system for rewarding states on the selected outcome-based performance measures.

However, there are several key distinctions between the two systems. First, TANF combines the bonuses for achievement of outcome-based performance measures with penalties based on process measures, such as the work participation rate requirements. WIA links financial penalties as well as bonuses to the performance measures. Second, the new WIA system relies on standards that are negotiated at the federal and state levels. In addition, the WIA/JTPA system also allows adjustments to performance expectations based on economic conditions and demographic characteristics. In contrast, the welfare system relies on overall rankings of states and measures of improvement. Finally, compared to TANF, the workforce development system is more uniformly decentralized. Under WIA, awards and sanctions apply to both states and localities and the state plays a major role in how funds are distributed within the state. Under TANF, states have the discretion to decide if funds should be distributed at all to local agencies.

State Experiences in Developing Outcome-Based Performance Measurement Systems

As discussed in the previous section, the federal government has played a strong role in the development of outcome-based performance standards for both the welfare and workforce development systems. In addition to these federal efforts, however, states have also become increasingly involved in developing their own performance measurement systems. This section discusses state efforts to build these systems which range from very broad efforts that cover a range of human service programs to more narrow efforts focused on welfare or workforce development programs.

State initiatives to develop their own performance measurement systems are in part due to changes at the federal level that transfer additional responsibilities to states and local units of government. This devolution of responsibility poses new challenges for public administrators, who not only have to manage programs but also maintain and expand existing information systems needed to monitor program performance.

The design and use of performance measurement systems vary widely by state. One study (Horsch, 1996(a)) developed several useful dimensions for describing state efforts to build these systems.

Cross-sector vs. sector-specific models. A few states are developing cross-sector outcome-based systems that cover many or all of a states programs focused on children and families. This approach typically begins with a statewide strategic plan and statewide goals cutting across a number of different agencies and programs. Many more states have begun to develop systems as sector-specific efforts such as those focused on employment and training programs or programs for youth.
Level of participation. The participation of political appointees, agency personnel, and citizens in the design of the system also varies among the states. Some states have chosen an inclusive participatory process incorporating the viewpoints of people from a number of different areas, including citizens and line staff. For other states, the planning process involves state agency-level staff.
Type of collaboration. Collaboration may occur across government agencies, between state and local entities, or both. At the state-level, some have reorganized their agencies by bringing together those that focus on the achievement of similar results, while others have established formal or informal bodies to coordinate performance measurement systems. The extent and the nature of the involvement of localities also varies some may be involved in the development of the state effort, others may define their own results within a framework established at the state level, and others may allow locally-determined as well as state measures.
Use of results. State systems also differ in the nature of the results they identify and the process used to articulate the important results they wish to achieve. Some states have developed comprehensive models on how results will be identified, collected, and reported. Other states have begun developing their systems only by identifying, collecting, and reporting those measures which meet their immediate needs. Still others have identified a comprehensive set of measures, but are not currently collecting and reporting all of them because of time, data, or resource constraints.
Types of measures. Measures can be articulated at the child/family/community level, the agency level, or the program level. The level of the measures generally dictates who is responsible for achieving the results.

There are clearly a number of different approaches at the state level for developing outcome-based performance measures. At one end are comprehensive, cross-sector systems which focus on establishing indicators or benchmarks of progress toward certain goals. Rather than measuring outcomes associated with specific programs (such as welfare-to-work or WIA programs), these states use measurable social goals such as reducing poverty or lowering the teen birth rate to monitor the overall effectiveness of public programs, to set goals, and/or to coordinate efforts across agencies (Brown et al, 1997; Yates, 1997).

Oregon developed the first and most comprehensive effort (Dyer, 1996; Lewis and Dunkle, 1996; Popovich, 1996) although Florida, Minnesota, and Vermont have embarked on similar efforts.⁽⁶⁾ As summarized in Chart 1, Oregon has put substantial energy into reorganizing government activities at all levels to take a more goals-oriented approach. This was done through a multi-year collaborative process involving stakeholders at all levels including state agencies, local community leaders, the business community, and citizens. While public agencies were not consolidated as part of this effort, individual agencies use the indicators to guide policy and contracting decisions.

While some states have developed these relatively broad performance measurement systems, others focused more specifically on welfare and/or workforce development programs. For example, in Ohio, the state is giving counties block grants to administer the TANF programs and using some of the same performance measures as those used at the federal level on TANF to gauge success (Yates, 1997). Counties that exceed performance standards in the areas of a work participation rate and out-of-wedlock births receive greater funding flexibility and earn financial incentives.

Wisconsin established performance standards for its private contractors that administer the states TANF program. Contractors are required to meet a base level of performance to remain in compliance with the contract requirements and receive bonuses for exceeding the standards. Standards are established for: employment rates, wage rates, job retention, the provision of appropriate basic educational services (to certain types of individuals), the availability of employer-provided health insurance, and basic skills/job skills attainment (an optional measure for bonuses only).

A study of state welfare-to-work performance systems completed before the 1996 welfare law found that over half had statewide performance standards (GAO, 1995). Most of these states used employment rates as the key performance measure, although some also used other measures such as wage rates, job retention, and education and training achievement. Another study found state variation in whether uniform performance standards for welfare-to-work were set for local offices or negotiated with the state, whether they were adjusted using JTPA standards, and whether rewards and sanctions were significant or relatively informal (APWA, 1994).

Other states are developing performance measures and standards that focus more specifically on workforce development programs. Almost half the states are using outcome-based performance measures and standards to assess and monitor the performance measures that cut across the entire workforce development system including JTPA/WIA, the Employment Service (ES), vocational education, and others (Hyland, 1997; Simon, 1998). Some of these efforts include TANF programs as well. As an example, the California legislature enacted a statewide system to evaluate the performance of all its publicly-funded workforce development programs including JTPA (now WIA), TANF, the Employment Service, vocational education, community colleges, and rehabilitation programs (State of California, 1999) (see Chart 1 for details).

Some states sought to effect change to their workforce development system through a common policy framework for designated programs established by an interprogram or interagency team (North Carolina and Illinois). Others achieved this goal by reorganizing their programs into a single administrative entity (Texas and Iowa).

Overall, many states are moving forward on developing outcome-based performance measures. A few states have developed relatively comprehensive systems that track social indicators rather than the outcomes of specific programs. However, others are more narrowly focused on welfare or workforce development programs. These efforts generally complement the development of outcome-based measures occurring at the federal level.

Chart 1
Examples of State Outcome-Based Performance Measurement Systems
Oregon Oregons results-based accountability system for children and families is part of a comprehensive statewide system. With the goal of creating a more results-oriented, decentralized governance system, Oregon first developed a strategic vision (known as Oregon Shines) with a task force of over 150 individuals identifying broad goals for the state. The state then established through a broad participatory process a total of 92 specific benchmarks to measure whether the state was making progress in achieving these goals (known as Oregon Benchmarks). Benchmarks were established in key policy areas related to the strategic vision including education and workforce development, reducing welfare dependency and increasing self-sufficiency, and protecting the health of children. Indicators for measuring these benchmarks were then developed and goals set for each indicator. Specific goals include reducing the welfare caseload from 40,000 to 33,000 through self-sufficiency efforts; reducing the percentage of children living in poverty from 11 percent to 6 percent; and reducing the first time demand for public assistance among young adults by decreasing rates of teen pregnancy, teen drug use, and juvenile crime and increasing school graduation and placement rates. Oregon Option an effort designed to break down barriers and facilitate cooperation between levels of government and across agencies and to promote an outcome-based approach to planning was also a key component of this states effort. California California has a number of separate but inter-related efforts to track the well-being of children and families. The state tracks indicators in several areas including education, health, and workforce development. In the area of workforce development, the California legislature enacted in 1996 a statewide system to evaluate the performance of all its publicly-funded workforce programs including JTPA (now WIA), TANF, ES, vocational education, community colleges, and rehabilitation programs. With a goal of being fully operational by 2001, the outcome measures in this system include: rate of employment, length of employment retention, earnings before and after program participation, rate of change in Unemployment Insurance (UI) status, rate of change in the number of individuals moving from tax receiver to tax payer, and rate of advancement to higher education.

Endnotes

1. WIA mandates the consolidation of specific employment and training programs administered by the Departments of Labor (DOL; the Job Training Partnership Act (JTPA) and the Employment Service), community colleges, other vocational and adult education providers, vocational rehabilitation providers as well as employment and training activities provided through the Community Services Block Grant, the Department of Housing an Urban Development, and the Veteran's Administration. Welfare-to-work activities provided under TANF are not required to be part of the workforce development system, but they may be, and in many states and localities are, included. However, grantees under the Welfare-to-Work (WtW) program administered by DOL are mandatory partners under WIA.

2. In four of the sites Atlanta, Columbus, Grand Rapids, and Riverside random assignment occurred to two different program groups. Results from only one of the programs in each of these sites is included here. In Atlanta, Grand Rapids, and Riverside, results for the Labor Force Attachment programs (an approach stressing quick entry into the labor market) were included on the table rather than the Human capital Development program (an approach stressing an investment in upfront education and training). In columbus, results for the Traditional Case Management program (an approach that used separate staff to provide income eligibility and JOBS case management responsibilities rather than the Integrated program (an approach that combined the responsibility of income eligibility and case management functions into one worker) were included. Results for the programs not included on the table were similar to those that were and would not have changed the overall rankings on outcomes or impacts. In addition, results from Detroit and Oklahoma were not included including these sites would not have changed the overall conclusions drawn from the table.

3. Data on participation rates in the NEWWS evaluation were obtained from Hamilton, et al., 1997 (Atlanta, Columbus, and Grand Rapids); Scrivner, et al., 1998 (Portland) and Brock and Hartnett, 1998 (Columbus).

4. In FY 1999, the states that received bonuses were Alabama, California, the District of Columbia, Massachusetts, and Michigan.

5. Adjustment factors used in the JTPA system included: percent of participants who were female, percent age 55 or more, percent high school dropout, percent African-American, percent cash assistance recipient, percent long-term cash assistance recipient, percent basic skills deficient, percent with disabilities, percent lacking significant work history, percent not in the labor force, the local employment rate, the three-year growth rate of earnings in retail and wholesale trade, and annual earnings in retail and whole trade.

6. For more information on Oregon and other states developing and using comprehensive indicators to measure program performance, see Brown, et al., 1997.

Appendix B: Summary of the Minutes of the Consultation on Alternative Outcome Measures, July 21, 1999

Background

In 1994, the Department of Health and Human Services issued a report on the use of outcome measures to evaluate welfare-to-work programs. This report was issued in response to a Congressional mandate to study alternatives to the work activity participation rates required under the Job Opportunities and Basic Skills Training (JOBS) program. The 1994 report on outcome-based performance measures was generally supportive of the concept of outcome measures, but expressed concern about possible unintended consequences. The 1996 welfare reform act mandated a further examination of outcome-based measures for evaluating states success in moving individuals from welfare to work, in light of the dramatic program changes the law put in place. In recent years, outcome-based performance measurements have become more common generally. Many states have adopted their own internal outcome-based performance measurement and incentive systems. A total of 47 states submitted data in 1999 in order to compete for the first year of the High Performance Bonus.

Participation rates were initially designed to introduce a sense of reciprocal obligation into a welfare system that, heretofore, had been based on issuing benefits henceforth, at least some portion of the caseload would be required to do something in exchange for receiving benefits. With the advent of time limits, participation rates also provided a means of ensuring that clients received services that would promote self-sufficiency.

Todays environment could not have been foreseen when the welfare reform law passed in 1996:

Most (almost two-thirds in 1997 and 70 percent in 1998) of the welfare recipients who count toward the participation rate are working in unsubsidized employment, rather than participating in other activities such as job readiness or training. This is, in part, a result of state decisions to increase earnings disregards, which allow individuals to combine work and welfare for longer periods.
Although many states have achieved impressive participation rates, in general state participation rate targets are much lower than initially had been expected, due to the caseload reduction credit.
Nonetheless, it may become harder for states to achieve the participation rates over time for three reasons: the minimum participation rate increases each year until 2002; the number of hours of participation required in order to count toward the rate also increases; and as the most job-ready participants leave welfare the share of the caseload that is harder-to-serve (with greater barriers to employment) appears to be increasing.

Brainstorming on Measures

On July 21, 1999, the Department hosted a consultation meeting with representatives from states and research and advocacy organizations to initiate our formal study and analysis of potential outcomes measures for evaluating the success of the states in moving individuals out of the welfare system through employment, as an alternative to the minimum participation rates. Consultation participants discussed TANFs overall goals related to work that should be advanced through performance measures and, in turn, some specific measures that might be used to promote those goals without creating perverse incentives. A wide range of potentially desirable outcomes were identified through a brainstorming process both from the narrower perspective of targeting employment successes solely, and from the broader perspective of recognizing success in meeting any of the purposes of TANF. Potential measures fell roughly into the following broad categories:

work participation and employment;
poverty and movement to self-sufficiency;
requirements for two-parent families;
duration of welfare receipt;
caseload reductions;
child outcomes;
supportive services;
customer satisfaction; and
educational outcomes.

The majority of the discussion focused on the first two of these categories.

Work Participation and Employment

There was general agreement among consultation participants that there is interest in what is happening to both current participants and leavers. However, many technical concerns were raised:

Due to differences across state TANF programs, leavers are not the same population from state to state. Therefore, it is unfair to compare the leavers populations across states. Does it make sense to have the base be all recipients at a given point in time?
How should multiple exits and entries within a specified period be treated? Should we exclude people who return to welfare?
In considering possible measures, discussants were unclear what treatment should be measured. In the Workforce Investment Act (WIA), the treatment is services. What is the TANF treatment receipt of cash assistance?
If employment were a measure, what should count as employment? Would one dollar of earnings in a quarter constitute employment?

Most participants agreed that in an ideal world, employment stability and earnings progression should be measures are also outcomes of interest along with job entry., but that there are some concerns regarding the selection of specific measures. They re was some discussion about discussed the problems of selecting an appropriate time frame for measuring employment stability and wage progression. There was general consensus that outcome measures lose their effectiveness as incentives if there is too great a lag between the time the actions are taken and the calculation of outcome measures whether tied to penalties or bonuses. Yet, it is unrealistic to expect significant wage progression in a short period of time, and job retention is more meaningful over longer periods of time. Six months was suggested as a compromise by one researcher, but some states argued that it was important to use the same timeframe (13 weeks) used for performance measurement under WIA.

Poverty and Movement to Self Sufficiency

Consultation participants were sharply divided on the issue of whether poverty would be an appropriate measure. Researchers and advocates almost universally argued in favor of a measure of poverty and income, while state representatives argued against it on the grounds that it is not reasonable to expect TANF agencies to solve the poverty problem. Everyone recognized that AFDC never served more than a fraction of the poor population. It was generally concluded that a measure of extreme poverty (under 50 percent of the poverty level) would be more likely to capture changes caused by TANF than the basic poverty rate.

Two Perspectives

Following the brainstorming process to identify desirable outcomes and potential measures, attendees were divided into two groups one consisting of state representatives, and the other consisting of researchers and advocates. Each group discussed the measures proposed in the brainstorming session and identified its most and least favorite measures. The results are summarized in the table below.

There was an overall broad consensus that, in a program with as many different objectives as TANF, no single measure could adequately capture all of the variety of possible state actions. In general, researchers and advocates were, therefore, inclined toward a wide range of measures. Mindful of the administrative burden involved, state representatives supported the idea of a menu of possible measures from which states could choose the ones under which they wish to compete.

Researchers and advocates supported the inclusion of a broad population measure, such as the rate of extreme child poverty. They argued that, since TANF is not an entitlement, it is essential to include a measure that reflects states choices of whom to serve under TANF, not just their success with the population served. By contrast, state representatives strongly opposed such measures. They argued that population measures are driven by many factors over which state TANF agencies have little or no control. Researchers and advocates agreed that it would be important to adjust the measures to reflect the different circumstances and demographics in each state. These might include statistical models (such as under the Job Training Partnership Act (JTPA) performance measurement system), negotiated standards (such as under the Tribal TANF program), or measures that reflect improvements over time.

	Most Favorite Measures	Least Favorite Measures
State Representatives	Limited set of core measures such as those in High Performance Bonus (HPB) with additional measures possible at state option Progression along the poverty continuum Percent of those required to work who have earnings Recidivism (would need to avoid counting administrative churning) Increasing what counts as participation	Anything that would require additional data collection by the states Anything over which TANF has no control, e.g., broad population anti-poverty measures Customer satisfaction Two-parent work participation rate Process measures
Researchers And Advocates	Multiple measures in order to reflect the wide range of possible goals under TANF Labor market success Broader population-based measure of participation Extreme child poverty Supportive services	Caseload reduction Two-parent work participation rate Child support enforcement (CSE) measures (covered sufficiently in CSE incentives)

There was a great deal of ambivalence about the appropriate role of process measures. On the one hand, states thought federally established process measures were inappropriate for a block grant such as TANF. (There was much more interest in process measures at the state level as a management tool for program administrators.) At the same time, state representatives were deeply concerned about being evaluated using outcome measures over which they did not have full control. There was some support expressed for a two-stage system, under which the federal government would look first at outcomes and consider process measures only for those states that had not met their outcome performance goals. Such an approach would allow high-performing states to operate under reduced federal oversight, while states that did all the right things but had poor outcomes due to events beyond their control would be protected from penalties. Participants generally agreed that there was a continued role for a participation rate measure, but that the list of activities that count as participation should be expanded to give states credit for engaging recipients in education and other activities such as mental health services or substance abuse treatment, when needed.

States generally opposed the addition of measures for which there are no baseline data to indicate what is a reasonable level of performance. This was due both to a natural concern about how standards would be set in the absence of such data (the two-parent family participation rate was cited as an example of an unreasonable standard) and to a sense that outcome measures lose their effectiveness as incentives if states do not know whether they have a reasonable chance at a bonus or if they are in danger of a penalty. This is particularly true for a rank-based system, such as the current High Performance Bonus, since states have no little idea of how their performance compares to that of their peers.

States were uniformly concerned about the burden of data collection. They recommended that any new measures be based on either national survey data or on the administrative data that states are already required to collect and report under TANF. They also urged that performance measures under TANF be made consistent with those under other related programs, particularly the workforce development programs under WIA. They asked that no changes be made to data collection requirements at least until reauthorization, so that they could stabilize their systems.

The question of performance penalties versus performance bonuses was discussed at some length. State representatives generally supported bonuses rather than penalties. Some basic principles were enunciated:

A core set of the most critical measures should have both penalties and bonuses attached.
It is more appropriate to give bonuses than to impose penalties when states have less control over outcomes.
Caution is needed when considering penalties for measures that are not highly accurate. In particular, thresholds are bad in a "noisy: data system, because narrow real differences can have large consequences. (The TANF requirement for higher state spending levels (MOE) when participation requirements are not met was cited as example.)
It is essential to develop reasonable standards if penalties are to be assessed. They should be based on baseline data, but not on a simple national average.
Penalties typically result in an adversarial relationship between the states and the federal government. Historically, when sanctions have been imposed, legal battles have followed, resulting in high costs in both money and time for both sides. Is this worth it?

There was a general sense that in order for bonuses to have incentive effects, all states should be able to compete. This argues for systems where all states can receive awards if they achieve at the appropriate level, as opposed to systems where only a small number of top performers receive awards. Some participants argued for rating states based on improvement, rather than absolute performance, but others opposed this, saying that in the long run it was unrealistic to expect continual improvement. Others suggested that adjustments to reflect state economic circumstances and demographics could help resolve this problem, or that states could be organized into comparable groups and then ranked against their peers.

List of Participating States and Organizations

Participating States

Colorado
Connecticut
District of Columbia
Florida
Georgia
Illinois
Louisiana
Maine
Maryland
Massachusetts
Michigan
Minnesota
Nevada
New Jersey
New York
Ohio
Oklahoma
Oregon
Pennsylvania
Rhode Island
South Carolina
Texas
West Virginia

Participating Organizations

American Federation of State, County and Municipal Employees
California Welfare Directors Association
Childrens Defense Fund
Census Bureau
Center on Budget and Policy Priorities
Center for Law and Social Policy
Domestic Policy Council
Johns Hopkins University, Institute for Policy Studies
Mathematica Policy Research, Inc.
National Association of Counties
National Association on Welfare Research and Statistics
National Organization for Women, Legal Defense and Education Fund
National Center for Children in Poverty Research Forum
National Governors Association
Office of Management and Budget
The Urban Institute
U.S. Department of Agriculture, Food and Nutrition Service
U.S. Department of Health and Human Services, Health Care Financing Administration
U.S. Department of Labor

Appendix C: Participation Requirements for Welfare Recipients Under the JOBS and TANF Programs

Introduction

One of the major elements of the welfare reform act is the requirement that states mandate adults to participate in welfare-to-work programs as a condition of their receiving cash assistance. The Personal Responsibility and Work Opportunity Reconc1iliation Act (PRWORA) of 1996 requires states to meet new and more stringent participation requirements for adults receiving assistance through the Temporary Assistance for Needy Families (TANF) program. The TANF participation requirements replace those established by the Job Opportunities and Basic Skills Training (JOBS) program created by the Family Support Act of 1988.

Participation requirements in welfare-to-work programs require states to involve a certain portion of their welfare population in welfare-to-work activities or face financial penalties. In JOBS and TANF, participation in welfare-to-work programs is measured by a monthly participation rate a ratio which represents the number of individuals who participate in the welfare-to-work program (the numerator) out of all those expected to participate (the denominator). The stringency of the participation requirement is dependent on both the level of the rate states are expected to meet and how the numerator and denominator of the rate are defined.

To provide an understanding of participation mandate changes between the JOBS and TANF programs, this paper provides a comparison of the participation requirements under the two programs. The paper explores the key ways in which the measurement of participation as expressed by the monthly participation rate varies in these two programs. The first section of the paper examines the proportion of the welfare caseload states are expected to involve in welfare-to-work activities under JOBS and TANF. The second section explains how the denominator of the rate is defined under the two programs; this is followed by a discussion of how the numerator of the rate is defined. Finally, the paper discusses factors that may affect a states ability to achieve the expected levels of participation.

What Rates are States Required to Meet?

The statutes governing the JOBS and TANF programs established target participation rates that states are required to meet. Both programs established an overall rate (which includes both single and two-parent families) and a separate rate for two-parent families. As shown on Table 1, both programs had target participation rates that increased over time, however, the target participation rates in the JOBS program were substantially lower. The target overall rate in JOBS increased from 7 percent in Fiscal Year (FY) 1990 to 20 percent in FY 1995, while in TANF it started at 25 percent in FY 1997 and increases to 50 percent for FY 2002. The differences in the rates between the two programs are not as dramatic for the two-parent rate. In JOBS, the two-parent rate increased from 40 percent in FY 1994 to 75 percent in FY 1998⁽¹⁾, while in TANF this rate started at 50 percent in FY 1997 and increases to 90 percent in FY 2000.

Table 1
Required Participation Rates in the JOBS and TANF Programs
Year	Overall	Two-parent Families	Overall	Two-Parent Families
	JOBS Program		TANF Program
Year 1	7%	40%	25%	75%
Year 2	7%	50%	30%	75%
Year 3	11%	60%	35%	90%
Year 4	11%	75%	40%	90%
Year 5	15%	75%	45%	90%
Year 6	20%	None	50%	90%

In both programs, states are subject to financial penalties if they do not meet the required participation rates. In JOBS, a states federal matching rate for the JOBS program could be reduced if it failed to meet the participation requirements. Under TANF, states face up to a five percent reduction in their TANF block grant and an increase in the maintenance of effort requirement (from 75 percent to 80 percent) for not meeting these requirements

While the statute governing the TANF program sets target participation rates, there is another provision in this law that affects the actual rate each state is expected to meet. This provision known as the caseload reduction credit allows states to reduce their participation rates based on declines in their TANF caseload. Specifically, participation rates are reduced by one percentage point for each percentage point that a states welfare caseload in the last fiscal year is below FY 1995 levels. Both the overall and the two-parent rate can be reduced by the caseload reduction credit; however, caseload decreases due to changes in federal law or in state-defined eligibility criteria do not count toward the credit. For example, a state that experienced a ten percent caseload decline between FY 1995 and FY 1997 would be required to meet an overall participation rate of 15 percent (rather than 25 percent) for FY 1997.

To date, the caseload reduction credit has had a significant effect on the participation rates states are required to meet. Table 2 shows the "adjusted target" for the overall rate the participation rate that takes the caseload reduction credit into account for each state in FY 1997 and FY 1998. As shown, as a result of the caseload reduction credit, in FY 1998, some states such as Indiana, Wisconsin, and Wyoming had their overall participation rate reduced to zero. Others such as Alaska and Hawaii only had a small reduction in the FY 1998 target participation rate of 30 percent.

Table 2.
TANF Overall Participation Rates, FY 1997-99
(in percents)
State	FY1997		FY1998		FY1999
	(Target Rate = 25 Percent)		(Target Rate = 30 Percent)		(Target Rate = 35 Percent)
	Adjusted Target	Rate Achieved	Adjusted Target	Rate Achieved	Adjusted Target	Rate Achieved
Alabama	17.1	42.3	5.0	38.9	0.0	37.4
Alaska	NA	NA	26.8	42.5	16.8	46.0
Arizona	16.1	26.9	8.7	30.2	0.0	32.1
Arkansas	NA	NA	16.6	19.4	6.0	23.7
California	19.5	20.6	17.7	36.6	8.5	42.2
Colorado	NA	NA	7.5	28.7	0.0	36.4
Connecticut	20.3	58.4	21.5	41.4	19.7	47.4
Delaware	NA	NA	9.4	26.2	0.0	24.9
DC	21.0	31.3	20.1	22.8	13.9	26.7
Florida	16.4	28.4	5.8	34.5	0.0	31.6
Georgia	18.7	20.6	6.1	29.3	0.0	17.6
Guam	NA	NA	30.0	12.4	35.0	16.1
Hawaii	NA	NA	28.1	30.0	23.1	41.1
Idaho	NA	NA	4.2	28.6	0.0	43.7
Illinois	NA	NA	13.6	37.7	6.1	60.4
Indiana	5.6	25.3	0.0	29.9	0.0	33.3
Iowa	14.9	52.8	9.1	56.9	4.7	54.8
Kansas	14.1	33.3	1.9	41.3	3.9	57.3
Kentucky	20.3	33.1	16.3	39.3	5.8	38.1
Louisiana	13.4	13.5	2.0	29.2	0.0	29.4
Maine	19.3	41.6	15.1	45.6	5.9	54.9
Maryland	16.3	18.3	3.1	12.7	7.2	11.2
Massachusetts	12.6	31.5	7.3	29.0	0.9	27.8
Michigan	13.3	41.1	5.2	49.2	0.0	43.8
Minnesota	NA	NA	17.0	30.6	13.7	36.9
Mississippi	16.3	17.2	3.7	25.2	0.0	27.0
Missouri	17.6	23.2	10.4	24.1	2.2	28.2
Montana	19.2	49.5	7.2	78.3	0.0	92.3
Nebraska	20.5	34.0	20.6	36.2	19.7	34.7
Nevada	20.0	31.2	6.0	34.5	1.1	34.8
New Hampshire	13.3	36.1	5.5	37.3	0.0	29.9
New Jersey	19.5	20.7	14.7	26.5	7.1	30.3
New Mexico	NA	NA	8.5	15.9	0.0	27.6
New York	19.6	27.9	15.0	37.5	8.3	36.3
North Carolina	15.1	25.9	10.0	14.5	0.0	16.0
North Dakota	NA	NA	10.7	31.5	0.8	31.7
Ohio	15.6	38.3	11.6	44.9	1.4	53.7
Oklahoma	11.6	27.8	0.0	35.2	0.0	42.9
Oregon	10.2	96.7	0.0	98.2	0.0	96.7
Pennsylvania	NA	NA	9.9	19.3	0.9	16.2
Puerto Rico	NA	NA	17.1	6.8	12.3	20.7
Rhode Island	NA	NA	19.3	27.5	22.0	28.8
South Carolina	18.4	38.9	19.0	42.7	6.7	44.7
South Dakota	20.4	44.8	11.2	39.2	1.6	46.5
Tennessee	20.3	38.6	2.0	43.2	0.0	41.1
Texas	14.6	19.4	5.2	25.2	0.0	27.3
Utah	13.7	39.6	2.5	39.8	2.2	44.0
Vermont	NA	NA	NA	NA	NA	NA
Virgin Islands	NA	NA	27.7	15.5	19.7	11.5
Virginia	15.6	17.3	6.8	27.5	0.0	41.1
Washington	22.0	24.0	21.1	48.5	12.9	40.3
West Virginia	15.0	18.3	19.2	33.4	0.0	25.6
Wisconsion	8.0	52.8	0.0	64.0	0.0	80.1
Wyoming	16.0	52.6	0.0	55.3	0.0	57.7
National Average		28.1		35.4		38.3

Table 3 provides this information for the two-parent rate. As with the overall rate, the caseload reduction credit has had a significant impact on the two-parent rates in some states. While some states still have to meet relatively stringent rates, in FY 1998, 10 states had their participation requirement for two-parent families cut to 10 percent or less (compared to a target rate of 75 percent).

Table 3.
TANF Two-Parent Participation Rates, FY 1997-99
(in percents)
State	FY 1997		FY 1998		FY 1999
	(Target Rate = 75 Percent)		(Target Rate = 75 Percent)		(Target Rate = 90 Percent)
	Adjusted Target	Rate Achieved	Adjusted Target	Rate Achieved	Adjusted Target	Rate Achieved
Alabama	35.8	31.5	NA	NA	NA	NA
Alaska	NA	NA	68.6	36.8	66.8	44.8
Arizona	66.1	68.8	53.7	76.6	48.9	88.4
Arkansas	NA	NA	57.8	20.3	61.0	10.5
California	68.0	24.5	32.7	36.2	36.9	54.3
Colorado	NA	NA	15.1	25.7	44.9	41.2
Connecticut	70.3	90.6	66.5	73.2	NA	NA
Delaware	NA	NA	54.4	23.7	NA	NA
DC	48.2	14.0	30.1	22.5	23.3	19.5
Florida	NA	NA	NA	NA	NA	NA
Georgia	NA	NA	NA	NA	NA	NA
Guam	NA	NA	75.0	13.0	90.0	10.7
Hawaii	NA	NA	NA	NA	NA	NA
Idaho	NA	NA	0.0	22.5	20.6	44.0
Illinois	NA	NA	45.0	77.7	46.2	92.4
Indiana	35.1	35.2	20.1	32.8	32.3	41.4
Iowa	64.9	48.5	51.4	53.6	49.0	55.5
Kansas	44.0	33.7	23.2	44.2	56.9	64.9
Kentucky	50.8	51.9	37.5	52.4	17.6	46.6
Louisiana	5.7	15.4	0.0	38.1	33.5	43.1
Maine	61.7	50.5	35.3	49.9	23.6	51.0
Maryland	NA	NA	NA	NA	NA	NA
Massachusetts	52.1	71.1	44.6	73.3	55.9	66.4
Michigan	60.3	47.4	38.4	63.9	15.2	69.1
Minnesota	NA	NA	42.5	30.8	68.7	43.6
Mississippi	54.0	24.4	1.2	70.4	39.7	87.5
Missouri	21.7	47.8	0.0	34.9	0.0	29.8
Montana	69.2	82.1	52.2	86.4	45.2	87.0
Nebraska	61.2	42.0	53.1	39.5	74.7	53.8
Nevada	45.8	39.2	31.7	58.7	56.1	69.6
New Hampshire	21.9	46.0	1.6	44.6	8.0	31.6
New Jersey	55.8	25.2	NA	NA	NA	NA
New Mexico	NA	NA	35.6	16.8	54.1	29.3
New York	61.1	63.6	38.5	58.8	23.1	58.4
North Carolina	56.2	45.5	55.0	30.9	45.1	30.3
North Dakota	NA	NA	NA	NA	NA	NA
Ohio	47.8	39.8	49.2	51.5	35.1	65.4
Oklahoma	36.9	16.1	4.2	31.4	NA	NA
Oregon	46.8	98.3	9.8	95.2	38.5	96.1
Pennsylvania	NA	NA	26.3	21.8	20.7	24.9
Puerto Rico	NA	NA	NA	NA	NA	NA
Rhode Island	NA	NA	51.1	32.4	77.0	94.7
South Carolina	24.9	40.2	48.5	60.9	61.7	78.1
South Dakota	NA	NA	NA	NA	NA	NA
Tennessee	46.0	52.4	4.6	39.1	5.2	44.3
Texas	55.4	34.3	47.9	44.3	48.2	61.0
Utah	63.7	64.1	47.5	49.7	NA	NA
Vermont	NA	NA	NA	NA	NA	NA
Virgin Islands	NA	NA	NA	NA	NA	NA
Virginia	65.6	19.4	51.8	26.5	NA	NA
Washington	66.0	18.6	52.2	45.5	47.8	55.3
West Virginia	61.3	49.0	46.8	37.2	51.1	25.9
Wisconsin	39.9	51.3	0.0	39.2	20.1	55.8
Wyoming	54.9	63.9	4.9	65.8	0.0	90.7
National Average		34.3		35.4		54.7

Tables 2 and 3 also show the participation rates states achieved during the initial years of the TANF program (FY 1997 and FY 1998) for the overall and the two-parent rate, respectively. Table 4 shows the rates achieved during the last year of the JOBS program (FY 1996). As shown, under both JOBS and TANF, all states met the participation requirement for the overall rate (adjusted by the caseload reduction credit under TANF) in the years examined in this report.⁽²⁾ However, under both JOBS and TANF, states experienced much more difficulty meeting the more stringent two-parent rates. Under JOBS, roughly half of the states (26) met the two-parent requirements. Under TANF, 16 states met their adjusted two-parent rate in FY 1997 and 28 states met this rate in FY 1998.⁽³⁾

Table 4.
JOBS Overall and Two-Parent Participation Rates, FY 1996
(in percents)
State	Overall Rate (Target Rate = 25 Percent) Rate Achieved	Two-Parent Rate (Target Rate = 60 Percent) Rate Achieved
Alabama	56.1	35.9
Alaska	NA	48.3
Arizona	49.4	55.4
Arkansas	20.1	29.3
California	27.1	39.0
Colorado	22.2	61.1
Connecticut	NA	84.6
Delaware	14.4	21.4
DC	NA	NA
Florida	81.8	100.0
Georgia	40.8	71.4
Guam	20.0	16.9
Hawaii	21.1	8.1
Idaho	53.0	61.7
Illinois	29.4	67.8
Indiana	22.5	34.9
Iowa	37.5	75.1
Kansas	33.0	60.1
Kentucky	31.8	90.0
Louisiana	33.5	77.8
Maine	35.4	74.7
Maryland	26.1	34.6
Massachusetts	33.5	65.3
Michigan	29.5	51.5
Minnesota	25.3	38.4
Mississippi	24.6	11.6
Missouri	37.7	67.2
Montana	30.2	65.5
Nebraska	73.2	51.9
Nevada	29.6	67.8
New Hampshire	68.1	58.6
New Jersey	26.7	66.7
New Mexico	36.4	67.8
New York	27.7	54.2
North Carolina	NA	18.0
North Dakota	52.2	75.6
Ohio	42.9	48.0
Oklahoma	29.6	24.6
Oregon	76.4	35.2
Pennsylvania	32.8	72.5
Puerto Rico	27.4	NA
Rhode Island	22.5	69.6
South Carolina	24.2	39.3
South Dakota	76.9	83.0
Tennessee	36.2	60.4
Texas	28.7	41.2
Utah	55.2	87.1
Vermont	22.0	60.6
Virgin Islands	14.8	NA
Virginia	34.8	23.7
Washington	65.8	49.7
West Virginia	26.1	25.8
Wisconsin	60.6	76.9
Wyoming	68.4	81.7
National Average	33.1	NA

Overall, while TANF requires higher levels of participation to meet both the overall and two-parent rate than required under JOBS, the participation requirement for TANF is moderated in many states by the caseload reduction credit. As explained in detail below, because of the differences in the way participation rates are defined under JOBS and TANF, it is very difficult to compare the participation rates achieved under JOBS to those achieved under TANF.

Who Must Participate?

A key issue in the calculation of participation rates is defining who is required to participate that is, who is counted in the denominator of the participation rate. As discussed below, the TANF program requires participation from a broader segment of adults receiving assistance than required under JOBS.

It is important to note that TANF funding covers both cash assistance and welfare-to-work programs and services and can be used for a wider range of cash and non-cash assistance than the cash assistance program (Aid to Families with Dependent Children (AFDC)) and job training program (JOBS) that it replaced. However, with regard to the participation requirements, the TANF regulations specify that only families receiving TANF "assistance" are included in the denominator of the participation rate. "Assistance" includes cash payments, vouchers, and other benefits designed to meet a familys ongoing needs, as well as supportive services provided to families who are not employed.⁽⁴⁾ Thus, the participation requirements in TANF generally apply to a similar population, i.e., those receiving cash assistance rather than other types of services, as they did under JOBS.

The JOBS program mandated participation in program activities for adults in both two-parent and single-parent families receiving cash assistance but only for a portion of these populations. Under JOBS, certain groups of individuals who received cash assistance particularly those who were in circumstances that made participation more difficult were "exempted" from the participation requirement. As summarized in Table 5, exemptions were made for those individuals who were: ill, incapacitated, or of advanced age (over age 60); needed in the home because of illness or incapacity of another family member; the caretaker of a child under age three (or, at state option, under age one); employed more than 30 hours per week; a dependent child under age 16 or attending an education program full-time; women in the second and third trimester of pregnancy; and residing in an area where the (JOBS) program was not available. For two-parent families, the exemption due to the age of the child applied to only one parent.

The TANF program has a broader approach to defining the participation mandate than the JOBS program. Unlike JOBS, the approach under TANF is to exempt as few individuals as possible from welfare-to-work activities. Thus, while TANF does allow certain individuals who receive assistance to be exempted from the work requirements, fewer exemptions are allowed than under JOBS. As shown on Table 5, the following exemptions are allowed in TANF:

States can exempt single parents with a child under age one from participation; these parents may be disregarded from the calculation of participation rates for up to 12 months.
A family that includes a disabled parent does not count in calculating the two-parent rate.
A state can waive participation requirements for individuals with a history of domestic violence. If a state fails to meet the required participation rates, the calculations will be adjusted to exclude these cases before a penalty is applied.⁽⁵⁾ So far, none of the states have required an adjustment to their overall or two-parent participation rate based on this provision.

Table 5
Exemptions from Welfare-to-Work Activities Under the JOBS and TANF Programs
Exemptions in the JOBS Program	Exemptions in the TANF Program
Illness Incapacitated Over age 60 Needed in the home because of illness or incapacity of another family member Caretaker of a child under age three (or, at state option, under age one) Employed more than 30 hours per week Dependent child under age 16 or attending an education program full-time Women in the second and third trimester of pregnancy Residing in an area where the program was not available	At state option, single parents with a child under age one Disabled parents in two-parent families Individuals with a history of domestic violence. (If a state fails to meet the required participation rates, the calculations will be adjusted to exclude these cases before a penalty is applied.)

TANF also allows exemptions from the participation requirements for states who had work program waivers in effect when PRWORA was enacted. States that had waivers for any of a number of work program provisions are allowed to continue their pre-TANF policies (if they are inconsistent with TANF) until the waiver expires. States that received waivers in one or more areas of their work program specifically, allowable work activities, hours, exemptions, and/or sanctions for noncompliance that are inconsistent with TANF are allowed to continue operating under their prior rules in all of these areas. Thus, states with pre-existing work program waivers can continue with the JOBS exemptions or other exemptions policies they established through a waiver. States in these circumstances can exempt a broader range of families from welfare-to-work activities than would otherwise be allowed under TANF. In addition, one state (Vermont) claims that waiver inconsistencies exempt all cases from the participation rates.

Another issue regarding the denominator of the participation rate under TANF concerns the treatment of families who have been sanctioned. Under TANF, families who have had their TANF grants reduced due to a sanction are subtracted from the total number of families included in the denominator unless they have been sanctioned for more than three of the past 12 months.

Overall, as a result of these different approaches regarding who is required to participate and who is counted in the denominator of the participation rate, the TANF program extends its participation mandate to a much broader population. Under the JOBS program, the adults in approximately 43 percent of families receiving cash assistance were not exempt and therefore were subject to the participation mandate in FY 1995 (Committee on Ways and Means, 1997). In contrast, this proportion doubles under TANF approximately 86 percent of the adults receiving assistance were subject to the participation requirements in TANF in FY 1998.⁽⁶⁾

For How Many Hours are Individuals Required to Participate?

Another critical element in defining participation rates is establishing for how many hours and in what types of activities individuals must participate in the welfare-to-work program each month in order to count toward the rate that is, to be counted in the numerator of the rate. This section discusses how long individuals are required to participate, while the next section discusses the types of activities required. As discussed below, JOBS and TANF established different rules regarding who is counted as a participant based on the number of hours individuals were required to participate each week (summarized in Table 6).

Table 6
Weekly Hours of Participation Required Under the JOBS and TANF Programs
Hours Per Week Required in the JOBS Program	Hours Per Week Required in the TANF Program
Scheduled for program activities for an average of 20 hours per week; must attend at least 75 percent of the scheduled hours. Maximum of 20 hours per week required for single parents with child under age six	Hours (actual not scheduled) required increases from 20 hours per week in FY 1997 to 25 hours per week in FY 1999 to 30 hours per week in FY 2000 and beyond for overall rate 20 hours per week required for single parents with child under age six 35 hours per week required for two-parent families

In JOBS, to count toward the overall rate, all individuals who could possibly be counted as participants must be scheduled for program activities for an average of 20 hours per week.⁽⁷⁾ To actually be counted as a "participant" these individuals must attend at least 75 percent of their scheduled hours. A parent of a child under age six who is personally providing care to the child can be required to participate only 20 hours per week.

In TANF, for the overall work participation rate, the number of hours (actual not scheduled) required to meet the participation requirement increased from 20 hours per week in FY 1997 to 25 hours per week in FY 1999 to 30 hours per week in FY 2000 and beyond. However, the number of required hours for a single parent with a child younger than six years old remains at 20 hours per week throughout the period covered by the legislation. Hours of participation may be averaged across weeks in a month for any given participant, but may not be averaged across participants.

To qualify as a participant for the two-parent rate, 35 hours of participation are required. This number remains constant throughout the TANF authorization period, but it may be met by a combination of effort between the two parents. However, if a two-parent family is receiving federally funded child care assistance, then the two parents together must meet a 55 hour per week requirement.

States with a prior work program waiver can choose to maintain the hourly requirements established by JOBS or by their waiver (whichever is applicable). Once the waiver expires, these states will be required to meet the hourly requirements established by TANF.

What Activities Count Toward the Participation Rate?

A final element in determining whether an individual counts toward a states participation requirement is examining the types of activities recipients must participate in. As with the other factors, the two programs take very different approaches regarding what types of activities individuals can participate in to count toward the rate. While the JOBS program took a relatively expansive approach regarding what counted, TANF narrows the type of activities and requires participation in more work-focused activities.

For the JOBS overall rate, a wide range of employment-related, education, and training activities counted toward the participation requirement. As summarized in Table 7, these included job development and placement; job skills training; educational activities (including high school and equivalency programs, basic and remedial education, and English as a Second Language (ESL)); group and individual job search (for up to eight weeks); community work experience; work supplementation; on-the-job training; and post-secondary education. Non-exempt custodial parents under the age of 20 who had not completed high school (or its equivalent) were required to participate in educational activities. Individuals who entered a job were also counted as participants if they engaged in a JOBS activity during the month of job entry or during the preceding month.

Table 7
Activities that Count Toward the Participation Requirement Under the JOBS and TANF Programs
Countable Activities in the JOBS Program	Countable Activities in the TANF Program
Job development and placement Job skills training Educational activities (including high school and equivalency programs, basic and remedial education, and English as a Second Language (ESL)) Group and individual job search (for up to eight weeks) Community work experience Work supplementation On-the-job training Post-secondary education Employment for those engaged in a JOBS activity during the month of job entry or during the preceding month.	At least 20 hours (30 hours for two-parent families) must be spent in the following: Unsubsidized employment Subsidized employment Work experience On-the-job training Job search and job readiness activities Community service programs Vocational educational training (for up to 12 months per person) Provision of child care services to a person participating in community service The remaining required hours may be in the above or any of the following activities: Job skills training directly related to employment Education directly related to employment (for those with no high school diploma) Secondary school or program leading to a certificate of general equivalence (for those with no high school diploma)

The requirements for meeting the two-parent rate under JOBS were more stringent. JOBS required individuals in these families to participate 16 hours per week in a work supplementation program, a community work experience program, on-the-job training, or other types of work programs. For both the overall and two-parent rates, individuals who were working less than 30 hours per week were not counted in the rates at all under JOBS, while those working more than 30 hours per week were exempted from the rates.

The TANF statute is much more prescriptive with respect to the types of activities that count toward a states participation rate. As summarized in Table 7, in order to count a participant for a month in any year, at least 20 hours per week for all families and 30 hours per week for two-parent families⁽⁸⁾ must be spent in one or more of the following activities; unsubsidized employment, subsidized private or public sector employment, work experience, on-the-job training, job search and job readiness activities (for up to six weeks total per individual or 12 weeks if the state meets the definition of a needy state⁽⁹⁾ but not for more than four consecutive weeks), community service programs, and vocational educational training (for up to 12 months per person), and the provision of child care services to a person participating in community service.

The remaining required hours may be in any of the above or the following activities: job skills training directly related to employment and, for those who do not have a high school diploma or equivalent, education directly related to employment or satisfactory attendance in a secondary school or in a course of study leading to a certificate of general equivalence. Teen heads of household can count toward the participation rate by maintaining satisfactory attendance in high school or the equivalent (regardless of the number of hours) or by participating in education directly related to employment for at least 20 hours per week.

The TANF program has specific rules that limit the number of individuals who can count toward the participation requirement by attending education and training activities. For FYs 1997-1999, no more than 30 percent of those individuals who count toward the rate can do so by participating in vocational educational training. In FY 2000 and thereafter, no more than 30 percent of those counted can meet the requirements by either participating in vocational education training or being a teen head of household in school.

Similar to other TANF provisions, states with one or more work program waivers (see above) in effect prior to the passage of the TANF can continue counting the activities allowed under JOBS or the activities allowed through their waiver. Once the waiver expires, however, states will be required to use the definition of program activities established by TANF.

It should be noted that there is some overlap in the types of activities allowed under JOBS and TANF such as job search and on-the-job training. In some cases the overlap is not always clear because different terms are used for activities that are actually very similar. For example, both programs allow subsidized employment, work experience, and vocational training to count toward the participation requirement although in JOBS these activities were called work supplementation, community work experience, and job skills training. In both programs, states are given discretion over how the specific activities are defined.

Overall, the TANF law narrows the type of activities that count toward the participation rate and focuses those that do count on work and work-related activities. Education and training activities a major component in the JOBS program only count toward the TANF participation rate in limited circumstances. Unsubsidized employment which did not count toward the JOBS participation requirement is a key activity under TANF. Under TANF, over 80 percent of the welfare-to-work program participants were in work or work-related activities (unsubsidized or subsidized employment, work experience, or on-the-job training) compared to seven percent in JOBS.⁽¹⁰⁾ In contrast, in JOBS about 47 percent of the program participants were involved in education and training activities (vocational and jobs skills training, high school, post secondary education, remedial education, ESL, or education related to employment), while in TANF only 11 percent of the participants received these services (U.S. Department of Health and Human Services, 1999, and Administration for Children and Families, 1997). Clearly, TANF has had a significant impact on types of activities provided in welfare-to-work programs.

Assessment of the Change in Participation Requirements Under JOBS and TANF

While the previous sections of this paper described how participation is measured under the JOBS and TANF programs, the paper concludes by discussing factors that may influence a states ability to achieve the required participation rates and provides an assessment of how the participation requirements changed under the two programs.

An important factor in a states ability to achieve the expected level of participation is the economy. Increases or decreases in the unemployment rate can affect welfare caseloads by increasing or decreasing the need for cash assistance. This in turn affects the number of individuals states will have to involve in activities to meet the minimum participation requirements. When the economy declines, welfare caseloads tend to increase (Blank, 1997). This occurred after the passage of the Family Support Act in 1988. In part due to an economic recession, the number of families on welfare grew by 38 percent from 3.7 million in 1988 to 5.1 million 1994. As caseloads increased, states had to involve more individuals in work activities to meet the required participation rates.

In contrast, the TANF statute passed in the midst of an economic recovery and the strong economy has played at least a partial role in the recent decline in welfare caseloads (Council of Economic Advisors, 1997 and 1999). As of March 1999, 2.7 million families were receiving TANF assistance. As a result of this decline, states have been required to involve fewer individuals in work activities to meet the participation requirements than they would have if caseloads had been higher. If economic conditions worsen and welfare caseloads increase, states may face greater challenges in future years when the target rates are higher and the caseload reduction credit is smaller.

Another way the economy can affect a states ability to achieve participation rates is the availability of jobs in the labor market particularly when work counts as an activity as it does under TANF. A lack of unsubsidized jobs suitable for individuals leaving welfare will require states and localities to develop more program-based options (such as work experience positions) or to involve individuals in job skills training options in order to meet participation rates. This can increase program costs and place more demands on program managers and line staff. Another related issue is the lack of unskilled jobs in some local economies. In this case, jobs may be available but they may not be in the low-skilled labor market in which most welfare recipients find jobs.

A second factor that can affect the ability of states to meet minimum participation requirements is the composition of the welfare caseload particularly the proportion of the caseload defined as "hard-to-serve". Welfare recipients with more employment barriers such as low employment skills, limited work experience, and/or problems with domestic violence and substance abuse may have more difficulty sustaining participation in program activities. States with a higher proportion of these types of individuals in their caseload have greater difficulty meeting participation requirements. Interestingly, because of the dramatic caseload decline experienced recently, it appears that the most employable recipients have found jobs and that those remaining on the caseload have more barriers to employment. This result may mitigate somewhat the positive effect that falling caseloads have on a states ability to achieve the minimum participation rates.

A third factor affecting the ability of states to meet the required participation rates is the availability of support services particularly child care and transportation. Studies of welfare-to-work programs have found that problems in providing child care resulted in lower participation levels (Hamilton and Scrivener, 1999). Child care problems can result from a shortage of slots (or from a shortage of certain types of slots such as infant care or care during non-traditional or irregular hours), from inadequate resources to provide child care subsidies, and from unreliable arrangements with no backup options. Similarly, assistance in obtaining reliable transportation is also necessary if individuals are to participate in activities on a regular basis particularly in areas with limited public transportation.

Finally, policies that encourage work by making work pay such as earnings disregards can increase the ability of states to achieve participation rates when works counts toward the requirement. These financial incentives encourage work by allowing workers to remain on welfare after their earnings would otherwise have exceeded eligibility limits. However, states with low grant levels may have more difficulty in meeting participation rates particularly when work counts toward the rate. In states with low monthly assistance grants (and without generous income disregards), any job that meets the hourly threshold for participation is likely to make a worker ineligible for assistance.

In addition, some types of program approaches may enhance the ability of states to achieve participation rates. Research has shown that programs that emphasize skill development can result in higher participation rates, largely because activities are more seamless (Hamilton and Scrivener, 1999). When people move from one activity to another in a welfare-to-work program, the "in-between" periods can cause monthly participation rates to drop. Because education and training activities tend to be longer-term than job readiness and job search activities, programs that incorporate these activities more extensively may be able to achieve higher monthly participation rates more easily.

Overall, compared to the JOBS program, the TANF program has resulted in a participation mandate covering a much broader range of welfare recipients and has established stronger incentives to enroll welfare recipients in work-focused activities. In some respects, TANF made achieving minimum participation rates more difficult for states particularly because of the broader mandate, the more limited range of activities that count toward participation, and the higher target rates. However, this has been moderated by the caseload reduction credit that has reduced target rates, by declining caseloads that require states to involve fewer individuals in program activities to meet the rates, and by state waivers that in some cases have temporarily narrowed the proportion of the caseload that is required to participate in program activities and allowed a broader range of activities to count toward the requirement.

Compared to JOBS, TANF has resulted in a slight overall increase in the number of welfare recipients participating in welfare-to-work activities. In FY 1996 approximately 665,000 welfare recipients participated in JOBS activities, while in FY 1998 almost 700,000 individuals participated in work or work-related activities under TANF (U.S. Department of Health and Human Services, 1999, and Administration for Children and Families, 1997). However, as discussed above, these numbers are not strictly comparable because participation is defined in different ways under the two programs. For example, individuals who were working while receiving cash assistance under JOBS are not included in this number (but they are in TANF) while in TANF some individuals participating in education and training may not count toward the rate (but they did in JOBS).

These figures also reflect the decline in the number of families receiving cash assistance. While the number of program participants is close in absolute terms, a higher proportion of the cash assistance caseload is now participating in welfare-to-work activities. The number of JOBS participants in FY 1996 represents 14 percent of the number of families on welfare (including families that were exempted from the participation requirement). Under TANF, 22 percent of all families receiving assistance (again, including families that were exempted) were involved in work or work-related activities in FY 1998.

In sum, this paper has shown that participation requirements in welfare-to-work programs are very sensitive to how they are defined as well as to outside factors such as the state of the economy and other program features, including the availability of support services and grant and earnings disregard levels. Thus, it is important to consider all these factors when comparing participation rates across states and from one program to another.

Endnotes

1. The participation requirement for two-parent families in JOBS did not apply in the initial years of the program; it did not phase-in until FY 1994. The JOBS statute specified rates for two-parent families through 1998, however, these rates were superceded by the TANF law beginning in FY 1997.

2. None of the territories met this participation requirement for TANF in FY 1998.

3. The two-parent requirements do not apply in eleven states and territories. Six states alabama, Florida, Georgia, Hawaii, Maryland, and New Jersey served two-parent families through a separate state program and thus the TANF two-parent participation requirements do not apply. North Dakota, Puerto Rico, South Dakota, and the Virgin Islands do not serve two-parent families. Vermont exempts all families from participation requirements due to a waiver inconsistency (see Section III of this Appendix).

4. "Assistance" does NOT include other benefits such as non-recurrent, short-term benefits, work subsidies, counseling and supportive services provided to families who are employed.

5. The Family Violence Option (FVO) in the PRWORA statute permits a state to waive program requirements for a victim of domestic violence if complying with the requirements would make it more difficult for the victim to escape violence or would unfairly penalize the individual. Under the FVO, the state must also develop a system to screen for victims of domestic violence and refer them to appropriate counseling and support services. HHS will determine that a state has reasonable cause for failing to meet the work participation rates or to comply with the five-year limit on federal assistance if its failure was due to its provision of good cause domestic violence waivers.

6. Calculations from Table 3:5: TANF Work Activities, Excluding Waivers, for Families Meeting the All-Family Work Requirements, Fiscal Year 1998, in U.S. Department of Health and Human Services, 1999. Child-only cases were not included.

7. This means some individuals could be scheduled for activities that lasted 10 hours per week and others could be scheduled for activities that lasted 30 hours per week. The JOBS regulations required that the number of scheduled hours across all participants average 20 hours per week.

8. And 50 hours for those with a 55-hour requirement.

9. Needy states are defined by high unemployment or increased participation in the Food Stamp program.

10. Calculations from Table 3:5: TANF Work Activities, Excluding Waivers, for Families Meeting the All-Family Work Requirements, Fiscal Year 1998, in U.S. Department of Health and Human Services, 1999, and Administration for Children and Families, 1997.

Appendix D: Select Sources of Data for Outcome Measures

References have been made throughout this report about the need for timely and reliable data that would permit an assessment of states performance in achieving the goals of TANF. Discussion of the various outcome-based performance measures has included brief references to various data sources that might or might not be suitable for that purpose. Below are described some of the major sources of data for potential outcome measures, including both survey and administrative sources. This is neither a comprehensive listing nor intended to be a comprehensive description of data sources, but rather a brief introduction to the types of information that are available through each of these sources. Internet addresses are provided for those seeking additional information about the surveys discussed.

Following the narrative description is a table of selected characteristics of each data source, which includes the agency that collects the data, the unit of analysis, the frequency with which data are collected, how soon after collection the data are available for analysis, the sample size and whether the sample is large enough to allow for state-level estimates.

Survey Data

The Decennial Census is a major source of detailed population information, and the benchmark for estimation from almost all other data sets. It is a comprehensive source of information on individuals economic and social characteristics in local areas across the country. The short form is conducted primarily as a mail-out, mail-back survey to every household in the United States and is the basis on which seats in Congress are apportioned. It asks only name, sex, age, race/Hispanic origin, relationship, and whether the household owns or rents its housing The long form is sent to a sample of the population (17 percent in 2000) and forms the basis for social and economic information published by the Census Bureau. The long form includes questions on marital status, education, ancestry, migration, employment, income, welfare receipt, and housing conditions. New in 2000 is a question required by PRWORA, on grandparents as caregivers for children. The census is conducted every 10 years.

Web site: http://www.census.gov/dmd/www/2khome.htm

The American Community Survey (ACS) is a new approach under development by the U.S. Census Bureau for collecting accurate, timely information needed for critical government functions. The ACS instrument is based on the Decennial Census long form. If fully implemented, this new data collection approach will provide data users with timely demographic, housing, social, and economic data updated every year that can be compared across states, communities, and population groups. In addition, the ACS is a flexible data collection method with the ability to adapt to changing data needs; for example, the potential exists for adding questions of national policy interest or specialized supplements in the future. With the American Community Survey, data will be available every year for all states, as well as for all cities, counties, metropolitan areas, and population groups of 65,000 people or more. For smaller areas, it will take two to five years to accumulate a sufficient sample to produce data for areas as small as census tracts. For example, for areas of 20,000 to 30,000, data can be averaged over three years. For rural areas and city neighborhoods or population groups of less than 15,000 people, it will take five years to accumulate an adequate sample size.

Web site: http://www.census.gov/acs/www/

The Current Population Survey (CPS) has been conducted monthly by the Bureau of the Census since 1942. Its main purpose is to provide estimates for employment, unemployment, and other characteristics of the labor force. The survey focuses on individuals aged 15 and older, but since 1979 limited demographic data have been collected on children in the sample. In addition to the core monthly survey, the CPS also collects annual data in the March Supplement on prior year work experience, education, income (including welfare receipt and program participation), and migration. Other supplements focus on such topics as school enrollment, child support and alimony, and fertility. The CPS is a probability based sample, with a total sample size of about 71,000 households per month (50,000 to 57,000 are actually interviewed). The sample is representative at both the state and the national level. However, the small sample size for many states restricts its usefulness as a source of annual data on state performance, as the standard errors of the state estimates for a single year are quite large. Various small area estimation techniques are currently used by the Census Bureau and others to produce reliable state-by-state estimates, including combining and averaging three or four years of data.

Web site: http://www.bls.census.gov/cps/cpsmain.htm

The Survey of Income and Program Participation (SIPP) is a continuous series of national panels begun in 1983 by the Census Bureau. The SIPP content is built around a core of labor force, program participation, and income questions designed to measure the economic situation of persons in the United States. Panel members are asked the core questions every four months, and are asked to recall their activities over the four previous months. In each wave of interviews, a set of modules on topics not covered in the core section are also asked. Topics covered by the modules include personal history, child care, wealth, program eligibility, child support, disability, school enrollment, and taxes. Until recently, the SIPP consisted of overlapping panels, with a new panel of 14,000 to 20,000 households introduced each February (through 1993) and interviewed for a total of 2 ½ years. Starting in 1996, the SIPP panels have been expanded in both size (to about 36,000 households) and duration (to 4 years in 1996, 3 years thereafter), but a new panel will only be drawn every 4 years. The redesigned SIPP includes enhanced questions about receipt of government program benefits, including reasons why receipt was begun or ended, and which household members were covered.

Web site: http://www.sipp.census.gov/sipp/sipphome.htm

The Survey of Program Dynamics (SPD) was created specifically for the Census Bureau to track the effects of PRWORA using a "pre-post" comparison. Starting with SIPP respondents first interviewed in 1992 and 1993, the SPD is a longitudinal survey with data from annual retrospective interviews conducted each year between 1997 and 2002. Combined with SIPP data collected from the 1992 and 1993 panels, the SPD will provide longitudinal panel data on approximately 18,500 households for 10 years. The survey primarily focuses on employment and earnings, income, and program participation, and also includes questions on child well-being and adolescent behaviors.

Web site: http://www.census.gov/apsd/www/spdmenu.html

The Panel Study of Income Dynamics (PSID) is designed to study the determinants of changes in the economic well-being of families and individuals across time and generations. The survey, conducted by the University of Michigan, is based on a probability sample of about 5,000 U.S. households first interviewed in 1968. The individuals in these households are interviewed through the years, regardless of whether they remain in the same household. For example, children are followed as they advance through childhood and into adulthood, forming family units of their own. Although the original design over-sampled lower income and minority households, the sample also included a complete representative sample of families at all income levels. Surveys were conducted annually through 1997, then switched to every other year for cost reasons. While the sample size is smaller than most other national data sets, the data collected are extremely rich. The survey emphasizes the dynamic aspects of economic and demographic behavior, but covers a broad range of topics including: employment, income, wealth, housing, food expenditures, transfer income, and marital and fertility behavior.

Web site: http://www.isr.umich.edu/src/psid/

The National Longitudinal Surveys (NLS) of Labor Market Experience are a collection of panel surveys sponsored by the Department of Labor. The primary focus in these surveys is on education and labor market transitions; however, all these surveys feature a comprehensive set of questions about family relationships, income, welfare receipt and numerous other subjects. Starting in the mid-1960s, four groups were surveyed: young men (aged 14-24 in 1966), older men (aged 45-59 in 1966), young women (aged 14-24 in 1968) and mature women (aged 30-44 in 1967). The first two groups were last interviewed in 1981 and 1990 respectively, while the other two groups are still being interviewed. Another youth (both young men and young women) survey (NLSY79) was begun in 1979 and, starting in 1986, supplemental information was collected on the children born to the young women in this panel. A new youth survey (NLSY97) was begun in 1997. A key feature of the NLSY is that all respondents were asked to take the Armed Services Vocational Aptitude Battery, a high-quality test of academic and non-academic knowledge and skills.

Web site: http://stats.bls.gov/nlshome.htm

Administrative Data

Under section 411 of the Social Security Act, states are required to collect and report TANF Administrative Data. The data are collected by states monthly, reported to HHS (Administration for Children and Families) quarterly, and consist of both disaggregated and aggregated data on TANF recipients and some others in TANF households. The states also report similar data on closed cases and on participants in separate state programs. The data are used for many purposes including the calculation of participation rates for one and two parent families, determining the number of families reaching the time limits for receipt of TANF, and compiling the characteristics of TANF recipients. States have the option of providing the data via a sample of the population or the entire population. This data source could be used for measuring:

Recidivism rate;
Food Stamp receipt among TANF recipients;
Medicaid/SCHIP receipt among TANF recipients;
Out-of-wedlock birth rates for TANF families;
Level of participation in work and work-related activities;
Percentage of caseload reaching time limits; and
Caseload reduction.

Most states also collect additional data elements which are not required by the federal government for their own policy and program management functions. State administrative data systems frequently collect information on such areas as caseload demographics, caseload reduction, information related to the reasons clients left the TANF caseload, work participation rates, job placement data, caseload distribution by local office, and type of child care TANF clients use (APHSA, 2000). While these data elements are not collected consistently by all states, they could be the basis for optional performance measures.

States collect Food Stamp Quality Control (QC) Data as part of quality control reviews conducted in accordance with section 16 of the Food Stamp Act o f 1977, as amended, and Part 275, Subpart C of the Food Stamp Program regulations (7CFR275). Data are collected monthly from a sample of households selected for review as part of the Integrated Quality Control System (IQCS), an ongoing review of food stamp household circumstances. The IQCS is designed to determine (1) if households are eligible to participate or are receiving the correct benefit amount and (2) if household participation is correctly denied or terminated. The IQCS is based on a national probability sample of approximately 50,000 participating food stamp households, and on a somewhat smaller number of denials and terminations. The national sample of participating households collected in the IQCS is stratified by the 50 states, the District of Columbia, Guam and the Virgin Islands. Annual required state samples range from a minimum of 300 to 1200 reviews, depending on the size of the states caseload. State agencies select an independent sample each month that is generally proportionate to the size of the monthly participating caseload.

The Medicaid Management Information System (MMIS) is designed to collect, manage, analyze, and distribute information on eligibles, recipients, use and payment for services covered by State Medicaid programs. States provide the Health Care Financing Agency (HCFA) with quarterly files containing specified data elements for: (1) persons covered by Medicaid (Eligible File), and (2) adjudicated claims for medical services reimbursed with Title XIX funds (Claims File). These data are furnished on a Federal Fiscal Year quarterly schedule, which begins October 1 of each year. The Eligible File, which is used to tract enrollment on a quarterly basis, contains one record for each person who was eligible for Medicaid for at least one day during the reporting quarter. These files classify individuals by type of eligibility category and contain a flag for TANF receipt, even though Medicaid eligibility is no longer automatically linked to welfare receipt. The Claims File contains several types of records: all Current Claims for Medical Services, Adjustments to Previously Paid Claims, Premium Payments, Lump Sum Adjustments, and Dummy Claims. Dummy Claims simulate claims that would have been generated for Managed Care patients if they were billed on a fee-for-service basis. The Claims Files are submitted quarterly based on the date of payment, not on the date of service.

The State Childrens Health Insurance Program (SCHIP) requires States to provide quarterly expenditure reports to support claims for federal matching funds and financial/statistical data for purposes of program monitoring and evaluation. Regarding expenditure reports, in order to receive Federal matching funds, States report quarterly expenditures made during any quarter of the State program's operation. Such expenditure reports should be submitted within 30 days of the end of the quarter for use in calculating federal funding requirements. For enrollment data, States report enrollment figures on a quarterly basis in four age categories for children under age 19 (under 1, 1-5, 6-12, and 13-18) and family income categories related to Federal poverty levels and State cost sharing requirements. States report on the total unduplicated number of children served in the SCHIP program (both separate SCHIP and Medicaid expansion) as well as the non-SCHIP-related children in the regular Medicaid program, and enrollees and disenrollees for each quarter. These reports also provide for reporting enrollment status data for each type of service delivery system that the individual is enrolled in (i.e., fee-for-service, managed care, or primary care case management) by age and income categories.

As under TANF, states frequently collect additional information regarding their Food Stamp program, Medicaid, and SCHIP that is not required by the federal government. These data may be stored in separate or unified data systems. Some states record monthly data in such a manner that longitudinal event histories for cases my be constructed, while others store only the most current information.

State employment security agencies collect Unemployment Insurance (UI) Quarterly Earnings Records for the purposes of determining eligibility for unemployment insurance. UI records are available on a relatively timely basis within six months of the end of a quarter. UI reports total quarterly earnings, but does not provide information on wage rates or hours of work. UI records do not provide the start or end date of employment, so it may be difficult to determine whether an individual left assistance before or after they started a job. UI does not capture all employment by law, certain types of employment are not "covered" by unemployment insurance, meaning the system does not capture the employment of individuals working in specific types of jobs. Employment that is not covered by the UI system includes independent contractors, federal (including military) and foreign government employees, student employees at an educational institution, domestic employees (below a specified earnings level), certain agricultural workers and elected officials. Moreover, there is reason to believe that much casual or irregular employment is never reported to any government agency, and is therefore missed by the UI data. While there is not believed to be significant variation among states in terms of how coverage is defined, states with higher percentages of non-covered employment may be disadvantaged by use of UI data to measure employment rates. Finally, because UI records are maintained at the state level, they do not record circumstances when individuals find employment in another state. This would affect areas where major employment centers cross state lines.

UI data can be linked to a list of social security numbers in order to calculate employment rates and earnings levels for a given population. While this linking task is not trivial, it is being widely performed by states for research purposes as well as to calculate the performance measures for the TANF High Performance Bonus. The main advantage of using UI records is that all of the data collection is occurring anyway, and therefore the additional burden and cost is minimal.

The National Directory of New Hires (NDNH) is part of the expanded Federal Parent Locator Service provided by the Office of Child Support Enforcement, Administration for Children and Families, U.S. Department of Health and Human Services. The primary purpose of the NDNH is the establishment and enforcement of child support obligations by helping states locate non-custodial parents and identify their employers. Other purposes of the database include support for the administration of IV-A programs (TANF) and research. The NDNH contains information obtained from States New Hire Directories, State Employment Service Agencies (quarterly earnings and Unemployment Insurance benefits) and Federal Agency personnel offices (new hire and quarterly wage data). The data are maintained for a period of two years unless there is an active child support enforcement case.

The NDNH shares many of the strengths and weaknesses of the underlying UI data on which it is based. One key difference is that the NDNH includes federal civilian and military workers, who are not covered under states UI systems. In addition, because the NDNH is a national dataset, it captures information on employment even if a recipient is not working in the same state where she receives welfare. Finally, because the federal government could perform the match between the NDNH and the list of TANF recipients, the data could be more consistently measured across states and the burden on state agencies could be reduced.

Table D-1.
Select Data Bases and Their Characteristics.
Title	Agency	Unit of Analysis	Frequency	Data Availability	Sample Size	State-Level Estimates	Comments
CROSS-SECTIONAL DATA
Decennial Census	Census	Households Families Individuals	Every 10 years	18 months after data collection	1 in 6 housing units (about 20 million units)	Yes (estimates for smaller geographic areas possible)
Proposed American Community Survey (ACS)	Census	Households Families Individuals	Monthly	Data released on annual basis approx. 6 months after each 12 month period (for populations 65,000+)	Estimated full-scale survey annual sample: 3,000,000	Yes (estimates for smaller geographic areas possible with greater time delay)	Pretesting: 1996-1998 Comparison testing (with the 2000 long form): 1999-2002 If fully implemented, estimated date for full implementation: 2003
March Current Population Survey (CPS)	Census/ BLS	Households Families Individuals	Annual	6 months after data collection	Approx. 50,000 households	Yes (3-4 year rolling averages required due to sample size)	The March CPS Demographic Supplement provides labor force data (current week and previous calender year) as well as data on income, education and program participation.
LONGITUDINAL DATA
Survey of Income & Program Participation (SIPP)	Census	Households Families Individuals	Every 4 months; Panel length: 2.5 - 4 years	2-3 years after data collection	14,000-36,000 households	No	SIPP is built around a core of labor force, program participation, and income questions. Additional modules cover child care, child well-being, disability, taxes, wealth, and other topics.
Survey of Program Dynamics (SPD)	Census	Households Families Individuals	Annual	2-3 years after data collection	18,500 households	No	The SPD is a longitudinal follow-up of the 1992 and 1993 panels of the SIPP primarily focused on program participation, income and employment.
Panel Study of Income Dynamics (PSID)	Univ. of Michigan	Households Families Individuals	Annual, every other year starting in 1997	Minimum of 2 years after data collection	8,700 core families	No	Tracks the economic well-being of families and individuals across time and generations.
National Longitudinal Surveys: Young men (1966-1981) Older men (1966-1990) Young women (1968- ) Mature women (1967- ) NLSY79 (1979- ) NLSY79 Children (1986- ) NLSY97 (1997- )	BLS	Households Families Individuals	1-yr. and 2-yr. intervals	Approx. 12-15 months after data collection	Varying sample sizes: 5,600-12,600	No	The surveys involve 7 cohorts. Data are gathered on labor market issues, as well as education, training, income and program participation.
ADMINISTRATIVE DATA
Temporary Assistance for Needy Families (TANF) Data	ACF	TANF assistance units	Monthly data reported quarterly	Approx. 6 months after FY end	3,000 per state	Yes	Full data collection requirements took effect for FY 2000. States may collect additional data.
Food Stamp Quality Control (QC) Data	FNS	Food stamp assistance units	Monthly data reported quarterly	Approx. 6 months after FY end	300-1,200 per state	Yes
Medicaid Management Information System (MMIS)	HCFA	Individuals	Monthly data reported quarterly	Month after reporting	Full population	Yes
State Child Health Insurance Program (SCHIP)	HCFA	Aggregate data	Quarterly	Quarter after reporting	Full population	Yes	SCHIP is reported on MMIS for those states implementing SCHIP under Medicaid
Unemployment Insurance (UI) Data	States	Individuals	Quarterly	Quarter after reporting	Full population	Yes	Does not include cross-state, federal or other uncovered employment.
National Directory of New Hires: New Hires Data (W4)	ACF	Individuals	Monthly	Month after reporting	Full population	Yes
National Directory of New Hires: Quarterly Earnings	ACF	Individuals	Quarterly	Quarter after reporting	Full population	Yes

Appendix E: Bibliography

Administration for Children and Families. 1997. Final Estimates of JOBS Program Participation for FY 1996. Washington, DC: U.S. Department of Health and Human Services.

American Public Human Services Association. 2000. States Use of Administrative Data for Policy and Program Management (Research Notes: March 2000). Washington, DC: Author.

American Public Welfare Association. 1994. JOBS: Measuring Client Success. Washington, DC: The Institute for Family Self-Sufficiency.

Bane, Mary Jo and David Ellwood. 1983. The Dynamics of Dependence: The Routes of Self-Sufficiency. Cambridge, MA: Urban Systems Research and Engineering, Inc.

Barnow, Burt S. 1999. Exploring the Relationship Between Performance Management and Program Impact: A Case Study of the Job Training Partnership Act. Baltimore, MD: Institute for Policy Studies, Johns Hopkins University.

Bartik, Timothy. 1996. Using Performance Indicators to Improve the Effectiveness of Welfare-to-Work Programs. Upjohn Institute Staff Working Paper 95-36. Kalamazoo, MI: W.E. Upjohn Institute for Employment Research.

Behn, Robert D. 1991. Leadership Counts: Lessons for Public Managers from the Massachusetts Welfare, Training, and Employment Program. Cambridge, MA: Harvard University Press.

Blank, Rebecca. 1997. What Causes Public Assistance Caseloads to Grow? Chicago, IL: Northwestern University.

Brock, Tom and Kristin Hartnett. 1998. "A Comparison of Two Welfare-to-Work Models." Social Science Review.

Brown, Brett and Thomas Corbett. 1997. Social Indicators and Public Policy in the Age of Devolution. Institute for Research on Poverty, Special Report No. 71, University of Wisconsin-Madison.

Brown, Brett, Gretchen Kirby and Christopher Botsko. 1997. Social Indicators of Child and Family Well-Being: A Profile of Six State Systems. Institute for Research on Poverty, Special Report No. 72, University of Wisconsin-Madison.

Bittner, Janet. 1998. "Do You Really Want to Be Accountable for Results? Musings on Georgias Learnings." The Evaluation Exchange. Volume IV, Number 1. Cambridge, MA: Harvard Family Research Project.

Committee on Ways and Means, U.S. House of Representatives. 1998. 1998 Green Book: Background Material and Data on Programs within the Jurisdiction of the Committee on Ways and Means. Washington, DC: U.S. Government Printing Office.

Council of Economic Advisors. 1997. Explaining the Decline in Welfare Receipt, 1993-1996. Washington, DC: Author.

Council of Economic Advisors. 1999. The Effects of Welfare Policy and the Economic Expansion on Welfare Caseloads: An Update. Washington, DC: Author.

Dickinson, Katherine and Richard West. 1988. Evaluation of the Effects of JTPA Performance Standards on Clients, Services, and Costs. Prepared by SRI International and Berkeley Planning Associates. Washington, DC: National Commission on Employment Policy.

Forsythe, Dall. 2000. Performance Management Comes to Washington: A Status Report on the Government Performance and Results Act. Albany, NY: Rockefeller Institute of Government.

Freedman, Stephen et al. 2000. National Evaluation of Welfare-to-Work Strategies: Evaluating Alternative Welfare-to-Work Approaches: Two-Year Impacts for Eleven Programs. Washington, DC: U.S. Department of Health and Human Services and U.S. Department of Education.

Friedlander, Daniel. 1988. Subgroup Impacts and Performance Indicators for Selected Welfare Employment Programs. New York, NY: Manpower Demonstration Research Corporation.

Friedman, Mark. 1997. A Guide to Developing and Using Performance Measures in Results-based Budgeting. Washington, DC: The Finance Project.

Gallagher, Jerome et al. 1998. One Year After Welfare Reform: A Description of State Temporary Assistance for Needy Families (TANF) Decisions as of October 1997. Washington, DC: The Urban Institute.

Hamilton, Gayle et al. 1997. National Evaluation of Welfare-to-Work Strategies: Evaluating Alternative Welfare-to-Work Approaches: Two-Year Findings on the Labor Force Attachment and Human Capital Development Programs in Three Sites. Washington, DC: U.S. Department of Health and Human Services and U.S. Department of Education.

Hamilton, Gayle and Susan Scrivner. 1999. Promoting Participation: How to Increase Involvement in Welfare-to-Work Activities. New York, NY: Manpower Demonstration Research Corporation.

Hatry, Harry P. 1999. Performance Measurement: Getting Results. Washington, DC: The Urban Institute Press.

Heckman, James, Jeffrey A. Smith and Christopher Taber. 1996. What Do Bureaucrats Do? The Effects of Performance Standards and Bureaucratic Preferences on Acceptance into the JTPA Program. Cambridge, MA: National Bureau of Economic Research.

Horsch, Karen. 1996 (a). Resource Guide of Results-Based Accountability Efforts. Cambridge, MA: Harvard Family Research Project.

Horsch, Karen. 1996 (b). "Results-Based Accountability Systems: Opportunities and Challenges." The Evaluation Exchange. Volume II, Number 1. Cambridge, MA: Harvard Family Research Project.

Hyland, Jill. 1997. Restructuring and Reinventing State Workforce Development Systems. Washington, DC: National Governors Association.

Interstate Conference of Employment Security Agencies. 1999. Workforce Investment Act: Guidelines for Reporting and Management Information Systems. Washington, DC.

Jensen, Martin. 1997. Building State Systems Based on Performance. The Workforce Development Experience. Washington, DC: National Governors Association.

Koshel, Jeff. 1997. Indicators as Tools for Managing and Evaluating Programs at the National, State, and Local Levels of Government. Institute for Research on Poverty, Special Report No. 73, University of Wisconsin-Madison.

Lewis, Ann and Margaret Dunkle. 1996. The New Oregon Trail: Accountability for Results. Washington, DC: The Institute for Educational Leadership.

Martin, Lawrence and Peter Kettner. 1996. Measuring Program Outcomes: A Practical Approach. New York, NY: Sage Publications.

Pavetti, LaDonna. 1993. The Dynamics of Welfare and Work: Exploring the Process by Which Women Work their Way Off Welfare. Harvard University Doctoral Dissertation.

Perrin, Edward and Jeffrey Koshel, editors. 1997. Assessment of Performance Measures for Public Health, Substance Abuse, and Mental Health. Washington, DC: National Academy Press.

Popovich, Mark. 1996. Toward Results-Oriented Intergovernmental Systems: An Historical Look at the Development of the Oregon Option Benchmarks. Washington, DC: The Alliance for Redesigning Government.

Schilder, Diane. 1998. "Aiming for Accountability: Lessons Learned from Eight States? The Evaluation Exchange. Volume IV, Number 1. Cambridge, MA: Harvard Family Research Project.

Scrivner, Susan et al. 1998. National Evaluation of Welfare-to-Work Strategies: Evaluating Alternative Welfare-to-Work Approaches: Implementation, Participation Patterns, Costs and Two-Year Impacts of the Portland (Oregon) Welfare-to-Work Program. Washington, DC: U.S. Department of Health and Human Services and U.S. Department of Education.

Simon, Martin. 1998. An Update on State Workforce Development Reforms. Washington, DC: National Governors Association.

Social Policy Research Associates. 1997. Workforce Development Performance Measurement: Options for Performance Measures. Washington, DC: Employment and Training Administration, U.S. Department of Labor.

Social Policy Research Associates. 1997. What Do Customers Want? A Review of Customer Goals for Workforce Development Programs. Washington, DC: Employment and Training Administration, U.S. Department of Labor.

State of California, State Job Training Coordinating Council. 1999. Overview of the Performance-Based Accountability System. www.sjtcc.cahwnet.gov/pba/.

Trott, Charles E. and John Baj. 1997. Building State Systems Based on Performance: The Workforce Development Experience, A Guide for States. Washington, DC: National Governors Association.

Urban Institute. 1980. Performance Measurement. Washington, DC: Author.

U.S. Department of Health and Human Services. 1994. Report to Congress: Recommendations on Performance Standards for the JOBS Program. Washington, DC: Administration for Children and Families and Office of the Assistant Secretary for Planning and Evaluation.

U.S. Department of Health and Human Services. 1998. Formula for Awarding the First High Performance Bonus in Fiscal Year (FY) 1999. Memorandum No. TANF-ACF-PI-98-1. Washington, DC: Administration for Children and Families, Office of Planning, Research and Evaluation.

U.S. Department of Health and Human Services. 1999. Temporary Assistance for Needy Families (TANF) Program: Second Annual Report to Congress. Washington, DC: Administration for Children and Families, Office of Planning, Research and Evaluation.

U.S. Department of Health and Human Services. 2000. Temporary Assistance for Needy Families (TANF) Program: Third Annual Report to Congress. Washington, DC: Administration for Children and Families, Office of Planning, Research and Evaluation.

U.S. Department of Health and Human Services. 2000. Bonus to Reward States for High Performance under the TANF Program; Final Rule. Federal Register, Vol 65, No. 169. Washington, DC: Administration for Children and Families.

U.S. Department of Labor. 1999. Consultation Paper on Awarding Incentive Grants and Applying Sanctions for Title I Programs Under Sections 503 and 136 of the Workforce Investment Act; Notice. Federal Register, Vol. 64, No. 80. Washington, DC: Employment and Training Administration.

U.S. Department of Labor. 1999. Consultation Paper on Performance Accountability Measurement for the Workforce Investment System Under Title I of the Workforce Investment Act; Notice. Federal Register, Vol. 64, No. 56. Washington, DC: Employment and Training Administration.

U.S. Department of Labor. 1999. Workforce Development Performance Measurement Initiative: Final Report. Washington, DC: Employment and Training Administration.

U.S. Department of Labor. 2000. Workforce Investment Act; Final Rule. Federal Register, Vol 65, No. 156. Washington, DC: Employment and Training Administration.

U.S. General Accounting Office. 1997a. Managing for Results: Analytic Challenges in Measuring Performance. Washington, DC: U.S. Government Printing Office.

U.S. General Accounting Office. 1997b. Managing for Results: Using GPRA to Assist Congressional and Executive Branch Decisionmaking; statement of James F. Hinchman, Acting Comptroller General of the United States, February 12, 1997, before the Committee on Government Reform and Oversight, House of Representatives. Washington, DC: U.S. Government Printing Office (GAO/T-GGS-97-43).

U.S. General Accounting Office. 1994. Welfare to Work: Current AFDC Program Not Sufficiently Focused on Employment. Washington, DC: U.S. Government Printing Office.

U.S. General Accounting Office. 1995. Welfare to Work: Measuring Outcomes for JOBS Participants. Washington, DC: U.S. Government Printing Office.

Yates, Jessica. 1997. Performance Management in Human Services. Washington, DC: Welfare Information Network.

Zornitsky, Jeffrey and Mary Rubin. 1988. Establishing a Performance Management System for Targeting Welfare Programs. Prepared by Abt Associates. Washington, DC: National Commission for Employment Policy.

Topics

Family Well-Being

Populations

Families with Children

Program

Temporary Assistance for Needy Families (TANF)

Report on Alternative Outcome Measures: Temporary Assistance for Needy Families (TANF) Block Grant

Context for this Report

Legislative Context

Introduction

Background on Performance Measurement

Performance Standards in Welfare-to-Work Programs Before PRWORA

Penalties and Bonuses under TANF

Consultation

Principles of Outcome-Based Performance Measurement

Goals

Measures

Standards

Consequences

Examination of Selected Outcome Measures

Employment-Related Measures

Job Entry Rate

Employment Retention Rate

Earnings Gains

Percentage of Those Required to Work with Earnings

Recidivism Rate

Measures of Child and Family Well-Being

Food Stamp Receipt

Medicaid/SCHIP Receipt

Child Care Affordability and Quality

Receipt of TANF and Other Types of Transitional Assistance

Extreme Poverty Rate

Family Formation/Stability Measures

Percentage of Children Living in Married-Couple Homes

Out-of-Wedlock Birth Rate for TANF Families

Conclusions

Performance measures are only one element of a comprehensive system of program monitoring, research, and evaluation.

Before specific measures can be identified, it is critical to reach agreement among all major stakeholders on appropriate program goals.

There is no perfect outcome-based performance measurement system. Therefore, it would be appropriate to build gradually upon an existing system.

Using outcome measures does not mean abandoning useful process measures, especially because of problems in attributing outcomes to program interventions.

Appendices

Appendix A: Literature Review: The Use of Outcome-Based Performance Measures in Welfare and Workforce Development Programs

Introduction

Defining Performance Measurement

Issues in Using Outcome-Based Performance Measures in Welfare-to-Work Programs

Federal-Level Experiences in Using Outcome-Based Performance Measures in the Welfare and Workforce Development Systems

State Experiences in Developing Outcome-Based Performance Measurement Systems

Appendix B: Summary of the Minutes of the Consultation on Alternative Outcome Measures, July 21, 1999

Background

Brainstorming on Measures

Appendix C: Participation Requirements for Welfare Recipients Under the JOBS and TANF Programs

Introduction

What Rates are States Required to Meet?

Who Must Participate?

For How Many Hours are Individuals Required to Participate?

What Activities Count Toward the Participation Rate?

Assessment of the Change in Participation Requirements Under JOBS and TANF

Endnotes

Appendix D: Select Sources of Data for Outcome Measures

Survey Data

Administrative Data

Appendix E: Bibliography

Connect with Us