Status Report on Research on the Outcomes of Welfare Reform, 2001. Chapter 4 Evaluation Methods and Issues


Conclusion 4.1

Different questions of interest require different evaluation methods. Many questions are best addressed through the use of multiple methods. No single evaluation method can effectively and credibly address all the questions of interest for the evaluation of welfare reform.

Conclusion 4.2

Experimental methods could not have been used for evaluating the overall effects of PRWORA and are, in general, not appropriate for evaluating the overall effects of large-scale, system-wide changes in social programs.

Conclusion 4.3

Experimental methods are a powerful tool for evaluating the effects of broad components and detailed strategies within a fixed overall reform environment and for evaluating incremental changes in welfare programs. However, experimental methods have limitations and should be complemented with nonexperimental analyses to obtain a complete picture of the effects of reform.

Conclusion 4.4

Nonexperimental methods, primarily time-series, and comparative group methods, are best suited for gauging the overall effect of welfare reform and least suited for gauging the effects of detailed reform strategies, and as important as experiments for the evaluation of broad individual components. However, nonexperimental methods require good cross-area data on programs, area characteristics, and individual characteristics and outcomes.

Recommendation 4.1

The panel recommends that ASPE sponsor methodological research on nonexperimental evaluation methods to explore the reliability of such methods for the evaluation of welfare programs. Specification testing, sensitivity testing, and validation studies that compare experimental estimates to nonexperimental ones are examples of the types of methodological studies needed.

Conclusions 4.5

Existing household surveys are of inadequate sample size to estimate all but the largest overall effects of welfare reform on individual outcomes using cross-state comparison methods. Research is needed to address this problem by considering the American Community Survey, state level administrative data sets, and supplements and additions to the CPS or other surveys to increase their capacity to detect welfare reform impacts in the future.

Conclusions 4.6

The problem of generalizability of the evidence from welfare reform evaluations on specific populations, areas, and relationships to more general populations, to a national level, and to new policies, has not been sufficiently addressed. More use of microsimulation models as a tool to address generalizability is needed. Microsimulation is also needed to assist in the synthesis of diverse types of results.

Recommendation 4.4

The panel recommends that U.S. Department of Health and Human Services sponsor process research in a number of service delivery areas to better understand how service delivery administrations have implemented new welfare programs and the benefits and services families and children are receiving under these new programs.

Recommendation 4.5

Process and implementation studies have grown in number and importance in the evaluation of welfare reform but often have design defects and are insufficiently integrated with outcome evaluations. As a consequence, their potential use in evaluation has not been fully reached. The panel recommends that U.S. Department of Health and Human Services sponsor methodological research on process and implementation studies to improve methods for systematizing the documentation of program policies and practices, to develop protocols and best practices, and to further integrate them with impact evaluations.

Recommendation 4.6

Qualitative and ethnographic studies of the low-income population and its relevant subpopulations and of social service agencies that provide services to these populations are an important part of the overall welfare program evaluation framework. The panel recommends the further use of well-designed qualitative and ethnographic studies in evaluations of welfare programs to complement other evaluation methods.

Recommendation 4.7

A welfare dynamics perspective should be incorporated into more welfare reform studies, including leaver studies. In general, more disaggregation by levels of heterogeneity among leavers and stayers is needed given the importance of disaggregation for outcomes on and off welfare.

Conclusion 4.8

Studies of the outcomes of welfare leavers contribute only one part of the story of welfare reform and, as an evaluation method, have been disproportionately emphasized relative to other methods. Studies that compare current leavers to those who left welfare prior to welfare reform and studies of divertees, applicants, and nonapplicant eligibles need more emphasis.

Recommendation 4.9

More methodological research is needed to assess and improve the credibility of the multiple cohort method of evaluating the overall effects of welfare reform. This research needs to study the best method to control for the time-series effects of other policies and the economic environment and how many cohorts are enough to do this.

Recommendation 4.10

Experimental methods are underused in current designs of new welfare policy evaluations and should be employed in future studies evaluating different detailed reform strategies and different individual broad components.

Recommendation 4.11

The federal government should take a proactive role in sponsoring experiments at the state and local levels and should encourage planned variation and cross-state comparability to yield the maximum general knowledge.

Conclusion 4.5

Caseload and other econometric models have produced a mixed set of results, partly because of data limitations and partly because of an inherent lack of policy variability. They have done somewhat better at producing ballpark estimates of the overall effects of welfare reform than at producing estimates of the effects of individual broad components.

Recommendation 4.12

The federal government, taking all agencies as a whole, has produced and funded a great deal of valuable monitoring research and a much smaller volume of evaluation research. A greater effort to produce a comprehensive evaluation framework for social welfare programs that considers the major questions of interest and the evaluation methods appropriate for each is needed. A comprehensive framework for evaluation should be developed and used to guide the evaluation efforts under way by private and other public evaluation organizations. This should be an on-going effort as new issues emerge and is a responsibility that should be taken on by the ASPE in the U.S. Department of Health and Human Services.

Recommendation 4.13

In its Annual Report to Congress, ASPE should review the existing landscape of evaluation methods, whether the appropriate balance of experimental and different nonexperimental methods is being achieved, and how evaluation methodology fits into its own research agenda.

Conclusion 4.6

The panel finds that state capacity and resources to conduct evaluations of their own welfare reform programs is often below the level is needed for such an important change in policy.

Recommendation 4.14

The panel recommends that the U.S. Department of Health and Human Services. continue and expand its efforts to build capacity for conducting high- quality program evaluations at the state level through the provision of technical assistance, convening of research conferences, promoting the exchange of technical assistance among the states, and other capacity building mechanisms.

Recommendation 4.15

The panel recommends that ASPE be the primary agency responsible for synthesizing findings from studies of the consequences of changes in welfare programs.