Approaches to Evaluating Welfare Reform: Lessons from Five State Demonstrations. 3. Quasi-Experimental Designs


In some circumstances, states may wish to pursue quasi-experimental designs for evaluating welfare reform programs. Motivations for pursuing these designs include the following:

  • The state may not wish to invest the resources needed to operate two programs and monitor random assignment.
  • The state may be reluctant to use an experimental design for political or ethnical reasons.
  • The state may believe the welfare reform program must apply to all cases in a local area to be effective, because the program is intended to have substantial community effects, or because high levels of publicity necessarily imply that a control group would be affected.
  • The state may believe that large entry effects will result from the program intervention, thus calling into question the usefulness of random assignment. (Even if random assignment occurs at the point of application, it cannot capture entry effects, which occur before an application is made.) However, a nonexperimental analysis of entry effects may be coupled with an experimental design as well as a quasi-experimental design.

The rest of this subsection considers criteria for a strong quasi-experimental design and reviews the limitations of this design, even in the best of circumstances.

A quasi-experimental design uses a comparison group separated from the experimental group in time or space. The comparison group consists of a set of cases that have not been given the opportunity to participate in the reform program. Possible configurations include:

· Pre-Post Design. This design uses as a comparison group a set of cases in the same site as the new program (which could be the entire state), but from a period before the reforms were implemented. The analysis may be conducted at the case level or may use data aggregated by county or other geographic region. The problem with the pre- post design lies in distinguishing the effects of the intervention from the effects of any other factors that change at the same time, such as unemployment rates, demographic characteristics of the low-income population, or changes in related programs. The more periods of pre-program and post-program data that are available, the more potential there is to distinguish the effects of welfare reform from other changes. A major advantage of this type of quasi-experimental design is that it is inexpensive to implement if the data are available. However, it does require that the state maintain longitudinal data on welfare cases on a regular basis.

· Matched Comparison Site Design. The preferred method for implementing a matched comparison site design has two steps. The first step is to choose pairs of sites suitable for implementing the demonstration program, matched as closely as possible in terms of demographic and economic characteristics and characteristics of the program (other than the reforms being tested). The next step is to randomly pick one member of each pair to be a demonstration site and one member to be a comparison site.(6) If, instead, demonstration sites are selected first from those willing to implement the demonstration, then the best matches are selected from those not willing to implement the demonstration, the design is weaker, since the demonstration's success may be correlated with administrators' interest in being a demonstration site. Even with random selection among matched pairs, the small number of sites involved in most demonstrations implies that impact estimates may be biased if there are site differences not captured by the matching criteria, or if events (such as plant closing or openings or natural disasters) occur that lead to major changes at one of the sites in a pair.

· Combination Pre-Post/Matched Comparison Site Design. The strongest quasi- experimental design is a combination of the pre-post and comparison site designs. This involves a comparison site design, with pre-reform samples from both the demonstration and comparison sites. In such a design, the impact of the program is measured as a "difference in differences"--the difference in outcomes before and after welfare reform in the demonstration sites is contrasted with the difference in outcomes over the same time period in the comparison sites. This approach "nets out" differences between the sites that are constant over time, by comparing changes, rather than levels. However, differences between the sites that change over time may still be confounded with the effects of reform. For instance, a plant closing in a comparison site after program implementation may destroy the initial similarity between the two sites in a pair and, thus, lead to biased impact estimates.

The WNW evaluation is based on a combined pre-post and matched comparison site design. A "difference in differences" analysis will be used for public assistance-related outcomes, for which data from five years before the implementation of the demonstration are available for all Wisconsin counties. These analyses will be conducted at both an aggregate level (with the county-month as the unit of analysis) and a disaggregate level (with the case-spell as the unit of analysis). Cross-site comparisons will be conducted of outcomes for which no pre- implementation data are available, such as employment.

The WNW demonstration sites were selected before the comparison sites, and they were selected from sites with a particular interest in implementing the WNW model. There are only two demonstration counties; both are small and relatively prosperous. For each demonstration county, MAXIMUS (the evaluation contractor) selected two nonadjacent comparison counties that are similar in characteristics such as urbanicity, population, and caseload size. It will use multivariate statistical models with case-level control variables to attempt to control for remaining differences. It is unlikely, however, that matched comparison counties and statistical models will adequately control for the fact that the demonstration counties were preselected. It may not be possible to separate the effects of the program from the effects of being in a county where program staff and administrators were highly motivated to put clients to work.