Approaches to Evaluating Welfare Reform: Lessons from Five State Demonstrations. a. Distinguishing Impacts for Policy Changes Using Experimental Methods


The most rigorous way to distinguish impacts for specific policy changes is to employ an evaluation design with random assignment to multiple experimental groups. If there are several experimental groups, each exposed to different sets of policies, and a control group exposed to pre-reform policies, then the impacts of each set of policies can be measured and compared. Without such a design, the direction and relative size of impacts from two sets of policy changes can sometimes be inferred from the impact of both sets together.

Unlike an evaluation design with random assignment to a single experimental group, an evaluation design with random assignment to multiple experimental groups allows impacts to be estimated for multiple sets of policy changes, even if these changes were implemented simultaneously. For example, a welfare reform package may include both expanded earned income deductions and work requirements. Each set of provisions is likely to have positive impacts on employment rates. If an evaluation design included only one experimental group (X1), subject to both provisions, and a control group (N1), subject to neither provision, it would be impossible to distinguish separate impacts on employment. In contrast, if an evaluation design also included two partial experimental groups, one (X2) subject to the earnings incentives but not the work requirements, and the other (X3) subject to the work requirements but not the earnings incentives, the following impacts could be measured:

  • (X1 - N1) = combined impact of earnings incentives and work requirements
  • (X2 - N1) = impact of earnings incentives when work requirements are absent
  • (X1 - X3) = impact of earnings incentives when work requirements are present
  • (X3 - N1) = impact of work requirements when earnings incentives are absent
  • (X1 - X2) = impact of work requirements when earnings incentives are present
  • (X2 - X3) = impact of earnings incentives alone versus work requirements alone

Moving from a two-group experimental design to a four-group experimental design multiplies by a factor of six the number of impact estimates that can be obtained from the evaluation. If a three-group experimental design were used (for example, groups N1, X1 and X2), then three impact estimates could be obtained from the evaluation: (X1 - N1), (X2 - N1), and (X1 - X2). However, it is helpful to be able to estimate X2 and X3 separately, since, because of interaction effects of earnings incentives and work requirements, it will not necessarily be the case that (X1 - N1) = (X2 - N1) + (X3 - N1).

When an experimental evaluation design includes only a single experimental group subject to all welfare reform policies, and a control group subject to none, distinguishing the impacts of separate policy changes is more difficult. Sometimes, even when two sets of policies are implemented simultaneously, there are theoretical grounds for attributing opposite signs (directions) to the impacts of each set. In this case, the sign of the impact estimate indicates which set of policies has the largest impact. For example, expanded earnings incentives are likely to increase welfare participation levels by broadening eligibility, while stricter work requirements are likely to decrease welfare participation levels by reducing leisure or by imposing financial sanctions for noncompliance. If the impact of the combined changes on welfare participation was positive, then the positive impact of expanded earnings incentives must be bigger than the negative impact of stricter work requirements.

In contrast, whenever the anticipated impacts from multiple policy changes are in the same direction, it is impossible to distinguish the impacts of specific changes with only one experimental group. For example, since expanded earnings incentives and stricter work requirements both are likely to lead to higher employment rates, the evaluator cannot assess the contribution of each policy change to the package's overall impact on employment, even with knowledge of the combined impact of these two provisions.

When welfare reform policies are implemented sequentially rather than simultaneously (usually for programmatic rather than evaluation reasons), additional opportunities may be introduced to infer impacts for separate policies in a two-group experimental design. For example, if expanded earnings incentives are implemented immediately, but stricter work requirements are added after 24 months, the first two years of estimated impacts can be attributed to the expanded earnings incentives alone.

Although the staggered implementation of welfare reform policies provides opportunities for inferring impacts of particular changes, care must be taken in determining the groups compared after the implementation of a second set of reforms. Welfare cases assigned to experimental or control groups before the second stage of implementation are likely to be affected by their initial exposure to only the first set of welfare reform policies. Evaluators should distinguish impacts on cases with staggered exposure to the two welfare reform packages from impacts on cases exposed to the packages in combination only.