Even if the design of a welfare reform evaluation is fundamentally sound, the implementation of the evaluation is critical for its overall success. The successful implementation of an experimental welfare reform evaluation generally will include the following features:
Failure to achieve these goals when implementing an experimental welfare reform evaluation may lead to inaccurate estimates of the impacts of welfare reform policies.
This chapter address four issues concerning the implementation of experimental welfare reform evaluations:
An important issue in implementing an experimental evaluation design is determining when to perform random assignment of cases to the experimental and control groups. The timing of random assignment does not necessarily bias the impact estimates obtained through an experimental welfare reform evaluation. It does, however, determine the population of cases for which those estimates are applicable.
This section focuses on two questions:
For recipient cases, random assignment can occur either at a single point in time (when the welfare reform policies are first introduced in the research sites) or over time (as ongoing cases go through the redetermination process).
The chief advantage of random assignment at the time of redetermination is that, immediately following random assignment, cases can be informed in person of their experimental or control status. The provision of this information in person will ensure that recipients are informed of the programs and rules that apply to them and avoid any confusion about the policies to which they are subject.
The chief disadvantage of sampling recipients at the time of redetermination is that this approach excludes cases that do not remain on assistance long enough to reach this point. For example, when assignment occurs at redetermination, the recipient sample may include no cases that have been on assistance for less than six months.
When randomly assigning applicants to treatment or control status, the goal is to include all applicants who could be eligible for assistance under either experimental or control policies, but to exclude applicants who are twice-ineligible-ineligible under both sets of rules. The timing of random assignment before or after eligibility determination may crucially affect whether the sample meets this goal.
If welfare reform does not change eligibility rules at all, then eligibility determination should precede random assignment. Then, only approved applicants are kept in the sample.
If eligibility rules change, there is no ideal solution, unless it is feasible to determine eligibility under both sets of rules. When random assignment of applicants occurs before the determination of eligibility for benefits, then the sample of applicants will likely include twice-ineligible cases. These cases probably will not be affected by welfare reform programs, unless they later become eligible and reapply for welfare benefits. Ignoring the possibility of future changes in eligibility, it is reasonable to assume that the impact of welfare reform on twice-ineligible cases is zero. Under this assumption, the estimated impact of welfare reform on the entire sample of applicants will be smaller than the estimated impact of welfare reform on the sample of eligible applicants--those eligible under at least one set of rules. The extent to which impact estimates are diluted for applicants will depend on how large a portion of the sample of applicants is twice-ineligible.(1)
When eligibility rules change and random assignment of applicants occurs after the determination of eligibility for benefits, then the sample of applicants will be restricted to cases eligible for assistance under the rules used to determine eligibility prior to random assignment. The sample will contain all applicants eligible for assistance under either set of policies only under the following circumstances:
The first example is of a situation in which one eligibility calculation occurs prior to random assignment and another shortly thereafter, and the second is of a situation in which a dual eligibility calculation occurs prior to random assignment.
States generally will find it most convenient to perform a single eligibility calculation for each case, with no subsequent eligibility calculation occurring until the time of redetermination. Sequential or dual eligibility calculations usually will impose greater administrative burdens on states. For example, a state may prefer to have separate staff administer experimental and control policies to avoid confusion of case status or corruption of the random-assignment process. Performing a sequential or dual eligibility determination while maintaining separate staff could effectively double the staff time needed to determine a case's eligibility for welfare benefits.
Sequential eligibility determinations also may create awkward situations for states. For example, if welfare reform expands eligibility, and initial eligibility is determined under this broader set of rules, the state will need to recalculate eligibility and benefits under the narrower set of rules for cases subsequently assigned to the control group. This second eligibility calculation may result in benefits being lowered or eliminated entirely for cases in the control group. Specifically, to ensure that control cases receive only control group polices following random assignment, benefits would need to be reduced retroactively to account for the broader rules initially applied.
It might be feasible to perform dual eligibility calculations if a computer determined eligibility. This could be done in five steps:
In this section, we describe the timing of random assignment in the four evaluations featuring an experimental design. We consider random assignment of recipient and applicant cases separately.
Only Minnesota's evaluation implemented random assignment of recipients at the time of redetermination; the other evaluations implemented the random assignment of recipients at a single point in time. In California's evaluation, random assignment of recipients occurred about one month before the implementation of welfare reform throughout the state, with oversampling to allow for errors and attrition; recipients who did not receive welfare benefits during the month of implementation were deleted from the sample. In Colorado's evaluation, welfare reform was not implemented throughout the state, so recipients assigned to the experimental group were the only group subject to welfare reform policies. In Michigan's evaluation, random assignment of recipients occurred at the same time as the implementation of welfare reform throughout the state.
Two of the evaluations implemented random assignment of applicants before eligibility determination, while the other two implemented random assignment after eligibility determination. Both Michigan's and Minnesota's evaluations assigned applicants to experimental and control groups before determining eligibility. In Minnesota's evaluation, random assignment was immediately preceded by baseline data collection; it was immediately followed by eligibility determination (under the rules for the appropriate group) and then separate group orientation sessions for the experimental and control groups. Minnesota's research sample included denied applicants. Denied applicants who had gone through random assignment were excluded from Michigan's sample because of data limitations. Fortunately, Michigan's welfare reform and control group policies differed little in eligibility rules, so the comparability of the experimental and control groups probably was not seriously compromised by the exclusion of denied applicants from the sample.
In California's evaluation, random assignment of applicants occurred after intake and eligibility determination (under experimental group rules, which were identical to the control group rules for applicants). Originally, random assignment of applicants was to occur at intake, but this "proved to be too expensive and disruptive to county operations.(2) Control group members were subject to experimental policies for one or two months. A control case whose initial AFDC benefits were too low would receive a retroactive supplement. A control case whose initial AFDC benefits were too high (the rarer case) would not receive a retroactive reduction. Short-term AFDC recipients in the control group may have left AFDC without being exposed to control group policies. Control cases receiving AFDC-UP who began working over 100 hours after enrollment (but before random assignment) were dropped from the research sample rather than disenrolled from AFDC-UP. To preserve comparability of the experimental and control group samples, the corresponding AFDC-UP cases in the experimental group should also have been dropped from the research sample, but this action does not appear to have been taken.
In Colorado's evaluation, random assignment of applicants occurred one month after cases were observed in the state's data system. Only applicant cases approved for AFDC under the (slightly more restrictive) control group policies were included in the research sample. Experimental cases in the sample of approved applicants were subject to control group policies for one month before the application of experimental group policies.
None of the welfare evaluations we reviewed performed dual eligibility calculations. As states upgrade the computer systems used to administer their assistance programs, this sort of random- assignment process may be more feasible in the future.
We have reviewed the timing of random assignment in four evaluations featuring an experimental design. Here, we offer separate recommendations for the random assignment of recipients and the random assignment of applicants.
Only Minnesota's evaluation performed random assignment of recipients at the time of redetermination; the other evaluations assigned recipients to experimental and control groups at a single point in time. The latter approach has the disadvantage that it introduces a potential lag between random assignment and a case becoming aware of its experimental or control status. When recipients are all assigned at once, their first knowledge of their experimental or control status will probably be by letter, with an in-person explanation being perhaps weeks or months away. In contrast, random assignment at the time of redetermination allows recipients to be told in person about the policies to which they are subject and to be reminded that other cases in the same region may be subject to a different set of policies. For example, in Minnesota, group orientation sessions were held for recipients who had just gone through random assignment to explain the policies to which they were subject.
Because it is desirable that cases be informed of their experimental or control status immediately following random assignment, we recommend that states consider the option of performing random assignment of recipients at the time of redetermination. However, states should be aware that if they perform random assignment at redetermination, the sample of recipients will necessarily exclude cases that leave welfare between the time of implementation of welfare reform and their next time of redetermination.
If recipients are assigned to experimental or control groups over time, rather than all at once, the implementation of welfare reform in the research sites will necessarily be gradual. The gradual phasing in of welfare reform in the research sites may conflict with a state's desire for a dramatic implementation of welfare reform throughout the state. If a state prefers to implement welfare reform policies all at once in the research sites, then it would be helpful if the state informs recipients in advance that they may be assigned to one of two welfare programs--the welfare reform program or the preexisting welfare program. Recipients should not assume that they will necessarily be subject to welfare reform provisions. Upon going through random assignment, recipients would be notified of the particular policies that apply to them.
We recommend that, if welfare reform substantially changes the rules by which eligibility is determined, states perform dual eligibility calculations before random assignment of applicants. They should include in the sample of applicants all cases eligible for welfare under either experimental or control group rules but exclude cases eligible for welfare under neither set of rules. Dual eligibility calculations are particularly justified if twice-ineligible cases represent a large proportion of all applicants. Including twice-ineligible cases in the sample of applicants increases the number of observations for which the likely impact of welfare reform is zero and may make it more difficult to identify statistically significant impacts for eligible applicants. When eligibility rules are the same for the experimental and control groups, only a single eligibility calculation is necessary. All ineligible cases are excluded from the sample of applicants.
None of the random-assignment evaluations we reviewed included dual eligibility calculations before random assignment. Only one of these evaluations was of a waiver package in which the experimental and control groups faced identical rules for welfare eligibility, although in two other instances eligibility rules were only slightly different for the experimental and control groups.
We recognize that, for ease of administration, states may be prepared to perform only a single eligibility calculation, either before or after random assignment. When welfare reform does not change welfare eligibility rules, we recommend that random assignment of applicants occur after eligibility has been determined. When welfare reform changes eligibility rules, we recommend that random assignment of applicants occur before the determination of welfare eligibility and benefits. If a state follows this pattern, there will be no discrepancy between the initial determination of a case's welfare eligibility and benefits and its assigned experimental or control status. Because random assignment before eligibility determination is likely to introduce twice-ineligible cases into the sample of applicants, we recommend that evaluators track all applicant cases (approved and denied) and report cumulative denial rates for applicants in the experimental and control groups. For the group subject to the broader eligibility rules, the denial rate may provide an approximate measure of the share of applicant cases for which the likely impact of welfare reform is zero.
Another important issue in implementing an experimental evaluation design is determining how to perform random assignment of cases to the experimental and control groups. If an inappropriate method is chosen for performing random assignment, the experimental and control groups may not be truly comparable at baseline, and inferences about subsequent impacts from welfare reform may be biased.
This section focuses on two aspects of performing random assignment:
Random assignment should generate no systematic differences between the baseline characteristics of the experimental group and the baseline characteristics of the control group. The simplest way to accomplish this is to assign cases to the experimental or control group with probabilities equal to the proportion of cases in each group. For example, if the experimental group is expected to equal 35 percent of the cases passing through random assignment, each case passing through random assignment should have a 35 percent chance of selection into the experimental group. Neither the cases themselves nor the program staff administering the program should have any say over who is selected for the experimental group or who is selected for the control group. Otherwise, certain cases may be favored to receive particular policies, leading to systematic baseline differences between experimental cases and control cases.
Random assignment could be accomplished in a manner consistent with these principles in several ways. One approach would be to have a computer generate, for each case, a random number from a uniform distribution between 0 and 1. Suppose, for example, that the state wished to assign 35 percent of cases to the experimental group, 35 percent to the control group, and 30 percent to a nonresearch sample. The computer would compare the random number with the case's probability of selection into the experimental group (0.35 when the probability of selection equals 35 percent). If the number were less than or equal to the probability of selection, the case would be assigned to the experimental group. If the number were greater than the probability of selection, but less than or equal to the probability of being selected into either the experimental group or the control group (0.70 when the probability of selection equals 35 percent for each group), then the case would be assigned to the control group. Otherwise, the case would be assigned to the nonresearch sample.
Another method of random assignment would be to assign cases by the social security number (SSN) of the case head. Since SSNs are not entirely random, only digits at or near the end of the number should be used for random assignment. For example, when the probabilities of selection to the experimental group and the control group are each 35 percent, random assignment could be on the basis of the last two digits of the case head's SSN (00 to 34 resulting in assignment to the experimental group, 35 to 69 resulting in assignment to the control group, and 70 to 99 resulting in assignment to the nonresearch sample). This approach carries a slight risk that, if potential applicants are informed of an SSN-based selection rule in advance, their decision about whether to apply for welfare or whom to identify as the case head might be affected, thereby corrupting the process of random assignment.
A third way to accomplish random assignment is to assign cases on the basis of some other number used for administrative purposes, such as a case number. It would be important to determine the manner in which this number is being generated, and in particular whether program officials have any control over its value. Assuming that particular digits were not subject to the control of program administrators, but were indeed generated randomly, the process of random assignment could proceed in a manner similar to random assignment using the case head's SSN. It would be important to ensure that recipient cases not become aware of the selection rule in advance; otherwise, certain cases might decide to leave welfare prior to random assignment, thereby corrupting the selection process.
An evaluator can detect problems with the method of random assignment in two ways. First, as part of an implementation study, the evaluator can conduct interviews with program staff in the research sites. These interviews can shed more light on how random assignment was accomplished in practice, as well as whether caseworkers or prospective clients had any opportunity to influence (either intentionally or unintentionally) the odds of a case being assigned to the experimental group instead of the control group.
A second method of assessing the method of random assignment is to compare the baseline characteristics of the experimental and control groups as a preface to an impact study. These comparisons would include statistical tests of differences of average levels for experimental and control cases. On average, there should be few statistically significant differences between the baseline characteristics of the experimental group and those of the control group. The occasional detection of a statistically significant difference between experimental and control cases does not prove that random assignment was flawed. However, the more statistically significant differences that are detected between experimental and control cases at baseline, and the larger these differences, the more likely it is that the assignment of cases to the respective categories was not entirely random.(3)
If both interviews with program staff and comparisons of experimental and control cases uncover irregularities in the process of random assignment, then there is a good chance that the process of random assignment was flawed. Otherwise, the assumption that the experimental and control groups are comparable to each other generally can be maintained, and experimental-control differences in subsequent outcomes can be attributed to differential exposure to welfare reform policies.
In this section, we describe the method of random assignment in the four evaluations that feature an experimental design.
The four random-assignment evaluations used different methods of random assignment. Both Colorado's and Minnesota's evaluations performed random assignment using a random number generated by a computer.
California's evaluation assigned cases by sorting them by case number and then using a random start and interval sampling to determine membership in the experimental and control groups. Because case numbers are assigned sequentially in each county, this method ensured that the experimental and control samples of recipient cases had exactly the same proportion of cases of different welfare tenures. If another method of random assignment had been used, there would be no guarantee that the tenure distribution of cases would be identical for the experimental and control groups, although on average the distribution of tenures would be the same for both groups.
Michigan's evaluation assigned recipient cases using the eligibility worker's caseload identification number rather than the case number. As a result, recipient cases were assigned experimental, control, or nonresearch status in groups rather than as individual cases. Caseworkers then specialized in administering the welfare rules to which their clients had been assigned. The main advantage of this method was that it ensured that recipients, and in particular recipients in the control group, did not experience the disruption of having to change caseworkers because of the implementation of the evaluation. Random assignment of applicant cases was performed using the last two digits of the head's SSN.
As required by the terms and conditions of the Section 1115 welfare waivers, all of the evaluators assessed the method of random assignment both through interviews with program staff and through statistical comparisons of the baseline characteristics of experimental and control cases. Neither Colorado's nor Minnesota's evaluations reported any concerns about the implementation of random assignment. California's evaluation experienced some problems in sampling applicants; originally, individuals (rather than cases) were sampled, and some previous recipients were sampled as "new" applicants. In Michigan's evaluation, while some statistically significant differences were detected between the baseline characteristics of the experimental group and the baseline characteristics of the control group, none of these differences exceeded two percentage points. Possible explanations for these differences include mere chance, the random assignment of recipients in groups rather than as individual cases, and the exclusion of denied applicants from the analysis sample.
We have reviewed the method of random assignment in four evaluations featuring an experimental design. Here, we present recommendations on two issues: (1) selection of a method of random assignment, and (2) assessment of the performance of the method of random assignment.
The four evaluations we examined differed in the method they used to perform random assignment. California's evaluation used an interval sampling approach relying on sequential case numbers with a random start. Colorado's and Minnesota's evaluations used computer-generated random numbers to perform random assignment. Michigan's evaluation used the eligibility worker's number to perform random assignment of recipients and the case head's SSN to perform random assignment of applicants.
All of these approaches appear to have been acceptable means of performing random assignment, but each has advantages and disadvantages. Use of a computer-generated random number best ensures that cases will not learn of their experimental or control status before random assignment, even when cases go through random assignment over a period of a time. The main disadvantage of this approach is that program administrators may lack the resources to generate a random number for each case at the time of random assignment.
Interval sampling has the advantage of being a simple and straightforward approach. If case numbers are assigned sequentially, interval sampling will produce experimental and control groups in which cases of different welfare tenures are guaranteed to be represented in the same proportions in both groups. The disadvantage of this approach is that steps must be taken to ensure that the interval sampling begins with a random starting point and that the subsequent assignment of recipient and applicant cases to the list used in sampling is immune from manipulation by state employees or the cases themselves. An example of such manipulation would be program staff altering the sequence in which applicants are entered on the list.
Random assignment of recipients by eligibility worker number has the advantage of preserving the relationship between caseworkers and ongoing cases. This relationship is more likely to be disrupted by random-assignment approaches that assign recipients individually, since cases may then be moved to new eligibility workers or even to new welfare offices. The main disadvantage of random assignment by eligibility worker number is that cases are assigned experimental and control status in groups rather than on an individual basis, thereby increasing the possibility of large differences in the baseline characteristics of the experimental and control group samples.
Random assignment by the last two digits of the case head's SSN has the advantage of employing a number readily available from a state's administrative records. This approach introduces a slight risk of giving advance notice to recipients and potential applicants of the rules by which experimental or control status is determined. Anticipating their experimental or control status, recipients may decide to leave welfare before random assignment, and potential applicants may decide either not to apply for assistance or to apply with another individual identified as the case head.
In all the experimental evaluations reviewed, an implementation study documented program administrators' perspectives on the process of random assignment, and the impact study compared the baseline characteristics of cases in the experimental and control groups. We believe that both of these investigations are valuable in uncovering any irregularities in the method of selecting of the experimental and control groups. We recommend that evaluators continue to use both approaches for monitoring the implementation of random-assignment evaluations.
Another important issue in implementing a welfare reform evaluation is ensuring that the members of the control group are subject to the welfare policies that would have been in place in the absence of the welfare reform program. Even if random assignment proceeds without error, the implementation of welfare reform may alter control group policies:
When control group policies are changed, the resulting measures of the impacts of the welfare reform program will be biased, because the control group is subject to policies and situations qualitatively different from those that would have existed in the absence of welfare reform. In spillover, impact estimates may be too small, since control group members receive some welfare reform policies. In displacement, impact estimates may be too large, since control group members fail to receive services they would have received in the absence of the welfare reform.
This section discusses two issues related to displacement, spillover, and other changes in control group policies:
In situations in which spillover occurs, control group members receive services they would have not received in the absence of welfare reform. A prime example of spillover is the presence of community effects, in which welfare reform affects members of the control group through changes in community institutions or norms. For example, if welfare reform is accompanied by strong community expectations that welfare recipients work, then control group members may be subject to additional social pressure to obtain employment, even if the formal policies to which they are subject have not changed.
The preceding example suggests that a certain degree of spillover is likely, since a welfare reform program usually will be accompanied by changing attitudes in the community. Indeed, these changes in attitudes and expectations are often both the cause and the intended consequence of a major welfare reform initiative. The resulting contamination of control group policies can be reduced by reminding members of the control group of the policies that apply to them, as well as the policies that do not apply to them (even if neighbors, friends, or relatives are subject to welfare reform provisions). Cases can be given these reminders through mailings, group orientation sessions, or regular meetings with the caseworker. The most effective approach probably is meetings with the caseworker, since they occur on a regular basis and provide information in person.
Spillover also can occur as the consequence of administrative error or manipulation, in which program staff administer welfare reform policies to individuals who are still officially in the control group. For example, a welfare reform-related instructional video could be shown to all research cases, because program staff fail to distinguish the experimental and control groups. This type of spillover can be reduced in several ways. To the extent possible, separate staff can administer welfare reform policies and control group policies. In addition, files for experimental and control cases can be clearly
distinguished through the use of different colored folders or other measures, so welfare reform policies are never accidentally applied to control group members.
Displacement of control group policies is the opposite of spillover, since it involves control group members failing to receive services they would have received in the absence of welfare reform. An example of displacement would be reductions in job-training services to control group members because of longer waiting lists arising from welfare reform. Usually, any displacement is the unintentional consequence of administering two welfare systems at the same research site. On rare occasions, however, displacement may arise through the intentional actions of program administrators, who may devote greater attention to ensuring that welfare reform "works" and less attention to maintaining the "old" welfare system.
As with spillover, a certain degree of displacement is likely to occur in an experimental evaluation, since the administration of two programs in the same research site may reduce the resources available for administering services to members of the control group. Nonetheless, the risk of displacement may be reduced by seeking to preserve enough administrative resources for the continuation of control group policies and by maintaining separate staff to administer experimental and control group policies to avoid any manipulation by program administrators.
In this section, we consider state approaches to the challenges of minimizing the risk of spillover and minimizing the risk of displacement.
The spillover of welfare reform services to the control group was difficult to measure in the evaluations we reviewed. In Colorado's evaluation, welfare reform was implemented only in the demonstration counties; before implementation, there was concern that the provision of JOBS services to control cases would increase because welfare reform-related JOBS expansions would spill over to control cases in the research counties. Subsequent to implementation, the evaluator has been concerned that political pressures may be resulting in the displacement of services to control cases. The evaluators of Michigan's initiative suspected that some spillover had occurred because of the welfare reform program's strong emphasis on preparing individuals for employment, but they were unable to quantify these effects. (Presumably, the amount of spillover in Michigan's demonstration was reduced because caseworkers were assigned to serve only either experimental or control cases.) In Wisconsin, changing community institutions was a major goal of the WNW initiative; the likelihood of spillover to a control group through community effects was a major justification for the adoption of a nonexperimental design for the evaluation.
The evaluations studied differed greatly in the steps taken to reduce the risk of displacement. As noted earlier, Michigan's evaluation assigned recipients on the basis of eligibility worker numbers, thereby preserving relationships between caseworkers and ongoing cases in the control group. In contrast, in California's evaluation, recipients in the control group were frequently assigned to different caseworkers and sometimes to different welfare offices, since the control group was a very small proportion of the county caseload and welfare reform policies were implemented for all other cases. The specialized control group caseworkers, fewer in number than the caseworkers administering welfare reform policies, did not always know the language of their clients and were sometimes located
far away from them. The implementation of welfare reform effectively displaced the relationships that had developed between caseworkers and clients. In addition, as a result of being assigned to new caseworkers, control cases were more likely to be "cleaned" (have eligibility reexamined) than experimental cases continuing with the same caseworkers.
Both spillover and displacement are threats to a welfare reform evaluation, because they change the policies received by members of the control group and bias estimates of the impacts from welfare reform. It is usually difficult or impossible to adjust impact estimates for this bias, so states implementing welfare reform evaluations should take steps to minimize spillover and displacement.
Certain measures can help states reduce the risk of the spillover of experimental policies to the control group and the displacement of control group policies as a consequence of welfare reform. In particular, we recommend that states keep experimental and control group members well informed, both in writing and in person, of the policies that apply to them. In this way, states can counteract any spillover of attitudes and impressions from the implementation of welfare reform. We also recommend that states administer experimental and control policies using separate but equivalent staff, with minimal disruption and displacement of the services already being provided to recipients in the control group.
Evaluators can gather evidence on the extent to which spillover or displacement has occurred through a study of the implementation of a welfare reform evaluation. Such a study could include interviews with program administrators in which these individuals are asked to provide the following information:
Evaluators can also gain useful insights through interviews with participants about their perceptions of the policies that apply to them and the services they received. The process evaluation of APDP/WPDP in California is one example of a study that carefully considers these issues. This information will not solve the problems of spillover or displacement, but it will provide evidence of the extent to which these problems are present in an evaluation.
Even if control group policies consistently represent the polices that would have existed in the absence of welfare reform, individuals originally belonging to the control group may be exposed to experimental policies (or vice versa) in some situations:
All of these situations are examples of cases crossing over from one experimental/control status to another. We distinguish migrant crossover cases as cases that experience a change in experimental/control status because of migration; merge/split crossover cases as cases that experience a change in experimental/control status because of a case merger or split; and administrative crossover cases as cases that experience a change in experimental/control status because of administrative error or manipulation. In general, crossover from control to experimental status is more likely when most of the welfare cases in the state are subject to welfare reform policies, while crossover from experimental to control status is more likely when most of the welfare cases in the state are not subject to welfare reform policies.
Regardless of how crossover occurs, the presence of crossover cases in the research sample may result in biased impact estimates because some cases receive the other group's policies instead of the policies to which they were originally assigned. In particular, impact estimates may be too small, since a fraction of original control cases becomes subject to welfare reform policies, and/or a fraction of original experimental cases becomes subject to control group policies. Statistical methods for adjusting for crossover exist (they are discussed in Chapter VI); however, these methods have certain theoretical and practical limitations, so it is in a state's interest to minimize crossover.
The terms and conditions of Section 1115 welfare waivers specify several steps designed to reduce the incidence of cases crossing over from control group policies to experimental group policies (or vice versa):
These standards do not address crossover that occurs through administrative errors in the classification of cases' experimental or control status.
Although the reported incidence of crossover is seldom high for welfare reform waiver evaluations, the evaluations we reviewed had some difficulties in minimizing the risk of crossover. Most research sites were not next to each other, because the desire to obtain a sample representative of the state took precedence over the desire to reduce migration to nonresearch sites. In the four states with experimental evaluation designs, the lack of contiguous research counties may have increased the risk of crossover through migration.
The absence of experimental/control status information in individual records in state administrative systems (as opposed to case records) often made crossover from splits and mergers more likely by failing to identify individuals with previous membership in a research case. In California's evaluation, for example, county-specific automated systems made it difficult for caseworkers to identify crossovers from other counties in the state; the evaluator was able to achieve this identification by relying on state Medicaid records noting receipt of AFDC during the previous 12 months. In Michigan's evaluation, the state was unable to identify the previous research status of individuals reapplying for assistance (although the evaluator later obtained this information by merging case- and individual-level files).
Crossover is a potentially serious threat to a welfare reform evaluation, because it blurs the distinction between experimental and control cases and can lead to biased estimates of the impacts from welfare reform. The terms and conditions of Section 1115 welfare waivers have devoted considerable attention to minimizing the risk of crossover, and we recommend that states seek to adhere to the waiver standards for administering experimental or control policies to cases that migrate, merge, or split.
States can reduce the incidence of crossover in welfare reform evaluations in at least three additional ways:
(1)The share of applicants that is twice-ineligible can often be approximated by calculating the share of applicants that has always been denied welfare benefits. If welfare reform eligibility rules are strictly broader than control group eligibility rules, then the cumulative denial rate for experimental applicants is a good estimate of the fraction of applicants that is twice-ineligible. If control group eligibility rules are strictly broader than welfare reform eligibility rules, then the cumulative denial rate for applicants in the control group is a good estimate of the fraction of applicants that is twice-ineligible. If one set of eligibility rules is not strictly broader than another, then the fraction of applicants that is twice-ineligible cannot be determined merely from cumulative denial rates.
(2)"UC DATA, "Assistance Payments Demonstration Project: Process Evaluation: Phase I," p.8.
(3)Statistically significant differences between the baseline characteristics of experimental and control cases will be easier to detect when the size of the research sample is larger.
(4)We distinguish this change in status from the situation, discussed in the last section as an example of spillover, of a case retaining its official status but receiving the other group's policies because of administrative error or manipulation. A change in a case's official experimental/control status (unlike spillover) should be readily apparent to the evaluator and will generate systematic changes in the policies applied to the case.
Home Pages:
Human Services Policy
Assistant Secretary for Planning and
Evaluation
U.S. Department of Health and Human Services
Updated 09/24/01