Approaches to Evaluating Welfare Reform: Lessons from Five State Demonstrations. b. Subgroup Sample Sizes


Other than stratification of the sample between applicants and recipients (discussed in Section B), the only explicit stratifications of the sample in the five studies examined were by site (or grouping of sites, such as urban versus rural) and by single-parent versus two-parent cases. The motivation behind these stratifications generally was to allow more precise estimates for subgroups; the implications for precision of the estimates for subgroups and overall were not explicitly drawn out.

All of the evaluations (except Wisconsin, which is not really comparable because of its quasi- experimental design) to some extent oversampled cases in smaller sites. In three instances, the motivation was to increase the precision for subgroup estimates; in one instance, it was to increase statewide representativeness:(13)

  • In California, the sample was allocated across counties as follows: 40 percent to Los Angeles (roughly proportional to its relative caseload), and 20 percent each to the remaining three counties of Alameda, San Bernadino, and San Joaquin. This allocation substantially oversamples the latter county in particular. The goal of being able to measure site-specific impacts justified this approach.
  • Colorado sampled at a higher rate in smaller sites and sampled the full caseload in the smallest county included in the study. The goal of a minimum of 330 experimental and 330 control group cases in each county determined the sampling rates, with additional cases from the largest counties selected to meet the overall sample size goals (and to improve overall precision).
  • In Michigan, the sample was selected from four offices: two in Wayne County (Detroit) and two in other parts of the state. The entire caseload in the two non- Wayne offices was assigned to the research sample, but only 70 percent of the caseload in the two Wayne county offices was assigned to it. The motivation for this allocation appears to have been to make the sample more representative of the state as a whole, since the proportion of the sample from Wayne County thus resembled the proportion of the state caseload from Wayne County.
  • Minnesota had an explicit stratification into urban versus rural sites: the full caseload was sampled in the rural sites, but not in the urban sites. The motivation for this allocation was to derive separate estimates for urban versus rural areas.

Two of the evaluations reviewed stratified explicitly by single-parent versus two-parent cases. California set up the sample so that one-third of cases sampled were two-parent cases (AFDC-UP cases), although such cases typically make up less than 15 percent of the caseload. Minnesota also explicitly oversampled two-parent cases (including cases on the state general assistance program and AFDC-UP), relative to their basic sample of single-parent cases in urban areas.(14) Again, no explicit power analyses were offered to justify these sample sizes, but the motivation was clearly to increase the precision of estimates for two-parent cases. This stratification seems sensible, since changes in rules for two-parent families were a major part of the reform packages in these states, and both states had relatively large sample sizes.

None of the evaluators appears to have considered the effects of oversampling of sites or other subgroups on the precision of the estimates for the overall research sample.