Approaches to Evaluating Welfare Reform: Lessons from Five State Demonstrations. d. Trade-Offs Between Subgroup Analysis and Full-Sample Analysis


Oversampling of key subgroups allows the evaluation to obtain more precise estimates of program impacts for the subgroups of interest. However, such oversampling (if total sample size is held constant) also reduces the precision of the estimates of impacts on the full sample. This becomes less of a concern if there are enough resources to have larger than minimum sample sizes overall, since the increase in precision from having a larger sample will at least partly balance the loss in precision from stratification.

For example, suppose subgroups are defined as the individual demonstration sites. Samples may be allocated across the sites in three ways:

  1. No Stratification. If the population about which inferences are to be made is the caseload in the research sites only, sampling rates should be the same in all the sites, and the sample sizes in the sites should be proportional to the number of cases in those sites.
  2. Stratification to Increase Precision of Site-Level Impact Estimates. To make inferences about impacts in specific sites as well as the entire group of research sites, sample sizes should be set to balance the precision needs of the two types of estimates. In general, cases in the smaller sites will be oversampled in relation to cases in the larger sites. It still may be desirable to have larger samples in larger sites, however, to increase precision of the overall estimates, as long as the samples in the smaller sites meet a minimum standard for site-level precision.
  3. Stratification to Increase State-Level Representativeness. If the population about which inferences is to be made is the entire state caseload, the sampling process is appropriately conceived of as a two-stage sampling process, in which sites are selected first, then cases within sites. Such a design could, in principle, lead to oversampling of either large or small sites. In this setting, implications for precision are most appropriately evaluated in the context of the state as a whole.

These same three approaches can be applied to determining sample sizes for other subgroups.