The most rigorous test of a program's impacts is an experimental evaluation in which enrollees are randomly assigned either to a program group or to a control group, and outcomes are compared at one or more points post-random assignment. Randomization yields groups expected to be identical, on average, in every way except exposure to the program. Therefore, any differences in outcomes between the treatment and control group post-random assignment can be attributed to the program as a "program impact."
Although such rigorous tests of a program's average impact in a given sample are enormously informative, average impacts tell only part of the story. A fuller understanding of the impacts of social programs requires a thorough understanding of impacts in meaningful subgroups. The goal of subgroup analyses in program evaluation research is to identify whether a program is equally or differentially effective for various subgroups of participants. For example, even if no impacts are found in the full sample, there may be impacts in some subgroups but not others (masked subgroup impacts), or positive impacts in one subgroup and negative impacts in another (offsetting subgroup impacts). Even when impacts are found in the full sample, they may not be equally strong in all subgroups (non-uniform subgroup impacts), or they may be driven by impacts in a single subgroup, with no impacts in the complementary subgroup (isolated subgroup impacts). Addressing this more nuanced impact question can help inform the design and more cost-effective targeting of program services.
When subgroups of individuals are provided services that meet their needs, and when these services would not otherwise be accessible to them, it is reasonable to expect positive program impacts in these subgroups. It is important to be clear, however, that outcomes are not the same thing as impacts. Outcomes reflect the status or well-being of sample members, whereas impacts reflect the effect of an intervention. Oftentimes, changes in outcomes pre- and post-intervention are interpreted as impacts of that intervention. However, outcomes can naturally change over time, so it is important to understand what would have happened in the absence of the intervention. Failure to distinguish outcomes from impacts can lead to faulty conclusions regarding a program's effectiveness.
Likewise, examining whether experimental impacts differ in subgroups is not the same as examining whether outcomes differ in subgroups. Examining outcomes entails comparing levels on outcomes in, say, low- versus high-risk subgroups; we would expect that low-risk individuals fare better than high-risk individuals. For example, the employment rate may be 80 percent for low-risk fathers but only 50 percent for high-risk fathers-a 30 percentage point difference in outcomes by risk status. However, in the context of an experimental evaluation, examining subgroup impacts entails comparing outcomes for program and control group members within each subgroup. For example, among low-risk fathers, those in the program group might have an employment rate of 83 percent compared to the 80 percent employment rate in the control group-an impact of 3 percentage points. And among high-risk fathers, those in the program group may have an employment rate of 60 percent compared to the 50 percent employment rate in the control group-an impact of 10 percentage points. Thus, in this example, low-risk fathers do better than high-risk fathers on employment outcomes, but the program had a larger impact for high-risk fathers (assuming this difference is statistically significant). The question addressed by subgroup impact analyses is whether a program improves outcomes for a subgroup compared to what subgroup members could do on their own. One can also examine whether a program is equally or differentially effective across complementary subgroups.