Data on participation, degree receipt, job quality, income, transitional benefits, health care coverage, child care, child outcomes, and several other measures used in this report come from responses to the Five-Year Client Survey, Child Outcomes Study (COS) survey, and Teacher Survey. This appendix assesses the reliability of impact results for the three survey samples. It also considers whether impacts estimated for sample members who responded to the survey may be generalized to the "eligible" sample that is, to sample members who were randomly assigned during the months when the sample was chosen and who met the criteria for selection.
The analysis addresses the following questions for each sample:
- Are the response rates (sample members interviewed as a percentage of sample members chosen to be interviewed) high enough to satisfy the usual standards of impact analysis?
- Are differences in response rates across research groups small enough to indicate that comparisons between those groups will yield unbiased impact estimates?
- Are impact estimates based on unemployment insurance earnings records and welfare payment records similar for respondents and the eligible samples?
For respondents in the Client Survey and COS survey samples, the findings from these multiple tests were generally, but not entirely, positive. For the LFA and HCD programs in Atlanta and Grand Rapids, impact results for respondents appear reliable and representative of effects for members of the eligible sample. Somewhat greater caution is required in generalizing about the impacts for Riverside LFA survey respondents, especially for the COS sample because (1) COS response rates in Riverside were relatively low and (2) relatively large differences were found when comparing five-year impacts on welfare payments and (for the COS) total earnings for the respondent and eligible samples. More extreme disparity was found in comparing impacts for the eligible and respondent samples for Portland (Client Survey) and for Riverside HCD (both surveys). These results suggest that impacts on survey outcomes for Riverside HCD and Portland are not representative of effects for the eligible sample. It is also uncertain whether results from the Teacher Survey represent program effects for the eligible sample because of low response rates in all sites and, for several programs, inconsistency across samples in estimated earnings and welfare impacts.
The response analysis involves comparing background characteristics and impact results for the following samples drawn from the full research sample:
The survey-eligible sample ("eligibles"). Sample members in the full research sample who were randomly assigned during months in which the survey sample was selected and who met the criteria for inclusion.
The fielded sample ("fieldeds"). Members of the eligible sample who were chosen to be interviewed.
The respondent sample ("respondents"). Members of the eligible sample, chosen to be interviewed (that is, fieldeds), who were interviewed.
The nonrespondent sample ("nonrespondents"). Members of the eligible sample, chosen to be interviewed (that is, fieldeds), who were not interviewed. They could not be located or declined to be interviewed.
As discussed in Chapter 2, in Atlanta, Grand Rapids, and Riverside, the samples for the Five-Year Client Survey, COS survey, and Teacher Survey were nested: COS sample members make up a part of the larger Client Survey sample, and teacher surveys were collected for a portion of the COS sample.
With very few exceptions, the eligible samples originally chosen for the Two-Year Client Survey and COS survey were the same for the later surveys.(1) Sample members were randomly assigned during some, but not all, months of sample intake. (See Chapter 2, Table 2.2.) Limiting the eligible sample in this way can introduce "cohort effects": impact estimates that are especially large or small for sample members randomly assigned during particular months. A cohort effect may occur because members of the survey-eligible sample differ in measured or unmeasured background characteristics from persons randomly assigned in other months. Changes in area labor markets or in program implementation that occur at some point after the start-up of random assignment may also introduce cohort effects for example, by increasing or decreasing a program's relative success in moving welfare recipients from welfare to work. In addition, the research strategy in Atlanta, Grand Rapids, and Riverside required exclusion of sample members with certain background characteristics: teen parents, parents with children under age 3 (in Atlanta and Riverside), men with children aged 3 to 5, people who did not speak either English or Spanish, and people who did not provide information on their education status and children's ages prior to random assignment. Survey "eligibles" in Atlanta, Grand Rapids, and Riverside who were also eligible for selection to the COS sample included women with at least one child aged 3 to 5 at the time of random assignment.
Differences of moderate size that is, not enough to change the overall findings about a program were found for the Atlanta LFA and HCD, Grand Rapids LFA, and Portland programs, when impacts for the survey-eligible and full samples were compared. For instance, five-year earnings impacts for both Atlanta programs were about $1,000 larger when calculated for the eligible sample rather than the full sample. Earnings gains averaged about $700 more for LFA-eligibles in Grand Rapids. In contrast, impacts for program group members in the Portland survey-eligible sample fell about $900 below the increase for the full sample. Differences in impacts were smaller for Riverside LFA and HCD and Grand Rapids HCD.
The fielded samples for the Five-Year Client Survey and COS survey were selected from among the sample members originally fielded for the two-year surveys. Sample selection for these earlier surveys occurred in the following way: The eligible samples in Atlanta, Grand Rapids, Riverside were divided into strata according to sample members' research group, date of random assignment, age of youngest child, and pre-random assignment educational attainment. In Portland, the strata were defined by sample members' research group and date of random assignment. For research purposes, different sampling ratios, ranging from 16 to 100 percent, were used in selecting members of the fielded sample from within each stratum. (The sampling ratio is the percentage of eligible sample members selected.) Sample members were chosen at random within each stratum. Although corrected for, as discussed below, differences in sampling ratios may affect survey impact estimates. For instance, unless the total sample size is large, different sampling ratios increase the likelihood that persons chosen in one research group differ (perhaps in unmeasured characteristics) from persons chosen in another research group.
In Portland and Grand Rapids all sample members fielded at two years were again fielded at five years, whereas in Atlanta and Riverside funding limitations made it necessary to select a subsample from the original fielded sample. For research purposes, MDRC gave priority to members of the Two-Year COS survey fielded sample in selecting the Five-Year Client Survey sample.
Members of the five-year fielded samples included respondents and nonrespondents to the earlier survey interviews. In addition, the boy or girl chosen as the "focal child" for the two-year COS survey continued as the focal child for the five-year study.
By the strictest definition, the fielded sample for the Five-Year COS survey also constituted the fielded sample for the Five-Year Teacher Survey, because researchers were prepared to interview teachers of every COS focal child. However, surveys were attempted only for COS focal children whose mother was interviewed in person at five years and who then gave her written permission to contact her child's school. Thus, an alternative definition of the fielded sample for the Teacher Survey would limit the sample to focal children of COS survey respondents, or even to children whose mothers signed a permission form during their interview.
Survey respondents are members of the fielded sample who completed an interview (or a sufficient portion to be usable for research). The concept is straightforward for the Five-Year Client Survey, but not for the COS survey. Sample members fielded only for the Client Survey could be interviewed by phone or in person. However, sample members fielded for the COS survey were supposed to answer the Client Survey and additional COS survey questions during an in-person session that also included observations of interactions between the COS mother and focal child, an interview with the focal child, and administration of a standard assessment of the focal child's intellectual development. As noted in Chapter 2, 203 mothers in the COS fielded sample had moved too far away to be visited by interviewers or could not participate a COS in-person interview for other reasons. They did, however, answer the Client Survey by phone, including the questions on child care and child outcomes asked of all respondents to the Client Survey. These sample members are counted as respondents to the Client Survey, but not as respondents to the COS survey. Thus, the COS respondent sample is limited to mothers who participated in the in-person interviews, observations, and assessments. Finally, for this analysis, the "respondent" sample for the Teacher Survey includes COS survey respondents with a completed Teacher Survey about the focal child.
For this report, weights were applied to the survey respondent sample to correct for differences in sampling ratios between the strata. In the unweighted fielded survey sample in these sites, strata (that is, sample members who share background characteristics and have the same sampling ratio) with high sampling ratios are overrepresented and strata with low sampling ratios are underrepresented. To make the fielded sample more closely replicate the background characteristics of survey eligibles, weights for each stratum were set to equal the inverse of the sampling ratio for that stratum. For example, a stratum in which 1 eligible person in 4 was chosen would receive a weight of 4 (or 4/1), whereas a stratum in which every eligible person was chosen would receive a weight of 1 (or 1/1). The same weights are used for the respondent sample.
It should be noted that under some conditions impacts for a weighted respondent sample may still be different from those for the eligible sample. For example, this result could occur if very different proportions of program and control group fieldeds answered the survey, or if members of a subgroup within one research group were more likely to be interviewed than their counterparts in a different research group. These issues are addressed in the next section.
Table G.1 shows the response rate, the percentage of the fielded sample who responded, by survey sample, program, and research group. The goal of each survey effort was to obtain responses from at least 70 percent of the fielded sample in every research group. For the Five-Year Client Survey, response rates exceed 70 percent for all programs and research groups (and 80 percent in Atlanta and Grand Rapids) and are high enough to suggest that the survey probably represents the eligible sample. These results inspire confidence in the impacts for respondents.
Site and Program
|Number of Fielded Members||Response Rate (%)||Number of Fielded Members||Response Rate (%)||Number of Fielded Members||Response Rate (%)|
|Five-Year Client Survey||Five-Year Child Outcomes Study Survey||Five-Year Teacher Survey|
|Atlanta Labor Force Attachment||597||86.9||349||82.8||349||52.7|
|Atlanta Human Capital Development||717||82.8||473||77.6||473||47.8|
|Grand Rapids Labor Force Attachment||605||88.4||252||84.5||252||57.1|
|Grand Rapids Human Capital Development||619||88.4||244||80.3||244||49.2|
|Grand Rapids Control||606||92.7||249||85.9||249||57.8|
|Riverside Labor Force Attachment||680||73.4||294||62.9||294||36.7|
|Riverside Human Capital Development||509||73.9||309||67.3||309||42.4|
|SOURCES: MDRC calculations from the Five-Year Client Survey, Child Outcomes Study survey, and Teacher Survey.
NOTES: See Appendix A.2.
Response rates also exceed the 70 percent threshold for the COS sample fielded in Atlanta and Grand Rapids. In Riverside, however, COS response rates fell below this standard, ranging from about 63 percent (LFA) to 67 percent (HCD).(2) Therefore, greater caution is required when interpreting results for the COS sample in Riverside.
Response rates were much lower for the Teacher Survey, using the strictest definition of the fielded sample: about 40 percent for Riverside, 50 percent for Atlanta and Grand Rapids HCD, and 58 percent for Grand Rapids LFA and control group members. In each site, response rates increase about 10 percentage points when COS survey respondents constitute the fielded sample. (Results not shown.) These data should be considered the least reliable indicators of program effects on children.
Different response rates among research groups can be a potential source of bias in research group comparisons. Such differences suggest that research groups may differ by unobservable characteristics that cannot be controlled for and may affect impact estimates. The results for the Five-Year Client Survey fielded samples indicate that within each site response rates for each research group differ by 5 percentage points or less, which should be interpreted as a good result. (See Table G.1.) Variation in response rates was only slighter greater for the smaller COS survey and Teacher Survey fielded samples.
An additional concern when estimating impacts from survey responses is that research groups may differ in background characteristics that affect future employment, welfare receipt, and other outcomes. Differences in these observable characteristics can be corrected for in the regression impact model and do not pose a large problem. These differences, however, may indicate variation in unobservable characteristics that as noted above, cannot be controlled for in the impact analysis. The following results show that background characteristics differ by research group in three programs.
To determine whether there are any observable program-control differences within the survey respondent sample, the 0/1 dummy variable indicating membership in the program group was regressed on pre-random assignment demographic information for the fielded and the respondent samples. A statistically significant p-value of the R-square of the regression described above indicates that research groups have different background characteristics and that greater caution is required in interpreting impact results. The results show that differences in demographic characteristics are evident in Atlanta HCD, Riverside LFA, and Portland among respondents to the Client Survey and in Riverside LFA among respondents to the COS survey. However, this problem was not severe. Even in programs for which program-control group differences were found, these differences rarely exceeded 5 percentage points.(3) (Results not shown.) No statistically significant differences were found for respondents to the Teacher Survey. (Results not shown.)
Impacts on five-year earnings and welfare payments based on administrative records were estimated for the eligible and respondent samples for the Client Survey, COS survey, and Teacher Survey. The results are summarized in Figures G.1-G.3. In these figures, impacts for the eligible sample (weighting not required) are compared with the weighted impacts for the respondent sample. Programs that fall near the 45-degree line that is drawn on these figures have similar impacts for the respondent sample and the eligible sample, whereas programs that fall well above or below the 45- degree line show large variation in impacts. Similarity in results suggests that impacts estimated with survey data for respondents represent the effects that would be found for the eligible sample especially on measures that are often affected by employment and welfare levels, such as child care use, health care coverage, and child outcomes. Differences in impact estimates suggest the opposite, though some variation in impacts should be expected because of differences in sample sizes.
For these comparisons a problematic result is considered to have occurred when the impact for respondents exceeded or fell below the impact for the eligible sample by an amount sufficient to change the findings. For example, results would be problematic if a program led to an unusually large impact on total earnings or welfare payments for the eligible sample, based on the range of impacts found in previous evaluations of welfare-to-work programs, but to an unusually small impact when calculated for the respondent sample (or vice versa).(4) Similarly, within a site, results would be problematic if the relative effects of LFA and HCD differed substantially when calculated for the two samples. Under these circumstances, findings on other outcomes may not represent the likely impact for the larger sample.
For the Client Survey eligible and respondent samples, five-year earnings impacts were nearly identical for the Riverside LFA program and relatively close (within $450 to $550) for the Atlanta and Grand Rapids HCD programs (Figure G.1). Impact estimates for the two samples varied by a somewhat larger amount for Atlanta and Grand Rapids LFA ($800 to $900). For both programs, impacts were smaller for the respondent sample, but not by enough to change the overall conclusions that the two programs led to moderately large impacts on earnings.(5)
Five-Year Earnings and Welfare Impacts for Five-Year Client Survey Respondents and Eligibles
SOURCE: MDRC calculations from the Five-Year Client Survey.
NOTES: See Appendix A.2.
Earnings impacts differed dramatically and are problematic for Riverside HCD and for Portland. For the eligible sample, Portland led to a substantial impact on cumulative earnings (averaging about $850 per year) for the full impact sample, but only a small effect (of about $225 per year) for the respondent sample.(6) The opposite result occurred for Riverside HCD. Five-year earnings impacts for the respondent sample (of about $700 per year) greatly exceed impacts estimated for the eligible sample (about $180 per year). Furthermore, impacts for HCD respondents in Riverside also exceed effects for LFAs, a very different result from what was found for the full sample.(7) For both of these programs, these differences in impacts change the conclusion about program effects and create considerable doubt as to whether other effects estimated with survey data may be generalized to the eligible sample.(8)
Comparisons of impacts on total welfare payments show greater consistency between the eligible and respondent samples. The Riverside LFA and Portland programs led to noticeably larger reductions in average welfare payments over five years when estimated for the respondent sample. However, reductions for the eligible samples in the programs averaged more than 15 percent below control group levels. Therefore, these differences in impacts do not change the overall finding (discussed in Chapter 5) that Riverside LFA and Portland programs led to large decreases in welfare payments.(9)
1. COS Respondents
Five-year earnings impacts were very consistent for Atlanta and Grand Rapids HCD when calculated for the COS eligible and respondent samples. Estimated impacts for Grand Rapids LFA differed to a greater extent (about $1,500 higher for respondents) but still support a conclusion that these programs led to a moderate impact on earnings. Impacts varied more dramatically for the two programs in Riverside, especially for Riverside HCD. For COS survey respondents in Riverside, each program led to unusually large gains in earnings above control group levels that averaged more than $1,200 per year. For Riverside LFA, this change in impacts is less of a problem, because LFAs in the eligible sample averaged close to $800 per year more than control group members, an impact that few programs have achieved for parents of young children. For Riverside HCDs, however, estimated earnings gains more than tripled when calculated for COS survey respondents and lead to a conclusion that the program was much more successful than suggested by impacts for the eligible sample.(10)
Less variation was found in comparing five-year impacts on welfare payments for the COS survey eligible and respondent samples (Figure G.2, lower panel). Reductions in welfare dollars were noticeably different for the respondent sample only for Riverside LFA. However, the impact for the eligible sample was already large for Riverside LFA.
Five-Year Earnings and Welfare Impacts for Five-Year Child Outcomes Study Respondents and Eligibles
SOURCE: MDRC calculations from the Five-Year Client Survey.
NOTES: See Appendix A.2.
2. Teacher Survey Respondents
As noted above, the "respondent" sample for the Teacher Survey is made up of the mothers in the COS survey respondent sample whose focal child was the subject of a completed Teacher Survey. For most sites and programs, the comparison of five-year impacts on total earnings and welfare payments yielded similar results as the comparison for COS eligibles and all COS respondents discussed above (Figure G.3). The main exception is the Grand Rapids LFA program, which led to earnings impacts that averaged nearly $1,000 per year for Teacher Survey respondents, but only $350 per year for COS survey eligibles. In addition, the Grand Rapids HCD program resulted in a large decrease in welfare payments for COS survey eligibles, but only a small reduction for Teacher Survey respondents.(11)
Five-Year Earnings and Welfare Impacts for Five-Year Teacher Survey Respondents and Eligibles
SOURCE: MDRC calculations from the Five-Year Client Survey.
NOTES: See Appendix A.2.
1. See Freedman et al., 2000a, Appendix E, for a response analysis of the Two-Year Client Survey. A small number of sample members who were chosen to be surveyed were subsequently discovered to have background characteristics (such as nonproficiency in English or Spanish) that made them ineligible. These sample members were dropped from the eligible samples for the survey at five years. In addition, it was decided to drop single fathers from the Five-Year Client Survey eligible sample, although some were interviewed at two years.
2. In Riverside, response rates would exceed 75 percent for all research groups if the COS mothers who answered only the Client Survey were counted as respondents.
3. An important exception occurred in Portland, where whites made up about 63 percent of program group respondents but 71 percent of control group respondents. As noted in Chapter 7, impacts on earnings were much larger for whites than for African-Americans among members of the full sample.
4. More specifically, programs are considered to have led to a "large" impact on earnings if program group members' average earnings exceeded the control group average by $900 or more per year. Earnings impacts of between $300 and $900 per year are considered "moderate"; and impacts of between $100 and $300 are considered "small." Programs that lead to earnings impacts of less than $100 per year are considered to have resulted in no impact. Thus, in comparing impacts for the respondent and eligible samples, a difference of impacts in earnings of $600 or more per year, or $3,000 or more over five years, could change a finding of "small" impacts to "large" impacts, or vice versa. This difference in impacts would be considered problematic. In contrast, a variation in impacts of $1,000 or less over five years would be unlikely to change the overall findings about a program's relative success in increasing earnings. Differences in impacts of between $1,000 and $3,000 may also cause concern, particularly if one sample shows no effects on earnings and the other sample shows a moderate gain or loss compared with the control group.
Reductions in welfare expenditures of 10 percent or more below the control group average may be considered "large," whereas reductions of between 2 and 5 percent may be considered "small."
5. By standards described in footnote 4, earnings impacts would be characterized as moderate for Grand Rapids LFAs in the eligible sample, but as small for LFAs in the respondents sample. However, the change in impacts is not that large: from about $450 per year for the eligible sample and about $290 per year for the respondent sample.
6. Earnings impacts were only slightly smaller for the eligible sample in Portland than for the full sample. (Results not shown.)
7. Earnings impacts for Riverside LFA and HCD eligible samples are much closer to impacts estimated for the full samples. (Results not shown.)
8. In Portland, two-year impact estimates were much closer for the eligible and respondent samples. See Freedman et al., 2000, Appendix E.
9. For Portland, there is also a large discrepancy in impacts on combined income for the eligible sample (+$1,600) and respondent sample (-$3,900). This result shows again that findings for respondents on survey outcomes are likely not representative of impacts for the eligible (or full) samples. The difference in impacts on combined income was also substantial for Riverside HCD (-$3,000 eligibles; -$500 respondents) and Riverside LFA (-$1,300 eligibles; -$3,300 respondents). (Results not shown.)
10.The same degree of variation in earnings impacts was also seen for Riverside LFAs in need of basic education. (Results not shown.)
11.The discrepancy in earnings impacts is less extreme for Grand Rapids LFA when COS respondents are considered as the eligible sample: $1,000 per year above the control group compared with about $650 per year. For Grand Rapids HCD, however, impact estimates were only slightly closer when COS respondents were compared with Teacher Survey respondents.