National Evaluation of Welfare-to-Work Strategies

Impact on Young Children and Their Families 2-Years After Enrollment:

Chapter 2:
Methods:
How Did We Study Impacts on Children?

Contents

Overview

Chapter 1 addressed the question of why it is important to focus on the effects of JOBS programs on children. We turn now to the question of how child impacts were studied in the NEWWS Child Outcomes Study and, more specifically, in the two-year follow-up wave of the study.

We begin by providing the context for the two-year follow-up. An overview is provided of the design of the National Evaluation of Welfare-to-Work Strategies (NEWWS), and of the NEWWS Child Outcomes Study that is embedded within it. We then describe the procedures and measures specific to the two-year follow-up survey of the Child Outcomes Study, noting especially how children's developmental outcomes were assessed. We conclude with an overview of our strategy for analysis of the data regarding child outcomes.

The focus here is on design, procedures, measures and analysis strategy. Chapters 3 and 4 complement the present chapter: Chapter 3 by providing a description of the three study sites (demographic context as well as information about how the various types of welfare-to-work program strategies, under the auspice of the JOBS Program, were implemented at each site), and Chapter 4 by providing a detailed description of the study sample at each site.

[ Go to Contents ]

Key Questions Addressed in Chapter 2

[ Go to Contents ]

Design of the National Evaluation of Welfare-to-Work Strategies

As noted in Chapter 1, the Family Support Act of 1988 recommended an experimental evaluation of the impacts of the JOBS Program, that is, an evaluation involving random assignment of families to control and experimental groups. In such a design, it can be assumed (and documented) that the families in the different research groups did not differ in terms of background characteristics prior to their assignment to the different research groups. After random assignment, apart from the experiences associated with the research group they are assigned to, families in each site are all exposed to the same broad context, for example, in terms of the job market and local economy. Given that the families did not differ initially, because the assignment to research groups is done randomly rather than according to the backgrounds of the families, and because the families reside in the same broad economic and social context within each study site, significant differences between research and control groups detected upon following up the families can be assumed to reflect the differences in their experiences associated with assignment to differing research groups.

The National Evaluation of Welfare-to-Work Strategies, currently being conducted by the Manpower Demonstration Research Corporation (MDRC), follows such an experimental research design. The purpose of the NEWWS is to assess the impacts of various types of welfare-to-work strategies under the auspice of the JOBS Program on adult human capital and economic outcomes, including program effects on:

Findings regarding adult human capital and economic impacts in three of the seven sites in the evaluation have been reported on previously (Hamilton, Brock, Farrell, Friedlander, and Harknett, 1997), and findings on economic impacts in the full set of eleven programs in the seven sites are being released in parallel with the present report (Freedman, Friedlander, Hamilton, Rock, Mitchell, Nudelman, Schweder, and Storto, L., 2000). In addition to the study of program impacts, the NEWWS also includes components examining the implementation of the JOBS program in differing sites (Hamilton and Brock, 1994; Hamilton et al., 1997) and a cost-benefit analysis (see Hamilton et al., 1997).

In the three NEWWS sites in which the Child Outcomes Study is embedded, there are two experimental groups. Each of the experimental groups involves a different program approach within JOBS: the labor force attachment (LFA) approach emphasizes a quick transition into the labor force through job search activities, while the human capital development (HCD) approach emphasizes enhancing welfare recipients' skills through education and training, as a means to obtaining employment at higher wages and with better prospects of advancement. As noted in Chapter 1, these approaches represent differing views on how best to foster economic self-sufficiency in welfare recipients (Hamilton et al., 1997). The labor force attachment approach assumes that participating in the workplace is the best way to learn work behaviors and skills. The human capital development approach assumes that building "human capital," (skills related to employment), is an important step to take prior to employment, that will help assure higher earnings and greater job stability. These three study sites thus use a "planned variation" research design, in which the outcomes of contrasting program approaches can be compared within each of the sites. A more detailed description of the program approaches can be found in the publications reporting on the NEWWS thus far (Freedman et al., 2000; Hamilton and Brock, 1994; Freedman and Friedlander, 1995; Hamilton et al.,1997), and is also provided in Chapter 3 for the particular sites included in the NEWWS Child Outcomes Study.

The NEWWS is being carried out in seven sites across the country. In three of the seven sites (Atlanta, Georgia; Grand Rapids, Michigan; and Riverside, California), families in the evaluation were randomly assigned to one of three research groups: the two program groups (the human capital development group or the labor force attachment group), or to a control group. In a fourth site (Columbus, Ohio), two different forms of case management were contrasted with a control group: an integration of case management for income maintenance and JOBS participation, and case management that focused on these separately. In three further sites (Detroit, Michigan; Oklahoma City and surrounding counties in Oklahoma; and Portland, Oregon), families were randomly assigned to only one of two groups: the site's pre-existing welfare-to-work program or the control group.

Once the income maintenance case worker had reached a decision that a welfare recipient or applicant was not exempt from legislatively mandated JOBS participation, the recipient then received notification to report to a JOBS program orientation. In the Child Outcomes Study sites (Atlanta, Grand Rapids, and Riverside), random assignment to a JOBS program occurred at the orientation. At random assignment, recipients were given a presentation about the evaluation, an assessment of their basic reading and math skills was administered, and they were asked to provide background information. Those who met the criteria for inclusion in the evaluation (noted below) were then randomly assigned to a research group within the evaluation. As we will note below, there was a further step in the random assignment process in the Riverside site than in the other two research sites.

Families were considered eligible for inclusion in the NEWWS when they met the following criteria:

It is important to note a variation on the random assignment process that occurred only at the Riverside site, and was necessitated by program regulations at the state level. At this site, regulations required that a distinction be made between those deemed "in need of basic education" and those deemed "not in need of basic education." Individuals were considered in need of basic education when they met any one of the following conditions: they (1) did not have a high school diploma or General Educational Development (GED) degree, (2) had a low score (214 or below) on either the reading or math component of the assessment (the GAIN Appraisal test), or (3) required remediation in English. In a further step in the random assignment process in this site only, individuals in need of basic education, and those not in need of basic education were then assigned to different random assignment processes. Those who were considered to be in need of basic education were randomly assigned to any one of the three research groups. However, those considered not in need of basic education could be randomly assigned only to the labor force attachment or control groups (see Hamilton et al., 1997).

As a result, when contrasts of research groups are carried out in the Riverside site, those in the human capital development group are compared to control group members who are likewise considered in need of basic education, whereas members of the labor force attachment group (who could be in need or not in need) are compared to all control group members. The use of a subset of the control group in the Riverside site for comparisons with the human capital development group is apparent in the program impact tables in Chapters 6, 7, and 9. The fact that all families in the human capital development group in Riverside were in need of basic education, whereas this was not the case for the human capital development groups in the other two sites, should be kept in mind when looking across the three sites at the findings for the human capital development program.

For mothers assigned to either the labor force attachment group or the human capital development group, participation in the activities of the JOBS Program was mandatory. That is, the mother was required to participate in JOBS program activities, or she faced the possibility of sanctioning (reduction in welfare benefits). Mothers in the control group, while eligible for Aid to Families with Dependent Children benefits, were not required to participate in any JOBS activities. Control group members were, however, free to seek out education and training programs in their communities at their own volition, and were guaranteed child care while participating in such approved activities, as required by the Family Support Act provisions.

It is important to note that the experimental evaluation of the JOBS program does not focus on the effects of participating in the JOBS program per se. Rather, the experimental evaluation assesses the impact of assignment to a JOBS experimental group, and thus exposure to program messages, services, and mandate to participate. The evaluation carefully documents how many mothers in each experimental group participated in JOBS program activities, and also considers the implications of participation for the major outcomes. However, when each of the experimental groups is contrasted with the control group in the analyses of program impacts, the experimental groups include all those who were assigned to those groups, whether or not they actually participated in the program. That is, the experimental groups include individuals who did not participate in any JOBS activities despite the mandate to participate, and might have been sanctioned as a result, as well as those who did participate.

For all of the sample members in the NEWWS, background information and attitudinal data are available from information collected just prior to random assignment (standard client characteristics and Private Opinion Survey). In addition, immediately prior to random assignment, recipients were given an assessment of reading and math skills. Because these data were collected prior to random assignment, they provide us with "baseline data" about the families, or data unaffected by assignment to a research group. In addition, administrative data are available for each of the families in the sample from county and state Aid to Families with Dependent Children records, and from state unemployment insurance records.

While baseline and administrative data are available for all sample members in the NEWWS, a subset of the full evaluation sample is also participating in two follow-up survey waves: one completed approximately two years after random assignment and another five years after random assignment. The sample for the client survey is a stratified random sample of the full evaluation sample; that is, those participating in the client survey were randomly selected, with certain subgroups systematically oversampled to permit analyses of specific subgroups. Only respondents who spoke English or Spanish, and thus could be interviewed in one of these languages, were included in the client survey sample. Analyses of the client survey data are weighted to permit generalization to the full population of individuals eligible for the NEWWS at each site. The two and five-year follow-up surveys provide maternal response measures on such issues as participation in educational and training activities and perceptions of such activities, educational attainment, employment, earnings, receipt of benefits, and use of child care while the mother is employed.

[ Go to Contents ]

Design of the NEWWS Child Outcomes Study

The JOBS program departed from earlier welfare-to-work programs in that it mandated the participation of parents with children as young as age three (or younger at state option). Previous welfare-to-work programs were often voluntary, and had focused their attention on mothers with school-age children. In the context of the JOBS Program, preschool-age children were expected to be particularly likely to experience changes in their daily routines and child care situations. The NEWWS Child Outcomes Study, being carried out by Child Trends under subcontract to the Manpower Demonstration Research Corporation, was launched as a special substudy within the larger NEWWS, in order to study whether and how the development of preschool-age children was affected over time when their mothers were assigned to a JOBS program.

As noted in Chapter 1, there are reasonable bases for hypothesizing quite divergent program impacts for children (ranging from negative impacts, to neutral impacts, to positive impacts, or to impacts only for specified subgroups). Thus, the NEWWS Child Outcomes Study does not begin with a specific hypothesis about the direction of effects on children but rather seeks to document the full range of impacts both in the aggregate and for specified subgroups. At the same time, a priority is placed in our examination of the impacts for children on assessing whether the JOBS Program had unfavorable impacts on children (the "harm hypothesis"). For policy makers, two important bases for assessing the impacts of the JOBS Program are whether it had positive effects on family economic self-sufficiency and, at a minimum, did not harm children.

In the present study, we report all program impacts on children that are statistically significant. These program impacts are reliable: they are very unlikely to have occurred just on the basis of chance. As such, these program impacts warrant continued monitoring. In the Child Outcomes Study, we will want especially to monitor whether the kinds of measures on which statistically significant impacts were found at the two-year point continue to show differences at the final follow-up (five years after the families enrolled in the evaluation), and if such differences grow in magnitude.

We also report on whether a statistically significant result meets a further criterion: that of "policy relevance." At the start of the study and as the study proceeded, researchers and policy makers met to grapple with the question of the point at which child impact findings should be taken into account in considerations about policy. A decision was made that statistically significant findings that were of a particular magnitude should be considered relevant to policy discussions: specifically, statistically significant child impact findings of a third of a standard deviation.

This threshold sets aside impact findings that are so small that, while they are reliable statistically and warrant continued monitoring over time, may at this point in time have limited importance in terms of children's development. At the same time, the threshold for policy relevance does not require that an impact be large in magnitude (1) in order to meet the criterion. By setting the threshold in this way, we can be reasonably confident that we are being inclusive in identifying instances of possible harm (as well as of possible beneficial effects on children), without focusing on effects that are so small as to be of limited importance for children's development.

In presenting results, we go beyond consideration of significant and policy relevant effects to discuss the patterning of findings. We also identify those impacts for which effect sizes substantially exceeded the threshold for policy relevance, in that effect sizes were .50 or larger. The strongest evidence on which to base conclusions about impacts on children is a consistent patterning of impact results, particularly when impacts meet or exceed the criterion for policy relevance. A patterning of results, for example, might show consistently favorable impacts for families in a particular site, or a particular program approach. A patterning of results might also pertain to a type of child outcome, with findings in one aspect of development (such as health) consistently affected favorably (or unfavorably) across programs.

The NEWWS Child Outcomes Study is being carried out in three of the seven sites of the full evaluation. These sites -- Atlanta (Fulton County), Georgia; Grand Rapids (Kent County), Michigan; and Riverside (Riverside County), California -- were chosen on the grounds that they each conducted at least one round of random assignment at the JOBS office, involve a contrast of all three research groups (labor force attachment, human capital development and control groups), and permit an examination of the JOBS Program as implemented in differing regions of the country (with differing populations and differing economic, as well as social, contexts). Chapter 3 includes a discussion of the site characteristics and a description of how the JOBS Program was implemented in each of the three sites of the NEWWS Child Outcomes Study.

The NEWWS Child Outcomes Study is "embedded" within the larger NEWWS; that is, each of the families in the Child Outcomes Study completed the procedures of the full evaluation, including the component of the full evaluation that involved collection of survey data. Thus, we have baseline data, administrative data, and the two-year follow-up survey data from the full evaluation for these families, and we will eventually have five-year follow-up surveys as well.

For the families participating in the NEWWS Child Outcomes Study, the two and five-year follow-up surveys are more extensive than for other families in the survey sample, including extra sections focusing on the development of the child and on aspects of family life and child care that may be important to child outcomes. In addition, families in the NEWWS Child Outcomes Study are asked at the time of the five-year follow-up for their permission to contact the focal child's primary teacher, in order to ask that the teacher complete a mailed questionnaire concerning the child's academic progress and behavior in school (the "Children's School Progress Survey").

In order to be eligible for inclusion in the NEWWS Child Outcomes Study, families participating in the NEWWS in the Atlanta, Grand Rapids, and Riverside evaluation sites had to meet these additional criteria:

In all, 5,905 families were identified as eligible for inclusion in the NEWWS Child Outcomes Study. Of these, 3,670 families were selected to be interviewed for the two-year follow-up. Overall, a total of 3,194 (or 87 percent) of selected families completed the two-year follow-up survey, with response rates ranging from 80 percent (in Riverside) to 91 percent (in Atlanta and Grand Rapids).

Four further criteria were established in order for families to be included in the analyses of the two-year follow-up data for the Child Outcomes Study:

A total of 176 (or 5.5 percent of) respondents to the two-year follow-up survey were dropped from the Child Outcomes Study analysis sample. Thus, the sample for the present analyses of the NEWWS Child Outcomes Study includes a total of 3,018 families. Of these families, 1,422 are from the Atlanta site of the evaluation, 646 are from the Grand Rapids site, and 950 are from the Riverside site. Chapter 4 describes the characteristics of these families at the time they entered the NEWWS.(6)

For the present report, which focuses on child outcomes at the time of the two-year follow-up, we will rely upon data from the following sources:

Core component. The core interview provides us with maternal report measures of participation in education and training programs, educational attainment, employment, earnings, benefits, and child care use while the mother was employed. Some measures in the core interview also concern the well-being of all of the children in the family. Respondents who were determined at baseline to be in need of education, and who were in the human capital development group or the control group, completed math and literacy tests as part of the two-year follow-up as well.

Child Outcomes Study component. The component of the interview specific to the Child Outcomes Study is the source of maternal report measures of the focal child's health and social development, as well as of a direct assessment of the focal child's cognitive school readiness. The specific child outcome measures are described below. This component of the survey also provides maternal report and interviewer rating measures of the home environment and of the mother-child relationship, and maternal report measures of the focal child's child care participation, the mother's psychological well-being, household composition, and the receipt of child support and the child's contact with the father.

While the present report focuses on outcomes two years after random assignment in all three of the sites included in the NEWWS Child Outcomes Study, we note that two special studies have been conducted that involve a subset of the Child Outcomes Study sample specifically in the Atlanta site. In order to provide a descriptive portrayal of families with young children close to the start of the evaluation, 790 families in the Atlanta site participated in an additional wave of data collection called the Descriptive Survey, on average three months after baseline. A descriptive account of these families and of the children's development shortly after the start of the evaluation is presented in a report entitled How Well Are They Faring? AFDC Families with Preschool-Aged Children in Atlanta at the Outset of the JOBS Evaluation (Moore, Zaslow, Coiro, Miller, and Magenheim, 1995).

A second special study, the JOBS Observational Study, is also being conducted in the Atlanta site among a subsample of families from the Descriptive Study. This study is supported by a consortium of private and public funders including the Foundation for Child Development, the William T. Grant Foundation, the George Gund Foundation, an anonymous funder, and (for pretest work only) the U.S. Department of Health and Human Services. The study seeks to provide detailed and sensitive measures of mother-child interaction at two points in time: soon after baseline (4-6 months), and a period of years after baseline, when longer-term effects of the program can be assumed to have occurred (4 ½ years after baseline). The goals of this study are to ask whether the JOBS program affects mother-child interaction during the early months of the program, and at a point in time when longer-term adaptations to the program have been made; to examine the role of parenting behavior in shaping any impacts of JOBS on children's development; and to assess the contributions of different approaches to measuring parenting behavior (observational; interview-based) in the context of an evaluation study.

[ Go to Contents ]

Procedures of the Two-Year Follow-Up Survey

All families participating in the survey waves of the NEWWS, including the families in the Child Outcomes Study, were told at the time of random assignment that they would be contacted for follow-up interviews. Families were then sent letters when it was time to contact them for the two-year follow-up survey. Interviewers set up appointments for the follow-up survey either by contacting respondents by phone, or by going directly to the respondent's home (for example, because the respondent did not have a telephone, or because the interviewer had difficulty reaching the respondent by phone). Interviews were conducted in the respondents' homes. The in-home survey, including the core as well as the component specific to the Child Outcomes Study, lasted about one and a half hours (range of from half an hour to four and a half hours). During the visit:

Respondents were given a $20 incentive for their participation, and the focal children in the study were also given a small gift (a travel Etch-a-Sketch).

In fielding the study, efforts were made to recruit interviewers from ethnic/racial backgrounds similar to those of the respondents in each site, and all interviewers were female. Bilingual interviewers (Spanish-English) were available in the Grand Rapids and Riverside sites, and all survey instruments were translated into Spanish (with checks on the translation carried out via oral back-translation). If a respondent in one of these sites indicated a preference for the interview to be conducted in Spanish, this was done. The child assessment, the Bracken Basic Concept Scale/School Readiness Composite, was also available in a Spanish version. An adult respondent who did not speak English was not asked to complete the literacy and math tests, because these involved assessing these skills as they would be used in an English-speaking context.

Interviewers all participated in a three-day long training meeting in which they received instruction not only in the administration of the survey modules, but also intensive training in the administration of the child cognitive assessment, the adult literacy assessment, and completing the ratings of the home environment and mother-child interaction. Training for the ratings of mother-child interaction employed a training videotape, and training for ratings of the home environment used photographs of home settings as exemplars.

Data quality was monitored intensively during the first months of fielding through detailed review of completed surveys (including assessments and ratings) in each of the three sites by staff from Child Trends, Manpower Demonstration Research Corporation, and the fielding organization, Response Analysis Corporation. Interviewers were contacted directly regarding any problems in their administration of the follow-up. After the initial months of fielding, ongoing quality control involved review of key questionnaire items and sections on every interview. If information was missing or inconsistent with data provided elsewhere in the interview, staff from Response Analysis Corporation re-contacted the respondent directly or had the original interviewer re-contact the respondent. Discussions occurred periodically between the fielding organization and the research staff at Child Trends regarding any particular child-related questions or issues that emerged during fielding. Response Analysis Corporation also verified the completion of 20-40 percent of each interviewer's sessions.

[ Go to Contents ]

Child Outcome Measures Included in the Two-Year Follow-Up

The two-year follow-up included measures of children's development in three broad aspects, or "domains," of development:

We note that some measures of development are available for the focal child in each family only, and some are available pertaining to any child in the family. Below, we briefly describe the measures used to assess development in each of these domains. A number of the measures pertaining to any child in the family come from the core instrument, and thus were asked of all families in full NEWWS sample participating in the survey component.(7) We note that other measures central to our analysis of child outcomes are described in detail in other chapters. In particular, Chapter 8 lists and describes the particular measures we use in assessing how child impacts come about (i.e., asking whether variables such as total family income, maternal depression, and participation in child care changed in response to JOBS, and whether changes on these variables help explain impacts on child outcomes).

I. Measures of Cognitive Development and Academic Achievement

II. Measures of Behavioral and Emotional Adjustment

III. Measures of Child Health and Safety

[ Go to Contents ]

Strategy of Analysis

The data analyses that we will present in the following chapters follow a progression across:

Below we briefly describe the aim and approach for each of these types of data analyses.

I. Descriptive Analyses

The goal of descriptive analyses is to provide a portrayal of the families and children in the sample apart from any effects of JOBS. Descriptive data on sample characteristics (presented in Chapter 4) are based on the information collected from respondents prior to random assignment. (10) For these analyses, as for all analyses in this report, we present findings separately for the Atlanta, Grand Rapids, and Riverside research sites. However, since we are relying on data collected before respondents were randomly assigned for these particular descriptive analyses, we combine the data across the three research groups (labor force attachment, human capital development, and control groups), and present summary figures. Thus, for example, we present the percentage of mothers with differing levels of educational attainment in each site, and we summarize the mean ages of mothers and children in each site, using baseline data.

Descriptive data on the developmental status of the children (presented in Chapter 5) are based on the child outcome measures described above that were collected as part of the two-year follow-up. Because the intent of providing descriptive data on the children is to portray their development apart from the effects of the program (in order to provide a context for interpreting subsequent findings on child impacts), we restrict our focus here to children in the control groups, the group in each site unaffected by exposure to JOBS. In presenting this descriptive portrayal of the developmental status of the children apart from JOBS, we will sometimes draw upon "benchmark data," or data for the same child outcome measures collected in other samples. For example, the measure of child behavior problems used at the two-year follow-up, the Behavior Problems Index, was also used in a national survey, the National Longitudinal Survey of Youth-Child Supplement. Behavior Problems Index findings for children of the same ages from the National Longitudinal Survey of Youth-Child Supplement can help us get a sense of whether the children in the control group of our sample have more or less frequent behavior problems, compared to children in a national sample.

II. Examination of Aggregate Impacts at Each Site

Having given a descriptive portrayal of the families in the sample and of the developmental status of the children apart from JOBS, we turn to an examination of program impacts on the children's developmental outcomes. A program impact reflects the average difference between families in an experimental group and families in the control group on a given outcome measure. Our examination of program impacts will contrast each experimental group (labor force attachment, human capital development) with the control group separately. We will carry out these contrasts separately within each site.

In examining program impacts, we first consider aggregate impacts. An aggregate impact reflects the difference, in a given site, between the average score on a particular measure for all of the families in one of the program groups, and all of the families in the control group. That is, in examining aggregate impacts we are asking whether, for a particular measure, there is a program impact for a research group as a whole, in a given site. In section III below, we describe analyses aimed at assessing whether program impacts occur in specified subgroups, in addition to or rather than for a research group as a whole, in each site.

We include in these analyses all of the families assigned at random assignment to the research groups of interest. Thus, for example, we consider all of the families assigned to the human capital development group whether or not they actually participated in basic education, job training, or employment activities, and contrast this group with all of the families assigned to the control group.(11) These group contrasts thus reflect, on the average, experiences of families in the different research groups in light of whether they were assigned to a JOBS program group, rather than according to their actual participation in program components.

All analyses of aggregate impacts will be reported separately by site and by program approach.(12) When the examination of impacts involves a continuous dependent variable (for example, children's scores on the assessment of cognitive school readiness), we have carried out ordinary least squares multiple regression. In these analyses, we examined each child outcome measure separately as a dependent variable, and included an experimental comparison "dummy" variable (i.e., either labor force attachment vs. control, or human capital development vs. control) as an independent variable to test program impacts. In each of these analyses we used a common set of covariates to improve the precision of the impact estimate by controlling for variation on background characteristics.(13) These covariates were chosen in communication with researchers at the Manpower Demonstration Research Corporation, so as to coordinate the present analyses of child outcomes with the analyses of economic outcomes at the two-year follow-up point being carried out with the larger NEWWS sample.

Where the examination of aggregate impacts involved a dichotomous child outcome variable (for example, in examining whether or not the focal child had any academic problems) rather than a continuous measure, the analysis was carried out using logistical regression. Again, each experimental group was contrasted with the control group separately; analyses employed the common set of covariates; each child outcome was examined in a separate analysis; and all analyses were carried out separately by site and program approach.

When we report that JOBS had a statistically significant impact on a child outcome, this indicates that the mean difference on a continuous outcome variable (for example, on the assessment of child cognitive development), or a difference in the proportion of children receiving a rating of one on a dichotomous variable (for example, the proportion of children with one or more academic problems), is unlikely to have arisen simply by chance. We will follow the convention of reporting an effect as statistically significant when data analyses indicate that there was a smaller than 10 percent probability that the finding could have arisen by chance, that is, reflected random variation in individuals' scores.

Tables reporting on aggregate impacts will note with a "+" superscript those effects that have less than a 10 percent probability of having arisen by chance. One asterisk will indicate effects that have less than a 5 percent probability of having arisen by chance, two asterisks will indicate a less than 1 percent probability, and three asterisks a less than one-tenth percent probability.(14)

As noted above, this report will also examine impacts on children from the point of view of whether they are of sufficient magnitude for policy makers to consider when developing policy. Such a "policy relevant" impact was defined, statistically, as one in which the effect size was at least one-third of a standard deviation on a given measure.(15),(16) While the "harm hypothesis" directs us to identify unfavorable, and especially "policy relevant," program impacts on children, we acknowledge that policy relevant program impacts may occur in a positive as well as negative direction.

In all discussions of impacts, the patterning of results (according to developmental domain, site, and/or program approach) will also be taken into account. Thus, while we present all statistically significant program impacts on children, we concentrate our discussion on impacts that show a distinct pattern, as well as impacts that are of sufficient magnitude to meet the criterion for policy relevance.

III. Examination of Subgroup Impacts at Each Site

As noted in Chapter 1, an important possibility is that JOBS will affect subgroups of families differently. In order to examine this possibility, we will go beyond the consideration of aggregate impacts to consideration of impacts for specified subgroups. Subgroups are delineated according to characteristics of the families at baseline and are categorized into "lower-risk" and "higher-risk" based on these variables. In an effort to minimize the number of subgroups examined and maximize the clarity of findings, information from ten baseline variables was drawn upon in creating higher and lower-risk subgroups of four different types. We refer to each approach to defining higher and lower-risk subgroups as a "risk composite" because each is based on multiple rather than individual baseline variables:

Thus, for example, we ask whether JOBS programs had effects on children in families in which the mother was at higher and lower risk in terms of indicators of psychological distress at baseline; in which the mother was at higher and lower educational risk; in which the mother was at higher and lower work risk; and in families with more or closely spaced children and with fewer or less closely spaced children. For each of the composite risk measures, families were categorized in a mutually exclusive way, as at either higher or lower risk. Families could be categorized as at higher risk on more than one of the risk composites.

The particular baseline variables that formed the basis of the composite risk measures were chosen from the far longer list of available baseline variables on two grounds:

Thus, for example, mothers showing few or many indicators of psychological distress at baseline might well differ in their ability to mobilize to respond to the requirements of JOBS. At the same time, there is ample evidence to indicate that maternal psychological distress is an important predictor of children's developmental outcomes (Downey and Coyne, 1990).

In addition to creating these composite risk measures, a summary index of cumulative risk was created, reflecting the number of composite risk factors at baseline for which a family was at higher risk. The risk summary score could range from 0 to 4, with a point assigned when:

Families experiencing none or one of these baseline composite risks were considered to be at lower cumulative risk, while families with two to four of these composite risks were considered to be at higher cumulative risk.

In addition to the creation of higher and lower-risk subgroups in terms of sibling configuration, educational risk, work risk, maternal psychological well-being risk, and cumulative risk, we examined three further approaches to delineating risk on a more exploratory basis: age of child, maternal attitudes about working, and maternal attitudes toward school. While theoretically important, there is less empirical evidence to suggest that these constructs provide meaningful bases for identifying risk within the present sample. These variables allow us to distinguish among families in which the mother had more and fewer reservations about working (with more reservations about working hypothesized to reflect higher risk); more and less positive attitudes toward school (with less positive attitudes about school hypothesized to reflect higher risk); and in families in which the focal child was the median age or younger at baseline or older than the median age at baseline (with younger child age hypothesized to reflect higher risk). As will be seen in Chapter 7, analyses of child outcome measures for control group families supported the use of only one of these more exploratory bases for grouping families as a risk measure: "attitudes toward work" risk. For this but not the other more exploratory measures, children's scores in the three sites' control groups consistently went in a direction indicating less favorable development in the group hypothesized to be at greater risk.

Table A-1 (in Appendix A) provides the definition and sample sizes for each of the baseline subgroups at each of the research sites.

The examination of subgroup impacts focuses on effects within a particular higher or lower-risk subgroup. For example, we consider impacts on child outcome measures in the subgroup of families at higher educational risk. Within this baseline subgroup, we ask whether families in one of the experimental groups have mean or proportion scores on child outcome measures that differ significantly from the scores of families in the control group. We then ask the same question for the subgroup of mothers in the subgroup at lower educational risk. In the same way, we ask whether there is evidence of significant program impacts within the higher and lower-risk subgroups in terms of work risk, maternal psychological well-being risk, sibling configuration risk, cumulative risk, and the more exploratory approaches to delineating risk (especially "attitudes toward work" risk).

Apart from the delineation of a particular subsample to focus upon as the sample for each subgroup impact analysis, we follow the same strategy here as was noted for aggregate impacts. For example, we use the same set of covariates in all analyses; we carry out ordinary least squares multiple regression or logistical regression in keeping with the nature of the outcome variable examined; and we reporting findings separately by site and program approach.

IV. Explanatory Analyses

Having identified the child outcomes for which there are significant aggregate impacts and impacts for specified subgroups, the focus of analysis will shift to the question of what underlies the program impact findings for children? In a modest set of non-experimental analyses, we will examine the pathways through which particular JOBS programs appear to have affected children using mediation analyses (Baron and Kenny, 1986). The first step requires identifying the child impacts that we wish to examine. We do not attempt to explain all significant program impacts on children; rather, for these mediational analyses, at least one aggregate impact in each developmental domain (i.e., cognitive development and academic achievement; behavioral and emotional adjustment; physical health and safety) was selected that generally illustrates the pattern of results for that domain. In order to conclude that a program impact on a targeted or non-targeted outcome helps to explain statistically, or "mediates," the same program's impact on a given child outcome, three conditions must hold (see Baron and Kenny, 1986): (1) the adult outcome must, itself, be affected by the JOBS program being considered; (2) the adult outcome must predict the child outcome (with the JOBS program dummy also in the model), and (3) with this adult outcome variable in the model, the previous impact of a JOBS program on the given child outcome must be smaller than without this variable in the model.

We should emphasize that, while we draw conclusions regarding the degree to which adult impacts appear to have led to impacts on children, the adult impacts we examine as possible mediators of program impacts on children were measured concurrently with children's outcomes; that is, both adult and child outcomes were measured at the two-year follow-up. Thus, any causal conclusions regarding the pathways through which children were affected by their mother's assignment to a JOBS welfare-to-work program must be made cautiously. Information from the five-year follow-up will allow us to examine the chronological nature of program impacts. This subsequent wave of data, combined with more rigorous statistical techniques that allow the direct testing of alternative hypotheses regarding pathways of program impacts, will improve our ability to identify the ways in which children were affected by JOBS welfare-to-work programs.

[ Go to Contents ]

Endnotes

1.  Researchers in the behavioral sciences often rely on Cohen's (1988) characterization of effect sizes (in standard deviation units) of .20 as "small," .50 as "medium," and .80 as "large."

2.  Sixty-four cases were dropped because the focal child was not the respondent's biological or adoptive child. (There is only one adoptive child in the Child Outcomes Study sample.)

3.  Two children were too old, and one child was too young, to be focal children; child ages must have been incorrectly reported at baseline and their families should not have been selected for the Child Outcomes Study in the first place.

4.  A total of 69 families - all in Riverside - were dropped from the sample because they had moved 100 or more miles away from Riverside County.

5.  A total of 67 mothers reported living away from the focal child for at least three months at the time of the two-year follow-up.

6.  Despite the fact that not all families eligible and randomly assigned at baseline are contained in the sample for the present report, the "fidelity" of random assignment was maintained -- that is, there is no systematic difference between the experimental and control groups on baseline characteristics - with one exception. In Riverside, among those identified as "in need" of basic education, those assigned to the labor force attachment program differed from those assigned to the control group on a few background characteristics. However, neither group can be considered uniformly more advantaged or less advantaged since, on some characteristics (e.g., prior employment ), the control group appeared more advantaged, whereas on other characteristics (e.g., maternal psychological well-being), the LFA group appeared more advantaged. Moreover, these differences were controlled statistically in all impacts analyses by including the variables on which these groups differed as covariates.

7.  A special "synthesis" report (Hamilton, with Freedman and McGroder, 2000) draws together the findings relating to any child in the family from the present Child Outcomes Study report, and these "any child in the family" items from the full NEWWS sample.

8.  Internal consistency reliability indicates the extent to which the individual items that make up a scale, all of which should reflect the same hypothetical underlying construct, are interrelated or "hang together" statistically. The measure used to reflect internal consistency reliability, Cronbach's alpha, has a possible range of 0 to 1.0, with higher scores indicating better internal consistency reliability.

9.  See the National Health Interview Survey, the National Health and Nutrition Examination Survey, the Rand Health Insurance Experiment, the Medical Outcomes Study, and the Child Health Questionnaire (Krause and Jay, 1994; Landgraf, Abetz, and Ware, 1996).

10.  Missing baseline data occurred on selected items from the Private Opinion Survey (POS), which measured clients' attitudes toward welfare, their psychological well-being, and the barriers to employment they faced. Because these baseline variables were important for impacts analyses -- both as covariates and, in subgroup impacts analyses, in defining baseline subgroups -- we imputed values where data were missing. In addition to relying on information regarding site in imputing these data, we selected other POS attitudinal variables to use as the basis for imputation, after examining which particular POS variables were most highly correlated with the variables for which we were imputing scores. Specifically, imputation was done based on data regarding site, JOBS office at random assignment, number of baseline risks, and high school degree status. The descriptive portrayal of families in Chapter 4 do not rely on imputed data.

11.  As we have noted, however, in the Riverside site, members of the human capital development group were contrasted only with members of the control group considered, at baseline, to be in need of basic education (i.e., those without a high school diploma or GED, who demonstrated lower levels of literacy, and/or were not proficient in English at baseline. Hamilton et al., 1997).

12.  Impact analyses were weighted to adjust for cohort differences in the assignment of clients to a treatment stream or to the control group (to preserve the experimental design), as well as to allow generalizations to populations from which the evaluation sample was drawn, namely, the county's AFDC-eligible population. Additional factors entering into the weighting were the number of JOBS offices (Riverside had more than one), high school/GED status, and cohort differences in Atlanta. Weights were decided upon in collaboration with researchers at MDRC, to assure common analytic approaches in the Child Outcomes Study and the NEWWS.

13.  Model covariates included were: marital status, number of children, race, mother's age, average AFDC benefit per month, number of months received AFDC in prior year, focal child's age and gender, high school diploma or GED, literacy, numeracy, time on welfare, work history, depressive symptoms, locus of control, sources of support, family barriers, and number of baseline risks.

14.  All tests of program impacts were "two tailed." That is, we did not begin with a hypothesis about direction of effects (for example, that scores for children in the human capital development group would be better than those in the control group), but rather considered the possibility of effects in either a positive or negative direction.

15.  Standard deviations were calculated separately for each site's control group(s), yielding a criterion for policy-relevance that is identical in a relative sense (i.e., .33 of a standard deviation) but that varies in an absolute sense, depending on the distribution of the measure in the particular site's control group.

16.  For example, on the Bracken Basic Concept Scale/School Readiness Composite, which ranges from 0 to 61, a difference as small as 3.6 points (in Atlanta), 3.8 points (in Grand Rapids), 4.2 points (for the impact of Riverside's LFA program), and 4.3 points (for the impact of Riverside's HCD program) - representing about four school readiness concepts relating to colors, letters, numbers and counting, comparisons, and shapes -- would be considered policy relevant. As another example, regarding the proportion of focal children in "very good" or "excellent health," a difference of at least 13.5 percentage points (in Atlanta), 13.1 percentage points (in Grand Rapids),13.2 percentage points (for Riverside's LFA program), and 13.4 percentage points (for Riverside's HCD program) is considered policy-relevant. In fact, for dichotomous outcomes, one-third of a standard deviation actually represents a relatively large impact in absolute terms.


Where to?

Top | Contents

Home Pages:
NEWWS Home Page
Human Services Policy
Assistant Secretary for Planning and Evaluation
U.S. Department of Health and Human Services

Updated: 09/19/01