National Evaluation of Welfare-to-Work Strategies: 2-Year Child Outcomes Study (COS) Files: Guide to Using 2-Year Child Outcomes Study

02/21/2002

        GUIDE TO USING THE U.S. DEPARTMENT OF HEALTH AND HUMAN SERVICES
          NATIONAL EVALUATION OF WELFARE-TO-WORK STRATEGIES (NEWWS)
                TWO-YEAR CHILD OUTCOMES STUDY PUBLIC USE FILE

This CD-ROM contains a public use analysis file (N2PC1730.TXT) and
documentation for research on the two-year experimental impacts on outcomes
for children of six welfare-to-work programs.  The programs were operated
in 3 of the 7 NEWWS sites: Atlanta, Georgia; Grand Rapids, Michigan; and
Riverside, California during the early- to mid-1990s.

The file contains the sample, original survey items, and outcome measures
analyzed for U.S. Department of Health and Human Services and U.S. Department
of Education, Impacts on Children and Families Two Years After Enrollment:
Findings from the Child Outcomes Study (2000).

The report and this public use file were prepared by Child Trends, who is
conducting the Child Outcomes Study under subcontract to the Manpower
Demonstration Research Corporation (MDRC).  MDRC is conducting the NEWWS
Evaluation under a contract with the U.S. Department of Health and Human
Services (HHS), funded by HHS under a competitive award, Contract
No. HHS-100-89-0030.  HHS is also receiving funding for the evaluation from the
U.S. Department of Education.  The study of one of the sites in the evaluation,
Riverside County (California), is also conducted under a contract from the
California Department of Social Services (CDSS).  CDSS, in turn, is receiving
funding from the California State Job Training Coordinating Council, the
California Department of Education, HHS, and the Ford Foundation.


I. DATA FILE

N2PC1730.TXT includes 1,380 variables on 14 records and a total of 3,018 sample
members. (See table below for the sample sizes of individual sites.)

N2PC1730.TXT contains, in ASCII format, data on survey-based measures of
children's developmental outcomes and maternal and family outcomes (for example,
maternal psychological well-being and parenting).  It also contains

1) Targeted outcome variables

2) Non-targeted outcome variables

3) Covariates used in statistical models

4) Subgroups used in subgroup impacts analyses

See Section V. of this memo for more information.

IMPORTANT: Values of some measures have been changed to protect sample
members' confidentiality. (See notes in codebook) The data file N2RC1326.TXT
contains the original values and is available to researchers (on a restricted
basis) at the National Center for Health Statistics.  See
www.aspe.hhs.gov/hsp/newws/data-info.htm for more information.


II. SAMPLE

Atlanta      1,422

Grand Rapids   646

Riverside      950

TOTAL        3,018

Members of the Child Outcomes Study (COS) sample were all, as of baseline,
single-mother recipients of AFDC with at least one 3- to 5-year-old
child. A child aged 3 to 5 was chosen as the COS "focal child" and is the
subject of most of the survey questions on this file.  In most cases, this was
the mother's only or youngest child, except in the Grand Rapids site, where
about one-third of families had also had a 1- to 2-year-old at baseline. Sample
member identifiers and other information that could be used to identify
individuals have been deleted from this file.

All Child Outcomes Study sample members are also members of the NEWWS
2-Year Client Survey sample (N=9,675) and the Full Impact sample (N=44,569).

For all research group sample sizes, please refer to sample size table:
SAMPTBL1.TXT located in the \Tables directory.


III. USING THE DATA FILE

A. Creating a COS analysis file

IMPORTANT:  The data on N2PC1730.TXT are intended to be analyzed together with

1) Responses by COS sample members to additional survey questions that were
asked of all respondents in the Two-Year Client Survey.  (CD #2).

2) Additional information on background characteristics; responses to a Private
Opinion Survey (POS) administered at baseline; scores on baseline literacy and
math tests; and earnings, welfare, and Food Stamp data recorded from administra-
tive records.  These data were recorded for all members of the NEWWS
Full Impact sample. (CD #1).

Each record of all 3 files contains a unique sample member IDNUMBER
(which varies from 1 to 44569 on the full impact sample).  For any sample
member, the same IDNUMBER is used on each file.  Researchers may build one
or several analysis files, depending on their research interests,  by
identifying the samples (and records) they wish to study and then merging files
BY IDNUMBER.

The key subsamples are identified on the Full Impact Sample file SAMPLES
record.  They having a value of 1 on the following variables:

FULLSAMP:  Full impact sample

SRV2RESP:  2-Year Survey respondent

COS2RESP:  2-Year Client Outcomes Study respondent


B. Review documentation and test the file

We strongly suggest that users of this file do the following before conducting
any further analyses:

1.Read the C2README.TXT, which gives a brief description of all files
included on the CD-ROM.

2.Read the report, particularly Chapter 2, which describes the research
design, samples, and data sources; Chapter 3, which describes the program models
and sites; Chapters 6 and 7, which summarize the impacts of each program on
children's developmental outcomes (in the aggregate and in subgroups,
respectively), and Chapter 9, which summarizes the impacts of each program on
family outcomes (both in the aggregate and in subgroups).

3.Review the following SAS output located within the \OUTPUT subdirectory.

a) BRAKHLTH.LST presents output showing the mean values for the Bracken Basic
Concepts Scale/School Readiness Composite (BBCS/SRC) raw scores for focal
children of control group members, and the proportion rated in very good or
excellent health, in each of the 3 COS sites. (These means can be found in Table
5.1 in Chapter 5 of McGroder et al., 2000.)  Reproducing these means will give
the user experience running and interpreting means on both continuous and
dichotomous outcomes.

b) CSEMPLOY.LST presents output showing mean values on selected two-year
employment-related outcomes for control group members in each of the 3 COS
sites.  (These means can be found in Table 8.1 in Chapter 8 of McGroder et al.,
2000.)   These variables are stored on the Full Impact Sample file
(CD #1) and must be merged with the COS data on this file:


JEMPYN1 is a dummy variable indicating whether the respondent was
employed in the month prior to the survey, and was calculated on all COS
respondents in each site.  JHRLYPAY represents the hourly wage received in the
job held in the month prior to the survey, and was calculated only for those
still employed at the time of the two-year interview (to yield a meaningful wage
rate for descriptive purposes).  JCOVHEAL is a dummy variable indicating whether
the respondent had health insurance at her current/most recent job, and was
calculated only for those still employed at the interview.  VFMEDCOV is a dummy
variable indicating whether the respondent had used any transitional Medicaid
benefits since random assignment, and was calculated only for those reporting
any paid work since random assignment.  Running these means will give the user
experience running and interpreting means on both continuous and dichotomous
outcomes, and will give the user an understanding of how, for descriptive
purposes only, to select the appropriate subsample for whom certain employment-
related variables apply.  Note, however, that to maintain the experimental
comparison for impacts analyses, all sample members are retained (e.g., even
those without a job are retained in impacts analyses of wage rate, with zeroes
assigned to non-employed respondents).


c) GLMBRAKN.LST presents output showing two-year impacts of each of the 6
programs on the Bracken assessment of focal children's cognitive school
readiness.  (These means can be found in Table 6.2 in Chapter 6 of McGroder et
al., 2000.)  Reproducing these impacts will give the user experience running and
interpreting impacts on a continuous child outcome measure.

d) LOGHLTH.LST presents outputs showing two-year impacts of each of the 6
programs on the proportion of focal children rated by mothers as being in very
good or excellent health.  (These means can be found in Table 6.2 in Chapter 6
of McGroder et al., 2000.)  Reproducing these impacts will give the user
experience running and interpreting impacts on a dichotomous child outcome
measure.

e) EDURSKSG.LST presents output showing two-year impacts of each of the 6
programs on the Bracken assessment of focal children's cognitive school
readiness, for the subgroup of children in each site whose mothers had no
educational risk factors and for the subgroup of children whose mothers had any
of three educational risk factors at baseline.  (These means can be found in
Table 7.1 in Appendix C of McGroder et al., 2000.)  Reproducing these impacts
will give the user experience running and interpreting impacts on a continuous
child outcome measure for a lower- and higher-risk subgroup, defined here
according to the presence of any of three educational risks.

f) CUMRSKSG.LST presents output showing two-year impacts of each of the 6
programs on the proportion of focal children rated by mothers as being in very
good or excellent health, for the subgroup of children in each site whose
mothers had none or 1 of four composite risks and for the subgroup of children
whose mothers had 2-4 of four composite risks at baseline.  (These means can be
found in Table 7.1 in Appendix C of McGroder et al., 2000.)  Reproducing these
impacts will give the user experience running and interpreting impacts on a
dichotomous child outcome measure for a lower- and higher-risk subgroup, defined
here according to the cumulative number of composite family risks).


NOTE: These standard weight variable used for estimating program impacts for the
Two-Year Client Survey and COS Survey samples is called FIELDWGT.  It is stored
in the Two-Year Client Survey data set (CD #2).  The .LST files described above
show output from impact calculations that used a slightly different version of
FIELDWGT for each site.  Weighting by FIELDWGT will produce the same impact
results as weighting by these different versions.


4.Review the rest of the documentation on the public use file, including
N2PC_CBK.TXT, the COS file codebook, and N2PCVARS.TXT, which provides
additional background information on the measures on this file.


5.Replicate the means and frequencies on the CD-ROM. Recreating these results
will familiarize the user with the samples, outcome measures, and the regression
models used to estimate program impacts.  If the user cannot replicate the
output, there is a danger of producing inaccurate results that may lead to
inappropriate conclusions.


IV. THE IMPACT SAMPLE AND RANDOM ASSIGNMENT DESIGN

To test the effectiveness of welfare-to-work program strategies, NEWWS
conducted a random assignment experiment:  In each research site,
people who were required to participate in the program were assigned, by
chance, to either a program group that had access to employment and training
services and whose members were required to participate in the program or risk
a reduction in their monthly AFDC grant, or to a control group, which received
no services through the program but whose members could seek out such services
on their own from the community. This random assignment design assures that
there are no systematic differences between the background characteristics of
people in the program and control groups when they enter the study. Thus, any
subsequent differences in outcomes between the groups can be attributed with
confidence to the effects of the program. These differences are called impacts.

In the three Child Outcomes Study sites (Atlanta, Grand Rapids, and Riverside),
two different types of welfare-to-work programs were operated side by side--a
strongly employment-focused approach, called Labor Force Attachment (or LFA), or
a strongly education-focused approach, called Human Capital Development (or
HCD). Sample members in Atlanta and Grand Rapids were randomly assigned to an
LFA group, an HCD group, or to a control group. Riverside implemented different
random assignment designs to study the effects of its LFA and HCD programs.
Following program intake procedures established by California's welfare
department, Riverside determined each sample member's "need for basic education"
just prior to random assignment.  Those who had a high school diploma or GED
certificate, and scored above minimum levels on both the math and the literacy
sections of the GAIN Appraisal test, and were proficient in English, were
determined not to need basic education.  This group was randomly assigned only
to the LFA or control group.  Those without a high school diploma or GED
certificate, or who scored below minimum levels on either section of the GAIN
Appraisal test, or who did not speak English, were determined by the program to
be in need of basic education.  Individuals in this group were randomly assigned
to any of the three Riverside research groups, including the HCD group.  Thus,
the effects of the LFA approach were tested on the entire sample, but the
effects of the HCD approach were tested only on sample members determined to
need basic education.  Comparisons can be made between outcomes for individuals
assigned to each of the program groups and outcomes for those assigned to the
control group (LFA versus control; HCD versus control), enabling one to estimate
the added benefit of either of these approaches above what the individuals would
achieve in the absence of a welfare-to-work program. Additionally, a direct
comparison can be made between outcomes for individuals randomly assigned to the
two program groups (LFA versus HCD), enabling one to estimate the relative
benefits of one welfare-to-work strategy over the other.

In the Child Outcomes Study sites, random assignment took place at the JOBS
office; that is, sample members were randomly assigned as they attended a
program orientation.  Random assignment for the different sites took place as
indicated by the table below. The impact samples used for the two-year child
impacts report (McGroder, Zaslow, Moore, & LeMenestrel, 2000), and therefore for
the HHS (2001) data file, includes the full single-mother impact samples in the
three Child Outcomes Study sites.


Site and Random Assignment Period for Child Outcomes Study Sample

Atlanta         03/12/92-01/27/94

Grand Rapids    03/25/92-01/31/94

Riverside       09/03/91-06/30/93


The proportion of sample members randomly assigned to the program and control
groups differed across sites.  In Atlanta and Grand Rapids, the proportion of
sample members in each of the three research groups is roughly equal.  Because
of Riverside's dual random assignment design, the proportion of sample members
in each of the three research groups is not equal.  The Riverside research
design has several implications for making comparisons between research groups.
First, comparisons between the LFA and the HCD groups in Riverside should
include only sample members determined to need basic education as of random
assignment (DIPLOMA=0).  That is, researchers should select HCDs, LFAs,  and
control group members determined to be in need of basic education when
estimating the impacts of Riverside's HCD program, or when comparing the
relative effectiveness of the LFA versus the HCD approach in Riverside.
Second, Riverside's design also affects the comparability of the HCD
research groups to other education-focused programs, particularly to the HCD
programs in Atlanta and Grand Rapids.  Researchers should select Atlanta and
Grand Rapids HCDs and control group members who had not completed high school or
received a GED certificate before random assignment (DIPLOMA=0),  when
comparing results to those of the HCD approach in Riverside.

The Riverside design also has implications for calculating LFA impacts in that
site.  In Riverside, a sample member determined not to need basic education had
a 50-50 chance of becoming an LFA (because those "not in need" were not assigned
to the HCD group), whereas a sample member determined to need basic education
had only a 1 in 3 chance of becoming an LFA. Therefore, those not in need of
basic education are overrepresented among the LFAs and control group members,
and outcomes for those determined not to need basic education unduly influence
unweighted LFA-control group comparisons. Thus, the most accurate estimates of
LFA impacts use weighted averages of the outcomes for LFAs found by program
staff to be in need of basic education at baseline and LFAs who were determined
not to need basic education.   This additional weighting procedure is
required even after the sample is weighted by FIELDWGT, the standard weight
variable for the Two-Year Client Survey and COS Survey data sets.


V. VARIABLES INCLUDED ON THE FILE

The key variables on the public use file are outlined below:

1.Sample member baseline characteristics

As noted above, each sample member is identified by IDNUMBER.  Variables
available for all sample members in the The Two-Year Client Survey sample
and the Full Impact sample (for example, RES2 [research group], ALPHSITE
[site]) are also available for the Child Outcomes Study sample.  They can be
accessed by merging (by IDNUMBER) the Child Outcomes Study datafiles with the
Full Impact Sample- and 2-Year Client Survey data files.


Variables available only for Child Outcomes Study (COS) sample members include:

a) Covariates used in COS impacts models.  Demographic characteristics are
dummy-coded; measures derived from the Private Opinion Survey pertaining to
such variables as psychological well-being, social support, and barriers to
employment are trichotomized.  Covariates created by Child Trends used the
following naming convention:  CTxxxTRB, with CT indicating a Child Trends-
created variable; xxx indicating a shorthand for the particular measure (e.g.,
DEP for baseline depressive symptoms); TR denoting a trichotomous measure;
and B indicating a baseline measure. (See N2PC_CBK.TXT and N2PCVARS.TXT
for a list of covariates, with variable names and a brief description.  See
N2PCCOVA.TXT for how these composited covariates were created.)

b) Subgroups, for which impacts were run separately.  (See N2PC_CBK.TXT and
N2PCVARS.TXT  for a list of subgroups, with variable names and a
brief description.  See N2PCCOVA.TXT for how these composited subgroup variables
were created.)

For McGroder, Zaslow, Moore, & LeMenestrel (2000), researchers imputed values
for sample members with missing data on baseline characteristics and attitudes
used in creating covariates for the impacts models and in creating baseline
subgroups for examining the impacts of these six programs for lower-risk and
higher-risk families.  (Eight composite subgroup variables were created by
combining conceptually-related covariates into 8 dichotomous variables.
(See N2PC_CBK.TXT and N2PCVARS.TXT, for a list of subgroups and N2PCCOVA.TXT
for how these composited subgroup variables were created.)  Multivariate mean
substitution was used for imputing values. The imputation method and variables
used in the mean substitution were identical to the method and variables used
for Freedman et al., 2000, though an additional variable (CTRSKTRB) was used to
impute baseline covariates, with the premise that the number of family risks at
baseline (CTRSKTRB) would increase the precision of the imputation, particularly
of baseline attitudinal variables used as covariates in the Child Outcomes
Study.  (The number of baseline family risks included mothers' lack of a high
school degree or GED, mothers' low literacy, mothers' low numeracy, mothers'
limited work history, numerous and frequent depressive symptoms, a relatively
external locus of control, lacking any of three sources of support, three or
more children in the family, welfare receipt exceeding 5 years, and between 4
and 7 (of 7) family barriers to work.  CTRSKTRB =0 for families having 0-3 of
these risks, =1 for families having 4 or 5 of these risks, =2 for families
having 6-10 of these risks.  See N2PCCOVA.TXT for how this family risk variable
was created.)

Covariates that have been imputed have values between 0 and 1 (or, for
trichotomous covariates, between 0 and 2), which are stored permanently on the
file.  Subgroups that have been imputed have values of either 0 (denoting
"lower-risk" families) or 1 (denoting "higher-risk" families).

Researchers who choose to substitute other values or return the measures to
missing can merge (by IDNUMBER) the original values of covariates used in the
impacts analyses for both the Child Outcomes Study report (McGroder et al.,
2000) and the full NEWWS report  (Freedman et al., 2000): namely, MARSTAT, BLACK,
GYRADC, YRKREC, YREMP, YREARN, YREARNSQ, YRREC from the Full Impact Sample File
(CD #1).


2. Survey-based Outcome Variables

Child Outcome Variables

From the continuous measures pertaining to an assessment of academic school
readiness and to maternal ratings of focal child behavior problems, positive
behaviors, and the overall health, dichotomous measures were created to indicate
the proportion of children scoring in the top and bottom 25th percentiles on
these measures.  Thus, in addition to examining experimental impacts on mean
levels of a child outcome measure, impacts analyses were also conducted on these
"distributional" outcomes, to assess whether a program shifted the distribution
of scores. (See N2PC_CBK.TXT and N2PCVARS.TXT for a complete list and
description of variables.)

Targeted "Intervening Mechanism" Variables

The majority of outcomes targeted by these welfare-to-work programs are included
in CD#1 and CD#2.  However, two additional outcomes were created (from two of
these key targeted outcomes) because of their potential relevance to children's
developmental outcomes:  MINWAGE2 is a dummy variable, coded 1 if the wage
reported by mothers for their current/most recent job (JHRLYPAY) was below the
minimum wage in the early 1990s ($4.25).  MANYHRS2 is a dummy variable, coded 1
if the number of weekly hours reportedly worked by mothers in their current/most
recent job (JWRK_HRS) was more than 40. (See N2PC_CBK.TXT and N2PCVARS.TXT)

Non-Targeted "Intervening Mechanism" Variables

These outcomes were not targeted by these welfare-to-work programs, but they may
nevertheless be affected by mothers' involvement in these programs and may have
implications for children's developmental outcomes. (See N2PC_CBK.TXT and
N2PCVARS.TXT for a complete list and description of variables.)

The user should note that, to retain the experimental design, as many of the
3,018 cases as possible should be included in impacts analyses.  Most
importantly, respondents who were appropriately skipped out of questions because
they did not apply to them should be assigned a "0" on the skipped items,
thereby retaining these cases in impacts analyses.  For example, variables
representing the use of child care while employed last month (named EMPxxxxx)
contain missing data for respondents reporting only mother care (BBCHCAR24=1)
and/or reporting they were not employed at any point in the month prior to the
survey (JEMPYN1=0).  The user seeking to run impacts on child care variables
should assign the value "0" to child care variables for these cases.  Likewise,
child support award and amount variables contain missing data for respondents
reporting that the focal child's biological father was deceased (CCPACUR=1) or
currently living in the household (CCPACUR=2).  The user seeking to run impacts
on child support variables should assign the value "0" to child support
variables for these cases.

In addition, impacts on dichotomous survey-based outcomes with missing values
were calculated (i.e., with PROC LOGISTIC; see below) as though these missing
values were 0s. Impacts on continuous survey-based outcomes with missing values
were calculated (i.e., with PROC GLM; see below) by listwise deleting cases with
missing values on the particular outcome.  Note that missing values on outcome
measures were never "hard-coded" to 0 in the data file.

3.Original Survey Items

Sections of the two-year follow-up survey administered only to Child Outcomes
Study families were denoted with a double letter (i.e., modules AA, BB, CC, DD,
EE, FF, GG, and HH); variable names for the original items in these sections
begin with the appropriate double letter.  Variable names for the original
items in the interviewer assessment likewise begin with "IA." (See N2PC_CBK.TXT
and N2PCVARS.TXT for a complete list and description of variables.)


VI. TUTORIAL:  REPRODUCING OUTPUT


All impacts analyses were run separately within site, selecting only the
applicable program group (b = HCD group; j = LFA group) and the applicable
control group (all Cs in Atlanta, all Cs in Grand Rapids, all Cs when assessing
the impacts of Riverside's LFA program, and only "in-need" Cs when assessing the
impacts of Riverside's HCD program).

Means and cross-site comparisons of means appearing in Chapter 5 (child
outcomes) and Chapter 8 (adult outcomes) are unadjusted but weighted and
were obtained using the following SAS programming:

PROC GLM; CLASS SITE;
MODEL <outcome> = SITE;
LSMEANS SITE/PDIFF STDERR;
WEIGHT FIELDWGT;
RUN;

Impacts analyses of continuous outcome measures used OLS regression methods
(PROC GLM, in SAS), and impacts analyses of dichotomous outcome measures used
logistic regression methods (PROC LOGISTIC, in SAS).  Logistic regression models
are fit iteratively and must converge in order to obtain reliable results.
Logistic regression models were allowed up to 100 iterations; models that did
not converge were not interpreted nor reported in impacts tables.

The SAS language used to run all impacts analyses of continuous outcomes, with
"b" used in models testing the impact of HCD programs, and "j" used in models
testing the impact of LFA programs is shown below.  NOTE that the user will not
be able to replicate exactly the impact tables in the report for the following
reason:  Focal child gender did not become available until the five-year follow-
up survey, so for the two-year analyses, child gender was "assigned" by same-
race/same-ethnicity raters based on the child's first name. This estimation
method was over 90 percent accurate.  Nevertheless, now that actual focal child
gender is available, the user will want to this variable (FCGENDER) in
subsequent analyses.


PROC GLM; CLASS b (or j);
MODEL <outcome> = b (or j)
marstat ctnchtrb black agep gyradc yrkrec
fcgender cthsgrkb ctlitrkb ctnumrkb
ctwlftrb ctdeptrb ctloctrb ctsuprkb ctbartrb ctwrkrkb
ctrsktrb/ solution;
LSMEANS b (or j)/PDIFF STDERR; weight fieldwgt;
run;


The following SAS language was used to run all impacts analyses, and print means
(probabilities, "probxxx") for each program group, on dichotomous outcomes.
Note for SAS users: Each model, testing a single outcome, must define a
separate, unique "prob" variable; otherwise, the means statement will print
means (probabilities) of the particular PROBXXX variable (from a prior run)
specified.

PROC logistic descending;
MODEL <outcome> = black agep gyradc yrkrec
fcgender marstat ctnchtrb cthsgrkb ctlitrkb
ctnumrkb ctwlftrb ctwrkrkb ctdeptrb ctloctrb ctsuprkb ctbartrb
ctrsktrb b (or j)
   /maxiter = 100;
weight fieldwgt; output pred=<probXXX>;

proc sort tagsort ; BY B (or j);
proc means ; var <probXXX>; BY  B (or j); run;


NOTE:

1) Calculations for McGroder et al. 2000 include an additional covariate
(CHAGERAD: focal child's age at random assignhment, in months) that is
available only in the restricted access version of this file.

2) The covariates, MARSTAT, BLACK, AGEP, GYRADC, YRKREC were collected for all
members of the Full Impact Sample (N=44,569) and are stored in the Full Impact
Sample File (CD #1).

3) However, their values have been changed slightly to protect sample members'
confidentiality.  (See NPBCOVER.TXT for details.)  For this reason, researchers
will obtain slightly different results than those which appear in tables in
McGroder et al., 2000.


MISSING VALUES FOR COVARIATES

1) FCGENDER is missing for 17 COS respondents.  Researchers will need to impute
values for FCGENDER when using it as a covariate; otherwise, these sample
members will be dropped from the calculations due to listwise deletion. Here
are the suggested values for imputing FCGENDER, based on inferences from
reading the focal child's first name:

  IDNUMBER    FCGENDER  (1=male, 2=female)

     1153         2
     4306         2
     8669         2
    11016         2
    13305         2
    14420         2
    16762         2
    18956         1
    22806         2
    23644         2
    24073         2
    28321         2
    29258         1
    32191         2
    35016         2
    40618         1
    41433         2


2) Some covariates collected for the Full Impact Sample also have missing
values for COS respondents.  Specifically, in the model used by Child Trends,
the measure BLACK is missing for 6 respondents.  (MARSTAT, AGEP, GYRADC, and
YRKREC do not have missing values for COS respondents.)  Researchers will need
to impute values for BLACK and, possibly, for other covariates that researchers
choose to add. The measure XBLACK contains imputed values (through mean
substitution by site and level of educational attainment for the Full Impact
Sample). It is stored in CD #1.  Child Trends imputed values in a slightly
different way (as described above). The number of sample members involved
is too small to be affected this minor variation in imputation procedures.