Baseline data are data on characteristics and experiences of experimental (or demonstration) and control (or comparison) group members, before the intervention occurs for the experimental or demonstration group, and before the comparable follow-up period begins for the control or comparison group. Such data may be obtained either from administrative records or special data collection efforts. Baseline data are critical to nonexperimental evaluations, since they are needed to control for preexisting differences between the demonstration and comparison groups. In an experimental evaluation, random assignment ensures that the experimental and control groups are the same, on average, in their background characteristics, so controlling for background characteristics is not critical in obtaining unbiased impact estimates. However, baseline data collection merits attention in experimental evaluations for several reasons. Baseline data (1) provide a check on the integrity of random assignment, (2) are used in improving the precision of impact estimates, (3) are used to define subgroups for analysis, and (4) are critical to any nonexperimental analyses (for example, of welfare recidivism, which must be analyzed using nonexperimental methods since not all experimental and control group members leave welfare during the follow-up period). Baseline data may also be important sources of contact information for follow-up surveys. Nonetheless, there have been no explicit standards or requirements for baseline data collection in the federal waiver process.
2. State Approaches
In the four state evaluations reviewed that have experimental designs, three are relying primarily on baseline data from administrative records:
- In California, UC DATA has built a longitudinal file with up to five years of preimplementation data on all cases in the welfare reform research sample, on the basis of data recorded in the state's Medicaid data system. Variables include participation in Medicaid, AFDC, and other programs related to Medicaid eligibility. They also have assembled over five years of historical UI records data on employment and earnings for individuals in research sample cases. Constructing these longitudinal files required a major investment but has led to a research infrastructure that now is supporting a wide range of research projects. Another database records demographic characteristics of individuals from the time the case enters the research sample (generally two months after application for new applicants), using data extracted from the county-level AFDC data systems.
- In Colorado, baseline data from administrative records reflected case characteristics in the month of random assignment, except that there was a plan to try to obtain UI records data for a period before random assignment.
- In the Michigan TSMF evaluation, information was available from the state's AFDC data system, by person, on basic demographics and on AFDC/SFA participation in the 24 months prior to random assignment. By case, information was available on active welfare status, welfare participation before random assignment, number of children, number of adults, presence of earnings, and so forth, by month. However, little information was available on cases denied for both AFDC and SFA; in the end, these cases were dropped from the sample. The evaluator argued that this exclusion is not a concern, because the intervention largely affects whether a family is approved for AFDC versus SFA, but not whether it is denied for both.
Special forms or surveys were not used to collect baseline information in Colorado and Michigan. California supplemented the administrative data with a telephone survey, and Minnesota relied completely on an intake form:
- In California, the plan was to conduct the first telephone survey within a few months after random assignment, but the start of the survey was substantially delayed, limiting its usefulness as a baseline survey. The delay was caused in part by the time it took to obtain sample frame information from county-level automated data systems. Delays in instrument development were also a factor, as many stakeholders were involved in reviewing and adding to the instrument. In practice, the "baseline" or Wave I survey took place about a year after random assignment began for ongoing cases and has continued to lag random assignment substantially for applicant cases. Because of this, the survey results are being used only as descriptive background information on the survey sample, not to provide independent variables for the impact analysis.
- Minnesota used a special baseline data collection form, administered to all ongoing and applicant cases just before random assignment. The individual applying for assistance or subject to redetermination would meet with an intake worker to fill out the Background Information Form. The form took about 10 minutes to fill out, and the response rate was 99 percent. For those already or previously on assistance, some data on their public assistance history were entered by intake staff from the automated system. In addition, the client was given a self-administered Private Opinion Survey on issues such as barriers to work, and attitudes toward work and welfare. The response rate for the Private Opinion Survey was 83 percent.
3. Analysis and Recommendations
An ideal evaluation would combine California's pre-program longitudinal data with Minnesota's baseline information form. Not all states have the resources to do this. However, the following steps toward collecting better baseline data should receive priority.
First, we recommend that states conducting random-assignment evaluations collect at least minimal baseline information at intake; DHHS could develop a prototype form to be adapted to each state's needs. The form should be brief and should focus on basic background information, identification information for all family members, and contact information. It should be filled out by the applicant or recipient jointly with a welfare agency staff member just before random assignment and eligibility determination. Use of a baseline form would require recipients to go through random assignment at redetermination (as recommended for other reasons in Chapter IV). If possible, the staff person responsible for these forms should be someone other than the eligibility worker, and obtaining good data on these forms should be designated as a key part of this person's job.
Second, we recommend that states maintain historical data on program participation and benefits in such a way that the data may be linked to create longitudinal files. If feasible, we recommend that states create the longitudinal files. This effort could be linked to the new requirements for lifetime limits on cash assistance, which will require states to move in this direction. When historical administrative data on outcomes are available, states should use these data in their welfare reform evaluations.