With passage of the Personal Responsibility and Work Opportunity Reconciliation Act (PRWORA) of 1996 and the expansions of the Earned Income Tax Credit (EITC) over the past decade, increasing attention has been paid to the employment experiences, labor market earnings, and transfer income received by disadvantaged individuals and households. This attention, prompted by explicit performance goals in PRWORA and implicit goals of the EITC expansions, focuses on whether low-income households can achieve self-sufficiency without resorting to Temporary Assistance for Needy Families (TANF) for other public assistance programs. Although income and employment levels are only partial indicators of the well-being of households, they continue to be ones most often used to assess the consequences, intended and unintended, of welfare reform.
More broadly, good measures of income and employment for low-income families are necessary to (1) assess the well-being and labor market attachment of low-income and welfare populations at the national, state, and local levels; (2) evaluate welfare reform and learn the effects of specific policies, such as time limits and sanctions; and (3) meet reporting requirements under TANF and aid in the administration of welfare programs.
There are two data sources for measuring employment and incomes of the disadvantaged: survey data and administrative. Surveys have been the mainstay of evaluating welfare programs and of monitoring changes in income and employment for decades. These include national surveys--such as the U.S. Censuses of Population, the Current Population Survey (CPS), the Survey of Income and Program Participation (SIPP), the National Longitudinal Surveys (NLS), and the Panel Study of Income Dynamics (PSID)--and more specialized surveys that gather data for targeted groups, such as current or former welfare recipients, and at the state or local level. (1) Although survey data continue to be important, the use of administrative data sources to measure income and employment has grown dramatically over the past 30 years. Data on wages and salaries from state Unemployment Insurance (UI) systems, for example, have been used to measure the earnings and employment of individuals that participated in state AFDC/TANF programs, manpower training, and other social programs. Data on earnings (and employment) from Social Security Administration (SSA) records have been linked with the records of welfare and social program participants.
What type of data one uses to measure income and employment among current and past welfare participants and welfare-eligible households may have important consequences for implementing and evaluating recent welfare reforms. Recent debates between the states and the federal government, for example, over employment targets and associated sanctions mandated under PRWORA hinged crucially on exactly how the fraction of a state's caseload that is employed would be measured. Furthermore, the conclusions of several recent assessments of the impacts of welfare reform and caseload decline appear to depend on how income and employment of welfare leavers and welfare-eligible populations are measured.(2)
In this paper we assess the strengths and weaknesses of using survey or administrative data to measure the employment and income of low-income populations. We review a number of studies, most of which have been conducted in the past 10-15 years,(3) that assess the comparability of income and employment measures derived from surveys and administrative records. Clearly the primary criterion for evaluating data sources is their accuracy or reliability. Ideally one would compare the income and employment measures derived from either surveys or administrative data sources with their true values in order to determine which source of data is the most accurate.
Unfortunately this ideal is rarely achieved. One seldom, if ever, has access to the true values for any outcome at the individual level. At best, one only can determine the relative differences in measures of a particular outcome across data sources. In this paper, we try to summarize the evidence on these relative differences and the state of knowledge as to why they differ. These studies point to several important dimensions along which surveys and administrative records differ and, as such, are likely to account for some, if not all, of the differences in the measures of income and employment derived from each. These include the following:
- Population Coverage: Surveys generally sample the population while administrative data typically cover the population of individuals or households who are enrolled in some program. In each case issues arise about the sizes of samples at state or substate levels and sample designs that may limit the issues that can be examined.
- Reporting Units: Different data sources focus on individuals, households, tax-filing units, or case units. Differences in reporting units hinder the ability to move across data sources to obtain measures of income and complicate efforts to evaluate the differential quality of income data across data sets. Furthermore, differences in reporting units may have important consequences for the comprehensiveness of income measures, an issue especially relevant when attempting to assess the well-being, and changes in the well-being, of disadvantaged populations.
- Sources of Income: Data sources differ in the breadth of the sources of individual or household income they collect. Surveys such as the CPS and, especially, the SIPP, attempt to gather a comprehensive set of income elements, including labor earnings, cash benefits derived from social programs, and income from assets. In contrast, administrative data sources often contain only information on a single type of income (as in the case of UI earnings) or only those sources of income needed for the purposes of a particular record-keeping system.
- Measurement Error: Different data sources may be subject to different sources of measurement problems, including item nonresponse, imputation error, and measurement error with respect to employment and income (by source). Furthermore, issues such as locating respondents, respondent refusals, and sample attrition are important in conducting surveys on low-income populations.
- Incentives Associated with Data-Gathering Mechanisms: Data sources also may differ with respect to the incentives associated with the gathering of information. In the case of surveys, respondents' cooperation may depend on a comparison of the financial remuneration for a survey with the respondent "burden" associated with completing it. In the case of administrative data, the incentives relate to the administrative functions and purposes for which the information is obtained. What is important is attempting to anticipate the potential for and likelihood of biases in measures of income and employment that may result from such incentives.
The importance of various strengths and weaknesses of different data sources for measuring employment and income generally will depend on the purpose to which these measures are put. We note five considerations. First, when conducting an experimental evaluation of a program, the criteria for judging data sources is whether they yield different estimates of program impact, which generally depends on differences in income (employment) between treatment and control groups. In this case, errors in measuring the level of income between treatment and control groups could have little effect on the evaluation. Alternatively, suppose one's objective is to describe what happened to households who left welfare. In this case, researchers will be interested in the average levels of postwelfare earnings (or employment). We discuss results from Kornfeld and Bloom (1999) where UI data appear to understate the level of income and employment of treatments and controls in an evaluation of the Job Training Partnership Act (JTPA), but differences between the two groups appear to give accurate measures of program impacts. Depending on the question of interest, the UI data may be suitable or badly biased.
Second, surveys, and possibly tax return data, can provide information on family resources while UI data provide information on individual outcomes. When assessing the well-being of case units who leave welfare, we often are interested in knowing the resources available to the family. When thinking about the effects of a specific training program, we often are interested in the effects on the individual who received training.
Third, data sets differ in their usefulness in measuring outcomes over time versus at a point in time. UI data, for example, make it relatively straightforward to examine employment and earnings over time, while it is impossible to do this with surveys unless they have a longitudinal design.
Fourth, sample frames differ between administrative data and surveys. Researchers can not use administrative data from AFDC/TANF programs, for example, to examine program take-up decisions because the data only cover families who already receive benefits. Surveys, on the other hand, generally have representative rather than targeted or "choice-based" samples.
Fifth, data sources are likely to have different costs . These include the costs of producing the data and implicit costs associated with gaining access. The issue of access is often an important consideration for certain sources of administrative data, particularly data from tax returns.
The remainder of this paper is organized as follows: We characterize the strengths and weaknesses of income and employment measures derived from surveys, with particular emphasis on national surveys, from UI wage records, and from tax returns. For each data source, we summarize the findings of studies that directly compare the income and employment measures derived from that source with measures derived from at least one other data source. We conclude the paper by identifying the "gaps" in existing knowledge about the survey and administrative data sources for measuring income and employment for low-income and welfare-eligible populations. We offer several recommendations for future research that might help to close these gaps.