V. Joseph Hotz and John Karl Scholz
[ Main Page of Report | Contents of Report ]
Printer friendly version (PDF format)
With passage of the Personal Responsibility and Work Opportunity Reconciliation Act (PRWORA) of 1996 and the expansions of the Earned Income Tax Credit (EITC) over the past decade, increasing attention has been paid to the employment experiences, labor market earnings, and transfer income received by disadvantaged individuals and households. This attention, prompted by explicit performance goals in PRWORA and implicit goals of the EITC expansions, focuses on whether low-income households can achieve self-sufficiency without resorting to Temporary Assistance for Needy Families (TANF) for other public assistance programs. Although income and employment levels are only partial indicators of the well-being of households, they continue to be ones most often used to assess the consequences, intended and unintended, of welfare reform.
More broadly, good measures of income and employment for low-income families are necessary to (1) assess the well-being and labor market attachment of low-income and welfare populations at the national, state, and local levels; (2) evaluate welfare reform and learn the effects of specific policies, such as time limits and sanctions; and (3) meet reporting requirements under TANF and aid in the administration of welfare programs.
There are two data sources for measuring employment and incomes of the disadvantaged: survey data and administrative. Surveys have been the mainstay of evaluating welfare programs and of monitoring changes in income and employment for decades. These include national surveys--such as the U.S. Censuses of Population, the Current Population Survey (CPS), the Survey of Income and Program Participation (SIPP), the National Longitudinal Surveys (NLS), and the Panel Study of Income Dynamics (PSID)--and more specialized surveys that gather data for targeted groups, such as current or former welfare recipients, and at the state or local level. (1) Although survey data continue to be important, the use of administrative data sources to measure income and employment has grown dramatically over the past 30 years. Data on wages and salaries from state Unemployment Insurance (UI) systems, for example, have been used to measure the earnings and employment of individuals that participated in state AFDC/TANF programs, manpower training, and other social programs. Data on earnings (and employment) from Social Security Administration (SSA) records have been linked with the records of welfare and social program participants.
What type of data one uses to measure income and employment among current and past welfare participants and welfare-eligible households may have important consequences for implementing and evaluating recent welfare reforms. Recent debates between the states and the federal government, for example, over employment targets and associated sanctions mandated under PRWORA hinged crucially on exactly how the fraction of a state's caseload that is employed would be measured. Furthermore, the conclusions of several recent assessments of the impacts of welfare reform and caseload decline appear to depend on how income and employment of welfare leavers and welfare-eligible populations are measured.(2)
In this paper we assess the strengths and weaknesses of using survey or administrative data to measure the employment and income of low-income populations. We review a number of studies, most of which have been conducted in the past 10-15 years,(3) that assess the comparability of income and employment measures derived from surveys and administrative records. Clearly the primary criterion for evaluating data sources is their accuracy or reliability. Ideally one would compare the income and employment measures derived from either surveys or administrative data sources with their true values in order to determine which source of data is the most accurate.
Unfortunately this ideal is rarely achieved. One seldom, if ever, has access to the true values for any outcome at the individual level. At best, one only can determine the relative differences in measures of a particular outcome across data sources. In this paper, we try to summarize the evidence on these relative differences and the state of knowledge as to why they differ. These studies point to several important dimensions along which surveys and administrative records differ and, as such, are likely to account for some, if not all, of the differences in the measures of income and employment derived from each. These include the following:
The importance of various strengths and weaknesses of different data sources for measuring employment and income generally will depend on the purpose to which these measures are put. We note five considerations. First, when conducting an experimental evaluation of a program, the criteria for judging data sources is whether they yield different estimates of program impact, which generally depends on differences in income (employment) between treatment and control groups. In this case, errors in measuring the level of income between treatment and control groups could have little effect on the evaluation. Alternatively, suppose one's objective is to describe what happened to households who left welfare. In this case, researchers will be interested in the average levels of postwelfare earnings (or employment). We discuss results from Kornfeld and Bloom (1999) where UI data appear to understate the level of income and employment of treatments and controls in an evaluation of the Job Training Partnership Act (JTPA), but differences between the two groups appear to give accurate measures of program impacts. Depending on the question of interest, the UI data may be suitable or badly biased.
Second, surveys, and possibly tax return data, can provide information on family resources while UI data provide information on individual outcomes. When assessing the well-being of case units who leave welfare, we often are interested in knowing the resources available to the family. When thinking about the effects of a specific training program, we often are interested in the effects on the individual who received training.
Third, data sets differ in their usefulness in measuring outcomes over time versus at a point in time. UI data, for example, make it relatively straightforward to examine employment and earnings over time, while it is impossible to do this with surveys unless they have a longitudinal design.
Fourth, sample frames differ between administrative data and surveys. Researchers can not use administrative data from AFDC/TANF programs, for example, to examine program take-up decisions because the data only cover families who already receive benefits. Surveys, on the other hand, generally have representative rather than targeted or "choice-based" samples.
Fifth, data sources are likely to have different costs . These include the costs of producing the data and implicit costs associated with gaining access. The issue of access is often an important consideration for certain sources of administrative data, particularly data from tax returns.
The remainder of this paper is organized as follows: We characterize the strengths and weaknesses of income and employment measures derived from surveys, with particular emphasis on national surveys, from UI wage records, and from tax returns. For each data source, we summarize the findings of studies that directly compare the income and employment measures derived from that source with measures derived from at least one other data source. We conclude the paper by identifying the "gaps" in existing knowledge about the survey and administrative data sources for measuring income and employment for low-income and welfare-eligible populations. We offer several recommendations for future research that might help to close these gaps.
In this section, we discuss the strengths and weaknesses of measuring income and employment status for low-income populations using survey data. Most of our analysis focuses on the use of national surveys--CPS and SIPP in particular--because of the availability of several high-quality studies that compare their income and employment measures to other data sources. Where available, we also summarize studies that assess income and employment measurement with more targeted surveys.
The key features of the national surveys for the purposes of this paper are summarized in Table 9-1.
| Feature | Current Population Survey (CPS) | Survey of Income and Program Participation (SIPP) | Panel Study of Income Dynamics (PSID) | National Longitudinal Survey of Youth, 1979 |
|---|---|---|---|---|
| Nationally representative sample? | Yes | Yes | Only at sample Inception in 1968 | No, but representative for cohorts covered at sample inception |
| Primary unit of analysis | Household | Household | Household | Individual |
| Longitudinal data? | No | Yes | Yes | Yes |
| Typical sample size | 60,000 households | 21,000 households | 8,700 households | 11,400 individuals |
| Capacity for state and local Analysis | For all but small states | For large states only | Limited | Limited |
| Coverage of income sources | Broad | Very broad | Broad | Very broad |
| Accuracy of earnings data(a) | 97% | 92% | -- | -- |
| Accuracy of AFDC data(b) | -- | -- | ||
| Timeliness of data | Several months | 2+ years | 2-year lag | 1-2 year lag |
| Notes: a For 1990, See Table 9-3. b AFDC = Aid to Families with Dependent Children, for 1990, See Tables 9-2 and 9-3. |
||||
The CPS and SIPP are vital data sets for understanding the functioning of low-wage labor markets and the effects of antipoverty programs. These data get high marks on many of the concerns mentioned in the introduction. They have a national sampling frame covering program participants and nonparticipants that make these data valuable for developing a broad perspective on developments in low-wage labor markets. An example of this type of study is Primus et al. (1999), which uses CPS data to show that AFDC/TANF and Food Stamp Program participation rates have declined considerably faster than poverty rates between 1993 and 1997. They further report that incomes of poor single mothers fell between 1995 and 1997 (after rising between 1993 and 1995), and that the safety net is lifting fewer children from poverty than in the past. Concerns arise with this study, some of which are mentioned in the text that follows. Nonetheless, the CPS and the SIPP are the only data sets that would allow analysts to address the important issues that Primus et al. examine on a national scale.
The other national data sets that have been used to analyze the employment and income status of low-income populations are the National Longitudinal Survey (particularly the National Longitudinal Survey of Youth 1979) and the PSID. Both of these data sets have the additional feature that they are longitudinal surveys so that one can obtain information on earnings and employment status over time for the same person (and household). (4) The PSID has surveyed, until very recently, its respondents and the "splitoffs" of initial respondent households on an annual basis since 1968. Similarly, until 1994 the NLSY79 conducted annual surveys of a random sample of individuals who were 14-21 years of age in 1979. Both of these surveys gathered detailed information on labor market earnings and employment status of respondents, earnings and some employment information on other adult household members, and some information on other sources of income, including income from various public assistance programs. One of the advantages of longitudinal data sets such as SIPP, PSID, and NLSY is that they allow one to monitor the entry into and exit from welfare or other social programs and the factors related to welfare dynamics, including changes in earnings and family structure.
The CPS, SIPP, and PSID, in addition to having nationally representative samples, focus on households as the unit of analysis, and include information on all adult household members.(5) Given the general presumption that families pool resources, data sets that focus on families or households (and include information on cohabiting partners) are valuable. A calculation in Meyer and Cancian (1998) illustrates the usefulness of having data on family, as well as individual, incomes. Their study examines the economic well-being of women in the 5 years after leaving AFDC. They show that in the first year upon exit from AFDC, 79 percent of the women have incomes below the poverty line, but when family income is considered, a smaller number, 55.5, have income below the (correspondingly larger) poverty line. After 5 years, 64.2 percent of the women still have incomes below the poverty line, while only 40.5 percent of the broader family unit had income below the poverty line.
The nationally representative surveys provide information on multiple sources of income, especially in the SIPP, either through separate questions or prompting of specific income sources. By asking specific questions about, for example, welfare receipt or food stamps, the data identify participants and (eligible) nonparticipants, so the data can be used to study program entry effects.
The national surveys also measure income and employment in a comparable fashion both over time and across geographical locations, though in January 1994 the way that earnings information was elicited in the CPS was changed (Polivka, 1997).(6)
Another strength of the nationally representative surveys is that questions can be modified to reflect changing circumstances. For example, the U.S. Census Bureau periodically conducts cognitive interviews of respondents to the CPS in order to assess how they responded to different CPS income- and welfare-related questions. Such studies are used to determine which of the CPS questions were confusing and how respondents interpreted questions. Results from these cognitive interviews are used to improve the way questions are asked, with the goal of improving the quality of the data on key variables such as income and program participation. 7 Typically, this sort of sophisticated assessment can only be done on large-scale, national surveys.
To summarize, there are several potential strengths of using survey data to measure income and employment. These include the following:
Three general concerns arise with the nationally representative surveys that keep them from being the solution, or "core" data, for understanding the effects of welfare reform. The most important issue is that sample sizes and sampling frames are such that these data cannot be used to examine certain subpopulations of interest, such as welfare recipients in a particular state (perhaps with the exception of the largest states, such as California, New York, and Texas). A distinguishing feature of welfare reform is that program responsibility now largely rests with states and even counties within a state. The nationally representative data sets do not have sample designs and sample sizes that allow analysts to examine behavior at a level that corresponds to where program decisions are being made.
Second, there appear to be systematic changes in the coverage of low-income populations in the CPS. Studies have found that AFDC and Food Stamp Program benefits and the number of recipients in the CPS have declined over time relative to estimates of participants from administrative records. This issue of coverage is a serious concern for studies that use the CPS for measuring the income of welfare populations.(8) In Table 9-2, we reproduce comparisons of aggregate AFDC/TANF and Food Stamp Benefits Program between CPS and administrative data sources from the Primus et al. (1999) study. It shows there has been a sharp decline between 1990 and 1997 in the percentage of AFDC/TANF and Food Stamp Program benefits reported in the CPS compared to amounts reported in administrative data.(9) The reduction in coverage of AFDC/TANF (or family assistance) benefits also is consistent with Roemer's (2000: Table 3b) calculations from the CPS for 1990 through 1996. Interestingly, the apparent decline in AFDC/TANF coverage does not show up in the SIPP, though the SIPP appears to capture only about three-quarters of aggregate benefits.
| AFDC/TANF Benefits* | Food Stamp Benefits | |||||
|---|---|---|---|---|---|---|
| CPS Data | Administrative Data | Ratio (%) | CPS Data | Administrative Data | Ratio (%) | |
| 1990 | 14.259 | 18.855 | 75.6 | 10.335 | 13.556 | 76.2 |
| 1991 | 15.554 | 20.804 | 74.8 | 12.373 | 16.551 | 74.8 |
| 1992 | 15.362 | 22.258 | 69.0 | 13.394 | 20.014 | 66.9 |
| 1993 | 17.540 | 22.307 | 78.6 | 15.010 | 22.253 | 67.5 |
| 1994 | 17.145 | 22.753 | 75.4 | 15.317 | 22.701 | 67.5 |
| 1995 | 15.725 | 21.524 | 73.1 | 14.542 | 22.712 | 64.0 |
| 1996 | 13.494 | 19.710 | 68.5 | 14.195 | 22.440 | 63.3 |
| 1997 | 10.004 | 15.893 | 62.9 | 12.274 | 19.570 | 62.7 |
| Source: Primus et al. (1999:65) which in turn gives the sources,
as HHS and USDA administrative records, CBPP tabulations of CPS data. * Acronyms: AFDC - Aid to Families with Dependent Children TANF - Temporary Assistance for Needy Families HHS - Health and Human Services USDA - United States Department of Agriculture CBPP - Center on Budget and Policy Priorities |
||||||
Polivka (1998) compares the monthly average number of AFDC recipients in the March CPS to the monthly average reported to the Department of Health and Human Services (prior to quality control). She finds there has been a modest decrease in the proportion of total months on AFDC as measured in the CPS. The ratio of the CPS estimated to the administrative count (excluding Guam, the Virgin Islands, and Puerto Rico) is 83.0 (1989), 86.7 (1990), 86.0 (1991), 82.5 (1992), 84.2 (1993), 78.5 (1994), 75.5 (1995), and 79.6 (1996). The timing of the drop in the ratio corresponds to changes in the March CPS survey instrument. Taken together, the Primus et al. (1999) and Polivka (1998) results suggest that the decline in benefits reported in the CPS results from both a reduction in the coverage of families receiving AFDC and from an underrepresentation of benefits conditional on receipt, though the second factor seems quantitatively more important than the first.
The third potential weakness of national surveys is that there is little or no "cost" to respondents of misreporting of income, employment, or other circumstances.(10)
Some specific potential weaknesses associated with the PSID and NLSY79 are of potential relevance for obtaining information on the income and employment status of low-income populations. Most notable is the fact that they are not, by design, representative of the general population over time. Both data sets began with samples that were representative of their targeted groups--young adults in the case of the NLSY79 and the national population as of 1968 in the case of the PSID--but are not designed to be representative of the national population, or even of the age group covered in the NLSY79, in subsequent years. This feature can result in biased measures of summary statistics on income and employment vis-à-vis the nation as a whole in more recent years.
The other feature of the NLSY79 and PSID relevant for assessing the income and employment status of low-income populations is their respective sample sizes. The original sample for the NLSY79 was 12,686 young men and women, from which approximately 90 percent of the original sample remains today. The original sample in the PSID was 5,000 U.S. households in 1968 and, because of its growth through the accumulation of additional households through splitoffs from original households, it contained more than 8,700 in 1995. Although these are not small sample sizes, the sizes of low-income samples at a point in time are relatively small compared to both the CPS (which contains some 60,000 households at a point in time) and most waves of the SIPP (which, in its larger waves, contains data on 21,000 households). The sizes of the low-income or welfare subsamples in the NLSY79 and PSID for even the largest states are generally too small to derive reliable measures on income and employment, let alone other outcomes.
To summarize, there are two primary potential weaknesses with using national survey data to measure income and employment of low-income populations. They are the following:
Moore et al. (1997) conducted a general survey of income reporting in the CPS and SIPP, and Roemer (2000) assesses trends in SIPP and CPS income reporting between 1990 and 1996.(11) A central finding in Moore et al. (1997) and Roemer (2000) is that there is underreporting of many types of income in surveys. The reasons for this and, hence, solutions in the design of effective surveys are complex. The magnitudes of CPS and SIPP underreporting for selected years are given in Tables 9-3a and 9-3b, taken from the two papers. (Note that differences may be the result of flawed benchmarks rather than flawed surveys.)
Surveys of Income Reporting in the SIPP and CPS
The understatement of certain types of income, such as interest and dividend receipts, is probably not critical for low-income populations because low-income families typically receive small amounts of income from these sources. Based on the evidence presented in Tables 9-3a and 9-3b, it appears that wages and salaries are fairly accurately reported in the CPS, although less accurately in the SIPP. But Moore et al. (1997) note that 26.2 percent (35,205,000 out of 134,135,000 total weighted cases) of the wage and salary "responses" in CPS surveys are imputed from cases where the respondent did not give an answer, replied "don't know," or refused to answer the question. They also report that 7 to 8 percent of households refuse to participate in the CPS, so imputations and imputation quality is clearly a critical element in survey quality.
The apparent accuracy of wage and salary reporting in Tables 9-3a and 9-3b does not fully resolve concerns that we have about data accuracy for low-income populations, because we do not know much about the characteristics of families that underreport their incomes. If, for example, most of the underreporting of income occurs among the disadvantaged, the findings of Moore et al. (1997) and Roemer (2000) on wage and salary reporting in the CPS and SIPP may be of little comfort. Roemer, for example, shows there are significantly more aggregate dollars reported below family income of $25,000 in the SIPP relative to the March CPS. He suggests that the SIPP does a better job than the CPS of capturing the incomes of low earners and a worse job of capturing the incomes of high earners. Learning more about the nature of underreporting would appear to be a high priority for future research.
Matching Studies of Wage and Salary Income
Roemer (2000) examines the accuracy of CPS wage and salary reports by matching CPS data to Internal Revenue Service (IRS) tax returns in selected years for the first half of the 1990s. The sample is limited to nonjoint returns and selected joint returns where each filer matches a March CPS person. The sample is restricted further to observations with no imputed wages in the CPS. He finds that in the middle of the income distribution (from $15,000 to $150,000), at least half the CPS and tax reports are within 10 percent of each other. Anywhere from 60 to 80 percent of the observations are within 15 percent of one another. Discrepancies appear much larger in the bottom and very top of the income distribution. Below $10,000 and above $150,000, at least half the observations have discrepancies exceeding 20 percent, and most are larger than that. Discrepancies are both positive and negative, though, as expected, CPS incomes tend to be larger than incomes reported on tax returns in the bottom of the income distribution, and CPS incomes tend to be smaller than incomes reported on tax returns in the top of the income distribution.
| 1984 | 1990 | |||||
|---|---|---|---|---|---|---|
| Source of income | Indep. Estimate (billions $) | SIPP (%) | CPS (%) | Indep. Estimate (billions $) | SIPP (%) | CPS (%) |
| Employment income: | ||||||
| Wages and salaries | $1,820.1 | 91.4 | 97.3 | $2,695.6 | 91.8 | 97.0 |
| Self-employment | 192.6 | 103.1 | 70.2 | 341.4 | 78.4 | 66.8 |
| Asset income: | ||||||
| Interest | $244.8 | 48.3 | 56.7 | $ 282.8 | 53.3 | 61.1 |
| Dividends | 59.3 | 65.9 | 51.8 | 126.3 | 46.1 | 31.3 |
| Rents and royalties | 19.4 | 211.3 | 95.4 | 44.1 | 102.9 | 87.8 |
| Govt. transfer income: | ||||||
| Social Security | $160.5 | 96.2 | 91.9 | $ 225.5 | 98.3 | 93.0 |
| Railroad Retirement | 5.6 | 96.4 | 71.4 | 6.9 | 95.7 | 66.7 |
| SSI | 9.9 | 88.9 | 84.8 | 13.6 | 94.9 | 89.0 |
| AFDC | 13.9 | 83.5 | 78.4 | 19.7 | 70.1 | 71.6 |
| Other cash welfare | 2.0 | 135.0 | 120.0 | 2.9 | 86.2 | 80.2 |
| Unemployment Ins. | 16.3 | 76.1 | 74.8 | 17.7 | 84.2 | 80.2 |
| Workers' Comp. | 14.1 | 56.7 | 48.2 | 14.6 | 86.3 | 94.5 |
| Vets' pens. And comp. | 13.9 | 82.0 | 59.7 | 13.8 | 84.1 | 77.5 |
| Retirement income: | ||||||
| Private pensions | $65.2 | 63.8 | 57.2 | $ 70.2 | 107.1 | 110.8 |
| Federal employee pens. | 20.3 | 98.0 | 84.7 | 30.4 | 73.4 | 82.6 |
| Military retirement | 15.6 | 105.1 | 98.1 | 20.4 | 92.2 | 89.2 |
| S&L employee pens | 21.9 | 88.1 | 71.7 | 36.1 | 75.1 | 80.1 |
| Miscellaneous income: | ||||||
| Alimony | $2.7 | 100.0 | 81.5 | $ 2.5 | 116.0 | 124.0 |
| Source: These figures are adapted from Coder and Scoon-Rogers
(1996).
Acronyms: |
||||||
Beyond the cited studies, there appears to be little recent work on the accuracy of the wage and salary income in the SIPP, CPS, or related national surveys.(12) The dates of the citations for American work on this topic (there also is one Canadian study) are 1958, 1970, and 1980. In each case there seemed to be a small (on the order of 5 percent) incidence of non-reporting of wage and salary income.(13) Coder (1992) compares a restricted set of SIPP households with tax data (married couples with valid Social Security numbers who file joint returns and have positive wage and salary income in either the SIPP or on tax returns) and finds a roughly 5-percent discrepancy in the existence of wage and salary income. Moore et al. (1996) examine a sample of SIPP households working for specific employers and find that respondents sometimes drop months of wage and salary receipt over a 4-month interview cycle, though virtually all accurately reported the presence of a job during the wave.
| 1990 | 1996 | |||||
|---|---|---|---|---|---|---|
| Source of Income | Indep. Estimate (billions $) | SIPP (%) | CPS (%) | Indep.Estimate (billions $) | SIPP (%) | CPS (%) |
| Employment income: | ||||||
| Wages and salaries | $2,727.7 | 90.1 | 95.9 | $3,592.3 | 91.0 | 101.9 |
| Self-employment | 333.5 | 85.1 | 68.5 | 475.9 | 69.1 | 52.6 |
| Asset income: | ||||||
| Interest | $258.5 | 56.7 | 67.1 | $ 187.1 | 50.2 | 83.8 |
| Dividends | 96.8 | 65.8 | 40.9 | 129.4 | 51.0 | 59.4 |
| Rents and royalties | 45.6 | 113.1 | 85.0 | 76.2 | 82.0 | 58.6 |
| Govt. transfer income: | ||||||
| Social Security and Railroad Retirement | $283.4 | 97.1 | 90.6 | $ 332.2 | 87.9 | 91.7 |
| SSI | 15.3 | 83.1 | 78.9 | 26.5 | 101.4 | 84.2 |
| Family assistance | 18.9 | 75.6 | 74.4 | 19.8 | 76.3 | 67.7 |
| Other cash welfare | 2.9 | 81.9 | 85.6 | 3.4 | 114.0 | 80.5 |
| Unemployment Ins. | 17.9 | 77.5 | 79.9 | 21.6 | 69.4 | 81.6 |
| Workers' Comp. | 15.4 | 67.8 | 89.5 | 17.0 | 71.7 | 62.7 |
| Vets' pens. and comp. | 14.5 | 83.1 | 73.9 | 17.8 | 72.9 | 89.6 |
| Retirement income: | ||||||
| Private pensions | $68.5 | 91.8 | 98.3 | $ 98.7 | 98.1 | 93.1 |
| Federal employee pens. | 30.5 | 75.9 | 82.7 | 38.8 | 75.6 | 80.8 |
| Military retirement | 21.4 | 87.4 | 85.6 | 28.3 | 101.6 | 58.2 |
| S&L employee pens. | 36.9 | 76.8 | 78.7 | 66.0 | 67.8 | 57.3 |
| Source: These figures are from Roemer (2000). The independent estimates are the mean values of the implied independent estimates from the SIPP and the CPS (from Tables -2a, 2b, 3a, and 3b in Roemer, 2000). | ||||||
Several other studies assess the quality of income and earnings measurement based on matching survey data with various types of administrative data. Bound and Krueger (1991) match CPS data from 1977 and 1978 with SSA earnings records and find essentially zero net bias in CPS income reports for those whose incomes did not exceed the SSA's earnings maximum cutoff. In fact, more than 10 percent of the CPS sample matched their Social Security reported earnings to the dollar, and 40 percent were within 2.5 percent. Second, Rodgers et al. (1993) examine wage records in the PSID for unionized men working fulltime at an hourly rate in one specific durable goods manufacturing firm in 1983 and 1987. These authors examine three common measures of earnings: earnings from the previous week, from the previous year, and "usual" earnings. They find annual earnings are reported fairly reliably, but this is less true for the other two measures. They also find for each measure that there is a tendency for workers with lower than average earnings to overreport and for workers with higher than average earnings to underreport.(14)
Studies of Program Participation and Transfer Income
The previous discussion focused on income reporting. There are also several studies of transfer program reporting in surveys, though the cited studies are old (dates for the citations are 1940, 1962, 1969, 1969, 1971, 1975, 1978, 1980, and 1984). These are not "complete" design studies, in that they typically focus on a sample of recipients and examine whether or not they report benefits. Complete designs also would look at nonrecipients and see if they falsely report receipt. More recent studies do the latter. Most, but not all, of these studies find fairly substantial underreporting of transfer program receipt.
Marquis and Moore (1990), using two waves of the 1994 SIPP panel, did a comprehensive study of the accuracy of reporting of transfer program participation. They discuss evidence of substantial underreporting of program participation among true program participants, on the order of 50 percent for Workers' Compensation and AFDC, 39 percent for UI and 23 percent for food stamps and Supplemental Security Income. Overall participation rates for transfer programs, however, were quite close to what would be expected from administrative controls.
Subsequent work by Moore et al. (1996) on a sample of households from Milwaukee found smaller underreporting among true recipients, and found that most error, when it exists, is due to participants' failures to report the sources of income, rather than a failure to report all months of participation.
Bollinger and David (2001) give a detailed examination to food stamp underreporting in the 1984 SIPP panel. They find that the high rate of underreporting for food stamps arises in part from failures to locate the person legally certified within the household. About half of the underreports within a household were offset by an overreport from another household member. The net effect was underreporting of food stamps receipt of 12 to 13 percent in the 1984 SIPP panel. Bollinger and David also (2001) document the important point that nonresponse and false answers are correlated across survey waves in the SIPP.
Finally, Yen and Nelson (1996) examine survey and administrative records from Washington state and find that 93 percent of the nearly 49,000 person-months are reported correctly, and net overreports roughly equal net underreports.
Assessment of Income and Transfer Program Reporting in National Surveys
Moore et al. (1997:12) conclude their survey of what is known about income measurement in surveys by stating that:
Wage and salary income response bias estimates from a wide variety of studies are generally small and without consistent sign, and indicators of unreliability (random error) are quite low. Bias estimates for transfer income amount reporting vary in magnitude but are generally negative, indicating underreporting, and random error also is an important problem.
They conclude, "in general we find that the additional data continue to support the conclusion of very little bias in survey reports of wage and salary income, and little random error as well." They conclude that studies that match administrative records of transfer programs and survey data "suggest a general tendency for transfer program income to be at least modestly--and in some instances substantially--under reported" (p. 16).
Based on our review of available assessments of income and employment measurement in national surveys, we think the above quotation is still correct. The CPS, SIPP, NLS, and PSID surveys provide:
We now consider the evidence on using UI wage records to measure the income and employment status of low-income populations. UI wage records contain the earnings reported by employers (on a quarterly basis) to state UI agencies for each employee. As we noted above, UI data often are linked to information on targeted samples, such as participants in evaluations of specific welfare or training programs. Thus, the populations for which UI wage data are used to measure their income and employment varies with the particular investigation being conducted. We report on several of these studies, attempting to draw some general conclusions about the strengths and weaknesses of this data source.
Using UI wage records to measure income and employment has several potential advantages. The first is that wages reported to state UI programs are thought to include most of the wage earnings of individuals. By law, any employer paying $1,500 in wages during a calendar quarter to one or more employees is subject to a state UI tax and, hence, must report quarterly what is paid to each employee, including regular earnings, overtime, and tips and bonuses. Agricultural employers must report earnings if they have either a quarterly payroll of at least $20,000 or have hired 10 or more employees in each of 20 or more weeks during the preceding calendar year. Employers of paid household help must report wages if they pay at least $1,000 in cash wages during any quarter. In a study of the use of UI wage records to measure the post enrollment earnings of JTPA recipients, Baj et al. (1991) claim that, "Virtually all jobs that most observers would consider appropriate targets for JTPA terminee placement are covered by the UI reporting system." (More on this study follows.)
A second potential advantage of UI wage data is their presumed accuracy. Hill et al. (1999), for example, made the following, perhaps incorrect argument. "Employers are liable for taxes up to an earnings threshold. Because this threshold is quite low, there appears to be little incentive for employers to underreport earnings for most employees. Moreover, employers' reports are used to determine unemployment benefits. Discrepancies between employer and employee reports upon application of unemployment benefits can result in employer sanctions." Baj, Trott and Stevens (1991:10) write, "The accuracy of the reporting of money wages is unknown. However, relatively few corrections occur in the routine processing of individual unemployment insurance claims. In addition electronic payroll processing is increasing, electronic cross-matching capabilities are expanding, and new revenue quality control practices have been introduced. Thus, there is reason to think that the accuracy of UI data is higher than that of most self-reported sources of earnings information. Intentional underreporting of wages constitutes fraud, which is subject to sanctions. Unintentional misreporting is subject to penalty payments."
A third presumed advantage of using UI data to measure employment and wage income of individuals is its ready availability, at least for certain authorized studies, and the ability to link this data with information from other administrative or survey data sources. (Note that state UI authorities control access to UI wage records and the Social Security numbers necessary to link these data to other data sources for individuals, in order to safeguard the confidentiality of this information.) UI wage records are commonly used in state-level evaluations of welfare reform and other social programs. As Baj et al. (1991) conclude in their study of the feasibility of using UI wage data from different states to monitor the post training earnings outcomes of individuals who received training services in JTPA:
The findings from the first phase of this project indicate that JTPA and any other program [emphasis added] whose goal is to increase the employment and earnings of participants can use UI wage-record data with confidence. Obtaining post-program information from state UI systems is not only a viable option, it is far more cost-effective than the current practice of gathering this information through contact with participants. Furthermore, UI data are of higher quality than corresponding survey-based information. (p. 30)
They found, for example, that the response rate to the survey was 70.2 percent for those who were employed at termination compared to 49.6 percent for those who were not. Based on these results, they concluded that using UI wage data was preferred to obtaining data via surveys, especially given the cost of conducting surveys on this population.
To summarize, using UI wage records to measure income and employment has several potential strengths. These include the following:
Relying on UI wage records to measure employment and income for low-income populations has two potentially serious weaknesses. The first arises because UI wage records do not cover all forms of employment. In particular, state UI systems typically do not cover the employment of self-employed persons, most independent contractors, military personnel, federal government workers, railroad employees, some part-time employees of nonprofit institutions, employees of religious orders, and some students employed by their schools. Therefore, wage earnings from these types of employment are not contained in state UI wage records.
The importance of these exemptions is unclear. In at least two places in the literature, an assertion is made that 90 percent of workers in the U.S. economy are in jobs covered by the UI system (Baj et al., 1991; Kornfeld and Bloom, 1999).(16) As noted in the following paragraphs, this statistic is challenged by the results of Blakemore et al. (1996) and Burgess et al. (1998), but even if true, it is not clear how comforting it should be if the topic of interest is low-wage labor markets. If, for example, 8 percent of all jobs are missing from UI wage records, but all 8 percent are low-income workers (which in turn is a much larger fraction of all low-income workers), the usefulness of UI data in monitoring the effects of welfare reform would be severely eroded.
Blakemore et al. (1996) and Burgess et al. (1998) report results of a fascinating study of 875 Illinois employers from 1987 that were subjected to detailed audits of their UI reports. As part of the data set, routine information such as the employment size of the firm, the statutory UI tax rate for each firm, one-digit Standard Industrial Classification codes, and UI reporting punctuality were compiled. They also have unique audit information on unreported workers, underreported total and taxable wages, and UI taxes due on these unreported wages. They also merged information on the total number of independent contractors that each firm reported to the IRS. The data set does not attempt to identify employers who are part of the underground economy.
If the results for Illinois are projected nationally,(17) employers failed to report the presence of 11.1 million UI-eligible workers and $70.6 billion in wages to state UI agencies in 1987. This is 13.6 percent of all workers. Some of the undercoverage arose from failure to report casual or part-time workers, and failure to report tips, bonuses, or other types of irregular compensation. By far the largest problem (accounting for roughly 50 percent of the discrepancy), however, was with independent contractors. Issues surrounding independent contractors are among the most vexing in tax administration and labor law. In brief (and at the risk of oversimplification), in tax law there is a somewhat subjective, 20-part test to define a worker as a regular employee or independent contractor. Elements of the test include (from IRS Publication 15A: Employer's Supplemental Tax Guide) whether the business has "behavioral control" of the worker (does the business give instructions and train the worker?); financial control (can the worker make a profit or loss, does the worker have unreimbursed business expenses, or does the worker make services available to a broad market?); and type of relationship (does the job have benefits, is it permanent, are the tasks a key aspect of the regular business of the company?). If a worker is treated as an independent contractor, an employer does not have to withhold income taxes, withhold and pay Social Security and Medicare taxes, or pay UI taxes.
It is not clear if the issues raised in the Illinois UI audits are associated strictly with independent contractors (in the technical sense) or more broadly with flexible staffing arrangements. Houseman (1999) provides a nice introduction to issues associated with flexible staffing arrangements. She reports data from the February 1997 CPS Supplement on Contingent and Alternative Work Arrangements showing that 6.7 percent of workers were "independent contractors," 1 percent were "agency temporaries," 1.6 percent were "on-call or day laborers," .6 percent were "contract company workers," and 2.6 percent were "other direct-hire temporaries." These categories compose 12.5 percent of the workforce. The use of flexible staffing arrangements appears to have been growing sharply over time, but detailed information on its growth is not available. Houseman (1999) reports that the IRS estimates it loses billions in tax revenues each year due to misclassification of employees.
Houseman (1999) also reports information on the incomes of "flexible workers" drawn from the February 1995 CPS Supplement on Contingent and Alternative Work Arrangements, matched to the March 1995 CPS. Of "regular" employees 7.5 percent had incomes below 125 percent of poverty. The corresponding figures for agency temporaries was 21.7 percent; 16.2 percent for on-call or day laborers; 10.8 percent for independent contractors; 11.5 percent for contract company workers; and 15.1 percent for other short-term direct hires. Consequently, a failure of UI data to fully capture workers in flexible staffing arrangements could be a major problem for studies that rely exclusively on UI data to measure the income and employment of low-income workers.
In many industries, employers have considerable flexibility in designating the status of workers. At least in the Illinois audit study, employers aggressively overused the independent contractor designation. In all, 45 percent of employers make some underreporting error. This includes nearly 500,000 cases in which workers were excluded erroneously, which resulted in $2.6 billion in wages being underreported. Smaller firms were estimated to underreport 14 percent of their taxable wages and 56 percent of their UI-covered workforce. In statistical models, the percentage of workers on the payroll who are independent contractors and the turnover of the firms' workers are two key explanatory variables. The effective tax rate, while related to turnover, also appears to be positively associated with compliance. The characteristics of firms that make errors on UI reports would appear to be positively correlated with the type of employers who disproportionately hire workers with low levels of human capital.
Hence, we view the Blakemore et al. (1996) and Burgess et al. (1998) studies as raising a serious concern about the coverage of UI data, and hence its suitability as the exclusive source of data with which to evaluate welfare reform. In our conclusions, we recommend that at least one additional study be conducted along the lines of the Illinois study to assess UI coverage. It is our impression, based on casual, anecdotal evidence, that the use of independent contractors has increased fairly substantially over time, and thus the work based on 1987 Illinois data may understate the problem.
The second potentially major weakness with using UI data for evaluating welfare reform is that they contain limited accompanying demographic information on individuals, and, more importantly, may not allow one to form an accurate measure of family income. In assessing the impacts of welfare reform, many argue that it is important to assess how these changes affect the well-being of children and the families in which they reside. As such, families constitute the natural "unit of analysis" for such assessments and family income often is used as an indicator of this unit's well-being.
The potential problem of relying on earnings data from UI wage records when the objective is to assess the level of family resources in studying the impact of welfare reform recently has been highlighted by Rolston (1999). Based on past research, Rolston notes that changes in individual income account for only 40 to 50 percent of exits from welfare. Thus, to have a complete picture of the effects of welfare reform, analysts need information on other economic and demographic changes occurring in the family. Given this, the problem is clear. Income as reported through UI records fails to include sources of nonemployment income and income of partners that is available to a family. Income sources that are not UI data may result in a family not receiving cash assistance or being ineligible.
The calculations from Meyer and Cancian (1998) suggest the concern raised by Rolston (1999) is economically important. Recall that Meyer and Cancian found, for example, that 5 years after leaving welfare, 64.2 percent of the women still have incomes below the poverty line, while, when considering the broader family unit, only 40.5 percent have income below the poverty line. In a related calculation, however, Primus et al. (1999) do an analysis that shows "for most single-mother families, including the income of unrelated male individuals does not materially change the picture drawn of a decline in overall disposable income between 1995 and 1997." More needs to be learned about the importance of the issue raised by Rolston in assessing the level and trend in family well-being following welfare reform.
To summarize, using UI wage records to measure income and employment has two potential weaknesses. These are as follows:
In this section, we review two sets of studies that make direct comparisons of income and employment measurements across several data sources for the same individual and/or family. We first consider the results of a comparison of measures of income and employment gathered from UI records in 11 states and from a survey for a sample of 42,564 adults who left JTPA programs during the 1986 program year. The findings from this study are described in Baj et al. (1991) and Baj et al. (1992).(18) Of those terminees, 27,721 responded to all three of the questions that were mandatory for "terminees" of JTPA-sponsored programs, giving an overall response rate of 65.1 percent. The investigators had access to separate data files containing UI wage records for the full sample of terminees, where the latter information was drawn from the UI systems for the 11 Midwestern states included in this study. Baj et al. (1991) drew the following conclusions about estimating the post enrollment incomes of these JTPA terminees with these two alternative data sources:
There are two major conclusions to be drawn from these analyses. First, there is ample evidence to suggest that the post-program survey data is substantially affected by the presence of non-response bias. While this conclusion is based largely on the examination of post-program employment experiences, it is suspected that the same conclusion would hold if the focus was on post-program earnings. The second conclusion is that the major source of this bias, i.e., the different post-program employment experiences of respondents and non-respondents who were employed at termination, is not addressed through current non-response adjustment procedures. The implication of these findings is that the estimates of post-program performance based on the information gathered through the post-program survey are not a true reflection of the actual post-program experiences of all JTPA terminees. (p. 35)
The survey they examined was not constructed in a way that allows comparisons of earnings reports. Instead, the presence of employment in a given quarter was compared across the survey and UI data. To do this, they sharply restrict the sample to people leaving Title II-A (JTPA) a week prior to the week containing the starting date of a fiscal quarter. For data reasons three states also were dropped from the sample.(19) This left 1,285 participants, of which 863 responded to the survey. Even with these sample restrictions, employment comparisons are not completely straightforward because UI earnings are reported for the quarter in which they are paid, not the quarter in which they are earned. With these issues in mind, Table 9-4 shows the result of the comparisons.
The diagonal elements in Table 9-4 show that 81.7 percent (72.8 percent + 8.9 percent) of the UI-survey observations are in agreement on employment status. The lower off diagonal element indicates that 5.1 percent of the matched sample report that they were unemployed during the quarter, yet they had UI earnings. One might think welfare recipients would be reluctant to report earnings, but they were only slightly (5.4 percent) more likely to not report earnings (when they had positive UI earnings) than nonrecipients (4.4 percent). This result has two potential explanations. First, respondents may have earned the UI wages reported for the quarter during the previous quarter and subsequently lost their jobs. Second, respondents may have provided inaccurate reports. Given that many of these 44 cases were employed at the time they left JTPA, Baj et al. (1991) suggest the second explanation is more likely than the first.
| Post Program Survey Status | First Quarter UI Status | Total | |
|---|---|---|---|
| Employed | Unemployed | ||
| Employed | 628 (72.8%) | 114 (13.2%) | 742 (86%) |
| Unemployed | 44 (5.1%) | 77 (8.9%) | 121 (14%) |
| Total | 672 (77.9%) | 191 (22.1%) | 863 (100%) |
| Source: Baj et al. (199:39). | |||
The upper diagonal element shows that 13.2 percent of the sample report being employed yet have no UI wages.(20) Again, it is possible that the timing of UI wage reports can partially account for this discrepancy, though most of these people were employed at the time they left JTPA, so this again is an unlikely explanation. Instead, it is likely that some of these people were employed out of state, and that others had jobs that were not covered by UI.(21)
Baj et al. (1992) update the Baj et al. (1991) calculations and provide more detail on the potential sources of discrepancy between UI data and the survey that was administered. In 1987, 11.3 percent of the sample report being unemployed for the quarter but have UI data (the corresponding figure from the earlier study was 5.1 percent), and 9.1 percent have no UI record but report they are employed (the corresponding figure from the earlier study was 13.2 percent). Baj et al. (1992) discuss three possible reasons to explain cases that claim to be employed but show no UI record.(22) The respondent may have been employed out-of-state or employed in the quarter but have wages that were not paid until the next quarter, or are employed in a job not covered by UI or where the employer fails to report UI wages.
| Reason for Mismatch | Number of Cases | Percent |
|---|---|---|
| Employed out of state | 517 | 15.3 |
| Self-employed | 51 | 1.5 |
| Federal employment | 172 | 5.1 |
| Within program UI record | 81 | 2.4 |
| 1st-quarter UI record | 608 | 18.0 |
| 2nd-quarter UI record | 93 | 2.7 |
| No related UI record | 1,865 | 55.0 |
| No UI record | 1,325 | 39.1 |
| Mismatched employers | 540 | 15.9 |
| Total | 3,387 | 100.0 |
| Source: Baj et al.(1992:142). | ||
To look at these factors, the authors used data from the Illinois JTPA management information system, which gives detailed information on the employment status at termination of the program and compares that to UI status at termination. The analysis focuses on 3,387 cases (13.1 percent of the sample) that reported that JTPA participants were employed at termination, but there was no UI record for the termination quarter. Table 9-5 suggests some explanations for the mismatches (at the termination quarter). The table shows that out-of-state employment accounts for 15.3 percent of the discrepancies (line 1). Identifiable employment in uncovered (self-employed and federal appointments) sectors accounts for 6.6 percent of the discrepancy (lines 2 and 3). The next three rows of the table--the within, first-quarter, and second-quarter UI entries--are supposed to reflect timing differences in the data. Collectively these account for 23.1 percent of the discrepancy (lines 4, 5, and 6). Another 15.9 percent of the discrepancies seem to result from name mismatches between employers that could be reconciled fairly easily. This still leaves 39.1 percent of the remaining sample unexplained. Of this group of 1,325 participants, there were 1,108 different employers. The potential explanations for the discrepancy that Baj et al. (1992) offer include: errors in reporting the Social Security number on the JTPA or UI data systems, an employer's neglect of UI reporting requirements, and reporting errors by JTPA operators.
Baj et al. (1991) and Baj et al. (1992) examine the existence of employment in survey and UI data, but do not provide comparisons of earnings as their survey did not elicit information on earnings. Kornfeld and Bloom (1999) look at both employment and earnings. They describe their study as attempting "to determine whether wage records reported by employers to state unemployment insurance agencies provide a valid alternative to more costly retrospective sample surveys of individuals as the basis for measuring the impacts of employment and training programs for low-income persons" (p. 168). Kornfeld and Bloom (1999) is based on data covering 12,318 people from 12 sites around the country in which an experimental evaluation of JTPA training programs was conducted. For each site, they had access to data from both UI wage records and follow-up surveys of experimental (received JTPA services) and control (did not receive JTPA services) group members. In their analysis, they dropped observations with missing or imputed data, but included observations where earnings were recorded as zeros in the follow-up surveys.
Another, and slightly different, comparison of measurement of employment status and wage income across two different data sources for a sample of individuals who were provided access to JTPA services is found in Kornfeld and Bloom (1999). They assess how UI and survey data differ, where the latter was conducted as part of the National JTPA Study, in estimating the levels of earnings and the differences in mean earnings and employment rates between experimental and control group members, where control group members were denied access to JTPA services. Although the primary objective of the Kornfield and Bloom (1999) is how to assess how the estimated impacts of JTPA services on income and employment status vary by data source--they found virtually no difference in the estimates of impact by data source--we shall focus on what they found with respect to differences in levels of earnings across the two sources of income and employment data available to them.
Table 9-6, drawn from their study, shows that employment rates calculated from the two data sources are quite close. The discrepancies between employment data derived from their survey versus from UI records range anywhere from employment being 1 percent lower in surveys to being 11 percent more. At the same time, Kornfeld and Bloom find that the discrepancies in the level of earnings for JTPA participants are much greater. In particular, they consistently find that the level of earnings from survey data is higher than those found in UI data. The nature of this discrepancy in earnings measures is different from the one raised in Rolston (1999). Recall that Rolston is concerned that using UI wage data to measure the earnings of welfare leavers tends to be biased because such data do not include the income of other family members. Rolston argues that this lack of inclusion of the earnings of other family members is important given evidence that suggests that many exits from welfare are coincident with changes in family structure. The comparison in Table 9-6 from Kornfeld and Bloom (1999) focuses on only earnings reports for individuals. It documents systematic discrepancies of UI and survey data, where income reported by UI data is always substantially lower (in one case, by half) than that reported in survey data. Because the employment rates are comparable, Kornfeld and Bloom conclude that the earnings differences must reflect either differences in hours of work for JTPA participants who are recorded as being employed in a quarter, differences in the rate of pay recorded for this work, or both.
| Treatment Earnings ($) | Control Earnings ($) | Treatment Employment Rate | Control Employment Rate | |
|---|---|---|---|---|
| Adult women (4,943; 18,275; 8,916) | ||||
| Survey data | $1,294 | $1,141 | 59.2% | 54.5% |
| UI data | $1,048 | $922 | 57.6% | 54.1% |
| Ratio (survey UI) | 1.23 | 1.24 | 1.03 | 1.01 |
| Adult men (3,651; 13,329; 6,482) | ||||
| Survey data | $1,917 | $1,824 | 65.8% | 63.5% |
| UI data | $1,456 | $1,398 | 61.7% | 60.7% |
| Ratio (survey/UI) | 1.32 | 1.30 | 1.07 | 1.05 |
| Female youth (2,113; 9,452; 4,316) | ||||
| Survey data | $951 | $949 | 51.3% | 50.6% |
| UI data | $701 | $700 | 50.6% | 51.2% |
| Ratio (survey/UI) | 1.36 | 1.36 | 1.01 | 0.99 |
| Male youths without a prior arrest (1,225; 5,009; 2,442) | ||||
| Survey data | $1,556 | $1,655 | 65.5% | 69.3% |
| UI data | $1,015 | $1,103 | 61.3% | 63.2% |
| Ratio (survey/UI) | 1.53 | 1.50 | 1.07 | 1.10 |
| Male youths with a prior arrest (386; 1,646; 705) | ||||
| Survey data | $1,282 | $1,531 | 58.0% | 61.3% |
| UI data | $759 | $760 | 52.8% | 55.0% |
| Ratio (survey/UI) | 1.69 | 2.01 | 1.10 | 1.11 |
| Source: Kornfeld and Bloom (1999), Tables 1 and 2. Numbers after each panel heading reflect the number of persons represented (4,943 adult women) with the number of person-quarters in the treatment and control groups. | ||||
Kornfeld and Bloom (1999) also condition on whether a JTPA participant was receiving AFDC benefits during a particular quarter and find that, while the level of earnings is lower, the discrepancy between survey and UI data is strikingly similar. Survey earnings reports for adult women and female youth are 24 to 34 percent higher than reported UI earnings levels. There was also wide variation across JTPA sites in the size of earnings discrepancies between survey and UI data, but the survey always yielded larger numbers than did the UI data. The "ratio range" was 1.15 to 1.40 for adult women, 1.16 to 1.72 for adult men, 1.16 to 1.76 for female youth and even larger for male youth. Whatever the mechanism is generating these discrepancies, it exists across all 12 geographically diverse JTPA sites.
The dispersion of earnings discrepancies is very large, so the means mask large variations, across earnings reports. We do not know, of course, which measure of earnings more closely resembles the truth. If survey data tend to be more accurate, however, the discrepancies shown in Table 9-7 would be reason for one to give pause in using UI data to assess the economic well-being of families following welfare reform. It shows that more than 10 percent of women and 20 percent of men have discrepancies that exceed $1,000 in a quarter.(23)
| Mean Survey - Mean UI |
Adult Women | Adult Men | Female Youth | Male Youth No Arrest | Male Youth With Arrest |
|---|---|---|---|---|---|
| $2,001 + | 3.5 | 9.9 | 2.0 | 8.7 | 9.3 |
| 1,001-2,000 | 7.7 | 11.1 | 7.7 | 15.0 | 16.1 |
| 601-1,000 | 7.9 | 9.4 | 9.1 | 12.4 | 13.2 |
| 401-600 | 6.8 | 5.9 | 8.3 | 8.2 | 7.3 |
| 201-400 | 10.4 | 8.6 | 13.0 | 11.9 | 9.6 |
| 1-200 | 17.3 | 12.7 | 20.6 | 13.2 | 14.8 |
| 0 | 14.3 | 8.1 | 10.0 | 3.8 | 5.7 |
| -$1 - -$200 | 16.2 | 13.2 | 17.3 | 10.8 | 11.1 |
| -201- -400 | 6.2 | 6.2 | 5.3 | 5.6 | 6.0 |
| -401 - -600 | 3.3 | 4.2 | 3.3 | 3.4 | 1.6 |
| -601 - -1,000 | 3.4 | 4.6 | 2.0 | 4.5 | 3.4 |
| -1,001 - -2,000 | 2.3 | 4.1 | 1.1 | 1.7 | 1.6 |
| -2,001 - | 0.9 | 2.0 | 0.2 | 0.8 | 0.5 |
| Mean diff ($) | 228 | 451 | 256 | 547 | 605 |
| Source: Table 5 from Kornfeld and Bloom (1999). | |||||
Kornfeld and Bloom (1999) also examine those JTPA participants for whom they found positive earnings in one data source but not the other. "When only the survey reported employment (and UI data presumably missed it), mean earnings were more than twice what they were when only UI data reported employment (and the surveys presumably missed it). This suggests that surveys are more likely to miss 'low earnings' quarters, perhaps because respondents forget about minor, or short-term, jobs. In contrast, UI data appear more likely to miss 'average earnings' quarters--where mean earnings are similar to when both data sources report employment. This might be due to random errors in matching UI wage records, out-of-state jobs, jobs that are not covered by UI, and/or earnings that are 'off the books.' (p. 184)
The above-noted discrepancies could arise between the data sources because some jobs are uncovered, some jobs may be located out of state, some payments may go unreported because of unintentional or intentional noncompliance, or Social Security numbers may be misreported. To provide further insight, Kornfeld and Bloom compare the earnings reports that employers make about their employees to state UI systems with those they make to the IRS. Although employers have an incentive to underreport earnings to the UI system (and hence avoid paying UI taxes), they have no incentive to conceal earnings when reporting to the IRS, because wages are a business expense that will lower tax payments. The sample for doing this comparison is smaller than the previous samples because each observation needs to be there for 4 consecutive quarters, corresponding to the calendar year. The ratio of mean IRS earnings to mean UI earnings ranged from 1.14 for adult women to 1.25 for male youth, so UI wage records clearly are missing earnings from some jobs.
Based on their analysis, Kornfeld and Bloom draw the following conclusions from their investigation.(24) Approximately half of the survey-UI earnings difference reflects earnings that are missing from UI wage records (by making use of the IRS data). Out-of-state jobs do not explain why UI wage records reported lower earnings than sample surveys. Uncovered jobs account for only a small part of the survey/UI earnings difference. There is little evidence consistent with recall bias in the survey data. There is no evidence that large survey discrepancies result from survey reports of "unusually good" jobs or weird reports of industry of employment. Survey discrepancies also do not appear to be driven by overtime or odd pay periods.
From the direct comparisons of the data sources used to measure income and employment status found in the studies reviewed above, we draw the following tentative conclusions about the differences between using survey versus UI data:
Our review of the literature has pointed to three critical concerns that arise with using UI data to measure the earnings and employment of low-income and welfare-eligible populations. The concerns are as follows:
Although not widely used in past evaluations, wage and salary data from federal and state income tax returns represent an alternative to UI data for measuring the income and employment of low-income populations. Here we outline the potential strengths and weaknesses of these data sources and briefly summarize a recent comparison of UI wage and tax return data for a disadvantaged population drawn from the AFDC caseload in California.
Compared to using surveys or UI records, using tax return data for measuring the income and employment has at least two potential advantages. These are the following:
The data are accurate. Taxpayers provide information under the threat of audit and there is third-party information reporting, so employers as well as recipients are reporting wage and salary information.
The definition of income that is reported is broader than that provided by unemployment insurance data, including, most importantly, self-employment income and in cases where a person is married and they file a joint return, spousal income.(25)
Several potential weaknesses are associated with using tax returns data to measure income and employment. We summarize several: Note that some of these weaknesses apply to the general population, while others are more relevant for low-income populations. First, the access by researchers to tax returns data is extremely limited and constrained because of Section 6103 of the Internal Revenue Code. Section 6103 explicitly states that tax data cannot be released, except to organizations specifically designated in Section 6103(j). The exceptions are the Department of Commerce, but only as it relates to the Census and National Income Accounts, the Federal Trade Commission, the Department of the Treasury, and the Department of Agriculture (for conducting the Census of Agriculture). Penalties for unauthorized disclosure are severe, including jail terms of up to 5 years.
Second, tax return data also contain only limited information on demographic characteristics of taxpayers. For example, the tax system does not collect information on the race or education of tax filers.
Third, tax-filing units differ from both families and individuals. Married couples can file either a joint return or separate returns (as "married filing separate"). Cohabiting couples, even if fully sharing resources, will file separate returns as individuals or head of household (generally meaning the filer is a single parent with dependents). In general we believe families pool resources so families are the best unit of analysis for assessing economic well-being. Hence, case units probably are the most useful unit of analysis.
Fourth, there also are differences between tax return data and other data sources in the frequency of reporting. Unemployment insurance wages are reported quarterly. Transfer program information is reported monthly. Tax returns are filed annually. Because shorter periods can be aggregated into longer ones and there can be major changes in family composition over time, the annual frequency of tax reporting is less appealing than monthly or quarterly reporting in other data sets. To the extent that family structure changes over these intervals, problems may arise when trying to link different data sets to assess well-being.
A fifth concern relates to the incidence and accuracy of tax filing by individuals and households, especially among low-income populations. This concern takes two forms: (1) whether people file any tax return, and (2) if they file, whether they report all sources of income to the IRS (or state taxing authorities).(26) We consider each in turn.
If large fractions of low-income taxpayers do not file tax returns, then tax return data have very limited value. Unfortunately, there is not a lot of information on the filing propensities of people with low income. Information from the early 1990s (Scholz, 1994) suggests that 14 to 20 percent of those entitled to the earned income tax credit at the time failed to receive it, meaning that they failed to file tax returns.(27) Later, we discuss one recent study on the tax filing propensities of a low-income population that sheds some preliminary light on this issue.
Among filing units, it is also possible that their members do not report all of their sources of income on their tax returns. For example, individuals may fail to file income received as independent contractors. Although firms or individuals who use independent contractors are obligated to report payments to such contractors to the IRS, failures to do this generally are difficult to detect. Again, we know little about the incidence of underreporting of various income sources for low-income populations.
To summarize, using tax return data to measure income and employment has several potential weaknesses. These are the following:
In a recent study of the EITC for a sample of assistance units on the California caseload, Hill et al. (1999) compared UI wage data with linked data from the sample members' IRS tax returns. The study used data from the California Work Pays Demonstration Project (CWPDP), which was conducted in four counties (Alameda, Los Angeles, San Bernardino, and San Joaquin) starting in 1992. The data consisted of two sets of assistance units drawn from the caseloads in these counties. One set, which is used for the sample in Table 9-8, consisted of a random sample drawn from a caseload at a particular date in 1992. Although this sample is representative of the caseload at that time, recall that the study by Bane and Ellwood (1983) showed that random samples from the existing caseload of AFDC are disproportionately made up of assistance units that are "welfare dependent."
The second set of assistance units, which is the sample used for Table 9-9, is a random sample of new entrants to the caseload in 1993. Bane and Ellwood (1983) and others have found that a significant proportion of new entrants remain on welfare for only a relatively short period.(28) Furthermore, Gritz and MaCurdy (1991) find that most new entrants exit from AFDC to employment. We also break both samples up into female-headed households (Aid to Families with Dependent Children-Family Group AFDC-FG cases) and two-parent households (AFDC-U). We report on annual earnings information for the year after the samples were drawn, that is, 1993 for the random sample of the caseload and 1994 for the new entrants sample.(29)
The first two lines of each panel of each table give estimates of the employment rates of each sample of AFDC recipients. As expected, employment rates of the point-in-time caseload (Table 9-8) are lower than the sample of new entrants (Table 9-9). Employment rates of one-parent cases (AFDC-FG) are lower than the employment rates of two-parent cases (Aid to Families with Dependent Children-Unemployed Parent [AFDC-U]). What is striking and not necessarily expected, however, is that the implied employment rates using UI data and using tax return data are nearly identical. From Table 9-8, employment rates of the point-in-time AFDC-FG caseload were 26 percent using UI data and 22 percent using tax return data. The corresponding rates for AFDC-U cases were 31 percent for both data sources. Employment rates were 37 percent using UI data for the new entrant sample and 33 percent using tax returns. Employment rates were 48 percent using UI data for the AFDC-U new entrants and 49 percent using tax returns.
| Those Filing Tax Returns | Full Sample | |
|---|---|---|
| AFDC-FG cases | ||
| % of households with UI earnings | 26 | |
| % of households that filed tax returns | 22 | |
| Average UI earnings of adults in household | $4,514 | $1,242 |
| Average adjusted gross earnings on tax returns | $10,589 | $2,378 |
| Average wage & salary earnings on tax returns (Line 7) | $9,748 | $2,189 |
| Average income reported to AFDC | $1,222 | $360 |
| % of households with No UI earnings, but filed tax return | 5.89 | |
| % of households with UI earnings, but filed no tax return | 11.41 | |
| % of households for which AGI < UI wages | 12.61 | |
| % of households for which AGI = UI wages | 78.59 | |
| % of households for which AGI > UI wages | 8.80 | |
| % of households for which AGI < UI wages, for UI wages > 0 | 3.39 | |
| % of households for which AGI > UI wages, for AGI > 0 | 40.47 | |
| Self-employment income reported on tax returns | ||
| Fraction of filers reporting any | 0.06 | |
| Average amount reported | $357 | |
| AFDC-U Cases | ||
| % of households with UI earnings | 31 | |
| % of households that filed tax returns | 31 | |
| Average UI earnings of adults in household | $5,223 | $1,792 |
| Average adjusted gross earnings on tax returns | $8,482 | $2,595 |
| Average wage & salary earnings on tax returns (line 7) | $7,554 | $2,311 |
| Average income reported to AFDC | $2,513 | $894 |
| % of households with no UI earnings, but filed tax return | 7.07 | |
| % of households with UI earnings, but filed no tax return | 8.21 | |
| % of households for which AGI < UI wages | 9.26 | |
| % of households for which AGI = UI wages | 78.39 | |
| % of households for which AGI > UI wages | 12.05 | |
| % of households for which AGI < UI Wages, for UI Wages > 0 | 3.97 | |
| % of households for which AGI > UI wages, for AGI > 0 | 40.53 | |
| Self-employment income reported on tax returns | ||
| Fraction of filers reporting any | 0.12 | |
| Average amount reported | $562 | |
| Source: Hill et al. (1999) | ||
Although tax return data and UI data would give similar perspectives about employment patterns of the 4-county California sample, it is clear that each sample covers workers that the other misses. For example, in the top panel of Table 9-8 (AFDC-FG cases from the point-in-time sample), roughly one-quarter of people (5.89/22) who filed tax returns had no corresponding UI record.(30) Over 40 percent (11.41/26) of those with positive UI earnings did not file taxes.(31) Of those with both UI and tax return earnings, more than 40 percent reported more earnings on tax returns than would be expected based on UI data. Similar figures apply to each other group, though for AFDC-U cases, only about 20 percent of the cases with UI earnings do not file tax returns.
| Those Filing Tax Returns | Full Sample | |
|---|---|---|
| AFDC-FG cases | ||
| % of households with UI earnings | 37 | |
| % of households that filed tax returns | 33 | |
| Average UI earnings of adults in household | $6,769 | $2,868 |
| Average adjusted gross earnings on tax returns | $13,185 | $4,342 |
| Average wage & salary earnings on tax returns (Line 7) | $12,575 | $4,141 |
| Average income reported to AFDC | $1,625 | $709 |
| % of households with no UI earnings, but filed tax return | 8.34 | |
| % of households with UI earnings, but filed no tax return | 13.55 | |
| % of households for which AGI < UI wages | 15.69 | |
| % of households for which AGI = UI wages | 71.31 | |
| % of households for which AGI > UI wages | 13.00 | |
| % of households for which AGI < UI wages, for UI wages > 0 | 4.21 | |
| % of households for which AGI > UI wages, for AGI > 0 | 39.88 | |
| Self-employment income reported on tax returns | ||
| Fraction of filers reporting any | 0.04 | |
| Average amount reported | $95 | |
| AFDC-U cases | ||
| % of households with UI earnings | 48 | |
| % of households that filed tax returns | 49 | |
| Average UI earnings of adults in household | $8,516 | $5,138 |
| Average adjusted gross earnings on tax returns | $12,970 | $6,360 |
| Average wage & salary earnings on tax returns (Line 7) | $11,421 | $5,601 |
| Average income reported to AFDC | $3,264 | $1,831 |
| % of households with no UI earnings, but filed tax return | 10.45 | |
| % of households with UI earnings, but filed no tax return | 7.94 | |
| % of households for which AGI < UI wages | 11.77 | |
| % of households for which AGI = UI wages | 64.71 | |
| % of households for which AGI > UI wages | 23.51 | |
| % of households for which AGI < UI wages, for UI Wages > 0 | 6.12 | |
| % of households for which AGI > UI wages, for AGI > 0 | 46.83 | |
| Self-employment income reported on tax returns | ||
| Fraction of filers reporting any | 0.11 | |
| Average amount reported | $512 | |
| Source: Hill et al. (1999) | ||
The fact that across all four groups (two samples, and AFDC-FG and AFDC-U cases), tax return income exceeded UI income in at least 40 percent of the cases with positive earnings from both sources, is consistent with households from this welfare-based population having earnings that are not from covered employment. The fact does not seem to be explained by people leaving welfare (through changes in family structure). Among AFDC-FG cases, only 1 to 13 percent of these households had no months on AFDC during the tax reference year and between 56 and 83 percent were on welfare for 9 to 12 months during that year. There is also little evidence that self-employment income plays an important role in earnings differences between tax return and UI income.
Based on comparisons between UI and tax return data, we offer several tentative conclusions:
Taking into account all of the features of a data source, including not only its accuracy but also its cost and ease of access, it appears that no single source can be declared "preferred." The inability to find a preferred data source is inevitable given the differences in the desired uses of data, the constraints imposed by budgets for data collection, and the access limitations to data. The fact that UI wage data are inexpensive, timely to obtain, and available at the state level, for example, implies that they will continue to be a focal data set for state-level evaluations of welfare reform. But our review raises a number of serious questions about UI data. In the remainder of this paper, we highlight selected issues that we believe need further attention in the hopes of encouraging future research on at least some of them.
Certain questions related to welfare reform can only be answered with nationally representative data sets, such as the CPS or SIPP. While Moore et al. (1990) and Roemer (1999a) conclude that income, especially labor earnings, are measured well in the CPS and SIPP, there are, in our view, several important questions that remain with respect to income and employment measurements for low-income populations with national surveys. The questions are as follows:
Recommendation 1: We would like to see further work on the sources of antipoverty program underreporting and its origins in nationally representative survey data.
Plans are under way for some of the needed work. Professor Hotz is a principal investigator on a project recently approved by the U.S. Census Bureau to match data from UI wage records and administrative data on AFDC/TANF participation for the California subsamples of several waves of the SIPP.(32) The work of this project should yield some more recent information on both the welfare participation underreporting and income reporting issues. This study--or comparable ones done with matches of the SIPP with administrative data for the subsamples from other states--also may provide some insight into the impact of changes in family structure on income reporting for welfare leavers by exploiting the (limited) panel structure of the SIPP.
Further research also is needed on the use of UI wage records to measure the income of low-income and welfare-prone populations. While the Kornfeld and Bloom (1999) evaluation suggested that UI wage data and survey data produced similar estimates of the impact of a social program (i.e., JTPA-funded training programs) on earnings and employment, their study also found that average earnings of JTPA-eligible individuals were consistently lower than those based on survey data. Furthermore, the study by Hill et al. (1999) also found that UI wage data produced substantially lower estimates of earnings than did tax returns data for a welfare-based population drawn from the California AFDC caseload. Learning more about the quality of this data source for measuring income is extremely important because UI wage data presumably will continue to be a core resource in state and local evaluations of the effects of welfare reform.
Several issues related to UI wage data appear to need further scrutiny. First, the studies by Burgess and his coauthors raises important concerns about the "coverage" of UI and tax returns, particularly for the low-income population.
Recommendation 2: It would be extremely useful to follow the helpful lead of the various Burgess studies to closely examine the coverage and trends in coverage of low-income populations with UI data. Such an examination could be aided by using a match of UI data with respondents in a national survey, such as the SIPP, so that one could learn more about the demographic characteristics of individuals (and households) that report labor market earnings on a survey that are not recorded in UI wage records data.
Second, more work is needed to understand the extent to which UI wage data provide a misleading measure of the earnings available to low-income households .This problem arises in short- and long-term follow-up analyses of earnings for welfare samples drawn from state caseloads. One can use UI data to measure subsequent earnings for individuals who were in assistance units as long as they remain on welfare. However, as noted by Rolston (1999), one may not be able to accurately measure household income after assistance units leave the rolls because it is difficult to keep track of the identities of household members. The evidence provided in the Meyer and Cancian (1998) and Hill et al. (1999) studies suggest that this may be a serious problem.
Recommendation 3: To learn more about family well-being, it will be necessary to continue to rely on targeted follow-up surveys to monitor samples of welfare leavers. Unfortunately surveys are expensive. We recommend that a pilot study be undertaken to devise a survey that is designed just to obtain Social Security numbers of other adults in a household, which can then be used to obtain UI wage earnings for these family members.
A third issue relates to the possibility that wage earnings are missed because individuals move out of the state from which UI wage data are drawn or because workers earn part of their income in other states. Again, comparisons of UI wage data with data from federal tax returns may help us to assess the importance of this problem and, more importantly, the biases that it imparts on measures of individual and household income. To learn more, it may be useful to take a closer look at what is known about the interstate mobility of disadvantaged and welfare-prone populations, such as the work done on movements of welfare populations in response to "welfare magnets," as in Meyer (1999) and the citations therein, and the implications this mobility has for the coverage of low-income workers in UI data.
[ Go to Contents ]
1998 Note on the Possible Effects of Welfare Reform on Labor Market Activities: What Can Be Gleaned from the March CPS. Unpublished paper, Bureau of Labor Statistics. December 1.
1999 The Initial Impacts of Welfare Reform on the Incomes of Single-Mother Families. Washington, DC: Center for Budget and Policy Priorities.Rockefeller Institute of Government.
2000 Reconciling March CPS Money Income with the National Income and Product Accounts: An Evaluation of CPS Quality. Unpublished paper, Income Statistics Branch, Bureau of the Census, August 10.