Income Data for Policy Analysis: A Comparative Assessment of Eight Surveys. Overview of surveys and similarities and differences


In addition to the design features already discussed, there are additional features of the eight surveys that should be noted.

While the CPS is the official source of monthly data on the labor force and employment, the survey has also collected income data for almost 60 years. The ASEC supplement, sponsored by the Census Bureau, is the source of official estimates of income and poverty, and is widely used for policy analysis and legislative cost estimates. The CPS collects detailed annual income information for the prior calendar year once a year. The basic purpose of the CPS—labor force information—suggests that it will be most accurate in the areas of wages and salaries and earned income generally.

SIPP is a longitudinal survey sponsored by the Census Bureau that collects a broad range of information relevant to public policy formulation for income security, retirement and health programs, including within-year patterns of income and program participation. SIPP was designed to address a wide range of policy-analytic needs, including estimation of persons eligible for means-tested programs. Panel households are interviewed three times a year at strict four-month intervals to collect month-by-month information for each person. In SIPP, annual income is obtained by adding up 12 months of data for each person. SIPP questionnaires and field methods are intended to maximize the accuracy of income data, especially for lower income persons with intermittent or irregular income sources, and persons with public program benefits.  SIPP is unique among the eight surveys in supporting detailed analysis of short-term behavioral dynamics. This project is especially timely for SIPP, which is undergoing a major redesign that may produce a substantially altered design by early in the next decade.

The ACS, which is also conducted by the Census Bureau, was designed to replace the decennial census long form by collecting the same type of data on a rolling basis rather than only once every ten years. As of 2005 the ACS collects data from 2 million households each year, with an annual sample of group quarters added in 2006. Like the long form the ACS will make available a common set of variables—mandated by law—down to very small levels of geography. The ACS will provide annual estimates for states and the largest counties and municipalities plus three-year and five-year rolling averages for smaller areas of geography.

MEPS is an annual longitudinal survey sponsored by the Agency for Healthcare Research and Quality (AHRQ) with field work conducted by Westat; it replaces earlier one-time longitudinal surveys to provide detailed information on health status, health care, and health care costs. The MEPS sample frame consists of households that participated in the prior year NHIS. MEPS collects annual income information for the prior calendar year once a year, and the Full Year files combine contemporaneous health and income data from overlapping two-year panels for cross-section analysis. MEPS is designed for policy analysis requiring income data, as well as data on health care costs, health insurance coverage, and third-party payments.

The NHIS is a cross-section survey sponsored by the National Center for Health Statistics (NCHS) with field work conducted by the Census Bureau. It is the primary source of information on health status and health care in the United States and is widely used for health-related analysis—particularly of trends. The NHIS is in the field continuously during the year, with an annual sample (consisting of four, nonoverlapping representative panels) that is assigned, first, to four calendar quarters and then, within quarters, to individual weeks. Each weekly subsample is representative of the target population. From this rolling sample the NHIS collects summary annual income information for the prior calendar year. Historically, the NHIS has collected only limited information on personal and family income.

The PSID is sponsored by ASPE and the National Science Foundation and is conducted by the Survey Research Center of the Institute for Social Research at the University of Michigan. The PSID was initiated in 1968 with a sample of approximately 5,000 families selected from two sample frames. Members of this initial sample and all of the families that they have created or joined have been followed continuously, with annual interviews through 1997 and biennial interviews starting in 1999. A Latino supplement was added in 1990 to help compensate for the survey’s under-representation of part of the immigrant population. This supplement was later dropped, due to insufficient funding, but a new and more broadly representative sample of immigrants was added in 1997. Where the SIPP was designed to support analysis of short-term dynamics of income, program participation, and related characteristics, the PSID was designed to study long-term dynamics.

The HRS, which is also a panel survey, began with a sample of households containing at least one individual born between 1931 and 1941. Sample members were first interviewed in 1992 and have been reinterviewed every two years since then. A second cohort of “war babies,” born 1942 to 1947, was added in 1998. A companion survey, the Asset and Health Dynamics Among the Oldest Old Survey (AHEAD), was started in 1993 with a sample of persons born in 1923 and earlier. A third HRS cohort of “children of the depression,” born from 1924 through 1930, was introduced in 1998 to fill the gap, and all of the cohorts have since been shifted to the same interview schedule to facilitate pooling of the data across cohorts. With these additions the HRS/AHEAD sample became representative of the U.S. resident population born before 1948—that is, 51 and older by the end of 1998. A new cohort was added in 2004 representing persons born between 1948 and 1954. Sample members are interviewed every two years. The HRS has employed a number of survey methodological innovations with respect to the collection of data on income and wealth. The income detail that it collects falls between that of the ACS and the CPS, so the HRS demonstrates what can be accomplished with a moderate number of questions.

The MCBS is sponsored by the Centers for Medicare and Medicaid Services (CMS) and is a longitudinal survey of Medicare beneficiaries. A new sample is drawn every year, and sample members are interviewed 12 times over a four-year period. MCBS data are released in annual files that pool four consecutive cohorts. MCBS is unique in that the survey data are not the final product. Cost and utilization data from Medicare claims files are added to the survey data along with information on non-covered medical services. Income data are limited to a single total.  

Only two of the surveys—the SIPP and the ASEC Supplement to the CPS—were designed explicitly to measure income, but income is also a major focus of the data collection in both the PSID and HRS. The ACS income data are much more limited than what is collected in the CPS or the SIPP, but income is still considered one of the most important characteristics collected by the survey. By contrast, the measurement of income in the MEPS, the NHIS, and the MCBS is decidedly secondary to the main objectives of each survey. MEPS, nevertheless, collects more detailed income data than the ACS while NHIS collects just total family income and personal earnings (along with receipt of multiple sources) and MCBS collects only the sample member’s total income, including that of a spouse.

Five of the eight surveys can be described as general population surveys. But while all five cover essentially the same universe—the full civilian, noninstitutionalized population resident in the United States—no two surveys represent this population at the same point in time. In fact, only the ASEC supplement comes close to capturing the population at a single point in time. CPS-ASEC respondents are interviewed primarily in mid-March of each year, but some supplemental interviews—part of a 2001 sample expansion—are conducted in mid-February and mid-April. The survey is weighted to March 1 population controls. The SIPP fully represents the population only in the first wave of each panel. Over the length of a SIPP panel, people who leave the survey universe are no longer represented, and new entrants through birth are almost fully represented, but immigrants, people returning from abroad, and people released from institutions and the military are represented only if they move in with persons who were included in the SIPP universe at the start of a panel. For cross-sectional estimates, the SIPP is weighted to the full civilian noninstitutionalized population in each month, but this becomes a less accurate reflection of the survey’s true universe with each passing month.
As noted, the ACS and the NHIS both use a rolling sample that covers the entire year. For simplicity, the ACS is weighted to mid-year (July 1) population controls while the NHIS is weighted to quarterly population controls to enable users to estimate disease prevalence at different times of the year. The MEPS is a subsample of the NHIS, drawn from completed interviews; MEPS respondents are interviewed multiple times over a two-year period to provide data for the two calendar years following the NHIS survey year from which they were drawn. A single MEPS panel represents the survivors of the population represented by the NHIS sample from which they were drawn—plus births to this population. However, AHRQ also releases annual files that pool two adjacent MEPS panels; the combined sample is weighted to population totals for that calendar year.
Cross-sectional estimation is not the purpose of the PSID, so concerns about how well it continues to represent the general population after 40 years detract only marginally from its value. They do require caution, however, whenever comparisons with other surveys are used to draw inferences about the quality of data in either the PSID or the other surveys. The PSID is included in this project in large part because certain features of its collection of data on income and program participation are being considered in the redesign of the SIPP, but only PSID’s use of an annual interview to collect monthly data proved relevant to our findings, and those findings do not provide any insights into the effective capture of monthly information with an annual interview.

The remaining two surveys, the MCBS and the HRS, represent restricted populations—that is, subsets of the general population—and will be used in this project to help assess the quality of income data on persons 65 and older and persons 51 and older, respectively.

View full report


"report.pdf" (pdf, 4.33Mb)

Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®