Income Data for Policy Analysis: A Comparative Assessment of Eight Surveys. FOOTNOTES


  1. The CPS design is longitudinal as well, in that sample addresses are revisited seven additional times after the first interview. Unlike the other longitudinal surveys, however, the CPS does not follow sample members who move to new addresses.
  2. Both MEPS and MCBS employ overlapping panels, so annual estimates should be less susceptible to these influences.
  3. See <>; the Technical Documentation pp. 9-1 through 9-12 available at <>; and Technical Paper 66 Chapter 5 available at <>.
  4. In MCBS these are individual enrollees.
  5. The description applies to the preliminary cross-section weights available when this report was prepared and may not apply fully to the final weights.
  6. For those wishing to replicate any study calculations on any of the eight surveys, copies of the file extracts and all of the SAS programs used by the study will be delivered to the government on completion of the study and should be requested from ASPE.
  7. How the ACS is post-stratified will affect this result. The Census Bureau post-stratifies the CPS to population controls representing the civilian non-institutional population. Residents of non-institutional group quarters must be excluded from such controls to match the ACS universe.  The Census Bureau can use group quarters population data for this purpose, but we are not certain how the changing size of the college dormitory population over the calendar year is handled.
  8. The ACS has 0.7 million fewer members of the armed forces and their families but 0.7 million more unrelated children under 15. The greater number of the latter is due to the ACS’s not identifying family relationships among persons unrelated to the householder. Children who would be classified as members of unrelated subfamilies in the SIPP and CPS are identified as unrelated individuals in the ACS.
  9. This estimate is based on a tabulation of the 2003 CPS ASEC supplement.
  10. Among those who were in-scope and had nonzero person weights, sample persons in families with missing members are identified by FMRS1231 < 1, and those with missing family heads are identified by FMRS1231 = -1 or, alternatively, FCSZ1231 or CPS family size = -1 (inapplicable or unknown).
  11. By contrast, MEPS person weights are assigned only to original sample members—that is, those who were selected from the respondents to the NHIS—and to new family members (primarily newborns) who were not in the survey universe at the time the sample was selected. See Chapter II.
  12. These respondents have been weighted to Census Bureau population controls for that month.
  13. The distribution of reference months over the 12 interview months resembles a pyramid, with the middle month, December 2001, occurring in all 12 reference periods and the outermost months, January 2001 and November 2002, occurring in only one reference period each.
  14. To determine poverty status in the ACS, the Census Bureau compares unadjusted incomes to adjusted thresholds. This is mathematically equivalent to comparing incomes that have been adjusted to reflect price levels during the calendar year to thresholds for that same calendar year as long as the same inflation index is used for both.
  15. There is one additional wrinkle in the CPS family definition, but we do not apply it to the other surveys even though it has implications for poverty measurement. If a family does not include the householder (or, by implication, relatives of the householder), making it an “unrelated subfamily” in Census Bureau terminology, its membership is restricted to the family reference person, a spouse, and/or one or more never-married children under 18.  The restrictive composition of an unrelated subfamily is a function of limitations in the relationship data collected in the CPS.
  16. Based on our own tabulations, fewer than 15 percent of the unmarried partners of householders or unrelated subfamily reference persons in the 2003 CPS ASEC supplement were 50 and older. Two-thirds of the unmarried partners were under 40.
  17. About 40 percent of the minor children in these families are biological children of both the reference person and partner, but their relationship codes identify them as children of the reference person. In the relatively few cases where a child is identified in the relationship code as a child of the partner, typically that child belongs to the partner but not the reference person. In the CPS, until recently, a biological child of two unmarried partners would be identified as the child of only one partner. In NHIS both biological parents are identified as such.
  18. A family qualified as a non-CPS family if any member was identified as the unmarried partner of another member or if any member was identified as a foster child with no biological or adoptive parent present. That is, in a two-parent family the child had to be the foster child of both parents since nearly half of the children in husband-wife families who were coded as the foster child of one parent were coded as the biological child of the other parent.
  19. In the vast majority of cases, as noted above, there were only two adults.
  20. The inclusion of these additional non-relatives does not follow a fixed set of rules, although survey documentation suggests that there may be patterns.  For example, an unmarried, same-sex partner is included in the family as a non-relative.
  21. We relied on the relationship code to determine which other family members were related to the head and which family members were related to the “wife” (or partner).  A shared child would have been assigned in this code to either the head or wife.
  22. Designating one family head as the combined family head was necessary for subsequent tabulations that counted only heads of families.
  23. As noted in the table, families include individuals living alone or with non-relatives.
  24. The NHIS has five files of independently calculated income imputations.  We performed each analysis five times, using each of the five files in turn, and averaged the results. This procedure was followed for all tabulations on NHIS.
  25. Within each survey, each of the five quintiles contains the same number of people (weighted) except when the numbers are affected by heaping at quintile boundaries.
  26. We suspect that this higher proportion with less than a high school education is incorrect. The people we assigned to this status were identified as a residual and may include persons with missing data or otherwise undefined educational attainment.
  27. Near poor does not have a standard definition. We use the term to give a name to those who are low income (below 200 percent of poverty) but not poor. Elsewhere, near poor is sometimes used to identify persons between 100 and 125 percent of poverty.
  28. Also, our independently calculated poverty status differs occasionally from the status on the MEPS public use file due to an apparent error in the algorithm used in creating the recode for the public use file.
  29. Creating CPS-like families by splitting NHIS families with unmarried partners increases the NHIS poverty rate. An estimate of the impact is reported in Chapter V.
  30. The NHIS does not collect wage and salary or self-employment income separately from total earnings. MEPS does collect both sources of earnings, and the estimate of persons with earnings in Table IV.14 is based on sample members reporting income from one or both sources. However, employment and annual income are collected in different sections of the MEPS instrument, and a comparison of reported employment and income by source suggests that most self-employed sample members report their business income as wages and salaries. For this reason we do not break down MEPS earnings or persons with earnings by type of employment except to illustrate (in the next section) the impact of reclassifying reported wage and salary income for a those reporting self-employment as their sole work activity.  
  31. In 2003 (for tax year 2002), taxpayers filed 18.6 million returns with non-farm sole proprietorships, which exclude self-employed farmers and those with partnerships, LLCs, or S corporations. Even the SIPP estimate appears to understate the self-employed by a substantial margin.
  32. In 2003 the initial CPS question in the section on public assistance asked about “cash assistance from a state or county welfare program such as [state program name]” and the follow-on question asked about payments received on behalf of children.  The more detailed SIPP questions mentioned AFDC or TANF, general assistance, payments for foster children, and “other” welfare.  We combine estimates of welfare and Food Stamps because most persons receiving income from welfare also receive Food Stamps.
  33. With the notable exception of family income, employment, and earnings, NCHS does not allocate missing values on the NHIS file. Generally, item non-response rates are very low, however. For instance, only about 1 percent of the sample could not be classified as insured or not.  We have not compensated for item non-response in any way, so our estimates of the number of program participants and their share of the population will be very slightly lower than if the missing values had been allocated.
  34. SIPP respondents are asked to report coverage in each of the preceding four calendar months. MEPS respondents are asked when coverage started and stopped over a variable reference period, and these reports are used to determine coverage by month.
  35. NCHS does not allocate missing values for the health insurance items, and we have not compensated for non-response on these items.  As we noted earlier, however, only about 1 percent of the sample could not be classified as insured or not.  Consequently, our estimates of the number of persons and percent of the population without health insurance coverage will be very slightly lower than if the missing values had been allocated.
  36. Two additional factors, one from the MCBS and one from the other surveys, may contribute to the observed differences. The identification of non-institutionalized beneficiaries in the MCBS is based on those interviewed with community rather than facility forms. This distinction is frequently a matter of fieldwork convenience and is not equivalent to the Census Bureau’s identification of non-institutionalized persons. It is possible, then, that non-institutionalized beneficiaries are understated in the MCBS. For the other surveys, persons 65 and older may be overstated due to respondents under 65 rounding up their ages whereas MCBS beneficiaries would have had to document their ages to qualify for Medicare.
  37. For the 2004 HRS, an age-eligible sample member or spouse was born before 1954.
  38. These estimates are based on the NHIS family definition, as this is the family unit for which family income is reported.
  39. This estimate was derived by assigning the excess to the family reference person and summing the amounts using the reference person’s weight.
  40. Unpaid activity may be addressed separately.
  41. As noted earlier, the JOBS data are derived from interviews conducted two to three times a year whereas the annual income data are from a single interview conducted after the end of a year.
  42. See Chapter III, section A.3.a.
  43. The estimate from the NHIS has an upper and lower bound. As explained in Chapter III, we developed two alternative allocations of NHIS family income to the CPS families that we created. One alternative allocated family income in excess of total personal earnings in such a way as to maximize the number of poor in the CPS families. The other alternative allocated the excess family income in such a way as to minimize the number of poor in the CPS families. Ultimately, we averaged the two results, but the difference in the total number of poor persons with one alternative versus the other was 460,000, so our estimate of 2.6 million has a range of plus or minus 230,000.
  44. To show this effect, we do not change the classification of a single parent when we include an unmarried partner in the family. However, the partner who is brought into the family with the broader family concept is generally classified as a single under the CPS family concept and as an “other” person (that is, not belonging to any of the specified categories) under the NHIS/MEPS family concept. With the broader family concept the number of persons classified as “other” grows by several million, which explains why the number of poor in this category increases between the two family concepts while the poverty rate declines.
  45. This will happen, for example, if the principal earner in an unrelated subfamily has an income that is above the poverty threshold for a family of size one but below the poverty threshold for that person’s actual family size.
  46. This issue is relevant not only to poverty estimates but to simulated tax calculations, which must approximate the composition of the family at the end of the tax year—and, for some dependents, based on more elaborate residency and support rules.
  47. Specifically, we used the Census Bureau’s third longitudinal weight, which the Census Bureau created for panel members who were present from the beginning through the end of the panel or until they left the survey universe.
  48. We used 2001 as the income reference year so that we could have a full 12 months of SIPP data following the end of the reference year. The 2001 panel did not collect 12 months of data from 2003 for the entire sample.
  49. Monthly poverty thresholds are reported on the SIPP public use file. They are indexed for inflation, consistent with the official annual thresholds.
  50. For example, if a family member had data for only 10 of the 12 months, we multiplied the sum of that person’s monthly incomes over the 10 months by the ratio, 12/10, to obtain a 2001 calendar year income for that person.
  51. A calendar year 2001 annual poverty threshold was used for this purpose.
  52. This text is from the ACS questionnaire. The Census Bureau tabulations on the internal file, as well as the ADJUST factor and POVPIP measure on the public use file, assign income to the 12 months prior to the month in which the questionnaire is completed.
  53. These estimates were prepared by the U.S. Census Bureau using the 2003 ACS because the monthly samples in the 2002 ACS were not of uniform size.
  54. The CPS instrument asks very generally about “pension or retirement income from a previous employer or union, or any other type of retirement income other than Social Security or VA benefits” but includes among the sources that a respondent may identify “regular payments from IRA, Keogh, or 401(k) accounts.”
  55. For surveys in which the collection of income data is not a major objective, this can become a rationale for limiting the content of their income questions or relegating them to the end of interview, which minimizes their potential impact on the response rates to other questions or to respondents breaking off their interviews. However, relegating them to the end of the interview may further reduce their responses rates and adversely affect the quality of the responses, as appears to be the case with the CPS health insurance questions.
  56. We follow the Census Bureau in using the term allocation in a generic sense to describe any method of replacing a missing value with a generated value.  For all three Census Bureau surveys included in this study, the flags indicating how missing responses were filled in are identified in the survey documentation as allocation flags.  Frequently, the methods of allocation listed in the survey documentation are characterized as types of imputation.  In other respects, however, the terms allocation and imputation appear to be used interchangeably.
  57. The quality of allocation methods can be defined in terms of unbiasedness and their addition of minimal “noise” or variance to the variables being allocated.
  58. Limiting the number of significant digits in reported incomes reduces their uniqueness, making them less identifiable. The ACS uses a very well-defined rounding rule described in Chapter II.
  59. Income allocations for the 2003 PSID were either limited to earnings or not fully flagged. Income allocations in the 2004 HRS were performed at the source level, but we cannot assess their contribution to total family income from the variables provided in the RAND data release used in our analysis. In the MCBS, allocated amounts were not flagged.
  60. What is reported as wage and salary income in MEPS appears to include most of the employment-related income reported by the self-employed, so we do not report allocation rates for self-employment income (see Chapter IV, Section C.2).  In addition, nonzero self-employment income in MEPS was allocated only to persons who responded that they received income from self-employment but did not give a dollar amount.  That is, persons who did not respond to the recipiency question for self-employment income were allocated no income from this source. This is a departure from how income was allocated to the other sources in MEPS and how income was allocated to self-employment (as well as the other sources) in the other surveys.  Given that our estimates of allocation rates are based on nonzero dollars, this allocation strategy would depress an estimated allocation rate for self-employment income.
  61. NHIS does not collect income by source, so these comparisons are based on four surveys.
  62. Checking the allocation flags in the prior wave could have been informative, but we were not certain that an amount that was allocated using the labor force procedure in one wave but imputed in the prior wave should unequivocally be regarded as imputed in the current wave.
  63. Identifying all allocations from wage rates and hours worked as using partial information is not entirely correct. The allocated amounts are generated from a regression model that uses either reported or imputed wage rates and weeks worked during the calendar year, if such information is available, based on questions asked in the employment section of the interview. In the 2002 data, regression estimates were generated for 3 million weighted persons who have only self-employment recorded in the JOBS file and, therefore, would not have been asked for an hourly rate and usual hours worked.
  64. Revisions to the NHIS income questions for the 2007 survey have increased the percentage of respondents providing brackets.
  65. We selected $52,500 in order that we might examine the frequency of rounding up to levels of $50,000. The ACS public use file has incomes of $50,000 or greater rounded to the nearest $1,000.
  66. The Census Bureau also applies topcoding to the individual sources before summing them to calculate total personal and total family income. In this way, the totals are assured of being consistent with their components.
  67. The study uses income data for 2002 (HRS and MCBS income for 2003 were deflated with the CPI-U) that covers a calendar year, except for the rolling reference period in ACS, which spans 23 months.
  68. While the CPS is the official source of monthly labor force statistics, this status does not extend to annual estimates. The CPS collects much less detailed information on labor force activity in the prior year than in the reference period for the official monthly statistics.

View full report


"report.pdf" (pdf, 4.33Mb)

Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®