Using National Survey Data to Analyze Children’s Health Insurance Coverage: An Assessment of Issues. 3. Limitations of Survey Data


A full simulation of Medicaid eligibility requires, at the minimum, data on seven basic sets of characteristics:

  1. Income by source, with additional information on those expenditures that are applicable to the calculation of disregards
  2. Resources--specifically, financial assets along with the number and value of non- commercial vehicles
  3. Participation in certain other assistance programs
  4. Age and school enrollment of children
  5. Family unit membership
  6. Medical expenditures
  7. State of residence

We discuss the limitations of survey data with respect to each of these types of data below.

a. Income and Disregards

As we noted, Medicaid eligibility is based on monthly income, and portions of this income may be disregarded based on the amount that is attributable to earnings and the level of expenditures on such items as child care, shelter costs, and transportation to work. In simulating eligibility with survey data, it is important, therefore, to have good measures of earned and unearned income as well as the relevant types of expenditures. The importance of measuring unearned income derives not from the fact that the population targeted by Medicaid has income from many sources (although public assistance income is often importance) but that people who are in fact not eligible but rely on unearned income for support may appear to be eligible if their unearned income is not measured adequately.

The surveys that we have described provide little if any data on the expenditures that are taken into consideration in calculating disregards. SIPP is clearly the best, but it collects expenditure data only once or twice in the life of a panel, so the expenditures are not measured concurrently with income except at those one or two times. The CPS collects annual rather than monthly income. Researchers who conduct the most sophisticated simulations with the CPS construct monthly income streams to improve their eligibility simulations, but in doing so they are still not able to measure other components of eligibility (such as family composition) concurrently.

b. Resources

Data on resources, or assets, are very limited. Only the SIPP captures detailed data on asset balances--including the value of vehicles, which are a major component of the applicable asset holdings of the low income population.(26) The SIPP data are collected in two survey waves a year apart, so researchers must interpolate or extrapolate to other months. Reported assets vary so substantially between the two waves, however, that this is difficult--and it suggests low quality in the data (Czajka 1999). Vehicular assets appear to turn over very rapidly as well. Some researchers deal with the limitations of CPS data in this regard by applying an assumed rate of return to reported asset income to impute the unreported balances.

c. Participation in Assistance Programs

Receipt of AFDC and Supplemental Security Income (SSI) benefits are relevant to Medicaid eligibility determination. Prior to welfare reform, which replaced the AFDC program with TANF, AFDC participants were identified in the federal surveys, but AFDC participation was underreported by as much as 25 percent. Data on SSI participation have been collected less regularly and with lower accuracy. Prior to the 1996 panel, the SIPP did not identify individual SSI recipients within a family. Data users could employ information on disability--reported in one survey wave--to infer which child or children in an SSI family may have been the SSI recipient(s). In comparing SIPP 1992 panel estimates of SSI children with published administrative records, however, we found that the survey estimates were quite low and failed to capture a significant upward trend in SSI enrollment (Czajka 1999).

d. Age and School Enrollment of Children

Survey data on these characteristics of children are generally quite adequate for the purposes of Medicaid eligibility simulation.

e. Family Unit Membership

While the official federal poverty levels are designed to be applied to all related persons living in the same household (a “census family”), Medicaid eligibility may be based on just a subset of family members. Some family members (and their incomes) are automatically excluded when determining the eligibility of the remaining family members or the children--for example, SSI recipients and adult children of the family head. In addition, to maximize potential eligibility many states allow their caseworkers considerable latitude in defining the family unit for Medicaid income eligibility determinations (Lewis and Ellwood 1998). Including or not including one particular family member can make the difference between the remaining members being eligible or not. Therefore, the composition of the survey household, the family relationships among household members, and the income available to individual members at a point in time are needed to assign family members to Medicaid eligibility units. The SIPP data are the strongest in this regard, but the simulation of eligibility units is exceedingly complex (even when the applicable rules are well-documented, which they frequently are not). The CPS data are weak because family composition is measured at the time of the survey (March) while the income data refer to the previous calendar year. No data are collected on who was actually present in the household at any time during the previous year.

f. Medical Expenditures

Medical expenditure data are the weakest element among the data collected by the CPS, SIPP, and the NHIS. The virtual lack of information on medical expenditures makes it exceedingly difficult to develop a credible simulation of eligibility under the medically needy provisions of Medicaid. Researchers who do attempt to simulate this component of eligibility must resort to imputing medical expenditures based on other surveys--such as MEPS.

g. State of Residence

Identification of the state of residence of survey households is essential to replicating the state variation in Medicaid eligibility rules. While all but MEPS among the major surveys that we have discussed identify the state of residence of most sample members, we are aware that at least one of the surveys groups sets of states in order to protect the confidentiality of respondents--and possibly to discourage estimates for states that are not adequately represented in the sample. There are nine small states that are not individually identified in SIPP files prior to the 1996 panel. These nine states are combined into groups of two, three, and four states. In order to simulate features of the Medicaid programs for these nine states, it is necessary to assign respondents to the nine states in some manner. One must assume that other characteristics reported on the SIPP files are of limited value in predicting the actual state of residence for sample households reported in one of the three state groups or else the confidentiality of the state data would be compromised. Ultimately, therefore, the assignment of respondents to individual states must rely heavily on randomization. This implies the introduction of some additional error into Medicaid simulations, which may contribute, in turn, to mismatches between simulated eligibility and reported participation.