Survey data will play an important role in the evaluations of the Children’s Health Insurance Program (CHIP) because program administrative data cannot tell us what is happening to the number of uninsured children. This report discusses key analytic issues in the use of national survey data to estimate and analyze children’s health insurance coverage. One goal of this report is to provide staff in the Office of the Assistant Secretary for Planning and Evaluation (ASPE) with information that will be helpful in reconciling or at least understanding the reasons for the diverse findings reported in the literature on uninsured children. The second major objective is to outline for the broader research community the factors that need to be considered in designing or using surveys to evaluate the number and characteristics of uninsured children. We examine four areas:
· Identifying uninsured children in surveys
· Using survey data to simulate Medicaid eligibility
· Medicaid underreporting in surveys
· Analysis of longitudinal data
We focus on national surveys, but many of our observations will apply equally to the design of surveys at the state level.
IDENTIFYING UNINSURED CHILDREN IN SURVEYS
Most of what is known about the health insurance coverage of children in the United States has been derived from sample surveys of households. Three ongoing federal surveys--the annual March supplement to the Current Population Survey (CPS), the National Health Interview Survey (NHIS), and the Survey of Income and Program Participation (SIPP)-- provide a steady source of information on trends in coverage and support in-depth analyses of issues in health care coverage. Periodically the federal government and private foundations sponsor additional, specialized surveys to gather more detailed information on particular topics. Three such surveys are the Medical Expenditure Panel Survey (MEPS), the Community Tracking Study (CTS), and the National Survey of America’s Families (NSAF). Table 1 presents recent estimates of uninsured children from all six surveys. It is easy to see from this table why policymakers are frustrated in their attempts to understand the level and trends over time in the proportion of children who are uninsured.
Estimates of the incidence or frequency of uninsurance are reported typically in one of three ways: (1) the number who were uninsured at a specific point in time, (2) the number who were ever uninsured during a year, or (3) the number who were uninsured for the entire year. Point-in-time estimates are the most commonly cited. With the exception of the MEPS estimate, all of the estimates reported in Table 1 represent estimates of children uninsured at a point in time, or they are widely interpreted that way. Of the six surveys, only the SIPP and MEPS are able to provide all three types of estimates. With the 1992 SIPP panel we estimated that 13.1 percent of children under 19 were uninsured in September 1993, 21.7 percent were ever uninsured during the year, and 6.3 percent were uninsured for the entire year. Clearly, the choice of time period makes a big difference in the estimated proportion of children who were uninsured.
TABLE 1 ESTIMATES OF THE PERCENTAGE OF CHILDREN WITHOUT HEALTH INSURANCE, 1993-1997 Source of 1993 1994 1995 1996 1997 Estimate CPS 14.1 14.4 14.0 15.1 15.2 NHIS 14.1 15.3 13.6 13.4 -- SIPP 13.9 13.3 -- -- -- MEPS -- -- -- 15.4 -- CTS -- -- -- 11.7 -- NSAF -- -- -- -- 11.9 Notes: Estimates from the CPS and SIPP are based on tabulations of public use files by Mathematica Policy Research, Inc., and refer to children under 19 years of age. Estimates from the other surveys apply to children under 18. The NHIS estimates were reported in NCHS (1998). The estimate from MEPS refers to children who were "uninsured throughout the first half of 1996," meaning three to six months depending on the interview date; the estimate was reported in Weigers et al. (1998). The CTS estimate, reported in Rosenbach and Lewis (1998), is based on interviews conducted between July 1996 and July 1997. The NASF estimate, reported in Brennan et al. (1999), is based on interviews conducted between February and November, 1997.
The estimate of uninsured children provided annually by the March CPS has become the most widely accepted and frequently cited estimate of the uninsured. At this point, only the CPS provides annual estimates with relatively little lag, and only the CPS is able to provide state-level estimates, albeit with considerable imprecision. But what exactly does the CPS measure? CPS respondents are supposed to report any insurance coverage that they had over the past year. There is little reason to doubt that the CPS respondents are answering the health insurance questions in the manner that was intended--that is, they are reporting coverage that they ever had in the year. For example, CPS estimates of Medicaid enrollment match very closely the SIPP estimates of children ever covered by Medicaid in a year whereas the CPS estimates exceed the SIPP estimates of children covered by Medicaid at a point in time by about 27 percent. How, then, can the CPS estimates of children ever uninsured during the year match other survey estimates of children uninsured at a point in time? The answer, we suggest, lies in the extent to which insurance coverage for the year is underreported by the CPS. Is it simply by chance that the CPS numbers approximate estimates of the uninsured at a point in time, or is there something more systematic? The more the phenomenon is due to chance, the less confident we can be that the CPS will correctly track the changes in the number of uninsured children over time or correctly represent the characteristics of the uninsured.
Multiple sources of error may affect all of the major surveys, including the CPS, and make it difficult to compare their estimates of the uninsured. These include the sensitivity of responses to question design; the impact of basic survey design features; the possibility that respondents may not be aware of the source of their coverage or even its very existence; and the bias introduced by respondents’ imperfect recall.
Typically, surveys identify the uninsured as a “residual.” They ask respondents if they are covered by health insurance of various kinds and then identify the uninsured as those who report no health insurance of any kind. Both the CTS and the NSAF have employed a variant on this approach. First, they collect information on insurance coverage, and then they ask whether people who appear to be uninsured really were without coverage or had some coverage that was not reported. In both surveys this latter “verification question” reduced the estimated proportion of children who were without health insurance. These findings make a strong case for including a verification question into the measurement of health insurance coverage. The NHIS introduced such a question in 1997, and the SIPP is testing this approach.
The sensitivity of responses to question design is further illustrated by the Census Bureau’s experience in testing a series of questions intended to identify people uninsured at a point in time. These questions yielded much higher estimates than other, comparable surveys. The Bureau’s experience sends a powerful message that questions about health insurance coverage can yield unanticipated results. Researchers fielding surveys that attempt to measure health insurance coverage would be well-advised to be wary of constructing new questions unless they can also conduct very extensive pretesting.
Other survey design decisions can also have a major impact of the estimates of the uninsured, including the choice of the survey universe and the proportion of the target population that is actually represented, the response rate among eligible households, the use of proxy respondents, the choice of interview mode, the use of editing to correct improbable responses, and the use of imputation to fill in missing responses. Both the CTS and NSAF were conducted as samples of telephone numbers, with complementary samples of households without telephones. This difference in methodology between these surveys and the CPS, NHIS, and SIPP has drawn less attention than the use of a verification question, but it may be as important in accounting for the lower estimate of the proportion of children who are uninsured.
Which estimate reported in Table 1 is the most correct? There is no agreement in the research community. Clearly, the CPS estimate has been the most widely cited, but, probably its timeliness and consistency account for this more than the presumption that it is the most accurate. When the estimate from the CTS was first announced, it was greeted with skepticism. Now that the NSAF, using similar survey methods, has produced a nearly identical estimate, the CTS’ credibility has been enhanced, and the CTS number, in turn, has paved the way for broader acceptance of the NSAF estimate. Yet neither survey has addressed what was felt to be the biggest source of overestimation of the uninsured in the federal surveys: namely, the apparent, substantial underreporting of Medicaid enrollment, discussed below. Much attention has focused on the impact of the verification questions in the CTS and NSAF, but the effect was much greater in the NSAF than in the CTS even though the end results were the same. The NHIS will soon be able to show the effects of introducing a verification question into that survey, but we suspect that significant differences in the estimates will remain. We conclude that a more detailed evaluation of the potential impact of sample design on the differences between the CTS and NSAF, on the one hand, and the federal surveys, on the other, may be necessary if we are to understand the differences that we see in Table 1.
USING SURVEY DATA TO SIMULATE MEDICAID ELIGIBILITY
There are two principal reasons for simulating Medicaid eligibility in the context of studying children’s health insurance coverage. The first is to obtain denominators for the calculation of Medicaid participation rates--for all eligible children and for subgroups of this population. The second is to estimate how many uninsured children--and what percentage of the total--may be eligible for Medicaid but not participating. The regulations governing eligibility for the Medicaid program are exceedingly complex, however. There are numerous routes by which a child may qualify for enrollment, and many of the eligibility provisions and parameters vary by state. Even the most sophisticated simulations of Medicaid eligibility employ many simplifications. More typically, simulations are highly simplified and exclude many eligible children. A full simulation requires data on many types of characteristics, but even the most comprehensive surveys lack key sets of variables.
A Medicaid participation rate is formed by dividing the number of participants (people enrolled) by the number of people estimated to be eligible. Because surveys underreport participation in means-tested entitlement programs, it has become a common practice to substitute administrative counts for survey estimates of participants when calculating participation rates. This strategy merits consideration in calculating Medicaid participation rates as well, but the limitations of Medicaid eligibility simulations imply that this must be done carefully. In addition, there are issues of comparability between survey and administrative data on Medicaid enrollment that affect the substitution of the latter for the former in the calculation of participation rates and even the use of administrative data to evaluate the survey data. Problems with using administrative data include:
- The limited age detail that is available from published statistics
- The duplicate counting of children who may have been enrolled in different states
- The fact that the administrative data provide counts of children ever enrolled in a year while eligibility is estimated at a point in time
- The difficulty of removing institutionalized children--who are not in the survey data-- from the administrative numbers
- Inconsistencies in the quality of the administrative data across states and over time
Attempts to combine administrative data with survey data in calculating participation rates must also address problems of comparability created by undercoverage of the population in sample surveys and the implications of survey estimates of persons who report participation in Medicaid but are simulated to be ineligible.
A further issue affecting participation rates is how to treat children who report other insurance. With SIPP data we found that 18 percent of the children we simulated to be eligible for Medicaid reported having some form of insurance coverage other than Medicaid. Excluding them from the calculation raised the Medicaid participation rate from 65 percent to 79 percent.
MEDICAID UNDERREPORTING IN SURVEYS
When compared to administrative data, it appears that the CPS and the SIPP may underestimate Medicaid enrollment by 13 to 25 percent. The underreporting of Medicaid enrollment may lead to an overstating of the number and proportion of children who are without insurance. But the impact of Medicaid underreporting on survey estimates of the uninsured is far from clear. Indeed, even assuming that these estimates of Medicaid underreporting are accurate, the potential impact of a Medicaid undercount on estimates of the uninsured depends on how the underreporting occurs. First, some Medicaid enrollees may report to survey takers, incorrectly, that they are covered by a private insurance plan or a public plan other than Medicaid. Such people will not be counted as Medicaid participants, but neither will they be counted among the uninsured. Second, some children in families that report Medicaid coverage may be inadvertently excluded from the list of persons covered. In the SIPP we found that 7 percent of uninsured children appeared to have a parent covered by Medicaid. Any such children actually covered by Medicaid will be counted instead as uninsured. Third, some children covered by Medicaid may fail to report any coverage at all and be in families with no reported Medicaid coverage either; these children, too, will be counted incorrectly as uninsured. Fourth, some of the undercount of Medicaid enrollees may be due to underrepresentation of parts of the population in surveys, although survey undercoverage may have a greater impact on understating the number of uninsured children. This problem has not been addressed at all in the literature, and we are not aware of any estimates of how many uninsured children may be simply missing from the survey estimates. In sum, the potential impact of the underreporting of Medicaid enrollment on estimates of the uninsured is difficult to assess without information on how the undercount is distributed among different causes.
In using administrative estimates of Medicaid enrollment, it is important that the reference period of the data match the reference period of the survey estimates. HCFA(now known as CMS) reports Medicaid enrollment in terms of the number of people who were ever enrolled in a fiscal year. This number is considerably higher than the number who are enrolled at any one time. Therefore, the HCFA(now known as CMS) estimates of people ever enrolled in a year should not be used to correct survey estimates of Medicaid coverage at a point in time because this results in a substantial over-correction.
The CPS presents a special problem. We have demonstrated that while the CPS estimate of uninsured children is commonly interpreted as a point in time estimate, the reported Medicaid coverage that this estimate reflects is clearly annual-ever enrollment. Adjusting the CPS estimate of the uninsured to compensate for the underreporting of annual-ever Medicaid enrollment produces a large reduction. What this adjustment accomplishes, however, is to move the CPS estimate of the uninsured closer to what it purports to be--namely, an estimate of the number of people who were uninsured for the entire year. Applying an adjustment based on annual-ever enrollment but continuing to interpret the CPS estimate of the uninsured as a point-in-time estimate is clearly inappropriate. Adjusting the Medicaid enrollment reported in the CPS to an average monthly estimate of Medicaid enrollment yields a much smaller adjustment and a correspondingly smaller impact on the uninsured, but it involves reinterpreting the reported enrollment figure as a point-in-time estimate--which it is clearly not. Invariably, efforts to “fix” the CPS estimates run into problems such as these because the CPS estimate of the uninsured is ultimately not what people interpret it to be but, instead, an estimate--with very large measurement error--of something else. We would do better to focus our attention on true point-in-time estimates, such as those provided by SIPP, NHIS, the CTS, and NSAF. But until the turnaround in the release of SIPP and NHIS estimates can be improved substantially, policy analysts will continue to gravitate toward the CPS as their best source of information on what is happening to the population of uninsured children.
ANALYSIS OF LONGITUDINAL DATA
Given the difficulties that respondents experience in providing accurate reports of their insurance coverage more than a few months ago, panel surveys with more than one interview per year seem essential to obtaining good estimates of the duration of uninsurance and the frequency with which children experience spells of uninsurance over a period of time. Longitudinal data are even more essential if we are to understand children’s patterns of movement into and out of uninsurance and into and out of Medicaid enrollment. At the same time, however, longitudinal data present many challenges for analysts. These include the complexity of measuring the characteristics of a population over time, the effects of sample loss and population dynamics on the representativeness of panel samples, and issues that must be addressed in measuring spell duration.
Perhaps the single most important lesson to draw from this review is how much our estimates of the number and characteristics of uninsured children are affected by measurement error. Some of this error is widely acknowledged--such as the underreporting of Medicaid enrollment in surveys--but much of it is not. Even when the presence of error is recognized analysts and policymakers may not know how to take it into account. We may know, for example, that Medicaid enrollment is underreported by 24 percent in a particular survey, but how does that affect the estimate of the uninsured? And how much does the apparent, substantial underreporting of Medicaid contribute to the perception that Medicaid is failing to reach millions of uninsured children? Until we can make progress in separating the measurement error from the reality of uninsurance, our policy solutions will continue to be inefficient, and our ability to measure our successes will continue to be limited.
As federal and state policy analysts ponder how to evaluate the impact of the Children’s Health Insurance Program (CHIP) initiatives authorized by Congress, attention is turning to ways to utilize ongoing surveys as well as to the possibility of states funding their own surveys. Survey data certainly will play an important role in the CHIP evaluations. While administrative data can and will be used to document the enrollment of children in these new programs as well as the expanded Medicaid program, administrative data cannot tell us what is happening to the number of uninsured children. In this context it is important to consider what we know about the use of surveys to measure the incidence of uninsurance among children.
The purpose of this report is to discuss key analytic issues in the use of national survey data to estimate and analyze children’s health insurance coverage. The issues include many that emerged in the course of preparing a literature review on uninsured children (Lewis, Ellwood, and Czajka 1997, 1998) and in conducting analyses of children’s health insurance coverage with the Survey of Income and Program Participation (SIPP) (Czajka 1999). One goal of this report is to provide staff in the Office of the Assistant Secretary for Planning and Evaluation (ASPE) with information that will be helpful in reconciling or at least understanding the reasons for the diverse findings reported in the literature on uninsured children. The second major objective is to outline for the broader research community the factors that need to be considered in designing or using surveys to evaluate the number and characteristics of uninsured children. While we focus on national surveys, many of our observations will apply equally well to the design of surveys at the state level.
Section A discusses how uninsured children have been identified in the major national surveys. It compares alternative approaches, discusses a number of measurement problems that have emerged as important, and concludes with comments on the interpretation of uninsurance as measured in the Current Population Survey (CPS)--the national survey most widely cited with respect to the number of uninsured children. Section B looks at the problem of simulating eligibility for the Medicaid program. Estimates developed with different underlying assumptions suggest that anywhere from 1.5 million to 4 million uninsured children at various points in the 1990s may have been eligible for but not participating in Medicaid. In part because the estimates vary so widely, and also because even the lowest estimate of this population is sizable, the problem of simulating Medicaid eligibility merits extended discussion. Building on this discussion, Section C then examines strategies for calculating participation rates for the Medicaid program. We review issues relating to estimating the number of participants with administrative versus survey data and making legitimate comparisons with estimates of the number of people who were actually eligible to participate in Medicaid. We include a discussion of the problem presented by people who report participation but appear to be ineligible. Section D examines how the underreporting of Medicaid participation in surveys may affect survey estimates of the uninsured, and Section E discusses issues related to the use of longitudinal data to investigate health insurance coverage in general and uninsurance in particular. Finally, Section F reviews our major conclusions.