Using National Survey Data to Analyze Children’s Health Insurance Coverage: An Assessment of Issues

Publication Date

May 20, 1999

by John L. Czajka and Kimball Lewis

Mathematica Policy Research, Inc.
600 Maryland Ave., S.W. Suite 550
Washington, DC 20024

The authors wish to express their gratitude to Marilyn Ellwood for her helpful comments on an earlier draft and her many other contributions to this effort. We also want to thank Elizabeth Hurley for her editing of the manuscript, and Melanie Lynch for carrying out the final production.

EXECUTIVE SUMMARY

Survey data will play an important role in the evaluations of the Children’s Health Insurance Program (CHIP) because program administrative data cannot tell us what is happening to the number of uninsured children. This report discusses key analytic issues in the use of national survey data to estimate and analyze children’s health insurance coverage. One goal of this report is to provide staff in the Office of the Assistant Secretary for Planning and Evaluation (ASPE) with information that will be helpful in reconciling or at least understanding the reasons for the diverse findings reported in the literature on uninsured children. The second major objective is to outline for the broader research community the factors that need to be considered in designing or using surveys to evaluate the number and characteristics of uninsured children. We examine four areas:

· Identifying uninsured children in surveys

· Using survey data to simulate Medicaid eligibility

· Medicaid underreporting in surveys

· Analysis of longitudinal data

We focus on national surveys, but many of our observations will apply equally to the design of surveys at the state level.

IDENTIFYING UNINSURED CHILDREN IN SURVEYS

Most of what is known about the health insurance coverage of children in the United States has been derived from sample surveys of households. Three ongoing federal surveys--the annual March supplement to the Current Population Survey (CPS), the National Health Interview Survey (NHIS), and the Survey of Income and Program Participation (SIPP)-- provide a steady source of information on trends in coverage and support in-depth analyses of issues in health care coverage. Periodically the federal government and private foundations sponsor additional, specialized surveys to gather more detailed information on particular topics. Three such surveys are the Medical Expenditure Panel Survey (MEPS), the Community Tracking Study (CTS), and the National Survey of America’s Families (NSAF). Table 1 presents recent estimates of uninsured children from all six surveys. It is easy to see from this table why policymakers are frustrated in their attempts to understand the level and trends over time in the proportion of children who are uninsured.

Estimates of the incidence or frequency of uninsurance are reported typically in one of three ways: (1) the number who were uninsured at a specific point in time, (2) the number who were ever uninsured during a year, or (3) the number who were uninsured for the entire year. Point-in-time estimates are the most commonly cited. With the exception of the MEPS estimate, all of the estimates reported in Table 1 represent estimates of children uninsured at a point in time, or they are widely interpreted that way. Of the six surveys, only the SIPP and MEPS are able to provide all three types of estimates. With the 1992 SIPP panel we estimated that 13.1 percent of children under 19 were uninsured in September 1993, 21.7 percent were ever uninsured during the year, and 6.3 percent were uninsured for the entire year. Clearly, the choice of time period makes a big difference in the estimated proportion of children who were uninsured.

                                   TABLE 1
ESTIMATES OF THE PERCENTAGE OF CHILDREN WITHOUT HEALTH INSURANCE, 1993-1997
                                                                                          
Source of     1993        1994        1995        1996       1997
Estimate
                                                                                          
CPS           14.1        14.4        14.0        15.1       15.2
NHIS          14.1        15.3        13.6        13.4        --
SIPP          13.9        13.3         --          --         --
MEPS           --           --          --        15.4        --
CTS            --           --          --        11.7        --
NSAF           --           --          --         --         11.9
                                                                                          
Notes:  Estimates from the CPS and SIPP are based on tabulations of public use
files by Mathematica Policy Research, Inc., and refer to children under 19 years
of age.  Estimates from the other surveys apply to children under 18.  The NHIS 
estimates were reported in NCHS (1998).  The estimate from MEPS refers to children
who were "uninsured throughout the first half of 1996," meaning three to six 
months depending on the interview date; the estimate was reported in 
Weigers et al. (1998).  The CTS estimate, reported in Rosenbach and Lewis (1998), 
is based on interviews conducted between July 1996 and July 1997.  The NASF estimate, 
reported in Brennan et al. (1999), is based on interviews conducted between 
February and November, 1997.

The estimate of uninsured children provided annually by the March CPS has become the most widely accepted and frequently cited estimate of the uninsured. At this point, only the CPS provides annual estimates with relatively little lag, and only the CPS is able to provide state-level estimates, albeit with considerable imprecision. But what exactly does the CPS measure? CPS respondents are supposed to report any insurance coverage that they had over the past year. There is little reason to doubt that the CPS respondents are answering the health insurance questions in the manner that was intended--that is, they are reporting coverage that they ever had in the year. For example, CPS estimates of Medicaid enrollment match very closely the SIPP estimates of children ever covered by Medicaid in a year whereas the CPS estimates exceed the SIPP estimates of children covered by Medicaid at a point in time by about 27 percent. How, then, can the CPS estimates of children ever uninsured during the year match other survey estimates of children uninsured at a point in time? The answer, we suggest, lies in the extent to which insurance coverage for the year is underreported by the CPS. Is it simply by chance that the CPS numbers approximate estimates of the uninsured at a point in time, or is there something more systematic? The more the phenomenon is due to chance, the less confident we can be that the CPS will correctly track the changes in the number of uninsured children over time or correctly represent the characteristics of the uninsured.

Multiple sources of error may affect all of the major surveys, including the CPS, and make it difficult to compare their estimates of the uninsured. These include the sensitivity of responses to question design; the impact of basic survey design features; the possibility that respondents may not be aware of the source of their coverage or even its very existence; and the bias introduced by respondents’ imperfect recall.

Typically, surveys identify the uninsured as a “residual.” They ask respondents if they are covered by health insurance of various kinds and then identify the uninsured as those who report no health insurance of any kind. Both the CTS and the NSAF have employed a variant on this approach. First, they collect information on insurance coverage, and then they ask whether people who appear to be uninsured really were without coverage or had some coverage that was not reported. In both surveys this latter “verification question” reduced the estimated proportion of children who were without health insurance. These findings make a strong case for including a verification question into the measurement of health insurance coverage. The NHIS introduced such a question in 1997, and the SIPP is testing this approach.

The sensitivity of responses to question design is further illustrated by the Census Bureau’s experience in testing a series of questions intended to identify people uninsured at a point in time. These questions yielded much higher estimates than other, comparable surveys. The Bureau’s experience sends a powerful message that questions about health insurance coverage can yield unanticipated results. Researchers fielding surveys that attempt to measure health insurance coverage would be well-advised to be wary of constructing new questions unless they can also conduct very extensive pretesting.

Other survey design decisions can also have a major impact of the estimates of the uninsured, including the choice of the survey universe and the proportion of the target population that is actually represented, the response rate among eligible households, the use of proxy respondents, the choice of interview mode, the use of editing to correct improbable responses, and the use of imputation to fill in missing responses. Both the CTS and NSAF were conducted as samples of telephone numbers, with complementary samples of households without telephones. This difference in methodology between these surveys and the CPS, NHIS, and SIPP has drawn less attention than the use of a verification question, but it may be as important in accounting for the lower estimate of the proportion of children who are uninsured.

Which estimate reported in Table 1 is the most correct? There is no agreement in the research community. Clearly, the CPS estimate has been the most widely cited, but, probably its timeliness and consistency account for this more than the presumption that it is the most accurate. When the estimate from the CTS was first announced, it was greeted with skepticism. Now that the NSAF, using similar survey methods, has produced a nearly identical estimate, the CTS’ credibility has been enhanced, and the CTS number, in turn, has paved the way for broader acceptance of the NSAF estimate. Yet neither survey has addressed what was felt to be the biggest source of overestimation of the uninsured in the federal surveys: namely, the apparent, substantial underreporting of Medicaid enrollment, discussed below. Much attention has focused on the impact of the verification questions in the CTS and NSAF, but the effect was much greater in the NSAF than in the CTS even though the end results were the same. The NHIS will soon be able to show the effects of introducing a verification question into that survey, but we suspect that significant differences in the estimates will remain. We conclude that a more detailed evaluation of the potential impact of sample design on the differences between the CTS and NSAF, on the one hand, and the federal surveys, on the other, may be necessary if we are to understand the differences that we see in Table 1.

USING SURVEY DATA TO SIMULATE MEDICAID ELIGIBILITY

There are two principal reasons for simulating Medicaid eligibility in the context of studying children’s health insurance coverage. The first is to obtain denominators for the calculation of Medicaid participation rates--for all eligible children and for subgroups of this population. The second is to estimate how many uninsured children--and what percentage of the total--may be eligible for Medicaid but not participating. The regulations governing eligibility for the Medicaid program are exceedingly complex, however. There are numerous routes by which a child may qualify for enrollment, and many of the eligibility provisions and parameters vary by state. Even the most sophisticated simulations of Medicaid eligibility employ many simplifications. More typically, simulations are highly simplified and exclude many eligible children. A full simulation requires data on many types of characteristics, but even the most comprehensive surveys lack key sets of variables.

A Medicaid participation rate is formed by dividing the number of participants (people enrolled) by the number of people estimated to be eligible. Because surveys underreport participation in means-tested entitlement programs, it has become a common practice to substitute administrative counts for survey estimates of participants when calculating participation rates. This strategy merits consideration in calculating Medicaid participation rates as well, but the limitations of Medicaid eligibility simulations imply that this must be done carefully. In addition, there are issues of comparability between survey and administrative data on Medicaid enrollment that affect the substitution of the latter for the former in the calculation of participation rates and even the use of administrative data to evaluate the survey data. Problems with using administrative data include:

The limited age detail that is available from published statistics
The duplicate counting of children who may have been enrolled in different states
The fact that the administrative data provide counts of children ever enrolled in a year while eligibility is estimated at a point in time
The difficulty of removing institutionalized children--who are not in the survey data-- from the administrative numbers
Inconsistencies in the quality of the administrative data across states and over time

Attempts to combine administrative data with survey data in calculating participation rates must also address problems of comparability created by undercoverage of the population in sample surveys and the implications of survey estimates of persons who report participation in Medicaid but are simulated to be ineligible.

A further issue affecting participation rates is how to treat children who report other insurance. With SIPP data we found that 18 percent of the children we simulated to be eligible for Medicaid reported having some form of insurance coverage other than Medicaid. Excluding them from the calculation raised the Medicaid participation rate from 65 percent to 79 percent.

MEDICAID UNDERREPORTING IN SURVEYS

When compared to administrative data, it appears that the CPS and the SIPP may underestimate Medicaid enrollment by 13 to 25 percent. The underreporting of Medicaid enrollment may lead to an overstating of the number and proportion of children who are without insurance. But the impact of Medicaid underreporting on survey estimates of the uninsured is far from clear. Indeed, even assuming that these estimates of Medicaid underreporting are accurate, the potential impact of a Medicaid undercount on estimates of the uninsured depends on how the underreporting occurs. First, some Medicaid enrollees may report to survey takers, incorrectly, that they are covered by a private insurance plan or a public plan other than Medicaid. Such people will not be counted as Medicaid participants, but neither will they be counted among the uninsured. Second, some children in families that report Medicaid coverage may be inadvertently excluded from the list of persons covered. In the SIPP we found that 7 percent of uninsured children appeared to have a parent covered by Medicaid. Any such children actually covered by Medicaid will be counted instead as uninsured. Third, some children covered by Medicaid may fail to report any coverage at all and be in families with no reported Medicaid coverage either; these children, too, will be counted incorrectly as uninsured. Fourth, some of the undercount of Medicaid enrollees may be due to underrepresentation of parts of the population in surveys, although survey undercoverage may have a greater impact on understating the number of uninsured children. This problem has not been addressed at all in the literature, and we are not aware of any estimates of how many uninsured children may be simply missing from the survey estimates. In sum, the potential impact of the underreporting of Medicaid enrollment on estimates of the uninsured is difficult to assess without information on how the undercount is distributed among different causes.

The CPS presents a special problem. We have demonstrated that while the CPS estimate of uninsured children is commonly interpreted as a point in time estimate, the reported Medicaid coverage that this estimate reflects is clearly annual-ever enrollment. Adjusting the CPS estimate of the uninsured to compensate for the underreporting of annual-ever Medicaid enrollment produces a large reduction. What this adjustment accomplishes, however, is to move the CPS estimate of the uninsured closer to what it purports to be--namely, an estimate of the number of people who were uninsured for the entire year. Applying an adjustment based on annual-ever enrollment but continuing to interpret the CPS estimate of the uninsured as a point-in-time estimate is clearly inappropriate. Adjusting the Medicaid enrollment reported in the CPS to an average monthly estimate of Medicaid enrollment yields a much smaller adjustment and a correspondingly smaller impact on the uninsured, but it involves reinterpreting the reported enrollment figure as a point-in-time estimate--which it is clearly not. Invariably, efforts to “fix” the CPS estimates run into problems such as these because the CPS estimate of the uninsured is ultimately not what people interpret it to be but, instead, an estimate--with very large measurement error--of something else. We would do better to focus our attention on true point-in-time estimates, such as those provided by SIPP, NHIS, the CTS, and NSAF. But until the turnaround in the release of SIPP and NHIS estimates can be improved substantially, policy analysts will continue to gravitate toward the CPS as their best source of information on what is happening to the population of uninsured children.

ANALYSIS OF LONGITUDINAL DATA

Given the difficulties that respondents experience in providing accurate reports of their insurance coverage more than a few months ago, panel surveys with more than one interview per year seem essential to obtaining good estimates of the duration of uninsurance and the frequency with which children experience spells of uninsurance over a period of time. Longitudinal data are even more essential if we are to understand children’s patterns of movement into and out of uninsurance and into and out of Medicaid enrollment. At the same time, however, longitudinal data present many challenges for analysts. These include the complexity of measuring the characteristics of a population over time, the effects of sample loss and population dynamics on the representativeness of panel samples, and issues that must be addressed in measuring spell duration.

CONCLUSION

Perhaps the single most important lesson to draw from this review is how much our estimates of the number and characteristics of uninsured children are affected by measurement error. Some of this error is widely acknowledged--such as the underreporting of Medicaid enrollment in surveys--but much of it is not. Even when the presence of error is recognized analysts and policymakers may not know how to take it into account. We may know, for example, that Medicaid enrollment is underreported by 24 percent in a particular survey, but how does that affect the estimate of the uninsured? And how much does the apparent, substantial underreporting of Medicaid contribute to the perception that Medicaid is failing to reach millions of uninsured children? Until we can make progress in separating the measurement error from the reality of uninsurance, our policy solutions will continue to be inefficient, and our ability to measure our successes will continue to be limited.

As federal and state policy analysts ponder how to evaluate the impact of the Children’s Health Insurance Program (CHIP) initiatives authorized by Congress, attention is turning to ways to utilize ongoing surveys as well as to the possibility of states funding their own surveys. Survey data certainly will play an important role in the CHIP evaluations. While administrative data can and will be used to document the enrollment of children in these new programs as well as the expanded Medicaid program, administrative data cannot tell us what is happening to the number of uninsured children. In this context it is important to consider what we know about the use of surveys to measure the incidence of uninsurance among children.

The purpose of this report is to discuss key analytic issues in the use of national survey data to estimate and analyze children’s health insurance coverage. The issues include many that emerged in the course of preparing a literature review on uninsured children (Lewis, Ellwood, and Czajka 1997, 1998) and in conducting analyses of children’s health insurance coverage with the Survey of Income and Program Participation (SIPP) (Czajka 1999). One goal of this report is to provide staff in the Office of the Assistant Secretary for Planning and Evaluation (ASPE) with information that will be helpful in reconciling or at least understanding the reasons for the diverse findings reported in the literature on uninsured children. The second major objective is to outline for the broader research community the factors that need to be considered in designing or using surveys to evaluate the number and characteristics of uninsured children. While we focus on national surveys, many of our observations will apply equally well to the design of surveys at the state level.

Section A discusses how uninsured children have been identified in the major national surveys. It compares alternative approaches, discusses a number of measurement problems that have emerged as important, and concludes with comments on the interpretation of uninsurance as measured in the Current Population Survey (CPS)--the national survey most widely cited with respect to the number of uninsured children. Section B looks at the problem of simulating eligibility for the Medicaid program. Estimates developed with different underlying assumptions suggest that anywhere from 1.5 million to 4 million uninsured children at various points in the 1990s may have been eligible for but not participating in Medicaid. In part because the estimates vary so widely, and also because even the lowest estimate of this population is sizable, the problem of simulating Medicaid eligibility merits extended discussion. Building on this discussion, Section C then examines strategies for calculating participation rates for the Medicaid program. We review issues relating to estimating the number of participants with administrative versus survey data and making legitimate comparisons with estimates of the number of people who were actually eligible to participate in Medicaid. We include a discussion of the problem presented by people who report participation but appear to be ineligible. Section D examines how the underreporting of Medicaid participation in surveys may affect survey estimates of the uninsured, and Section E discusses issues related to the use of longitudinal data to investigate health insurance coverage in general and uninsurance in particular. Finally, Section F reviews our major conclusions.

A. IDENTIFYING UNINSURED CHILDREN IN SURVEYS

Most of what is known about the health insurance coverage of children in the United States has been derived from sample surveys of households. Three ongoing federal surveys collect data on insurance coverage from nationally representative samples, thereby providing a steady source of information on trends in coverage as well as supporting in-depth analyses of issues in health care coverage. Periodically the federal government and private foundations sponsor additional, specialized surveys to gather more detailed information on particular topics. After a brief review of the major federal surveys and three recent specialized surveys, we outline the alternative approaches that are being used to identify uninsured children and consider some of the measurement problems that confront these efforts. We close this section with a discussion of the interpretation of estimates of the uninsured from the most widely cited of these surveys.

1. The Major Surveys

The CPS is a monthly survey whose chief purpose is to provide official estimates of unemployment and other labor force data. In an annual supplement administered each March, the CPS captures information on the health insurance coverage. In large part because of the timely release of these data and their consistent measurement over time, the CPS has become the most widely cited source of information on the uninsured. The March supplement is also the source of the official estimates of poverty in the United States. The availability of the poverty measures along with the data on health insurance coverage and a large sample size--50,000 households--that can support state-level estimates have contributed to making the CPS an important resource for research on the uninsured.

The National Health Interview Survey (NHIS) collects data each week on the health status and related characteristics of the population. The principal purpose of the NHIS is to provide estimates of the incidence and prevalence of both acute and chronic morbidity. To achieve this objective, the entire year must be covered. To limit the impact of recall error and reduce respondent burden, the annual interviews (with more than 40,000 households) are distributed over 52 weeks, and respondents are asked to report on their current health status as well as recent utilization of health care services. The interviews include a battery of questions on health insurance coverage. These data can be aggregated over the year to produce an average weekly measure of insurance coverage. Despite some clear advantages of the NHIS measure over the CPS measure of the uninsured, however, the NHIS measure has been much less widely accepted and cited. Even its limitations are much less well known than those of the CPS measure. The long lag with which data from the NHIS are released, relative to the March CPS, is undoubtedly a major factor limiting use of these data on uninsurance.

The last of the three ongoing surveys, the SIPP, is a longitudinal survey that follows a sample of households--a “panel”--for two-and-a-half to four years. Sample households are interviewed every four months and asked to provide detailed monthly data on household composition, employment and income of household members, and other characteristics. Each interview includes a battery of questions on health insurance coverage. Until a major redesign, initiated in 1996, new panels were started every year. When combined, the overlapping panels yielded national samples that were about three-quarters the size of the CPS and NHIS samples. The 1996 panel, which is twice the size of its predecessors, will run for four years; the next panel is not scheduled to begin until 2000. While the enhanced sample size was intended to eliminate the need for overlapping panels, starting a new panel every year also provided a way to maintain the representativeness of SIPP data over time. The loss of overlapping panels, however, weakens the SIPP as a source of reliable data on national trends. Finally, while the redesign has also slowed the release of data from the 1996 panel, SIPP data have never been released in as timely a manner as March CPS data, and, as with the NHIS, this has limited their value as a source of current data on trends.⁽¹⁾

All three of these surveys are conducted by the U.S. Bureau of the Census. The CPS is a collaborative effort with the Bureau of Labor Statistics (BLS), which bears ultimate responsibility for the labor force statistics. The March supplement and the SIPP, however, are entirely Census Bureau efforts. The NHIS is conducted for the National Center for Health Statistics (NCHS), with the Census Bureau serving, essentially, as the survey contractor.

Periodically, the Agency for Health Care Policy and Research (AHCPR) conducts a panel survey of households to collect detailed longitudinal data on the population’s utilization of the health care system, expenditures on medical care, and health status. The most recent of these efforts, the Medical Expenditure Panel Survey (MEPS), was drawn from households that responded to the NHIS during the middle quarters of 1995. The initial MEPS interviews were conducted by Westat. Like the SIPP, MEPS will collect data at subannual intervals, and new panels will overlap earlier panels, allowing data to be pooled to enhance sample size and improve representativeness (see Section E).

The federal government is not alone in sponsoring large-scale national surveys to measure health insurance coverage and aspects of health care utilization. Private foundations have sponsored a number of surveys as well. While none of these foundation-sponsored efforts has been repeated with sufficient regularity to provide a long-term source of data on trends, the two most prominent of the recent undertakings will collect data from at least two points in time. The household component of the Community Tracking Study (CTS) was conducted by Mathematica Policy Research for the Center for Studying Health System Change, with funding from the Robert Wood Johnson Foundation.⁽²⁾ The survey was fielded between July 1996 and July 1997 and collected data on current health insurance coverage (that is, at the time of the interview). Interviews were completed with about 32,000 families representing the civilian noninstitutionalized population of the 48 contiguous states and the District of Columbia. More than a third of the sample was concentrated in 12 urban sites that will be the subject of intensive study. The second round survey, which includes both a longitudinal component and a new, independent sample of households, started in 1998 and will be completed in 1999.

In 1997 the Urban Institute, with sponsorship from a group of foundations, fielded the first wave of the National Survey of America’s Families (NSAF).⁽³⁾ The total sample size of 44,000 households compares to the NHIS, although the nationally representative sample (except for Alaska and Hawaii) features large samples for 13 states. These 13 states, which account for one-half of the U.S. population, will be the subject of intensive study. The survey was conducted by Westat from February through November of 1997. A second interview with the same sample is currently in the field, and a third interview may be fielded as well. Both the CTS and the NSAF include extensive batteries of questions on health insurance coverage, and both incorporate significant methodological innovations in these measures, which we will describe shortly.

Table 1 presents estimates from each of these surveys of the proportion of children who were uninsured at different times between 1993 and 1997. With the exception of the MEPS estimate, discussed below, all of these estimates represent or are widely interpreted to represent children who were uninsured at a point in time. Estimates refer to children under 19 (CPS and SIPP) or children under 18.⁽⁴⁾ We will refer back to this table as we discuss alternative approaches to measuring uninsurance and the sources of error in estimates of the uninsured. Briefly, however, the estimates from the CPS, which we have reported for all five years, show little movement over the first three years but then a 1.1 percentage point rise between 1995 and 1996, with essentially no change between 1996 and 1997. The NHIS estimate in 1993 equals the CPS estimate, but the NHIS series shows a 1.2 percentage point rise between 1993 and 1994, followed by a 1.7 percentage point drop

                                   TABLE 1
ESTIMATES OF THE PERCENTAGE OF CHILDREN WITHOUT HEALTH INSURANCE, 1993-1997
                                                                                          
Source of     1993        1994        1995        1996       1997
Estimate
                                                                                          
CPS           14.1        14.4        14.0        15.1       15.2
NHIS          14.1        15.3        13.6        13.4        --
SIPP          13.9        13.3         --          --         --
MEPS           --           --          --        15.4        --
CTS            --           --          --        11.7        --
NSAF           --           --          --         --         11.9
                                                                                          
Notes:  Estimates from the CPS and SIPP are based on tabulations of public use
files by Mathematica Policy Research, Inc., and refer to children under 19 years
of age.  Estimates from the other surveys apply to children under 18.  The NHIS 
estimates were reported in NCHS (1998).  The estimate from MEPS refers to children
who were "uninsured throughout the first half of 1996," meaning three to six 
months depending on the interview date; the estimate was reported in 
Weigers et al. (1998).  The CTS estimate, reported in Rosenbach and Lewis (1998), 
is based on interviews conducted between July 1996 and July 1997.  The NASF estimate, 
reported in Brennan et al. (1999), is based on interviews conducted between 
February and November, 1997.

between 1994 and 1995 and then essentially no change between 1995 and 1996, at which point the NHIS estimate is 1.7 percentage points below the CPS estimate. We should caution, however, that the 1996 NHIS estimate is a preliminary figure based on just the first 5/8 of the sample. For this reason it may not reflect the impact of the implementation of the Personal Responsibility and Work Opportunity Reconciliation Act (PRWORA)--the welfare reform law that went into effect in the late summer of 1996. Some observers have attributed the rise in the CPS estimate of uninsured children between 1995 and 1996 to a reduction in the Medicaid caseload that accompanied the implementation of welfare reform (Fronstin 1997). The SIPP estimate for September 1993, at 13.9 percent, lies within sampling error of the CPS and NHIS estimates for 1993, but the SIPP estimate drops between 1993 and 1994 while both the other series rise. Like the CPS estimate, the MEPS estimate of 15.4 percent purports to be children who were continuously uninsured over a period of time (three to six months in this case), but its value, which nearly equals the CPS estimate, is more consistent with point-in-time estimates. Finally, both the CTS and the NSAF yield estimates below 12 percent for the proportion of children who were uninsured. These estimates for the privately funded surveys lie substantially below the estimates from the federal surveys. In later sections we will explore possible reasons for this difference.

2. Alternative Approaches to Measuring Uninsurance

The surveys discussed in the preceding section have employed somewhat different approaches to measuring uninsurance among children, and other approaches are possible. Here we discuss two dimensions of the measurement of uninsurance: (1) whether uninsurance is measured directly or as a residual and (2) the choice of reference period.

a. Measuring Uninsurance Directly or as a Residual

There is a direct approach and a more commonly used indirect approach to identifying uninsured children in household surveys. The direct approach is to ask respondents if they and their children are currently without health insurance or have been uninsured in the recent past. The alternative, indirect approach is to ask respondents if they are covered by health insurance and then identify the uninsured as those who report no health insurance of any kind. Because interest in measuring the frequency of uninsurance is coupled, ordinarily, with interest in measuring the frequency with which children (or adults) are covered by particular types of health insurance, the more common approach is the indirect one--that is, identifying the uninsured as a “residual,” or those who are left when all children who are reported to be insured are removed. This is the approach used in the CPS, the SIPP, the NHIS, and, for some of its measures, MEPS.

We are not aware of any survey that has attempted to measure uninsurance by first asking if a child is or has been without health insurance.⁵ However, both the CTS and the NSAF have employed a variant on the traditional approach that involves first collecting information on insurance coverage and then asking whether those people who appear to be uninsured really were without coverage or had some insurance that was not reported. For example, in the CTS, the sequence on insurance coverage ends with, “(Are you/any of you/either of you) covered by a health insurance plan that I have not mentioned?” Respondents who indicated “no” to every type of coverage were then asked:

According to the information we have, (NAME) does not have health care coverage of any kind. Does (he/she) have health insurance coverage through a plan I might have missed?

If necessary, the interviewer reviewed the eight general types of plans. The respondent could indicate coverage under any of these types of plans or could reaffirm that he or she was not covered by any plan. In the NSAF, each respondent under 65 who reported no coverage was asked,

According to the information you have provided, (NAME OF UNCOVERED FAMILY MEMBER UNDER 65) currently does not have health care coverage. Is that correct?

If the answer was yes, the question was repeated for the next uninsured person. If the answer was no, the respondent was then asked:

At this time, under which of the following plans or programs is (NAME) covered?

The sources of coverage were repeated, and the respondent was allowed to identify coverage that had been missed or to verify that there was indeed no coverage under any type of plan.

In both of these surveys, including this “verification” question converted nontrivial percentages of children from uninsured, initially, to insured. In the CTS, the responses to this question reduced the fraction of children (under 18) who were reported as uninsured from 12.7 percent to 11.7 percent (Rosenbach and Lewis 1998). In the NSAF, the verification question lowered the estimated share of children who were uninsured from about 15 percent to 11.9 percent.⁽⁶⁾ While the uninsured are still identified as a residual, the findings from these two surveys suggest that giving respondents the opportunity to verify their status makes a difference in the proportion of children who are estimated to be without health insurance. Curiously, both the CTS and the NSAF end up with about the same proportion of children reported as uninsured. Without the verification question, however, the CTS would have estimated 2 percentage points fewer uninsured children than the NSAF. Is a verification question an equalizer across surveys, helping to compensate for differentially complete reporting of insurance coverage in the questions that precede it? Certainly that is a plausible interpretation of these findings from a survey methodological standpoint. In any event, the results from these two surveys make a strong case for including a verification question as a standard part of a battery of health insurance questions. The NHIS added such a question in 1997, although no results have been reported as yet. The Census Bureau is testing such a question in the SIPP setting. We would hope that these efforts to test the impact of a verification question would be accompanied by cognitive research that can help to explain why respondents change their responses. It would be preferable to improve the earlier questions than to rely on a verification question to change large numbers of responses.

b.Reference Periods

Estimates of the number or percentage of children who were uninsured over different periods of time are useful for different purposes. Estimates of the number of children who were ever uninsured over a year indicate how prevalent uninsurance is. Estimates of children uninsured for an entire year demonstrate the magnitude of chronic uninsurance. Estimates of children uninsured at a point in time reflect a combination of prevalence and duration in that the more time children spend in the state of uninsurance, the more closely the number uninsured at a point in time will approach the number who were ever uninsured.

Table 2 presents estimates for all three types of reference periods, based on data from the 1992 SIPP panel. While 13.1 percent of children under 19 were uninsured in September 1993, 21.7 percent of children under 19 were ever uninsured during the year while 6.3 percent were uninsured for the entire year.

Measuring uninsurance as a residual has implications for the length of time over which children are identified as uninsured. When a survey identifies the uninsured as a residual, the duration of uninsurance that is measured is generally synonymous with the reference period. That is, children for whom no insurance coverage is reported during the reference period are, by definition, uninsured for the entire period. To identify periods of uninsurance occurring within a reference period in which there were also periods of insurance coverage, it is necessary to do one of the following: (1) ask about such periods of uninsurance directly, (2) ask whether the insurance coverage extended to the entire period, or (3) break the total reference period into multiple, shorter periods, such as months and establish whether a person was insured or uninsured in each month.⁷

In the March CPS, respondents are asked if they were ever covered by any of several types of insurance during the previous calendar year. Respondents can indicate that they had multiple types of coverage during the year. But because the survey instrument does not ask if respondents were ever uninsured, or how long they were covered, respondents cannot report that they were covered for part of the year and uninsured for the rest.

TABLE 2. ESTIMATES OF THE PROPORTION OF CHILDREN UNDER 19 WHO WERE UNINSURED FOR DIFFERENT PERIODS OF TIME

Period	Estimate
Uninsured at a Point in Time (September 1993)	13.1%
Ever Uninsured in Year	21.7%
Uninsured Continuously throughout the Year	6.3%

In the SIPP, respondents are asked to report whether they had any of several types of insurance coverage during each of the four preceding months. The month is the reference period. To be identified as uninsured during a given month, a child must be reported as having had no coverage during the month. Thus, a child is classified as uninsured during a month only if the child was uninsured for the entire month.⁽⁸⁾ With the SIPP data, however, we can aggregate individual months into years or even longer periods, and we can identify children who were ever uninsured during the year, where being ever uninsured means being uninsured for at least one full calendar month.

The redesigned NHIS, the CTS, and the NSAF all capture insurance status at the time of the interview--that is, literally at a point in time. Other things being equal, this approach would appear likely to yield the most error-free reports and, in addition, the least biased estimates of coverage. It also has the advantage of requiring no recall. Respondents are not asked to remember when coverage began or ended, only to indicate whether they currently have it or not.

The value of estimates for different types of reference periods depends, in part, on the accuracy with which they can be measured. If the number of children uninsured at a point in time can be measured more accurately than the number ever uninsured during a year or the number uninsured for the entire year, then there is a sense in which the point-in-time estimates are more valuable. In the next section we discuss measurement problems that affect estimates of the uninsured.

3. Sources of Error in Estimates of the Uninsured

There are a number of sources of error encountered in attempting to measure uninsurance, and these affect the comparability of estimates from different surveys. These include certain limitations inherent in measuring uninsurance as a residual, as it is usually done; the possibility that respondents may not be aware of existing coverage; the bias introduced by respondents’ imperfect recall; the sensitivity of responses to question design; and the impact of basic survey design choices.

a.Limitations Inherent in Measuring Uninsurance as a Residual

Perhaps the most significant problem with measuring uninsurance as a residual is that a small error rate in the reporting of insurance becomes a large error in the estimate of the uninsured. With the number of children insured at a point in time being eight to nine times the number without insurance, and the number ever insured during a year being 18 to 19 times the number never insured, errors in the reporting of insurance coverage are multiplied many times in their impact on estimates of the uninsured. Based on the SIPP estimates reported in Table 2, a 6 to 7 percent error in the reporting of children who ever had health insurance would double the estimated number who had no insurance. In Section 4, below, we argue that this is what accounts for the fact that the CPS estimate of the uninsured resembles an estimate of children uninsured at a point in time rather than children uninsured for the entire year, which is what the questions are designed to yield.⁽⁹⁾

Another implication of measuring uninsurance as a residual can be seen in the CPS estimates of the frequency of uninsurance among infants. The health insurance questions in the March CPS refer to coverage in the preceding calendar year--that is, the year ending December 31. If parents answer the CPS questions as intended, a child born after the end of the year cannot be identified as having had coverage during the previous year. With no reported coverage, such a child would be classified as uninsured. If all children born after the end of the year were classified as uninsured, this would add about one-sixth of all infants to the estimated number uninsured. Because the March CPS public use files lack a field indicating the month of birth, data users cannot identify infants born after the end of the year and cannot exclude them from their analyses. Is there any evidence that uninsurance is overstated among infants in the CPS? Table 3 addresses this question by comparing estimates of the rate of uninsurance for infants and older children, based on the March CPS and the SIPP. The CPS estimates of the proportion of infants who are uninsured are markedly higher than the SIPP estimates in both the 1993 and 1994 reference years: 11.5 versus 7.7 percent in 1993 and 17.3 versus 9.3 percent in 1994.

b.Awareness of Coverage

People may have insurance coverage without being aware that they have it. While this lack of awareness may seem improbable, both the CPS and SIPP provide direct evidence with respect to Medicaid coverage. Prior to welfare reform, families that received Aid to Families with Dependent Children (AFDC) were covered by Medicaid as well. Nevertheless, surveys that asked respondents about AFDC as well as Medicaid found that nontrivial numbers reported receiving AFDC but not being covered by Medicaid. Were such people unaware that they were covered by Medicaid, or did they know Medicaid by another name and not recognize the name(s) used in the surveys?⁽¹⁰⁾

We do not know the answer. To correct for such instances, the Census Bureau employs in both the CPS and SIPP a number of “logical imputations” or edits to reported health insurance coverage. All adult AFDC recipients and their children are assigned Medicaid coverage, for example. Of the 28.2 million people estimated to have had Medicaid coverage in 1996, based on the March 1997 CPS, 4.6 million or 16 percent had their Medicaid coverage logically imputed in this manner (Rosenbach and Lewis 1998). Most if not all of these 4.6 million would have been counted as uninsured if not for the Census Bureau’s edits. With AFDC, which accounted for half of Medicaid enrollment, being replaced by the smaller Temporary Assistance to Needy Families (TANF) program, the number of logical imputations will be reduced significantly, which could increase the number of children who in fact have Medicaid coverage but are counted in the CPS and SIPP as uninsured.⁽¹¹⁾

Table 3. ESTIMATES OF THE PROPORTION OF CHILDREN UNINSURED BY AGE: COMPARISON OF MARCH CPS AND SIPP, SELECTED YEARS

Survey and Date	less than 1	1 to 5	6 to 14	15 to 18	Total
CPS, March 1994	11.5	11.6	13.7	19.4	14.1
CPS, March 1995	17.3	13.2	14.0	16.5	14.4
CPS, March 1996	16.7	12.7	13.7	16.1	14.0
SIPP, September 1993	7.7	10.9	13.7	16.7	13.1
SIPP, September 1994	9.3	10.5	13.1	16.3	12.7

SOURCE: Tabulations of public use files, CPS and SIPP.

c.Recall Bias

It is well known among experienced survey researchers that respondent recall of events in the past is imperfect and that recall error grows with the length of time between the event and the present. Error also increases with the amount of change in people’s lives. Respondents with steady employment have less difficulty recalling details of their employment than do respondents with intermittent jobs and uneven hours of work. Similarly, respondents who have had continuous health insurance coverage can more easily recall their coverage history than respondents with intermittent coverage. Obtaining accurate reports from respondents with complex histories places demands upon the designers of surveys and those who conduct the interviews. Panel surveys that ascertain health insurance coverage (and other information) with repeated interviews covering short reference periods are much more likely to obtain reliable estimates of coverage over time than one-time surveys that ask respondents to recall the details of the past year or more.

d.Sensitivity to Question Design

Even when recall is not an issue, when insurance coverage is measured “at the present time,” survey questions that appear to request more or less the same information can generate markedly different responses. This point was demonstrated in dramatic fashion when the Census Bureau introduced some experimental questions into the CPS to measure current health insurance coverage. At the end of the sequence of questions used to measure insurance coverage during the preceding year, respondents were asked:

These next questions are about your CURRENT health insurance coverage, that is, health coverage last week. (Were you/Was anyone in this household) covered by ANY type of health insurance plan last week?

Those who answered in the affirmative were asked to identify who in the household was covered and then, for each such person, by what types of plans he or she was covered. This sequence of questions, which first appeared in the March 1994 survey, yielded an uninsured rate that was about double the rate measured by the NHIS and the SIPP, and the experimental questions were discontinued with the March 1998 supplement.

Even if these questions had not followed a lengthy sequence of items asking about several sources of coverage in the preceding year, it would have been difficult to imagine that they could have generated such low estimates of coverage. That they did so despite the questions that preceded them is hard to fathom, and it underscores the point that researchers cannot simply write out a set of health insurance coverage questions and expect to obtain the true measure of uninsurance--or even a good measure of uninsurance, necessarily. It is not at all clear why this should be so. Health insurance coverage appears to be straightforward enough. Generally, people either have it or they don’t. Yet the Census Bureau’s experience sends a powerful message that questions about health insurance coverage can yield rather unanticipated results. Researchers who are fielding surveys that attempt to measure health insurance coverage would be well-advised to be wary of constructing new questions unless they can also conduct very extensive pretesting. In the absence of thorough testing, it is better to borrow from existing and thoroughly tested question sets rather than construct new questions from scratch.

e.Impact of Survey Design and Implementation

While perhaps not as important as question wording, differences in the design and implementation of surveys can have a major impact on estimates of the uninsured. These differences include the choice of universe and the level of coverage achieved, the response rate among eligible households, the use of proxy respondents, the choice of interview mode, and the use of imputation.

Universe and Coverage. Surveys may differ in the universes that they sample and in how fully they cover these universes. Typically, surveys of the U.S. resident population exclude the homeless, the institutionalized population--that is, residents of nursing homes, mental hospitals, and correctional institutions, primarily--and members of the Armed Forces living in barracks. There may be other exclusions as well. For example, household surveys do not always include Alaska and Hawaii in their sampling frames.

All surveys--even the decennial census--suffer from undercoverage; that is, parts of the universe are unintentionally excluded from representation in the sample. In a household-based or “area frame” sample, undercoverage can be attributed to three principal causes: (1) failure to identify all street addresses in the sample area, (2) failure to identify all housing units within the listed addresses, and (3) failure to identify all household members within the sampled housing units. Nonresponse, discussed below, is not undercoverage, although the absence of household listings for nonresponding households can contribute to coverage errors (in either direction). The 1990 census undercounted U.S. residents by about 1.6 percent.⁽¹²⁾Sample surveys have much greater undercoverage. The Census Bureau has estimated the undercoverage of the civilian noninstitutionalized population in the monthly CPS to be about 8 percent in recent years. Undercoverage varies by demographic group. For children under 15, undercoverage is closer to 7 percent than to 8 percent. But among older teens it approaches 13 percent, and for black males within this group the rate of undercoverage reaches 25 to 30 percent.

To provide at least a nominal correction for undercoverage, the Census Bureau and other agencies or organizations adjust the sample weights so that they reproduce selected population totals. These population totals or “controls” may even incorporate adjustments for the census undercount.⁽¹³⁾ This “post-stratification,” a statistical operation that serves other purposes as well, is based on a limited set of demographic characteristics--age, sex, race and Hispanic origin, typically, and sometimes state.⁽¹⁴⁾ Other characteristics measured in the surveys are affected by this post-stratification to the extent that they covary with demographic characteristics. We know, for example, that Medicaid enrollment and uninsurance vary quite substantially by age, race, and Hispanic origin, so a coverage adjustment based on these demographic characteristics will improve the estimates of Medicaid enrollment and uninsurance. To the extent that people who are missing from the sampling frame differ from the covered population even within these demographic groups, however, the coverage adjustment will compensate only partially for the effects of undercoverage on the final estimates. It is quite plausible, for example, that the Hispanic children who are missed by the CPS have an even higher rate of uninsurance than those who are interviewed. We would suggest, therefore, that survey undercoverage, even with a demographic adjustment to population totals corrected for census undercount, contributes to underestimation of uninsured children.

Response Rate. Surveys differ in the fraction of their samples that they succeed in interviewing. Federal government survey agencies appear to enjoy a premium in this regard. The Census Bureau, which conducts both the CPS and the SIPP and carries out the field operations for the NHIS, reports the highest response rates among the surveys that provide our principal measures of health insurance coverage. For the 1997 March supplement to the CPS, the Census Bureau reported a response rate of 84 percent.⁽¹⁵⁾ For the first interview of the 1992 SIPP panel the Bureau achieved a response rate of 91 percent, with the cumulative response rate falling to 74 percent by the ninth interview. The 1995 NHIS response rate for households that were eligible for selection into the MEPS was 94 percent (Cohen 1997). In contrast to these , MPR obtained a 65 percent response rate for the CTS, and Westat achieved a comparable percentage for the NSAF, which includes a substantial oversampling of lower income households. For the first round of the MEPS, Westat secured an 83 percent response rate among the 94 percent of eligible households that responded to the NHIS in the second and third quarters of 1995, yielding a joint response rate of 78 percent (Cohen 1997). These response rates are based on people with whom interviews were completed, but there may have been additional nonresponse to individual items in the health insurance sequence. However, unlike more sensitive items, like those pertaining to income, health insurance questions do not appear to generate much item nonresponse.

The reported response rates also do not include undercoverage, which varies somewhat from survey to survey. Arguably, people who were omitted from the sampling frame never had an opportunity to respond and, therefore, may have less in common with those who refused to be interviewed than they do with respondents. Nevertheless, their absence from the collected data represents a potential source of bias and one for which some adjustment is desirable. Generally speaking, however, less is known about the characteristics of people omitted from the sampling frame than about those who were included in the sampling frame but could not be interviewed. Hence the adjustments for undercoverage, when they are carried out, tend to be based on more limited characteristics than the adjustments for nonresponse among sampled households.

How important is nonresponse as a source of bias in estimates of health insurance coverage? We are not aware of any information with which it is possible to address that question. Certainly the nearly 30 percent difference in response rates between the NHIS and the CTS or NSAF could have a marked impact on the estimated frequency of a characteristic (uninsurance) that occurs among less than 15 percent of all children, but we have no direct evidence that it does.

Proxy Respondents. Some members of a household may not be present when the household is interviewed. Surveys differ in whether and how readily they allow other household members to serve as “proxy” respondents. From the standpoint of data quality, the drawback of a proxy respondent is the increased likelihood that information will be misreported or that some information will not be reported at all. This is particularly true when the respondent and proxy are not members of the same family. For this reason some surveys restrict proxy respondents to family members. Ultimately, however, some responses are generally better than none, so it is rare that a survey will rule out particular types of proxy responses entirely. Rather, proxy responses may be limited to “last resort” situations--that is, as alternatives to closing out cases as unit nonrespondents. For this reason, it is important to compare not only how surveys differ with respect to their stated policies on proxy respondents but the actual frequency with which proxy respondents are used and the frequency with which household members are reported as missing.

Children represent a special case. While all the surveys we have discussed collect data on children, the surveys differ with respect to whether these children are treated as respondents per se or merely other members of the family or household, about whom information is collected only or largely indirectly. For example, both the CPS and SIPP define respondents as all household members 15 and older. Some information, such as income, is not collected for younger children at all while health insurance coverage is collected through questions that ask respondents who else in the household is included under specific plans. With this indirect approach, children are more susceptible to being missed.

Mode: Telephone Versus In-person. Surveys may be conducted largely or entirely by telephone or largely or entirely in-person.⁽¹⁶⁾ There are two aspects of the survey mode that are important to recognize. The first bears on population coverage while the second pertains to how the data are collected.

Pure telephone surveys, which are limited to households with telephones, cover a biased subset of the universe that is covered by in-person surveys. Methodologies have been developed to adjust such surveys for their noncoverage of households that were without telephone service during the survey period. These methodologies use the responses from households that report having had their telephone service interrupted during some previous number of months to compensate for the exclusion of households that had no opportunity to appear in the sample. How effectively such adjustments substitute for actually including households without telephones is likely to vary across the characteristics being measured, and for this reason some telephone surveys include a complementary in-person sample to obtain responses from households without telephones.⁽¹⁷⁾

In addition to the coverage issue, distinguishing telephone from in-person interviews is important because the use of one mode versus the other can affect the way in which information is collected and the reliability with which responses are reported. Telephone surveys preclude showing a respondent any printed material during the interview (such as lists of health insurance providers), and they limit the rapport that can develop between an interviewer and a respondent. Furthermore, the longer the interview, the more difficult it is to maintain the respondent’s attention on the telephone, so data quality in long interviews may suffer. On the other hand, conducting interviews by telephone may limit interviewer bias and make respondents feel less uncomfortable about reporting personal information. Moreover, until recently, telephone interviewing allowed for the use of computer-based survey instruments that could minimize the risk of interviewer error in administering instruments with complex branching and skip patterns. For all of these reasons, survey researchers recognize that there can be “mode effects” on responses. The different modes may elicit different mean responses to the same questions, with neither mode being consistently more reliable than the other. To minimize differential mode effects when part of a telephone survey is conducted in person, survey organizations sometimes conduct the in-person interviews by cellular telephone, which field representatives loan to the respondents.

Panel surveys allow for another possibility: using a household-based sample design and conducting at least the initial interview in-person but using the telephone for subsequent interviews. Both the CPS and the SIPP have utilized this approach. In the CPS, the first and last of the eight interviews are conducted in person while the middle six are generally conducted by telephone. For any given month, then, about one-quarter of the interviews are conducted in person.⁽¹⁸⁾

The recent introduction of computer-assisted personal interviewing (CAPI) has created an important variation on the in-person mode and one with its own mode effects. In some respects, CAPI may be more like computer-assisted telephone interviewing than in-person interviewing with a paper and pencil instrument. The methodology is too new to have generated much information on its mode effects yet.

Imputation Methodology. Surveys differ in the extent to which they impute values to questions with missing responses and in the rigorousness of their imputation methodologies. For example, both the CPS and SIPP impute all missing responses, and they use methodologies that have been developed to do this very efficiently. For the SIPP imputation algorithms, over time the Census Bureau has made increasing use of the responses reported in adjacent waves of the survey. Generally, questions about health insurance coverage elicit very little nonresponse, so imputation strategies are less important than they are for more sensitive items, such as income. Nevertheless, in the March 1997 CPS, the Census Bureau imputed 10 percent of the “reported” Medicaid participants (Rosenbach and Lewis 1999).⁽¹⁹⁾ In the NHIS, responses of “don’t know” are not replaced by imputed values, and in published tabulations the insurance coverage of people whose coverage cannot be determined is treated as unknown. While this may not have a large impact on the estimated rates of uninsurance among children or adults, this strategy does make it more difficult for data users to replicate published results.

4. Interpreting Uninsurance as Measured in the CPS

The estimate of uninsured children provided annually by the March supplement to the CPS has become the most widely accepted and frequently cited estimate of the uninsured. At this point, only the CPS provides annual estimates with relatively little lag, and only the CPS is able to provide state- level estimates, albeit with considerable imprecision.⁽²⁰⁾ But what, exactly, does the CPS measure? The renewed interest in the CPS as a source of state-level estimates for CHIP makes it important that we answer this question.⁽²¹⁾ While the CPS health insurance questions ask about coverage over the course of the previous calendar year, implying that the estimate of uninsurance identifies people who had no insurance at all during that year, the magnitude of the estimate has moved researchers and policymakers to reinterpret the CPS measure of the uninsured as providing an indicator of uninsurance at a point in time.⁽²²⁾ How can this interpretation be reconciled with the wording of the questions themselves, and how far can we carry this interpretation in examining the time trend and other covariates of uninsurance? We consider these questions below.

a.In What Sense Does the CPS Measure Uninsurance at a Point in Time?

There is little reason to doubt that the CPS respondents are answering the health insurance questions in the manner that was intended. That is, they are attempting to report whether they ever had each type of coverage in the preceding year. We can say this, in part, because the health insurance questions appear near the end of the survey, after respondents have reported their employment status, sources and amounts of income, and other characteristics for the preceding year. By the time they get to the health insurance questions, respondents have become thoroughly familiar with the concept of “ever in the preceding year.” More importantly, however, there is empirical evidence that reported coverage is more consistent with annual coverage than with coverage at a point in time. Consider Medicaid, for example. Table 4 compares CPS and SIPP estimates of children under 19 who were reported to be covered by Medicaid in 1993 and 1994. The CPS estimates match very closely the SIPP estimates of children ever covered in a year whereas the CPS estimates exceed the SIPP estimates of children covered at a point in time by 26 to 28 percent.⁽²³⁾

				CPS as Percent of SIPP
	CPS	SIPP Annual Ever	SIPP Point in Time	Annual Ever	Point in Time
1993	17,168,000	17,800,000	13,369,000	96.4%	128.4%
1994	16,727,000	17,795,000	13,259,000	94.0%	126.2%

NOTES: The SIPP annual estimates refer to the federal fiscal year. The point-in-time estimates refer to September of each year. The CPS estimates refer to the calendar year. Both sets of estimates were obtained by tabulating public use data files. The CPS estimates are from the March 1994 and March 1995 surveys. The SIPP estimates are from the 1992 panel.
The SIPP estimates here actually understate what SIPP finds, as these estimates refer to the survivors of the population sampled in early 1992. SIPP also understates births. SIPP point-in-time estimates made with the calendar month weights would be higher, as the calendar month weights are controlled to the full population. Annual-ever estimates cannot be produced for the calendar month samples, however.

How, then, can the frequency with which the CPS respondents report no coverage during the year imply rates of uninsurance that are double what we would expect for children uninsured all year and about equal to what we would expect for children uninsured at a point in time? The answer, we suggest, lies in the extent to which coverage for the year is underreported. That is, CPS respondents answering questions in March either forget or otherwise fail to report health insurance coverage that they had in the previous year, and they do so with greater frequency than respondents to other surveys reporting more current coverage. Presumably, coverage that ended early in the year is more likely to be missed than coverage that ended later in the year or continued to the present. Coverage that started late in the year may be susceptible to underreporting as well, with respondents who are uncertain about the starting date having some tendency to project it into the current year. With more than 90 percent of the population having had coverage for at least some part of the year, only a small fraction--about 8 percent of those with any coverage--need to fail to report their coverage to account for the observed result.

Is it simply by chance that CPS respondents underreport their coverage in the previous year to such an extent that the number who appear to have had no coverage at all rises to the same level as independent estimates of the number who were without coverage at a point in time? Or is the phenomenon the result of a more systematic process that in some sense ensures the observed outcome? The answer is important because the more the phenomenon is due to chance, the less confident we can be that the CPS estimate of the uninsured will track the true number of uninsured children (or adults) over time. Similarly, the more the resemblance to a point-in-time estimate is due to chance, the less we can rely on the CPS estimate of the uninsured to tell us how uninsurance at a point in time varies by children’s characteristics--including state of residence.

b.Covariates of Uninsurance

Time is a critical covariate of uninsurance. The CPS measure of the uninsured is used by many policy analysts to assess the trend in uninsurance for the population as a whole and for subgroups of that population. But, in truth, how well does the CPS measure track the actual level of uninsurance? There is no definitive source on the uninsured, but both the NHIS and the SIPP provide annual estimates that can be compared with the CPS. Do these estimates show the same trends over time, even though the estimates themselves may differ? The estimates presented in Table 1 are inconclusive in this regard. The CPS time series is clearly less volatile than the NHIS time series, with the latter showing large swings between 1993 and 1994 and between 1994 and 1995. Between 1995 and 1996, the CPS uninsured rate shows an upswing that observers have interpreted as a response to the implementation of welfare reform. The NHIS estimate for 1996 predates this event, as we explained earlier. With a redesign of the survey and a revision of the health insurance questions in 1997, the continuation of the NHIS time series once the 1997 data are released will shed little if any light on the validity of the CPS series. The SIPP data are too limited to provide a useful point of comparison.⁽²⁴⁾

Even if the CPS estimate of the uninsured were a sufficiently reliable proxy for point-in-time uninsured to provide an accurate indicator of trends, this gives us no assurance that the CPS measure can accurately reflect the relationship between point-in-time uninsurance and other covariates besides time. We have already presented evidence that for reasons related, no doubt, to the measurement of uninsurance as a residual, combined with the peculiar reference period of the survey, the CPS overstates the proportion of infants who are uninsured (see Table 3). How confident can we be that the CPS can provide adequate estimates of the relationship between children’s uninsurance and very complex variables, such as Medicaid eligibility? This is an important question but one that will require more research to answer.

As a final note, the success of verification questions in the CTS and NSAF is prompting consideration of including such questions in the SIPP and the CPS. In light of our discussion of the CPS measure, we must wonder what the impact would be of introducing a verification question into the CPS. Rather than improving the point-in-time representation of the CPS, might this not move the CPS much closer to estimating the number of people who truly were uninsured throughout the preceding year? Arguably, this would reduce the policy value of the CPS measure because uninsurance throughout the year is too limited a phenomenon to be embraced as our principal measure of uninsurance. Of course, policy analysts could choose not to use the verification question, but this would only make it that much more difficult to assert that the data being reported in the CPS provide a reliable measure of uninsurance at a point in time.

5. Conclusion

The estimates of the incidence of uninsurance among children presented in Table 1 beg the question: Which estimate is the most correct? The short answer is that we do not know. There is no agreement in the research community. Clearly, the CPS estimate has been the most widely cited, but, as we explained, its timeliness and consistency account for this more than the presumption that it is the most accurate. When the estimate from the CTS was first announced, it was greeted with skepticism. Now that the NSAF, using similar survey methods, has produced a nearly identical estimate, the CTS’ credibility has been enhanced, and the CTS number, in turn, has paved the way for broader acceptance of the NSAF estimate. Yet neither survey has addressed what was felt to be the biggest source of overestimation of the uninsured in the federal surveys: namely, the apparent, substantial underreporting of Medicaid enrollment, which we discuss in Section C.⁽²⁵⁾

Much attention has focused on the impact of the verification questions in the CTS and NSAF. Indeed, Urban Institute researchers have indicated that without the verification question the estimate of uninsured children in the NSAF would be as high as it is in the CPS. Yet analysis of CTS data has shown that the CTS estimate would have been 2 percentage points below the CPS estimate even without the verification question (Rosenbach and Lewis 1998). With the addition of a verification question to the NHIS in 1997 and an experimental application to the SIPP questions under way, we will soon know if the presumably better reporting of coverage elicited by a verification question really does account for the lower estimates obtained by the CTS and NSAF. Our suspicion is that the verification question will not account for all or even most of the difference, which leads us to consider the differences in survey methodology detailed above. From a survey design perspective, the selection of a sample based on telephones is a very different exercise from the drawing of a household sample from a list frame (CPS and SIPP) or an area frame (NHIS). Both the CTS and NSAF include nontelephone households in their samples as well as additional corrections for potential bias. Kenney et al. (1999) have documented that the NSAF matches the income distribution and other characteristics of the population as reported by the CPS, so there are no clear differences that we can point to as evidence that the CTS and NSAF samples include too few households with uninsured children. Both surveys also sample children within households rather than collecting data on all children, but there is no evidence as yet that the sampling or subsequent weighting of these children was biased. Nevertheless, it would seem that more detailed evaluation of the potential impact of sample design on the differences between the CTS and NSAF, on the one hand, and the federal surveys, on the other, is warranted--and, indeed, necessary if we are to understand the differences that we see in Table 1.

B. SIMULATING MEDICAID ELIGIBILITY

The measurement or simulation of Medicaid eligibility among the members of a survey sample is important within the context of studying uninsurance because of the impact that Medicaid participation can have on the number of uninsured children. Because the rules for determining Medicaid eligibility are complex, however, simulating Medicaid eligibility is no small undertaking. Furthermore, what it demands from the data in order to replicate all of Medicaid’s complex provisions is more than any existing survey can provide.

1. Reasons for Simulating Eligibility

2. Complexity of Eligibility Determination

The regulations governing eligibility for the Medicaid program are exceedingly complex. There are numerous routes by which a child may qualify for enrollment, and many of the eligibility provisions and parameters vary by state. Simply documenting the published eligibility criteria is a sizable chore, and operational aspects of the Medicaid eligibility determination--such as the definition and application of income disregards--may not be published or readily accessible. Relative to other means-tested programs, Medicaid presents a far greater challenge for simulation of program- eligibility--both in terms of the complexity of the rules and the data requirements that they generate. Even the most sophisticated simulation models of Medicaid eligibility employ many simplifications (see, for example, Giannarelli 1992). More typically, simulations of Medicaid eligibility are highly simplified. For example, the General Accounting Office has reported findings based on simulating only the federally mandated poverty-related expansions (U.S. GAO 1995). While this captures the majority of Medicaid-eligible children born after September 30, 1983 (because eligibility via cash assistance programs has lower income thresholds), it attributes no eligibility at all to older children.

The data requirements for a Medicaid eligibility simulation are substantial. Like most means-tested entitlement programs, eligibility determinations are based on monthly income, and countable income includes a number of potential disregards for which the source of income and various kinds of monthly expenditures may be relevant. Participation in certain programs makes families or children eligible for Medicaid, so data on program participation are needed. Because people who were eligible for AFDC were often eligible for Medicaid even if they did not participate in AFDC, the Medicaid eligibility determination incorporated the AFDC eligibility rules, and this has been extended in some form into the post-welfare reform era. To simulate AFDC and Medicaid eligibility it is necessary to construct several alternative family income measures, which must be compared to sets of state- specific parameters, which vary by family size. AFDC and some of the other Medicaid provisions are limited to particular types of families, creating a need for family demographic and economic data. In addition, the AFDC unit may be a subset of the entire co-resident family, and other aspects of the Medicaid eligibility determination may exclude some family members as well, so there is a need for additional family demographic data as well as the economic characteristics of family members. Furthermore, AFDC imposed a resources test and other components of the Medicaid program have resource limits as well, so a simulation must include measures of not only financial resources but vehicles as well. Finally, expenditures on health care are instrumental to the determination of eligibility under the medically needy provisions, so health care expenditure data are needed to fully simulate this component of the Medicaid program.

3. Limitations of Survey Data

A full simulation of Medicaid eligibility requires, at the minimum, data on seven basic sets of characteristics:

Income by source, with additional information on those expenditures that are applicable to the calculation of disregards
Resources--specifically, financial assets along with the number and value of non- commercial vehicles
Participation in certain other assistance programs
Age and school enrollment of children
Family unit membership
Medical expenditures
State of residence

We discuss the limitations of survey data with respect to each of these types of data below.

a. Income and Disregards

As we noted, Medicaid eligibility is based on monthly income, and portions of this income may be disregarded based on the amount that is attributable to earnings and the level of expenditures on such items as child care, shelter costs, and transportation to work. In simulating eligibility with survey data, it is important, therefore, to have good measures of earned and unearned income as well as the relevant types of expenditures. The importance of measuring unearned income derives not from the fact that the population targeted by Medicaid has income from many sources (although public assistance income is often importance) but that people who are in fact not eligible but rely on unearned income for support may appear to be eligible if their unearned income is not measured adequately.

The surveys that we have described provide little if any data on the expenditures that are taken into consideration in calculating disregards. SIPP is clearly the best, but it collects expenditure data only once or twice in the life of a panel, so the expenditures are not measured concurrently with income except at those one or two times. The CPS collects annual rather than monthly income. Researchers who conduct the most sophisticated simulations with the CPS construct monthly income streams to improve their eligibility simulations, but in doing so they are still not able to measure other components of eligibility (such as family composition) concurrently.

b. Resources

Data on resources, or assets, are very limited. Only the SIPP captures detailed data on asset balances--including the value of vehicles, which are a major component of the applicable asset holdings of the low income population.⁽²⁶⁾ The SIPP data are collected in two survey waves a year apart, so researchers must interpolate or extrapolate to other months. Reported assets vary so substantially between the two waves, however, that this is difficult--and it suggests low quality in the data (Czajka 1999). Vehicular assets appear to turn over very rapidly as well. Some researchers deal with the limitations of CPS data in this regard by applying an assumed rate of return to reported asset income to impute the unreported balances.

c. Participation in Assistance Programs

Receipt of AFDC and Supplemental Security Income (SSI) benefits are relevant to Medicaid eligibility determination. Prior to welfare reform, which replaced the AFDC program with TANF, AFDC participants were identified in the federal surveys, but AFDC participation was underreported by as much as 25 percent. Data on SSI participation have been collected less regularly and with lower accuracy. Prior to the 1996 panel, the SIPP did not identify individual SSI recipients within a family. Data users could employ information on disability--reported in one survey wave--to infer which child or children in an SSI family may have been the SSI recipient(s). In comparing SIPP 1992 panel estimates of SSI children with published administrative records, however, we found that the survey estimates were quite low and failed to capture a significant upward trend in SSI enrollment (Czajka 1999).

d. Age and School Enrollment of Children

Survey data on these characteristics of children are generally quite adequate for the purposes of Medicaid eligibility simulation.

e. Family Unit Membership

While the official federal poverty levels are designed to be applied to all related persons living in the same household (a “census family”), Medicaid eligibility may be based on just a subset of family members. Some family members (and their incomes) are automatically excluded when determining the eligibility of the remaining family members or the children--for example, SSI recipients and adult children of the family head. In addition, to maximize potential eligibility many states allow their caseworkers considerable latitude in defining the family unit for Medicaid income eligibility determinations (Lewis and Ellwood 1998). Including or not including one particular family member can make the difference between the remaining members being eligible or not. Therefore, the composition of the survey household, the family relationships among household members, and the income available to individual members at a point in time are needed to assign family members to Medicaid eligibility units. The SIPP data are the strongest in this regard, but the simulation of eligibility units is exceedingly complex (even when the applicable rules are well-documented, which they frequently are not). The CPS data are weak because family composition is measured at the time of the survey (March) while the income data refer to the previous calendar year. No data are collected on who was actually present in the household at any time during the previous year.

f. Medical Expenditures

Medical expenditure data are the weakest element among the data collected by the CPS, SIPP, and the NHIS. The virtual lack of information on medical expenditures makes it exceedingly difficult to develop a credible simulation of eligibility under the medically needy provisions of Medicaid. Researchers who do attempt to simulate this component of eligibility must resort to imputing medical expenditures based on other surveys--such as MEPS.

g. State of Residence

Identification of the state of residence of survey households is essential to replicating the state variation in Medicaid eligibility rules. While all but MEPS among the major surveys that we have discussed identify the state of residence of most sample members, we are aware that at least one of the surveys groups sets of states in order to protect the confidentiality of respondents--and possibly to discourage estimates for states that are not adequately represented in the sample. There are nine small states that are not individually identified in SIPP files prior to the 1996 panel. These nine states are combined into groups of two, three, and four states. In order to simulate features of the Medicaid programs for these nine states, it is necessary to assign respondents to the nine states in some manner. One must assume that other characteristics reported on the SIPP files are of limited value in predicting the actual state of residence for sample households reported in one of the three state groups or else the confidentiality of the state data would be compromised. Ultimately, therefore, the assignment of respondents to individual states must rely heavily on randomization. This implies the introduction of some additional error into Medicaid simulations, which may contribute, in turn, to mismatches between simulated eligibility and reported participation.

C. CALCULATING MEDICAID PARTICIPATION RATES

Properly calculated, a Medicaid participation rate is formed by dividing the number of participants (people enrolled) by the number of people estimated to be eligible. In the previous section we detailed many of the difficulties that are inherent in estimating the denominator. Here we discuss some complications associated with estimates of the numerator and satisfying the requirement that the people who have an opportunity to be included in the numerator be fully counted in the denominator.

1. Choice of a Numerator: Survey Versus Administrative Data

Survey estimates of participants in means-tested entitlement programs generally fall well short of the counts reported in program administrative statistics. As a rule, the survey estimates tend to run between 75 and 90 percent of the administrative counts even when the two are rendered as comparable as possible with respect to the universe that they cover. For this reason, it has become a common practice to substitute administrative counts for survey estimates of participants in calculating participation rates for food stamps and AFDC. The choice of a numerator is an issue with respect to Medicaid participation rates as well. Here we discuss a number of considerations that are relevant to using the administrative statistics in this context. The bottom line is that comparability between the administrative and survey data on participation is difficult to establish.

a. Underreporting of Medicaid and Related Program Participation

Table 5 compares CPS estimates of children under 15 who were ever enrolled in Medicaid during 1993, 1994, and 1995 with enrollment statistics reported by the Health Care Financing Administration(now known as Centers for Medicare and Medicaid Services(CMS)) (HCFA(now known as CMS)). While a number of caveats should be addressed in making such a comparison, as we explain in the next subsection, the figures in the table give us a rough sense of how complete the CPS reports of Medicaid enrollment appear to be. In 1994 and 1995 the CPS figures lie between 75 and 76 percent of the HCFA(now known as CMS) estimates versus 83 percent in 1993. The decline in coverage would appear to be due to the CPS’s incomplete capture of a sizable growth in enrollment between 1993 and 1994. Elsewhere, with a more detailed comparison we estimated that the SIPP captured between 85 and 87 percent of Medicaid enrollment among children in FY93 and FY94 (Czajka 1999). The apparent implication, then, is that participation rates will be understated by 13 to 25 percent if we rely exclusively on survey estimates of participation.

TABLE 5. COMPARISON OF CPS ESTIMATES AND ADMINISTRATIVE COUNTS OF CHILDREN UNDER 15 ENROLLED IN MEDICAID

Year	CPS: Ever Enrolled in Calendar Year	HCFA(now known as CMS) Statistics: Ever Enrolled in Fiscal Year	CPS Estimate as Percent of HCFA(now known as CMS)
1993	15,165,000	18,348,000	82.7%
1994	14,545,000	19,227,000	75.6%
1995	14,685,000	19,444,000	75.5%

SOURCE: March Current Population Survey, 1994 to 1996, and HCFA(now known as CMS) Medicaid enrollment statistics, FY93 to FY95.

b. Issues in Comparing Survey and Administrative Estimates of Medicaid Enrollment

There are several issues in comparing survey and administrative estimates of Medicaid enrollment to evaluate coverage and, ultimately, to substitute the latter for the former in estimates of participation rates. These include unduplication across states, the existence of state-only programs, the reporting of average monthly versus annual ever enrollment, the limited age detail that is available from published statistics, the inclusion of institutionalized children, concerns about the quality of state Medicaid enrollment data, and retroactive eligibility.

Unduplication Across States. The Medicaid enrollment data published by HCFA(now known as CMS) are based on reports or data files submitted by the states. While researchers have at times expressed concern about duplicate counting of enrollees within states--the classic situation involving someone who is enrolled in Medicaid at the beginning of the fiscal year, leaves the program, then re-enrolls and is assigned a new, unique identification number--within-state duplication has been reduced by administrative improvements. The same cannot be said about duplication across states. People who start the year enrolled in Medicaid in one state, then move to another state and re-enroll, will be counted-- legitimately--in both states’ ever enrollment figures. In the survey data, of course, such people will be counted only once--in the state in which they reside at the time of the interview. There are no data with which to estimate the possible magnitude of this cross-state duplication, which would require matching state administrative files at the person level or matching survey data to these same administrative data. We doubt, however, that such duplication amounts to more than a few percent of the total national caseload reported by HCFA(now known as CMS), although this is purely speculative. About 16 percent of the total U.S. population moves in the course of a year, but only a small fraction of these moves are interstate.

State-Only Programs. A few states (New Jersey, for example) operate what are generally small programs that provide Medicaid coverage to children who do not qualify for federal matching dollars. These children are not included in the enrollment counts reported by HCFA(now known as CMS), but they would presumably report themselves (or be reported) to a survey interviewer as covered by Medicaid. If no allowance is made for their differential inclusion in survey versus federal administrative data, their presence in the survey estimates will contribute to an overestimate of survey coverage of Medicaid enrollees.

Average Monthly versus Annual Ever Enrollment. HCFA(now known as CMS) reports annual (fiscal year) estimates of people ever enrolled in Medicaid by programmatic and demographic characteristics for each state and for the nation as a whole (an aggregate of the state numbers, which may include some duplication). For all people in each state (that is, with no further breakdown), HCFA(now known as CMS) also reports the number enrolled for all 12 months, the number enrolled for less than 12 months, and the total person-months of enrollment among the latter. With these data is possible to calculate the average monthly enrollment--but only for all enrollees. Children and adults cannot be separated. Thus, the most readily available Medicaid administrative data on enrolled children can be used to evaluate only one type of survey estimate of Medicaid coverage: the number of children ever enrolled during a fiscal year. To evaluate survey estimates of Medicaid enrollment at a point in time requires that the researcher make some assumption about how the relationship between ever enrollment in a year and enrollment at a point in time differs between children and adults.

The states can and do produce their own estimates of Medicaid enrollment. They can produce estimates of the number of people enrolled each month by demographic and programmatic characteristics. Such data are not compiled nationally, however. To obtain monthly enrollment estimates, the researcher would have to request these from every state. In practice, then, it is not possible to compare survey and administrative estimates of Medicaid enrollment at a point in time with the same precision that can be done with estimates of enrollment ever in a year.

Age Detail. HCFA(now known as CMS) reports Medicaid enrollees under 21 by the following age groups: infant (under 1), 1 to 5, 6 to 14, and 15 to 20. These age categories do not map exactly to children as commonly defined from survey data: all people under 18 or all people under 19. Because Medicaid enrollment declines over ages 15 to 20, allotting two-thirds of the reported number of enrollees in this age group to the ages 15 to 18 yields too few children. To obtain better administrative estimates of enrolled children, we recommend estimating from survey data the fraction of reported Medicaid enrollees 15 to 20 years of age who are 15 to 18 (or 15 to 17 if that is the needed group) and applying this fraction to the Medicaid administrative data. Another strategy, which we followed in preparing Table 5, is to base the comparison on just those ages that can be matched (that is, 0 through 14) and assume that the same rate of coverage applies to the entire population of child enrollees.⁽²⁷⁾

We should note that there is a programmatic definition of “children” used in determining Medicaid eligibility, and that reported counts of “children” in some of the HCFA(now known as CMS) tabulations reflect this definition rather than the purely age-based definition used in survey-based research. Enrollees identified as “children” in HCFA(now known as CMS) reports are a subset of the full age group that would be defined as children in survey-based research. An individual who is under the age of 19 but responsible for a dependent child would be reported as an adult in tabulations of the basis of eligibility.

Institutionalized Children. Administrative estimates of enrollees include some who are institutionalized whereas the surveys that are used to estimate health insurance coverage exclude people in institutions from their sampling frames. Published HCFA(now known as CMS) tabulations do not report institutionalized enrollees by age, so it is not possible to exclude institutionalized children from the administrative counts of Medicaid enrollees--except crudely. This is not a large population, but other things being equal, failing to make some adjustment for its differential treatment in the two estimates will contribute to an underestimate of the survey coverage of Medicaid enrollment.

Quality of State Medicaid Enrollment Data. Researchers have raised concern about the quality of state Medicaid enrollment data. As we noted above, one area of concern was the potential multiple counting of individuals who left the program and re-entered within the same fiscal year, but the widespread use of unique “lifetime” identifiers is eliminating this problem. Indeed, analysis of case record data provides indirect support for this assertion in the form of frequent, identifiable instances of the same individuals exiting and then re-entering Medicaid within the year (Ellwood and Lewis 1999). At the same time, however, the state statistics reported by HCFA(now known as CMS) each year are accompanied by extensive caveats that point out omissions, inconsistencies, and other errors. At a minimum, users of the published enrollment data need to be aware that the data have known imperfections that may require some form of correction before they are used.

Retroactive Eligibility. Under certain circumstances, a Medicaid enrollee’s eligibility may be applied retroactively to cover medical costs that were incurred prior to official enrollment. Survey respondents interviewed just prior to their enrollment may correctly report their status as not covered, but the administrative statistics may later change this status. As a result, the administrative statistics would tend to run slightly higher than the reports obtained from surveys even if the latter were correct at the time they were recorded.

2. Incomplete Simulation of Eligibility

Because of the aforementioned data limitations, together with the complexity of the rules, simulations of Medicaid eligibility will almost invariably be incomplete, and even very good simulations may exclude as much as one-fifth of the eligible population. This may provide the strongest argument against substituting administrative estimates for survey estimates of participants in order to calculate participation rates. If the deficiencies of the eligibility simulation can be matched to the eligibility categories reported in the Medicaid statistics, however, it may be possible to construct an administrative count of participants that is reasonably consistent with the eligibility simulation. For example, we have noted that the medically needy component of Medicaid is the most difficult to simulate, and many analysts make little attempt to do so. Medically needy children under 21, where “children” are defined by the nature of their eligibility rather than their age, are reported in the annual statistics released by HCFA(now known as CMS), and therefore they could be subtracted from an administrative count of participants to yield a numerator that could be used to calculate a Medicaid participation rate that excluded the medically needy from both the numerator and denominator.

3. Simulated Eligibles Reporting Coverage Other Than Medicaid

In our research with SIPP data we found that 18 percent of the children we simulated to be eligible for Medicaid reported having some form of insurance coverage other than Medicaid (Czajka 1999). This other coverage could represent Medicaid being misreported as something else, or it could represent genuinely different coverage. In the former case, there are clear implications for the calculation of Medicaid participation rates. Indeed the misreported coverage would account for part of the Medicaid undercount. If there were a way to resolve this with the survey data and determine how many of the children who reported other coverage may have actually been covered by Medicaid, then the quality of a survey-based participation rate could be improved. To the extent that it is not possible to discern the amount of misreporting and in so doing correct the survey data, the argument for considering administrative data for the numerator is strengthened.

The possibility that much if not most of the reported other coverage is truly something other than Medicaid suggests another strategy--perhaps one that is best viewed as a complementary strategy rather than an alternative one. Nonparticipation in Medicaid by eligible children who have other coverage carries very different policy implications than nonparticipation by those who are uninsured. Medicaid participation rates calculated for just those children who would otherwise be uninsured may provide a more meaningful indication of the success of Medicaid outreach than participation rates that count eligible children with other coverage as eligible but not participating. Even without adjusting for Medicaid underreporting, we found that the participation rate among eligible children with no other coverage was 79 percent, compared with 65 percent for all eligible children (Czajka 1999).

4. Seemingly Ineligible Participants

Extensive research on simulating eligibility for food stamps and AFDC has identified a perplexing and thus far inescapable problem: nontrivial numbers of those who report participation are simulated to be ineligible.⁽²⁸⁾ The existence of such people may reflect the incompleteness or inaccuracy of the simulation model. This is particularly true of Medicaid, with its complex eligibility determination and the difficulties analysts face in documenting current state policies. Adding to the complexity are provisions that allow certain classes of participants to maintain their eligibility, once established, despite changes in income that would render other participants ineligible. Such provisions apply to pregnant women and infants and to families receiving transitional coverage after losing AFDC benefits because of increased earnings. The fraction of participants who experience extended eligibility is likely to increase as the Balanced Budget Act of 1997 gives states the option to guarantee coverage to children for up to 12 months after enrollment, regardless of changes in family income (Lewis and Ellwood 1998).

The appearance of simulated ineligible participants may also be due to misreported participation, incorrect edits or imputations, errors in the actual eligibility determination, or the failure of participants to report changes in their circumstances. These different explanations for the phenomenon of seemingly ineligible participants carry different implications for how such cases should be handled in calculating participation rates (Vaughan 1992). When the errors are due to the simulation, there is little question that such participants should be included in participation rates providing that the total number of simulated eligibles can be corrected to offset any bias.⁽²⁹⁾ At the same time, however, people who participate when in fact ineligible or who are incorrectly identified as participants should be excluded from participation rates. Simply adding them to the denominator to offset their inclusion in the numerator gives them an implied participation rate of 100 percent, which is difficult to interpret. Clearly, it would be desirable to know more about the seemingly ineligible participants, but “they” have proven difficult to understand. Limiting participation rates to simulated eligibles (which would also imply removing ineligible participants from administrative statistics if the latter are used to form the numerator) makes it unnecessary to address what may be unresolvable issues in the definition of the denominator, and it maintains the concept of a participation rate as the number of eligible participants divided by the total number of eligibles.

5. Population Undercoverage

Some of the apparent underreporting of Medicaid enrollment may be due to survey undercoverage--that is, the exclusion of complete households and individual members from the sample frame. Undercoverage affects the uninsured as well as Medicaid participants--and probably more so. The impact of undercoverage is difficult to gauge because the Census Bureau adjusts its survey estimates for undercoverage within categories defined by age, sex, race, and Hispanic ethnicity. That is, the missing households and people are included in the population estimates to which the survey is weighted, but their characteristics may be misrepresented. For example, if the undercoverage is concentrated in a subset of an adjustment category--such as the lowest income members--the adjustment will compensate only partially because it will increase the relative frequency of higher income people by the same amount that it increases the relative frequency of lower income people.

D. HOW DOES MEDICAID UNDERREPORTING AFFECT SURVEY ESTIMATES OF THE UNINSURED?

In the previous section we provided estimates of the amount by which the CPS and the SIPP may underestimate Medicaid enrollment--magnitudes between 13 and 25 percent. The underreporting of Medicaid enrollment may lead to an overstating of the number and proportion of children who are without insurance. But the impact of Medicaid underreporting on survey estimates of the uninsured is far from clear. Indeed, even assuming that these estimates of Medicaid underreporting are accurate, the potential impact of Medicaid underreporting on estimates of the uninsured depends on how the underreporting occurs. In this section we consider several ways in which Medicaid underreporting might occur and then consider the implications for estimates of the uninsured.

1. Forms of Underreporting

It is important to differentiate among different sources or forms of underreporting, as they carry quite different implications for estimates of the uninsured.

a. Misreporting of Medicaid as Private or Other Public Coverage

In an effort to increase and maintain enrollment, a number of states have taken steps to give their Medicaid programs the appearance of private insurance plans. Such tactics may be successful to the point of confusing participants as to the source of their health insurance coverage. As a result, some Medicaid enrollees may report that they are covered by some type of private insurance plan or a public plan other than Medicaid and thus not get counted as Medicaid participants. Such people will not be counted among the uninsured, but their actions will contribute to Medicaid enrollment being understated relative to administrative estimates.

b. Incomplete Reporting of People Included in Family Coverage

The fact that multiple family members are often but not always included under the same coverage creates a measurement problem that different surveys approach in different ways--for example, going through the household person by person to ascertain each member’s source(s) of coverage versus identifying the person in whose name a particular coverage is held and listing all household members included in that coverage. However insurance coverage is measured, there exists the potential for individual household members to be omitted. The likelihood of such omissions may be greatest for children, whose coverage is often collected somewhat differently than that of adults.⁽³⁰⁾ A finding that is consistent across multiple surveys is that roughly one-fifth of uninsured children appear to have at least one insured parent in the household. Czajka (1999) finds that 7 percent of uninsured children as measured in the SIPP report having a parent covered by Medicaid while about 13 percent have a parent with employer-sponsored coverage. The 7 percent figure strikes us as high. While there are circumstances under which a parent could be Medicaid-eligible without the children also being eligible, the parent would still have to meet a means test, which in most cases would imply a family income low enough to qualify the children under the poverty-related criteria.⁽³¹⁾ Furthermore, while there is nothing implausible about uninsured children having parents with employer-sponsored coverage, we suspect that if parents can report their own Medicaid coverage but omit that of one or more children, they can do likewise with employer-sponsored coverage. In short, much of the 7 percent of uninsured children with Medicaid-covered parents and at least some of the 13 percent with employer-insured parents may be misreported as uninsured.

c. Not Reporting Medicaid at All

Rather than misreporting Medicaid as another type of insurance or reporting Medicaid for only some of the family members who are actually covered, respondents may fail to report any Medicaid coverage at all. If such people report no other coverage during the reference period, they will be classified, incorrectly, as uninsured. If they do report other coverage, then they will be identified as insured, but they will still contribute to the underreporting of Medicaid. When the reference period is short, we suspect that nearly all of these cases will be recorded as uninsured. With a reference period as long as a year, however, a nontrivial share of those who fail to report their Medicaid coverage may have had--and reported--other coverage and, therefore, not been classified as uninsured.

d. Population Undercoverage

We have discussed how population undercoverage is endemic to surveys and how this type of error affects the low-income population disproportionately. While this almost certainly accounts for some of the underreporting of Medicaid, we suggest that population undercoverage may have a relatively greater impact on estimates of Medicaid-eligible nonparticipants. The circumstances that contribute to people being left off of household rosters--basically, their transiency or weak attachment to the household--may also make it unlikely that they would be covered by Medicaid if eligible. In general, population coverage may affect the uninsured more than the insured. Thus it may tend to push estimated Medicaid participation rates up rather than down and lower rather than inflate estimates of the proportion of children who are uninsured.

2. Implications for Adjusting the Number of Uninsured

Each of the four forms of underreporting that we have discussed implies, potentially, a different type of bias in estimates of the number of uninsured children. Therefore, each form of underreporting implies a different strategy for using administrative estimates of Medicaid enrollment to correct for Medicaid underreporting and, by implication, adjust the number of uninsured.

First, when Medicaid is misreported as private coverage or other public coverage, the number of children who are uninsured is not increased. Consequently, if we wish to adjust for Medicaid underreporting we must first determine how much of the Medicaid shortfall is due to Medicaid being misreported as something else. Such children would be found among those who are simulated to be eligible but report coverage other than Medicaid. Beyond this, however, there are no easy guidelines for determining which children or even how many have misreported their Medicaid coverage. Second, when children are simply left out of a list of family members covered by Medicaid, they may indeed affect estimates of the uninsured. To determine the potential impact, we must first impute their Medicaid coverage based on the reports for other family members. In so doing, we will in effect “see” if these individual children were reported to have other coverage, and the change in the number of uninsured children after the imputation has been performed will indicate the net impact on estimates of the uninsured. Third, when Medicaid is simply not reported at all, it is still possible that other coverage was reported. Such children will be found among those simulated to be eligible for Medicaid, but they may or may not have reported other coverage. Again, there are no easy guidelines for identifying which children or how many fall into this category. Fourth, underreporting due to population undercoverage requires no adjustment of the survey data, but if administrative estimates of Medicaid enrollment are used to adjust for the other sources of error, then, in theory, estimates of Medicaid children who were excluded from the survey sample frame should be subtracted from the administrative counts. The bigger problem with population undercoverage, however, is the likely underestimation of the uninsured. If we are lowering the estimated number of uninsured to compensate for Medicaid underreporting, we need to keep in mind that some of the uninsured may not be included in the initial estimate. This problem has not been addressed at all in the literature, and we are not aware of any estimates of how many uninsured children may be missing from the survey estimates.

In using administrative estimates of Medicaid enrollment, it is important that the reference period of the data match the reference period of the survey estimates. HCFA(now known as CMS) reports Medicaid enrollment in terms of the number of people who were ever enrolled in a fiscal year. This number is considerably higher than the number who are enrolled at any one time. Therefore, the HCFA(now known as CMS) estimates of people ever enrolled in a year should not be used to correct survey estimates of Medicaid coverage at a point in time because this results in a substantial over-correction. From the published HCFA(now known as CMS) tabulations it is possible to derive an estimate of average monthly enrollment--but only for the entire population of enrollees, not children or any other subgroup. These average monthly estimates can be compared to survey estimates of enrollment at a point in time (subject to all of the caveats discussed in the preceding section).

The CPS presents a special problem. We have demonstrated that while the CPS estimate of uninsured children is commonly interpreted as a point in time estimate, the reported Medicaid coverage that this estimate reflects is clearly annual-ever enrollment. Adjusting the CPS estimate of the uninsured to compensate for the underreporting of annual-ever Medicaid enrollment produces a large reduction. We have to recognize that what this adjustment accomplishes is to move the CPS estimate of the uninsured closer to what it purports to be--namely, an estimate of the number of people who were uninsured for the entire year. Applying an adjustment based on annual-ever enrollment but continuing to interpret the CPS estimate of the uninsured as a point-in-time estimate is clearly wrong. Adjusting the Medicaid enrollment reported in the CPS to an average monthly estimate of Medicaid enrollment yields a much smaller adjustment and a correspondingly smaller impact on the uninsured, but it involves reinterpreting the reported enrollment figure as a point-in- time estimate--which we have seen that it is not. Invariably, efforts to “fix” the CPS estimates run into problems such as these because the CPS estimate of the uninsured is ultimately not what people interpret it to be but, instead, an estimate--with very large measurement error--of something else. We would do better to focus our attention on true point-in-time estimates, such as those provided by SIPP, NHIS, the CTS, and NSAF. However, until the turnaround in the release of SIPP and NHIS estimates can be improved substantially, policy analysts will continue to gravitate toward the CPS as their best source of information on what is happening to the population of uninsured children.

E. ANALYSIS OF LONGITUDINAL DATA

Given the difficulties that respondents seem to experience in providing accurate reports of their insurance coverage more than a few months ago, panel surveys with more than one interview per year seem to be essential to obtaining good estimates of the duration of uninsurance and the frequency with which children experience spells of uninsurance over a period of time. Longitudinal data are even more essential if we are to understand children’s patterns of movement into and out of uninsurance and into and out of Medicaid enrollment. At the same time, however, longitudinal data present many challenges for analysts. These include the complexity of measuring the characteristics of a population over time, the effects of sample loss and population dynamics on the representativeness of panel samples, and the issues that must be addressed in measuring spell duration.

1. Overall Importance to Understanding Uninsurance and Medicaid Enrollment Patterns

The snapshots provided by cross-sectional surveys tell us how many children are without health insurance coverage at a point in time and how many of these appear to be eligible for Medicaid but, according to their families’ reports, are not enrolled. What the cross-sectional surveys cannot tell us, though, are the paths that children have taken to becoming uninsured and the paths that most of them will take in becoming insured (again). Nor can the cross-sectional surveys tell us, without considerable measurement error, how much of the uninsurance among children is transitional versus more chronic. Panel surveys such as the SIPP and MEPS can provide such information. By following children and their families over a two-to-three-year period and interviewing them multiple times during the year, these surveys capture transitions between sources of coverage and between insured and uninsured, and they also capture changes in family circumstances that may contribute to the changes that we observe in health insurance coverage. These changes in family circumstances include, most importantly, job loss or re-employment, marriage, and divorce. Smaller changes in income or household composition may also affect Medicaid eligibility and, ultimately, children’s insurance coverage.

2. Measuring the Characteristics of a Population over Time

Adding a time dimension to data introduces complications along with the benefits. We provide two illustrations with respect to what might appear to be very simple problems: (1) estimating the number of children who were ever without insurance over the course of a year and (2) evaluating survey data on children who were ever enrolled in Medicaid over a year. Then we examine what longitudinal data may or may not tell us about causal sequences.

a. First Illustration: How Many Children Are Uninsured in a Year?

Typically, analysts define children as all people who are under a particular age. For example, children may be defined as all people under 19, or all people under 18. However we choose to define the population of children, when we add a time dimension by looking at behavior over the course of a year (or some other period), we must recognize that the set of people who are defined as children changes over this period. Over the course of a 12-month period, a new cohort of children is born while the oldest cohort “ages out” of the population of children--by turning 19, for example, if children are defined as people under 19. While this “cohort succession” may not affect appreciably the number who are defined as children at any one time, if there is no net growth, it does affect the number who would ever meet the definition of children over the year. In the United States currently, the size of a birth cohort is about 4 million. Over the course of a year, then, the number of people who were ever under 19 is about 4 million or more than 5 percent larger than the number who are under 19 at any one time during the year.

In counting the number of children who were ever uninsured in a year, we can elect to include all of those who were ever defined as children during that span or we can limit the count to those who were children at a specific point in time--say, the beginning of the year or the end of the year. No one way of defining a population over time should be regarded as the “correct” way. What is important to recognize is that there are different ways to define the population of children over time, and this awareness must carry over to how we count the uninsured and how we compare different measures of the prevalence of uninsurance among children.

Table 6 contrasts two different ways of counting the number of children who were ever uninsured during a year. The first approach, shown in the upper half of the table, tabulates the number of children ever uninsured in FY93 for a fixed population of children--specifically, those who were under 19 on September 30, 1993. This population numbers 70,868,000 children, and the number who were ever uninsured during the year is 15,360,000. The second approach, shown in the lower half of the table, tabulates the number of children ever uninsured in FY93 for a dynamic population of children. This population, consisting of those who were ever under 19 during FY93, numbers 74,691,000, and the number of these children who were ever uninsured is 16,089,000. The two alternative populations of children show essentially identical proportions of children who were

Table 6. ESTIMATES OF CHILDREN EVER UNINSURED IN FY93, BASED ON ALTERNATIVE DEFINITIONS OF CHILD POPULATION OVER TIME

Description of Population and Estimate	Estimate

Fixed Population
Number of Children under 19 on September 30, 1993	70,868,000

Number Uninsured in September 1993	9,271,000
Proportion Uninsured in September 1993	13.1%

Number Ever Uninsured in FY93	15,360,000
Proportion Ever Uninsured in FY93	21.7%

Dynamic Population
Number of Children Ever under 19 in Year Ending September 30, 1993	74,691,000

Number Uninsured in September 1993	9,271,000
Proportion Uninsured in September 1993	NA

Number Ever Uninsured in FY93	16,089,000
Proportion Ever Uninsured in FY93	21.5%

SOURCE: Survey of Income and Program Participation, 1992 Panel.

ever uninsured during the year, but the dynamic population includes about 3.8 million more children than the fixed population and about 700,000 more children who were ever uninsured.⁽³²⁾ Either approach provides a correct use of the data. A comparison of the two sets of tabulations illustrates that when we examine the incidence of uninsurance among children over time, part of why we may observe more children to have been uninsured than at a point in time is that, depending on how we define children for this purpose, more children may have been exposed to the risk of being uninsured.

b. Second Illustration: Age Distribution of Medicaid Enrollees?

The Medicaid enrollment data that are reported by HCFA(now known as CMS) represent the number of individuals who were ever enrolled over the course of a year. There are, in fact, a number of issues that arise in trying to compare HCFA(now known as CMS) statistics with survey estimates of Medicaid enrollment, but one of them illustrates the complexity of dealing with the characteristics of a population over time. The issue is how we classify children by age.

HCFA(now known as CMS) includes among its published tabulations a table that presents enrollment by age, and in evaluating the completeness of reporting of Medicaid coverage in survey data, an analyst could use this table to obtain an administrative count of all people under a given age who were ever enrolled in Medicaid during the year. With the published data the analyst has limited flexibility in defining the upper-age boundary for children because HCFA(now known as CMS) reports ages in groups rather than single years. In particular, people 15 to 20 are reported in a single group. With the survey data the analyst has considerably more flexibility, of course. One approach that the analyst can take is to match the survey data to the published age categories in order to measure the completeness of reporting for those categories and then, if desired, extrapolate the findings to children at somewhat higher ages.

Over the course of a year every child experiences a birthday. For a year in which Medicaid enrollment is observed, therefore, each child for whom such coverage is reported can be assigned to either of two ages. Which is the more appropriate? More generally, is there a preferred strategy for assigning age to children who may have been enrolled in Medicaid at any time over the course of a year? What, exactly, does HCFA(now known as CMS) do?

It turns out that states employ at least two different conventions for reporting age to HCFA(now known as CMS). States that participate in the Medicaid Statistical Information System (MSIS) and submit electronic case records to HCFA(now known as CMS) assign age as of the end of the fiscal year (HCFA(now known as CMS) 1994). States that submit 2082 reports instead of MSIS electronic case records are instructed to assign age as of the middle of the fiscal year. This latter system classifies about 50 percent more children as infants and, generally, shifts the coverage of each reported age group by one-half year. Thus in the non-MSIS states the reported population under 15 will include children who turned 15 in the second half of the fiscal year.

Many states, both MSIS and non-MSIS, take a few months to process the enrollment of newborn infants. In these states, infants may not appear on state Medicaid files until their second or third month of life. As a result, infants who are born in the final months of the fiscal year may not be counted in the 2082 data as enrolled in Medicaid for that year (Lewis and Ellwood 1998). This produces a net undercount of infants--and therefore all enrollees. The fact that these infants show up in the next year’s enrollment statistics does not compensate for their omission, as they would have appeared in the next year’s statistics anyway.

The impact of these differences in the reporting of enrollment counts by age can be seen in Table 7, which presents the reported FY93 enrollment counts for children under age 1 and children 1 to 5. States are sorted by the ratio of the infant enrollment count to the count of children 1 to 5. This ratio, expressed as a percentage, appears in the final column. There is a clear break between the MSIS states and most of the non-MSIS states. No MSIS state has a ratio as high as 30 percent whereas most of the non-MSIS states (all but seven) lie above this value, between 31.7 and 43.4 percent. The average ratio for the MSIS states is about 21 percent while the average ratio for the non-MSIS states, excluding the seven, is about 38 percent. The difference between these mean values is consistent with the non-MSIS states counting about 50 percent more children as infants, but it also suggests that MSIS states may be more likely to undercount their infants. The seven non-MSIS states that fall into or below the range of ratios exhibited by the MSIS states may very well be assigning ages with the same convention as the MSIS states--that is, defining age as of the end of the fiscal year rather than the middle.

Table 7. MEDICAID ENROLLMENT OF CHILDREN UNDER 6 BY STATE, FY93

	Children Ever Enrolled		Children Under Age 1 As a Percent of Children
	Children Ever Enrolled		Children Under Age 1 As a Percent of Children
State	Age under 1	Ages 1 to 5	1 to 5
NEW MEXICO	10,613	69,356	15.3
MICHIGAN	46,357	293,558	15.8
1/ NEW HAMPSHIRE	3,301	18,238	18.1
1/ VERMONT	2,969	15,395	19.3
1/ NEW JERSEY	33,901	170,763	19.9
1/ MAINE	6,261	30,499	20.5
1/ DELAWARE	3,842	18,579	20.7
1/ GEORGIA	50,511	237,872	21.2
1/ NORTH DAKOTA	2,718	12,711	21.4
1/ INDIANA	32,258	150,761	21.4
1/ MONTANA	4,570	21,309	21.4
1/ PENNSYLVANIA	64,890	301,037	21.6
MINNESOTA	25,298	116,721	21.7
1/ CALIFORNIA	284,859	1,305,012	21.8
1/ MISSOURI	35,089	158,060	22.2
MASSACHUSETTS	33,018	145,515	22.7
1/ WYOMING	3,085	13,439	23.0
ILLINOIS	88,193	380,031	23.2
1/ KENTUCKY	29,345	124,180	23.6
1/ KANSAS	13,690	57,754	23.7
ARKANSAS	18,395	76,968	23.9
1/ IOWA	14,582	60,824	24.0
1/ WASHINGTON	36,052	148,723	24.2
1/ ALABAMA	37,138	150,949	24.6
1/ WISCONSIN	33,728	131,079	25.7
IDAHO	8,406	31,634	26.6
1/ NEVADA	7,647	28,679	26.7
1/ MISSISSIPPI	34,096	121,933	28.0
1/ HAWAII	7,560	26,324	28.7
1/ UTAH	13,833	47,492	29.1
TENNESSEE	58,205	183,875	31.7
NEBRASKA	11,459	35,999	31.8
FLORIDA	159,288	485,074	32.8
CONNECTICUT	20,827	62,491	33.3
OHIO	109,671	325,504	33.7
DISTRICT OF COLUMBIA	10,581	29,761	35.6
SOUTH DAKOTA	6,859	18,610	36.9
VIRGINIA	53,910	142,476	37.8
ALASKA	6,791	17,731	38.3
NEW YORK	213,538	551,463	38.7
LOUISIANA	62,083	158,470	39.2
OREGON	31,288	79,034	39.6
ARIZONA	64,111	160,246	40.0
NORTH CAROLINA	79,726	196,770	40.5
COLORADO	31,820	77,946	40.8
MARYLAND	46,848	114,684	40.8
TEXAS	271,434	656,465	41.3
OKLAHOMA	35,772	83,428	42.9
SOUTH CAROLINA	45,359	104,589	43.4
WEST VIRGINIA	35,026	80,648	43.4

SOURCE: HCFA(now known as CMS) (1994).

NOTES: "1/" designates a state that submits electronic case record data in lieu of 2082 tabulations. Rhode Island submitted no 2082 data for FY93 and is excluded.

The implications of these different conventions, as we said, are that in one set of states the reported enrollees under age 15 will include children who turned 15 in the second half of the year whereas in the other set of states the reported enrollees under age 15 will include no children who turned 15 during the fiscal year. If we wish to compare survey and administrative counts, therefore, we need to emulate these conventions in our survey tabulations or make some adjustment for the fact that the age categories used in the administrative records will line up with those in the survey data in about half the states but be somewhat more inclusive in the remaining states.

c. Causal Sequences

Longitudinal data with frequent observations allow us to examine not only changes in family circumstances but the sequence of changes. With that information it may be possible to infer causality. For example, we may observe that a child loses employer-sponsored coverage from one time period to the next. Why does that occur? Does a parent lose the job that provided coverage? Does the parent leave the household? Or does the parent remain in the job, suggesting that the parent or the employer simply dropped the coverage? In either case, does the parent lose coverage as well? These are the types of questions that longitudinal data may be able to answer.

In using longitudinal data to examine the sequence of events with an eye to inferring causality, analysts must be cognizant of the possibility of measurement error in the reporting timing of events. Changes in children’s family circumstances and, in particular, their health insurance coverage may not get reported exactly as they occur. They may be reported late or even early. This weakens the observed relationships between changes in economic circumstances and changes in children’s health insurance coverage, and analysts need to recognize this. With the SIPP, given the four-month frequency of interviews, this means that a change reported within a four-month reference period could very well have occurred anywhere within that period while changes reported at the boundary between reference periods almost certainly occurred earlier or later.⁽³³⁾

Another aspect of the analysis of longitudinal data is that repeated measures of a characteristic over time may show inconsistencies or changes that are rapidly reversed. Some or even many of these inconsistencies may be due to reporting errors. With cross-sectional data, where there is but a single measure, such error is not evident. This does not mean that the error is not present. Rather, the same error may be present, but without the benefit of repeated measures we cannot detect it.

3. Representativeness of Panel Samples

Panel samples lose respondents at each interview--a phenomenon described as attrition. For analyses that span the duration of a panel or focus on the later interviews, the sample available for analysis is affected by the cumulative attrition. Because it tends to be nonrandom, attrition affects the representativeness of a panel sample. There is evidence from the SIPP, for example, that poor people are more likely to leave than people at higher income levels. Major changes in life circumstances may also contribute to attrition. When this occurs, the changes that prompted the attrition may go unobserved, leaving no evidence to connect them to attrition. When differentials in attrition probabilities are observed, as with poverty, the sample weights of the respondents who remain can be adjusted to improve the overall representativeness of the sample. When factors that contribute to attrition are not observed, however, then compensating adjustments cannot be made, and the representativeness of the sample is reduced.

A second way in which the representativeness of panel samples changes over time is aging. The 9-year-olds in the first year of the panel become the 10-year-olds of year two and the 11-year-olds of year three. This aging of the sample carries a number of implications, but we consider two of them. Each year an entire birth cohort “ages out” of the population of children while a new birth cohort enters at the bottom. This raises issues about how to define the study population. Can an uninsured child simply age out of the population of uninsured children? Indeed, that is what happens in the real population, just as infants are born into uninsurance. With its aging, a panel sample merely replicates real life. Nevertheless, the analyst must determine whether analytical objectives may dictate retaining such children in the study population past the point where they would otherwise exit. For example, in measuring the duration of spells of uninsurance among older children, it may be desirable to follow these spells past the point where the child leaves childhood.

A second aspect of aging in a panel database is that chance differences in sample sizes by age group will be carried forward along with real differences, which can affect the measured impact of age-sensitive phenomena. Thus if the weighted number of children age 10 is 20 percent larger than the weighted number of children age 9, then one year later this difference will be manifested in the relative numbers of 11- and 10-year-olds. Two places where there seems to exist a potential for systematic bias are among infants and older teens. There is evidence that the SIPP underestimates births by as much as 25 percent, so that by the end of a three-year panel the number of children under age three is understated by that amount. Older teens seem to be underrepresented as well. Over time, the proportions of children who are estimated to be covered by Medicaid or to have no insurance at all are affected by these distortions in the age distribution, as the frequency of both Medicaid coverage and uninsurance vary by age. Adjusting the sample weights to match independent estimates of population size by single year of age can correct this problem, but it may alter the composition of the weighted sample on other dimensions.⁽³⁴⁾

A third way in which the representativeness of panel samples may change over time is through their exclusion of new additions to the population. People who immigrate to the United States after the start of a SIPP panel are not represented in the panel sample. The same is true of people who leave institutions or the military and return to the household population. Over time, then, an ongoing SIPP panel represents a decreasing fraction of the universe that would be eligible for selection into a new sample. The magnitude of the decline in representativeness is not nearly as large as that due to the underrepresentation of births. We estimated that at the end of three years the 1992 SIPP panel underrepresented the number of children in the population by about 4 million (out of 74 million), but less than 2 million of this could be attributed to the exclusion of children who entered the population by a means other than birth (Czajka 1999).

4. Measuring Spell Duration

One aspect of longitudinal data that is particularly valuable for the analysis of health insurance coverage is their ability to provide reliable measures of duration. As with other aspects of the use of longitudinal data, however, measuring spell duration requires a number of choices on the part of the analyst. These include defining the universe of spells to be included in the measure of duration, determining when a spell begins or ends, resolving how to handle the censoring of spells at either end, and electing how to treat multiple spells by the same individuals.

a. Choice of Universe of Spells

Over the lifetime of a panel survey there are spells of uninsurance that begin, spells that end, spells that begin and end, and spells that remain in progress without beginning or ending. The same can be said of any sub-period within the life of the panel--for example, a calendar year or fiscal year. Which spells the analyst chooses to include in a distribution of spell durations can have a profound effect on the average length and other features that are attributed to these spells. Essentially, it is difficult to restrict the universe of spells without selecting, indirectly, on their length.

Consider the following. A natural way to restrict the universe of spells is to limit the sample to spells that are active in a given month--that is, the spells of all children who are uninsured in that month. It turns out, however, that this restriction results in spells being represented in direct proportion to their completed duration. Spells of one-month duration are represented solely by spells that began in the selected month. Spells of two-months duration are represented solely by spells that began in either of two months: the selected month or the preceding month. Spells of three-months duration are represented by spells that began in any of three months, and so on. Thus spells of 12- months duration are represented by spells that began in any of 12 months while spells of 36-months duration are represented by spells that began in any of 36 months.⁽³⁵⁾

As an alternative, consider the impact of restricting the universe to all spells that started in the same month. Selecting a subset of spells in this manner does not favor spells of any particular length except insofar as this particular month may differ from other months. The same can be said of a selection of spells that ended in the same month: no particular length is favored. This applies equally to spells that started or ended over a range of months--say, a particular year. Spells selected in any of these ways give an unbiased representation of all spells--again, barring any seasonal patterns or trends over time.

These observations on spell length do not imply that the distribution of durations among spells that were active in a given month or months does not provide useful information. It may be exactly the information that is required to address a specific policy question or research issue--for example, how many children would be eligible for CHIP coverage if there were a six month waiting period? But it is important to recognize that defining a subset of spells on the basis of when they were active as opposed to when they began or ended will yield a distribution with a longer average length than is true of the entire universe of spells. The shorter the period from which they are selected, the more prominent will be the bias. If the intent is to represent “all spells” in some sense, then spells should be sampled by their ending date or starting date rather than when they were active.

b. When Does a Spell Begin or End?

There are essentially two issues that an analyst must address in defining when spells of uninsurance begin or end. The first is whether spells are defined to end at the point that an individual leaves the population of children or whether they are followed past that point. A child turning 18 or 19, depending on the upper age limit of the child population, is no longer a child and therefore no longer an uninsured child, but the individual may still be uninsured. The analyst must determine what strategy for handling these situations is most consistent with the objectives of the analysis. The second issue is how to treat what appear to be brief interruptions of spells. For example, does a one- or two-month period of coverage constitute the end of a spell of uninsurance? Such interruptions may be nothing more than measurement error, but even if they are genuine there may be reasons to treat them as inconsequential and to regard the spell of uninsurance as continuing. In addition, when evaluating the accuracy of brief spells of coverage, it is important to take into consideration the survey design. SIPP, for example, utilizes a four-month reference period, and this creates the potential for erroneous reports to occur in groups of four months. In the SIPP an interruption that coincides exactly with a survey reference period should be viewed suspiciously. Ordinarily, evidence that the inconsistent information was provided by a proxy respondent would create a persuasive case for editing the reported spell of insurance to match the surrounding months.

c. Censoring

Spells whose beginning or end lie outside the period of observation covered by the survey are described as “censored.” Spells whose starting point is observed but whose ending point is not observed are defined as “right-censored,” while spells whose ending points are observed but whose beginnings are not are defined as “left-censored.” Censored spells outnumber uncensored spells near the beginning and end of a panel, of course. For types of spells that commonly run to lengths of a year or more, even a two-year panel may yield few completed spells.

Observing spells from beginning to end is important for measuring the full distribution of spell duration, although there are analytic techniques for estimating distributions of duration for censored spells. Data on complete spells are even more important to understanding the dynamics of how children enter and leave uninsurance. Without seeing both ends of a spell, we cannot infer how the circumstances that precede uninsurance may compare with those that follow. For example, we learn from examining the beginnings and endings of spells that a disproportionate number of children enter uninsurance from Medicaid and leave uninsurance to enroll in Medicaid. But without seeing both ends of spells we cannot determine to what extent it is the same children leaving and re-entering Medicaid who account for the rates that we see or to what extent these exits and entries are independent.

While there are certain research questions that require observing both the beginnings and endings of spells to answer, there are other questions for which partial information may be sufficient. For example, how long do spells last? We can discern a lot about the distribution of spell lengths and the characteristics of children who experience spells of different lengths by following spells from their beginning through the end of 12 months. With a panel database covering 24 months we can identify all the spells that started in a 12-month period and follow them for at least 12 months. With this information we can discern what proportion of new spells last for 12 months or more and determine what differentiates the children who experience 12-month spells from those who experience very brief spells.

d. Multiple Spells

While some of what appear to be multiple spells are nothing more than long spells with erroneous reports of intermittent coverage, multiple spells remain a phenomenon of some interest. Analysis of SIPP data suggests that as many as one-third of new spells of uninsurance among children represent the second or third such spells within a year (Czajka 1999). The frequency of multiple spells makes it essential when counting spells to distinguish between spells and the children who experience them. Between October 1992 and September 1994 we found that children started 19 million new spells of uninsurance. But these 19 million new spells represented 12 million rather than 19 million children. Both numbers are staggering in terms of the needs that they reflect. But understanding how they are different is important to understanding how we can best address the social problem that these numbers present.

F. CONCLUSION

Perhaps the single most important lesson to draw from this review is how much our estimates of the number and, particularly, the characteristics of uninsured children are affected by measurement error. Some of this error is widely acknowledged--such as the underreporting of Medicaid enrollment in surveys--but much of it is not. And even when the presence of error is recognized, this does not mean that researchers and policymakers know how to take it into account. We may know, for example, that Medicaid enrollment is underreported by 24 percent in a particular survey, but how does that affect the estimate of the uninsured? And how much does the apparent, substantial underreporting of Medicaid contribute to the perception that Medicaid is failing to reach millions of uninsured children? Until we can make progress in separating the measurement error from the reality of uninsurance, our policy solutions will continue to be inefficient, and our ability to measure our successes will continue to be limited.

REFERENCES

Bilheimer, Linda T. “CBO Testimony on Proposals to Expand Health Coverage for Children.” Testimony before the Subcommittee on Health. U.S. House of Representatives, Committee on Ways and Means, Washington, DC, April 8, 1997.

Brennan, Niall, John Holahan and Genevieve Kenney. “Snapshots of America’s Families: Health Insurance Coverage of Children.” Washington, DC: The Urban Institute, January 1999.

Cohen, Steven B. “Sample Design of the 1996 Medical Expenditure Panel Survey Household Component.” MEPS Methodology Report No. 2. Rockville, MD: Agency for Health Care Policy and Research, 1997.

Czajka, John L. “Analysis of Children’s Health Insurance Patterns: Findings from the SIPP.” Washington, DC: Mathematica Policy Research, Inc., May 1999.

Czajka, John L., Scott Cody, and Larry Radbill. “Analysis of Whether Poverty Estimates Vary by the Month of Measurement.” Draft Report. Washington, DC: Mathematica Policy Research, Inc., 1998.

Ellwood, Marilyn R., and Kimball Lewis. “The Ins and Outs of Medicaid: Enrollment Patterns for California and Florida in 1995. Washington, DC: Mathematica Policy Research, Inc., 1999.

Fronstin, Paul. “Sources of Health Insurance and Characteristics of the Uninsured: Analysis of the March 1997 Current Population Survey.” Employee Benefit Research Institute Issue Brief no. 192. Washington, DC: EBRI, December 1997.

Giannarelli, Linda. “An Analyst’s Guide to TRIM2--The Transfer Income Model, Version 2". Washington, DC: The Urban Institute Press, 1992.

Health Care Financing Administration(now known as Centers for Medicare and Medicaid Services(CMS)). Medicaid Statistics: Program and Financial Statistics, Fiscal Year 1993. HCFA(now known as CMS) Publication no. 10129. Washington, DC: U.S. Department of Health and Human Services, October 1994.

Kenney, Genevieve, Fritz Scheuren, and Kevin Wang. “National Survey of America’s Families: Survey Methods and Data Reliability.” Washington, DC: The Urban Institute, February 1999.

Lewis, Kimball. “Simulating Food Stamp Program Participation Using Single-month and Multiple- month Data.” Washington, DC: Mathematica Policy Research, Inc., June 1997.

Lewis, Kimball, and Marilyn Ellwood. “Using Medicaid Administrative Data to Examine Medicaid’s Role in WIC Eligibility. Washington, DC: Mathematica Policy Research, Inc., July 1998.

Lewis, Kimball, Marilyn Ellwood, and John L. Czajka. “Counting the Uninsured: A Review of the Literature.” Assessing the New Federalism, Occasional Paper No. 8. Washington, DC: The Urban Institute, 1998.

Lewis, Kimball, Marilyn Ellwood, and John L. Czajka. “Children’s Health Insurance Patterns: A Review of the Literature.” Washington, DC: Mathematica Policy Research, Inc., December 1997.

National Center for Health Statistics. Health, United States, 1998 With Socioeconomic Status and Health Chartbook. Hyattsville, MD: NCHS, 1998.

Rosenbach, Margo, and Kimball Lewis. “Estimates of Health Insurance Coverage in the Community Tracking Study and the Current Population Survey.” Cambridge, MA: Mathematica Policy Research, Inc., November 1998.

Swartz, Katherine. “Interpreting the Estimates from Four National Surveys of the Number of People Without Health Insurance.” Journal of Economic and Social Measurement, vol. 14, 1986, pp. 233-243.

Ullman, Frank, Brian Bruen, and John Holahan. “The State Children’s Health Insurance Program: A Look at the Numbers.” Assessing the New Federalism, Occasional Paper No. 4. Washington, DC: The Urban Institute, 1998.

U.S. General Accounting Office. “Health Insurance for Children: Many Remain Uninsured Despite Medicaid Expansion.” GAO/HEHS-95-175 Washington, DC: 1995.

Vaughan, Denton R. “Discussion of Papers by Giannarelli and Young and Doyle, Cohen, and Beebout.” Proceedings of the Government Statistics Section. Alexandria, VA: American Statistical Association, 1992.

Weigers, Margaret E., Robin M. Weinick, and Joel W. Cohen. Children’s Health, 1996. MEPS Chartbook No. 1. AHCPR Pub. No. 98-0008. Rockville, MD: Agency for Health Care Policy and Research, 1998.

ENDNOTES

(1)As of May 1999, data from only the first two of the 12 waves have been released. These data cover five calendar months in early 1996.

(2)A description of the CTS is presented in Rosenbach and Lewis (1998).

(3)A brief description of the design of the NSAF is provided by Kenney, Scheuren and Wang (1999), which can be found on the Urban Institute’s web page at http://newfederalism.urban.org/ nsaf/design.html.

(4)For the CPS in 1996 the rate of uninsurance among children under 18 is 14.8 percent or .3 percentage points lower than the rate for children under 19. We can assume that a comparable differential between these two alternative definitions of children exists across the other years.

(5)The MEPS instrument includes direct questions about periods of uninsurance in the past. The number reported in Table 1 and cited frequently in AHCPR reports is based on measuring uninsurance as a residual.

(6)Brennan et al. (1999) report that without the verification question the estimated proportion of people under 65 who lacked health insurance would have been “slightly greater than the uninsurance rate published by the Census Bureau.” If this applied to children under 18 as well, the uninsurance rate without the verification question would have been slightly above the 14.8 percent figure that the Census Bureau estimated from the March 1997 CPS.

(7)This last approach is the one used by the SIPP. The estimates in Table 2 of children ever uninsured or children uninsured for an entire year were constructed by aggregating individual monthly results in which uninsurance was measured as a residual.

(8)Technically, a child who was covered for only the first day and last day of a two-month period should be reported as covered for each of the two months. That is, despite a 58-day period of uninsurance, the child would not be identified in the SIPP as uninsured at all if the child (or parent) answers the SIPP questions correctly.

(9)Measuring insurance coverage as it is done in the CPS, SIPP, and NHIS involves, in effect, filling in for every sample household a matrix that includes a column for every distinct type of insurance that the survey takers wish to measure and a row for every household member. Failure to identify every household member who is included under a particular type of coverage may result in one or members being classified as uninsured. The potential problems with this approach would be less severe if the survey instruments walked through the entire matrix cell by cell. But in the interest of saving valuable time in surveys that serve many purposes, the survey instruments do not do this. Without a verification question, then, there is no way to determine if a household member who appears to be uninsured was overlooked under a particular coverage type.

(10)The CPS and SIPP questionnaires use state program names in addition to the more generic “Medicaid” in their questionnaires. This is fairly common practice in the major surveys.

(11)Hypothetically, someone could qualify for TANF without being eligible for Medicaid, which suggests that imputing Medicaid to all respondents who report TANF but do not report Medicaid may not always be correct. In reality, however, there are probably very few TANF recipients who are not covered by Medicaid.

(12)This is a net figure representing the number of people actually missed by the census less those counted twice.

(13)The Census Bureau does not publish population estimates and projections that have been adjusted for the 1990 census undercount, but it uses adjusted estimates to weight its surveys, and it publishes the estimates of census undercount that are used to derive the sample weights. These estimates of the census undercount are available by age, sex, race, Hispanic origin, and state. Users can add these estimates of the census undercount to the published population estimates and projections in order to obtain undercount-adjusted figures.

(14)The CPS, which provides state-level estimates of unemployment, includes state as a dimension of its post-stratification. The NSAF employed state population controls for 13 states.

(15)Bureau field staff conduct the monthly employment questionnaire before beginning the March supplement. The response rate to the employment questionnaire was 93 percent in March 1997, but 9 percent of these respondents refused or otherwise failed to complete the supplement, producing the indicated total response rate (91 percent of 93 percent).

(16)We exclude surveys conducted by mail. Except for the decennial census, which uses telephone and in-person methodologies to complete interviews with the more than 35 percent of households that fail to return their questionnaires, mail surveys tend to be very limited in scope. Furthermore, their contribution to research on the uninsured has been minimal at best. We also exclude self-administered questionnaires that are included as part of an in-person interview. These represent a distinctly different mode and one that has proven effective as a means of collecting data on sensitive topics, such as drug use, but they have little relevance to the measurement of health insurance coverage.

(17)Both the NSAF and the CTS were telephone surveys. The NSAF included a complementary sample of nontelephone households. The CTS did so in the 12 intensive sites and relied on statistical adjustments to compensate for households without telephones elsewhere.

(18)The fact that nearly all respondents were introduced to the CPS with in-person interviews may reduce the mode differences between the telephone and in-person interviews.

(19)This is in addition to the 16 percent of reported participants whose Medicaid coverage was logically imputed or edited, as explained earlier.

(20)Estimates from the NHIS and the SIPP typically have not been released until more than two years after the end of the data year. With the move to CAPI and other changes, the NCHS has goals of reducing the lag in releasing NHIS data to as little as six months. There are no such objectives for the SIPP, however, and until an overlapping panel design is restored in 2001 or 2002, the representativeness of the SIPP sample over time presents a serious concern for the measurement of trends.

(21)Effective with the 1997 redesign, the NHIS will be able to provide state-level estimates, but for many if not most states the precision of these estimates will be too low to support policy analysis at the state level. It is likely that the SIPP will move to a fully state-representative design analogous to the CPS but almost certainly not before 2004.

(22)See, for example, the seminal paper by Swartz (1986) and, for a more recent perspective, Bilheimer (1997).

(23)The SIPP estimates of annual coverage are derived from responses in three or four consecutive interviews, so their face validity is high. The SIPP estimates refer to a somewhat smaller universe of children than the CPS estimates, which lowers them by a few percentage points.

(24)Estimates for 1995 would have to come from the 1993 SIPP panel, which shows significantly more poor children than the 1992 panel and is likely to show an upswing in uninsured children relative to the 1992 panel. SIPP data covering the final months of 1996--to pick up the implementation of welfare reform--will not be released for several months.

(25)In the CTS, only people who did not report employer-sponsored coverage were asked the questions on Medicaid coverage. Other surveys indicate that a nontrivial fraction of those who report Medicaid also report another source of coverage at the same time, so it is likely that the incidence of Medicaid enrollment among those respondents who were not asked the question is not negligible. Even with some allowance for this, however, the reported Medicaid enrollment is too low to suggest that better Medicaid reporting accounts for the relatively low estimate of the incidence of uninsurance. The quality of reporting of Medicaid reporting in the NSAF has not been documented as yet.

(26)The value of a home is not counted in determining Medicaid eligibility or eligibility for the other major means-tested entitlement programs.

(27)Another issue with respect to the reporting of age in the HCFA(now known as CMS) data is discussed in Section E.2.b.

(28)Lewis (1997) found that about 10 percent of the households reporting the receipt of food stamp benefits in January 1992 were simulated to be ineligible. Official estimates placed the error rate in food stamp eligibility determinations at about 3 percent at the time, suggesting that errors in the simulation algorithm or the survey data used by the model accounted for most of the seemingly ineligible participants.

(29)For example, infants who no longer appear eligible but would be eligible if they had enrolled earlier should be included in the count of eligibles if the seemingly ineligible infants are added to the numerator (and denominator).

(30)Both the CPS and the SIPP differentiate between respondents, who include all household members 15 and older, and younger members of the household. Coverage is ascertained separately for each respondent, whether directly or by proxy, but coverage for children is measured with questions that ask who else is included under each respondent’s plan.

(31)Children born on or before September 30, 1983, are the exception. An important question, then, is whether the uninsured children who report parents covered by Medicaid are themselves eligible for Medicaid. If they are eligible, then the likelihood is high that they were in fact covered by Medicaid, and their true status was simply misreported. On the other hand, when they are not eligible, and their parents are neither pregnant nor covered by SSI, then perhaps it is the parents’ coverage that is misreported.

(32)Both populations include the same 9,271,000 uninsured children in September 1993, but for the dynamic population it is inappropriate to divide this number by the total population to obtain a proportion uninsured at a point in time because the dynamic population total includes people who would not have been defined as children in September 1993.

(33)Like all panel surveys, SIPP data exhibit a pronounced “seam” effect. Reported changes occur disproportionately between rather than within the four-month reference periods.

(34)Given the frequency of SIPP interviews, the underrepresentation of births is more likely the result of attrition by new mothers than the underreporting of births. If so, new mothers and any other children they may have are being underrepresented along with the newborns.

(35)Selecting all spells that were active during a 12-month period yields a much less skewed distribution than limiting the universe to spells that were active during a single month. Spells of one month duration could have started in any of 12 months while spells of two months duration could have started in any of 13 months, and spells of 12 months duration could have started in any of 24 months. In this case, then, spells of 12 months in length are represented at only twice the relative frequency rather than 12 times the relative frequency of one-month spells.

Topics

Survey Data | Development of Data, Surveys, & Indicators

Populations

Children

Program

Medicaid