The most current databases for each of the three surveys were examined to assess their ability to provide statelevel estimates. For the CPS, the most current data are from the March 1996 survey. For SIPP the 1993 panel data are available.
In future years there will be data from the 1996 SIPP panel and from the SPD. The SPD combines the respondents from the 1992 and 1993 SIPP. However, since only approximately threequarters of the original respondents to these two waves remain in the SPD, there is a strong potential for bias in some of the estimates produced from this survey. While a larger sample will be available from the 1996 SIPP panel, the basic structure will be similar to the 1993 panel. The new panel assures the inclusion of every state in the survey, but the procedure that was implemented still uses strata that cross state boundaries and does not improve the ability to produce direct estimates for every state. Lowincome households have in general been oversampled, resulting in a larger number of poor persons being included in the survey. However, the differential weights resulting from the oversampling may significantly affect the gains in precision that would be expected to result from the oversampling. Thus, it is not possible to make clear generalizations from the 1993 panel to the newer data series based solely on the changes in sample sizes.
Because the most recent NHIS data are for 1993 and the NHIS sample was completely redesigned in 1995, no NHIS data are examined. However, some discussion of the ability of the NHIS to provide the desired estimates is included.
The Bureau of the Census is making plans to introduce the American Community Survey (ACS) beginning around 2002. This survey will collect information from more than one million households annually, using a revised versions of the Census Long Form. If questions of interest to ASPE are included in the ACS, it can be expected to provide more accurate statelevel estimates than those described below from the three smaller existing surveys. It is our understanding, however, that it is not certain that this survey will be annual.

A. The Proportion Nationally with the Characteristic

As mentioned in Section II, the accuracy of statelevel estimates of proportions is a function of the proportion of the population with the characteristic and the effective sample size. Table 1 shows the proportion of the population in each state estimated by the March 1996 CPS to live below poverty, and the actual sample sizes from which the proportions are estimated. In general, approximately 15 percent of the population are estimated to live in poverty, with approximately double that rate for minorities. The overall rates vary across states, from six percent in New Hampshire to 27 percent in New Mexico. However, many of the state estimates for minorities that differ greatly from the national numbers may be a result of extremely small sample sizes. For example, all state estimates with less than 10 percent or 50 percent or more of their black or Hispanic populations living below the poverty line are based on samples of fewer than 50 minority respondents. The estimate of zero percent of blacks in North Dakota is based on a sample of only two blacks. This demonstrates why great caution is needed before using any statelevel estimates. For purposes of this assessment we will use the national proportions, rather than the very unstable state estimates, when calculating precision for each state. For example, rather than using statespecific poverty rates to determine the minimum cell counts for each subpopulation for each state, we use the national poverty rate to determine the threshold applied to each subpopulation across all states. Detailed tables for each of the states are provided in the appendices.
Table 2 provides the national estimates of the proportions with each characteristic based on the CPS and the SIPP. It is important to remember that the estimates reflected in this table do not cover the same time period. These are the values used in the assessment of the surveys' ability to produce accurate statelevel estimates.


B. Effective Sample Size

The effective sample size is the sample size from a simple random sample of respondents that would have equivalent precision to that achieved by the complex sample design actually used for the survey. Since standard statistical formulas assume simple random sampling, when using them to estimate the precision of estimates it is important to replace the actual sample size with the effective sample size.
The effective sample size is computed by dividing the actual sample size by a design effect that reflects the effect of the deviations from simple random sampling. Design effects may vary by subgroup (e.g., blacks versus whites) but will generally be fairly consistent across states for each subgroup. This is because in large national surveys, such as the three examined here, a similar sample design, including the number sampled form each PSU, is used in all states. Design effects will also vary by type of question; for example, respondents who live near each other (in the same sampled cluster) are likely to have similar poverty characteristics but are not likely to have similar disability characteristics.
From Westat's experience with these and similar surveys, we have estimated the statelevel design effects shown in Table 3 for each of the four characteristics being estimated. National design effects for the CPS are higher than these because they take into account the oversampling of small states by each survey to increase the accuracy of state estimates. This assessment is only examining state estimates, and therefore is only concerned with the survey design within each state.^{2}
Design effects are a function of the average number of completed interviews for the domain of interest that are completed in each cluster. Thus, design effects for subpopulations tend to be smaller than for the entire population, assuming the subpopulations are spread fairly evenly throughout the population. Design effects for children and the elderly may therefore be smaller than those in Table 3. Given that blacks and Hispanics are not evenly distributed across the population, their design effects are not likely to differ from those in the table. For purposes of this assessment, we have assumed that the design effects in the table apply to all subpopulations.
The CPS does no oversampling within states, so there is no additional design effect from differential weighting. (The one exception is that on the March supplement Hispanics are oversampled at twice their normal rate. Given that they represent a small proportion of the total sample, the increase in design effect is not significant.) An absence of oversampling is also true of the 1993 and 1996 SIPP panels. However, the 1996 SPD will oversample lowincome populations, resulting in an additional design effect for analyses from that survey. Beginning with the 1995 sample, the NHIS is oversampling blacks and Hispanics, so any analyses of the NHIS will also have to incorporate that design effect. Oversampling in these surveys will also result in larger sample sizes for these subpopulations than would otherwise be observed.
The sample sizes for the 1996 CPS and 1993 SIPP panel vary across states for all of the populations of interest. Table 4 provides the minimum and maximum actual state sample sizes for each survey for each of the populations of interest. These CPS sample sizes are based on respondents to the 1996 March supplement. Sample sizes for the main CPS questionnaire are a little larger since approximately 10 percent of respondents to the main questionnaire do not participate in the supplement, but Hispanic respondents to the previous November's CPS are asked the supplement questions in March. Thus, for questions asked on the main questionnaire (which does not include any of the four questions used in this assessment) the CPS sample sizes will be somewhat larger than used in this assessment. SIPP only asks those under age 70 about work disability, so for this question the minimum and maximum elderly SIPP sizes are 4 and 220. The appendices provide state level detail for sample sizes.


C. Necessary State Sample Sizes

The desired precision of estimates, and therefore the necessary sample size, is a function of the planned use of the estimates. It is therefore impossible to make a general statement on how big a sample is necessary in each state. Instead, it is possible to look at a few illustrative characteristics for each subgroup and examine how often the precision will meet an arbitrary cutoff.
As mentioned earlier, the National Center for Health Statistics (NCHS) tries to ensure that all of its reported values that are analyzed in NCHS reports have a coefficient of variation (cv) less than or equal to 30 percent. Thus, for estimating fairly rare diseases with incidence rates of around 1.0 percent, this rule ensures that the standard error is no greater than 0.30 percentage points, yielding a 95 percent confidence interval of 1.0% ± 0.60%. For proportions closer to 50 percent this rule allows for much larger standard errors. A cv of 30 percent on such an estimate yields a 95 percent confidence interval of 50% ± 30%. Thus, depending on the size of the proportion estimated from the CPS and SIPP, it may be preferable to use different cutoffs for different characteristics.
Table 2 provided the estimated proportions for characteristics in question. The proportion receiving AFDC and the proportion with a work disability (except for the elderly) are both generally around 10 percent or less. For these two characteristics, we used the NCHS rule of a cv not greater than 30 percent. For the other two characteristics and disabled elderly, a smaller cv would be desirable. The estimates for poverty and employerprovided health insurance range from 11 to 60 percent. We chose an arbitrary confidence interval width of less than or equal to ±10 percent on these estimates.
As an alternative, all cutoffs could be specified in terms of standard errors, with larger standard errors acceptable for larger estimated percentages. For example, estimates under 10 percent could have a confidence interval width of ±2 percent, estimates of 2040 percent a width of ±4 percent, and larger percents a width of ±5 percent. Another alternative for each population and characteristic would be to examine the distribution of standard errors achieved by the existing state samples.


D. Summary Results for the Selected Subgroups and Variables

The estimated proportions in Table 2 are very similar for both the CPS and SIPP. Therefore, the following analyses apply to both surveys. Poverty and health insurance both use the "confidence interval width of ±10 percent or less" rule and are therefore discussed before the two characteristics using the "cv of less than or equal to 30 percent" rule. Please note that the SIPP data combine information for nine states. Therefore, we assessed the 41 states and the District of Columbia for a total of 42 possible "states" from the SIPP.^{3}
Poverty  The minimum effective sample sizes necessary to achieve a 95 percent confidence interval width of ±10 percent or less for each sample proportion, p, can be calculated by solving the following formula for the effective sample size n. (where P is the population proportion with the characteristic):
To convert to the actual sample size, it is necessary to multiply n by the design effect shown in Table 3. For poverty this is 1.3. This leads to a minimum actual number of approximately 70 respondents for the total population, 110 for blacks or Hispanics, 95 children, and 55 elderly. The criteria differ slightly for the SIPP and the CPS. Both are presented in the appendices.
From the CPS, every state meets these minima for the total population, children, and the elderly (Table 5). Only 24 of the states have a sufficient sample for blacks and 19 states for Hispanics. From SIPP, every state assessed meets these minima only for the total population. The minima are also met for blacks in 20 states, Hispanics in 7 states, children in 35 states, and the elderly in 32 states.
Health Insurance  The minimum actual sizes necessary to achieve a 95 percent confidence interval width of ±10 percent or less for the percentage receiving employerprovided health insurance is approximately 100 respondents for the total population and for each subpopulation. For the CPS, this is achieved for all states for the total population and children. The minimum is also met for blacks in 25 states, Hispanics in 20 states, and the elderly in 50 states. For the SIPP, this is achieved for all assessed states only for the total population. The minimum is also met for blacks in 20 states, Hispanics in 7 states, children in 34 states, and the elderly in 24 states.
AFDC  The minimum effective sample size necessary to achieve a cv of less than or equal to 30 percent for each proportion, p, can be calculated by solving the following formula for n:
Note that on AFDC rates near 10 percent, this cv rule results in confidence intervals of ±6 percent. To convert to the actual sample size, it is necessary to multiply n by the design effect shown in Table 3. For AFDC this is 1.2. For the two surveys this leads to a minimum actual number of between 240 (for the SIPP) and 303 (for the CPS) respondents in a state for the total population and between 73 (for blacks from the SIPP) and 133 (for Hispanics from the CPS) for each of the subgroups. AFDC is generally not available to the elderly and therefore that subgroup is not considered for this characteristic.
From the CPS, every state meets these minima for the total population and children. Only 28 of the states have a sufficient sample for blacks and 16 states for Hispanics. From SIPP, the minima are met for the total population in 35 of the 42 assessed states, blacks in 20 states, Hispanics in 7 states, and children in 35 states.
Work Disability  The minimum actual sizes necessary to achieve a cv of less than or equal to 30 percent for each proportion with a work disability ranges from 100 to 175 for all populations except the elderly and for children. Given that most children under 18 are not in the work force, their proportion with a work disability is also very small. Thus, while few states have the necessary completed interviews with more than 1,000 children, it is unlikely that such estimates will be necessary.
Given their relatively high frequency of disability, the necessary number of completes for the elderly is only 30. This number of completes is available from all states for the CPS and 24 states for SIPP. However, the resulting cv of 30 percent yields a confidence interval of 27% ± 16%. To achieve a confidence interval on this estimate that is no wider than ±10 percent would require 76 elderly respondents, a level reached in all CPS states other than Alaska, but only in 9 of the assessed SIPP states.
For the remaining populations, a cv of less than or equal to 30 percent requires from 100 to 175 completes. For the CPS, this is achieved for the total population in all states and for 27 states for blacks and 14 states for Hispanics. For the SIPP, a large enough number of completes for the total population is found in all of the assessed states except New Mexico and the District of Columbia, while it is only achieved for blacks in 20 states, and in 6 states for Hispanics.
It is worth noting that the work disability question on the CPS is being redesigned to correspond with the more extensive disability questions planned for the 2000 Census long form. Work disability will still be asked, but other types of disability will also be captured. Once wording for the new questions is finalized, they could be compared against other sources to predict the proportion with that type of disability and, by using the formulas in this section, to estimate the number of states that would support accurate estimates.


E. Generalization of the Ability to Produce Accurate Direct Estimates at the State Level Using a Single Time Period

By examining the results of the previous section and the distribution of sample sizes across the states, it is possible to make some general comments on the ability to produce accurate state estimates from a single time period's data for the CPS and SIPP. Unfortunately, the lack of data from the redesigned NHIS makes it impossible to make statements about that survey, beyond the fact that for many states the NHIS sample sizes are so small that direct estimates from a single time period would be subject to large variability. This assessment has also not taken into consideration the effect that the lack of statestratification has on SIPP estimates. Research is currently being conducted on how that will affect state estimates. Table 5 summarizes the results found in the previous section. It is important to remember that the actual number of completes in a state is a random variable that will change with each round of data collection. Therefore, the exact numbers shown in Table 5 are only approximations for future survey rounds. This is particularly true for subpopulations. Again, the SIPP data combine information for nine states. Therefore, we assessed the 41 states and the District of Columbia for a total of 42 possible "states."
Given the relatively low precision requirements used in the previous section, it is possible to estimate the proportion of the total population in a state with a characteristic for almost all states from either survey. For the CPS, this is also true for children and, except for Alaska, the elderly. The CPS is only able to support estimates for blacks for about half of the states and 30 to 40 percent of the states for Hispanics, depending upon the measure. Given the smaller sample size of the SIPP, its ability to support such estimates for subpopulations is more limited than the CPS. For children and the elderly, the SIPP can support estimates for the majority of states. For blacks, it can produce estimates that meet these levels of precision for around 20 states and for Hispanics in less than 10 states.
If other characteristics of interest to ASPE are contained in the core CPS interview, it would be possible to increase the sample size in each state significantly by combining data from different months of the survey. (CPS respondents are interviewed in four successive months, then dropped for eight months, then interviewed again for the following four months.) Even when this is true, the respondents in a given state are generally all from just a few primary sampling units (PSUs). This results in statelevel standard error estimates that are quite unstable. To accurately estimate the accuracy of the estimates, it would be necessary to use some form of generalized variance function model that smoothes precision estimates derived from the different states.
In terms of specific states, the 1996 CPS permits analyses of all of the selected characteristics for the subgroups examined at the specified precision criteria for eight states  California, Florida, Illinois, Massachusetts, New Jersey, New York, Pennsylvania, and Texas. The SIPP permits analyses for six states  California, Florida, Illinois, New Jersey, New York, and Texas. The binding constraint for the data for a number of states is the sample size for Hispanics. If the selected characteristics for Hispanics are not included in assessing which states meet all of the criteria, 16 states are added for the CPS and three states are added for the SIPP. For the SIPP, work disability among those aged 65 to 69 also caused several states to fail to meet all of the criteria. Table 6 and the two maps (Exhibit 1 and Exhibit 2) provide summary information regarding the number of criteria met for the states. The appendices provide state level detail for each of the selected characteristics and criteria.
It is important to repeat that the precision requirements used in Table 5 and Table 6 are quite arbitrary. If narrower confidence intervals are desired, the number of states meeting the cutoff will obviously be reduced.
TABLE 1. Percent Living in Poverty and Actual Sample Size by State, March 1996 CPS Percent Living in Poverty Actual Sample Size Black Hispanic Other Total Black Hispanic Other Total Alabama 41% 25% 11% 21% 507 23 1,190 1,720 Alaska 21% 8% 7% 8% 64 48 1,405 1,517 Arizona 49% 31% 12% 18% 64 747 1,325 2,136 Arkansas 38% 32% 12% 17% 254 23 1,483 1,760 California 30% 34% 11% 19% 677 5,601 6,626 12,904 Colorado 28% 25% 7% 10% 55 311 1,418 1,784 Connecticut 32% 48% 4% 11% 102 187 1,016 1,305 Delaware 16% 28% 10% 11% 208 61 982 1,251 District of Columbia 31% 21% 9% 25% 761 80 320 1,161 Florida 36% 29% 11% 18% 772 1,599 4,169 6,540 Georgia 23% 20% 9% 14% 593 63 1,432 2,088 Hawaii 6% 17% 14% 13% 44 52 1,286 1,382 Idaho 0% 43% 13% 15% 12 207 1,623 1,842 Illinois 41% 19% 8% 14% 785 767 3,806 5,358 Indiana 23% 17% 10% 11% 91 46 1,461 1,598 Iowa 28% 11% 12% 13% 41 40 1,577 1,658 Kansas 21% 26% 11% 12% 99 89 1,447 1,635 Kentucky 44% 8% 14% 16% 108 20 1,465 1,593 Louisiana 41% 28% 13% 22% 458 45 1,152 1,655 Maine 0% 31% 12% 12% 3 7 1,278 1,288 Maryland 24% 26% 6% 12% 369 68 1,049 1,486 Massachusetts 31% 49% 8% 11% 168 215 2,498 2,881 Michigan 34% 32% 9% 13% 542 126 3,663 4,331 Minnesota 33% 39% 8% 10% 41 60 1,678 1,779 Mississippi 43% 50% 14% 26% 621 19 977 1,617 Missouri 26% 16% 10% 12% 128 40 1,316 1,484 Montana 32% 30% 16% 16% 6 41 1,660 1,707 Nebraska 31% 28% 10% 11% 51 89 1,537 1,677 Nevada 31% 29% 8% 13% 78 291 1,110 1,479 New Hampshire 0% 18% 6% 6% 7 20 1,202 1,229 New Jersey 18% 27% 6% 9% 363 677 2,965 4,005 New Mexico 37% 35% 22% 27% 24 1,206 1,137 2,367 New York 35% 41% 9% 18% 1,128 1,907 5,781 8,816 North Carolina 31% 39% 9% 14% 575 85 2,256 2,916 North Dakota 0% 21% 13% 13% 2 22 1,535 1,559 Ohio 33% 25% 10% 13% 530 104 4,040 4,674 Oklahoma 44% 24% 15% 18% 142 75 1,614 1,831 Oregon 14% 32% 11% 13% 16 138 1,455 1,609 Pennsylvania 39% 36% 10% 13% 526 214 4,673 5,413 Rhode Island 21% 35% 10% 12% 43 130 1,156 1,329 South Carolina 41% 50% 11% 21% 445 16 911 1,372 South Dakota 42% 7% 15% 15% 14 13 1,748 1,775 Tennessee 29% 25% 15% 18% 273 23 1,306 1,602 Texas 24% 36% 10% 19% 553 3,209 3,721 7,483 Utah 22% 42% 7% 10% 11 188 1,718 1,917 Vermont 0% 51% 11% 11% 3 10 1,261 1,274 Virginia 14% 16% 11% 12% 355 80 1,375 1,810 Washington 19% 33% 13% 14% 46 85 1,467 1,598 West Virginia 55% 0% 18% 18% 27 14 1,683 1,724 Wisconsin 49% 48% 8% 11% 92 49 1,769 1,910 Wyoming 6% 4% 11% 13% 16 131 1,500 1,647 United States 32% 33% 10% 15% 12,893 19,361 98,222 130,476 TABLE 2. National Estimates of the Proportions with each Characteristic Based on the CPS and the SIPP Variable Total
(%)Black
(%)Hispanic
(%)Children
(%)Elderly
(%)1996 CPS Income below poverty 15 32 33 24 11 Receiving AFDC 4 13 9 11 0 Employerprovided health insurance 60 45 39 59 35 Work disability 8 10 6 0 27 1993 SIPP Income below poverty 17 34 35 26 13 Receiving AFDC 5 15 13 12 0 Employerprovided health insurance 59 46 39 56 39 Work disability 7 9 6 1 27  The SIPP only asks work disability questions of individuals under age 70. Therefore, for the percentage of elderly with a work related disability, these estimates reflect only those between the ages of 65 and 69.
TABLE 3. Estimated StateLevel Design Effects for the CPS and SIPP Characteristic Design Effect Income below poverty 1.3 Receiving AFDC 1.2 Employerprovided health insurance 1.1 Work disability 1.0 TABLE 4. Minimum and Maximum State Sample Sizes for Populations of Interest from the 1996 CPS and 1993 SIPP Total Black Hispanic Children Elderly CPS Minimum 1,161 (DC) 2 (ND) 7 (ME) 276 (DC) 59 (AK) Maximum 12,904 (CA) 1,128 (NY) 5,601 (CA) 4,046 (CA) 1,212 (CA) SIPP Minimum 104 (DC) 0 (*) 0 (*) 25 (DC) 14 (*) Maximum 6,454 (CA) 435 (TX) 1,752 (CA) 1,990 (CA) 685 (CA) * Multiple states. TABLE 5. Number of States with the Sufficient Number of Completes to Provide Estimates of the Desired Level of Precision for Four Characteristics from the 1996 CPS and 1993 SIPP Variable Total Black Hispanic Children Elderly CPS Income below poverty 51 24 19 51 51 Receiving AFDC 51 28 16 51 N/A Employerprovided health insurance 51 25 20 51 50 Work disability 51 27 14 N/A 50 # of States Meeting Criteria for: All 4 characteristics 51 24 14 N/A N/A Only 3 characteristics  1 2 51 50 Only 2 characteristics  2 3   Only 1 characteristic  1 1  1 No characteristics 11 23 31   SIPP^{a} Income below poverty 42 20 7 35 32 Receiving AFDC 35 20 7 35 N/A Employerprovided health insurance 42 20 7 34 24 Work disability 40 20 6 N/A 9^{b} # of States Meeting Criteria for: All 4 characteristics 35 20 6 N/A N/A Only 3 characteristics 5  1 34 9 Only 2 characteristics 2   1 15 Only 1 characteristic     8 No characteristics  22 35 7 10  SIPP does not provide separate state identifiers for nine states. Therefore the maximum number of state that could meet the desired criteria is 42.
 The SIPP only provides a measure of work disability among the elderly for persons age 65 to 69. Therefore, we evaluated this variable in SIPP only for these ages. The criteria used was a confidence interval of 95 percent.
TABLE 6. Number of Selected Characteristics and Subgroup Combinations States Meet^{a} March 1996 CPF 1993 SIPP^{b} All Groups
Max=18Excluding Hispanics
Max=14All Groups
Max=18Excluding Hispanics
Max=14Alabama 14 14 12 12 Alaska 8 8 NA NA Arizona 14 10 12 9 Arkansas 14 14 7 7 California 18 14 18 14 Colorado 14 10 7 7 Connecticut 15 12 9 9 Delaware 14 14 3 3 District of Columbia 14 14 2 2 Florida 18 14 18 14 Georgia 14 14 13 13 Hawaii 10 10 3 3 Idaho 14 10 NA NA Illinois 18 14 18 14 Indiana 10 10 13 13 Iowa 10 10 NA NA Kansas 12 12 9 9 Kentucky 13 13 8 8 Louisiana 14 14 12 12 Maine 10 10 NA NA Maryland 14 14 13 13 Massachusetts 18 14 9 9 Michigan 16 14 13 13 Minnesota 10 10 9 9 Mississippi 14 14 12 12 Missouri 14 14 13 13 Montana 10 10 NA NA Nebraska 10 10 7 7 Nevada 14 10 3 3 New Hampshire 10 10 3 3 New Jersey 18 14 18 14 New Mexico 14 10 2 2 New York 18 14 18 14 North Carolina 14 14 14 14 North Dakota 10 10 NA NA Ohio 15 14 14 14 Oklahoma 14 14 9 9 Oregon 13 10 9 9 Pennsylvania 18 14 14 14 Rhode Island 12 10 3 3 South Carolina 14 14 12 12 South Dakota 10 10 NA NA Tennessee 14 14 13 13 Texas 18 14 18 14 Utah 14 10 7 7 Vermont 10 10 NA NA Virginia 14 14 13 13 Washington 10 10 9 9 West Virginia 10 10 8 8 Wisconsin 11 11 8 8 Wyoming 12 10 NA NA  The nine states for which the SIPP does not provide individual identifiers are: Alaska, Idaho, Iowa, Maine, Montana, North Dakota, South Dakota, Vermont, and Wyoming.
 The maximum number of combinations for all groups evaluated is 18 (three characteristics each for the elderly and children and four characteristics each for the total population, blacks and Hispanics). Removing Hispanics results in 14 combinations.
EXHIBIT 1. Number of Estimates for Each State from CPS Data EXHIBIT 2. Number of Estimates for Each State from SIPP Data

View full report
"deriving.pdf" (pdf, 2.83Mb)
Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®