Section 2 Table of Contents Section 4

Assessment of Major Federal Data Sets for Analyses of Hispanic and Asian or Pacific Islander Subgroups and Native Americans:
Extending the Utility of Federal Data Bases

3. Ability of Current Surveys to Provide Data with Adequate Precision

Contents

  1. Standards for Precision
  2. Effective Sample Sizes Required to Meet Precision Levels
  3. Survey Design Effects
  4. Nominal and Effective Sample Sizes
  5. Surveys and Race/Ethnicity Groups Meeting Standards for Precision

3.1  Standards for Precision

Sections 2.1 and 2.2 discussed ways of focusing on the precision levels required for various analytic uses. Since most U.S. Government surveys cover a broad array of data items, it is clear that no single standard of precision is likely to satisfy all potential uses of the data and that some compromises are necessary. It is particularly difficult to create standards of precision for a group of unrelated surveys whose specific analyses are yet to be developed at some future time. Under these circumstances, it seems sensible to use the standards for "generic" prevalence rates that were established for the study of the feasibility of producing state data from the NHIS. However, we reiterate our caveat that the standards may not satisfy all requirements, and if some new and critically important needs for statistical information on the subpopulations arise, the standards should be reviewed.

The precision levels that were examined for NHIS state level generic estimates were described in Section 2.2. We repeat them below:

In order to remove these prevalence rates from the abstract, we show some examples reported in recent U.S. Government sponsored surveys:

Percent of U.S. population age 15 and over with earnings under $10,000 in 1996 24.9%1
U.S. poverty rate in 1998 12.7%2
Percent of U.S. population without health insurance coverage during all of 1998 16.3%3
Percent of persons of Hispanic origin without health insurance in 1998 35.3%3
Cocaine use by adults employed full time 0.7%4
Cocaine use by adults employed part time 0.9%4
Cocaine use by unemployed adults 2.4%4
1 Source: March 1997 CPS, U.S. Census Bureau Report P60, No. 206.
2 Source: March 1999 CPS; U.S. Census Bureau Report P60, No. 207.
3 Source: March 1999 CPS; U.S. Census Bureau Report P60, No. 208.
4 Source: National Household Survey of Drug Abuse

It may be useful to convert the generic coefficients of variation to standard errors and confidence intervals for a clearer view of the effects of the sampling errors on the statistics. They are shown below.
CV and prevalence rate Standard error 66% confidence interval 95% confidence interval
30% CV      
.01 .003 .007-.013 .004-.016
.05 .015 .035-.065 .020-.080
.10 .030 .070-.130 .040-.160
.15 .045 .105-.195 .060-.240
.20 .060 .140-.260 .080-.320
20% CV      
.01 .002 .008-.012 .006-.014
.05 .010 .040-.060 .030-.070
.10 .020 .080-.120 .060-.140
.15 .030 .120-.180 .090-.210
.20 .040 .160-.240 .120-.280
10% CV      
.01 .001 .009-.011 .008-.012
.05 .005 .045-.055 .040-.060
.10 .010 .090-.110 .080-.120
.15 .015 .135-.165 .120-.180
.20 .020 .180-.220 .160-.240

Section 2.2 also notes the types of variables that are likely to have design effects beyond the average values used in this report. We have left them out of our discussion because we use a single average design effect (or, in a few cases, two) for each survey covered in this report.

Our examination assumes the prevalence rates are applied to the total of all persons in the subpopulation. In practice, analyses are frequently desired for subsets, e.g., adults or children, each sex separately, families rather than persons, persons below the poverty level, etc. Examining all possible uses of data would lead to such a wide variety of possibilities that no clear-cut decision could be made, and it seems sensible to restrict the alternatives. Basically, if subset analysis is considered of crucial importance for a survey, the sample size implied by each precision level can be thought of as applying to the subset, and the implications for the total sample for the survey can be calculated. For example, if a 30 percent CV requires a sample of 200 persons, then a sample of 400 persons is necessary if the same CV is desired for males and females separately.

Some examples of subsets, and associated prevalence rates are shown below.
Males 15 years and over: Percent with income under $10,000 17.2%1
Hispanic males 15 years and over: Percent with income under $10,000 21.4%1
Males, 15-24 years of age: Percent with income under $10,000 40.8%1
Hispanic low-income persons (under 125% of poverty): percent without health insurance 44.0%2
Hispanic children under 18 years of age: percent without health insurance 30.0%2
Persons 65 years and over: percent without health insurance 1.1%2
Hispanic males, 65 years and over: percent with income below 50 percent of poverty 4.7%3
1 Source: U.S. Census Bureau Report P60 No. 206.
2 Source: U.S. Census Bureau Report P60 No. 208.
3 Source: U.S. Census Bureau Report P60 No. 207.

[ Go to Contents ]

3.2  Effective Sample Sizes Required to Meet Precision Levels

The effective sample sizes needed to provide the precision levels for the various prevalence rates are shown in Table 3-1. They were derived from the simple formula for simple random sampling,

n = 1 - p  ;
p(CV)2

where p is the prevalence rate.

Table 3-1.
Effective sample size needed for alternative prevalence rates and levels of precision
 Prevalence rate Precision level (CV)
.30 .20 .10
0.01 1,100 2,475 9,900
0.05 211 475 1,900
0.10 100 225 900
0.15 63 142 567
0.20 44 100 400

[ Go to Contents ]

3.3 Survey Design Effects

Section 2.4 of this report briefly described features of sample designs that affect the sampling errors and thus contribute to design effects. It was also noted that design effects can differ among statistics gathered in a survey, sometimes dramatically. As a basis for decision-making, we have chosen to use an average design effect for each survey, one that is approximately midway between the high and low values. In a few cases, we have indicated an additional design effect that applies to specific race/ethnic groups. However, an analyst who is concerned with a specific subject in a survey might prefer to use a different design effect that is more appropriate to the items to be studied. This report cannot take into account all possible analyses that could be carried out. We have tried to include enough information to permit modifications of the results for special subpopulations or items.

The design effects shown in Table 3-2 come from a number of sources (see Appendix A). Wherever possible, we have used published reports. These reports usually do not show design effects, as such, but the information on sampling errors makes it possible to calculate average design effects. When published reports with the required information were not available, the design effects were estimated from the descriptions of the sample design or through discussions with statisticians at the agencies sponsoring the surveys.

[ Go to Contents ]

3.4  Nominal and Effective Sample Sizes

Tables 3-3, 3-4, and 3-5 show the sample sizes for all of the race/ethnic subgroups and are the same numbers reported in Tables A-1 to A-3 of the Task 2 report. As noted, these data represent approximations of the number of sample cases for each subpopulation, and were obtained either from published reports of the Federal agencies sponsoring the survey, provided by the agencies, or derived by Westat. We refer to these numbers as the "nominal sample sizes" to distinguish them from the effective sample sizes. We note that we have included all race/ethnic subgroups, including those that are not currently identified in the data set. The sources used to provide estimates of design effects are shown in Appendix A.

Table 3-2.
Average design effects1 for minorities
Survey Average design effect
Census
    Census 2000 1.0
    ACS 1.0
    CPS2 1.5
    SIPP – Hispanic 2.4
    SIPP – API and American Indians or Alaska Natives 1.6
NCHS/CDC
    NHIS – Hispanics and American Indians or Alaska Natives 1.5
    NHIS – API 1.3
    NSFG – Hispanics and American Indians or Alaska Natives 1.7
    NSFG – API 1.4
    NIS 1.3
    NHANES3 – Mexican-American 2.2
    NHANES3 – Other minorities 1.8
AHRQ
    MEPS – Hispanic 1.2
    MEPS – API and American Indians or Alaska Natives 2.1
HCFA
    MCBS 1.1
SAMHSA
    NHSDA 2.2
NCES
    NHES4 1.4
    ECLS-B 1.2
    ECLS-K5 2.5
1 Most of the surveys are based on household samples. The design effects apply to statistics that do not cluster strongly within households, e.g., health conditions, educational attainment, and labor force status. Items like poverty status, availability of health insurance, urban–rural residence, etc. generally are identical for all members of a household, and the design effects for such items are much larger, usually two to three times the ones shown in the table.
2 The design effects are approximately the same for the March CPS and for other months.
3 The design effects are those of statistics on data for the total of each race/ethnic group. Design effects for individual age–sex groups are lower.
4 The design effects shown apply to statistics on children who constitute the main focus of NHES. Data for adults are sometimes included in the survey, and they are subject to higher design effects.
5 The design effect shown, 2.5, applies to most social, economic, and related items. The design effect for test scores is about 5.

Many of the U.S. Government surveys are repetitive, that is either carried out every year, conducted several times a year, or as in the case of CPS, conducted every month. In most cases, the sample sizes shown in this report describe the annual sample as it was in the time period noted. The reader should be aware that sample sizes are sometimes changed because of budgetary restrictions or other causes. For analysis of a data set, it would be useful to ascertain whether there is an important difference in the sample design between the time period analyzed and the reference date shown in Section 1.2. If so, the sample sizes should be modified accordingly. There are a few cases in which there may be some ambiguity in the sample size. A brief discussion of these cases follows:

Table 3-3.
Approximations of Hispanic sample cases in the data set
Data set Total Hispanic Mexican-
Americans
Puerto Ricans Cubans Central or South American Other Hispanic
Census
   Census 20001 4,508,000 2,850,000 475,000 190,000 650,000 335,000
   ACS 900,000 570,000 95,000 38,000 130,000 67,000
   CPS-March 11,260 6,940 1,190 470 1,685 975
   CPS-Monthly 5,635 3,470 595 235 845 490
   SIPP 10,845 7,181 1,172 372 1,306 814
NCHS/CDC
   NHIS 22,145 13,869 2,353 1,165 2,093 4,758
   NSFG 2,097 1,330 221 88 302 156
   NIS 4,852 3,529 398 99 526 300
   NHANES 1,582 1,500 24 10 32 16
AHRQ
   MEPS 5,375 3,650 600 225 766 134
HCFA
   MCBS 464 254 42 67 52 50
SAMHSA
   NHSDA 5,000 3,170 527 211 721 372
NCES
   NHES 18,804 13,675 1,541 385 2,040 1,162
   ECLS-B 1,979 1,367 160 35 137 280
   ECLS-K 2,957 2,150 242 61 321 183
1 Long form data

NOTE:
The sample cases for each data set reflect the population coverage of the respective surveys. For example, CPS-March covers all persons in the civilian noninstitutional population, whereas NSFG covers women 15 to 44 years of age. The Task 2 descriptions of the respective data sets note the appropriate population coverage. The sample sizes are the number of sample persons in each subgroup, including those that are not identified in the data file.

Table 3-4.
Approximations of Asian and Pacific Islander sample cases in the data set
Data set Total
Asian and PI
Chinese Filipinos Japanese Asian Indian Korean Vietnamese Hawaiian Other
Census
    Census 20001 1,580,000 375,000 300,000 180,000 175,000 175,000 135,000 45,000 195,000
    ACS 316,000 75,000 60,000 36,000 35,000 35,000 27,000 9,000 39,000
    CPS-March 4,555 995 850 515 495 485 375 125 565
    CPS-Monthly 4,555 995 850 515 495 485 375 125 565
    SIPP 3,293 745 637 386 370 362 280 95 421
NCHS/CDC
    NHIS 3,284 755 647 356 320 342 356 112 396
    NSFG 327 74 63 38 37 36 28 9 42
    NIS 1,172 265 227 137 131 129 100 33 150
    NHANES 113 27 22 13 12 12 10 3 14
AHRQ
    MEPS 750 152 170 62 111 96 45 17 97
HCFA
    MCBS 151 34 29 18 17 17 13 4 19
SAMHSA
    NHSDA 700 158 135 82 78 77 59 20 90
NCES
    NHES 4,420 999 855 517 495 486 376 128 566
    ECLS-B 2,483 705 467 134 282 278 217 74 325
    ECLS-K 1,870 423 362 219 209 206 159 54 239
1 Long form data

NOTE:
The sample cases for each data set reflect the population coverage of the respective surveys. For example, CPS-March covers all persons in the civilian noninstitutional population, whereas NSFG covers women 15 to 44 years of age. The Task 2 descriptions of the respective data sets note the appropriate population coverage. The sample sizes are the number of sample persons in each subgroup, including those that are not identified in the data file.

Table 3-5.
Approximations of American Indian or Alaska Native sample cases in the data set
Data set American Indian and Alaska Native
Census
    Census 20001 330,000
    ACS 67,000
    CPS-March 1,600
    CPS-Monthly 1,350
    SIPP 1,200
NCHS/CDC
    NHIS 978
    NSFG 77
    NIS 460
    NHANES 24
AHRQ
    MEPS 375
HCFA  
    MCBS 25
SAMHSA
    NHSDA 166
NCES
    NHES 1,675
    ECLS-B 50
    ECLS-K 364
1 Long form data

NOTE:
The sample cases for each data set reflect the population coverage of the respective surveys. For example, CPS-March covers all persons in the civilian noninstitutional population, whereas NSFG covers women 15 to 44 years of age. The Task 2 descriptions of the respective data sets note the appropriate population coverage. The sample sizes are the number of sample persons in each subgroup, including those that are not identified in the data file.

The effective sample sizes are simply the nominal sample sizes divided by the design effects. They are shown in Tables 3-6 to 3-8. The effective sample sizes will be used to identify data sets that satisfy minimum standards of reliability.

Table 3-6.
Effective sample sizes for Hispanics
Data set Total Hispanic Mexican-
American
Puerto Rican Cuban Central or South American Other Hispanic
Census
    Census 20001 4,508,000 2,850,000 475,000 190,000 650,000 335,000
    ACS 900,000 570,000 95,000 38,000 130,000 67,000
    CPS-March 7,507 4,627 793 313 1,123 650
    CPS-monthly 3,757 2,313 397 157 563 327
    SIPP 4,519 2,992 488 155 544 339
NCHS/CDC
    NHIS 14,763 9,246 1,569 777 2,093 1,079
    NSFG 1,234 782 130 52 178 92
    NIS 3,732 2,715 306 76 405 231
    NHANES 727 682 12 6 18 9
AHRQ
    MEPS 4,479 3,042 500 188 637 112
HCFA
    MCBS 422 231 38 61 47 45
SAMHSA
    NHSDA 2,273 1,441 240 96 328 169
NCES
    NHES 13,431 9,768 1,101 275 1,457 266
    ECLS-B 1,649 1,139 133 29 114 233
    ECLS-K 1,183 860 97 24 128 73
1 Long form data

NOTE:
The sample cases for each data set reflect the population coverage of the respective surveys. For example, CPS-March covers all persons in the civilian noninstitutional population, whereas NSFG covers women 15 to 44 years of age. The Task 2 descriptions of the respective data sets note the appropriate population coverage. The sample sizes are the number of sample persons in each subgroup, including those that are not identified in the data file.

Table 3-7.
Effective sample sizes for API
Data set Total Asian and PI Chinese Filipinos Japanese Asian Indian Korean Vietnamese Hawaiian Other
Census
    Census 20001 1,580,000 375,000 300,000 180,000 175,000 175,000 135,000 45,000 195,000
    ACS 316,000 75,000 60,000 36,000 35,000 35,000 27,000 9,000 39,000
    CPS-March 3,037 663 567 343 330 323 250 83 377
    CPS-Monthly 3,037 663 567 343 330 323 250 83 377
    SIPP 2,058 466 398 241 231 226 175 59 263
NCHS/CDC
    NHIS 2,433 559 479 264 237 253 264 83 293
    NSFG 234 53 45 27 26 26 20 6 30
    NIS 902 204 175 105 101 99 77 26 115
    NHANES 63 15 12 7 7 7 5 2 8
AHRQ
    MEPS 357 72 81 30 53 46 21 8 46
HCFA
    MCBS 137 31 26 16 15 15 12 4 17
SAMHSA
    NHSDA 318 72 61 37 35 35 27 9 41
NCES
    NHES 3,157 714 611 369 354 347 269 91 404
    ECLS-B 2,069 588 389 112 235 232 181 62 271
    ECLS-K 748 169 145 88 84 82 64 22 96
1 Form data

NOTE:
The sample cases for each data set reflect the population coverage of the respective surveys. For example, CPS-March covers all persons in the civilian noninstitutional population, whereas NSFG covers women 15 to 44 years of age. The Task 2 descriptions of the respective data sets note the appropriate population coverage. The sample sizes are the number of sample persons in each subgroup, including those that are not identified in the data file.

Table 3-8.
Effective sample sizes for American Indians or Alaska Natives
Data set Effective sample size
Census
    Census 2000(1) 330,000
    ACS 67,000
    CPS-March 1,067
    CPS-Monthly 1,067
    SIPP 1,000
NCHS/CDC
    NHIS 652
    NSFG 45
    NIS 354
    NHANES 12
AHRQ
    MEPS 179
HCFA
    MCBS 23
SAMHSA
    NHSDA 75
NCES
    NHES 1,196
    ECLS-B 148
    ECLS-K 146
1 Long form data

NOTE:
The sample cases for each data set reflect the population coverage of the respective surveys. For example, CPS-March covers all persons in the civilian noninstitutional population, whereas NSFG covers women 15 to 44 years of age. The Task 2 descriptions of the respective data sets note the appropriate population coverage. The sample sizes are the number of sample persons in each subgroup, including those that are not identified in the data file.

[ Go to Contents ]

3.5  Surveys and Race/Ethnicity Groups
Meeting Standards for Precision

A comparison of the effective sample sizes in Tables 3-6 to 3-8 with the numbers needed to meet alternate levels of precision shown in Table 3-1 indicate which race/ethnic subgroups meet these standards for each of the surveys.

We should like to reiterate the caveats mentioned earlier in the discussion of these standards. The sample sizes in Table 3-1 will provide the coefficient of variation for the indicated estimate of prevalence of the total population in the race/ethnic subgroup (or of the total target population of the survey; e.g., females 15-44 for NSFG, person 65 or older for MCBS, etc.) If the contemplated analysis includes examining subsets of the total, such as individual age groups, urban-rural residence, or low-income vs. higher-income persons, much larger sample sizes are needed; essentially each subset would require approximately the sample sizes shown in Table 3-1. Since the specific studies to be carried out have not yet been developed, this report does not contain a provision for subset analysis, but the possibility of the need for such statistical breakdowns and their implications should be kept in mind.

Most of the surveys use an identical sampling rate for all persons in each race/ethnic group. In these surveys, the sample size for any subset can be estimated by taking the proportion of the sample equal to the proportion of the relevant population in that subset. For example, for analysis of data by gender, the male (and female) sample will be equal to about one-half the total sample. Similarly, for an age group containing about 20 percent of the relevant population, the sample will be 20 percent of the total sample in the race/ethnic subgroup. Similar relationships hold for other subsets, such as regional breakdowns, income classes, etc. For subset analyses, the nominal and effective sample sizes in the tables, which follow, should be adjusted to reflect the portion of the subgroup to be analyzed.

There are a few exceptions to the use of a common sampling rate for all members of a subgroup. NHANES focuses on 52 age-sex-race/ethnicity subsets, and uses approximately the same sample sizes for each. The 52 groups are described in several reports on the methodology of NHANES, and analysts concerned with subsets of the race/ethnicity subgroups should refer to the NHANES publications for appropriate methods of estimating the sample sizes. SIPP oversamples persons in poverty. For subset analyses comprising persons in poverty (or items correlated with poverty), the analyst should obtain a description of the current SIPP sample and use it to estimate the sample size.

Secondly, the design effects in Table 3-2 that were inputs to the calculation of the effective sample sizes basically apply to data that are not heavily clustered within households. Examples of statistics that are not clustered, or only moderately clustered are: smoking status, presence of specific chronic illnesses such as hypertension or arthritis, occupation, and very large expenditures for medical care during the year. For such items, members of a household are unlikely to have the same characteristics. On the other hand, as is indicated in footnote 1 of Table 3-2, items such as poverty status, health insurance, urban-rural residence, etc. tend to be identical for all members in a household, and the design effects are usually two to three times as large as those in Table 3-2. Other examples of items with high clustering effects are: mobility status, whether or not foreign born, and income class. Such items will tend to be identically reported within a household so that obtaining the statistics from all members of a household is no more useful than an interview with only one household member. In such instances, the design effect is increased by a factor equal to the average household size, that is by a factor of about 3.5 for Asian and Pacific Islanders, 4.3 for American Indians and Alaska Natives and 3.6 for Hispanics. The average household size (and consequently the design effects) can differ among the subgroups that are the focus of this report. For example, the average household size for Hispanic subgroups varies from a low of 2.6 for Cubans to 3.9 for Mexicans. An analyst should check the household sizes of the subgroups to be studied if highly clustered items are important variables, and modify the design effects accordingly. An alternate way of accomplishing the same goal for highly clustered items is to treat the sample size as the number of households in the sample rather than the number of persons. The nominal and effective sample sizes in the various tables should then be divided by the average household size.

Results of the comparisons of Tables 3-6 to 3-8 with Table 3-1 are summarized below. Table 3-1 indicates the sample size cut-offs for various levels of confidence in the data. Thus, an effective sample size of 500 satisfies requirements for a 20 percent CV for all prevalence rates except very rare ones (i.e., p = .01); an effective sample of 1,000 will provide a CV of .10 on prevalence rates greater than or equal to .10, as well as satisfying the criteria mentioned for a sample of 500; and a sample size of about 2,000 to 2,500 will produce CVs of .20 or better for prevalence rates as low as .01.

The analysis above can be summarized as follows. The vital statistics records, Census 2000 and the ACS will permit detailed and complex analyses of all race/ethnic subpopulations. The March CPS, the NHIS, and NHES can produce quite accurate statistics for Mexican-Americans, moderately good data for Puerto-Ricans and Central or South Americans, and acceptable data for the other Hispanic subgroups, with the possible exception of Cubans. Data for Chinese, Filipinos, and American Indian or Alaska Native would be fairly reliable. Only limited analysis could be made of data for the remaining API subgroups. The monthly CPS and SIPP would be weaker for Hispanics, but mostly still acceptable. For the other surveys, acceptable precision is only possible for Mexican-Americans, and MCBS would not even be acceptable for that subgroup.

It is important to remember that the above analyses apply to the ability of the surveys to provide acceptable accuracy on prevalence rates (or percentage distributions) of total persons in each subpopulation. Many surveys require examination of important subsets of the population, as well as the total. For example, NHANES concentrates on age-sex-race/ethnicity subgroups, MEPS examines low-income persons as well as the total population, and an analytic group in the NSFG is teenagers, by race/ethnicity. For such analyses, the survey needs to have each subset have the sample sizes in Table 3-1. Thus, a simple four-way breakdown of the population, such as persons under or over 25 years by sex, would require a sample four times as great as the numbers in Table 3-1.

Table 3-9 contains guidance on the ability of the various databases to provide acceptable precision levels, as follows:

  1. Detailed cross-classification is possible with reasonable precision;
  2. Some limited cross-classification is possible;
  3. Only simple distributions are possible; and
  4. No analysis is possible.

The classifications are subjective, and it is possible to reach different conclusions on the levels of precisions that are reasonable. An analyst should determine how much error can be tolerated before reaching a conclusion on the detailed analysis to be carried out. Once again, given the possible changes in sample size or design, as well as the use of overlapping samples, we urge that, prior to using a particular data file, the current sample sizes and design effects be verified.

Table 3-9.
Adequacy of databases for provision of data with acceptable precision
(see footnote* for description of codes used)
Database Hispanic American Indian or
Alaska Native
Mexican-American Puerto Rican Cuban Central & South American Other
Census
    Census 2000 A A A A A A
    ACS A A A A A A
    CPS-March A C C B C B
    CPS-Monthly B C D C C B
    SIPP B C D C C B
NCHS/CDC
    NHIS A B C B B C
    NSFG C D D D D D
    NIS B C D C C C
    NHANES C D D D D D
AHRQ
    MEPS B C D C C D
HCFA
    MCBS C D D D D D
SAMHSA
    NHSDA B C D C D D
NCES
    NHES A B C B C B
    ECLS-B B D D D C D
    ECLS-K C D D D D D
* Level of detail possible that can be attained with adequate precision Effective sample sizes
  A    Detailed cross-classification possible 4,000 or more
  B    Some limited cross-classification 1,000 to 3,999
  C    Only simple distributions 200 to 999
  D    Analysis not possible Under 200

Table 3-9. (continued)
Adequacy of databases for provision of data with acceptable precision
(see footnote* for description of codes used)
Data set Chinese Filipino Japanese Asian Indian Korean Vietnamese Hawaiian Other
Census
   Census 2000 A A A A A A A A
   ACS A A A A A A A A
   CPS-March C C C C C C D C
   CPS-Monthly C C C C C C D C
   SIPP C C C C C D D C
NCHS/CDC
   NHIS C C C C C C D C
   NSFG D D D D D D D D
   NIS C D D D D D D D
   NHANES D D D D D D D D
AHRQ
   MEPS D D D D D D D D
HFCA
   MCBS D D D D D D D D
SAMHSA
   NHSDA D D D D D D D D
NCES
   NHES C C C C C C D C
   ECLS-B C C D C C D D C
   ECLS-K D D D D D D D D
* Level of detail possible that can be attained with adequate precision Effective sample sizes
  A    Detailed cross-classification possible 4,000 or more
  B    Some limited cross-classification 1,000 to 3,999
  C    Only simple distributions 200 to 999
  D    Analysis not possible Under 200

The ability to produce acceptable data also depends on whether the survey collects the detailed race/ethnicity description of each sample person and enters the code in the data set. The Task 2 report indicated a few cases in which not all subpopulations were identified. Many of the surveys simply ask whether the sample person is an Asian or Pacific Islander without obtaining additional detail. The NVS, both natality and mortality, record the identification of Chinese, Japanese, Hawaiian, and Filipinos in all 50 states, but identify the other ethnic groups -- Vietnamese, Asian-Indian, Korean, Samoans, and Guamanians -- in only nine states which contain about two-thirds of the U.S. population in each of these groups. Obviously, the identifications and coding in the surveys and the NVS would need to be expanded to make tabulations possible.


Section 2 Table of Contents Section 4


Where to?

Top of Page
Table of Contents of Report

Home Pages:
Human Services Policy (HSP)
Assistant Secretary for Planning and Evaluation (ASPE)
U.S. Department of Health and Human Services (HHS)

Last updated 9/14/00