Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Assessment of Major Federal Data Sets for Analyses of Hispanic and Asian or Pacific Islander Subgroups and Native Americans: Inventory of Selected Existing Federal Databases

Publication Date

By:
Joseph Waksberg
Daniel Levine
David Marker

Submitted to:
U.S. Department of Health and Human Services
Office of the Assistant Secretary for Planning and Evaluation

Submitted by:
Westat
1650 Research Boulevard
Rockville, Maryland 20850

"

Content of Report

This Task 2 report is the first of the two substantive reports in the study to assess the capability of a number of federal surveys: (1) to provide data on the major subgroups of Hispanic, Asian or Pacific Islanders (API), and on American Indian or Alaska Natives, in order to analyze the health, education status, and social and economic well being of these groups; (2) to identify barriers to developing such data; and (3) to identify options for improving the capacity to obtain statistically reliable data about these populations. The report contains information on the applicable sample sizes, and an inventory of existing Federal databases for most of the major demographic, social, economic and health-related surveys carried out by or for U.S. Government agencies. Most of the databases consist of surveys that are carried out annually, or at other regular intervals, so that they provide reasonably current statistical information. However, two of the databases are somewhat different, and do not, strictly speaking, fall into the category of surveys. One is the decennial census; the other is the National Vital Statistics System, which contains data from the birth and death registration systems. These two databases are such important sources of information on demographic characteristics, economic status, and selected health items that it seemed appropriate to include them.

The ability of a survey to provide data on population subgroups with reasonable precision depends on two factors:

  1. The questionnaire or other instruments used for data collection must identify the subgroups and record the information. In turn, the detail also must appear in the microdata file. This is obviously essential and Appendix B describes both the specific questions on race and ethnicity used in each survey and the detailed race/ethnicity codes which are recorded; and
  2. The sampling errors on estimates of the characteristics of the subgroups need to be low enough for the statistics to be reasonably reliable. The sampling errors are mostly, but not exclusively, dependent on the sample size in each survey. Some of the surveys oversample Hispanics, which reduces sampling errors for the Hispanic subgroups. However, since the surveys operate with fixed budgets, the increased Hispanic samples result in a reduction in sample size for other population groups, which increases the sampling errors for Asian and Pacific Islanders and American Indians. In addition, the survey designs need to be taken into account in considering the appropriate sample sizes. For example, although labor force information is obtained monthly in the Current Population Survey (CPS) conducted by the Bureau of the Census, supplemental items designed to collect a wide variety of other social and economic information are added to individual months during the year. Consequently, only the monthly sample size applies to such information as income, family status, migration, school enrollment, etc.; in the case of the labor force data, on the other hand, information for different months can be combined to increase the sample size and produce quarterly, semi annual, or annual labor force estimates with improved reliability. In another example, the sample design for the National Health and Nutrition Examination Survey (NHANES) is focused on the need to analyze health conditions for rather narrow age-sex groups for Mexican-Americans, blacks, and all other groups. The precision of estimates for the total population is considered of secondary importance. The requirement for approximately equal sample sizes in the various age-sex domains influences the sampling errors for the statistics on the total population, and on data for all Hispanics and all Asians and Pacific Islanders.

The attached tables (Appendix Tables A-1 through A-3) provide detailed information on the sample sizes. The inventory (Appendix B) contains a concise description of the purpose of the survey, the kinds of data obtained, interview methods, and publication policy, as well as the agency website address for those desiring additional detail. Note that the inventory description is limited to what is collected and what is available on the micro-data file, since these are most relevant to the assessment. We also have included information on whether and how the subpopulations are identified, and whether bilingual interviewers are used. Table A-4 describes the availability of information on citizenship, year of immigration, and whether foreign born.

Subgroups and Databases Examined

The subgroups of interest are:

  1. Hispanic:
    • Mexican-American
    • Puerto Rican
    • Cuban
    • Central or South American
    • Other Hispanic
  2. Asian or Pacific Islander: (Note that the new OMB standards (Appendix C) splits this category into "Asian" and "Native Hawaiian or Other Pacific Islander.")
    • Chinese
    • Filipino
    • Japanese
    • Asian Indian
    • Korean
    • Vietnamese
    • Hawaiian
    • Other Asian or Pacific Islander
  3. American Indian or Alaska Native
The databases examined and the appropriate reference dates are:
Data set Reference date
Census
Census 2000 April 1, 2000
American Community Survey 2003, proposed
Current Population Survey-March March 1998
Current Population Survey-Monthly Average month, 1998
Survey of Income and Program Participation Wave 1, 1996 Panel
NCHS/CDC
National Health Interview Survey 1998
National Vital Statistics System-Natality 1997
National Vital Statistics System-Mortality 1997
National Survey of Family Growth 1995
National Immunization Survey 1999
National Health and Nutrition Examination Survey 1999
AHRQ
Medical Expenditure Panel Survey 1999
HCFA
Medicare Current Beneficiary Survey Early 1998, 4 panels
SAMHSA
National Household Survey on Drug Abuse

1998

NCES
National Household Education Survey

1996

Early Childhood Longitudinal Survey - Birth Cohort

Year 1, 2000

Early Childhood Longitudinal Survey - Kindergarten Cohort

Year 1, Fall 1998

Since both sample sizes and designs are subject to changes over time as a result of budget actions, congressional or programmatic initiatives, or baseline revisions, it is important that users or interested parties refer to current documentation or inquire of the appropriate agency whether any important changes in sample size or design have been made.

It is important to note that these reports are to serve as a general reference to a potential audience of analysts and policy makers seeking information on the possible uses of these databases as a source of data on race/ethnic groups of interest, rather than as technical handbooks. We would urge users to seek appropriate professional assistance or expertise, either from the relevant agency or from other sources, to deal with specific technical issues.

1  See page 2 of NCHS Report, Sample Design:  Third National Health and Nutrition Examination Survey, Series 2, No. 13, for a more detailed discussion.

Sources of Information

To the extent possible, information was obtained from the staff of the Government agency responsible for each survey. Westat contacted each agency initially with a structured set of questions, but with the understanding that the important objective was to obtain the desired information rather than to adhere rigidly to a fixed format of questions. Other sources were used as required to fill in any data gaps, for example, published or unpublished descriptions of the sample designs and survey procedures were consulted, as were Westat staff with personal knowledge of the content and procedures of many of the surveys. (Westat conducts some of the surveys under contract; in other cases, Westat helped develop the sample designs; further, some staff members previously held senior positions at the Census Bureau.)

We note that in a number of cases direct information on the sample sizes for the race/ethnicity subgroups was not available even though the total sample size was known, and frequently the number of all Hispanics and of all Asian and Pacific Islanders, as well. Further, the Census Bureau does not prepare independent current estimates for the Asian and Pacific Islander subpopulations, nor publish counts of the number of sample persons or households in each API subpopulation for the current surveys, such as the CPS. The Census Bureau does prepare annual population estimates for Hispanics, Native Americans, and total Asian and Pacific Islanders, by updating the most recent Census counts (currently, the 1990 Census) through the use of birth and death records, and estimates of net migration, including an allowance for illegal immigration, and the most recent estimates are shown in Table 2-1. However, the subpopulations are not included in this program. Similarly, most of the current surveys sponsored by the major statistical agencies do not publish data for the subpopulations. (The CPS does provide a limited amount of data annually for the larger Hispanic subpopulations but not for the Asian and Pacific Islander subgroups.)

The detail shown in Table 2-1 is derived from the March 1999 CPS. The Hispanic subpopulations are estimated directly from the survey; the API subpopulation detail, however, was obtained by applying the percent distributions for the subgroups as reported in the 1990 Census to Census Bureau estimates for March 1999 of the total number of Asian and Pacific Islanders.

Table 2-1.
Estimates of U.S. population in the race/ethnic subgroups examined in this report:  March 1999

Race/ethnic group Total population1
(000’s)
Percent of total

     
Civilian noninstitutional population 271,743 100.0
     
Hispanics, total 31,689 11.7
   Mexican-American 20,652 7.6
   Puerto Rican2 3,039 1.1
   Cuban 1,370 0.5
   Central or South American 4,536 1.7
   Other Hispanic 2,091 0.8
     
Asian or Pacific Islanders, total 10,492 3.8
   Chinese 2,370 0.9
   Filipinos 2,028 0.7
   Japanese 1,227 0.4
   Asian-Indian 1,175 0.4
   Korean 1,154 0.4
   Vietnamese 892 0.3
   Hawaiian 304 0.1
   Other 1,342 0.5
     
American Indian or Alaska Native 2,396 0.9

1 Data for Hispanics and Hispanic subgroups are from the March 1999, Current Population Survey. Since current estimates for the Asian or Pacific Islanders subgroups are not available, 1990 Census detail was adjusted to the March 1999 total for the group to produce an approximate distribution. The estimate for American Indian or Alaska Natives is for July 1999.

2 Does not include persons living in Puerto Rico.


There were several problems in preparing the sample sizes shown in Tables A-1 to A-3 for the current surveys (i.e., all data collection systems covered in this report except the decennial census, ACS, and the vital statistics records). For a number of surveys, there was no way of obtaining exact subpopulation sample sizes since the data records do not contain a subpopulation identifier. In other current surveys, the data records do indicate each sample person's subpopulation identity, but the detail was not tabulated. Requesting special tabulations would have been both fairly expensive and caused significant delays in the timetable for the project.

However, since it is necessary to have the complete distribution in order to assess the survey's ability to provide reasonably reliable data, and recognizing that reasonable approximations would serve the goals of this study adequately, we have estimated the appropriate subpopulation sample sizes where not available. Tables A-1 through A-3, therefore, contain approximations of the number of sample cases for each subpopulation in a given database. The numbers shown include those published by the Federal agencies responsible for the conduct of the survey, or provided separately, along with a number of derived estimates, prepared for the most part by using the distribution of the population in Table 2-1 as an approximation of the sample distribution. For surveys whose target populations were different from the total population (young children for NIS, ECLS-B, and ECLS-K, females 15 to 44 years for NSFG, and persons 65 years and over for MCBS), the population distribution for the target group, or for a reasonable approximation to this group, was used. Some of the surveys oversample Hispanics, and an allowance for the oversampling is included in the estimates.

The derived estimates of sample sizes for those databases which do not currently identify all subpopulations also provide an indication of the potential value of the databases as a source of useful information were the appropriate agency to record the race/ethnic detail on the data file.

Although the estimates shown in Tables A-1 through A-3 may differ somewhat from the actual counts and, thus, introduce a degree of approximation in the analyses of these data, we do not believe this will affect in any important way the conclusions to be drawn. The reason for the emphasis on the sample sizes in both this and the Task 3 report is that the sample size is the most important factor in determining the standard errors of the estimates of a survey which, of course, establishes the precision that is achieved. The standard error of a proportion estimated in a survey can be expressed as

formula: sigma=square root(dp(1-p)/n)

where p is the proportion being estimated, d is the design effect which depends on the sample design and the specific item being estimated, and n is the sample size. Consequently, the standard errors move rather slowly with changes in the value of n. For example, if the estimates of sample sizes in Tables A-1 to A-3 were off by 10 percent, this would only cause an error of 5 percent in the estimates of the standard errors. Similarly, if the sample sizes were actually 20 percent higher or lower than those shown in Tables A-1 to A-3, the estimates of standard error would be in error by only 10 percent (e.g., ±8.8 percent instead of ±8 percent). We do not believe that this would affect a decision on whether a survey can provide useful data on the subpopulations, or the amount of sample increase required to produce reliable data.

We also note that standard errors will vary greatly among the statistics estimated in each survey because of the impact of the values of d and p in the expression for the standard error. For example, p may be of the order of 0.10 for children without health insurance, but 0.30 for the number of children below poverty. Consequently, a decision regarding a survey's ability to provide "adequate reliability" will need to focus on either the reliability for a few specific items or on the average among a group of items. Thus, having reasonable approximations, as distinct from exact standard errors, will not appreciably influence any of the conclusions in this report.

Particular Issues Relating to Content of the Inventory

There are a number of issues that will affect the ability of the surveys to provide statistical data on the minority subgroups, or in some cases to permit data from several surveys to be combined for improved reliability. A detailed discussion of these issues will be included in the Task 3 report, but it seems useful to call attention to them now.

  • Some of the surveys do not ask Hispanic or Asian respondents to identify themselves or, in some cases, do not record the specific subgroups. In other cases, the question wordings vary among some of the surveys, although usually within a survey the questions are asked in a consistent way over long periods of time. It seems unfortunate that Government statistical agencies cannot agree on a common set of questions. The proposed revisions in the standards for the federal collection of race and ethnicity, developed by the Office of Management and Budget (OMB), should help resolve this issue. It is our understanding that the revised race/ethnic standards will be in use for virtually all federal surveys by mid decade. The revised standards, as currently proposed, are set forth in Appendix C.

    Alternative wordings probably tend to identify race/ethnicity reasonably consistently for most members of the subpopulations, but there are important exceptions. For example, in the 1980 Census, many Hispanics in New Mexico, Arizona, and to a lesser extent in Texas and California, reported themselves as "Hispanic – Other" rather than Mexican-Americans , presumably because their ancestors resided in these areas at the time they were annexed by the United States in the early 19th century, and they do not consider themselves of Mexican ancestry. (Further, until recently, the New Mexico birth certificate used the category, "U.S. Southwest", which referred to very long-time residents who had immigrated from Mexico.) We assume similar reporting in CPS and SIPP, since the questions essentially are the same as in the Census questionnaire. However, Hispanic HANES and NHANES III probed more intensively and classified such persons as Mexican-Americans. Another example of differences in the manner of identification exists in the questions on race/ethnicity in the National Vital Statistics System–Natality, as contrasted with those asked in the Census and most current surveys. Natality data are formulated according to the race of the mother; information is not obtained on the race of the child, while the surveys collect race/ethnicity for each individual.

  • Surveys in which the subgroups are not identified are useless for subgroup analysis (e.g., Asian and Pacific Islander subgroups are not identified in SIPP). If the subject matter of a particular survey could shed light on important social, economic, or health conditions of minority groups, it might prove very useful and desirable to add the appropriate questions identifying the subpopulations to the survey instrument and, thus, record the full detail for the subgroups. Such a decision would add only trivially to the length of interview or the complexity of the survey, although even slight changes in survey content can create complications in software and take time to implement.
    For purposes of this study, we have developed approximations of the number of sample persons for the race/ethnic subgroups not identified on the data tapes and included them in Tables A-1 to A-3. These approximations also should prove informative in any consideration of the desirability of asking the sponsoring agency to add such identification.
  • We mentioned earlier that the sample size for the survey is a dominant factor in determining the applicable sampling errors, but it is not the only one. Some of the household surveys now oversample Hispanics in order to reduce sampling errors for this minority; the methods of oversampling are not the same, and they have varying effects on reducing the sampling errors. The NHANES focuses on providing specified sample sizes for a group of age-sex-race/ethnic groups, which sharply reduces the effective sample size for the total population. Whenever applicable, such features of the sample are described in the accompanying survey description. With the exception of the two Early Childhood Longitudinal Studies, none of the surveys oversamples Asian and Pacific Islanders.
  • An important feature of the sample design that influences the sampling error is the survey design effect. Design effects reflect increases in the variances arising both from the clustering and from departures from an equal probability sample, and decreases from the use of stratification and estimation procedures. (The increases usually dominate.) Varying sampling rates are sometimes used among geographic areas, age groups, or income class. The discussion in the Task 3 report on the precision of the survey estimates and actions that could be taken to obtain statistically reliable results for the designated subgroups will include detailed information on design effects, as well as on certain other features of the various designs which also influence the sampling errors.
  • When the sample size for a survey does not provide reasonable reliability for subgroup analyses, and combining several years of data (or cycles if it is not an annual survey) does not improve the statistics sufficiently, a solution is a large increase in the survey sample size, probably at least doubling or tripling the sample, in addition to introducing a massive field screening operation to locate a sufficient number of members of the subgroup. A model of how this can be done efficiently is the Hispanic Health and Nutrition Examination Survey (HHANES), carried out by Westat for the National Center for Health Statistics in 1982-84. It should be noted that a number of the identified surveys oversample blacks as well as Hispanics and other groups and the supplementation for the race/ethnic subpopulations would have to be superimposed on the sample design in use. There are no technical difficulties in such an expansion of the sample, but it is not likely that any part of the supplementation can be compensated for by a reduction of the black or white population.
  • There has been considerable discussion in the press, in Congress, and by statisticians and other social scientists, about the Bureau of Census' proposed plans to conduct some of the follow-up on nonrespondents in the Year 2000 Census on a sample basis and, also, to adjust the Census data for the expected undercounts on the basis of a sample survey. The proposal for a sample follow-up of nonrespondents has been dropped as a consequence of a Supreme Court ruling that a full count was necessary for apportionment of Congressional seats among the states. The Bureau proposes to conduct a sample survey to measure undercoverage, but the results will not be used in the official population counts used for apportionment. However, adjusted figures may be prepared and both adjusted and nonadjusted counts made available for use by state officials, and, more generally, by social scientists. It is unlikely such an adjustment procedure will take undercounts for the specific subpopulations into account; in the past, adjustments were for all Hispanics, or all Asian and Pacific Islanders as a group. The undercounts are only one source of quality problems in the Census. We note that many of the current surveys use census figures adjusted for undercoverage as the population controls for poststratification, but this practice has only a minor effect on the quality of the data for subpopulations.

Appendix A: Tables A-1 Through A-4

The information shown in Tables A-1 through A-3 provides a reasonable approximation of the number of subpopulation sample cases contained in each survey, for Hispanics, Asians and Pacific Islanders, and for American Indians or Alaska Natives. The following should be noted:

  • Many of the numbers shown in the tables are approximations, meaning that they are not exact, but rather provide a relatively reasonable measure of magnitude. As might be expected, the sample sizes will vary somewhat, among the months, waves, or rounds of the surveys, reflecting changes in response, coverage, and sampling variability. Such slight differences, however, do not affect the analyses or conclusions contained in either the Task 2 or Task 3 reports.
  • In most cases, the information reflects the design in use and the sample sizes as of the data collection period noted earlier. Census 2000 and the ACS detail reflect the current understanding of the proposed, respective activity.
  • As in recent censuses, Census 2000 consists of both a limited number of questions asked of the full population (the short form), and a more detailed set of questions asked of about 1 household in six (long form). For this report, the sample sizes shown in the appropriate tables are restricted to the long form, which covers the detailed economic and social information.
  • The entries in the tables are a combination of three possible sources:
    1. The group or subgroup is identified and counts of subgroups were available or provided;
    2. The group or subgroup was identified but not counted and estimates were prepared; and
    3. The subgroups were not identified but estimates were prepared based on the most recent population distributions available.
  • The sample cases shown for each data set reflect the population covered in the survey. For example, CPS-March covers all persons in the civilian noninstitutional population, whereas NSFG covers women 15 to 44 years of age. The descriptions of the respective data sets note the appropriate population coverage.
  • In describing the interviewing policy, the terms "CAPI" and "CATI" are used. CAPI refers to Computer Assisted Personal Interview, a technique for conducting a personal interview in which the questionnaire and all instructions are found on a personal computer, rather than in a paper version. CATI refers to Computer Assisted Telephone Interview, a technique for conducting an interview over the phone, in which the questionnaire and all instructions are found in a central computer, rather than in a paper version. Both CAPI and CATI have the advantage of insuring adherence to the proper pattern of questions for the respondent, and the ability to conduct edits during the interview to insure consistency in response.

Table A-4 indicates, for each survey or data collection effort, whether the following 3 items of information are available as of the appropriate reference date.

  1. U.S. Citizenship (Yes/No);
  2. Year of entry into the U.S.; and
  3. Whether foreign born (Yes/No).
Table A-1.
Approximations of Hispanic sample cases in the data set1

Data set Total Hispanic Mexican-
American
Puerto
Rican
Cuban Central or
South American
Other Hispanic

Census
   Census 2000 4,508,000 2,850,000 475,000 190,000 650,000 335,000
   ACS 900,000 570,000 95,000 38,000 130,000 67,000
   CPS-March 11,260 6,940 1,190 470 1,685 975
   CPS-Monthly 5,635 3,470 595 235 845 490
   SIPP 10,845 7,181 1,172 372 1,306 814
             
NCHS/CDC            
   NHIS 22,145 13,869 2,353 1,165 2,093 4,758
   NVS-Natality2 710,000 500,000 55,000 13,000 97,000 45,000
   NVS-Mortality2 95,000 53,000 13,000 10,000 8,000 12,000
   NSFG 2,097 7,181 1,172 372 1,306 814
   NIS 4,852 3,529 398 99 526 300
   NHANES 1,582 1,500 24 10 32 16
 
AHRQ
   MEPS 5,375 3,650 600 225 766 134
 
HCFA
   MCBS 464 254 42 67 52 50
 
SAMHSA
   NHSDA 5,000 3,170 527 211 721 372
 
NCES
   NHES 18,804 13,675 1,541 385 2,040 1,162
   ECLS-B 1,979 1,367 160 35 137 280
   ECLS-K 2,957 2,150 242 61 321 183

1 Data not directly available from the data set have been approximated (see Section 2).

2 Data represent a census of all registered events.

NOTE:
The sample cases for each data set reflect the population coverage of the respective surveys. For example, CPS-March covers all persons in the civilian noninstitutional population, whereas NSFG covers women 15 to 44 years of age. The descriptions of the respective data sets note the appropriate population coverage.


Table A-2.
Approximations of Asian and Pacific Islander sample cases in the data set1

Data set Total API Chinese Filipino Japanese Asian Indian Korean Vietnamese Hawaiian Other API

Census
   Census 2000 1,580,000 375,000 300,000 180,000 175,000 175,000 135,000 45,000 195,000
   ACS 316,000 75,000 60,000 36,000 35,000 35,000 27,000 9,000 39,000
   CPS-March 4,555 995 850 515 495 485 375 125 565
   CPS-Monthly 4,555 995 850 515 495 485 375 125 565
   SIPP 3,293 745 637 386 370 362 280 95 421
 
NCHS/CDC
   NHIS 3,284 755 647 356 320 342 356 112 396
   NVS-Natality2 170,000 28,000 32,000 91,000 18,000 8,000 13,000 6,000 56,000
   NVS-Mortality2 30,800 7,200 6,000 5,100 1,500 1,900 1,400 1,500 6,200
   NSFG 327 74 63 38 37 36 28 9 42
   NIS 1,172 265 227 137 131 129 100 34 150
   NHANES 113 27 27 13 12 12 10 3 14
 
AHRQ
   MEPS 750 152 170 62 111 96 45 17 97
 
HCFA
   MCBS 151 34 29 18 17 17 13 4 19
 
SAMHSA
   NHSDA 700 158 15 86 78 77 59 20 90
 
ED
   NHES 4,420 999 855 517 495 486 376 128 566
   ECLS-B 2,483 705 467 134 282 278 217 74 325
   ECLS-K 1,870 423 362 219 209 206 159 54 239

1 Data not directly available from the data set have been approximated (see Section 2).

2 Data represent a census of all registered events; (-) indicates the specific subpopulation detail is not available.

NOTE:
The sample cases for each data set reflect the population coverage of the respective surveys. For example, CPS-March covers all persons in the civilian noninstitutional population, whereas NSFG covers women 15 to 44 years of age. The descriptions of the respective data sets note the appropriate population coverage.


Table A-3.
Approximations of American Indian or Alaska Native sample cases in the data set1

Data set American Indian or
Alaska Native

Census
   Census 2000 330,000
   ACS 67,000
   CPS-March 1,600
   CPS-Monthly 1,350
   SIPP 1,200
   
NCHS/CDC
   NHIS 978
   NVS-Natality2 39,000
   NVS-Mortality2 11,000
   NSFG 77
   NIS 460
   NHANES 24
   
AHRQ
   MEPS 375
   
HCFA
   MCBS 25
   
SAMHSA  
   NHSDA 166
   
NCES
   NHES 1,675
   ECLS-B 50
   ECLS-K 364

1 Data not directly available from the data set have been approximated (see Section 2).

2 Data represent a census of all registered events.

NOTE:
The sample cases for each data set reflect the population coverage of the respective surveys. For example, CPS-March covers all persons in the civilian noninstitutional population, whereas NSFG covers women 15 to 44 years of age. The descriptions of the respective data sets note the appropriate population coverage.


Table A-4.
Information on citizenship, year of immigration, and foreign birth, by survey
Survey U.S. Citizenship Year entered U.S. Whether foreign born
Census
   Census 2000 Yes Yes Yes
   ACS Yes Yes Yes
   CPS-March Yes Yes Yes
   CPS-Monthly Yes Yes Yes
   SIPP Yes Yes Yes
NCHS/CDC
   NHIS Yes Yes Yes
   NVS-Natality * * *
   NVS-Mortality No No Yes
   NSFG No Yes Yes
   NIS No No No
   NHANES Yes Yes Yes
AHRQ
   MEPS No No No
HCFA
   MCBS No No No
SAMHSA
   NHSDA No No No
NCES
   NHES No Yes Yes
   ECLS-B * * *
   ECLS-K Yes Yes Yes
NOTE:
* Birth certificate contains information on mother’s and father’s place of birth.

Appendix B: Inventory of Selected Existing Federal Data Bases

Census 2000

Sponsoring agency

Bureau of the Census
Department of Commerce

Reference date

April 2000

Introduction

The decennial census is the oldest data collection effort in the United States. Initiated in 1790, and conducted thereafter at 10-year intervals in the year ending in zero, Census 2000 will be the 22nd decennial census. In addition to providing a "snapshot" of the United States, the decennial census provides information at all levels of geography, from the large to the small, ranging from political entities such as states, counties, cities, and local governments, to small areas such as blocks and tracts. The availability of results from past censuses provides historical series for a variety of characteristics, including race/ethnicity, which can be examined and analyzed over time, with allowance for differences in definitions and reporting. Decennial census results (sometimes adjusted for census undercount) either directly or extrapolated to post-census time periods also serve as denominators in the calculation of important social and economic indicators, such as birth and death rates, incidence rates for diseases, and crime rates. The data also are used in the design of population surveys and in adjusting survey results to known population parameters.

For the most part, the decennial effort is collected through self-enumeration. The basic Census 2000 form (short form) to be filled out for each person in the nation as of April 2000, including the institutionalized population and the resident armed forces, will collect information on only seven subjects, including Hispanic origin and race. The Census long-form, to be completed by about 1 household in 6, includes a wide variety of additional questions on the social and economic status of the population. The results from the long-form will begin to become available in 2002.

Race/ethnicity

The race/ethnicity questions in Census 2000 reflect the results of extensive research and testing to develop new standards for collecting, tabulating, and presenting data on race and ethnicity. Census 2000 is one of the first activities to reflect this effort. The standards, when adopted fully, will be incorporated in virtually all surveys or data collection efforts supported by the Federal statistical system, beginning early in the next decade.

The specific questions and responses are as follows:

  • Is Person - Spanish/Hispanic/Latino?
    Mark (X) the "NO" box if not Spanish/Hispanic/Latino
    [ ] No, not Spanish/Hispanic/Latino [ ] Yes, Puerto Rican
    [ ] Yes, Mexican, Mexican-Am., Chicano [ ] Yes, Cuban
    [ ] Yes, other Spanish/Hispanic/Latino-Print group  

  • What is Person's race? Mark [X] one or more races to indicate what this person considers himself/herself to be.
    [ ]White
    [ ] Black, African-Am., or Negro
    [ ] American Indian or Alaska Native—Print name of enrolled or principal tribe

    [ ] Asian Indian [ ] Japanese [ ] Native Hawaiian
    [ ] Chinese [ ] Korean [ ] Guamanian or Chamorro
    [ ] Filipino [ ] Vietnamese [ ] Samoan
    [ ] Other Asian—Print race   [ ] Other Pacific Islander—Print race

    [ ] Some other race

Interviewing policy

As the Nation's oldest data gathering organization, the Census Bureau has long experience in interviewing non-English speaking respondents. The decennial census makes special efforts to hire indigenous interviewers, especially so in areas containing large numbers of non-English speaking respondents. Where a bilingual interviewer is not immediately available and another family member is unable to bridge the language gap, a callback visit is scheduled and the required language skill is located and made available. Partnerships also are established with the local community and with public interest groups in order to ensure the availability of the needed language skills, and to obtain assistance in seeking public cooperation in responding to the census. As in recent censuses, the public will be urged to call if they require assistance. To the extent possible, the Bureau plans to meet such needs, both through its own staff and through the efforts of the local community; the extent of the effort is still in development.

Census questionnaires will be available in five languages other than English--that is, in Chinese, Korean, Spanish, Tagalog, and Vietnamese--and will be provided if requested. In addition, questionnaire assistance booklets will be available in over 30 languages.

Sample size

The decennial census covers the total population resident in the United States as of April 1, 2000. The sample sizes shown in Appendix Tables A-1 to A-3 refer to the respective populations included in the long-form sample.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

A wide variety of detailed data will be available from the long-form, including extensive detail for the Hispanic population and subpopulations, for each of the Asian and Pacific Islander populations, and for American Indians or Alaska Natives. The amount of detail, of course, will be more limited for smaller geographic entities, such as towns or rural areas, because the small sample sizes may preclude presenting data, either for the group as a whole or for its components, with sufficient reliability. At the time of publication, Census 2000 will provide the most timely and most extensive national and small area information available for the populations of interest, with regard to demographic and socio-economic characteristics.

As in past censuses, public use micro-record files will be available. Following past practice, they should contain the full race/ethnic detail, including subpopulation identification for both Hispanics and Asians and Pacific Islanders.

Revised race/ethnic definitions

Census 2000 will utilize the revised race/ethnic definitions, consistent with the new OMB standards for collecting and presenting race/ethnic data.

Agency website address:

www.census.gov

American Community Survey (ACS)

 

Sponsoring agency

Bureau of the Census
Department of Commerce

 

Reference date

2003

Introduction

The American Community Survey (ACS) is planned as a continuing sample survey designed to replace the Census long form in 2010. It is intended to provide reliable annual estimates of the detailed social, economic, and housing characteristics for all states, and for cities, counties, metropolitan areas, and population groups of 65,000 persons or more. For smaller areas, multi-year average data covering the most recent 2-to-5 years will be used to generate the estimates. For the most part, the survey will be conducted by mail. The sample will interview about 250,000 households per month, or some 3 million different households per year (some 3 percent of all households). Thus, over a 5-year period, the sample will approximate the decennial long-form sample. Currently in the testing and developmental stage, it is planned for implementation in 2003. As part of the inquiry, extensive demographic detail will be collected, including detailed race and Hispanic origin.

Race/ethnicity

The ACS race/ethnicity questions and response categories follow:

  • Is this person - Spanish/Hispanic/Latino?
    Mark (X) the "NO" box if not Spanish/Hispanic/Latino

    • [ ]No, not Spanish/Hispanic/Latino
      [ ]Yes, Mexican, Mexican-Am., Chicano
      [ ]Yes, Puerto Rican
      [ ]Yes, Cuban
      [ ]Yes, other Spanish/Hispanic/Latino(Print group)

    What is this persons's race? Mark (X) one or more races to indicate what this person considers himself/herself to be

    • [ ] White [ ] Asian Indian [ ] Native Hawaiian
      [ ] Black, African-Am., or Negro [ ] Chinese [ ] Guamanian or Chamorro
      [ ] American Indian, or Alaska Native. Print name of enrolled or principal tribe.
      [ ] Filipino [ ] Samoan
      [ ] Japanese [ ] Other Pacific Islander
      [ ] Korean [ ] Some other race
        [ ] Vietnamese  
        [ ] Other Asian
      Print race
       

Interviewing policy

As the Nation's oldest data gathering organization, the Census Bureau has long experience in interviewing non-English speaking respondents. The ACS, however, as a continuing, monthly effort to be conducted principally by mail, does raise a number of new issues, which the Bureau is attempting to address as part of its current research and testing program. Given its large sample size and its dispersion across virtually all counties, it is not feasible to have resident interviewers in all locations. To the extent that follow-up is conducted by telephone, using Computer Assisted Personal Interviewing (CAPI), a variety of language skills will be available. Personal follow-up, however, will be accomplished by travelling interviewers, some of whom may lack the necessary language skills. In a large number of instances, bilingual family members will be able to assist in completing the interview, but in some cases it may be necessary for the Bureau to hire or otherwise recruit the necessary language skills.

The ACS will be available in paper form in a limited number of other languages. The exact number is yet to be determined, based both on the experience of Census 2000 and the testing of the ACS.

Sample size

As with the decennial census, the ACS will cover the total resident population, including both the armed forces and the institutionalized. The sample sizes shown in Appendix Tables A-1 to A-3 refer to the Hispanic, Asian and Pacific Islander, and Native American populations included in an annual ACS sample.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

When fully implemented, the ACS will become a major source of a wide variety of detailed data on the Hispanic population and its subgroups, as well as for each of the API populations, and for American Indians or Alaska Natives. The amount of detail, of course, will be more limited for smaller geographic entities, such as towns or rural areas, because the small sample sizes may preclude obtaining data, either for the group as a whole or for its components, with sufficient reliability. Further, as noted above, information for small demographic groups or for small geographic entities may require the accumulation of from 2-to-5 years of sample, in order to provide acceptable levels of reliability. In sum, however, ACS will provide both timely and extensive national and small area information for the populations of interest.

It is expected that public use files containing micro-records will be made available regularly. Such files undoubtedly will contain the full scope of the race/ethnic information collected on the ACS, including identification of the Hispanic and API subpopulation groups.

Revised race/ethnic definitions

When implemented in 2003, the ACS will utilize the revised race/ethnic definitions and categories, including multiple reporting of race, consistent with the new OMB standards for collecting and presenting race/ethnic data.

Agency website address:

www.census.gov

Current Population Survey (CPS)

 

Sponsoring agency

Bureau of the Census
Department of Commerce

 

Reference date

March 1998
Average Month 1998

Introduction

The Current Population Survey (CPS) is a monthly survey conducted since 1942 by the Bureau of the Census to produce the official government statistics on the Nation's employment and unemployment. At the present time, some 48,000 households are interviewed. As part of the inquiry, extensive demographic detail is collected about those interviewed, including age, sex, race, Hispanic origin, educational attainment, and marital status. In addition, from time to time, supplementary questions are added to the survey to provide a wide variety of national information on such subjects as school enrollment, multiple job holding, immunization status, fertility, voting behavior, and computer ownership. In March of each year, the survey includes the questions found in the decennial long-form, such as income, work experience, and mobility, in order to provide post-censal updates for the socio-economic detail. The CPS has included questions on Hispanic origin since the early 1970's and, thus, serves to provide historical series for this group; information for subgroups is a more recent occurrence.

Race/ethnicity

Race and Hispanic origin are obtained for each person in the sample. The following series of questions is used currently (November 1999):

I am going to read a list of race categories: What is the race of each person in this household?

(If respondent seems unsure or is unable to provide an answer), ask,

Are you (Is he/she) White, Black, American Indian, Aleut or Eskimo, Asian or Pacific Islander, or something else?

I am going to read a list of origin categories: What is (name's/your) origin or descent?

The response categories are listed on flashcards which are handed to the respondent when appropriate. The Race flashcard entry for Asian or Pacific Islander displays parenthetically a number of subgroups (Japanese, Chinese, Filipino, Korean, Asian Indian, Vietnamese, Hawaiian, Guamanian, Samoan, other Asian) to assist the respondent, but these groups are not recorded. The Origin or Descent flashcard lists a number of the discrete Hispanic subgroups and (Mexican-American, Chicano, Puerto Rican, Cuban, and Central or South American). Subgroups for the Asian or Pacific Islanders are not listed on the Origin or Descent flashcard. The detail on the flashcards for Hispanics appears on the tape files, including the Public Use files.

In connection with the extended March supplement, persons identified as Asian or Pacific Islanders are further classified into subgroups (e.g., Chinese, Japanese, Filipino, Asian Indian, Vietnamese, Guamanian, Hawaiian); at the moment, these results appear only on internal records.

Interviewing policy

Census Bureau has long experience in interviewing non-English speaking respondents. Since a large proportion of CPS interviews use a Computer Assisted Telephone Interviewing system (CATI), the Bureau has located one of its telephone centers in Tucson, AZ, to take advantage of its large Spanish speaking population. In areas containing large numbers of non-English speaking respondents, the Bureau generally attempts to locate, hire, and train members of the group who are bilingual, and they are assigned as needed. Where a bilingual interviewer is not available, the interviewer attempts to locate another member of the family who is bilingual to assist in the interview, or arranges to call back when a translator can be obtained.

Since the CPS is a "computer based" survey, it is not available in other languages in paper form, although a Spanish-language version is resident on the CATI/CAPI systems. In a given month, only some 220 of the 3,400 Spanish-origin households (about 6.5 percent) are interviewed using the Spanish-language instrument.

Sample size

The CPS covers the civilian noninstitutional population. The sample sizes shown in Appendix Tables A-1 to A-3, both for "CPS-March" and for CPS-Monthly cover all ages. As noted earlier, the CPS does identify the Hispanic subgroups; it does identify subgroups of Asian and Pacific Islanders in March. The March CPS oversamples Hispanics by a factor of 2; thus the sample sizes for Hispanics for CPS-March reflect the oversampling.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

Data from the March CPS supplement on the socio-economic status of the Hispanic population are published annually in a separate report in the Bureau's P-20 Series. Limited information is provided for the subpopulations. Data for the Asian and Pacific Islander population from the March CPS supplement are contained in the special report by race, issued by the Bureau; no information is provided for Native Americans. Some of the reports on other subject areas contain limited information on Hispanics and the API population. Information on the labor force behavior of Hispanics and Asians is published regularly by the Bureau of Labor Statistics.

Public use micro-data files are available containing data from the monthly CPS, as well as for the supplementary questions asked only in March. The race/ethnic data to be found on these files conforms to the collected detail; that is, Hispanic subpopulations are identified, but only the total Asian/Pacific Islander group is shown.

Revised race/ethnic definitions

The new OMB standards for collecting race/ethnic data will be introduced in the CPS in 2003. Current thinking suggests a separate item on Hispanic origin, by subgroups, followed by the question on race, which would have separate entries for Asian and Pacific Islander subgroups. Allowance will be made for multiple reporting. At the moment, the Bureau is not planning to extend the detail published, either for Hispanics or for the API population group.

Agency website address:

www.census.gov

Survey of Income and Program Participation (SIPP)

Sponsoring agency

Bureau of the Census
Department of Commerce

 

Reference date

Wave 1, 1996 Panel

Introduction

The Survey of Income and Program Participation (SIPP), initiated in late 1983, is a major continuing household survey, providing information on the detailed sources of income, on participation in a wide range of government programs, and on program eligibility. Extensive demographic detail is collected about the persons interviewed, including age, sex, race, Hispanic origin, educational attainment, and marital status. In addition to a core set of questions concerning labor market activity, earnings and income, and program participation and eligibility, from time to time, supplementary questions or topic modules are added to the survey to provide a wide variety of national information on such subjects as assets and liabilities, housing costs and energy usage, child care, welfare history, and disability. The current sample panel consists of some 40,000 households, which are interviewed at 4-month intervals. Longitudinal surveys, such as SIPP, are subject to cumulative nonresponse, which must be taken into account in using data derived over the life of a panel. Prior to 1996, the typical panel length was 32 months; the survey was redesigned in 1996, at which time the panel length was extended to 4 years.

Because of specific congressional mandates on how moneys flow to immigrants, SIPP has included subject modules on Migration History and Immigrant Status, topics which are particularly relevant for analysis of program participation.

Race/ethnicity

The questions shown are the current items for race/ethnicity (1996 Panel). Race is categorized into four major groups -- White; Black; American Indian, Eskimo, and Aleut; and Asian and Pacific Islander. The interviewer provides the respondent with a flashcard containing the categories, and asks,

"Which of the categories on this card best describes ....'s race?"

The race as reported by the respondent is entered. If the person reports a race not listed, the response is entered in the "other race" category, and subsequently edited into one of the four groups. If more than one race is reported or the respondent is uncertain, the interviewer next asks,

"Which race does...most closely identify with?"

and records the race reported. If the respondent is unable to provide a single response, the race of the person's mother is reported. If the respondent reports a multiple race for the mother, the first race originally mentioned is recorded.

A separate flashcard is used in connection with the question on ethnicity. This card lists 34 different ethnic categories as options, including nine categories for Spanish origin, one for American Indians, Eskimo or Aleut, and one for African-American. None of the specific subgroups comprising Asians or Pacific Islanders is listed. The flashcard question is,

"Which of the categories on this card best describes ....'s origin or descent?"

The instructions state:

Enter the origin as reported by the respondent. If the person reports more than one origin, ask him/her to select only one choice, and enter that code.

The Hispanic categories listed on the flashcard are:

  • Mexican Mexican-American Chicano
    Puerto Rican Cuban Central American
    South American Dominican Republic Other Hispanic

All of these detailed codes are recorded on the public use file.

Interviewing policy

In areas containing large numbers of non-English speaking respondents, the Census Bureau generally attempts to locate, hire, and train members of the group who are bilingual, and they are assigned as needed. Where a bilingual speaker is not available, the interviewer attempts to locate another member of the household who is bilingual to assist in the interview, or arranges for a callback with a bilingual interviewer (or translator).

The survey is administered through a computer assisted personal interview (CAPI). A computerized Spanish-language version of the questionnaire is available.

Sample size

Since SIPP covers the civilian, noninstitutional population, the sample sizes shown in Appendix Tables A-1 to A-3 are persons' counts for all ages. SIPP identifies the Hispanic subgroups; but tabulates data only for the Hispanic total; it does not collect (nor present) data for the separate API subgroups. SIPP does not oversample any of these race/ethnic groups. However, persons in poverty are oversampled, including some who are in these race/ethnic groups.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

Only very limited data are made available either for Hispanics or APIs; no information is provided for Native Americans. At the moment, SIPP has no plans for extending the detail published, either for Hispanics or for APIs.

SIPP public use micro-data files identify the individual Hispanic subgroups; for the API group, however, only the API total is identified.

Revised race/ethnic definitions

Current thinking calls for introducing the new categories, including allowance for multiracial reporting, in 2003 or 2004, consistent with the revised OMB standards for collecting and presenting race/ethnic data.

Agency website address:

www.census.gov

National Health Interview Survey (NHIS)

Sponsoring agency

Bureau of the Census
Department of Commerce

 

Reference date

1998

Introduction

The National Health Interview Survey (NHIS) is an important source of national information on American health indicators, health care access and use, and health-related behaviors. Initiated in 1957, NHIS is a continuous survey of about 41,500 households per year, with data collected throughout the year from weekly subsamples. The survey covers the civilian non-institutionalized population of the United States through personal interviews with household members.

The questionnaire consists of two basic parts: a set of family, adult, and child core questions on health and demographic items, and one or more sets of questions on current health topics. The core items provide continuous information on basic health variables, such as limitations of activity, injuries, health insurance, access to health care, health care utilization, conditions, and behaviors such as tobacco use, physical activity, alcohol use and immunizations. Extensive demographic information for household members, including race/ethnicity, also are collected. Questions on special topical modules change from year to year, but examples of past interest include prevention, dental care, physician services, health insurance, cancer risk factors, child adoption, and functional limitations.

Race/ethnicity

The race/ethnic questions, used in the Calendar 1998 NHIS were as follows:

  • (Show Flashcard) do any of these groups represent (your) national origin or ancestry?
    (Where did {your} ancestors come from?)
    • Puerto Rican Mexican/Mexicano Hispanic
      Cuban Mexican-American Other Latin American (specify)
      Cuban American Chicano Other Spanish or Hispanic (specify)

    (Show Flashcard) What race {do} consider {yourself} to be?

    • White Vietnamese
      Black/African American Japanese
      Indian (American) Asian Indian
      Eskimo Samoan
      Aleut Guamanian
      Chinese Other Asian, Pacific Islander (specify)
      Filipino Other (specify)
      Hawaiian  
      Korean  

    (If multiple entries in Race) which one of these groups, that is (Read Groups) would you say BEST represents {your} race?

Interviewing policy

The NHIS policy regarding the use of bilingual interviewers parallels that of the Bureau of the Census, given that the survey is conducted for NCHS by the Bureau. Bilingual interviewers are recruited routinely for those areas known to be predominantly non-English speaking, with Spanish as the most important second language. Where feasible, other members of the household who are bilingual are asked to assist. Other language skills are provided, as the situation requires.

Sample size

The NHIS oversamples blacks and Hispanics, and the sample sizes shown in Appendix Tables A-1 to A-3 reflect the oversampling. Numbers shown are for all ages.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

Reports prepared from the NHIS present limited data for the major race/ethnic groups; the relatively small samples for the subgroups, however, preclude the presentation of data for these individual groups. On occasion, NCHS has prepared special reports for a major race/ethnic group (e.g., Health Status of Asian Americans: United States, 1992-1994) by using multi-year average data.

Public use micro-record files are provided annually from the survey; the full race/Hispanic origin detail is among the items included on the individual record.

Revised race/ethnic definitions

All of the subgroups of interest were identified in the race/ethnic questions included in the 1998 NHIS. The Calendar 1999 NHIS race/ethnicity questions have been modified to reflect the changes introduced in the revised OMB standards for collecting race/ethnic data.

Agency website address:

www.cdc.gov/nchs/default.htm

National Vital Statistics System - Natality

Sponsoring agency

National Center for Health Statistics
Centers for Disease Control and Prevention
Department of Health and Human Services

 

Reference date

1997

Introduction

The birth component of the national vital statistics system in the United States is comprehensive, covering all political jurisdictions in the country, as well as Puerto Rico, the Virgin Islands, Guam, American Samoa, and the Commonwealth of the Northern Mariana Islands. National statistics derived from the system include the 50 states and the District of Columbia, and data are available for the other jurisdictions as well. The system is cooperative in the sense that local registration offices receive notices of vital events and are responsible for issuing certified copies of vital records (e.g., birth and death certificates), which are state adaptations of the model certificates (U.S. Standard Certificates) developed jointly by the states and the Public Health Service. The states, having received copies of the certificates from the registration officials, process and send them to the National Center for Health Statistics for consolidation into a national database. The standard certificate of birth contains a wide variety of information about the mother and child, including maternal and infant health characteristics, information on tobacco and alcohol use during pregnancy, obstetric procedures, method of delivery, and abnormal conditions of the newborn. The race or national origin of the mother and father also is obtained and, beginning in 1989, the certificate added a Hispanic identifier for the mother and father. This information is not obtained for the child.

Race/ethnicity

The following two items, completed separately for the mother and the father, appear on the Certificate of Live Birth:

  • Of Hispanic Origin? (Specify No or Yes—If Yes, specify Cuban, Mexican, Puerto Rican, etc.)

    Race—American Indian, Black, White, etc. (specify)

For statistical purposes, the following categories for race are separately identified:

  • White Japanese
    Black Hawaiian
    American Indian Filipino
    Chinese Other Asian or Pacific Islander

At present, a total of nine states, which contain about two-thirds of the U.S. population of these additional API groups, code births as Vietnamese, Asian Indian, Korean, Samoan, Guamanian, and other API groups. Subgroup data for Hispanics are available for all states and are tabulated.

Interviewing policy

Registration of births is an administrative system, thus not requiring the use of interviewers.

Sample size

The data shown in Appendix Tables A-1 to A-3 represent the total numbers of births reported for the respective race/ethnic group.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

Data on births, by selected race/ethnic group, are released regularly in published and electronic form, including the Internet. NCHS has issued reports dealing with births of Hispanic origin, which show subgroup detail for Mexican, Puerto Rican, Cuban, and Central and South American mothers. More limited data are available for the API populations because population data are available only in Census years, not annually. However, numbers and percent distributions of births by all characteristics are available for the API subgroups. A public use file is provided annually, which includes the demographic information from all unit records.

Revised race/ethnic definitions

NCHS has convened a panel of experts to develop a revised, standard certificate to serve as a model for states. It is hoped that this effort will result in race/ethnic detail consistent with the revised OMB standards for collecting race/ethnic data. The timing of the revision is yet to be determined.

Agency website address:

www.cdc.gov/nchs/default.htm

National Vital Statistics System - Mortality

Sponsoring agency

National Center for Health Statistics
Centers for Disease Control and Prevention
Department of Health and Human Services

Reference date

1997

Introduction

The death component of the national vital statistics system in the United States is comprehensive, covering all political jurisdictions in the country, as well as Puerto Rico, the Virgin Islands, Guam, American Samoa, and the Commonwealth of the Northern Mariana Islands. National statistics derived from the system include the 50 states and the District of Columbia, and data are available for the other jurisdictions as well. The system is cooperative in the sense that local registration offices receive notices of vital events and are responsible for issuing certified copies of vital records (e.g., birth and death certificates), which are state adaptations of the model certificates (U.S. Standard Certificates) developed jointly by the states and the Public Health Service. The states, having received copies of the certificates from the registration officials, process and send coded statistical information without individual identification to the National Center for Health Statistics for consolidation into a national database. The standard certificate of death contains medical and demographic information about the deceased, including age, race, sex, Hispanic origin, marital status, occupation and industry, educational attainment, place of death, and causes of death.

Race/ethnicity

The following two items appear on the Standard Certificate of Death, and are completed for the decedent:

  • Was decedent of Hispanic Origin?
    (Specify No or Yes—If Yes, specify Cuban, Mexican, Puerto Rican, etc.)

    Race-American Indian, Black, White, etc. (specify)

For statistical purposes, the following categories for race are separately identified:

White
Black
American Indian (incl. Eskimo and Aleut)
Chinese
Japanese
Hawaiian
Filipino
Other Asian or Pacific Islander

At present, a total of nine states, which contain about two-thirds of the U.S. population of these additional API groups, code deaths to additional API subgroups, including Vietnamese, Asian Indian, Korean, Samoan, Guamanian, and other API groups. Subgroup data for Hispanics are available for all states and are tabulated.

Interviewing policy

Death registration is an administrative system, thus not requiring the use of interviewers.

Sample size

The data shown in Appendix Tables A-1 to A-3 represent the total numbers of deaths reported for each race/ethnic group.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

Data on deaths, by selected race/ethnic group, are released regularly in published and electronic form, including the Internet. NCHS has issued reports dealing with deaths of Hispanic origin, which show subgroup detail for Mexican-American, Puerto Rican, Cuban, and Central and South American decedents, as well as a report on deaths for API subgroups. A public use file, containing the universe of death records is provided annually, including the demographic and medical information from the unit records.

Revised race/ethnic definitions

NCHS has convened a panel of experts to develop a revised, standard certificate to serve as a model for states. It is hoped that this effort will result in race/ethnic detail consistent with the revised OMB standards for collecting race/ethnic data. The timing of the revision is yet to be determined.

Agency website address:

www.cdc.gov/nchs/default.htm

National Survey of Family Growth (NSFG)

Sponsoring agency

National Center for Health Statistics
Centers for Disease Control and Prevention
Department of Health and Human Services

 

Reference date

1995

Introduction

The National Survey of Family Growth (NSFG) is focused on factors affecting pregnancy--including sexual activity, contraceptive use, and infertility--the use of family planning and other medical services such as prenatal care, and the health of women and infants. The survey is conducted at periodic intervals among a national probability sample of some 12,000 civilian noninstitutionalized women 15 to 44 years of age. Beginning with the next cycle (Cycle VI, scheduled for calendar 2001), the study also will include some 7,200 completed interviews with men 15 to 49 years of age. A variety of demographic and socio-economic information also is included in the interview.

Race/ethnicity

The Cycle V survey, conducted in 1995, used the NHIS as the sampling frame. Thus, the NHIS race/ethnicity in detail is available for NSFG; although all the information is not on the public use tape. The Cycle V survey includes questions relating to Hispanic origin, as well as for selected subpopulations, such as Mexican, Puerto Rican, Cuban, and Other. Specific wording follows:

  • Is (….) of Hispanic or Spanish origin?
    • YES NO

    Is (….)

    • Puerto Rican
      Cuban
      Mexican
      Or a member of some other group (specify)

The item on race instructs respondents to mark all that apply, but also to indicate the race that best describes the respondent, as follows:

  • Please look at Card. What is (....'s) race? (Code all that apply)
    Alaskan Native or American Indian
    Asian or Pacific Islander
    Black
    White

Interviewing policy

Some Spanish-English bilingual interviewers are hired and made available as needed. Respondents who cannot be interviewed in English or in Spanish are classified as eligible, but non-respondents. Because of the sensitive content of the interview, family members or other third party translators are not allowed to be present during the interview. Thus, if an eligible person speaks only other than English or Spanish, that person cannot be interviewed in the NSFG.

Sample size

The next cycle plans to oversample both Black and Hispanic men and women 15 to 44 years of age and Hispanic men 15 to 49 years of age, as well as Hispanic women 15 to 44 years, as in recent cycles The exact amount of oversampling is yet to be determined. The sample sizes shown in Appendix Tables A-1 toA-3 are for Hispanic women 15 to 44 years who were included in Cycle 5. The emphasis in this study has been on Black and Hispanic reproductive health.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

All reports contain data cross tabulated by race and ethnicity. The oversampling of black and Hispanic women, however, provides both greater reliability and greater detail for this group. The detail in the reports includes the number of children women have had and the number they expect in the future, intended and unintended births, sexual intercourse, marriage and cohabitation, contraceptive use, infertility, health insurance coverage, family planning, smoking, HIV testing, and sex education.

Public use data files, containing individual records for respondents, are produced for each Cycle; the race/Hispanic data are included on the record.

Revised race/ethnic definitions

Cycle VI, scheduled for Calendar 2001, will reflect the revised OMB standards for race/ethnic detail.

Agency website address:

www.cdc.gov/nchs/default.htm

National Immunization Survey (NIS)

Sponsoring agency

National Center for Health Statistics
Centers for Disease Control and Prevention
Department of Health and Human Services

 

Reference date

1999

Introduction

The National Immunization Survey (NIS) is conducted quarterly by telephone among about 9,000 households, in order to collect specific vaccination data for children 19 through 35 months of age. The sample is spread through some 78 areas, of which 28 are urban places participating in Immunization Action Plans (IAPs), and the remainders is located in the 50 States or partial states. For example, data for Texas might represent information collected from 4 IAPs in Texas, plus a sample in "the rest of" Texas.

Race/ethnicity

Beginning with the third quarter of 1999, the questions used to collect information on the race/ethnicity of the children and the mother were changed, consistent with the revised OMB standards. The current wording follows:

  • Is the person of Spanish, Hispanic, or Latino descent, that is,
    Mexican
    Mexican-American
    Central American
    South American
    Chicano
    Puerto Rican, or
    Cuban

    Is the person

    White
    Black or African-American
    Native American
    Alaska Native
    Asian
    Native Hawaiian, or
    Other Pacific Islander, or
    Another race (specify)
    (Check all that apply) Don't Know, Refused

    (If more than one answer, ask) which do feel best describes the person's race?

    (Categories as above)

Interviewing policy

English- and Spanish-speaking interviewers are used to collect the information. In addition, selected other language skills are available. A Spanish-language version of the questionnaire is available, when required. Where the required language skills are not available, an effort is made to obtain the assistance of an English-speaking family member or the AT&T language line translators are used.

Sample size

The information in Appendix Tables A-1 to A-3 is for children 19 to 35 months of age. Even though subpopulation detail is collected the subpopulations are not identified on the data files.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

Historically in general, the race/ethnicity detail shown in publications has been restricted to the total Hispanic, White, and Blacks. Occasionally, additional detail was provided for American Indians/Alaskan Native, and Asian/Pacific Islanders. The race/ethnic detail to be made available from the questions introduced in late 1999 is yet to be determined. Public use files are not yet available, but are under consideration.

Revised race/ethnic definitions

As noted, as of the third quarter of 1999, NIS is consistent with the revised OMB standards for collecting race/ethnic data. Since the results are published annually, a full four quarters of information is required before publication. However, as indicated in the Task 3 report, the sample sizes for many of the subpopulations are too small to permit meaningful analyses.

Agency website address:

www.cdc.gov/nchs/default.htm

National Health and Nutrition Examination Survey (NHANES)

Sponsoring agency

National Center for Health Statistics
Centers for Disease Control and Prevention
Department of Health and Human Services

 

Reference date

1999

Introduction

The National Health and Nutrition Examination Survey (NHANES) obtains information about the health and nutritional status of a representative national sample of the civilian, non-institutionalized population of all ages through direct interviews, physical examinations which obtain a wide variety of standardized medical information, and selected laboratory analyses. Data collected through NHANES assist in understanding and evaluating new public health issues and technology; risk factors for specific diseases; the relationship between diet, nutrition, and health; trends in risk behaviors and environmental exposures; the prevalence, awareness, treatment, and control of selected diseases; the number and percent of persons in the population and specific subgroups with diseases and risk factors; and in establishing a national probability sample of genetic and other materials.

Examples of the subjects covered in NHANES included prenatal care, birthweight, preschool/child care, current medical conditions, reported pain, physical functioning, immunization status, presence of selected diseases (TB, diabetes, cardiovascular disease, osteoporosis, kidney conditions, respiratory health), blood pressure, vision, audiometry, balance, oral health, diet behavior and nutrition, weight history, smoking and tobacco use, and use of dietary supplements and prescriptions. The medical examination includes a diagnostic interview, body measurements, bone desistometry, a dental examination, vision, hearing, physical fitness, physical functioning, and selected laboratory tests. Extensive demographic and socio-economic information also is collected.

Currently (1999), each year a new sample of 5,000 individuals of all ages is interviewed and examined; results can be aggregated across years to improve reliability of the estimates.

Race/ethnicity

The questions concerning race/ethnicity are consistent with the proposed OMB guidelines; that is, the item on Hispanic origin precedes the race question, and one or more categories may be selected. The specific questions are as follows:

  • (Do you) consider (yourself) Hispanic/Latino?
    • YES
      NO
      Refused
    Please give me the number of the group that represents (your) Hispanic original ancestry. Please select one or more categories. (Hand card)
     
    Puerto Rican Mexican/Mexicano Other Latin American (specify)
    Cuban/Cuban American Mexican-American Other Hispanic (specify)
    Dominican (Republic) Chicano  
     
    What race (do you) consider (yourself) to be? Please 1 or more of these categories.
     
    White Guanmanian Filipino
    Black/African American Samoan Japanese
    Indian (American) Other PI (specify) Korean
    Alaska Native Asian Indian Vietnamese
    Native Hawaiian Chinese Other Asian (specify)
        Some other race (specify)
     
    (If more than 1 entry, continue) which one of these groups, that is (display responses) would you say best represents (your) race?
     
    Enter race code    
    Cannot choose 1 race    
    Refused    

Interviewing policy

Given the large extent of oversampling of Mexican-Americans, NHANES both ensures the availability of Spanish-speaking interviewers and (as required) makes special efforts to conduct the interview in Spanish. The basic survey form also is available in Spanish. Consequently, about half of all Hispanic households are interviewed in Spanish. If an interviewer encounters a respondent who speaks neither English nor Spanish, a bilingual member of the family, if available, is enlisted as a translator. As needed, other language skills are provided. Most NHANES interviewers are bilingual.

Sample size

Both NHANES III and current NHANES oversample Mexican-Americans and blacks in order to permit separate analyses of these two race/ethnic groups. NHANES III produced reasonably reliable statistics for a set of 52 age-sex-race/ethnicity groups; 14 for both Mexican-Americans and blacks, and 24 for whites and all other persons. The current NHANES will provide similar detail when several years of data collection are combined. In order to achieve these goals, approximately equal size samples in the 52 groups are required. Superimposed on this was an oversampling of Mexican-Americans in geographic areas containing high concentrations of Mexican-Americans. The combination of these two features of the sample resulted in sampling rates for Mexican-Americans varying in a range from 7.5 to 1. Asian and Pacific Islanders were sampled at the same rate as white persons and their sampling varied in a range of 20 to 1. The range among all age-sex-race/ethnicity groups was over 120 to 1. This diversity in sampling rates contributed significantly to the design effects for NHANES on statistics for all age-sex groups combined and the effective sample size is much lower than the nominal sample size, which was already very small for the API’s.

In spite of the diversity in sampling rates, the NHANES III sample sizes provide data with fairly good precision for Mexican-Americans in each of the 14 age-sex domains designated by NCHS for separate analysis. On the other hand, the sample size for even the total API population and for Native Americans is quite low, and it was trivial for individual age-sex groups. The current NHANES is expected to follow a similar pattern for combined years of data.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

Given the relatively small overall sample size, separate data are not shown for any of the subgroups, other than for Mexican-Americans and Blacks. However, an examination of the feasibility of conducting studies on specific subpopulations in future surveys, to be called "Designated Population HANES," is currently underway. Public use micro-data files are released for each cycle with the race/Hispanic data grouped into categories for analytic purposes.

Revised race/ethnic definitions

As noted, the current NHANES utilizes the revised OMB standards for collecting race/ethnic data.

Agency website address:

www.cdc.gov/nchs/default.htm

Medical Expenditure Panel Survey (MEPS)

Sponsoring agency

National Center for Health Statistics
Centers for Disease Control and Prevention
Department of Health and Human Services

 

Reference date

1999

Introduction

The Medical Expenditure Panel Survey (MEPS) is a nationally representative survey of health care use, medical expenditures, sources of payment, and insurance coverage for both the U.S. civilian noninstitutionalized population and nursing homes and their residents. Both individual and family level information on health care utilization and expenses are collected.

MEPS comprises four component surveys: the Household Component (HC), the Medical Provider Component (MPC), the Insurance Component (IC), and the Nursing Home Component (NHC). The HC serves as the core survey from which the MPC sample and part of the IC sample are derived. Data from the HC survey are then linked with additional information collected from medical providers, employers, and insurance providers. Respondent panels are drawn from persons interviewed in the National Health Interview Survey; currently, they are interviewed for five rounds over a 2 ½ year period.

The detailed demographic characteristics collected from households include race/ethnicity, health conditions, health status, use of medical care services, charges and payments, access to care, satisfaction with care, health insurance coverage, income, and employment. The MPC contacts medical providers identified by respondents to supplement and validate information on reported medical care events. The IC collects data on health insurance plans obtained through employers, unions, or other private health insurance sources. Lastly, the NHC develops information on the characteristics, health care use, and expenditures by nursing home residents and the characteristics of nursing home facilities.

Race/ethnicity

Since MEPS currently (1999) draws its sample from persons interviewed previously in the National Health Interview Survey, the race/ethnicity data collected during the NHIS interview are available. However, the information is also obtained directly during the initial interview. Potentially, then, MEPS has available to it the full race/ethnic detail collected in the NHIS, which is consistent with the revised OMB standards. MEPS currently provides data for selected Hispanic subpopulations, but none for the API subpopulations groups.

Interviewing policy

Bilingual interviewers, especially Spanish-speaking, are used regularly. Other language skills are located as required. The CAPI system contains a Spanish-language version of the interview form.

Sample size

The MEPS sample introduced in 1999 contained 18,028 persons of all ages, selected from the 1998 NHIS. The sample sizes shown in Appendix Tables A-1 to A-3 include both the 1999 Panel (and reflect the oversampling of Hispanics and Blacks in the responding NHIS households) and some 12,000 persons from the 1998 panel, which was in its second year of data collection and overlapped the first year for the 1999 panel. Further, "peak" years, with larger sample sizes are planned at 5 year intervals. Because of the planned variation in sample size from year to year, users of these data should verify the appropriate sample sizes for the years to be studied.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

Data from MEPS are provided for the major race/ethnicity groups. Subpopulation data are not available. Public use data files, containing information for individual respondents, are issued regularly; Hispanic subpopulations (except for the category "Central or South American") are identified, but not API subpopulations.

Revised race/ethnic definitions

Since, as noted, the MEPS sample is drawn from the NHIS, the revised OMB standards for collecting race/ethnicity will be introduced concurrent with their adoption by NHIS. The detail to be available on the tape file is yet to be determined.

Agency website address:

www.ahcpr.gov

Medicare Current Beneficiary Survey (MCBS)

Sponsoring agency

Health Care Financing Administration
Department of Health and Human Services

 

Reference date

Early 1998

Introduction

The Medicare Current Beneficiary Survey (MCBS) is a continuous, multipurpose survey of a representative national sample of 16,000 beneficiaries, representing the Medicare population. The objective of the study is to determine expenditures and sources of payment for all services used by Medicare beneficiaries, including co-payments, deductibles, and noncovered services; to ascertain all types of health insurance coverage and relate coverage to sources of payments; and to trace processes over time, such as changes in health status, spending down to Medicaid eligibility, and the impacts of program changes.

MCBS covers the entire Medicare population, whether aged or disabled, living in the community or in institutions; oversampling selected age groups; and following and reinterviewing the sample to obtain a continuous longitudinal picture. Other features include collecting a wide variety of data on each sample person, including topical supplements; combining survey and administrative data; and being able to retrieve data to respond to urgent Medicare policy issues. Sampled beneficiaries are interviewed in person three times a year through the use of CAPI, over a 4-year period. The study was initiated in late 1991.

Race/ethnicity

Historically, the survey asked about race in five large groups--White, Black or Afro-American, Asian or Pacific Islander, American Indian, and Other. This was followed by a single question on Hispanic/Latino ethnicity. Beginning in Fall 1998, however, the questions were revised, consistent with the current OMB guidelines, as follows:

  • The next two questions are about ethnicity and race. Are (you) of Hispanic or Latino origin?
    YES
    NO
    Refused
    Don't Know

    Looking at this card, what is (your) race?
    (Code all that apply)

    • American Indian or Alaska Native
      Asian
      Black or African American
      Native Hawaiian or other Pacific Islander
      White
      Another Race (specify)

Interviewing policy

The interviewing staff includes resident, bilingual interviewers, especially in highly concentrated Spanish-speaking populated areas, such as California, Florida, Texas and Puerto Rico. The basic questionnaire is available in Spanish in hardcopy form, which is used as a guide, with responses entered into the CAPI instrument, which appears only in English. An item at the end of the interview captures whether the interview was conducted in Spanish. When the necessary language skills are not immediately available, translators are obtained. Experience has shown that the need for language skills other than Spanish is quite limited.

Sample size

The sample sizes shown in Appendix Tables A-1 to A-3 are for Medicare beneficiaries as of early 1998, and cover all ages. However, sampling rates vary by age, with overrepresentation of the disabled (generally those under 65 years of age) and the oldest-old (85 or more years of age).

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

There are no separate publications by race. A number of the regular reports contain distributions and cross classification by the major groups. Public use files are released regularly; the collected race/ethnic detail is available on the files.

Revised race/ethnic definitions

Consideration has not yet been given to any further revision of the race/ethnic standards, consistent with the revised OMB standards, in order to provide additional subpopulation detail.

Agency website address:

www.hcfa.gov

National Household Survey on Drug Abuse (NHSDA)

Sponsoring agency

Substance Abuse and
Mental Health Services Administration
Department of Health and Human Services

 

Reference date

1998

Introduction

The National Household Survey on Drug Abuse (NHSDA) provides statistical information on the use of illegal drugs, collected through interviews with a national household sample of persons 12 years old and older. A self-administered portion of the inquiry includes questions on recent use and frequency of use of various licit and illicit drugs, opinions about drugs, problems associated with drug use, perceived need and demand for drug abuse treatment, and drug abuse treatment experience. The interviewer-administered questions include the demographic characteristics and socio-economic background items, as well as health status, adult mental issues, health insurance, utilization of services, and access to health care. Data are collected throughout the calendar year.

The sample size in 1998 was about 24,500 persons; in 1999, the sample size will approach 70,000 CAPI interviews and as many as 15,000 "paper and pencil" interviews. In future years, plans call for limiting the survey to about 70,000 CAPI interviews.

Race/ethnicity

The survey includes the following questions concerning race/ethnic origin:

  • Are you of Spanish/Latino/Hispanic descent?
    YES NO

    Which best describes you? (Mark all)

    Mexican
    Mexican-American
    Mexicano
    Chicano
    Puerto Rican
    Central/South American
    Cuban
    Cuban/American
    Other (specify)

    Which group best describes you? (Mark all)

    White
    Black
    American Indian/Alaskan Native
    Hawaiian
    Pacific Islander
    Chinese
    Filipino
    Japanese
    Asian Indian
    Korean
    Vietnamese
    Other (specify)

Interviewing policy

The CAPI system displays both English and Spanish language versions of the interviewer form, and the staff contains interviewers with English/Spanish language capability. Other household members are asked to assist when bilingual capability is not available, or arrangements are made to callback in order to obtain the interview.

Sample size

The NHSDA sample design used in 1998 oversampled Blacks and Hispanics. Further, the 1998 survey oversampled residents of Arizona and California in order to provide direct survey estimates for these States. The sample sizes shown in Appendix Tables A-1 to A-3 are for 1998, and reflect the oversampling of Hispanics.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

Data are presented for three major race/ethnic groups: white, blacks, and Hispanics. A fourth category, Other, includes Asian and Pacific Islanders, American Indians and Alaskan Natives, and other groups. The Agency recently issued a special report containing extended detail for Hispanics, and may provide additional special reports for individual racial groups in the future. Public use micro-record data files are available; these records contain the detailed information on race/Hispanic origin.

Revised race/ethnic definitions

The current series of questions is consistent with the proposed OMB race/ethnic standards.

Agency website address:

www.samhsa.gov

National Household Education Survey (NHES)

Sponsoring agency

National Center for Education Statistics
Department of Education

 

Reference date

1996

Introduction

The National Household Education Survey (NHES) is designed to provide information on selected educational issues that are best addressed by contacting households directly, rather than schools or other educational institutions. NHES, a national telephone survey of the civilian noninstitutional population, has been conducted at varying periods since 1991 to address such topics as early childhood education, adult education, school readiness, school safety and discipline, parent and family involvement in education, library use, and civic involvement. Between 45,000 and 64,000 households were screened for each of these surveys, and individuals within households who met the predetermined criteria were interviewed to collect the desired information. Beginning in 1996, the potential of the screener was enhanced by an expansion which added a brief set of questions on issues of interest to education program administrators and policymakers. Members of all screened households also were asked to provide educational and demographic information, including race/ethnic detail, thus providing national and state estimates of household characteristics.

Race/ethnicity

The following questions are used currently to develop the information on race/ethnicity:

  • Are you… (If R gives ethnicity (e.g. Hispanic), probe for race
    • White Asian or Pacific Islander
      Black Some other race
      American Indian or Alaskan Native
       
      Hispanic/Latino/Mexican/Spanish
      Puerto Rican
       
      More than one race/biracial/multiracial
      Other (specify)

    Are you of Hispanic origin?

    • YES NO

    A test of the proposed, revised race/ethnic questions was conducted in NHES:1999, using the following items:

    Please tell me, are you of Hispanic or Latino origin?

    • YES NO

    Now I am going to read you a list of racial groups. After you have heard the list, you may choose one or more that apply to you. Are you…

    • White
      Black or African American
      American Indian or Alaska Native
      Asian, or
      Native Hawaiian or Other Pacific Islander
      Other (specify)

Interviewing policy

NHES is conducted in English and in Spanish, as required. The questionnaires are available on the CATI system in a Spanish language version, with bilingual interviewers trained to complete the interview in either English or Spanish. Telephone surveys may be answered by someone who does not speak English; if the interviewer is not bilingual in the language of the respondent, such cases are noted by the interviewer as "language problem" and, if the language is recognized, it is recorded. If the initial interviewer is functional in the respondent's language (usually Spanish), the interview is immediately carried out. In cases involving "language problem," efforts are made to identify and locate an English (or Spanish) speaking household member to assist with the interview; failing that approach, translators or persons with the unique language skill are used to complete the interview. In NHES: 1996, only about 2 percent of the almost 56,000 screeners were not conducted in English.

Sample size

The sample sizes shown in Appendix Tables A-1 to A-3 are for the 1996 screening operation.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

Key statistics are presented by the major race/ethnicity categories. Public use micro-record data files are prepared for each survey cycle; the race/Hispanic origin detail, as collected, is included on the public use file.

Revised race/ethnic definitions

As noted above, a test of the revised race/ethnic questions has been conducted. It is expected that this approach will be introduced into the cycle of NHES scheduled for 2001.

Agency website address:

www.nces.ed.gov

Early Childhood Longitudinal Study - Birth Cohort (ECLS-B)

Sponsoring agency

National Center for Education Statistics
Department of Education

 

Reference date

2000

Introduction

The Early Childhood Longitudinal Study, Birth Cohort (ECLS-B), is designed to track the development of 15,000 children born in the year 2000 through their first grade year of school. To be initiated in Fall 2000, its objective is to study the "whole child," including health, early learning, physical, cognitive, social, emotional and early educational experiences of young children. In sum, the primary objectives of ECLS-B are:

  • To understand the growth and development of children in critical domains;
  • To understand how children transition to out-of-home programs and to school; and
  • To understand children's school readiness.

Multiple types of data will be collected at multiple points, with the first data collection taking place some 9 months after birth. Subsequent data collection will occur at 18 months, 30 months, 48 months, kindergarten, and first grade. In all, six data collection methods are planned--use of information contained on the birth certificate, a parent/guardian interview in the home at each data point, administration of a battery of assessments to the child, and information from care providers, preschool teachers, first grade teachers, and school administrators. During the study, information also will be obtained from residential fathers about their interaction with the children.

Race/ethnicity

Initially, race/ethnicity for the parents will be collected directly from the Birth Certificate. Since the certificate does not record the child's race, the child will be assigned the mother's race. At the first interview with the parent, race/ethnicity will be obtained directly. The specific questions and response categories are not yet available. However, it is expected that, for the most part, the response items are expected to be the same as those used in ECLS-K.

Interviewing policy

Given that the study is still in an early stage, the exact policy for dealing with each of the groups to be contacted has not yet been finalized. As a general policy, however, bilingual staff will be used as needed to obtain information from non-English speaking respondents. At a minimum, both Spanish- and Chinese-speaking interviewers will be recruited; other language skills will be available as required. All CAPI instruments will be available in Spanish, as will paper instruments. A paper instrument in Chinese is under consideration.

Sample size

The sample sizes shown in Appendix Tables A-1 to A-3 refer to new births selected for the Year 1 sample. As noted above, the initial detail on race/ethnicity will be drawn from the birth certificate; since the information is not available for the child, the mother's detail will be ascribed to the child, and updated at the time of the first interview with the parent/guardian. The first interview also will obtain the subgroup detail for both Hispanics and Asian and Pacific Islanders (API). The API group as a whole was oversampled by about 10 percent, whereas the Chinese were oversampled by a factor of 3, in order to provide a sufficient sample for separate analysis.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

Plans for tabulation and publication are still being developed, but it is expected that limited data will be available for Hispanics and Asians, as well as for Mexican-Americans, Pacific Islanders, and those who classify themselves in more than one racial category. It is too early to speculate about the availability of public use micro-date tapes.

Revised race/ethnic definitions

Given the longitudinal nature of this effort, the sponsor, NCES, will face the need to reflect the proposed revised OMB race/ethnic standards. The questions and response categories currently planned for use in this effort, however, appear consistent for the most part with the expected guidelines, including an allowance for the reporting of multiple racial/ethnic categories.

Agency website address:

www.nces.ed.gov

Early Childhood Longitudinal Study - Kindergarten (ECLS-K)

Sponsoring agency

National Center for Education Statistics
Department of Education

 

Reference date

Fall 1998

Introduction

The Early Childhood Longitudinal Study, Kindergarten Class of 1998-1999 (ECLS-K) is designed to track the performance of some 22,000 children from kindergarten through the fifth grade. Initiated in Fall, 1998, its objective is to study the "whole child," including health, social and emotional development, and educational experiences. The study will explore the many factors, such as school, family, and child characteristics, and explain the differential levels of school success for specific subgroups. The conceptual model, collaboratively designed by health, human services, and education experts, frames the study. Information from this longitudinal effort will provide insight into the multiple ways that the family, child, school, and community interact to explain the progress and development in children as they progress from kindergarten through the third grade.

The study has three primary purposes and four issues of interest: To understand the roles of families and schools in collaboratively supporting children's education; to understand how teachers work with children from diverse backgrounds; to investigate teacher and school expectations. The issues of specific interest are: describing the levels of school skills and knowledge that children possess when they enter kindergarten; examining three crucial transitions--from home to kindergarten, from kindergarten to first grade, and from first to later grades; inspecting how kindergarten experiences relate to later school performance; and, describing growth in mathematics, reading, and general knowledge as it relates to teacher practices.

Information will be collected on four levels--from the child; from parents and guardians; from teachers; and from school administrators.

Race/ethnicity

The following questions develop the detail on race/ethnic origin:

  • Parent Interview (CAPI Protocol)

    Is (name) of Hispanic or Latino origin?    YES       NO

    • What is (name's) race?
      American Indian or Alaskan Native
      Asian
      African-American
      Native Hawaiian or other Pacific Islander
      White
      Other race (specify)

    First Grade Spring Parent Interview

    In the last interview, you indicated that (child/you/child's mother/child's father)(was/were) Hispanic. Which Spanish/Hispanic/Latino group best describes (child/you/child's mother/child/s father)?

    • Mexican, Mexican-American, Chicano
      Puerto Rican
      Cuban
      Other Spanish/Hispanic/Latino Group (specify)

    In our last interview, you indicated that (reference person) was/were Asian. Which Asian group best describes (appropriate reference)?

    • Asian Indian Japanese  
      Chinese Korean  
      Filipino Vietnamese Other

    In our last interview, you indicated that (reference person) was/were a Pacific Islander. Which Pacific Islander group best describes (reference person)?

    • Native Hawaiian Samoan
      Guamanian or Chamorro Other

Interviewing policy

Language minority children will be identified through a Home Language Survey; classroom teachers also will be asked questions about the child's home language. Children with home language other than English will complete the Oral Language Development Scale (OLDS). Above a given score, children will be assessed in English; Hispanic children below the score will be assessed using Spanish language subtests. Current plans call for bilingual interviewers to be available as needed for all other interviews and, to the extent possible, in the required languages. A Spanish language version of the questionnaire is available in paper format; for other languages, the interviewer does direct translation for the respondent.

Sample size

The sample sizes shown in Appendix Tables A-1 to A-3 refer to children of kindergarten age. As noted above, the subgroup detail on race/ethnicity is not yet available, but will be collected as part of the First Grade interview. The Asian/PI group was oversampled by a factor of 3.

Publications of data for Hispanics, Asians, Pacific Islanders, and Native Americans

The initial phase of data collection has just been completed; plans for tabulation and publication are still being developed, but it is expected that limited data for Hispanics and Asians will be made available. The availability of data for subpopulations is yet to be determined. The availability of public use micro-data tapes is yet to be determined.

Revised race/ethnic definitions

Given the longitudinal nature of this effort, the need to reflect the proposed revised OMB race/ethnic definitions will be faced as directed by NCES and OMB. The questions and response categories planned for use in this effort, however, appear consistent for the most part with the expected guidelines, including an allowance for the reporting of multiple racial/ethnic categories.

Agency website address:

www.nces.ed.gov

Appendix C: Standards for Maintaining, Collecting, and Presenting Federal Data on Race and Ethnicity

(Excerpt from Federal Register, October 30, 1999)

This classification provides a minimum standard for maintaining, collecting, and presenting data on race and ethnicity for all Federal reporting purposes. The categories in this classification are social-political constructs and should not be interpreted as being scientific or anthropological in nature. They are not to be used as determinants of eligibility for participation in any Federal program. The standards have been developed to provide a common language for uniformity and comparability in the collection and use of data on race and ethnicity by Federal agencies.

The standards have five categories for data on race: American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, and White. There are two categories for data on ethnicity: "Hispanic or Latino," and "Not Hispanic or Latino."

  1. Categories and Definitions

    The minimum categories for data on race and ethnicity for Federal statistics, program administrative reporting, and civil rights compliance reporting are defined as follows:

    • American Indian or Alaska Native. A person having origins in any of the original peoples of North and South American (including Central America), and who maintains tribal affiliation or community attachment.
    • Asian. A person having origins in any of the original peoples of the Far East, Southeast Asia, or the Indian subcontinent including, for example, Cambodia, China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam.
    • Black or African American. A person having origins in any of the black racial groups of Africa. Terms such as "Haitian" or "Negro" can be used in addition to "Black or African American."
    • Hispanic or Latino. A person of Cuban, Mexican, Puerto Rican, South or Central American, or other Spanish culture or origin, regardless of race. The term, "Spanish origin, can be used in addition to "Hispanic or Latino."
    • Native Hawaiian or Other Pacific Islander. A person having origins in any of the original peoples of Hawaii, Guam, Samoa, or other Pacific Islands.
    • White. A person having origins in any of the original peoples of Europe, the Middle East, or North Africa.

    Respondents shall be offered the option of selecting one or more racial designations. Recommended forms for the instruction accompanying the multiple response question are "Mark one or more" and "Select one or more."

  2. Data Formats

    The standards provide two formats that may be used for data on race and ethnicity. Self-reporting or self-identification using two separate questions is the preferred method for collecting data on race and ethnicity. In situations where self-reporting is not practicable or feasible, the combined format may be used.
    In no case shall the provisions of the standards be construed to limit the collection of data to the categories described above. The collection of greater detail is encouraged; however, any collection that uses more detail shall be organized in such a way that the additional categories can be aggregated into these minimum categories for data on race and ethnicity.
    With respect to tabulation, the procedures used by Federal agencies shall result in the production of as much detailed information on race and ethnicity as possible. However, Federal agencies shall not present data on detailed categories if doing so would compromise data quality or confidentiality standards.

    1. Two-question format

      To provide flexibility and ensure data quality, separate questions shall be used wherever feasible for reporting race and ethnicity. When race and ethnicity shall be collected first. If race and ethnicity are collected separately, the minimum designations are:

      Race:

      • American Indian or Alaska Native
      • Asian
      • Black or African American
      • Native Hawaiian or Other Pacific Islander
      • White

      Ethnicity:

      • Hispanic or Latino
      • Not Hispanic or Latino

      When data on race and ethnicity are collected separately, provision shall be made to report the number of respondents in each racial category who are Hispanic or Latino.

      When aggregate data are presented, data producers shall provide the number of respondents who marked (or selected) only one category, separately for each of the five racial categories. In addition to these numbers, data producers are strongly encouraged to provide the detailed distributions, including all possible combinations, of multiple responses to the race question. If data on multiple responses are collapsed, at a minimum the total number of respondents reporting "more than one race" shall be made available.

    2. Combined format

      The combined format may be used, if necessary, for observer-collected data on race and ethnicity. Both race (including multiple responses) and ethnicity shall be collected when appropriate and feasible, although the selection of one category in the combined format is acceptable. If a combined format is used, there are six minimum categories:

      • American Indian or Alaska Native
      • Asian
      • Black or African American
      • Hispanic or Latino
      • Native Hawaiian or Other Pacific Islander
      • White

      When aggregate data are presented, data producers shall provide the number of respondents who marked (or selected) only one category, separately for each of the six categories. In addition, to these numbers, data producers are strongly encouraged to provide the detailed distributions, including all possible combinations, of multiple responses. In cases where data on multiple responses are collapsed, the total number of respondents reporting "more than one race" (regardless of ethnicity) shall be provided.

  3. Use of the Standards for Record Keeping and Reporting

    The minimum standard categories shall be used for reporting as follows:

    1. Statistical reporting
      These standards shall be used at a minimum for all federally sponsored statistical data collections that include data on race and/or ethnicity, except when the collection involves a sample of such size that the data on the smaller categories should be unreliable, or when the collection effort focuses on a specific racial or ethnic group. Any other variation will have to be specifically authorized by the Office of Management and Budget (OMB) through the information collection clearance process. In those cases where the data collection is not subject to the information collection clearance process, a direct request for a variance shall be made to OMB.
    2. General program administrative and grant reporting
      These standards shall be used for all Federal administrative reporting or record keeping requirements that include data on race and ethnicity. Agencies that cannot follow these standards must request a variance from OMB. Variances will be considered if the agency can demonstrate that it is not reasonable for the primary reported to determine racial or ethnic background in terms of the specified categories, that determination of racial or ethnic background is not critical to the administration of the program in question, or that the specific program is directed to only one or a limited number of racial or ethnic groups.
    3. Civil rights and other compliance reporting
      These standards shall be used by all Federal agencies in either the separate or combined format for civil rights and other compliance reporting from the public and private sectors and all levels of government. Any variation requiring less detailed data or data which cannot be aggregated into the basic categories must be specifically approved by OMB for executive agencies. More detailed reporting which can be aggregated to the basic categories may be used at the agencies’ discretion.
  4. Presentation of Data on Race and Ethnicity

    Displays of statistical, administrative, and compliance data on race and ethnicity shall use the categories listed above. The term "nonwhite" is not acceptable for use in the presentation of Federal Government data. It shall not be used in any publication or in the text of any report.
    In cases where the standard categories are considered inappropriate for presentation of data on particular programs or for particular regional areas, the sponsoring agency may use:

    • The designations "Black or African American and Other Races" or "All Other Races" as collective descriptions of minority races when the most summary distinction between the majority and minority races is appropriate.
    • The designations "White," "Black or African American," and "All Other Races" when the distinction among the majority race, the principal minority race, and other races is appropriate.
    • The designation of a particular minority race or races, and the inclusion of "White" with "All Other Races" when such a collective description is appropriate.

    In displaying detailed information that represents a combination of race and ethnicity, the description of the data being displayed shall clearly indicate that both bases of classification are being used.

    When the primary focus of a report is on two or more specific identifiable groups in the population, on e or more of which is racial or ethnic, it is acceptable to display data for each of the particular groups separately and to describe data relating to the remainder of the population by an appropriate collective description.

  5. Effective Data

    The provisions of these standards are effective immediately for all new and revised record keeping or reporting requirements that include racial and/or ethnic information. All existing record keeping or reporting requirements shall be made consistent with these standards at the time they are submitted for extension, or not later than January 1, 2003.

Populations
Hispanic, Latino, Latina, & Latinx People | American Indian & Alaska Native People (AI-AN)
Location- & Geography-Based Data
Tribal Communities