-
Introduction to Part I
-
A vast body of research shows important differences among segments of the population on virtually all aspects of health and health care, including patterns of disease and disability, use of services, and quality and outcomes of care. Documenting such differences is an essential starting point for a wide array of policies and interventions to improve peoples’ health. Biological, cultural, historical, and socioeconomic differences among different segments of the population may create distinctive patterns of health care needs and differences in the use of and responses to medical services. Understanding the patterns and differences is impossible unless researchers can separate and compare data from various segments of the population. That is difficult when those population segments are small or difficult to identify. This is a particular concern when the small population in question has special vulnerabilities or may be subject to inequitable treatment. To date, the federal government’s very substantial data collection efforts have not generated adequate data about some subpopulations because of their small size or their distribution (either great concentration or lack of concentration) or because of insufficiently standardized ways of identifying the population in a survey context.
The small size of some populations means they may not be included in numbers sufficient for separate analyses in federal surveys. Also, information identifying some small populations may not be routinely included in the medical records and insurance claims that are another source of data. To illustrate the different research and methodological challenges facing research on small populations, this report focuses on four case examples—Asian-American subpopulations; lesbian, gay, bisexual, and transgender (LGBT) populations; adolescents with autism spectrum disorders (ASDs); and residents of rural areas. This report is about why research is needed about small populations such as those that we have chosen and about the challenges that small populations pose for research; we make no attempt here to report comprehensively on the health and health care needs of the four populations. We also recognize that many other relatively small populations may have special health care needs or pose particular challenges to the health care system. Our cases are illustrative of a more general set of issues.
Advocacy organizations, as well as some researchers and policymakers, have pushed for the collection of more data about various small populations, including the examples we focus on in this report. With the growing use of electronic health data in the provision of medical care, the possibility that such data might be used for research that complements or supplements existing federal data collection activities merits consideration. That is the topic of Part II of this report. For purposes of this report, we define “research” broadly as addressing issues traditionally addressed through clinical, pharmaceutical, health services, public health, public policy, and evaluation research.
-
-
Methodology for Identifying and Exploring Small Populations in This Report
-
In selecting our example small populations, we targeted those that would illustrate a broad range of health and health care questions, as well as challenges encountered in conducting research to answer them, with existing federal data sources and potential with electronic sources generated in medical care.
Small populations that need study share characteristics with what are typically considered underserved populations: “poor; uninsured; have limited English language proficiency and/or lack familiarity with the health care delivery system; or live in locations where providers are not readily available to meet their needs.”15 To focus our study, we consulted with government officials at the Agency for Healthcare Research & Quality (AHRQ), the Center for Disease Control’s (CDC’s) National Center for Health Statistics (NCHS), and the Health Resources and Services Administration (HRSA) about populations about which information requests have been received that could not be answered from existing federal data sources. We have also reviewed some related National Institutes of Health (NIH) projects, like the Health Care System Research Collaboratory program.
Once the four study populations were selected, we reviewed past federal surveys regarding the extent to which they could be identified in available data sources, and we examined existing literature for information about their characteristics, health and health care issues, as well as reasons why they have been difficult to study in existing federal surveys and with other sources of data.
In addition, we conducted tailored interviews with 16 expert informants whose work has focused on one of our small populations (see Table I.1). Topics in the interview guide were based on issues and concerns raised in available literature and by organizations that serve the populations in question. An initial purposive sample of experts was identified from published sources, advice from the governmental sources mentioned above, and the research team’s knowledge of the field, followed by some snowballing based on suggestions by the experts we were interviewing. Each person gave permission to have the interview recorded, and the interviews were summarized thematically. Particular attention was paid to areas of convergence and divergence among interviews, as well as between interviews and the literature.
-
-
Limitations in Federal Survey Data
-
There are a number of strengths to primary survey data compared to other primary data sources (e.g., focus groups, case studies) and secondary data (e.g., administrative and claims data). Survey data allows the researcher more control over who is included (i.e., sample frame and sample), the kinds of information that is collected from them (e.g., data domains, elements or specific questions), and key aspects of data elements (e.g., standardization and quality) compared to administrative, claims, or other secondary data sources. Consequently, it is often easier to generalize to the nation or other large populations and to replicate survey research.
All research approaches and data sources have limitations, and that is true of survey research. Although many important research questions (e.g., about outcomes of treatment or the consequences of being uninsured) require longitudinal data, most surveys are designed to collect cross-sectional data at a point of time. The Medical Expenditure Panel Survey (MEPS) is a two-year panel and a rare example of a study that attempts to follow cohorts (of households) over time. Such efforts are few and expensive. There are also limitations regarding the kinds of data that can be collected via survey research. For health matters, for example, surveys most often are limited to collecting self-reports about individual’s overall health status, so the resulting data do not include the kinds of clinical information (e.g., about diagnoses, service and procedures, laboratory results, drugs, genetic information) needed for some kinds of studies. Selection bias, which results from survey respondents’ decisions about whether to participate or not, can lead to misleading data.16 Self-reported survey data have weaknesses resulting, for example, from limitations in knowledge or from recall bias. Finally, with the exception of highly specialized studies, surveys generally obtain data from too few people to break out separate results for small populations. As a result even valid inferences drawn about the population (or major segments thereof) based on well-designed survey samples may not apply to small populations such as we are considering in this report.
General problems with small populations do not necessarily stem from the absolute size of the population, but rather its size relative to the total population (or sampling frame) from which the survey sample is drawn. Sample sizes calculated to collect information on the general population of Americans often lack ability to accurately detect small populations. This problem only increases when wanting to study specific health conditions within these small populations. There are standard approaches to increasing the chances of including people from small populations, such as using a list of group members to specifically target or screening questions to increase representation of the groups. However, these strategies are not typically used in national surveys.
Standard “solutions” for getting adequate numbers for analysis from small populations include oversampling17 and combining data from multiple years. But oversampling subgroups may require the researcher to screen out large numbers of people who do not fit the category in order to obtain the sought-after number of those who do. This becomes more costly as the target group’s presence in the population being screened becomes smaller and as the number of needed subgroups (e.g., age, gender, or those using different languages) increases. The smaller a group’s presence in the population being screened, the more calls are needed to obtain the desired number of respondents. Combining data from multiple years becomes problematic if year-to-year changes are taking place within that population or if survey questions change. A third alternative, sampling from an organization that specializes in service to the population in question, raises questions of representativeness.
In general, the limitations of national surveys for studying small populations can be summarized as issues related to coverage of the target population and issues related to data collection.18 These issues as they relate to our four example populations are presented in Table I.2 and are discussed in greater detail later in this report.
-
-
Population #1: Asian-American Subpopulations
-
“Asians” are one of the five race categories that must be used in the federal government’s surveys and administrative forms under rules of the Office of Management and Budget, but the Asian-American population is quite internally diverse. The 15.5 million Asian Americans who compose about 4.4 percent of the American population include more than 50 different Asian ethnicities and 100 languages. Asian Americans are concentrated in urban areas, particularly in California, New York, and Texas. Which Asian-American subpopulations are found in particular areas varies. Urban areas in California like Los Angeles and San Francisco, as well as eastern areas like New York City have larger Chinese populations than any other Asian subpopulation, while urban areas in Texas have higher concentrations of Asian Indians and Vietnamese.26 Other local concentrations of Asian subpopulations can increasingly be found throughout the country.27 Between 2000 and 2010, there was a 46 percent increase in the Asian-American population, making them the fastest growing racial group.28
It has been well documented that racial and ethnic minorities receive lower quality health care than non-minorities even after accounting for access-related factors,29 but little of the research on racial/ethnic disparities has focused on Asian Americans. Their health care needs remain poorly understood due to inconsistent definitions used in data collection, lack of disaggregated data about ethnic subgroups, and the uneven geographic distribution of the Asian-American population.30
The commonplace view of Asian Americans as self-sufficient, educated, and upwardly mobile fails to recognize the health needs of Asians overall, as well as their diversity in terms of ethnic background, country of origin, length of time in the United States, and other factors that may affect health and health care.31
Figure I.1, which comes from the Palo Alto Medical Foundation Research Institute’s Pan Asian Cohort Study (National Institutes of Health, National Institute of Diabetes and Digestive Kidney Diseases grant 5R01DK81371), which primarily utilizes electronic health record (EHR) data, shows diabetes prevalence among men in the San Francisco Bay area and provides a vivid example of the differences in health problems among sub-groups of the Asian-American population.32 The prevalence rate among Filipino men is more than three times that of Japanese men. It is apparent from these and other data, that health needs vary greatly within what is often treated in research as a single racial population.33
Figure I.1. Pan Asian Cohort Study—Preliminary Findings for Diabetes Prevalence
Source: Pan Asian Cohort Study. “Preliminary Findings for Diabetes Prevalence.” Palo Alto Medical Foundation. Accessed March 1, 2013. http://www.pamf.org/pacs/men.jpg.
There is also evidence of health care–related differences within the Asian-American population. Asian immigrants to the United States are less likely than U.S.-born Asians to have health insurance and use health care services.34 Linguistic isolation (living in a household in which no one above age 14 speaks English) may contribute to this. About one-quarter of Asian Americans live in linguistically isolated households, with rates ranging from 10 percent among Filipinos to 45 percent of the Vietnamese.35 Not surprisingly, linguistically isolated households tend to be of low socioeconomic status and have poorer access to care and more depravation of various kinds than do households in which English is spoken. New immigrants from all countries tend to locate near earlier immigrants. This pattern may facilitate access to various kinds of culturally specific goods and services but may produce isolation from the larger society as well as shared exposure to any environmental risk factors that are proximate to their locale.36
The language barriers and cultural differences associated with immigrant status create various complexities, including communications difficulties with health care providers, advice that is inconsistent with cultural beliefs and practices, and dissatisfaction with or distrust of medical advice.37 Imperfect language translation and nuance can create confusion. Language and cultural isolation of immigrant or non-English speaking groups may present barriers to care-seeking and treatment.38 Behavioral health issues—stress, smoking, domestic violence, alcohol abuse—may also be associated with these factors.
There is need for better information about subpopulations of Asian Americans, as can be can be illustrated by considering the examples of Vietnamese and Filipinos in the United States.
-
-
Population #2: Lesbian, Gay, Bisexual, and Transgender People
-
The health and health needs of lesbian, gay, bisexual, and transgender people are not well documented. Even basic information is hard to come by. As a recent Institute of Medicine report puts it, “it has been an ongoing challenge for researchers to collect reliable data from sufficiently large samples to assess the demographic characteristics of LGBT populations.”82 This project mainly focuses on the health and health needs of lesbian, gay, and bisexual people. The transgender population has a host of separate issues around classification, health problems, and provider relations that are not well researched.83
To start with the basics, federal and non-federal survey-based estimates of numbers of lesbian, gay, bisexual, and transgender people have varied by gender, over time, and according to survey methods and question wording (see TableI.4 in the Appendix to Part I). Recent estimates puts the percentage of the adult population who identify as homosexual, gay, lesbian, or bisexual at about 3.5%).84 No such information is available about transgender people. The percentage of adults who identify themselves as lesbian, gay, or bisexual to survey researchers is smaller than the percentage who report having same sex partners or who report some desire for or attraction to a person of the same sex. The small size of LGBT populations and the sensitivity of results to the wording of questions are among the challenges to studying health issues in these populations via survey research. However, there are many indications that such research is needed.
-
-
Population #3: Adolescents with Autism Spectrum Disorders
-
Autism spectrum disorders (ASDs) are a group of developmental disabilities that range from mild to severe and are characterized by social impairment, difficulty communicating, and repetitive motions or other unusual behaviors.106 These characteristics are usually noticeable before the age of 3 and remain as a lifelong chronic condition with both medical and psychological implications.107 ASDs include autistic disorder, Asperger’s disorder, pervasive developmental disorder–not otherwise specified (PDD-NOS), Rett syndrome, and childhood disintegrative disorder.108 Based on 2008 data from the 14 sites in its Autism and Developmental Disabilities Monitoring Network, the Centers for Disease Control estimates 1 in 88 8-year-old children have ASDs.109 Prevalence in these sites had increased 23 percent from two years earlier and 78 percent since 2002. Although there is disagreement about whether the true prevalence has increased (since guidelines for diagnosis have changed, more services are available, and awareness of ASD has increased), the CDC numbers are based on evaluation records, not parental reports. Measuring ASD prevalence continues to be a challenge due to the complexity of the disorder, the lack of consistent and reliable diagnostic standards, and changes in the definition of such conditions.110 ASD prevalence is about five times higher in boys than in girls (ratio of 4.5 boys to 1 girl). Prevalence is also significantly higher among non-Hispanic white children than among black and Hispanic children. Intellectual ability is highly variable, with 38 percent reported as intellectually disabled, 24 percent as borderline, and 38 percent with average or above average intellectual ability.
There are controversies about what should be included in the category of autism spectrum disorders. The NIH classifies Rett syndrome as an ASD, but some argue that it is more similar to non-autistic spectrum disorders such as fragile X syndrome or Down syndrome. Unlike other ASDs, Rett syndrome is also almost always in girls.111 There is also debate over whether Asperger’s disorder is a separate disorder or simply a less severe form of autism.112 The next revision of the American Psychiatric Association’s Diagnostic and Statistical Manual (DSM) will drop individual classifications for autistic disorder, Asperger’s disorder, childhood disintegrative disorder and PDD-NOS, grouping all of them under “autism spectrum disorder”—a term that is already widely used. APA has said this change will help “more accurately and consistently diagnose children with autism.” Rett syndrome will be dropped from the DSM altogether. There is concern among the Asperger’s and Rett communities that these changes will result in a loss of identity among individuals with these specific disorders and that it may affect health insurance coverage and school funding for special education.113
The exact causes of ASDs remain unknown, but research suggests genetics and environment both play important roles. Researchers are studying factors such as family medical conditions, parental age and other demographic factors, exposure to toxins, and complications during birth or pregnancy. CDC and IOM studies have found no link to childhood immunizations.114, 115, 116, 117
-
-
Population #4: Residents of Rural Communities
-
Depending on the definition used—particularly degree of proximity to urban areas—the proportion of the U.S. population described as rural ranges from 17 to 49 percent.150, 151, 152 Rural communities are far from uniform, but they are generally less densely populated and more geographically isolated than urban areas. These characteristics result in limited access to services and economic opportunities.153 Compared to the rest of the population, people in rural areas are more likely to live in poverty as a result of low wage jobs and less likely to be highly educated.154 Many rural areas face declining numbers due to the out-migration of younger residents.
-
-
Discussion/Conclusion
-
This report has focused on need for health information data about small populations and the challenges that meeting that need has posed for researchers. To explore these challenges we considered populations defined by four types of characteristics―sexual orientation and behavior, geography, race and ethnicity, and a health-related condition—that were selected to illustrate the range of problems that face researchers when using existing federal surveys (see Table I.6). In a Part II of this report, we examine the potential of data based on electronic health records and related electronic data sources to complement these surveys and overcome some of the problems researchers have historically faced.
In each of our four illustrative populations, we have presented evidence of distinctive health and health care issues that could usefully be better understood by research. Some of these issues pertain to problems and concerns that may characterize the population itself—as with the high rates of diabetes among Filipino Americans, the distance from specialty care that some rural populations face, or the problems posed by the transition to adulthood for adolescents with autism spectrum disorders. Some issues pertain to possible differences and possibility disparities from other populations or the population at large regarding health conditions, services, or outcomes of care.
Research to address questions about small populations depends on several things. The most fundamental is the ability to identify the population of interest in the data. The second is having data on the independent and dependent variables of interest, as well as relevant co-variates (e.g., education, income) that need to be controlled for. Third, the value of many data sources can be enhanced if researchers are able to link to other data sources. Such linkage requires availability of a unique identifier or a matching algorithm that uses multiple variables. Fourth, some research questions require longitudinal data in which data about the same people can be linked over time. Finally, given resource realities and constraints, ways are needed to conduct research as efficiently and effectively as possible. Primary data collection strategies for getting sufficient numbers of people from small populations can be very expensive.
Some national health survey data sets (including the National Survey of Family Growth, National Health and Nutrition Examination Survey, National Health Information Survey, and Behavioral Risk Factor Surveillance System) contain information about the LGBT population or Asian subpopulations. Although such data may be collected, issues exist that make it difficult to use for research on small populations. Information (e.g., zip codes) that is needed to characterize an individual’s degree of rural-ness is not available in federal public use data sets because of concerns that deductive identification of individual people might be possible. Additionally, validity concerns can be raised about information reported by a parent in household surveys about a condition such as a child’s autism. Survey data may also not include the dependent variables and co-variates needed to answer questions about the health and health care of small populations. Data analysis also requires sufficient numbers, and this can be a problem in survey research and secondary data analysis for people in categories that appear only in small numbers in a large population. This is particularly true when co-variates are considered. The common solutions for this problem all have important drawbacks.
Combining data from surveys conducted in multiple years may yield a sufficiently large analytic sample, but it can produce misleading results if changes are occurring within the population over time. Oversampling a small population in survey research is often feasible, but it can be expensive. Two-stage sampling, starting with a targeted survey, and then a follow-up survey of the target population, can be expensive, and can only be used when the target population is stable and easily identified.201 Web-based surveys are another potential approach, but these are also limited by self-selection bias (due to high nonresponse rates), representativeness issues, and concerns about the reliability and validity of the data collected.202, 203 Finally, focusing the study on a region or setting in which there is a concentration of people who fit the category is an oft-used option for obtaining sufficiently large numbers, but the resulting data may not be representative of the larger population.
Available data sources also have other important limitations. Federal survey research is typically cross-sectional, lending itself poorly to research questions that have a longitudinal dimension. Additionally, survey domains, questions, and response categories may change over time, limiting the ability to use the data longitudinally. Data based on insurance claims may permit data analysis that has a longitudinal dimension, but insurance claims do not typically include information that would permit identifying someone as from a LGBT or an Asian-American subpopulation and the data are limited to billed services from particular payers.
In sum, policymakers, advocates, or researchers interested in the health and health needs of small populations encounter various barriers to research using existing federal surveys.
A great deal of hope has been placed in the possibility that electronic information generated in the patient care process in organizations that have electronic health records will provide data that can be used for research on small populations, even though the organizations that collect such information at this time are hardly representative. Electronic health records and associated electronic data (e.g., patient reported health behavior or laboratory or prescription information) have a number potential benefits, such as the possible inclusion of large numbers of individuals from small populations, the collection of rich information about key process of care and outcome variables of interest, the potential for longitudinal study of cohorts of people (e.g. regarding outcomes of care), and the ability to do these relatively inexpensively.
In Part II of this report, we explore these possibilities on how electronic health records and other electronic data can be used to strengthen research on these patient populations.
-
-
Appendix to Part I
-
Table I.1. Key Informant Interviews
Pre-Interviews (to identify target populations)
Agency for Healthcare Research & Quality
- Steve Cohen, PhD, Harvey Schwartz, PhD, Cecilia Casale, PhD, Ed Lomotan, MD, Gurvaneet Randhawa MD, Jim Branscome, Joel Cohen, PhD
National Center for Health Statistics
- Virginia Cain, PhD, Vicki Burt, Don Malec, PhD
Maternal and Child Health Bureau, Health Resources and Services Administration
- Bonnie Strickland, PhD, Michael Kogan, PhD, Mary Kay Kenney, PhD, Marie Mann, MD
Office of Rural Health Policy, Health Resources and Services Administration
- Aaron Fischbach, Curt Mueller, PhD, Michelle Goodman, Tom Morris, Michael McNeely, Sarah Bryce
Target Population Interviews
LGBT
- Judith Bradford, PhD, The Fenway Institute
- Gary Gates, PhD, UCLA School of Law’s Williams Institute
- Stewart Landers, JD, John Snow, Inc.
- Harvey Makadon, MD, National LGBT Health Education Center, The Fenway Institute
- Shane Snowdon, Human Rights Campaign
Asian Americans
- Priscilla Huang, JD, Asian & Pacific Islander American Health Forum
- Latha Palaniappan, MD, Palo Alto Medical Foundation
- Marguerite Ro, DrPH, Public Health Dept., Seattle and King County, WA
- Chau Trinh-Shevrin, DrPH, Center for the Study of Asian American Health, Department of Medicine, NYU
Adolescents with Autism Spectrum Disorders
- Debra Lotstein, MD, UCLA School of Medicine
- Margaret (Peggy) McManus, National Alliance to Advance Adolescent Health
- Megumi Okumura, MD, UCSF School of Medicine
- Julie Lounds Taylor, PhD, Vanderbilt University School of Medicine
Individuals Living in Rural Areas
- Amy Brock-Martin, DrPH, South Carolina Rural Health Research Center
- David Hartley, PhD, University of Southern Maine
- Erika Ziller, PhD, University of Southern Maine
- Ira Moscovice, PhD, University of Minnesota
- Keith Mueller, PhD, University of Iowa
Table I.2. Limitations of National Surveys for Small Populations
Population General Problem: Small n relative to frame General Problem: Lack of approaches to increase sample Frame Problem:* Telephone number frame Frame Problem:* Area frame samples Data Collection Problem: Unit nonresponse Data Collection Problem: Item nonresponse Data Collection Problem: Instrumen-tation * These frame problems refer to specific challenges to constructing sampling frames based on telephone numbers or geographic areas. See the “Limitations in Survey Data” section for more information on general problems obtaining an adequate frame for small sample size groups relative to the rest of the population. Asian Americans
X
X
X
X
X
LGBT
X
X
X
X
Adolescents on the autism spectrum
X
X
X
X
X
Rural populations
X
X
X
X
X
X
Table I.3. The Ability of Key National Surveys to Study Four Target Populations
Data Set Avail-ability Sample Size Population #1 Race Population #1 Ethnicity/Nativity Population #2 Sexual Orientation/Behavior Population #3 Health/Disability Status Population #4 Geographic Identifier Current Population Survey (CPS)
19xx-2011
2011, 19-64: 121,520
White, Black, American Indian /Aleut /Eskimo, Asian, Hawaiian /Pacific Islander, and two or more races. Asian can be further classified into subgroups.
Hispanic origin (detailed), birthplace (state or country), mother’s birthplace, father’s birthplace, year of immigration, citizenship status
N/A
Self-reported health status, work disability, activity/functional limitations
State identifier; metro status; metro area identifier; some counties identified
American Community Survey (ACS)
Years with health insurance question: 2008-2011
2010, 19-64: 1,806,189
White, Black, American Indian or Alaska Native, Asian Indian, Chinese, Filipino, Korean, Vietnamese, Japanese, Other Asian or Pacific Islander, Other Race, two major races, three or more major races
Hispanic origin (detailed), birthplace (state or country), parent’s birthplaces, ancestry, year of immigration, year naturalized, citizenship status, language spoken at home, English fluency
N/A
Activity/functional limitations, work disability
State, super-PUMA, PUMA, metro status, metro area, Appalachian region, county sample drawn from
National Health Interview Survey (NHIS)
1997-2011
2010, 19-64: 54,177 full file; 21,396 sample adults
White, Black, American Indian, Alaska Native, Asian (subgroups: Chinese, Japanese, Vietnamese, Filipino, Asian Indian, Korean, other), Native Hawaiian or other Pacific Islander (Guamanian, Samoan, other). Asians were oversampled in the 2006-2009 surveys.
Hispanic ethnicity (detailed), number of years in U.S., citizenship status, global region of birth
Starting in 2013: http://www.hhs.gov/news/press/2011pres/06/20110629a.html
See NHIS documentation: Various health status, health condition, activity limitation, and health behavior variables
Region identifiers on public use; access to Census tract/block level and state identifiers at RDC
Medical Expenditure Panel Survey (MEPS)
19xx-2010
2010, 19-64: 21,596
Race/ethnicity data collected during the NHIS interview are available (MEPS draws sample from persons interviewed in prior NHIS survey).
Hispanic ethnicity (detailed), born in U.S., number of years in U.S., citizenship status
N/A
See MEPS documentation: Self-reported health status, health condition, activity limitation, and health behavior variables
Region only on public use; access to more detailed level at RDC
SLAITS-National Survey of Children with Special Health Care Needs
July 2009 - March 2011;
2009-11, 0-17: 40,242 detailed CSHCN interviews
White, Black, other, multiple (In some states, Hawaiian/PI, Asian, American/Alaskan Native can be identified)
Hispanic ethnicity, citizenship, child born in U.S. and number of years, parents born in U.S. and number of years
N/A
See documentation: health condition/limitation/disability; behavioral, developmental, and emotional health variables; special health care needs
State, MSA status
National Health and Nutrition Examination Survey (NHANES)
1999-2012
2009-10, 19-64: 4,861
White, Black, American Indian/Alaska Native, Asian, Native Hawaiian/Pacific Islander, other. Respondents asked to classify themselves as Asian Indian, Chinese, Filipino, Korean, Vietnamese, Japanese, Other Asian or Pacific Islander
Hispanic ethnicity, country of birth, citizenship status, length of time in U.S.
Yes: http://www.cdc.gov/NCHS/nhanes/variable_tables/sexual_behavior.htm Cognitive testing report: http://wwwn.cdc.gov/qbank/report/Miller_NCHS_2001NHANESSexualityReport.pdf
See documentation: Medical examination data, health status, health conditions, behavioral health, etc…
National
National Survey of Family Growth
2006-2010
2006-2010: ~10,000 men and 12,000 women, 15-44 years old
White, Black, Hispanic, Asian, Pacific Islander
Hispanic ethnicity (Mexican vs. all other)
Sexual identity and
attraction: http://www.cdc.gov/nchs/nsfg/abc_list_s.htm#sexualorientationandattractionMen’s and women’s health as related to family life, marriage and divorce, pregnancy, infertility, use of contraception.
The geographic scope of the study is national. Detailed geographic identifiers are available on the restricted access contextual data file.
Behavioral Risk Factor Surveillance System (BRFSS)
1995-2011
2010, 19-64: 292,502
White, Black, Hispanic, American Indian or Alaska Native, and Asian or Pacific Islander
Hispanic ethnicity
About 19 states have had a question one time or other, but not necessarily every year. In 2014 there is an approved optional module on sexual orientation and gender identity.
Self-reported health status, condition specific measures, diet, physician activity, functional limitations
State (typically), MSA
National Survey on Drug Use and Health (NSDUH)
1994-2011
~60,000
White, Black, Hispanic, American Indian or Alaska Native, Native Hawaiian, other Pacific Islander, Chinese, Filipino, Japanese, Korean, Indian, Vietnamese, other Asian
Hispanic ethnicity
1996: “During the past 12 months, have you had sex with only males, only females, or with both males and females?”
Currently testing 2 questions on sexual orientation to be added in 2015204
Drug and alcohol use, health care use, health conditions, mental health, health insurance
State (typically), urban/rural
National Immunization Survey
1994-2012
2010: 17,004
White, Black/African American, American Indian, Alaska Native, Asian, Native Hawaiian, Pacific Islander, Other
Hispanic, Mexican, Mexican-American, Central American, South American, Puerto Rican, Cuban/Cuban American, Spanish-Caribbean, Other Spanish/Hispanic
N/A
N/A
National, State, and selected large urban areas
SLAITS - Survey of Adult Transition and Health
2001, 2007
1,865
N/A (“derived”?)
Hispanic
N/A
Self-reported health status, disability, special health care needs, activity limitations,
State, region, MSA
SLAITS - National Survey of Children’s Health
2003, 2007-2008, 2011-2012
2011-2012: 91800
White/Caucasian, Black/African-American, American Indian/Native American, Alaska Native, Asian, Native Hawaiian, Pacific Islander, Other
Hispanic
N/A
Various disabilities and conditions, including autism, Asperger’s disorder, pervasive developmental disorder, or autism spectrum disorder
State, MSA
Medicare Current Beneficiary Survey
1991-
16,000 per year
American Indian or Alaska Native, Asian, Black or African American, Native Hawaiian or Other Pacific Islander, White, Some Other Race. More granular racial/ethnic categories will be added in 2014.
Hispanic
N/A
Self-reported general health, functional limitations
National
National Latino and Asian American Study
2002-2003
2,554 Latinos and 2,095 Asian Americans
Chinese, Vietnamese, Filipino, Other Asians (others subpopulations collected but too small for subgroup analysis)
Puerto Rican, Cuban, Mexican, Other Latinos
N/A
Various psychiatric disorders
National
National Longitudinal Study of Adolescent Health (Add Health)
1994-95, 1996, 2001-02, 2007-08
2008: 15,701
Same-sex relationships, sexual behavior
Self-reported health status and physical exam
National Adult Tobacco Survey
2009-2010
118,581
Non-Hispanic White, non-Hispanic Black, non-Hispanic Asian, non-Hispanic other (including American Indian or Alaska Native, Native Hawaiian or Pacific Islander, multiracial, or some other race)
Hispanic
Heterosexual-straight; esbian, gay, bisexual, or transgender (LGBT); or not specified.
A new version of this survey is in the field that no longer captured transgender after 2010.
General health, cigarette smoking, other tobacco use,smoke, cessation, secondhand chronic diseases
National, State
Table I.4. Estimated Percentage of People by Sexual Orientation and Behavior from Selected Federal and Non-Federal Sample Surveys
This table does not display the most recent estimates, but rather is presented to illustrate how federal and non-federal survey-based estimates of numbers of lesbian, gay, bisexual, and transgender people have varied by gender, over time, and according to survey methods and question wording. For more discussion, see the “Population #2: Lesbian, Gay, Bisexual, and Transgender People” section in Part I.
Survey
Ages
Percent of Men Identifying as Homosexual, Gay, Lesbian, or Bisexual
Percent of Women Identifying as Homosexual, Gay, Lesbian, or Bisexual
Percent of Men Reporting Same-Sex Partners
Percent of Women Reporting Same-Sex Partners
Percent of Men Reporting Some Same-Sex Desire or Attraction
Percent of Women Reporting Some Same-Sex Desire or Attraction
Notes: Estimates are based on small sample sizes, resulting in large confidence intervals around the estimates; see the text for details. Also, differences in estimates can occur because of sampling error (that is, the estimates in the table are based on probability samples) and nonsampling error, errors due to differential nonresponse and coverage, differences in the target population (the cohorts surveyed), differences in the survey questionnaires used, year of implementation, mode of administration, and the survey respondent.
ORIGINAL SOURCE: Institute of Medicine. “The Health of Lesbian, Gay, Bisexual, and Transgender People.” March 31, 2011. http://www.iom.edu/Reports/2011/The-Health-of-Lesbian-Gay-Bisexual-and-Transgender-People.aspx
Table Sources: Herbenick et al. (2010), Table 1, for results from the NSSHB; Gates (2010), Figures 1 and 7, for results from the GSS; Mosher et al. (2005), Tables 12 and 13, for results from the NSFG; Laumann et al. (1994a), Table 8.2, for results from the 1992 NHSLS.
National Survey of Sexual Health and Behavior, 2010
18+
6.8
4.5
—
—
—
—
General Social Survey, 2008
18+
2.9
4.6
—
—
—
—
General Social Survey, 2008
18 - 44
4.1
4.1
10.0
10.0
—
—
National Survey of Family Growth, 2002
18 - 44
4.1
4.1
6.2
11.5
7.1
13.4
National Health and Social Life Survey, 1992
18 - 59
2.8
1.4
7.1
3.8
7.7
7.5
Table I.5. Common Rural Taxonomies Used by the Federal Government
Taxonomy
Unit
Urban Definition (rural is what’s left)
Limitation
Source: Summarized from Hart 2005.205 OMB Metropolitan and Nonmetropolitan Taxonomy
Counties
Defines metropolitan areas as counties with 1 or more urbanized area (based on population size) and counties economically tied to that core, measured by commuting to work.
County boundaries may over- or under-bound urban core
USDA Economic Research Service Urban Influence Codes (UIC)
Counties
Builds on OMB metro and nonmetro dichotomy to create continuum based on population size and adjacency/nonadjacency to metro counties
Frequently used for research but not for federal or state policy
Census Bureau Rural and Urban Taxonomy
Census-tract
Urban clusters based on population size
Limited health-related data available at the census tract level, which is not stable over census years
Rural/Urban Commuting Area Taxonomy (RUCA)
Census-tract
Based on work commuting flows
Difficult to link to health data, often collected at the county or zip code level. A zip-code based version has been developed for this purpose, but is complex to use.
Table I.6. Potential Areas for Further Research
Population
Subpopulation
Health Issue
Challenges in Studying with Existing Federal Survey Data
Asian subpopulation
Vietnamese women
Cervical cancer
Difficulty disaggregating Vietnamese women and self-report of cervical cancer diagnosis
Filipino
Diabetes
Difficulty disaggregating Filipino and self-report of diabetes diagnosis
Lesbian, Gay, Bisexual, Transgender
Lesbian women
Obesity
Limited data collected on sexual identity and self-reported weight
LGBT Youth
Mental health
Limited data collected on sexual identity or potential unwillingness to respond to survey questions around mental health
Rural
Minorities
Access to care
Language barriers prevent adequate representation
Autism spectrum disorders
Adolescents in transition to adulthood
Transition to adulthood
Lack of longitudinal data and inconsistent definitions of disability between children and adulthood
-
View full report

"rpt_ehealthdata.pdf" (pdf, 1.99Mb)
Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®