Understanding Disparities in Persons with Multiple Chronic Conditions: Research Approaches and Datasets. 5.1 Quality of Race and Ethnicity Variables


Accuracy and completeness of demographic information is a concern in studying disparities. Race and ethnicity variables, in particular, have suffered from inconsistent measurement over time, evolving definitions and categories, insufficiently sensitive categories, and a variety of data collection challenges. The first U.S. census in 1790 recognized three racial categories: whites, blacks (as three fifths a person) and Indians who paid taxes; an unbalanced and racially motivated classification scheme (Williams, 1999). Within the past decade, the Office of Management and Budget (OMB) has approved the use of increasing numbers of racial and ethnic categories up to the current standard of 14 racial and 5 ethnic categories for use in federal data collection initiatives (Cunningham 2012). Federal efforts to collect disparities data are also hindered by non-uniform data collection practices across states. Medicaid in particular lacks federal disparities data collection standards, resulting in a large range between states in the type and quality of disparities data collected. Even within individual states the use of different healthcare provider organizations leads to further variability in the disparities data that is collected (Byrd & Verdier, 2011).

The quality of race and ethnicity variables is a limitation of most federal and private databases. For example, the Medicare enrollment database (EDB) at CMS contains race/ethnicity variables that are highly specific (low false positive rate), but insensitive (low true positive rate) for categories other than white or black. In other words, race/ethnicity coding for white and black beneficiaries is considerably more accurate than other minority groups, such as Asian or American Indians (Waldo, 2005). The Hispanic ethnicity code in the EDB captures only one third of beneficiaries who identify as Hispanic, leading to significant underestimation. Overall, minority populations are more likely to be missing race/ethnicity information or have misclassified information, and those minorities who are misclassified are most often misclassified as white (Waldo, 2005; Williams, 1999). Other examples of databases that suffer from inadequate race/ethnicity coding include the National Ambulatory Medical Care Survey and Healthcare Cost & Utilization Project - Nationwide Inpatient Sample.

View full report


"rpt_ResearchAddressing.pdf" (pdf, 1.34Mb)

Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®