Understanding Disparities in Persons with Multiple Chronic Conditions: Research Approaches and Datasets. Appendix E – Review of National Datasets and Data Systems: Summary Tables

09/30/2013

Disclaimer:
The information contained in this appendix was compiled by Abt Associates Inc. under contract #HHSP2333700IT to the Assistant Secretary for Planning and Evaluation (ASPE) in September 2013. Abt and ASPE are not liable for the accuracy or completeness of the information contained in this document, as the specifications of each data system described below are subject to change. For the most up to date and accurate information on each data system, please visit the website or contact the sponsor for more detail.

 

Agency for Healthcare Research and Quality Datasets

Consumer Assessment of Healthcare Providers & Systems (CAHPS)
References
Agency for Healthcare Research & Quality. Consumer Assessment of Healthcare Providers and Systems
(CAPHS). 2013. http://cahps.ahrq.gov/about.htm
Database Description
White Paper(s): Multiple Chronic Conditions and Disparities
Sponsorship: Agency for Healthcare Research and Quality
Description: CAHPS is a series of surveys that are used to ask consumers and patients about their experiences with healthcare. These surveys cover a wide spectrum of topics, such as provider communication skills and healthcare access. The goal of CAHPS is two-fold: 1) to develop standardized patient surveys that can be used to compare results across providers over time and 2) to generate tools and resources users can use to create comparative information for all stakeholders. There are CAHPS surveys for a variety of different care settings, including hospital, home health care, health plans, and in- center hemodialysis and clinician groups.
Database
(Scope, Size, Setting, Population, Age Range)
CAHPS surveys are used at various levels in the healthcare delivery system; anywhere from individual practices to national samples.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
Survey & Program Database. The CAHPS Database is a compilation of survey results from a large pool of healthcare consumers that are maintained in a national database.
Database Source/Origin: Survey Data
Date or Frequency of Data Collection: Annually, since 1995.
Longitudinal vs. Cross-sectional Database: Serial Cross-Sectional Survey
Data Collection Methodology: Data collection methodology varies by CAHPS sponsor and vendors administering the CAHPS survey. Surveys can be completed via the mail, telephone or internet.
Sampling Strategy: Sampling strategies for CAHPS vary by sponsor. CAHPS provides guidelines for sampling, including determining eligibility, calculating the estimated sample size needed for reporting, and creating a sub-sample of a specific patient population.
Unit of Analysis: Multiple (patients, providers, health plan, etc.) and dependent on survey type.
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
A patient’s principal diagnosis at discharge is used to determine whether he or she falls into a specific service line for CAHPS eligibility. Diagnosis is not capture on the survey itself.
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED, CPT)
Principal diagnosis ICD-9 codes at discharge.
Number of Diagnoses Captured: Only the principal diagnosis at discharge is used to determine CAHPS eligibility.
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
CAHPS does not include measures of cost.
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
CAHPS does not include measures of healthcare utilization, but the number of survey respondents can be used as a proxy for the number of discharges.
Measures of Healthcare Access: Ease of access to healthcare services.
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Age, Sex, Educational Attainment, Hispanic or Latino, Race/Ethnicity, Language
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
CAHPS does not include additional clinical information.
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
Health Literacy/Understanding
Site of Service Information: Limited - Department Based
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Self-reported health status, Self-reported mental health status, Quality of Care, Quality Measures and Patient Satisfaction
Strengths, Limitations & Feasibility
Data Strengths: Select CAHPS datasets contain a large number of minority respondents. Data are collected on key health policy issues, including health status.
Data Limitations: The CAHPS survey is not administered in a consistent fashion. The CAHPS database is a collection of surveys administered at various levels. As such, not all providers participate each year, so the mix of users will vary across years. Sampling and data collection methods also vary by user and are cross-sectional.
Data Access Restrictions: To access CAHPS data, a data release agreement, description of the planned research, and IRB documentation must be submitted to AHRQ. Survey instruments are publically available.
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
No unique identifiers. However, CAHPS surveys have been administered to Medicare Fee-for-Service patients, which may have resulted in a linked CAHPS-claim dataset.
Related Grouping Systems: n/a

 

Healthcare Cost & Utilization Project–Kids’ Inpatient Database
References
Overview of the Kids’ Inpatient Database (KID). 2013. http://www.hcup-us.ahrq.gov/kidoverview.jsp Introduction to The HCUP KID’s Inpatient Database (KID) 2009. Healthcare Cost and Utilization Project (HCUP). 2013. http://www.hcup-us.ahrq.gov/db/nation/kid/KID_2009_Introduction.pdf
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations & Multiple Chronic Conditions and Disparities
Sponsorship: Agency for Healthcare Research & Quality
Description: The Kids' Inpatient Database (KID) is a unique and powerful database of hospital inpatient stays for children. The KID was specifically designed to permit researchers to study a broad range of conditions and procedures related to child health issues. Researchers and policymakers can use the KID to identify, track, and analyze national trends in health care utilization, access, charges, quality, and outcomes. It is the only all-payer inpatient claims database for children in the U.S.
Database
(Scope, Size, Setting, Population, Age Range)
National; Adolescents Only (< 20 years old); 2-3 million records a year.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
A Federal-State-Industry database of Medicare, Medicaid, Private Insurance and Uninsured patient discharges.
Database Source/Origin: Administrative data from 4,121 community, non-rehabilitation hospitals in 44 states.
Date or Frequency of Data Collection: 1997-2009; updated every three years.
Longitudinal vs. Cross-sectional Database: Longitudinal
Data Collection Methodology: Discharge data submitted by participating organizations.
Sampling Strategy: Sampling frame is limited to pediatric discharges from community, non-rehabilitation hospitals in participating HCUP partner states. For sampling, pediatric discharges in participating States are stratified by uncomplicated birth, complicated birth, and all other cases. To ensure an accurate representation of each hospital’s case-mix, the discharges are sorted by State, hospital, DRG and a random with each DRG. Systematic random sampling is then used to select 10% of uncomplicated births and 80% of complicated births and other cases form each from hospital
Unit of Analysis: Multiple (patient, region, etc.)
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Number of Chronic Conditions (based on a list of 25 possible chronic condition indicators) Primary and Secondary Diagnoses Admission and Discharge Status
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
ICD-9-CM codes
Number of Diagnoses Captured: KID contains up to 25 diagnoses per patient per record. This number can vary by State.
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
Expected Primary and Secondary Payer Total Charges
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Admission Type
Procedure Type
ED Visits
Length of Stay
Number of Discharges
Measures of Healthcare Access: Database used to evaluate healthcare access through the use of geographic and hospital type variables (i.e. critical access).
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Age at Admission
Gender
Race
Hospital Characteristics
Physician Identifiers
Clinical Information:
BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
Comorbidity Measures Birth Weight
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
Place of Residence Median Household Income
Site of Service Information: Hospital Location (e.g. State, zip code, etc.) Site of Service Transition Information
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
In-Hospital Mortality Disposition of Patient
Strengths, Limitations & Feasibility
Data Strengths: Representative of all insurance types. Large sample size that allows researchers to study rare conditions.
Data Limitations: Missing data values can compromise the quality of estimates. If the outcome for discharges with missing values is different from the outcome for discharges with valid values, then sample estimates for that outcome will be biased and inaccurately represent the discharge population. For example, race is missing on 15% of discharges in the 2009 KID because some hospitals and HCUP State Partners do not supply it.
Data Access Restrictions: Access to KIDs is open to users who complete a Data Use Agreement and purchase the data. Uses are limited to research and aggregate statistical reporting.
Data Linking Feasibility
(Unique identifiers or sufficient demographics  to allow for data linkages)
The database contains AHA hospital identifiers. However, many states do not report this information.
Related Grouping Systems: HCUP Clinical Classifications System (CCS)

 

Healthcare Cost & Utilization Project - Nationwide Emergency Department Sample
References
Overview of the Nationwide Emergency Department Sample (NEDS). 2013. http://www.hcupus.ahrq.gov/nedsoverview.jsp
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations & Multiple Chronic Conditions and Disparities
Sponsorship: Agency for Healthcare Research & Quality
Description: The Nationwide Emergency Department Sample (NEDS) is a unique and powerful database that yields national estimates of emergency department (ED) visits. The NEDS was created to enable analyses of emergency department (ED) utilization patterns and support public health professionals, administrators, policymakers, and clinicians in their decision-making regarding this critical source of care. NEDS is the largest all-payer ED database in the U.S.
Database
(Scope, Size, Setting, Population, Age Range)
National; 25 - 30 million records
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
A Federal-State-Industry database of Medicare, Medicaid, Private Insurance and Uninsured ED patient discharge records.
Database Source/Origin: As of 2010, NEDS contains administrative data from over 961 hospitals in 28 States.
Date or Frequency of Data Collection: 2006-2010; updated yearly.
Longitudinal vs. Cross-sectional Database: Longitudinal
Data Collection Methodology: NEDS is developed from data from ED visits submitted by participating States.
Sampling Strategy: Similar to the design of the Nationwide Inpatient Sample (NIS), NEDS is developed using a 20% stratified sample of institutions; NEDS is a sample of U.S. hospital-based EDs who participate in the program (n=28). Sampling rate is 20% NEDS to Universe and 37.6% NEDS to Frame.
Unit of Analysis: Episode
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Number of Chronic Conditions
Primary and Secondary Diagnoses
Injury Descriptive Variables
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
ICD-9-CM, CPT-4
Number of Diagnoses Captured: NEDS contains up to 15 diagnoses per record. This number may differ by State.
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
Total ED charges and total hospital charges (for inpatient stays for those ED visits that result in admission. ED Event Type/Number of Visits Length of Stay Number of Discharges
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
ED Event Type/Number of Visits
Length of Stay
Number of Discharges
Measures of Healthcare Access: Database used to evaluate healthcare access through the use of geographic and hospital type variables (i.e. critical access).
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Gender, Age, Urban-Rural designation of resident, expected payment source (e.g. Medicare, Medicaid, self-pay)
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
ICD-9-CM and CPT-4 procedures and diagnoses Identification of injury-related ED visits including mechanism and intent of injury and severity of injury Discharge status from the ED
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
National quartile of median household income (from patient’s ZIP Code)
Site of Service Information: Hospital location (e.g. State, zip code, etc.) and characteristics (e.g. teaching status, region, ownership type).
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Discharge Status
Strengths, Limitations & Feasibility
Data Strengths: NEDS is the largest all-payer ED database in the U.S., with many research applications. It includes information on patients covered by all types of insurances.
Data Limitations: The NEDS contains event-level records, not patient-level records. This means that individual patients who visit the ED multiple times in one year may be present in NEDS multiple times. There is no uniform patient identifier available that would allow a patient-level analysis with the NEDS. In contrast, the HCUP state databases may be used for this type of analysis
Data Access Restrictions: Access to NEDS is open to users who complete a Data Use Agreement and purchase the data. Uses are limited to research and aggregate statistical reporting.
Data Linking Feasibility
(Unique identifiers or sufficient demographics
to allow for data linkages)
For most States, the NIS includes hospital identifiers that permit linkages to the American Hospital Association Annual Survey Database and county identifiers that permit linkages to the Area Resource File.
Related Grouping Systems: HCUP Clinical Classifications System (CCS)

 

Name: Healthcare Cost & Utilization Project - Nationwide Inpatient Sample
References
Overview of Nationwide Inpatient Sample (NIS). 2013. http://www.hcup-us.ahrq.gov/nisoverview.jsp
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations & Multiple Chronic Conditions and Disparities
Sponsorship: Agency for Healthcare Research & Quality
Description: The Nationwide Inpatient Sample (NIS) is a unique and powerful database of hospital inpatient stays. Researchers and policymakers use the NIS to identify, track, and analyze national trends in health care utilization, access, charges, quality, and outcomes. It is the largest publicly available all- payer patient care database in the U.S.
Database
(Scope, Size, Setting, Population, Age Range)
National; Information available on approximately 8 million hospital stays per year.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
A Federal-State-Industry database of Medicare, Medicaid, Private Insurance and Uninsured patient discharges.
Database Source/Origin: Administrative data from 1,051 hospitals from 44 states.
Date or Frequency of Data Collection: 1988 - 2010; updated yearly
Longitudinal vs. Cross-sectional Database: Longitudinal
Data Collection Methodology: NIS contains clinical and resource use information included in a patient discharge abstract and is submitted to HCUP by over 1,000 hospitals in the U.S.
Sampling Strategy: The NIS is a stratified probability sample of hospitals, with sampling probabilities calculated to select 20% of the universe of community, non-rehabilitation hospitals in specific strata for ease of use. The entire sampling frame from 46 states includes >90% of hospitals and >95% of discharges from community hospitals.
Unit of Analysis: Multiple (patient, hospital, region, etc.)
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Major Diagnosis Category (MDC) Primary and secondary diagnosis Admission and discharge status Number of Chronic Conditions
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
ICD-9
Number of Diagnoses Captured: NIS contains up to 25 diagnoses per record (15 prior to the 2009 NIS). The number of diagnoses varies by State; some states provide as many as 66 diagnoses while other states provide as few as 9 diagnoses.
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
Total Charges
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Length of Stay
Type of Admission
Number of Discharges
Measures of Healthcare Access: Database used to evaluate healthcare access through the use of geographic and hospital status variables (e.g. CAH status).
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Gender, age, race, median income for zip code, and Expected Primary and Secondary Payment Sources.
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
Primary and secondary procedures
Disease Severity Measures
Comorbidity Measures
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
Place of Residence Median household income for patient’s ZIP Code
Site of Service Information: Hospital location (e.g. State, zip code, etc.) and characteristics (e.g. teaching status, region, ownership type).
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Disposition of Patient In-hospital Death
Strengths, Limitations & Feasibility
Data Strengths: The NIS is the largest publicly available all-payer inpatient care database in the U.S. with information from 45 states, comprising over 96% of the U.S. population. The NIS’ large sample size enables analyses of rare conditions, uncommon treatments, and special patient populations (such as the uninsured).
Data Limitations: Missing data values can compromise the quality of estimates. If the outcome for discharges with missing values is different from the outcome for discharges with valid values, then sample estimates for that outcome will be biased and inaccurately represent the discharge population. For example, race is missing on over 11% of discharges in the 2010 NIS because some hospitals and HCUP State Partners do not supply it. Not all states report patient identifiers and complete diagnostic information.
Data Access Restrictions: Access to NIS is open to users who complete a Data Use Agreement and purchase the data.
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
The database contains AHA hospital identifiers. However, many states do not report this information.
Related Grouping Systems: HCUP Clinical Classifications System (CCS)

 

Medical Expenditure Panel Survey
References
Medicare Expenditure Panel Survey (MEPS). 2013. http://meps.ahrq.gov/mepsweb/
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations & Multiple Chronic Conditions and Disparities
Sponsorship: Agency for Healthcare Research and Quality
Description: The Medical Expenditure Panel Survey (MEPS) is a set of large-scale surveys of families and individuals, their medical providers, and employers across the United States. MEPS is the most complete source of data on the cost and use of health care and health insurance coverage.
Database
(Scope, Size, Setting, Population, Age Range)
National; approximately 35,000 persons interviewed annually.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)

Survey/Interviews
Two Primary Components

  • Household component - collects data from a sample of families and individuals is selected communities in the U.S.
  • Insurance component - collects data from a sample of private and public sector employees on the health insurance plans they offer their employees.
     
Database Source/Origin: Survey data from a set of large-scale surveys of families and individuals, their medical providers, and employers in the U.S.
Date or Frequency of Data Collection: 1996-2012; updated annually.
Longitudinal vs. Cross-sectional Database: Longitudinal
Data Collection Methodology: For the Household Component, a panel survey design in used to collect data via multiple rounds of interviewing over a two year period of time. For the Insurance component, an annual survey of employers is conducted that collections information on health insurance offerings.
Sampling Strategy: The Household Component collects data from a sample of families and individuals in selected communities across the U.S., drawn from a nationally representative subsample of households that participated in the prior year’s National Health Interview Survey. The Insurance Component collects information from Household Component respondent employers or other non-related employers.
Unit of Analysis: Household or Employer
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Self-Reported Diagnosis transformed into ICD-9 Codes
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
ICD-9
Number of Diagnoses Captured: MEPS identifies specific physical and mental health conditions, accidents, or injuries affecting each respondent. 670 clinical categories are created.
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
Total Health Care Expenditures, Total Expenditures Paid by Insurance, Hospital Outpatient Expenditures, Hospital Emergency Room Expenditures, Hospital Inpatient Expenditures, Dental Expenditures, Home Health Care Expenditures, Vision Aid Expenditures, Other Medical Equipment and Service Expenditures, and Prescription Drug Expenditures
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Medical Provider Visits (Physician, etc.), Hospital Outpatient Visits, Hospital Emergency Room Visits, Hospital Inpatient Visits, Dental Visits, Home Health Care Visits, Number of Drugs Prescribed , and Length of Stay
Measures of Healthcare Access: Presence of provider who provides the usual source of care, reasons why members without usual care do not have it, various aspects of satisfaction with usual care providers, and problems experience in obtaining needed health care
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Age, Sex, Race/Ethnicity, Insurance Status, Marital Status, and Disability Status
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
Prescribed Medicine, Pregnancy Detail,
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
Family Income as Percent of Poverty Line, Employment Status, Total Income, geographic location, and Size of Family
Site of Service Information: Type of Service (e.g. hospital, nursing home, etc.)
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Self-Reported Overall Health Status Self-Reported Physical Health Status Self-Reported Mental Health Status
Strengths, Limitations & Feasibility
Data Strengths: MEPS provides a level of breadth and depth of healthcare utilization information that is not captured in other surveys.
Data Limitations: Even after pooling several years of MEPS data, sample size limitations and confidentiality restrictions make MEPS data unsuitable for certain types of analysis. For example, the MEPS data do not support research on rare conditions. Moreover, information on conditions is household-reported and not verified by clinical records. All MEPS data are reported by one designated household respondent.
Data Access Restrictions: Some files are accessible to the public; however only researchers and users with approved access can gain access to restricted files.
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
Data can only be linked be survey number, which limits the feasibility of linking to non-MEPS-related data sources.
Related Grouping Systems: ICD-based grouping systems.

 

Centers for Disease Control and Prevention Datasets

Behavioral Risk Factor Surveillance System
References
Centers for Disease Control and Prevention. Behavioral Risk Factor Surveillance System. 2013. http://www.cdc.gov/brfss/
 
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations & Multiple Chronic Conditions and Disparities
Sponsorship: Center for Disease Control and Prevention
Description: The Behavioral Risk Factor Surveillance System (BRFSS) is the world’s largest, on-going telephone health survey system, tracking health conditions and risk behaviors in the United States yearly since 1984. Currently, data are collected monthly in all 50 states, the District of Columbia, Puerto Rico, the U.S. Virgin Islands, and Guam.
Database
(Scope, Size, Setting, Population, Age Range)
National; approximately 350,000 non-institutionalized adults (aged 18 years or older) are interviewed each year. One adult is interviewed per household.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
Multi-mode survey (mail, landline, and cell phone)
Database Source/Origin: Initiated in 1894 with 15 states collecting surveillance data on risk behaviors through monthly telephone interviews. By 2001 the 50 states, District of Columbia, Puerto Rico, and Virgin Islands were participating in the BRFSS.
Date or Frequency of Data Collection: 1984–2012; survey conducted monthly and report compiled by the CDC annually
Longitudinal vs. Cross-sectional Database: Cross-sectional
Data Collection Methodology: With technical assistance from the CDC, state health departments use in-house interviewers or contract with telephone call centers of universities to conduct BRFFS survey.
Sampling Strategy: The survey is conducted using Random Digit Dialing (RDD) techniques on both landlines and cell phones.
Unit of Analysis: Respondent
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Self-reported conditions
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
The BRFSS does not utilized diagnosis codes.
Number of Diagnoses Captured: BRFSS asks respondents about the following conditions: MI, CHD, Stroke, Asthma, Skin Cancer, Other Cancer, COPD, Arthritis, Depression, Kidney Disease, Vision Impairment, Diabetes, and HIV/AIDS.
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
The BRFSS only asks if cost is a barrier to obtaining healthcare services for specific conditions.
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Utilization of preventive healthcare services information is collected.
Measures of Healthcare Access: Questions are included related to insurance, regular care provider, and last health checkup.
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment, Income).
Age, Gender, Hispanic vs. Latino, Race, Military Status, Insurance Status/Type, Educational Obtainment, Disability Status and Income.
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
Hypertension Status, High Cholesterol Status, Risky Health, Behaviors (i.e. tobacco use), Pregnancy Status, Fruit and Vegetable Consumption, Physical Activity Level, and Immunizations.
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
Household Size, Employment Status, Household Income, Zip Code, and Own vs. Rent Home.
Site of Service Information: The BRFSS does not include information on site of service.
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Self-reported Health Status Self-reported Health-Related Quality of Life
Strengths, Limitations & Feasibility
Data Strengths: THE BRFSS raking methodology includes categories of age by gender, detailed race and ethnicity groups, education levels, marital status, regions within states, gender by race and ethnicity, telephone source, renter/owner status, and age groups by race and ethnicity. In 2011, 50 states, the District of Columbia, Guam, and Puerto Rico collected samples of both landline and cell phone interviews, while the Virgin Islands collected a sample of landline-only interviews.
Data Limitations: Limitations on the reliability and validity of self-reported behaviors, with some over-reported, and others underreported. Only administered in English and Spanish. An increasing numbers of households lack landlines.
Data Access Restrictions: BRFSS data is publicly available.
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
No direct identifiers, except telephone number.
Related Grouping Systems: n/a

 

National Ambulatory Medical Care Survey
References
Centers for Disease Control and Prevention. Ambulatory Health Care Data. 2013. http://www.cdc.gov/nchs/ahcd.htm
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations
Sponsorship: Centers for Disease Control and Prevention
Description: The National Ambulatory Medical Care Survey (NAMCS) is a national survey designed to provide information about the provision and use of ambulatory medical care services in the United States. Data are obtained on patients' symptoms, physicians' diagnoses, and medications ordered or provided. Information on services provided, including information on diagnostic procedures, patient management, and planned future treatment.
Database
(Scope, Size, Setting, Population, Age Range)
National; the NAMCS includes data on approximately 11,000 physicians from office-based settings and more than 6,000 CHC providers.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
Survey of physicians and providers.
Database Source/Origin: Findings are based on a sample of visits to non-federal employed office-based physicians who are primarily engaged in direct patient care. Physicians in the specialties of anesthesiology, pathology, and radiology are excluded from the survey.
Date or Frequency of Data Collection: The survey was conducted annually from 1973 to 1981, in 1985, and annually since 1989.
Longitudinal vs. Cross-sectional Database: Cross-sectional.
Data Collection Methodology: Specially trained interviewers visit physicians prior to their participation in the survey in order to provide them with survey materials and instruct them on how to complete the forms. Data collection is from physicians, rather than from patients, which provides an analytic base that expands information on ambulatory care collected through other ambulatory surveys. Each physician is randomly assigned to a 1-week reporting period. During this period, data for a systematic random sample of visits are recorded by the physician or office staff on an encounter form provided for that purpose.
Sampling Strategy: Data is obtained from sample of visits to non-federal employed office-based physicians who are primarily engaged in direct patient care.
Unit of Analysis: Physicians
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Common primary diagnosis.
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
ICD-9-CM. Drug data are coded using a unique classification scheme developed at NCHS.
Number of Diagnoses Captured: Information is collected on the following chronic conditions: Cerebrovascular disease, Congestive heart failure, Chronic renal failure, HIV, and diabetes.
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
Source of payment
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Number of past visits in last 12 months, major reason for visit, time spent with the physician, previous care – seen in ED in last 72 hours/ discharged from hospital in last 7 days, counseling/ education/ therapy, surgical procedures, patient’s primary care physician provider, was patient referred for visit, and patient seen before.
Measures of Healthcare Access: NAMCS does not have measures of healthcare access.
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Age, Sex, and Ethnicity/Race.
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
Pain level, Tobacco use, Respiratory rate, Episode of care, Glasgow coma scale (GCS), and On oxygen on arrival.
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
Place of residence
Site of Service Information: Hospitals and community health centers identified.
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Discharge status
Strengths, Limitations & Feasibility
Data Strengths: Data are collected on key policy issues pertaining to health. There are multiple years of data available.
Data Limitations: The item nonresponse rate for ethnicity and race is approximately 20%.
Data Access Restrictions: Data are available to the public at no cost. Restricted files which contain additional variables and non-masked data can be accessed by applying to the NCHS Research Data Center and paying a fee.
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
The NAMCS does not include unique identifiers to link patients.
Related Grouping Systems: ICD-based grouping systems.

 

National Health Interview Survey
References
Centers for Disease Control and Prevention. National Health Interview Survey. 2013. http://www.cdc.gov/nchs/nhis.htm
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations & Multiple Chronic Conditions and Disparities
Sponsorship: Centers for Disease Control and Prevention
Description: The National Health Interview Survey is the principal source of information on the health of the civilian non-institutionalized population of the United States and is one of the major data collection programs of the National Center for Health Statistics.
Database
(Scope, Size, Setting, Population, Age Range)
National; approximately 100,000 individuals.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
Household survey
Database Source/Origin: Surveys of households.
Date or Frequency of Data Collection: Annually since 1957, but revised every 10-15 years. Sampling and interviewing are continuous throughout the year
Longitudinal vs. Cross-sectional Database: The National Health Interview Survey is a cross-sectional household interview survey.
Data Collection Methodology: Sampled by household – one child and one adult are selected to complete the Sample Adult and Sample Child components of the survey. Sampling methods are redesigned after every census.
Sampling Strategy: Sampling and interviewing are continuous throughout each year. The sampling plan follows a multistage area probability design that permits the representative sampling of households and non-institutional group quarters (e.g., college dormitories). The sampling plan is redesigned after every decennial census. The current sampling plan was implemented in 2006. It has many similarities to the previous sampling plan, which was in place from 1995 to 2005. The first stage of the current sampling plan consists of a sample of 428 primary sampling units (PSU's) drawn from approximately 1,900 geographically defined PSU's that cover the 50 States and the District of Columbia. A PSU consists of a county, a small group of contiguous counties, or a metropolitan statistical area.
Unit of Analysis: Households, Individuals and Geographic Region.
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Self-reported diagnosis information.
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
Self-report diagnosis.
Number of Diagnoses Captured: Self-reported diagnosis information collected on: Hypertension/ high blood pressure, High cholesterol, Coronary heart disease, Angina, Heart attack, Heart condition/ heart disease, Stroke, Emphysema, COPD, Asthma, Ulcer, Cancer or malignancy of any kind/ benign tumors/cysts, Diabetes, Seizure disorder or epilepsy, Sinsuitis, Chronic bronchitis, Weak or failing kidneys, bladder or renal problem, Liver condition, Fibromyalgia, lupus, Multiple Sclerosis, Muscular Dystrophy, Osteoporosis or tendinitis, Pilio, paralysis, para/quadriplegia, Parkinson’s disease, other tremors, Hernia, Varicose veins, hemorrhoids, Thyroid problems, Grave’s disease, gout, Hearing problems, Depression, anxiety, or an emotional problem, Pain, ache, stiffness in or around a joint, bone injury, Arthritis, Birth defect, intellectual disability/ developmental problem, Senility, Weight problems, Missing limbs, Circulation problems / blood clots, Severe headache or migraine, Stomach or intestinal illness, Pregnant, Vision/ blindness, Teeth loss, Weak immune system (due to leukemia, lymphoma, HIV), Nerve damage/carpal tunnel syndromes, and Hepatitis.
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
Affordability of prescription medicines, Affordability of doctors, Affordability of dental care, and Affordability of insurance.
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Emergency room visit/ hospital visit , Asthma action plan/ class on managing asthma, Routine checkup for asthma, Taking insulin, Use hearing aid, Usual place to go when sick, Health care change due to health insurance change, Received home health visits, Received surgery, Received flu/ tetanus/ hepatitis/ HPV shot and Pap smear/ mammogram.
Measures of Healthcare Access: Lack of transportation to health care, Lack of available doctors, Lack of doctors’ offices open at convenient times, Worried about paying medical bills, Health care coverage compared to past year, Skipped medication to save money, and Communicate with a healthcare provider online.
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Age, sex, sexual orientation.
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
Smoker status, Exercise, Drinker status, Height and Weight.
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
Employment status, Business/ industry, Activities at job, Size of business, Paid by the hour or salaried, Paid sick leave, Multiple jobs held, and time at current residence.
Site of Service Information: Site of Service is not collected of the NHIS.
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Morbidity and Mortality.
Strengths, Limitations & Feasibility
Data Strengths: Includes questions that can be used to analyze demographic and socioeconomic characteristics and health trends.
Data Limitations: Cross-sectional data; it cannot be used study patients over time. Sample sizes are too small to provide accurate state-level statistics.
Data Access Restrictions: NHIS data files are available to download at no charge. All files from 1963-2011 are available online
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
AHRQ provides a crosswalk to merge the MEPS and NHIS data. Mortality data, Medicare enrollment and claims data, and social security and benefit history data are all linked to NHIS data. The National Immunization Provider Records Check Survey is also linked to NHIS data.
Related Grouping Systems: n/a

 

National Health and Nutrition Examination Survey
References
Centers for Disease Control and Prevention. National Health and Nutrition Examination Survey (NHANES). 2013. http://www.cdc.gov/nchs/nhanes.htm
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations & Multiple Chronic Conditions and Disparities
Sponsorship: Center for Disease Control and Prevention
Description: The National Health and Nutrition Examination Survey (NHANES) is a program of studies designed to assess the health and nutritional status of adults and children in the United States. The survey is unique in that it combines interviews and physical examinations. Findings from this survey are used to determine prevalence of major diseases and risk factors for diseases.
Database
(Scope, Size, Setting, Population, Age Range)
National; 5,000 Surveys conducted annually.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
Survey and Physical Examination
Database Source/Origin: Health interviews are conducted in respondents’ homes. Health measurements are performed in specially-designed and equipped mobile centers, which travel to locations throughout the country. The study team consists of a physician, medical and health technicians, as well as dietary and health interviewers.
Date or Frequency of Data Collection: As of 1999, NHANES has been conducted on an annual basis.
Longitudinal vs. Cross-sectional Database: Cross-sectional Survey
Data Collection Methodology: NHANES includes clinical examinations, selected medical and laboratory tests, and self-reported data. Medical examinations and laboratory tests follow very specific protocols and are as standard as possible to ensure comparability across sites and providers. Beginning in 1999, NHANES became a continuous, annual survey. Data are collected every year from a representative sample of the civilian non-institutionalized U.S. population, newborns and older, by in-home personal interviews and physical examinations in the mobile examination centers.
Sampling Strategy: The sample design is a complex, multistage, clustered design using unequal probabilities of selection. Low-income persons, adolescents 12-19 years of age, persons 60 years of age and over, African Americans, and persons of Mexican origin are oversampled. The sample is not designed to provide nationally representative estimates for the population of U.S Hispanics.
Unit of Analysis: Respondent/Interviewee
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Self-Reported Conditions
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
Self-Reported Conditions
Number of Diagnoses Captured: NHANES primarily studies nine categories of conditions: Obesity, Cardiovascular Health, Oral Health, Arthritis/Body Pain, Bone Density/Osteoporosis, Pulmonary Function, Endocrine Health, Renal Disease, and Allergy Inflammation.
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
NHANES does not capture information on cost.
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Hospital Utilization/Stays ED Utilization
Measures of Healthcare Access: NHANES includes specific questions on healthcare access.
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Age, Sex, Educational Attainment, Marital Status, Language, Race/Ethnicity, including subgroups and Health Insurance Status.
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
Health Risk Behaviors, Health Risk Exposure Data, Weight History, Oral Health History, other clinical metrics are obtained during the interview by clinicians (i.e. blood pressure).
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income,
Wealth, Place of Residence, Household Size & Composition, geographic location)
Veteran Status, Occupation, Employment Status and Income.
Site of Service Information: For each condition, NHANES asks patients if they received care at a certain type of facility (ED, doctor’s office, etc.).
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Self-reported Health Status Self-reported Physical Functioning
Strengths, Limitations & Feasibility
Data Strengths: Estimates for previously undiagnosed conditions are produced from NHANES.
Data Limitations: A major limitation of NHANES is that it is not geographically representative of the U.S. The sample selected to be demographically representative, but because two teams can only visit a total of 16 sites a year, it is impossible to achieve a good geographic spread. NHANES may not be optimal for detecting changes over time because one doesn’t know if the changes observed are due to geographic irregularities of the survey.
Data Access Restrictions: Certain public use data files are open to the file. Many survey data elements are not available for public use.
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
NHANES data have been linked with multiple years of Social Security Administrative Data, CMS Medicare enrollment and claims files include Part D data, and the National Death Index.
Related Grouping Systems: n/a

 

Centers for Medicare & Medicaid Services Datasets

CMS Chronic Conditions Warehouse
References
Chronic Conditions Data Warehouse. 2013. http://www.ccwdata.org/web/guest/home
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations
Sponsorship: Centers for Medicare & Medicaid Services
Description: The Chronic Condition Data Warehouse (CCW) is a research database designed to make Medicare, Medicaid, Assessments, and Part D Prescription Drug Event data more readily available to support research designed to improve the quality of care and reduce costs and utilization for chronic disease patients. Data is available across beneficiaries’ continuum of care.
Database
(Scope, Size, Setting, Population, Age Range)
National-Population-specific; All Medicare patients.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
The CMS Chronic Condition Warehouse is an amalgamation of linked datasets, including Medicare, Medicaid, and Part D Claims and Assessment data.
Database Source/Origin:

CCW contains the following 100% Medicare files for years 1999–2010:

  • Fee-for-service institutional and non-institutional claims
  • Enrollment/eligibility
  • Assessment data

100% Medicaid files for years 1999 - 2008 and 2009/partial states available. 100% Part D Prescription Drug Event data for years 2006–2010

  • Plan characteristics
  • Pharmacy characteristics
  • Prescriber characteristics
Date or Frequency of Data Collection: Ongoing; Data from 1999–2010.
Longitudinal vs. Cross-sectional Database: Longitudinal
Data Collection Methodology: CCW data are linked by a unique, unidentifiable beneficiary key, which allows researchers to analyze information across the continuum of care.
Sampling Strategy: All Medicare beneficiaries
Unit of Analysis: Medicare Beneficiary
Diagnosis Information  
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
CCW has a specific condition algorithm to determine chronic condition categories. For each chronic condition category, specific primary, principal or secondary diagnosis codes are used to “flag” the event.
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
ICD-9, CPT4, HCPCS codes
Number of Diagnoses Captured: Twenty-seven chronic conditions are maintained in the CCW.
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
Medicare & Medicare Claims; Part D Prescription Drug Costs
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Number of Claims, Number of Visits, and Type of Procedure.
Measures of Healthcare Access: CCW includes an Access to Care File.
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Sex, Race, Insurance Type, Dual Eligibility Status, Age, preferred language, marital status, etc.
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
n/a
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
Zip code
Site of Service Information: CCW includes information on site of service (hospital, nursing home, etc.)
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Mortality, morbidity, Mobility, functional status, quality of life, quality measures, quality of care.
Strengths, Limitations & Feasibility
Data Strengths: Links beneficiaries across multiple care settings and representative of all Medicare patients.
Data Limitations: Since claims for most services provided to Medicare beneficiaries in managed care do not reach the claim data files, the CCW Medicare claims should be viewed as providing utilization information primarily for the fee-for-service population.
Data Access Restrictions: CCW data files may be requested for any of the predefined chronic condition cohorts, or users may request a customized cohort(s) specific to research focus areas.
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
CCW files can be linked together via a single unique identifier for each beneficiary. ICD-based grouping systems.

 

CMS Medicare Provider Analysis and Review (MedPAR) File
References
CMS MedPAR Hospital Data File. 2013. http://www.healthdatastore.com/cms-medpar-hospital-data-file.aspx#
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations
Sponsorship: Centers for Medicare & Medicaid Services
Description: The Medicare Provider Analysis and Review (MEDPAR) File contains data from claims for all services provided to beneficiaries admitted to Medicare certified inpatient hospitals and skilled nursing facilities (SNF).
Database
(Scope, Size, Setting, Population, Age Range)
National; representative of Medicare patients; 12 million in- patient visits
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
Medicare Claims
Database Source/Origin: Medicare claims for inpatient visits from over 6,000 hospitals.
Date or Frequency of Data Collection: 1991–2012; updated yearly.
Longitudinal vs. Cross-sectional Database: Longitudinal
Data Collection Methodology: The Centers for Medicare and Medicaid Services (CMS) collects and releases data for all U.S. hospital inpatient stays for Medicare beneficiaries. Each record in the MedPAR file represents an inpatient stay during the calendar year of the file and has information on diagnosis, procedure, charge, payment, provider and patient for the claim.
Sampling Strategy: All Medicare related inpatient hospital stays.
Unit of Analysis: Inpatient Stay
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Principal Diagnosis Admission Diagnosis
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
ICD-9-CM
Number of Diagnoses Captured: Up to 9 diagnoses and 6 surgical procedure codes are captured in the MedPAR file.
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
TotalTotalCharges Payments
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Number of Inpatients Visits Length of Stay
Measures of Healthcare Access: n/a
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Age, Gender and Race.
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
n/a
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
State, Country Zip Code
Site of Service Information: Hospital provider number can be used to identify geographic region.
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Discharge Status
Strengths, Limitations & Feasibility
Data Strengths: Representative of all Medicare-related hospital inpatient admissions.
Data Limitations: MedPAR data is generally available with one year lag time and covers around one-third of all hospital inpatients; and almost all of its patients are 65 plus. Consequently, some specialties such as Pediatrics and Obstetrics are practically absent.
Data Access Restrictions: Because of data use restrictions, CMS cannot sell access to the raw data, but can provide a wide array of tabulations and descriptive statistics.
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
n/a
Related Grouping Systems: ICD-based grouping systems.

 

Medicare Health Outcomes Survey
References:
Medicare Health Outcomes Survey. 2013. http://www.hosonline.org/Content/Default.aspx
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations
Sponsorship: Centers for Medicare & Medicaid Services
Description:

The Medicare HOS is the first outcomes measure used in Medicare managed care programs. The goal of the Medicare HOS program is to gather valid and reliable health status data in Medicare managed care for use in quality improvement activities, plan accountability, public reporting, and improving health. The Medicare HOS 2.0 contains four major components:

  • the Veterans RAND 12 Item Health Survey (VR-12)
  • questions to gather information for case-mix and risk-
  • adjustment
  • four HEDIS® Effectiveness of Care measures
  • additional health questions
Database
(Scope, Size, Setting, Population, Age Range)
Medicare beneficiaries 18 years or older enrolled in Medicare Advantage Organizations with a minimum of 500 enrollees.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
Survey
Database Source/Origin: Patient Survey Data
Date or Frequency of Data Collection: Once a year, starting in 1998.
Longitudinal vs. Cross-sectional Database: Longitudinal
Data Collection Methodology: Data is collected from participating Medicare Advantage Organizations (MAOs) with a minimum of 500 enrollees
Sampling Strategy: Each spring a random sample of Medicare beneficiaries is drawn from each participating MAO, that has a minimum of 500 enrollees and is surveyed (i.e., a survey is administered to a different baseline cohort, or group, each year). Two years later, these same respondents are surveyed again. Effective 2007, the MAO sample size is increased to twelve hundred.
Unit of Analysis: Respondent, MAO’s, etc.
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Self-reported diagnosis
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
Self-reported diagnosis
Number of Diagnoses Captured: Hypertension or high blood pressure, Angina pectoris or coronary artery disease, Congestive heart failure, Myocardial infarction or heart attack, Other heart conditions such as problems with heart valves or the rhythm of heartbeat, Stroke, Emphysema, or asthma, or COPD, Crohn’s disease, ulcerative, colitis, or inflammatory bowel disease, Arthritis of the hip or knee, Arthritis of the hand or wrist, Osteoporosis, Sciatica, Diabetes, high blood sugar, or sugar in the urine, Any cancer other than skin cancer, and Poor eyesight.
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
n/a
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Enrollment duration Caregiving for others in household
Measures of Healthcare Access: Difficulty of getting around
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Age, Gender, Marital Status, Race, and Education.
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
BMI, Depression screen indicator, History of pain, Height History of falls, Comorbid Medical Conditions (Beneficiary reported)
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
Annual household income English language skills Household size Place of residence
Site of Service Information: n/a
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Health Status Activity Level
Strengths, Limitations & Feasibility
Data Strengths: Data can be used to assess the performance of MAOs and to reward high performers. Data can be used by health researchers to advance the state of the science in functional health outcomes measurement. Data can be used by managed care organizations, providers, and quality improvement organizations to monitor and improve health care quality.
Data Limitations: Lacks cost information. Lacks information on chronic conditions besides the ones specifically inquired about.
Data Access Restrictions: Several types of Medicare HOS data files are available for research purposes. Medicare HOS data files are available as public use files, limited data sets, and research identifiable files.
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
Beneficiaries are identified through their health insurance claims numbers. However, a beneficiary’s HIC number can change through special circumstances.
Related Grouping Systems: n/a

HMO Research Network Dataset

HMO Research Network Virtual Data Warehouse
References
National Cancer Institute. HMO Research Network. 2013. http://epi.grants.cancer.gov/pharm/pharmacoepi_db/hmorn.html
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations & Multiple Chronic Conditions and Disparities.
Sponsorship: HMO Research Network
Description: The HMORN Virtual Data Warehouse is a series of datasets developed from data submitted from 19 healthcare delivery organizations with integrated research practices. The purpose of the HMORN VDW is to provide a means by which to conduct broad spectrum population-based research studies to ultimately improve patient health and transform health care practice. HMORN research includes the following topics: biostatistics, mental health, cancer research, comparative effectiveness research, complementary & alternative medicine, communication & health literacy research, dissemination & implementation, epidemiology, genetic research, disparities research, health informatics, health services, infectious & chronic disease surveillance, patient-centered care, pharmaco- epidemiology, primary & secondary prevention, systems change and organizational behavior.
Database:
(Scope, Size, Setting, Population, Age Range)
The HMORN VDW is a consortium of 19 healthcare delivery systems that submit claims and EHR data for all patients.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
Virtual Database - Data is housed at individual HMOs but can be accessed from anywhere.
Database Source/Origin: Administrative Data, Claims Data, & Electronic Health Record Data (which includes clinical data).
Date or Frequency of Data Collection: n/a
Longitudinal vs. Cross-sectional Database: Longitudinal
Data Collection Methodology: Programmers at participating sites transform EHR and claims data elements from local data systems to a VDW standardized set of variable definitions, names, and codes. The common structure allows for programming code developed at one site to be used at other sites to extract and analyze data for a research throughout the network.
Sampling Strategy: All Patients
Unit of Analysis: Patient
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Primary and secondary diagnoses.
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
ICD-9-CM (other: CPT-4 & HCPCS, NGC, CPI)
Number of Diagnoses Captured: n/a
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
Insurance Claims
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Inpatient & Outpatient Visits
Measures of Healthcare Access: n/a
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Age, gender, race, ethnicity, insurance type, Hispanic vs. non- Hispanic, Educational Obtainment.
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
Height, Weight, BMI, blood pressure, Laboratory Results, Tumor Status, Tumor Staging, prescription drug use.
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
County, State, Zip, Income
Site of Service Information: Type of encounter, provider type, facility type.
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Discharge Disposition
Strengths, Limitations & Feasibility
Data Strengths: Data submitted to this warehouse is continuously vetted and cleaned. Data maintained in this warehouse can be analyzed using programs written at any HMO.
Data Limitations: Data is only submitted from health plans in twelve states.
Data Access Restrictions: n/a
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
Although demographic information is available, a special emphasis of this database is to keep records anonymous.
Related Grouping Systems: All ICD-related grouping systems.

National Institute on Aging Dataset

National Health & Aging Trends Study
References
Full bibliography available at http://www.nhats.org/scripts/biblioRep.htm
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations & Multiple Chronic Conditions and Disparities.
Sponsorship: National Institute on Aging
Description:

The National Health and Aging Trends Study (NHATS) is a new resource for the scientific study of functioning in later life. The NHATS is being conducted by the Johns Hopkins University Bloomberg School of Public Health, with data collection by Westat, and support from the National Institute on Aging. In design and content, NHATS is intended to foster research that will guide efforts to reduce disability, maximize health and independent functioning, and enhance quality of life at older ages.

The NHATS will gather information on a nationally representative sample of Medicare beneficiaries ages 65 and older. In-person interviews will be used to collect detailed information on activities of daily life, living arrangements, economic status and well-being, aspects of early life, and quality of life. Among the specific content areas included are: the general and technological environment of the home, health conditions, work status and participation in valued activities, mobility and use of assistive devices, cognitive functioning, and help provided with daily activities (self-care, household, and medical). Study participants will be re-interviewed every year in order to compile a record of change over time. The content and questions included in NHATS were developed by a multidisciplinary team of researchers from the fields of demography, geriatric medicine, epidemiology, health services research, economics, and gerontology.

As the population ages, NHATS will provide the basis for understanding trends in late-life functioning, how these differ for various population subgroups, and the economic and social consequences of aging and disability for individuals, families, and society.

Database:
(Scope, Size, Setting, Population, Age Range)
National; persons >=65 years old; Adolescents Only (< 20 years old); 2–3 million records a year.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
Survey
Database Source/Origin: Sample of Medicare beneficiaries
Date or Frequency of Data Collection: Annual (round 1 completed in 2011)
Longitudinal vs. Cross-sectional Database: Longitudinal
Data Collection Methodology: Interview
Sampling Strategy: Sample of over 8,000 Medicare beneficiaries ages 65 and older living in the contiguous U.S. Age-stratified so that persons are selected from 5 year age groups between the ages of 65 and 90, and from among persons age 90 and older. Oversample of persons at older age groups and persons whose race is listed as Black on the CMS enrollment file. Replenishment of the sample to maintain the ability to represent the older Medicare population is planned at regular intervals.
Unit of Analysis: Patient
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Number of Chronic Conditions (based on a list of 25 possible chronic condition indicators) Primary and Secondary Diagnoses Admission and Discharge Status
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
None (self-report by patient)
Number of Diagnoses Captured: 10 basic diagnoses (heart attack, heart disease, high blood pressure, arthritis, osteoporosis, diabetes, lung disease, stroke, dementia, cancer); more detailed questions are asked about each one if interviewee reports having or having had one or more of these illnesses. Additional questionnaires ask about cognitive status, mobility, sensory and physical impairments, and ACS disability questions
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
Out-of-pocket cost of home environment modifications
Measures of Healthcare Utilization:
(Number of Visits, Any  Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Hospital stays/surgery, use of a medical doctor
Measures of Healthcare Access: Measures of ability to handle medical care activities by oneself, whether patient has a regular doctor
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, insurance, education
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
Various indicators of physical, social, sensory and cognitive functioning
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
Income, assets, housing, car ownership, labor force participation, helpers
Site of Service Information:  
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Mortality (year to year), mobility, ability to complete activities of daily living, functional status
Strengths, Limitations & Feasibility
Data Strengths: Survey, longitudinal
Data Limitations: Small sample size (8,000), little information about rarer conditions
Data Access Restrictions: Users must register before downloading the data. Registration is instant and free online.
Data Linking Feasibility:
 Does not appear to be linkable to Medicare file. (Unique identifiers or sufficient demographics to allow for data linkages)
Does not appear to be linkable to Medicare file.
Related Grouping Systems: N/A

 

Utah Department of Health Dataset

Utah All Payer Claims Database
References
Office of Health Care Statistics Utah Health Data Committee. The Utah All Payer Claims Database (APCD). 2013. http://health.utah.gov/hda/apd/
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations & Multiple Chronic Conditions and Disparities.
Sponsorship: Office of Health Care Statistics; Utah Health Data Committee; Utah Department of Health
Description: The Utah All Payer Claims Database (APCD) became the fifth operating APCD in the nation on September 13th, 2009 with the receipt of the very first data submissions. Participating plans submit enrollment, medical, and pharmacy files starting from 1/1/2007 until they are current. As of 2010, there are 11 plans in full production; that is, they have submitted all required historic data and are reporting new data on determined schedule
Database
(Scope, Size, Setting, Population, Age Range)
State of Utah; all-payer claims data.
Database Type: (Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases) Claims and administrative enrollment files. All payer claims database.
Database Source/Origin: Medicaid Claims, CHIP, PPO’s and HMO’s in Colorado, Medicare claims are pending inclusion due to cost/infrastructure.
Date or Frequency of Data Collection: Inpatient Hospital Discharge Data (1992–2010) Ambulatory Surgery Data (1996–2009) Emergency Department Data (1996–2010)
Longitudinal vs. Cross-sectional Database: Longitudinal
Data Collection Methodology: Health insurance carriers are required to submit health insurance files.
Sampling Strategy: All patients receiving and paying for healthcare services in the State of Utah.
Unit of Analysis: Patient
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Principal Diagnosis Secondary Diagnosis
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
ICD-9 or ICD-10
Number of Diagnoses Captured: Up to nine diagnoses are captured for each patient.
Cost, Utilization & Clinical Information  
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
Total Charges, Facility Charges, and Professional Charges
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Length of Stay
Discharges
Type of Procure
Admissions/Hospitalizations
Measures of Healthcare Access: Yes, but specific measures not reported.
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Age, Gender, Marital Status, and Race/Ethnicity.
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
Yes, extensive clinical data from EHRs.
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
Place of Residence
Site of Service Information: Zip Code, Residential County
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Discharge Status Patient Severity Subclass Values Patient Risk of Mortality Values
Strengths, Limitations & Feasibility
Data Strengths: Large patient sample size; represents all types of payment sources.
Data Limitations: Only representative of the State of Utah; still in development and missing claims data for some periods of time.
Data Access Restrictions: Some files are publically available. However, more advanced files for health care cost, quality and access need to be purchased after IRB and HDC consent is achieved.
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
Patient and Physician Identifiers. Data is very easy to link; there are a number of personal identifiers.
Related Grouping Systems: All ICD-related grouping systems.

 

State of Colorado Dataset

Colorado All Payer Claims Database
References
Colorado All-Payer Claims Database. 2013. http://www.colorado.gov/cs/Satellite/HCPF/HCPF/1249996141729
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations & Multiple Chronic Conditions and Disparities.
Sponsorship: State of Colorado, Colorado Health Foundation, The Colorado Trust, Caring for Colorado Foundation, Rose Community Foundation and Kaiser Permanente Community Benefit Program; Center for Improving Value in Health Care (CIVHC).
Description: The APCD is a secure database that includes claims data from commercial health plans, Medicare and Medicaid. Created by legislation in 2010 and administered by the Center for Improving Value in Health Care (CIVHC), the APCD is the only comprehensive source of health care claims data from public and private payers in Colorado.
Database
(Scope, Size, Setting, Population, Age Range)
State All Payer Database (Commercial carriers, Medicaid, Medicare, Self-funded plans and small group). By 2014, the APCD will have collected claims data for 90% of Colorado’s 4.2 million insured.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
All Payer Claims Database
Database Source/Origin: All claims: commercial carriers, Medicaid, Medicare, self- funded plans and small group plans.
Date or Frequency of Data Collection: 2008-2011; update regularly
Longitudinal vs. Cross-sectional Database: Longitudinal
Data Collection Methodology: Health insurance carriers are required to submit health insurance files.
Sampling Strategy: Information is collected on all Colorado healthcare expenditures.
Unit of Analysis: Patient
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Admitting Diagnosis
Principal Diagnosis
12 “Other Diagnosis” Categories
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
ICD-9
Number of Diagnoses Captured: n/a
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
Total Cost
Inpatient Facility Cost
Outpatient Facility Cost (including ER cost)
Profession Cost
Drug Cost
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Hospital Admissions
Type of Service (ortho vs. pediatric)
Readmissions
Measures of Healthcare Access: Provider Density Variable
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Sex
Gender
Age
Insurance Status
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
Yes, extensive clinical data from EHRs.
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
n/a Zip Code, County, Type of Service (inpatient vs. outpatient).
Site of Service Information: Zip Code, County, Type of Service (inpatient vs. outpatient).
Measures of Healthcare Outcomes:
(Mortality, Morbidity, Mobility, Functional Status, Quality of Life, Quality Measures, Quality of Care, Readmissions)
Discharge Status Readmissions
Strengths, Limitations & Feasibility
Data Strengths: Large patient sample size; represents all types of payment sources.
Data Limitations: Only representative of the State of Colorado; still in development and missing claims data for some periods of time.
Data Access Restrictions: Data is publically available.
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
Social Security Number, Plan Number, Employee Number, Provider Number. Information is grouped by zip code or region to protect personal health information.
Related Grouping Systems: All ICD-related grouping systems.

 

University of Michigan Dataset

Health & Retirement Study
References
National Institute on Aging, National Institutes of Health, U.S. Department of Health and Human Services.
Growing Older in America: The Health & Retirement Study. 2007. NIH Publication No. 07-5757
Database Description
White Paper(s): Data Systems and the Prevalence of Chronic Disease Combinations & Multiple Chronic Conditions and Disparities.
Sponsorship: University of Michigan
Description: The University of Michigan Health and Retirement Study (HRS) is a longitudinal panel study that surveys a representative sample of more than 27,000 Americans over the age of 50 every two years. This study is supported by the National Institute on Aging and the Social Security Administration and is designed to examine changes in labor force participation and the health transitions that individuals experience at the end of their working lives and into the years that follow. It is the leading resource for data on combined health and economic circumstance of Americans over the age of 50.
Database
(Scope, Size, Setting, Population, Age Range)
The HRS study surveys more than 27,000 Americans over the age of 50 who represent the Nation’s diversity of economic conditions, racial and ethnic backgrounds, health, marital histories and family compositions, occupations and employment histories, living arrangements, and other aspects of life. As individuals drop out of the sample, they are replaced by new participants in their 50’s; it is nationally representative of the U.S. population over age 50.
Database Type:
(Survey, Registry, Research Study, Program Database, Claims, Administrative Data, and Clinical Databases)
Research study and associated database.
Database Source/Origin: Participant Interviews
Date or Frequency of Data Collection: Interviews are conducted every two years.
Longitudinal vs. Cross-sectional Database: This is a longitudinal panel survey that following individuals over multiple years.
Data Collection Methodology: The majority of interviews are done by telephone, although exceptions are made when respondents have health limitations that would make an hour-long session on the telephone difficult of impossible. The preferred mode of data collection is face-to- face for the first wave of data collect, followed by subsequent waves of data collection conducted over the phone.
Sampling Strategy: HRS uses a national area probability sample of U.S. households with supplemental oversamples of Blacks, Hispanics and residents of the state of Florida. Participation in this study/survey is optional, but there are incentives.
Unit of Analysis: Individual
Diagnosis Information
Diagnosis Variable Type:
(Chronic Condition Status, Principal Diagnosis, Primary Diagnosis, Secondary Diagnosis, Admit/Discharge Diagnosis and Self-Reported Diagnosis)
Self-reported Diagnosis
Diagnosis Codes:
(ICD-9, ICD-10, SNOMED)
Self-reported Diagnosis
Number of Diagnoses Captured: n/a
Cost, Utilization & Clinical Information
Measures of Cost:
(Claims, Out-of-pocket expenses, Self-reported expenditures, and Prescription Drug Costs)
Out-of-pocket expenditures
Measures of Healthcare Utilization:
(Number of Visits, Any Procedures/Number of Procedures/Type of Procedure, Number of Admission/Type of Admission, Length of Stay, Hospitalizations, Emergency Department Utilization, etc.)
Health Service Use by Type (i.e. Hospital, Nursing Home, etc.), Number of visits, etc.
Measures of Healthcare Access: n/a
Demographic Information:
(Sex, Age, Race, Ethnicity, Marital Status, Disability Status, Language, Insurance Type, Educational Attainment).
Age, Educational Attainment, Disability Status, Race, Ethnicity, Language, Sex, and Marital Status.
Clinical Information:
(BMI, Medical Conditions [high blood pressure], Smoker Status, History of Various Conditions, Preventative Health Measures, Activities of Daily Living, Instrumental Activities of Daily Living)
Disease history, Medicare Use, Physical Activity, Height, Weight, Measurements of Lung Function, Blood Pressure, Grip Strength, and Walking Speed.
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
Occupation, Employment Status, Income
Site of Service Information: Location of Health Service Type
Measures of Socioeconomic Status:
(Occupation, Employment Status, Income, Wealth, Place of Residence, Household Size & Composition, geographic location)
Self-reported health status and measure of functional status.
Strengths, Limitations & Feasibility
Data Strengths: There are multiple years of data available (longitudinal data). Comprehensive documentation is available for all respondents across a variety of key policy issues. There is a low sample attrition rate.
Data Limitations: Limited granularity in diagnosis coding, unless linked with Medicare claims data.
Data Access Restrictions: Data are available to the public at no cost. Detailed race/ethnicity data are available on a restricted basis.
Data Linking Feasibility
(Unique identifiers or sufficient demographics to allow for data linkages)
Respondent information can be linked to social security data, Medicare claims data and supplemental employer surveys.
Related Grouping Systems: n/a

 

View full report

Preview
Download

"rpt_ResearchAddressing.pdf" (pdf, 1.34Mb)

Note: Documents in PDF format require the Adobe Acrobat Reader®. If you experience problems with PDF documents, please download the latest version of the Reader®