Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

A National COVID-19 Longitudinal Research Database Linked to Centers for Medicare and Medicaid (CMS) Data

A National COVID-19 Longitudinal Research Database Linked to Centers for Medicare and Medicaid (CMS) Data
Agency
  • National Institutes of Health (NIH)/National Center for Advancing Translational Sciences (NCATS)
Start Date
  • 08/04/2021
OS-PCORTF Strategic Plan Alignment
  • Primary:
    • Goal 2. Data Standards and Linkages for Longitudinal Research
  • Secondary: 
    • Goal 1. Data Capacity for National Health Priorities
    • Goal 3. Technology Solutions to Advance Research
    • Goal 4. Person Centeredness

STATUS: Completed Project

BACKGROUND

The COVID-19 pandemic has precipitated an urgent need to have a near, real-time centralized research dataset to conduct patient-centered outcomes research (PCOR) on COVID-19 and generate evidence on effective interventions, especially in vulnerable patient subgroups. Linking clinical and claims data is necessary to address important research on COVID therapeutics and vaccines, long-term complications of COVID, disparate impacts of COVID-19 on vulnerable populations, and the safety-net providers who serve them.

Researchers both within the federal government and in non-federal institutions have been hampered by the lack of timely data on health outcomes associated with COVID-19. This project is built on several COVID-related efforts led independently by federal agencies that bring health systems, payment policy, and clinical perspectives to this collaborative project. This project also sought to bring together agencies across the Department of Health and Human Services (HHS) to engage in research to inform policy on the health system response to COVID-19 and to prepare for future emergencies. The final product is a privacy-preserving, comprehensive linked dataset on individual outcomes and health systems response that could be used internally by HHS agencies and externally by non-federal researchers through the NCATS National COVID Cohort Collaborative (N3C).

PURPOSE

The overall goal of this project was to enhance COVID-19 data infrastructure for PCOR to produce a national research dataset on COVID-19 by:

  • Demonstrating the feasibility of linking clinical electronic health record (EHR) data with Medicare claims data using the proposed N3C data linkage strategy and engaging researchers interested in PCOR in using the linked research dataset. 
  • Linking PCOR Researchers to the NCATS N3C Data Enclave to access Medicare Claims Data linked to the N3C Clinical EHR Data. 
  • Producing PCOR COVID use cases demonstrating the utility of the linked Medicare claims-N3C clinical data to conduct patient-centered outcomes research on COVID-19, including potential evaluation of economic outcomes. 
  • Supporting the joint activities of the OS-PCORTF COVID Collaborative. 
  • Assessing the feasibility of linking clinical EHR data with Medicaid claims data using the proposed N3C data linkage strategy.
  • Linking PCOR Researchers to NCATS N3C Data Enclave to access Medicaid Claims Data linked to the N3C Clinical EHR Data.
  • Producing PCOR COVID use cases demonstrating the utility of the linked Medicaid claims-N3C clinical data to conduct patient-centered outcomes research on COVID-19.

KEY IMPACTS

Improving the quality of data: Harmonizing Medicare and Medicaid data 
The project harmonized Medicare and Medicaid data using the Observational Medical Outcomes Partnership (OMOP) common data model (CDM) to lower burden for other researchers in standardizing, linking, and using these data. The project created a Code Map Service and a Data Transformation Pipeline and validated the mapping.  

Providing more relevant, comprehensive data: CMS-N3C data linkage 
The project produced a privacy-preserving, comprehensive linked dataset that links Centers for Medicare and Medicaid Services (CMS) claims with clinical EHR data contributed to the N3C Data Enclave by 84 health systems, for use by PCOR researchers. The linked dataset improves the comprehensiveness of the clinical care history and longitudinal outcomes of patients in the N3C cohort by adding CMS claims data on therapeutics, comorbid diagnoses, vaccinations, health care utilization, and hospital data on deaths, as well as evaluating post-COVID “long-haulers”.

Enhancing analytic resources: Establishing a privacy-preserving record linkage (PPRL) 
The project engaged subject matter experts across federal agencies to design a preliminary CMS-N3C data linkage strategy. Statistical experts also supported the development of plans for constructing statistical sampling weights to examine the representativeness of the linked dataset to the US population. Finally, the project ensured the CMS-N3C data was correctly linked.

PUBLICATIONS

N3C Privacy-Preserving Record Linkage and Linked Data Website. This webpage summarizes PPRL and N3C's PPRL processes for end users. It also describes the technical and data governance architecture for the N3C PPRL linked data.

CMS Medicare-N3C Linked National Longitudinal COVID Research Dataset. This dataset contains de-identified billing data from the Centers for Medicare & Medicaid Services, formatted into the Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM) for National COVID Cohort Collaborative (N3C) researcher use. 

N3C Dashboard. This webpage publishes key statistics on the CMS Medicare-National COVID Cohort Collaborative (N3C) linked data, including numbers, characteristics, representativeness of the sample compared to the CMS population, and sampling weights.

N3C Privacy-Preserving Record Linkage and Linked Data Governance. This governance document gave sites participating in the N3C greater visibility into and context around the PPRL that occurred through the N3C and how linked data will be used downstream.

Project Final Report. This final report describes the project background, goals, accomplishments, challenges, and key outcomes and outputs of the National COVID-19 Longitudinal Research Database. It also outlines use cases for the linked dataset and methods documentation.