Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.


The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Data Linkage: Evaluating Privacy Preserving Record Linkage Methodology and Augmenting the National Hospital Care Survey with Medicaid Administrative Records

Assess Novel Privacy Advancements in Patient Centered Outcomes Research and Broaden the Scope of Available Data
  • Centers for Disease Control and Prevention (CDC)
Start Date
  • 5/12/2020
  • Use of Clinical Data for Research
  • Linking of Clinical and Other Data for Research
  • Use of Enhanced Publicly-Funded Data Systems for Research


STATUS: Active Project


This project builds upon previously funded OS-PCORTF efforts to expand data capacity for research studies. This project focuses specifically on patient health outcomes across the continuum of care through linking disparate data sources. The first part of the project will focus on privacy preserving record linkage (PPRL). In order to ensure the accuracy of linked data sets, linkage algorithms rely on the exchange and matching of personally-identifiable information (PII). While this heightens researchers’ capability to examine individual health outcomes, concerns remain regarding privacy and the exchange of identifiable information. In an effort to move away from reliance on PII, groups such as Datavant and the CDC’s Childhood Obesity Data Initiative (CODI) have been working to develop linkage techniques that use PPRL, which eliminates PII sharing among organizations. This project assesses and compares linkage results from PPRL against earlier algorithms containing PII to identify discrepancies and their impact on subsequent analysis of the linked data.

The second part of the project will create new linked data sets to support patient-centered outcomes research (PCOR). In 2017 and 2019, the NCHS supported the linkage of the National Hospital Care Survey (NHCS) with three different data sets: 1) the National Death Index (NDI); 2) data regarding participation in Department of Housing and Urban Development (HUD) housing programs; and 3) Medicare enrollment, prescription medication, and claims data from the Centers for Medicare & Medicaid Services (CMS). This allows researchers to analyze individual-level outcomes using a broad array of patient-specific information. The linked data also enhance public understanding of the interplay between health determinants and post hospitalization outcomes. This project seeks to create new data sets through the linkage of Medicaid claims data from the CMS Transformed Medicaid Statistical Information System (T-MSIS) and the 2014 and 2016 NHCS. This expansion of existing infrastructure diversifies and widens researchers’ investigative range of PCOR topics. These topics include, interventions for opioid use, evaluation of medication protocols, use of social programs as a health determinant, and health disparities among understudied demographic groups.


This project aims to evaluate privacy preserving record linkage methodologies and broaden existing data resources that are individually matched across factors that may influence health outcomes.

The project objectives are to:

  • Evaluate PPRL technique utilizing past PCORTF-funded NHCS-NDI linkages as a gold standard.

  • Disseminate output showing the suitability of PPRL as a linkage technique and the creation of new data sets to conduct PCOR.

  • Conduct patient-level record linkages of 2014 and 2016 NHCS hospital administrative claims and EHR data to CMS’ T-MSIS data from 2014-2017 (and 2018, if available).

  • Develop research and user guidance materials to aid PCOR-led usage of new and existing NHCS linked data sets.