Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Evaluation of Privacy-Preserving Record Linkage Solutions to Broaden Linkage Capabilities in Support of Patient-Centered Outcomes Research Objectives

Evaluation of Privacy-Preserving Record Linkage Solutions to Broaden Linkage Capabilities in Support of Patient-Centered Outcomes Research Objectives
Agency
  • National Center for Health Statistics at the Centers for Disease Control and Prevention (NCHS-CDC)
Start Date
  • 05/01/22
Functionality
  • Linking Clinical Data and Other Data for Research
  • Use of Clinical Data for Research

 

STATUS: Active Project

BACKGROUND

Privacy preserving record linkage (PPRL) is a linking methodology that works to mitigate privacy concerns when linking person level data from disparate data sources. PPRL allows for additional privacy protections that encrypt or mask personally identifiable information (PII) used for person level matching. There are both open source and commercial PPRL tools available to researchers; however, different PPRL tools may produce different linkage results, which in turn can potentially affect patient-centered outcomes research (PCOR) findings. To maximize the potential of PPRL for PCOR data resources, it is important to understand the attributes of available tools and their linkage accuracy.

Leveraging the extensive NCHS-CDC linked data repository, this project will assess linkage results from three PPRL tools currently in use or in development within the Department of Health and Human Services (HHS). NCHS-CDC will conduct analyses comparing linkage results obtained through the tools with linked data resources developed using gold standard linkage methods. The project will assess a variety of scenarios, including PII that is non-standardized, incomplete (e.g., missing unique identification numbers), and of varying levels of quality.

PROJECT PURPOSE & GOALS

To foster transparency and increase confidence in the validity of data resources created through PPRL, this project will assess open-source and commercial PPRL tools and gather lessons learned regarding working with PPRL.

The purpose of this project is to address the following objectives:

  • Adapt open-source PPRL tools and obtain licenses for commercial PPRL tools.

  • Compare PPRL tools’ performance to the benchmark NCHS-CDC linked data files.

  • Conduct an analysis that considers the security and re-identification risks of the three PPRL tools to join records across multiple data sources.

  • Engage HHS stakeholders and disseminate findings of the project.