Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.


The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Harmonization of Various Common Data Models and Open Standards for Evidence Generation

Building data infrastructure for conducting patient centered outcomes research using observational data derived from the delivery of health care in routine clinical settings.
  • Food and Drug Administration (FDA)
  • National Institutes of Health's National Library of Medicine (NLM), National Cancer Institute (NCI) and National Center for Advancing Translational Sciences (NCATS)
  • Office of the National Coordinator for Health Information Technology (ONC)
Start Date
  • 2/8/2017
  • Use of Clinical Data for Research
  • Standardized Collection of Standardized Clinical Data
  • Linking Clinical and Other Data for Research


STATUS: Completed Project


In order to achieve a sustainable data network infrastructure, promote interoperability, and foster the creation of a Learning Health System (LHS), there is a need to map and transform data across various Common Data Models (CDMs) and leverage open-source standards. By mapping various CDM data elements and leveraging existing PCORTF investments, it is feasible to reuse the data, methods, and other resources from each network thereby providing PCOR researchers with access to larger and more diverse types of observational data.


This project was a collaborative effort among the FDA, NCI, NIH/NCATS, ONC, and the NLM. The project’s goal was to build data infrastructure for conducting patient-centered outcomes research (PCOR) using observational data derived from the delivery of health care in routine clinical settings. The sources of these data include, but are not limited to insurance billing claims, electronic health records (EHRs), and patient registries. The CDM organized data into a standard structure, which may differ across networks. This project harmonized several existing CDMs in order to support research and analyses across multiple data networks. The aim advanced the utility of data and its interoperability across networks to facilitate PCOR. The enhanced data infrastructure created through this project has the capacity to support evidence generation on patient-centered outcomes that can inform regulatory and clinical decision making within federal programs.

Project Objectives:

  • Develop common data architecture as the intermediary between four CDMs within four networks i.e., Sentinel, PCORNet, i2b2 and OHDSI.

  • Develop a flexible data model that can be used to create outbound data in multiple formats for multiple purposes.

  • Test the common data architecture by using it to study factors associated with the safety and effectiveness of newly approved oncology drugs that boost patients’ immune response to cancer. These drugs, known broadly as immune checkpoint inhibitors, are gaining approvals in a number of different indications, but it is unclear what the safety of these drugs may be in routine clinical care and how effectiveness may vary in different patient subpopulations, in combination with other effective agents for comorbid, such as those which treat autoimmune disorders. In this 2-year project, the team focused on three agents in the programmed cell death protein 1 (PD1)/ programmed death-ligand 1 (PDL1) class of oncology drugs with a focus on patients who have both cancer and an autoimmune condition. In order to validate this specific use case, the statistical tests and methods in the Sentinel and OHDSI libraries were applied to the mapped CDMs.

  • Establish methods and develop processes, policies, and governance for ongoing curation, maintenance, and sustainability of the common data architecture, building upon existing resources, standards, and tools. Example of existing resources include but are not limited to the Data Access Framework (DAF) developed by ONC to interface to various CDMs and the NIH Common Data Element Repository to register the harmonized, standardized data elements within each CDM.


  • Along with leading an environmental scan of existing CDM artifacts, the FDA project team developed the oncology use case for the PCORnet 3.1 and 4.0 CDMs.
  • The NIH/NCATS team surveyed the market for an existing open source extract, transform, and load (ETL) software tool to automate the data mapping process, and prepared a report on the selection process. The NIH/NCATS team also created a “Query Builder,” a front-end interface that offers researchers a simple way to construct and issue their research questions. “Query Transformation” transforms the query into a version that is compatible with each CDM. The CDM Harmonization Results Database and Viewer receives and analyzes the results of a query in one or more of the CDM formats. To process these results, the team created a tool that exports record level results in the Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM) format.
  • The NIH/NCI team completed the metadata curation of four CDMs (Sentinel, PCORV4.0, OMOP, I2b2 ACT) and completed the registrations of the CDMs and Biomedical Research Integrated Domain Group (BRIDG) in the cancer Data Standards Registry and Repository (caDSR). The final CDM packages were sent to NIH/NLM and the four CDM common data elements (CDEs) have been uploaded to the NIH CDE Repository.
  • ONC and NIH teams completed the mapping of the four data models, the NIH BRIDG conceptual model, and from BRIDG to Fast Healthcare Interoperability Resources (FHIR®). FHIR resource extensions and Common Data Models Harmonization (CDMH) Implementation Guide (IG) are in Health Level Seven (HL7) ballot reconciliation. Pilot-testing of the CDMH IG has been completed and the package has been passed for the first-round of balloting and is in reconciliation.
  • The NIH/NLM team developed a governance framework document that outlines suggested policies and practices for access to and use of the real world data that are derived from data-sharing networks that connect CDMs.




Below is a list of ASPE-funded PCORTF projects that are related to this project

Standardization and Querying of Data Quality Metrics and Characteristics for Electronic Health Data - Under the FDA, this project created and implemented a metadata standards data capture and querying system for: data quality and characteristics, data source and institutional characteristics, and “fitness for use.” This project targets the need to build bridges across networks and databases, so that information captured in each source can be combined and used for research.