Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.


The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Childhood Obesity Data Initiative (CODI): Integrated Data for Patient-Centered Outcomes Research Project

Expand Availability of Data for Childhood Obesity Outcomes Research
  • Centers for Disease Control and Prevention (CDC)
Start Date
  • 5/30/2018
  • Use of Clinical Data for Current Research
  • Standardized Collection of Standardized Clinical and Claims Data
  • Linking of Clinical and Other Data for Research


STATUS: Completed Project


Currently, distributed patient-centered outcomes research (PCOR) networks routinely gather data collected in health care settings, and structure these data in a common way (i.e., with a Common Data Model (CDM)). These networks, though capable of combining patient-level health intervention and community-level data, lack coding for children’s data, so these types of linkages have been limited. By building linkages and more advanced tools, CODI will help researchers fill the evidence gaps identified by the U.S. Preventive Services Task Force (USPSTF) in 2017. These data will help researchers answer questions such as whether all children are being screened appropriately for obesity, whether disparities exist by demographic groups, and will assist in the identification of weight management programs (WMPs) that are most effective.

The project piloted enhanced tools and services (e.g., patient record linkage and deduplication services) in the Colorado Health Observation Regional Data Service (CHORDS), a PCORnet Clinical Data Research Network (CDRN). CHORDS has initiatives to link patient health records to other data sources. To date, coding improvements and implementation of linkage services in large networks has been limited due to low resources for this purpose. The CODI project built data capacity of related research by linking data from health records such as weight height and blood pressure with WMP interventions and communities. This expanded the availability of data for childhood obesity PCOR that helps health professionals develop and tailor interventions that are specific to the needs of children.


The purpose of this project was to link pediatric clinical electronic health record (EHR) data, WMP intervention data, and community-level census information to expand the availability of data for researchers.

Project Objectives:

  • Establish an end-user collaborative of 15 subject matter experts, childhood obesity researchers, and network representatives to capture the data needs of end-users and technical requirements needed to facilitate the exchange and linkage of the data identified.
  • Expand and standardize patient-level EHR and WMP intervention data, as well as community-level census data by expanding to capture CODI clinical and intervention data elements in the PCORnet CDM.
  • Identify and describe the current business and technical processes and tasks for capturing the required childhood obesity data within health care and WMP intervention workflows, and then design the future ideal processes for the CODI project.
  • Expand linkage and de-duplication tools for integrating childhood obesity data.
  • Pilot the enhanced CODI information technology services and implementation guide to test the ability to link and query data in CHORDS to produce datasets for PCOR researchers to analyze.


  • The project team developed and piloted enhanced tools and services to 1) create individual-level, longitudinal records of linked multi-sector childhood obesity data and 2) promote sharing individual and community data while preserving privacy.
  • The team created standardized approaches to collecting childhood obesity data and developed Privacy Preserving Record Linkage (PPRL) techniques by linking child health-related data from Denver Health, Children’s Hospital Colorado, Kaiser Permanente Colorado, Girls on the Run of the Rockies, and Hunger Free Colorado.
  • The project team developed data cleaning tools for large EHR datasets, which will be made publicly available to help other organizations automate data cleaning processes and processes for standardizing and sharing data.
  • The team supported external pediatric obesity research by developing distributed queries that researchers and partners could use to examine 1) the local burden of childhood obesity, and 2) the impact of a service or an intervention on individuals’ health.



  • The team produced a report summarizing the results of the business process analysis (BPA) used in the CODI project to define the landscape of childhood obesity screening and treatment.
  • The team conducted and produced summaries of the CODI data architecture gaps analyses and recommendations.
  • The project produced a report that presents the results of the evaluation conducted on different PPRL packages to identify the best fit for COD.
  • The team produced the CODI enhanced data tools and services:
    • Data Owner Tools: Tools for Clinical and Community Data Initiative (CODI) Data Owners to extract personally identifiable information (PII) from the CODI Data Model and garble PII to send to the linkage agent for matching. These tools facilitate hashing and Bloom filter creation part of a Privacy-Preserving Record Linkage (PPRL) process.
    • Linkage Agent Tools: Tools for the Childhood Obesity Data Initative (CODI) Linkage Agent to use to accept garbled input from data owners / partners, perform matching and generate network IDs. This can also be thought of as Semi-Trusted Third Party (STTP) tools. These tools facilitate a Privacy Preserving Record Linkage (PPRL) process. They build on the open source anonlink software package.
  • Longitudinal query code was developed for two use cases in CODI:
    • CODI Use Case 1.4 - Estimate the prevalence by weight category in children between the ages of 2 and 19 years in 2017, 2018, and 2019.
    • CODI Use Case 2.1 - Among children aged 2–19 years who participated in an intervention in 2017, is intervention dose associated with health outcomes?
  • The project team published the CODI Data Models Implementation Guide, which contains implementation guidance on the CODI Research Data Model (RDM) and CODI Record Linkage Data Model (RLDM). The guide is located on GitHub here: Data Model Implementation Guide.pdf
  • The team published the CODI PPRL Implementation Guide, which provides guidance on implementing PPRL for participating organizations, and includes a description of the PPRL process, specific guidance for each PPRL role, and performance evaluation guidance. The guide is available on GitHub here: PPRL Implementation Guide.pdf
  • The team developed a fact sheet that provides additional information on the CODI projects. The fact sheet is available here:


Below is a list of ASPE-funded PCORTF projects that are related to this project

Childhood Obesity Data Initiative (CODI): Integrated Data for Patient-Centered Outcomes Research Project 2.0 - Linked, longitudinal clinical and community child-specific data are needed to assess the appropriateness of childhood obesity screening programs and the effectiveness of pediatric weight management interventions (PWMIs), as well as sociodemographic factors that influence childhood obesity. Initiated in 2018, CODI 1.0 built linkage capabilities, coding upgrades, and other enhancements into established PCOR networks to enhance the pediatric data available to researchers. As a continuation CODI 1.0, this project will carry on the work of addressing data needs as it relates to childhood PWMIs.

Harmonization of Various Common Data Models and Open Standards for Evidence Generation - This project was a collaborative effort among the Food and Drug Administration (FDA), National Cancer Institute (NCI), National Institutes of Health/National Center for Advancing Translational Sciences (NIH/NCATS), Office of the National Coordinator for Health (ONC), and the National Library of Medicine (NLM). The project built a data infrastructure for conducting PCOR using observational data derived from the delivery of health care in routine clinical settings. The sources of these data include, but are not limited to insurance billing claims, electronic health records (EHRs), and patient registries. In addition, the project team harmonized several existing common data models, including PCORnet and other networks.