Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Utilizing Natural Language Processing and Machine Learning to Enhance the Identification of Stimulant and Opioid-Involved Health Outcomes in the National Hospital Care Survey

Utilizing Natural Language Processing and Machine Learning to Enhance the Identification of Stimulant and Opioid-Involved Health Outcomes in the National Hospital Care Survey
Agency
  • Centers for Disease Control and Prevention (CDC)
  • National Center for Health Statistics (NCHS)
Start Date
  • 03/01/23
Functionality
  • Standardized Collection of Standardized Clinical Data
  • Use of Clinical Data for Research
  • Linking Clinical and Other Data for Research

 

STATUS: Active Project

BACKGROUND

Substance use continues to be a major national public health concern. Given the ongoing opioid epidemic, research on substance use has heavily focused on opioids. However, there has been a recent rise in the use of stimulants and the co-use (drugs used simultaneously) of stimulants and opioids. While population-level surveillance of substance use exists, few resources are available that capture this information on a clinical level, using all components of the electronic health record (EHR).

To address these data needs, the National Center for Health Statistics (NCHS) will develop an algorithm that utilizes natural language processing (NLP) and machine learning (ML) techniques to identify hospital encounters in the National Hospital Care Survey (NHCS) that involve the use of illicit stimulants, misuse of prescription stimulants, and the co-involvement of stimulants and opioids. The project will build upon the success of two previous OS-PCORTF projects that developed (1) an enhanced opioid identification algorithm and (2) an algorithm to identify co-occurring substance use disorders and mental health issues in patients with opioid-involved encounters. The new algorithm will be developed and validated using NHCS data and improve the identification of stimulant and opioid-involved hospital encounters from both structured and unstructured hospital data. This information is not currently readily available in other datasets for use by researchers, and the resulting products (algorithm and data files) will enhance access to robust, high-quality data on stimulant-associated health outcomes.

PROJECT PURPOSE & GOALS

The overall goal of the project is to improve data capacity for patient-centered outcomes research related to substance use disorders via the following objectives:

  • Develop an algorithm using ML and NLP techniques to identify hospital encounters involving the use of illicit stimulants, misuse of prescription stimulants, and the co-involvement of stimulants and illicit opioids or misused prescription opioids
  • Apply the newly developed stimulant-involved algorithm and the previously developed opioid-involved algorithm to the 2020 NHCS data to produce:
    • 2020 NHCS enhanced restricted use dataset with additional information on patients with stimulant-involved hospitals with associated documentation available for researchers through the Federal and NCHS RDC (research data center)
  • Apply the enhanced stimulant identification algorithm on external datafiles of similar structure
  • NCHS Vital and Health Statistics Series 2 Report summarizing the development and findings from the application of the enhanced stimulant identification algorithm on external datafiles, including a description of the datafile, algorithm performance measures (where applicable), and recommended algorithm applications
  • Develop an interactive dashboard to display national estimates on stimulant-involved and opioid-involved hospital utilization and care related to drug-involved hospital encounters and update the Annual Hospital Report (AHR) portal