Automated Data Extraction for Clinical Databases using Natural Language Processing

Name: Automated Data Extraction for Clinical Databases using Natural Language Processing
Start: 2023-10-16T10:00:00Z
Location: INFORMS Annual Meeting 2023

Georgios Margaritis, Periklis Petridis, Dimitris Bertsimas, Robert Habib, David Shahian, Agni Orfanoudaki

Abstract

Electronic health records (EHRs) are rich sources of multimodal clinical data, yet remain underutilized due to their largely unstructured nature and the manual effort required to extract structured information. This work focuses on automating population of the Society of Thoracic Surgeons Adult Cardiac Surgery Database (STS ACSD), a widely used clinical registry with over 1,000 structured variables. Currently, almost all US cardiac surgery programs populate the STS database regularly, a process requiring extensive manual work and resources. We present an AI-assisted pipeline trained and validated on Mass General Brigham (MGB) data, that can automatically populate 49.5% of the registry with accuracy exceeding 99%. External validation at Hartford Healthcare shows similar accuracy with a 43.2% completion rate. Our results highlight the potential of AI to automate EHR data abstraction at scale, enhancing the utility, scope, and availability of EHR-derived data for a variety of downstream applications.

Date

Oct 16, 2023 10:00 AM

Location

INFORMS Annual Meeting 2023

Phoenix, Arizona

Natural Language Processing Deep Learning Healthcare

Automated Data Extraction for Clinical Databases using Natural Language Processing

Abstract

Georgios Margaritis

PhD Candidate | Open to Work

Related