Surpassing GPT-4 Medical Coding with a Two-Stage Approach
Abstract
Recent advances in large language models (LLMs) show potential for clinical applications, such as clinical decision support and trial recommendations. However, the GPT-4 LLM predicts an excessive number of ICD codes for medical coding tasks, leading to high recall but low precision. To tackle this challenge, we introduce LLM-codex, a two-stage approach to predict ICD codes that first generates evidence proposals using an LLM and then employs an LSTM-based verification stage. The LSTM learns from both the LLM's high recall and human expert's high precision, using a custom loss function. Our model is the only approach that simultaneously achieves state-of-the-art results in medical coding accuracy, accuracy on rare codes, and sentence-level evidence identification to support coding decisions without training on human-annotated evidence according to experiments on the MIMIC dataset.
- Publication:
-
arXiv e-prints
- Pub Date:
- November 2023
- DOI:
- 10.48550/arXiv.2311.13735
- arXiv:
- arXiv:2311.13735
- Bibcode:
- 2023arXiv231113735Y
- Keywords:
-
- Computer Science - Computation and Language
- E-Print:
- Extended Abstract presented at Machine Learning for Health (ML4H) symposium 2023, December 10th, 2023, New Orleans, United States, 19 pages