A Survey of Large Language Models in Medicine: Progress, Application, and Challenge

2311.05112

Published 5/16/2024 by Hongjian Zhou, Fenglin Liu, Boyang Gu, Xinyu Zou, Jinfa Huang, Jinge Wu, Yiru Li, Sam S. Chen, Peilin Zhou, Junling Liu and 9 others

cs.CL cs.AI

💬

Abstract

Large language models (LLMs), such as ChatGPT, have received substantial attention due to their capabilities for understanding and generating human language. While there has been a burgeoning trend in research focusing on the employment of LLMs in supporting different medical tasks (e.g., enhancing clinical diagnostics and providing medical education), a review of these efforts, particularly their development, practical applications, and outcomes in medicine, remains scarce. Therefore, this review aims to provide a detailed overview of the development and deployment of LLMs in medicine, including the challenges and opportunities they face. In terms of development, we provide a detailed introduction to the principles of existing medical LLMs, including their basic model structures, number of parameters, and sources and scales of data used for model development. It serves as a guide for practitioners in developing medical LLMs tailored to their specific needs. In terms of deployment, we offer a comparison of the performance of different LLMs across various medical tasks, and further compare them with state-of-the-art lightweight models, aiming to provide an understanding of the advantages and limitations of LLMs in medicine. Overall, in this review, we address the following questions: 1) What are the practices for developing medical LLMs 2) How to measure the medical task performance of LLMs in a medical setting? 3) How have medical LLMs been employed in real-world practice? 4) What challenges arise from the use of medical LLMs? and 5) How to more effectively develop and deploy medical LLMs? By answering these questions, this review aims to provide insights into the opportunities for LLMs in medicine and serve as a practical resource. We also maintain a regularly updated list of practical guides on medical LLMs at: https://github.com/AI-in-Health/MedLLMsPracticalGuide.

Create account to get full access

Overview

This paper provides a comprehensive overview of the development and deployment of large language models (LLMs) in the medical field.
It addresses key questions around best practices for developing medical LLMs, measuring their performance, real-world applications, challenges, and strategies for more effective development and deployment.
The paper aims to offer insights into the opportunities and practical guidance for leveraging LLMs in medicine.

Plain English Explanation

Large language models (LLMs) like ChatGPT have gained significant attention for their ability to understand and generate human language. In the medical field, researchers have been exploring how to use LLMs to enhance various tasks, such as improving clinical diagnostics and providing better medical education.

This review paper takes a deep dive into the development and practical use of LLMs in medicine. It covers the key principles behind existing medical LLMs, including their model structures, the amount of data used to train them, and how they perform on different medical tasks when compared to other state-of-the-art models.

The paper also examines how medical LLMs have been employed in real-world settings, the challenges that have arisen, and strategies for more effectively developing and deploying these models in the future. The goal is to provide a comprehensive resource that offers insights into the potential of LLMs in healthcare and practical guidance for practitioners.

Technical Explanation

The paper begins by providing a detailed introduction to the principles underlying existing medical LLMs. This includes information about their basic model structures, the number of parameters they have, and the sources and scale of data used for their development. This serves as a guide for practitioners who want to build their own medical LLMs tailored to specific needs.

Next, the paper compares the performance of different LLMs across various medical tasks, and also compares them to state-of-the-art lightweight models. This helps to understand the advantages and limitations of using LLMs in a medical context.

The authors then explore how medical LLMs have been employed in real-world practice, the challenges that have arisen, and strategies for more effective development and deployment. This covers questions such as:

What are the best practices for developing medical LLMs?
How can the performance of medical LLMs be accurately measured in a medical setting?
What are the practical applications of medical LLMs in the real world?
What are the key challenges associated with using medical LLMs?
How can medical LLMs be more effectively developed and deployed?

By addressing these questions, the paper aims to provide a comprehensive resource that offers insights into the opportunities for LLMs in medicine and serves as a practical guide for practitioners.

Critical Analysis

The paper provides a thorough and well-researched overview of the development and deployment of LLMs in the medical field. However, it does acknowledge several limitations and areas for further research.

For example, the authors note that while the paper compares the performance of different LLMs, the specific benchmarks and evaluation metrics used may not fully capture the nuances of real-world medical tasks. Additional research may be needed to develop more comprehensive and clinically-relevant assessment frameworks.

The paper also highlights the need for further exploration of the challenges and ethical considerations around the use of LLMs in sensitive medical domains, such as patient privacy and the potential for biased or inaccurate outputs. Broader research on the responsible development and deployment of LLMs could provide valuable insights in this area.

Overall, the paper serves as a valuable resource for understanding the current state of LLMs in medicine, but it also encourages readers to think critically about the limitations and potential risks involved, and to continue exploring ways to more effectively leverage these powerful models in the healthcare domain.

Conclusion

This comprehensive review paper provides a detailed overview of the development and practical applications of large language models (LLMs) in the medical field. It covers the key principles behind existing medical LLMs, their performance on various tasks, real-world use cases, challenges, and strategies for more effective deployment.

By addressing these critical questions, the paper offers valuable insights into the opportunities and limitations of LLMs in healthcare, and serves as a practical resource for practitioners looking to leverage these powerful models in their own work. The authors also maintain an updated list of guides on medical LLMs, further enhancing the utility of this review.

Overall, this paper represents a significant contribution to the ongoing research on the use of large language models in medical applications, and will likely be a valuable reference for researchers, clinicians, and developers working at the intersection of AI and healthcare.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

💬

Large Language Models for Medicine: A Survey

Yanxin Zheng, Wensheng Gan, Zefeng Chen, Zhenlian Qi, Qian Liang, Philip S. Yu

To address challenges in the digital economy's landscape of digital intelligence, large language models (LLMs) have been developed. Improvements in computational power and available resources have significantly advanced LLMs, allowing their integration into diverse domains for human life. Medical LLMs are essential application tools with potential across various medical scenarios. In this paper, we review LLM developments, focusing on the requirements and applications of medical LLMs. We provide a concise overview of existing models, aiming to explore advanced research directions and benefit researchers for future medical applications. We emphasize the advantages of medical LLMs in applications, as well as the challenges encountered during their development. Finally, we suggest directions for technical integration to mitigate challenges and potential research directions for the future of medical LLMs, aiming to meet the demands of the medical field better.

5/24/2024

cs.CL cs.AI cs.CY

💬

A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine

Hanguang Xiao, Feizhong Zhou, Xingyue Liu, Tianqi Liu, Zhipeng Li, Xin Liu, Xiaoxuan Huang

Since the release of ChatGPT and GPT-4, large language models (LLMs) and multimodal large language models (MLLMs) have garnered significant attention due to their powerful and general capabilities in understanding, reasoning, and generation, thereby offering new paradigms for the integration of artificial intelligence with medicine. This survey comprehensively overviews the development background and principles of LLMs and MLLMs, as well as explores their application scenarios, challenges, and future directions in medicine. Specifically, this survey begins by focusing on the paradigm shift, tracing the evolution from traditional models to LLMs and MLLMs, summarizing the model structures to provide detailed foundational knowledge. Subsequently, the survey details the entire process from constructing and evaluating to using LLMs and MLLMs with a clear logic. Following this, to emphasize the significant value of LLMs and MLLMs in healthcare, we survey and summarize 6 promising applications in healthcare. Finally, the survey discusses the challenges faced by medical LLMs and MLLMs and proposes a feasible approach and direction for the subsequent integration of artificial intelligence with medicine. Thus, this survey aims to provide researchers with a valuable and comprehensive reference guide from the perspectives of the background, principles, and clinical applications of LLMs and MLLMs.

5/15/2024

cs.CL

💬

Evaluating large language models in medical applications: a survey

Xiaolan Chen, Jiayang Xiang, Shanfu Lu, Yexin Liu, Mingguang He, Danli Shi

Large language models (LLMs) have emerged as powerful tools with transformative potential across numerous domains, including healthcare and medicine. In the medical domain, LLMs hold promise for tasks ranging from clinical decision support to patient education. However, evaluating the performance of LLMs in medical contexts presents unique challenges due to the complex and critical nature of medical information. This paper provides a comprehensive overview of the landscape of medical LLM evaluation, synthesizing insights from existing studies and highlighting evaluation data sources, task scenarios, and evaluation methods. Additionally, it identifies key challenges and opportunities in medical LLM evaluation, emphasizing the need for continued research and innovation to ensure the responsible integration of LLMs into clinical practice.

5/14/2024

cs.CL cs.AI

A Survey on Medical Large Language Models: Technology, Application, Trustworthiness, and Future Directions

Lei Liu, Xiaoyan Yang, Junchi Lei, Xiaoyang Liu, Yue Shen, Zhiqiang Zhang, Peng Wei, Jinjie Gu, Zhixuan Chu, Zhan Qin, Kui Ren

Large language models (LLMs), such as GPT series models, have received substantial attention due to their impressive capabilities for generating and understanding human-level language. More recently, LLMs have emerged as an innovative and powerful adjunct in the medical field, transforming traditional practices and heralding a new era of enhanced healthcare services. This survey provides a comprehensive overview of Medical Large Language Models (Med-LLMs), outlining their evolution from general to the medical-specific domain (i.e, Technology and Application), as well as their transformative impact on healthcare (e.g., Trustworthiness and Safety). Concretely, starting from the fundamental history and technology of LLMs, we first delve into the progressive adaptation and refinements of general LLM models in the medical domain, especially emphasizing the advanced algorithms that boost the LLMs' performance in handling complicated medical environments, including clinical reasoning, knowledge graph, retrieval-augmented generation, human alignment, and multi-modal learning. Secondly, we explore the extensive applications of Med-LLMs across domains such as clinical decision support, report generation, and medical education, illustrating their potential to streamline healthcare services and augment patient outcomes. Finally, recognizing the imperative and responsible innovation, we discuss the challenges of ensuring fairness, accountability, privacy, and robustness in Med-LLMs applications. Finally, we conduct a concise discussion for anticipating possible future trajectories of Med-LLMs, identifying avenues for the prudent expansion of Med-LLMs. By consolidating above-mentioned insights, this review seeks to provide a comprehensive investigation of the potential strengths and limitations of Med-LLMs for professionals and researchers, ensuring a responsible landscape in the healthcare setting.

6/7/2024

cs.CL cs.LG