PhysioLLM: Supporting Personalized Health Insights with Wearables and Large Language Models

2406.19283

Published 6/28/2024 by Cathy Mengying Fang, Valdemar Danry, Nathan Whitmore, Andria Bao, Andrew Hutchison, Cayden Pierce, Pattie Maes

cs.HC

PhysioLLM: Supporting Personalized Health Insights with Wearables and Large Language Models

Abstract

We present PhysioLLM, an interactive system that leverages large language models (LLMs) to provide personalized health understanding and exploration by integrating physiological data from wearables with contextual information. Unlike commercial health apps for wearables, our system offers a comprehensive statistical analysis component that discovers correlations and trends in user data, allowing users to ask questions in natural language and receive generated personalized insights, and guides them to develop actionable goals. As a case study, we focus on improving sleep quality, given its measurability through physiological data and its importance to general well-being. Through a user study with 24 Fitbit watch users, we demonstrate that PhysioLLM outperforms both the Fitbit App alone and a generic LLM chatbot in facilitating a deeper, personalized understanding of health data and supporting actionable steps toward personal health goals.

Create account to get full access

Overview

The paper presents PhysioLLM, a system that combines wearable devices and large language models (LLMs) to provide personalized health insights.
It explores how LLMs can be leveraged to transform wearable sensor data into actionable information for users.
The system aims to support personalized health tracking and behavior change through a conversational interface.

Plain English Explanation

The paper discusses PhysioLLM, a system that uses wearable devices and large language models (LLMs) to provide people with personalized health insights. Wearable devices, like fitness trackers, can collect a lot of data about a person's physical activity, sleep, and other physiological measures. However, it can be challenging for people to understand and act on all this data.

PhysioLLM tries to address this by using powerful LLMs, which are AI models trained on vast amounts of text data. The researchers found that LLMs can be used to transform the data from wearable devices into easy-to-understand health information and recommendations. This is done through a conversational interface, where users can ask questions and get personalized responses about their health.

For example, a user might ask the system how their sleep patterns have been lately. The LLM would then analyze the user's recent sleep data from their wearable device and provide a summary, along with personalized suggestions for improving their sleep quality. The goal is to make it easier for people to track their health and make positive changes, supported by the insights generated by the LLM.

Overall, PhysioLLM aims to bridge the gap between the data collected by wearables and the practical health benefits people can gain from it, using the power of large language models.

Technical Explanation

The core of the PhysioLLM system is the integration of wearable sensor data and large language models (LLMs). The researchers used physiological data from wearable devices, such as heart rate, activity levels, and sleep patterns, and fed it into an LLM. The LLM was then trained to generate personalized health insights and recommendations based on this data.

To enable the conversational interface, the researchers fine-tuned the LLM to engage in dialogue and respond to user queries about their health. This allowed users to ask questions and receive tailored feedback, rather than just passively viewing their data.

The researchers also explored graph-augmented LLMs, which incorporate additional contextual information, such as demographic data and medical history, to further personalize the health insights. This graph-augmented approach aims to provide even more accurate and relevant recommendations for individual users.

Through user studies, the researchers evaluated the effectiveness of PhysioLLM in supporting physical activity behavior change and improving users' overall health awareness and engagement.

Critical Analysis

The paper presents a promising approach to leveraging LLMs and wearable data to provide personalized health insights. However, it also acknowledges several important caveats and limitations:

Data Quality and Reliability: The accuracy and reliability of the health insights generated by PhysioLLM are heavily dependent on the quality and completeness of the wearable sensor data. Issues with sensor accuracy or data gaps could lead to suboptimal recommendations.
Privacy and Security Concerns: The system requires users to share sensitive personal health data, raising important questions about privacy and data security that need to be addressed.
Generalizability and Scalability: The researchers tested PhysioLLM with a relatively small user group. Further research is needed to understand how well the system would scale and perform with a larger, more diverse population.
Clinical Validation: While the user studies showed promising results, the paper does not provide evidence of the system's clinical validity or its ability to produce medically relevant insights. Validation by healthcare professionals would be important for wider adoption.

Overall, the PhysioLLM concept is an innovative approach to leveraging the power of LLMs and wearable data, but further research and development is needed to address these concerns and establish the system's real-world effectiveness and scalability.

Conclusion

The PhysioLLM paper presents an exciting vision for how large language models and wearable devices can be combined to provide personalized health insights and support behavior change. By transforming the data collected by wearables into easy-to-understand information and recommendations, the system aims to empower users to take a more active role in managing their health.

The use of conversational interfaces and graph-augmented LLMs to further personalize the health insights is a particularly promising aspect of the research. If the challenges around data quality, privacy, and clinical validation can be addressed, PhysioLLM could potentially have a significant impact on how people engage with and improve their health, supported by the power of advanced AI technologies.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Towards a Personal Health Large Language Model

Justin Cosentino, Anastasiya Belyaeva, Xin Liu, Nicholas A. Furlotte, Zhun Yang, Chace Lee, Erik Schenck, Yojan Patel, Jian Cui, Logan Douglas Schneider, Robby Bryant, Ryan G. Gomes, Allen Jiang, Roy Lee, Yun Liu, Javier Perez, Jameson K. Rogers, Cathy Speed, Shyam Tailor, Megan Walker, Jeffrey Yu, Tim Althoff, Conor Heneghan, John Hernandez, Mark Malhotra, Leor Stern, Yossi Matias, Greg S. Corrado, Shwetak Patel, Shravya Shetty, Jiening Zhan, Shruthi Prabhakara, Daniel McDuff, Cory Y. McLean

In health, most large language model (LLM) research has focused on clinical tasks. However, mobile and wearable devices, which are rarely integrated into such tasks, provide rich, longitudinal data for personal health monitoring. Here we present Personal Health Large Language Model (PH-LLM), fine-tuned from Gemini for understanding and reasoning over numerical time-series personal health data. We created and curated three datasets that test 1) production of personalized insights and recommendations from sleep patterns, physical activity, and physiological responses, 2) expert domain knowledge, and 3) prediction of self-reported sleep outcomes. For the first task we designed 857 case studies in collaboration with domain experts to assess real-world scenarios in sleep and fitness. Through comprehensive evaluation of domain-specific rubrics, we observed that Gemini Ultra 1.0 and PH-LLM are not statistically different from expert performance in fitness and, while experts remain superior for sleep, fine-tuning PH-LLM provided significant improvements in using relevant domain knowledge and personalizing information for sleep insights. We evaluated PH-LLM domain knowledge using multiple choice sleep medicine and fitness examinations. PH-LLM achieved 79% on sleep and 88% on fitness, exceeding average scores from a sample of human experts. Finally, we trained PH-LLM to predict self-reported sleep quality outcomes from textual and multimodal encoding representations of wearable data, and demonstrate that multimodal encoding is required to match performance of specialized discriminative models. Although further development and evaluation are necessary in the safety-critical personal health domain, these results demonstrate both the broad knowledge and capabilities of Gemini models and the benefit of contextualizing physiological data for personal health applications as done with PH-LLM.

6/11/2024

cs.AI cs.CL

📊

Transforming Wearable Data into Health Insights using Large Language Model Agents

Mike A. Merrill, Akshay Paruchuri, Naghmeh Rezaei, Geza Kovacs, Javier Perez, Yun Liu, Erik Schenck, Nova Hammerquist, Jake Sunshine, Shyam Tailor, Kumar Ayush, Hao-Wei Su, Qian He, Cory Y. McLean, Mark Malhotra, Shwetak Patel, Jiening Zhan, Tim Althoff, Daniel McDuff, Xin Liu

Despite the proliferation of wearable health trackers and the importance of sleep and exercise to health, deriving actionable personalized insights from wearable data remains a challenge because doing so requires non-trivial open-ended analysis of these data. The recent rise of large language model (LLM) agents, which can use tools to reason about and interact with the world, presents a promising opportunity to enable such personalized analysis at scale. Yet, the application of LLM agents in analyzing personal health is still largely untapped. In this paper, we introduce the Personal Health Insights Agent (PHIA), an agent system that leverages state-of-the-art code generation and information retrieval tools to analyze and interpret behavioral health data from wearables. We curate two benchmark question-answering datasets of over 4000 health insights questions. Based on 650 hours of human and expert evaluation we find that PHIA can accurately address over 84% of factual numerical questions and more than 83% of crowd-sourced open-ended questions. This work has implications for advancing behavioral health across the population, potentially enabling individuals to interpret their own wearable data, and paving the way for a new era of accessible, personalized wellness regimens that are informed by data-driven insights.

6/12/2024

cs.AI cs.CL

Health-LLM: Large Language Models for Health Prediction via Wearable Sensor Data

Yubin Kim, Xuhai Xu, Daniel McDuff, Cynthia Breazeal, Hae Won Park

Large language models (LLMs) are capable of many natural language tasks, yet they are far from perfect. In health applications, grounding and interpreting domain-specific and non-linguistic data is crucial. This paper investigates the capacity of LLMs to make inferences about health based on contextual information (e.g. user demographics, health knowledge) and physiological data (e.g. resting heart rate, sleep minutes). We present a comprehensive evaluation of 12 state-of-the-art LLMs with prompting and fine-tuning techniques on four public health datasets (PMData, LifeSnaps, GLOBEM and AW_FB). Our experiments cover 10 consumer health prediction tasks in mental health, activity, metabolic, and sleep assessment. Our fine-tuned model, HealthAlpaca exhibits comparable performance to much larger models (GPT-3.5, GPT-4 and Gemini-Pro), achieving the best performance in 8 out of 10 tasks. Ablation studies highlight the effectiveness of context enhancement strategies. Notably, we observe that our context enhancement can yield up to 23.8% improvement in performance. While constructing contextually rich prompts (combining user context, health knowledge and temporal information) exhibits synergistic improvement, the inclusion of health knowledge context in prompts significantly enhances overall performance.

4/30/2024

cs.CL cs.AI cs.LG

Graph-Augmented LLMs for Personalized Health Insights: A Case Study in Sleep Analysis

Ajan Subramanian, Zhongqi Yang, Iman Azimi, Amir M. Rahmani

Health monitoring systems have revolutionized modern healthcare by enabling the continuous capture of physiological and behavioral data, essential for preventive measures and early health intervention. While integrating this data with Large Language Models (LLMs) has shown promise in delivering interactive health advice, traditional methods like Retrieval-Augmented Generation (RAG) and fine-tuning often fail to fully utilize the complex, multi-dimensional, and temporally relevant data from wearable devices. These conventional approaches typically provide limited actionable and personalized health insights due to their inadequate capacity to dynamically integrate and interpret diverse health data streams. In response, this paper introduces a graph-augmented LLM framework designed to significantly enhance the personalization and clarity of health insights. Utilizing a hierarchical graph structure, the framework captures inter and intra-patient relationships, enriching LLM prompts with dynamic feature importance scores derived from a Random Forest Model. The effectiveness of this approach is demonstrated through a sleep analysis case study involving 20 college students during the COVID-19 lockdown, highlighting the potential of our model to generate actionable and personalized health insights efficiently. We leverage another LLM to evaluate the insights for relevance, comprehensiveness, actionability, and personalization, addressing the critical need for models that process and interpret complex health data effectively. Our findings show that augmenting prompts with our framework yields significant improvements in all 4 criteria. Through our framework, we can elicit well-crafted, more thoughtful responses tailored to a specific patient.

6/26/2024

cs.LG cs.AI