Epfl-llm

Models by this creator

↗️

meditron-7b

204

meditron-7b is a 7 billion parameter model adapted to the medical domain from the Llama-2-7B model. It was developed by the EPFL LLM Team through continued pretraining on a curated medical corpus, including PubMed articles, abstracts, a new dataset of medical guidelines, and general domain data from RedPajama-v1. meditron-7b outperforms Llama-2-7B and PMC-Llama on multiple medical reasoning tasks. The larger meditron-70b model follows a similar approach, scaling up to 70 billion parameters. It outperforms Llama-2-70B, GPT-3.5 (text-davinci-003), and Flan-PaLM on medical benchmarks. Model Inputs and Outputs Inputs Text-only data**: The model takes textual input only, with a context length of up to 2,048 tokens for meditron-7b and 4,096 tokens for meditron-70b. Outputs Text generation**: The model generates text as output. It is not designed for other output modalities like images or structured data. Capabilities The meditron models demonstrate strong performance on a variety of medical reasoning tasks, including medical exam question answering, supporting differential diagnosis, and providing disease information. Their medical domain-specific pretraining allows them to encode and apply relevant medical knowledge more effectively than general language models. What Can I Use It For? The meditron models are being made available for further testing and assessment as AI assistants to enhance clinical decision-making and improve access to large language models in healthcare. Potential use cases include: Medical exam question answering Supporting differential diagnosis Providing disease information (symptoms, causes, treatments) General health information queries However, the maintainers advise against deploying these models directly in medical applications without extensive testing and alignment with specific use cases, as they have not yet been adapted to deliver medical knowledge appropriately, safely, or within professional constraints. Things to Try While it is possible to use the meditron models to generate text, which can be useful for experimentation, the maintainers strongly recommend against using the models directly for production or work that may impact people. Instead, they suggest exploring the use of the models in a more controlled and interactive way, such as by deploying them with a high-throughput and memory-efficient inference engine and a user interface that supports chat and text generation. The maintainers have provided a deployment guide using the FastChat platform with the vLLM inference engine, and have collected generations for qualitative analysis through the BetterChatGPT interactive UI.

Updated 4/29/2024

Text-to-Text

👨‍🏫

meditron-70b

epfl-llm

177

meditron-70b is a 70 billion parameter Large Language Model (LLM) developed by the EPFL LLM Team. It is adapted from the base Llama-2-70B model through continued pretraining on a curated medical corpus, including PubMed articles, abstracts, medical guidelines, and general domain data. This specialized pretraining allows meditron-70b to outperform Llama-2-70B, GPT-3.5, and Flan-PaLM on multiple medical reasoning tasks. Model inputs and outputs meditron-70b is a causal decoder-only transformer language model that takes text-only data as input and generates text as output. The model has a context length of 4,096 tokens. Inputs Text-only data Outputs Generated text Capabilities meditron-70b is designed to encode medical knowledge from high-quality sources. However, the model is not yet adapted to safely deliver this knowledge within professional actionable constraints. Extensive use-case alignment, testing, and validation is recommended before deploying meditron-70b in medical applications. What can I use it for? Potential use cases for meditron-70b may include medical exam question answering and supporting differential diagnosis, though the model should be used with caution. The EPFL LLM Team is making meditron-70b available for further testing and assessment as an AI assistant to enhance clinical decision-making and expand access to LLMs in healthcare. Things to try Researchers and developers are encouraged to experiment with meditron-70b to assess its capabilities and limitations in the medical domain. However, any outputs or applications should be thoroughly reviewed to ensure safety and responsible use of the model.

Updated 4/28/2024

Text-to-Text