Lmsys

Models by this creator

🌿

vicuna-13b-delta-v0

454

The vicuna-13b-delta-v0 is a chat assistant model developed by LMSYS. It is fine-tuned from the LLaMA language model with supervised instruction on user-shared conversations collected from ShareGPT. The model is available in different versions, including the vicuna-7b-delta-v0, vicuna-13b-v1.1, vicuna-7b-v1.3, and vicuna-33b-v1.3, each with its own unique training details and performance characteristics. These models are intended for research on large language models and chatbots, and are targeted at researchers and hobbyists in natural language processing, machine learning, and artificial intelligence. Model inputs and outputs The vicuna-13b-delta-v0 model is an auto-regressive language model that takes in text as input and generates additional text as output. The model can be used for a variety of natural language processing tasks, such as text generation, conversation, and question answering. Inputs Text prompts that the model can use to generate additional text. Outputs Coherent and contextually relevant text generated in response to the input prompts. Capabilities The vicuna-13b-delta-v0 model has been trained on a large corpus of conversational data and can engage in natural and engaging dialogue. It demonstrates strong capabilities in tasks such as open-ended conversation, task-oriented dialogue, and providing informative and helpful responses to a wide range of queries. What can I use it for? The primary use of the vicuna-13b-delta-v0 model is for research on large language models and chatbots. Researchers and hobbyists in natural language processing, machine learning, and artificial intelligence can use the model to explore topics such as language generation, dialogue systems, and the societal impacts of AI. The model could also be used as a starting point for developing custom chatbots or virtual assistants for specific applications or domains. Things to try Researchers and hobbyists can experiment with the vicuna-13b-delta-v0 model to explore its capabilities in areas such as task-oriented dialogue, open-ended conversation, and knowledge-intensive question answering. Additionally, they can fine-tune the model on domain-specific data to adapt it for specialized applications, or use it as a starting point for developing more advanced chatbots or virtual assistants.

Updated 5/28/2024

Text-to-Text

📉

vicuna-13b-delta-v1.1

lmsys

411

vicuna-13b-delta-v1.1 is a large language model developed by LMSYS. It is fine-tuned from the LLaMA model and trained on user-shared conversations collected from ShareGPT. This "delta model" cannot be used directly, but rather must be applied on top of the original LLaMA weights to get the actual Vicuna weights. Similar models include vicuna-13b-delta-v0, vicuna-7b-delta-v0, vicuna-13b-v1.1, and vicuna-7b-v1.3. Model inputs and outputs vicuna-13b-delta-v1.1 is an auto-regressive language model that takes in text and generates new text. It can be used for a variety of natural language processing tasks such as text generation, question answering, and conversational AI. Inputs Text prompts Outputs Generated text Capabilities vicuna-13b-delta-v1.1 has been trained to engage in open-ended dialogue and assist with a wide range of tasks. It demonstrates strong language understanding and generation capabilities, allowing it to provide informative and coherent responses. The model can be used for research on large language models and chatbots. What can I use it for? The primary use of vicuna-13b-delta-v1.1 is for research on large language models and chatbots. Researchers and hobbyists in natural language processing, machine learning, and artificial intelligence can use the model to explore advancements in these fields. To get started, users can access the model through the command line interface or APIs provided by the maintainer. Things to try Experiment with the model's language generation capabilities by providing it with a variety of prompts and observing the outputs. Assess the model's performance on natural language tasks and compare it to other language models. Explore ways to fine-tune or adapt the model for specific applications or domains.

Updated 5/28/2024

Text-to-Text

📈

fastchat-t5-3b-v1.0

lmsys

346

The fastchat-t5-3b-v1.0 is an open-source chatbot model developed by the lmsys team. It is based on the Flan-T5-XL model, which is a version of the T5 language model fine-tuned on a large set of instruction-following tasks. Compared to the original T5 model, the FLAN-T5 models have been further trained on over 1,000 additional tasks, giving them stronger few-shot and zero-shot performance. The fastchat-t5-3b-v1.0 model was trained by fine-tuning the Flan-T5-XL checkpoint on user-shared conversations from ShareGPT. This allows the model to engage in more open-ended and contextual dialogue, compared to the more task-oriented FLAN-T5 models. Similar models include the longchat-7b-v1.5-32k and the t5-small and t5-base checkpoints from the original T5 model. Model inputs and outputs Inputs Text**: The fastchat-t5-3b-v1.0 model takes natural language text as input, such as questions, statements, or instructions. Outputs Text**: The model outputs generated text, which can be responses to the input, continuations of the input, or answers to questions. Capabilities The fastchat-t5-3b-v1.0 model is capable of engaging in open-ended dialogue and responding to a wide variety of prompts. It can understand context and generate coherent and relevant responses. The model has been fine-tuned on a large dataset of real conversations, allowing it to produce more natural and contextual language compared to the more task-oriented FLAN-T5 models. What can I use it for? The primary intended use of the fastchat-t5-3b-v1.0 model is for commercial chatbot and virtual assistant applications. The model's strong conversational abilities make it well-suited for customer service, virtual agents, and other interactive AI applications. Researchers in natural language processing and machine learning may also find the model useful for exploring the capabilities and limitations of large language models. Things to try One interesting aspect of the fastchat-t5-3b-v1.0 model is its ability to engage in multi-turn dialogues and maintain context over the course of a conversation. You could try providing the model with a series of related prompts and see how it responds, building upon the previous context. Additionally, you could experiment with giving the model open-ended instructions or tasks and observe how it interprets and carries them out.

Updated 5/28/2024

Text-to-Text

🧪

vicuna-33b-v1.3

lmsys

285

vicuna-33b-v1.3 is an open-source chatbot developed by the Vicuna team at LMSYS. It is an auto-regressive language model based on the transformer architecture, fine-tuned from the LLaMA model on user-shared conversations collected from ShareGPT. This model builds upon the capabilities of LLaMA with additional training to improve its conversational abilities. Similar models include the vicuna-13b-v1.5-16K and stable-vicuna-13B-HF, which are also fine-tuned versions of LLaMA with different training data and techniques. Model inputs and outputs Inputs Text prompts**: The model takes text prompts as input, which can be questions, instructions, or conversational starters. Outputs Generated text**: The model generates coherent and contextual text responses based on the input prompt. The responses aim to be helpful, detailed, and polite. Capabilities vicuna-33b-v1.3 is capable of engaging in open-ended conversations, answering questions, and providing informative responses on a wide range of topics. It demonstrates strong language understanding and generation abilities, with the potential to assist users with tasks such as research, analysis, and creative writing. What can I use it for? The primary intended use of vicuna-33b-v1.3 is for research on large language models and chatbots. Researchers and hobbyists in natural language processing, machine learning, and artificial intelligence can use this model to explore advancements in conversational AI. Additionally, the model could be fine-tuned or integrated into various applications that require natural language interactions, such as virtual assistants, customer service chatbots, or educational tools. Things to try One interesting aspect of vicuna-33b-v1.3 is its ability to engage in back-and-forth conversations, where it can understand and respond to context. Users can try asking follow-up questions or providing additional context to see how the model adapts its responses. Additionally, users can experiment with different prompting strategies, such as using specific instructions or framing the interaction as a collaborative task, to further explore the model's capabilities.

Updated 5/28/2024

Text-to-Text

🤷

vicuna-7b-v1.5

lmsys

240

The vicuna-7b-v1.5 model is a chat assistant developed by LMSYS. It is an auto-regressive language model based on the transformer architecture, fine-tuned from Llama 2 on user-shared conversations collected from ShareGPT. The model aims to be useful for research on large language models and chatbots, with the primary intended users being researchers and hobbyists in natural language processing, machine learning, and artificial intelligence. Similar models include the vicuna-33b-v1.3 which is also a Vicuna model fine-tuned from a larger LLaMA base, and the vicuna-13B-v1.5-16K-GGML which is a GGML version of the 13B Vicuna model, optimized for CPU and GPU inference. Model inputs and outputs Inputs Prompt**: The model takes a free-form text prompt as input, which can be a question, instruction, or conversational message. Outputs Text response**: The model generates a coherent text response based on the input prompt. The response aims to be helpful, detailed, and polite. Capabilities The vicuna-7b-v1.5 model is capable of engaging in open-ended conversations on a wide range of topics. It can answer questions, provide explanations, and offer suggestions based on the input prompt. The model demonstrates strong performance on standard benchmarks, human preference tests, and LLM-as-a-judge evaluations, achieving around 90% of the quality of GPT-4 according to the Vicuna team. What can I use it for? The primary use case for the vicuna-7b-v1.5 model is research on large language models and chatbots. Researchers and hobbyists in natural language processing, machine learning, and artificial intelligence can experiment with the model, explore its capabilities, and use it as a starting point for further fine-tuning or development. Things to try One interesting aspect of the vicuna-7b-v1.5 model is its fine-tuning on user-shared conversations from ShareGPT. This means the model has been exposed to a diverse range of conversational styles and topics, which could allow it to engage in more natural and context-aware dialogue compared to models trained on more curated datasets. Experimenting with open-ended conversations on a variety of subjects could help uncover the model's strengths and limitations in real-world settings.

Updated 5/28/2024

Text-to-Text

🛠️

vicuna-13b-v1.5-16k

lmsys

218

The vicuna-13b-v1.5-16k is a large language model developed by LMSYS that is fine-tuned from the Llama 2 model on user-shared conversations collected from ShareGPT. It is an auto-regressive language model based on the transformer architecture. Similar models include the vicuna-7b-v1.5, vicuna-33b-v1.3, and vicuna-13B-v1.5-16K-GGML models, all of which are also fine-tuned versions of Llama or LLaMA models. Model Inputs and Outputs The vicuna-13b-v1.5-16k model is designed to perform text generation tasks. It takes text prompts as input and generates relevant, coherent text as output. The model can handle a wide range of prompts, from open-ended conversations to specific instructions and tasks. Inputs Text prompts of varying lengths, from a few words to multiple paragraphs Outputs Generated text that continues or responds to the input prompt The model can produce text of varying lengths, from a few words to multiple paragraphs Capabilities The vicuna-13b-v1.5-16k model has demonstrated strong performance on a variety of natural language tasks, including open-ended conversation, question answering, and task completion. It can engage in thoughtful and nuanced dialogue, drawing upon its broad knowledge base to provide informative and contextually appropriate responses. What Can I Use It For? The primary use case for the vicuna-13b-v1.5-16k model is research on large language models and chatbots. Researchers and hobbyists in natural language processing, machine learning, and artificial intelligence can use this model to explore advancements in conversational AI, text generation, and other related areas. The model can also be further fine-tuned on specific datasets to adapt it for various applications, such as customer service, content creation, or educational assistants. Things to Try Experiment with the vicuna-13b-v1.5-16k model by providing it with a wide range of prompts, from open-ended questions to specific instructions. Observe how the model responds and generates relevant, coherent text. You can also try fine-tuning the model on your own datasets to see how it performs on more specialized tasks. Additionally, compare the performance of this model to the similar models mentioned earlier to understand the nuances and tradeoffs between different fine-tuned versions of the Llama and LLaMA architectures.

Updated 5/28/2024

Text-to-Text

🌀

vicuna-7b-delta-v1.1

lmsys

202

vicuna-7b-delta-v1.1 is a chat assistant developed by LMSYS. It is a fine-tuned version of the LLaMA language model, trained on user-shared conversations collected from ShareGPT.com. This "delta model" is meant to be applied on top of the original LLaMA weights to get the actual Vicuna weights. Newer versions of the Vicuna weights are available, so users should check the instructions for the latest information. Model inputs and outputs vicuna-7b-delta-v1.1 is an auto-regressive language model based on the transformer architecture. It takes in text as input and generates text as output, making it suitable for a variety of natural language processing tasks. Inputs Text prompts Outputs Generated text continuations Capabilities The primary capability of vicuna-7b-delta-v1.1 is to engage in open-ended conversation and assist with a variety of language-based tasks. It can be used for tasks like question answering, summarization, and creative writing. What can I use it for? The primary use of vicuna-7b-delta-v1.1 is for research on large language models and chatbots. The model is intended for use by researchers and hobbyists in natural language processing, machine learning, and artificial intelligence. The model can be used through a command line interface or via APIs. Things to try Users can try fine-tuning vicuna-7b-delta-v1.1 on their own datasets to adapt it to specific use cases. The model can also be used as a starting point for further research and development of large language models and chatbots.

Updated 5/28/2024

Text-to-Text

🧪

vicuna-13b-v1.5

lmsys

191

vicuna-13b-v1.5 is a large language model developed by LMSYS. It is a 13 billion parameter chat assistant trained by fine-tuning the Llama 2 model on user-shared conversations collected from ShareGPT. The model is licensed under the Llama 2 Community License Agreement. Similar models include the vicuna-7b-v1.5, vicuna-13b-v1.5-16k, vicuna-7b-v1.5-16k, and vicuna-33b-v1.3. Model inputs and outputs vicuna-13b-v1.5 is an autoregressive language model that takes in text as input and generates text as output. It can be used for a variety of natural language processing tasks such as language generation, translation, and question answering. Inputs Text prompts Outputs Generated text responses Capabilities vicuna-13b-v1.5 has been trained to engage in open-ended conversation and provide helpful, informative, and coherent responses on a wide range of topics. It can be used for research on large language models and chatbots, as well as for practical applications like customer service, content creation, and task assistance. What can I use it for? The primary use of vicuna-13b-v1.5 is for research on large language models and chatbots. Researchers and hobbyists in natural language processing, machine learning, and artificial intelligence can use the model to explore topics like conversational AI, language understanding, and knowledge representation. The model can also be used for practical applications like customer service chatbots, content generation, and task assistance. Things to try With vicuna-13b-v1.5, you can experiment with different prompting techniques, such as providing context-specific instructions or engaging the model in multi-turn dialogues. You can also explore the model's capabilities in areas like language generation, question answering, and task completion. The Vicuna Model Card provides more details on the model's sources, training, and evaluation.

Updated 5/28/2024

Text-to-Text

🏋️

vicuna-13b-v1.3

lmsys

190

The vicuna-13b-v1.3 model is a large language model developed by LMSYS that has been fine-tuned on user-shared conversations collected from ShareGPT. It is an auto-regressive language model based on the transformer architecture, built by fine-tuning the LLaMA model. This model is available in several variants, including vicuna-7b-v1.3, vicuna-13b-v1.1, vicuna-7b-v1.1, and vicuna-33b-v1.3, which differ in their size and training details. Model inputs and outputs The vicuna-13b-v1.3 model is a text-to-text model, taking in natural language text as input and generating natural language text as output. It can be used for a variety of tasks, such as question answering, text generation, and dialogue. Inputs Natural language text prompts Outputs Natural language text responses Capabilities The vicuna-13b-v1.3 model has been trained to engage in open-ended dialogue and assist with a wide range of tasks. It can answer questions, provide explanations, and generate creative content. The model has shown strong performance on various benchmarks and is particularly capable at understanding and responding to user instructions. What can I use it for? The primary use of the vicuna-13b-v1.3 model is for research on large language models and chatbots. The model is intended to be used by researchers and hobbyists in natural language processing, machine learning, and artificial intelligence. Potential use cases include building conversational AI assistants, language generation applications, and language understanding systems. Things to try Researchers and developers can experiment with the vicuna-13b-v1.3 model by integrating it into custom applications through the command line interface or API endpoints provided by the LMSYS team. The model can be used to prototype and test new ideas in the field of conversational AI, exploring its capabilities and limitations.

Updated 5/28/2024

Text-to-Text

🚀

vicuna-7b-delta-v0

lmsys

162

vicuna-7b-delta-v0 is a chat assistant model developed by LMSYS. It is fine-tuned from the LLaMA model and trained on user-shared conversations from ShareGPT. The model is designed for research on large language models and chatbots, with the primary intended users being researchers and hobbyists in natural language processing, machine learning, and artificial intelligence. Similar models include vicuna-7b-v1.3, vicuna-13b-v1.1, vicuna-7b-v1.5, and vicuna-33b-v1.3, all of which are fine-tuned from LLaMA or Llama 2 and trained on ShareGPT conversations. The differences between these models are detailed in the vicuna_weights_version.md file. Model inputs and outputs The vicuna-7b-delta-v0 model is an auto-regressive language model, meaning it generates text one token at a time based on the previous tokens. The model takes in a prompt or conversation history as input and generates a response as output. Inputs Text prompt or conversation history Outputs Generated text, typically in the form of a conversational response Capabilities The vicuna-7b-delta-v0 model is capable of engaging in open-ended conversations on a wide range of topics. It can understand and respond to natural language queries, provide explanations, generate creative content, and assist with various tasks such as research, analysis, and problem-solving. What can I use it for? The primary use of the vicuna-7b-delta-v0 model is research on large language models and chatbots. Researchers and hobbyists in natural language processing, machine learning, and artificial intelligence can use the model to explore topics such as language understanding, text generation, and conversational AI. Additionally, the model could potentially be used for educational purposes, such as creating interactive learning experiences or providing personalized tutoring. However, it's important to note that the model is licensed for non-commercial use, so any commercial applications would require further consideration. Things to try One interesting aspect of the vicuna-7b-delta-v0 model is its ability to engage in multi-turn conversations and maintain context throughout the dialogue. Researchers could explore the model's performance on task-oriented conversations, where it needs to understand the user's intent and provide relevant and coherent responses over multiple exchanges. Another area to investigate would be the model's versatility in handling different types of prompts, such as open-ended questions, creative writing prompts, or problem-solving scenarios. This could shed light on the model's capabilities and limitations in various applications.

Updated 5/28/2024

Text-to-Text