manticore-13b

115

Last updated 5/28/2024

🌿

Property	Value
Run this model	Run on HuggingFace
API spec	View on HuggingFace
Github link	No Github link provided
Paper link	No paper link provided

Create account to get full access

Model overview

manticore-13b is a large language model fine-tuned by the OpenAccess AI Collective on a range of datasets including ShareGPT, WizardLM, and Wizard-Vicuna. It is a larger, more capable model compared to similar open-source models like Llama 2-13B and Nous-Hermes-Llama2-13b, with demonstrated strong performance on a range of benchmarks.

Model inputs and outputs

manticore-13b is a text-to-text model, taking in natural language prompts as input and generating relevant, coherent text responses as output. The model can handle a wide variety of prompts, from open-ended questions to detailed instructions.

Inputs

Natural language prompts of varying length, from single sentences to multi-paragraph text
Prompts can cover a broad range of topics, from creative writing to analysis and problem-solving

Outputs

Coherent, relevant text responses generated to address the input prompts
Responses can range from short, concise answers to detailed, multi-paragraph outputs

Capabilities

The manticore-13b model demonstrates strong capabilities across many domains, including question answering, task completion, and open-ended generation. It is able to draw upon its broad knowledge base to provide informative and insightful responses, and can also engage in more creative and speculative tasks.

What can I use it for?

manticore-13b can be a powerful tool for a variety of applications, such as:

Content generation: Generating original text content, such as articles, stories, or scripts
Dialogue systems: Building chatbots and virtual assistants that can engage in natural conversations
Question answering: Providing detailed and accurate answers to a wide range of questions
Task completion: Following complex instructions to complete tasks like research, analysis, or problem-solving

The model's versatility and strong performance make it a valuable resource for researchers, developers, and businesses looking to leverage large language models for their projects.

Things to try

One interesting aspect of manticore-13b is its ability to engage in more open-ended and speculative tasks, such as creative writing or thought experiments. Try prompting the model with ideas or scenarios and see how it responds, exploring the boundaries of its capabilities. You might be surprised by the novel and insightful suggestions it can generate.

Another interesting area to explore is the model's performance on specialized or technical tasks, such as programming, data analysis, or scientific reasoning. While it is a general-purpose language model, manticore-13b may be able to provide valuable assistance in these domains as well.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Models

🏅

wizard-mega-13b

openaccess-ai-collective

105

The wizard-mega-13b model, also known as the Manticore 13B model, is a large language model developed by the OpenAccess AI Collective. It is a fine-tuned version of the LlaMa 13B model, trained on datasets such as ShareGPT, WizardLM, and Wizard-Vicuna. These datasets have been filtered to remove responses where the model indicates it is an AI language model or declines to respond. The Manticore 13B model has also been updated and fine-tuned on additional datasets, including a subset of Alpaca-CoT for roleplay and chain-of-thought prompts, GPT4-LLM-Cleaned, GPTeacher-General-Instruct, and various subsets of the MMLU dataset for specific subjects. This additional fine-tuning has resulted in the Manticore 13B model, which aims to provide more helpful, detailed, and polite responses compared to the original Wizard Mega 13B model. Model inputs and outputs Inputs Free-form text prompts that the model uses to generate a response. Outputs Generated text responses, which can range from short, concise answers to longer, more detailed responses depending on the prompt. Capabilities The wizard-mega-13b model, or Manticore 13B, is capable of generating coherent and contextually appropriate text across a wide range of topics. It can be used for tasks such as question answering, summarization, language generation, and task completion. The model's fine-tuning on datasets like ShareGPT, WizardLM, and Wizard-Vicuna has equipped it with the ability to provide more helpful, detailed, and polite responses compared to the original Wizard Mega 13B model. What can I use it for? The Manticore 13B model can be used for a variety of natural language processing tasks, such as: Question Answering**: The model can be used to answer questions on a wide range of topics, providing detailed and informative responses. Summarization**: The model can be used to summarize longer text passages into concise, high-level summaries. Language Generation**: The model can be used to generate coherent and contextually appropriate text, such as stories, articles, or dialogues. Task Completion**: The model can be used to assist with task-oriented activities, such as writing code, solving math problems, or providing step-by-step instructions. The Hugging Face Spaces demo allows you to try out the Manticore 13B model and see its capabilities in action. Things to try Some interesting things to try with the Manticore 13B model include: Experimenting with different types of prompts, such as open-ended questions, specific task instructions, or creative writing prompts, to see the range of responses the model can generate. Evaluating the model's ability to provide detailed and helpful answers to questions on a variety of subjects, from science and history to current events and popular culture. Assessing the model's coherence and logical reasoning skills by asking it to break down complex problems or provide step-by-step solutions to tasks. Exploring the model's potential for creative writing or storytelling by giving it open-ended prompts and seeing the unique narratives it can generate. By trying out these and other use cases, you can gain a better understanding of the Manticore 13B model's capabilities and find ways to integrate it into your own projects or workflows.

Updated Invalid Date

Text-to-Text

🏅

Manticore-13B-GGML

TheBloke

Manticore-13B-GGML Model overview Manticore-13B-GGML is a large language model released by the OpenAccess AI Collective and maintained by TheBloke. It is a 13 billion parameter model trained on a diverse corpus of online data. TheBloke has provided a range of quantized versions of the model in the GGML format, allowing for efficient CPU and GPU inference using libraries like llama.cpp and text-generation-webui. Model inputs and outputs Inputs The model takes raw text as input. Outputs The model generates coherent, fluent text outputs in response to the input. Capabilities Manticore-13B-GGML demonstrates strong natural language understanding and generation capabilities across a variety of tasks. It can be used for tasks like question answering, summarization, language translation, and open-ended text generation. The quantized GGML versions of the model enable efficient deployment on both CPU and GPU hardware. What can I use it for? The Manticore-13B-GGML model can be used for a wide range of natural language processing applications. Some potential use cases include: Building chatbots and conversational agents Generating creative content like stories, poems, or scripts Automating content creation for blogs, social media, or marketing Powering virtual assistants with natural language understanding Things to try One interesting aspect of the Manticore-13B-GGML model is the variety of quantization methods available, which allow for different tradeoffs between model size, inference speed, and quality. Experimenting with the different quantized versions could be a good way to find the right balance for your specific use case and hardware setup.

Updated Invalid Date

Text-to-Text

🌐

wizard-mega-13B-GPTQ

TheBloke

107

The wizard-mega-13B-GPTQ model is a 13-billion parameter language model created by the Open Access AI Collective and quantized by TheBloke. It is an extension of the original Wizard Mega 13B model, with multiple quantized versions available to choose from based on desired performance and VRAM requirements. Similar models include the wizard-vicuna-13B-GPTQ and WizardLM-7B-GPTQ models, which provide alternative architectures and training datasets. Model inputs and outputs The wizard-mega-13B-GPTQ model is a text-to-text transformer model, taking natural language prompts as input and generating coherent and contextual responses. The model was trained on a large corpus of web data, allowing it to engage in open-ended conversations and tackle a wide variety of tasks. Inputs Natural language prompts or instructions Conversational context, such as previous messages in a chat Outputs Coherent and contextual natural language responses Continuations of provided prompts Answers to questions or instructions Capabilities The wizard-mega-13B-GPTQ model is capable of engaging in open-ended dialogue, answering questions, and generating human-like text on a wide range of topics. It has demonstrated strong performance on language understanding and generation tasks, and can adapt its responses to the specific context and needs of the user. What can I use it for? The wizard-mega-13B-GPTQ model can be used for a variety of applications, such as building conversational AI assistants, generating creative writing, summarizing text, and even providing explanations and information on complex topics. The quantized versions available from TheBloke allow for efficient deployment on both GPU and CPU hardware, making it accessible for a wide range of use cases. Things to try One interesting aspect of the wizard-mega-13B-GPTQ model is its ability to engage in multi-turn conversations and adapt its responses based on the context. Try providing the model with a series of related prompts or questions, and see how it builds upon the previous responses to maintain a coherent and natural dialogue. Additionally, experiment with different prompting techniques, such as providing instructions or persona information, to see how the model's outputs can be tailored to your specific needs.

Updated Invalid Date

Text-to-Text

📈

mythalion-13b

PygmalionAI

133

The mythalion-13b model is a merge of the Pygmalion-2 13B and MythoMax L2 13B models, created in collaboration between PygmalionAI and Gryphe. According to the maintainers, this model seems to outperform MythoMax in roleplay and conversation tasks. Model inputs and outputs Inputs The model can be prompted using both the Alpaca and Pygmalion/Metharme formatting, which utilize special tokens like `, , and ` to indicate different roles and conversation flow. Outputs The model generates long-form text responses that aim to stay in character and continue the narrative, making it suitable for fictional writing and roleplaying. Capabilities The mythalion-13b model is focused on generating engaging, character-driven text for creative writing and roleplay scenarios. It has been trained on a mixture of instruction data, fictional stories, and conversational data to develop its capabilities in these areas. What can I use it for? The mythalion-13b model is well-suited for projects involving fictional writing, interactive storytelling, and character-driven roleplaying. This could include applications like interactive fiction, creative writing assistants, and open-ended chat bots. However, the maintainers note that the model was not fine-tuned to be safe or harmless, so it may generate content that is socially unacceptable or factually incorrect. Things to try One interesting aspect of the mythalion-13b model is its use of the Pygmalion/Metharme prompting format, which allows the user to set the character persona and guide the model's responses to stay in-character. Experimenting with different character backgrounds and personas could lead to unique and engaging narrative experiences.

Updated Invalid Date

Text-to-Text