openorca-platypus2-13b

Maintainer: niron1

Last updated 9/19/2024

Property	Value
Run this model	Run on Replicate
API spec	View on Replicate
Github link	No Github link provided
Paper link	View on Arxiv

Create account to get full access

Model overview

openorca-platypus2-13b is a powerful language model that combines the capabilities of two prominent models - [object Object] and [object Object]. Developed by the team at Open-Orca, this merged model builds on the strengths of its predecessors to deliver impressive performance across a range of benchmarks.

Similar models include the Mistral-7B-v0.1 fine-tuned by nateraw and the OpenOrca Platypus2 13B - GGML quantized by TheBloke. These models showcase the versatility and potential of the OpenOrca and Platypus frameworks.

Model inputs and outputs

openorca-platypus2-13b is an autoregressive language model that takes in a text prompt and generates a response. The key input parameters include:

Inputs

prompt: The text prompt to be used as input for the model.
temperature: A parameter that controls the randomness of the generated output, with higher values leading to more diverse but potentially less coherent responses.
max_new_tokens: The maximum number of new tokens the model will generate in response to the input prompt.
repetition_penalty: A parameter that penalizes the model for repeating the same words or phrases, encouraging more diverse output.
seed: A random number seed used to ensure reproducibility of the model's outputs.

Outputs

generated text: The model's response, which can be a continuation of the input prompt or a completely new passage of text.

Capabilities

The openorca-platypus2-13b model has demonstrated impressive performance on a variety of benchmarks, including the Hendricks MMLU (5-shot) test with a score of 59.5, the ARC (25-shot) test with a score of 62.88, and the HellaSwag (10-shot) test with a score of 83.19. Additionally, the model scored 52.69 on the TruthfulQA (0-shot) test.

The model also exhibits strong performance on the AGIEval and BigBench-Hard evaluations, outperforming its base OpenOrcaxOpenChat-Preview2-13B model by 12% and 5% respectively.

What can I use it for?

The openorca-platypus2-13b model can be used for a variety of natural language processing tasks, such as:

Content Generation: The model can be used to generate coherent and relevant text, making it useful for tasks like article writing, story generation, and creative writing.
Question Answering: With its strong performance on benchmarks like MMLU and TruthfulQA, the model can be used to answer a wide range of questions across various domains.
Summarization: The model's ability to generate concise and informative text could be leveraged for summarizing long-form content.
Dialogue Systems: The model's conversational capabilities make it a promising candidate for building chatbots and virtual assistants.

Things to try

One interesting aspect of the openorca-platypus2-13b model is its ability to handle instructions and engage in task-oriented dialogue. Try prompting the model with open-ended instructions or requests and observe the range and quality of its responses. Additionally, the model's strong performance on logical reasoning and STEM-focused tasks suggests it could be a valuable tool for scientific and technical applications.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Models

🛸

OpenOrca-Platypus2-13B

Open-Orca

226

The OpenOrca-Platypus2-13B model is a merge of the garage-bAInd/Platypus2-13B and Open-Orca/OpenOrcaxOpenChat-Preview2-13B models. It combines the strengths of the Platypus2-13B model, which was trained on a STEM and logic-based dataset, with the capabilities of the OpenOrcaxOpenChat-Preview2-13B model, which was fine-tuned on a refined subset of the OpenOrca dataset. Model inputs and outputs The OpenOrca-Platypus2-13B model is an auto-regressive language model based on the Llama 2 transformer architecture. It takes in text prompts as input and generates coherent and contextual text as output. Inputs Text prompts of varying lengths Outputs Continuation of the input text in a natural and coherent manner Responses to open-ended questions or instructions Capabilities The OpenOrca-Platypus2-13B model has demonstrated strong performance on a variety of benchmarks, including the HuggingFace Leaderboard, AGIEval, and BigBench-Hard evaluations. It consistently ranks near the top of the leaderboards for 13B models, showcasing its capabilities in areas like logical reasoning, general knowledge, and open-ended language understanding. What can I use it for? The OpenOrca-Platypus2-13B model can be used for a wide range of natural language processing tasks, such as: General-purpose language generation, including creative writing, story generation, and dialogue systems Question answering and information retrieval Logical reasoning and problem-solving Summarization and text comprehension Given its strong performance on benchmarks, this model could be particularly useful for applications that require advanced language understanding and reasoning abilities, such as virtual assistants, educational tools, and scientific research. Things to try One interesting aspect of the OpenOrca-Platypus2-13B model is its ability to combine the strengths of its two parent models. By merging the STEM and logic-focused Platypus2-13B with the more general-purpose OpenOrcaxOpenChat-Preview2-13B, the resulting model may be able to excel at both specialized, technical tasks as well as open-ended language understanding. Prompts that require a mix of analytical and creative thinking could be a fruitful area to explore with this model.

Updated Invalid Date

Text-to-Text

mistral-7b-openorca

nateraw

The mistral-7b-openorca is a large language model developed by Mistral AI and fine-tuned on the OpenOrca dataset. It is a 7 billion parameter model that has been trained to engage in open-ended dialogue and assist with a variety of tasks. This model can be seen as a successor to the Mistral-7B-v0.1 and Dolphin-2.1-Mistral-7B models, which were also based on the Mistral-7B architecture but fine-tuned on different datasets. Model inputs and outputs The mistral-7b-openorca model takes a text prompt as input and generates a response as output. The input prompt can be on any topic and the model will attempt to provide a relevant and coherent response. The output is returned as a list of string tokens. Inputs Prompt**: The text prompt that the model will use to generate a response. Max new tokens**: The maximum number of tokens the model should generate as output. Temperature**: The value used to modulate the next token probabilities. Top K**: The number of highest probability tokens to consider for generating the output. Top P**: A probability threshold for generating the output, using nucleus filtering. Presence penalty**: A penalty applied to tokens based on their previous appearance in the output. Frequency penalty**: A penalty applied to tokens based on their overall frequency in the output. Prompt template**: A template used to format the input prompt, with a placeholder for the actual prompt text. Outputs Output**: A list of string tokens representing the generated response. Capabilities The mistral-7b-openorca model is capable of engaging in open-ended dialogue on a wide range of topics. It can be used for tasks such as answering questions, providing summaries, and generating creative content. The model's performance is likely comparable to similar large language models, such as the Dolphin-2.2.1-Mistral-7B and Mistral-7B-Instruct-v0.2 models, which share the same underlying architecture. What can I use it for? The mistral-7b-openorca model can be used for a variety of applications, such as: Chatbots and virtual assistants: The model's ability to engage in open-ended dialogue makes it well-suited for building conversational interfaces. Content generation: The model can be used to generate creative writing, blog posts, or other types of textual content. Question answering: The model can be used to answer questions on a wide range of topics. Summarization: The model can be used to summarize long passages of text. Things to try One interesting aspect of the mistral-7b-openorca model is its ability to provide step-by-step reasoning for its responses. By using the provided prompt template, users can instruct the model to "Write out your reasoning step-by-step to be sure you get the right answers!" This can be a useful feature for understanding the model's decision-making process and for educational or analytical purposes.

Updated Invalid Date

Text-to-Text

↗️

OpenOrca-Platypus2-13B-GPTQ

TheBloke

The OpenOrca-Platypus2-13B-GPTQ is a large language model created by Open-Orca and refined by TheBloke. It is based on the Llama 2 architecture and has been trained on a combination of the OpenOrca dataset and a custom dataset focused on STEM and logic tasks. This model builds on the previous OpenOrca Platypus2 13B model, incorporating improvements to its performance and capabilities. The OpenOrca-Platypus2-13B-GPTQ model is available in various quantized versions optimized for different hardware and performance requirements. These include 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit GPTQ models, as well as 2-8 bit GGUF models for CPU and GPU inference. Model inputs and outputs Inputs Prompts**: The model takes in natural language prompts that describe a task or request. Instructions**: The model can also accept structured instruction-based prompts, such as the Alpaca-InstructOnly format. Outputs Text generation**: The primary output of the model is generated text, which can range from short responses to long-form narratives. Task completion**: The model is capable of understanding and completing a variety of tasks described in the input prompts. Capabilities The OpenOrca-Platypus2-13B-GPTQ model excels at a wide range of language tasks, including creative writing, question answering, code generation, and more. It has demonstrated strong performance on various benchmarks, including the HuggingFace Leaderboard, AGIEval, and BigBench-Hard. Compared to the original OpenOrca Platypus2 13B model, this version offers improved performance, lower hallucination rates, and longer responses. What can I use it for? The OpenOrca-Platypus2-13B-GPTQ model can be used for a variety of applications, such as: Content generation**: Create engaging stories, articles, or product descriptions. Conversational AI**: Build chatbots and virtual assistants that can engage in natural language interactions. Task completion**: Develop applications that can understand and complete complex instructions, such as code generation, math problem-solving, or creative tasks. Research and development**: Use the model as a starting point for further fine-tuning or as a benchmark for comparing language model performance. Things to try One interesting aspect of the OpenOrca-Platypus2-13B-GPTQ model is its ability to generate long, detailed responses while maintaining coherence and factual accuracy. You can try providing the model with open-ended prompts or instructions and see how it responds. For example, you could ask it to write a story about llamas or solve a complex logic puzzle. Another avenue to explore is the model's performance on specialized tasks, such as technical writing, scientific analysis, or legal document review. By fine-tuning the model on domain-specific data, you may be able to unlock new capabilities that are tailored to your specific needs. Verifying the responses for safety and factual accuracy is also an important consideration when using large language models. Developing robust testing and monitoring procedures can help ensure the model is behaving as expected and not producing harmful or inaccurate outputs.

Updated Invalid Date

Text-to-Text

❗

OpenOrca-Platypus2-13B-GGML

TheBloke

The OpenOrca-Platypus2-13B-GGML is a large language model created by Open-Orca. It is an open-source model that has been trained on explain-tuned datasets, including the WizardLM, Alpaca, and Dolly-V2 datasets. The model has been optimized for reasoning tasks and is designed to excel at understanding the thought process behind answers. The model is available in a range of quantized formats, including GPTQ and GGML, which allow for efficient inference on both CPUs and GPUs. These files were generously provided by TheBloke, who has also made quantized versions of similar models like the orca_mini_13B-GGML and orca_mini_3B-GGML available. Model inputs and outputs The OpenOrca-Platypus2-13B-GGML model is a text-to-text model, meaning it takes text as input and generates text as output. The model can be used for a variety of language tasks, such as question answering, summarization, and open-ended generation. Inputs Prompts**: The model takes natural language prompts as input, which can include instructions, questions, or other text. Outputs Text generation**: The model generates relevant and coherent text in response to the input prompts. Capabilities The OpenOrca-Platypus2-13B-GGML model has been designed to excel at reasoning tasks, with the goal of understanding and replicating the thought process behind answers. It has been trained on a diverse range of datasets, which allows it to handle a variety of language tasks with high accuracy. What can I use it for? The OpenOrca-Platypus2-13B-GGML model can be used for a wide range of applications, such as: Question answering**: The model can be used to answer questions on a variety of topics, drawing upon its broad knowledge base. Summarization**: The model can be used to generate concise summaries of longer text, capturing the key points and ideas. Open-ended generation**: The model can be used to generate creative, coherent text on a wide range of topics, making it useful for tasks like story writing or content creation. Things to try One interesting aspect of the OpenOrca-Platypus2-13B-GGML model is its focus on replicating the thought process behind answers. Users could try providing the model with prompts that require reasoning or explanation, and then analyze the generated responses to better understand how the model approaches these types of tasks. Additionally, users could experiment with different quantization levels to find the right balance between model performance and resource requirements for their specific use case. The range of quantized models provided by TheBloke offer a variety of options to choose from.

Updated Invalid Date

Text-to-Text