Smaug-34B-v0.1

Maintainer: abacusai

Last updated 5/28/2024

🤷

Property	Value
Run this model	Run on HuggingFace
API spec	View on HuggingFace
Github link	No Github link provided
Paper link	No paper link provided

Create account to get full access

Model overview

Smaug-34B-v0.1 is a large language model created by the AI research group abacusai. It is a fine-tuned version of jondurbin's bagel model, developed using a new fine-tuning technique called DPO-Positive (DPOP).

The model was trained on a variety of datasets, including pairwise preference versions of ARC, HellaSwag, and MetaMath, as well as other existing datasets. The authors introduce DPOP in their paper "Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive," which shows how this new loss function and training procedure can outperform standard DPO across a wide range of tasks and datasets.

Model inputs and outputs

Inputs

Text-based prompts and instructions that the model uses to generate relevant responses.

Outputs

Generated text that responds to the input prompt or instruction.
The model can be used for a variety of text-to-text tasks, such as language generation, question answering, and task completion.

Capabilities

Smaug-34B-v0.1 demonstrates strong performance on a range of benchmarks, including ARC, HellaSwag, MMLU, TruthfulQA, Winogrande, and GSM8K. The authors report an average score of 77.29% across these evaluations.

The model also shows improvements in contamination compared to the reference jondurbin/bagel-34b-v0.2 model, with lower levels of contamination on ARC, TruthfulQA, and GSM8K.

What can I use it for?

Smaug-34B-v0.1 can be used for a variety of text-to-text tasks, such as language generation, question answering, and task completion. The model's strong performance on benchmarks like ARC and HellaSwag suggests it could be useful for tasks requiring reasoning and understanding, while its improved contamination scores make it a potentially safer choice for real-world applications.

Things to try

The authors of Smaug-34B-v0.1 have released their paper and datasets, encouraging the open-source community to build on and improve the model. Researchers and developers interested in large language models, preference optimization, and overcoming failure modes in DPO may find the model and associated materials particularly interesting to explore.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Models

✅

Smaug-72B-v0.1

abacusai

450

The Smaug-72B-v0.1 model is a powerful large language model developed by abacusai that has recently taken first place on the Open LLM Leaderboard by HuggingFace. It is the first open-source model to surpass an average score of 80%. Smaug-72B is fine-tuned directly from moreh/MoMo-72B-lora-1.8.7-DPO and is ultimately based on Qwen/Qwen-72B. The model was created using a new fine-tuning technique called DPO-Positive (DPOP) and new pairwise preference versions of datasets like ARC, HellaSwag, and MetaMath. Model inputs and outputs The Smaug-72B-v0.1 model is a text-to-text AI model, meaning it takes in text prompts and generates text outputs. The model can handle a wide variety of natural language tasks, from open-ended conversational responses to more structured outputs like answering questions or completing tasks. Inputs Text prompts**: The model accepts free-form text prompts that describe the desired task or output. Outputs Generated text**: The model outputs generated text that responds to or completes the input prompt. Capabilities The Smaug-72B-v0.1 model demonstrates impressive performance on a range of benchmarks, including achieving an average score of over 80% on the Open LLM Leaderboard. It excels at tasks like answering questions, generating coherent and relevant text, and reasoning about complex topics. The model's strong performance is attributed to its large size and the innovative DPOP fine-tuning technique used in its development. What can I use it for? The Smaug-72B-v0.1 model's capabilities make it well-suited for a variety of applications, such as: Natural language generation**: The model can be used to generate human-like text for chatbots, content creation, and other language-based applications. Question answering**: The model can be used to answer a wide range of questions on different topics, making it useful for educational and research purposes. Task completion**: The model can be fine-tuned or prompted to complete specific tasks, like summarizing text, translating between languages, or even generating code. Things to try One interesting aspect of the Smaug-72B-v0.1 model is its strong performance on math-based datasets like MetaMath, which the authors attribute to their new DPOP fine-tuning technique. You could try prompting the model with math-related questions or tasks to see its reasoning and problem-solving capabilities. Additionally, the model's high-quality text generation could be used for creative writing, storytelling, or other language-focused projects.

Updated Invalid Date

Text-to-Text

📊

Llama-3-Smaug-8B

abacusai

Llama-3-Smaug-8B is a large language model developed by Abacus.AI using the Smaug recipe for improving performance on real world multi-turn conversations. It is built on top of the meta-llama/Meta-Llama-3-8B-Instruct model. Compared to the base Meta-Llama-3-8B-Instruct model, this version uses new techniques and new data that allow it to outperform on key benchmarks like MT-Bench. Model inputs and outputs The Llama-3-Smaug-8B model takes in text as input and generates text as output. It is designed for open-ended natural language tasks and can be used for a variety of applications, from language generation to question answering. Inputs Text prompts for the model to continue or respond to Outputs Continuation of the input text Answers to questions Descriptions, summaries, or other text generation tasks Capabilities The Llama-3-Smaug-8B model is capable of engaging in multi-turn conversations and performing well on a variety of language understanding and generation benchmarks. It outperforms the base Meta-Llama-3-8B-Instruct model on the MT-Bench evaluation, achieving higher scores on both the first and second turns. What can I use it for? The Llama-3-Smaug-8B model can be used for a wide range of natural language processing tasks, including: Building conversational AI assistants Generating human-like text for creative writing or content creation Answering questions and providing information Summarizing long-form text Translating between languages The model's strong performance on multi-turn conversations makes it well-suited for developing interactive chatbots and virtual assistants. Things to try One interesting thing to try with the Llama-3-Smaug-8B model is generating multi-turn dialogues. The model's ability to maintain context and coherence across turns allows for the creation of more natural and engaging conversations. You could also experiment with using the model for creative writing, task-oriented dialogue, or other applications that require sustained language generation.

Updated Invalid Date

Text-to-Text

❗

bagel-dpo-7b-v0.1

jondurbin

The bagel-dpo-7b-v0.1 model is a fine-tuned version of jondurbin/bagel-7b-v0.1 that has been optimized using direct preference optimization (DPO). It was created by jondurbin and is based on the bagel framework. This DPO version aims to address issues where the original model may have refused requests, providing a more robust and uncensored assistant. Model inputs and outputs The bagel-dpo-7b-v0.1 model is a large language model capable of generating human-like text. It takes in natural language prompts as input and produces coherent, contextual responses. Inputs Free-form text prompts that can cover a wide range of topics and tasks, such as: Questions or statements that require reasoning, analysis, or generation Requests for creative writing, code generation, or task completion Open-ended conversations Outputs Coherent, contextual text responses that aim to fulfill the given prompt or continue the conversation Responses can range from short phrases to multi-paragraph outputs Capabilities The bagel-dpo-7b-v0.1 model demonstrates strong performance across a variety of benchmarks, including ARC Challenge, BoolQ, GSM8K, HellaSwag, MMLU, OpenBookQA, PIQA, TruthfulQA, and Winogrande. It outperforms the original bagel-7b-v0.1 model on many of these tasks. What can I use it for? The bagel-dpo-7b-v0.1 model can be used for a wide range of natural language processing and generation tasks, such as: Question-answering and information retrieval Conversational AI and chatbots Creative writing and storytelling Code generation and programming assistance Summarization and content generation Given its improved performance over the original bagel-7b-v0.1 model, the bagel-dpo-7b-v0.1 may be particularly well-suited for applications that require more robust and uncensored responses. Things to try One interesting aspect of the bagel-dpo-7b-v0.1 model is its use of multiple prompt formats, including Vicuna, Llama-2 chat, and a ChatML-inspired format. This allows the model to generalize better to a variety of prompting styles. You could experiment with these different formats to see which works best for your specific use case. Additionally, the model was trained on a diverse set of data sources, including various instruction datasets, plain text, and DPO pairs. This broad training data may enable the model to excel at a wide range of tasks. You could try prompting the model with different types of queries and observe its performance.

Updated Invalid Date

Image-to-Text

🤖

bagel-dpo-34b-v0.2

jondurbin

bagel-dpo-34b-v0.2 is an experimental fine-tune of the yi-34b-200k model by maintainer jondurbin. It was created using the bagel tool and includes the toxic DPO dataset, aiming to produce less censored outputs than similar models. This version may be helpful for users seeking a more uncensored AI assistant. Model Inputs and Outputs Inputs Text prompts and instructions provided to the model Outputs Coherent, open-ended text responses to the provided prompts and instructions Capabilities The bagel-dpo-34b-v0.2 model is capable of generating detailed, uncensored responses to a wide range of prompts and instructions. It demonstrates strong language understanding and generation abilities, and can be used for tasks like creative writing, open-ended dialogue, and even potentially sensitive or controversial topics. What Can I Use It For? The bagel-dpo-34b-v0.2 model could be useful for researchers, developers, or content creators who require a more uncensored AI assistant. It may be applicable for projects involving creative writing, interactive fiction, or even AI-powered chatbots. However, users should exercise caution as the model's outputs may contain sensitive or objectionable content. Things to Try One interesting aspect of the bagel-dpo-34b-v0.2 model is its potential to generate responses on controversial topics that other models may avoid or censor. You could try providing the model with prompts related to sensitive subjects and observe how it responds in an uncensored manner. Just keep in mind that the model's outputs may not always be suitable for all audiences.

Updated Invalid Date

Image-to-Image