MN-12B-Celeste-V1.9

Maintainer: nothingiisreal

Total Score

93

Last updated 9/11/2024

📉

PropertyValue
Run this modelRun on HuggingFace
API specView on HuggingFace
Github linkNo Github link provided
Paper linkNo paper link provided

Create account to get full access

or

If you already have an account, we'll log you in

Model Overview

The MN-12B-Celeste-V1.9 is a story writing and roleplaying AI model developed by nothingiisreal on the Mistral Nemo 12B Instruct base model. It has been trained on a variety of datasets including Reddit Writing Prompts, Kalo's Opus 25K Instruct, and c2 logs to improve its NSFW handling, narration, and use of ChatML tokens.

The model is available in several variations, including Dynamic by Auri, EXL2 models by Kingbri, and GGUF models with static and IMatrix quantizations. There are also several API endpoints available, including Featherless, Infermatic, and OpenRouter.

Model Inputs and Outputs

Inputs

  • Text prompts for creative writing and roleplaying scenarios

Outputs

  • Coherent and engaging story continuations and responses to roleplaying prompts

Capabilities

The MN-12B-Celeste-V1.9 model excels at creative writing and roleplaying tasks. It can generate immersive narratives, develop complex characters, and respond to open-ended prompts with vivid and imaginative prose. The model's improved NSFW handling and active narration make it well-suited for writing stories with mature themes or fantasy/sci-fi settings.

What can I use it for?

The MN-12B-Celeste-V1.9 model is a great tool for authors, game developers, and creative writers looking to generate inspirational story ideas or expand on existing narratives. It could be used to brainstorm plots, flesh out characters, or generate content for interactive fiction or tabletop roleplaying games.

In a professional setting, this model could be leveraged to produce marketing copy, product descriptions, or other creative business content. Its ability to generate engaging and varied text makes it a potentially valuable asset for companies looking to enhance their digital presence and connect with customers in a more compelling way.

Things to try

One interesting aspect of the MN-12B-Celeste-V1.9 model is its ability to maintain coherence and continuity over multiple turns of a roleplaying scenario. Try engaging the model in an extended back-and-forth conversation, exploring different narrative arcs or character interactions. The model's improved narration and NSFW handling may also make it suitable for crafting more mature or fantastical stories.

Additionally, consider experimenting with the provided sampling settings, as the "Stable" and "Creative" options may result in different styles of output that could be suited to different writing tasks or creative goals.



This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Models

🏋️

MN-12B-Lyra-v1

Sao10K

Total Score

57

The MN-12B-Lyra-v1 is an experimental general roleplaying model developed by Sao10K. It is a merge of two different Mistral-Nemo 12B models, one focused on instruction-following and the other on roleplay and creative writing. The model scored well on the EQ-Bench, ranking just below the Nemomix v4 model. Sao10K found that a temperature of 1.2 and a minimum probability of 0.1 works well for this model, though they also note that it can perform well at lower temperatures. The model was created by merging two differently formatted training datasets - one on Mistral Instruct and one on ChatML. Sao10K found that keeping the datasets separate and using the della_linear merge method worked best, as opposed to mixing the datasets together. They also note that the base Nemo 12B model was difficult to train on their datasets, and that they would likely need to do some stage-wise fine-tuning in the future. Model inputs and outputs Inputs Either [INST] or ChatML input formats work well for this model. Outputs The MN-12B-Lyra-v1 model generates text outputs in a general roleplaying and creative writing style. Capabilities The MN-12B-Lyra-v1 model excels at general roleplaying tasks, with good performance on the EQ-Bench. Sao10K notes that the model can handle a context length of up to 16K tokens, which is sufficient for most roleplaying use cases. What can I use it for? The MN-12B-Lyra-v1 model would be well-suited for creative writing, storytelling, and roleplaying applications. Its ability to generate coherent and engaging text could make it useful for applications like interactive fiction, collaborative worldbuilding, or even as a foundation for more advanced AI-driven narratives. Things to try One interesting aspect of the MN-12B-Lyra-v1 model is Sao10K's observation that the base Nemo 12B model was difficult to train on their datasets, and that they would likely need to do some stage-wise fine-tuning in the future. This suggests that the model may benefit from a more iterative or multi-stage training process to optimize its performance on specific types of tasks or datasets. Sao10K also notes that the model's effective context length of 16K tokens may be a limitation for some applications, and that they are working on further iterations to improve upon this. Trying the model with longer context lengths or more advanced prompt engineering techniques could be an interesting area of exploration.

Read more

Updated Invalid Date

👁️

miquliz-120b-v2.0

wolfram

Total Score

85

The miquliz-120b-v2.0 is a 120 billion parameter large language model created by interleaving layers of the miqu-1-70b-sf and lzlv_70b_fp16_hf models using the mergekit tool. It was improved from the previous v1.0 version by incorporating techniques from the TheProfessor-155b model. The model is inspired by the goliath-120b and is maintained by Wolfram. Model inputs and outputs Inputs Text prompts of up to 32,768 tokens in length Outputs Continuation of the provided text prompt, generating new relevant text Capabilities The miquliz-120b-v2.0 model is capable of impressive performance, achieving top ranks and double perfect scores in the maintainer's own language model comparisons and tests. It demonstrates strong general language understanding and generation abilities across a variety of tasks. What can I use it for? The large scale and high performance of the miquliz-120b-v2.0 model make it well-suited for language-related applications that require powerful text generation, such as content creation, question answering, and conversational AI. The model could be fine-tuned for specific domains or integrated into products via the CopilotKit open-source platform. Things to try Explore the model's capabilities by prompting it with a variety of tasks, from creative writing to analysis and problem solving. The model's size and breadth of knowledge make it an excellent starting point for developing custom language models tailored to your needs.

Read more

Updated Invalid Date

🔍

NeuralHermes-2.5-Mistral-7B

mlabonne

Total Score

148

The NeuralHermes-2.5-Mistral-7B model is a fine-tuned version of the OpenHermes-2.5-Mistral-7B model. It was developed by mlabonne and further trained using Direct Preference Optimization (DPO) on the mlabonne/chatml_dpo_pairs dataset. The model surpasses the original OpenHermes-2.5-Mistral-7B on most benchmarks, ranking as one of the best 7B models on the Open LLM leaderboard. Model inputs and outputs The NeuralHermes-2.5-Mistral-7B model is a text-to-text model that can be used for a variety of natural language processing tasks. It accepts text input and generates relevant text output. Inputs Text**: The model takes in text-based input, such as prompts, questions, or instructions. Outputs Text**: The model generates text-based output, such as responses, answers, or completions. Capabilities The NeuralHermes-2.5-Mistral-7B model has demonstrated strong performance on a range of tasks, including instruction following, reasoning, and question answering. It can engage in open-ended conversations, provide creative responses, and assist with tasks like writing, analysis, and code generation. What can I use it for? The NeuralHermes-2.5-Mistral-7B model can be useful for a wide range of applications, such as: Conversational AI**: Develop chatbots and virtual assistants that can engage in natural language interactions. Content Generation**: Create text-based content, such as articles, stories, or product descriptions. Task Assistance**: Provide support for tasks like research, analysis, code generation, and problem-solving. Educational Applications**: Develop interactive learning tools and tutoring systems. Things to try One interesting thing to try with the NeuralHermes-2.5-Mistral-7B model is to use the provided quantized models to explore the model's capabilities on different hardware setups. The quantized versions can be deployed on a wider range of devices, making the model more accessible for a variety of use cases.

Read more

Updated Invalid Date

🗣️

mixtralnt-4x7b-test

chargoddard

Total Score

56

The mixtralnt-4x7b-test model is an experimental AI model created by the maintainer chargoddard. It is a Sparse Mixture of Experts (MoE) model that combines parts from several pre-trained Mistral models, including Q-bert/MetaMath-Cybertron-Starling, NeverSleep/Noromaid-7b-v0.1.1, teknium/Mistral-Trismegistus-7B, meta-math/MetaMath-Mistral-7B, and PocketDoc/Dans-AdventurousWinds-Mk2-7b. The maintainer is experimenting with a hack to populate the MoE gates in order to take advantage of the experts. Model inputs and outputs The mixtralnt-4x7b-test model is a text-to-text model, meaning it takes text as input and generates text as output. The specific input and output formats are not clearly defined, but the maintainer suggests the model may use an "alpaca??? or chatml??? format". Inputs Text prompts in an unspecified format, potentially related to alpaca or chatml Outputs Generated text in response to the input prompts Capabilities The mixtralnt-4x7b-test model is capable of generating coherent text, taking advantage of the experts from the combined Mistral models. However, the maintainer is still experimenting with the hack used to populate the MoE gates, so the full capabilities of the model are not yet known. What can I use it for? The mixtralnt-4x7b-test model could potentially be used for a variety of text generation tasks, such as creative writing, conversational responses, or other applications that require generating coherent text. However, since the model is still in an experimental stage, it's unclear how it would perform compared to more established language models. Things to try One interesting aspect of the mixtralnt-4x7b-test model is the maintainer's approach of combining parts of several pre-trained Mistral models into a Sparse Mixture of Experts. This technique could lead to improvements in the model's performance and capabilities, but the results are still unknown. It would be worth exploring the model's output quality, coherence, and consistency to see how it compares to other language models.

Read more

Updated Invalid Date